Range of representable values of 32
In the C++ standard it says of floating literals:
If the scaled value is not in the range of representable values for its type, the program is ill-formed.
The scaled value is the significant part multiplied by 10 ^ exponent part.
Under x86-64:
float
is a single-precision IEEE-754 double
is a double-precision IEEE-754 long double
is an 80-bit extended precision IEEE-754 In this context, what is the range of repsentable values for each of these three types? Where is this documented? or how is it calculated?
The answer (if you're on a machine with IEEE floating point) is in float.h
. FLT_MAX
, DBL_MAX
and LDBL_MAX
. On a system with full IEEE support, something around 3.4e+38, 1.8E+308 and 1.2E4932. (The exact values may vary, and may be expressed differently, depending on how the compiler does its input and rounding. g++, for example, defines them to be compiler built-ins.)
EDIT:
WRT your question (since neither I nor the other responders actually answered it): the range of representable values is [-type_MAX...type]
, where type
is one of FLT
, DBL
, or LDBL
.
If you know the number of exponent bits and mantissa bits, then based on the IEEE-754 format, one can establish that the maximum absolute representable value is:
2^(2^(E-1)-1)) * (1 + (2^M-1)/2^M)
The minimum absolute value (not including zero or denormals) is:
2^(2-2^(E-1))
E
is 8, M
is 23. E
is 11, M
is 52. I was looking for largest representable number by 64 bits and ending up making my own 500 digit floating point calculator. This is what I come up with if all 64 bits are turned on
18,446,744,073,709,551,615
18 quintillion 446 quadrillion 744 trillion 73 billion 709 million 551 thousand 615
链接地址: http://www.djcxy.com/p/85576.html上一篇: 双倍的最大和最小指数
下一篇: 可表示值的范围为32