int浮点型转换如何处理大量数字?

问题描述 投票:0回答:2

如果将整数强制转换为浮点型,则当它变得太大而无法精确地由浮点数表示时,需要将其舍入或截断。这是一个小测试程序,用于查看此舍入。

#include <stdio.h>

#define INT2FLOAT(num) printf(" %d: %.0f\n", (num), (float)(num));

int main(void)
{
    INT2FLOAT((1<<24) + 1);
    INT2FLOAT((1<<24) + 2);
    INT2FLOAT((1<<24) + 3);
    INT2FLOAT((1<<24) + 4);
    INT2FLOAT((1<<24) + 5);
    INT2FLOAT((1<<24) + 6);
    INT2FLOAT((1<<24) + 7);
    INT2FLOAT((1<<24) + 8);
    INT2FLOAT((1<<24) + 9);
    INT2FLOAT((1<<24) + 10);

    return 0;
}

输出为:

 16777217: 16777216
 16777218: 16777218
 16777219: 16777220
 16777220: 16777220
 16777221: 16777220
 16777222: 16777222
 16777223: 16777224
 16777224: 16777224
 16777225: 16777224
 16777226: 16777226

两个可表示整数之间的中间值有时会四舍五入,有时会四舍五入。似乎采用了某种四舍五入的方法。究竟如何运作?在哪里可以找到进行此转换的代码?

c casting floating-point
2个回答
5
投票

此隐式转换的行为是实现定义的:(C11 6.3.1.4/2):

如果要转换的值在可以表示但不能准确表示的值的范围内,则结果是以实现定义的方式选择的最接近的较高值或最接近的较低可表示值。

这意味着您的编译器应记录其工作原理,但您可能无法控制它。

[将浮点源四舍五入为整数时,可以使用各种函数和宏来控制舍入方向,但是对于将整数转换为浮点的情况,我一无所知。


0
投票

例如,除了其他答案中提到的内容外,intel浮点单元还使用内部完整的80位浮点表示,但是位数过多。...因此,将其舍入到最接近的23位时位float的数字(我从您的输出中假设)认为它可以非常精确,并考虑int中的所有位。

IEEE-752将32位浮点数指定为23位专用于存储有效数字的数字,这意味着对于归一化的数字,其中最高有效位是隐式的(不存储,因为它始终是1位),实际上您有24位有效数字,格式为1xxxxxxx_xxxxxxxx_xxxxxxxx,这意味着数字2^24-1是您可以精确表示的最后一个数字(实际上是11111111_11111111_11111111)。在此之后,您可以表示所有偶数,但不能表示几率,因为您缺少表示它们的最低有效位。这应该意味着您能够代表:

                                                     v decimal dot.
16777210  == 2^24-6        11111111_11111111_11111010.
16777211  == 2^24-5        11111111_11111111_11111011.
16777212  == 2^24-4        11111111_11111111_11111100.
16777213  == 2^24-3        11111111_11111111_11111101.
16777214  == 2^24-2        11111111_11111111_11111110.
16777215  == 2^24-1        11111111_11111111_11111111.
16777216  == 2^24         10000000_00000000_00000000_. <-- here the leap becomes 2 as there are no more than 23 bits to play with.
16777217  == 2^24+1       10000000_00000000_00000000_. (there should be a 1 bit after the last 0)
16777218  == 2^24+2       10000000_00000000_00000001_.
...
33554430  == 2^25-2       11111111_11111111_11111111_.
33554432  == 2^26        10000000_00000000_00000000__. <-- here the leap becomes 4 as there's another shift
33554436  == 2^26+4      10000000_00000000_00000001__.
...

如果您想象以10为底的问题,请假定我们的浮点数只有3个有效数字的十进制数字,而指数为10以提高幂。当我们从0开始计数时,得到以下信息:

  1  => 1.00E0
...
  8  => 8.00E0
  9  => 9.00E0
 10  => 1.00E1  <<< see what happened here... this is the same number as the first but with the ten's exponent incremented, meaning a one digit shift of every digit to the left.
 11  => 1.10E1
...
 98  => 9.80E1
 99  => 9.90E1
100  => 1.00E2  <<< and here.
101  => 1.01E2
...
996  => 9.96E2
997  => 9.97E2
998  => 9.98E2
999  => 9.99E2
1000 => 1.00E3  <<< exact, but here you don't have anymore a fourth digit to represent units.
1001 => 1.00E3  (this number cannot be represented exactly)
...
1004 => 1.00E3  (this number cannot be represented exactly)
1005 => 1.01E3  (this number cannot be represented exactly) <<< here rounding is applied, but the implementation is free to do whatever it wants.
...
1009 => 1.01E3  (this number cannot be represented exactly)
1010 => 1.01E3 <<< this is the next number that can be represent exactly with three floating point digits.  So we switched from an increment of one by one to an increment of ten by ten.
...
© www.soinside.com 2019 - 2024. All rights reserved.