计算除 1 以外的 IEEE 754 浮点数的机器 epsilon 有什么魔力吗?

问题描述 投票:0回答:2

换句话说,我想要一个功能如下。

Input:一个正数

a
由 IEEE 754 表示

Output:最大数

e
使
a + e == a
符合 IEEE 754 算法(
e
也可以由 IEEE 754 表示)

  • std::numeric_limits<double>::epsilon
    给出
    a
    为 1 的结果。
  • 这样的功能可以通过简单的二分法来实现。有没有一种方法花费更少的 CPU 周期? bit magic以外的方法也欢迎,只要快就行
c++ floating-point ieee-754
2个回答
1
投票

此代码计算一个数字的 ULP,假设为 IEEE-754 二进制格式:

typedef double FloatType;


/*  Return the ULP of q.

    This was inspired by Algorithm 3.5 in Siegfried M. Rump, Takeshi Ogita, and
    Shin'ichi Oishi, "Accurate Floating-Point Summation", _Technical Report
    05.12_, Faculty for Information and Communication Sciences, Hamburg
    University of Technology, November 13, 2005.

    Since this uses subnormals, it may have poor performance.
*/
FloatType ULP(FloatType q)
{
    static const FloatType Epsilon = std::numeric_limits<FloatType>::epsilon();
    static const FloatType Minimum = std::numeric_limits<FloatType>::denorm_min();
        
    /*  Scale is .75 ULP, so multiplying it by any significand in [1, 2) yields
        something in [.75 ULP, 1.5 ULP) (even with rounding).
    */
    static const FloatType Scale = 0.75 * Epsilon;

    q = std::fabs(q);

    /*  In fmaf(q, -Scale, q), we subtract q*Scale from q, and q*Scale is
        something more than .5 ULP but less than 1.5 ULP.  That must produce q
        - 1 ULP.  Then we subtract that from q, so we get 1 ULP.

        The significand 1 is of particular interest.  We subtract .75 ULP from
        q, which is midway between the greatest two floating-point numbers less
        than q.  Since we round to even, the lesser one is selected, which is
        less than q by 1 ULP of q, although 2 ULP of itself.
    */
    return std::fmax(Minimum, q - std::fma(q, -Scale, q));
}

1
投票

OP 的目标与 epsilon 的含义有一些冲突。

让我们来看看 C++ 规范。

Machine epsilon:1 与可表示的大于 1 的最小值之间的差值。

...并假设 OP 想要

1.0
以外的值,一些
double x
.


这简直了

#include <cmath>

double epsilon_per_x(double x) {
  x = fabs(x);
  return nextafter(x, INFINITY) - x;
}

我们可能想要不同的最大值而不是返回无穷大。

#include <cmath>

double epsilon_per_x(double x) {
  x = fabs(x);
  if (x >= std::numeric_limits<double>::max()) {
    return (nextafter(x/2, INFINITY) - x/2)*2;  
  } else {
    return nextafter(x, INFINITY) - x;
  }
}
© www.soinside.com 2019 - 2024. All rights reserved.