快速平方根优化？

2018-06-04 15:23:20

如果你检查这个非常好的页面：

http://www.codeproject.com/Articles/69941/Best-Square-Root-Method-Algorithm-Function-Precisi

你会看到这个程序：

#define SQRT_MAGIC_F 0x5f3759df 
 float  sqrt2(const float x)
{
  const float xhalf = 0.5f*x;

  union // get bits for floating value
  {
    float x;
    int i;
  } u;
  u.x = x;
  u.i = SQRT_MAGIC_F - (u.i >> 1);  // gives initial guess y0
  return x*u.x*(1.5f - xhalf*u.x*u.x);// Newton step, repeating increases accuracy 
}

我的问题是：有没有什么特别的原因，为什么不实施：

#define SQRT_MAGIC_F 0x5f3759df 
 float  sqrt2(const float x)
{

  union // get bits for floating value
  {
    float x;
    int i;
  } u;
  u.x = x;
  u.i = SQRT_MAGIC_F - (u.i >> 1);  // gives initial guess y0

  const float xux = x*u.x;

  return xux*(1.5f - .5f*xux*u.x);// Newton step, repeating increases accuracy 
}

从拆解来看，我看到一个MUL更少。有没有任何目的让xhalf出现？

当使用80位寄存器的中间结果连接在最后一行的乘法器时，使用80位寄存器的传统浮点数学可能更精确。

上实现中的第一次乘法与后面的整数运算并行发生，它们使用不同的执行资源。另一方面，第二个功能看起来更快，但很难说是否真的是因为上述原因。另外， const float xux = x * ux; 语句将结果减少到32位浮点数，这可能会降低总体精度。

您可以头对头测试这些函数，并将它们与math.h中的sqrt函数进行比较（使用double而不是float）。这样你可以看到哪个更快，哪个更准确。

链接地址: http://www.djcxy.com/p/15031.html

上一篇: fast square root optimization?

下一篇: Are floating point operations in C associative?