Why don't LLVM passes optimize floating point instructions?

2018-06-04 15:15:02

This question already has an answer here:

Why doesn't GCC optimize a*a*a*a*a*a to (a*a*a)*(a*a*a)? 12 answers

It's not quite true to say that no optimization is possible. I'll go through the first few lines to show where transformations are and are not allowed:

  %addtmp = fadd double %x, %x

This first line could safely be transformed to fmul double %x 2.0e+0 , but that's not actually an optimization on most architectures ( fadd is generally as fast or faster than fmul , and doesn't require producing the constant 2.0 ). Note that barring overflow, this operation is exact (like all scaling by powers of two).

  %addtmp1 = fadd double %addtmp, %x

This line could be transformed to fmul double %x 3.0e+0 . Why is this a legal transformation? Because the computation that produced %addtmp was exact, so only a single rounding is been incurred whether this is computed as x * 3 or x + x + x . Because these are IEEE-754 basic operations and therefore correctly rounded, the result is the same either way. What about overflow? Neither may overflow unless the other does as well.

  %addtmp2 = fadd double %addtmp1, %x

This is the first line that cannot be legally transformed into constant * x. 4 * x would compute exactly, without any rounding, whereas x + x + x + x incurs two roundings: x + x + x is rounded once, then adding x may round a second time.

  %addtmp3 = fadd double %addtmp2, %x

Ditto here; 5 * x would incur one rounding; x + x + x + x + x incurs three.

The only line that might be beneficially transformed would be replacing x + x + x with 3 * x . However, the subexpression x + x is already present elsewhere, so an optimizer easily could choose not to employ this transform (since it can take advantage of the existing partial result if it does not).

链接地址: http://www.djcxy.com/p/15016.html

上一篇: 浮点除法与浮点乘法

下一篇: 为什么LLVM不通过优化浮点指令？