为什么VS 2015编译器不能在浮点数的abs（）实现中优化分支？

问题描述：

__declspec(dllexport)
float foo(float x) {
    return (x < 0) ? x * -1 : x;
}

这是一个很简单的计算 abs x）其中 x 是一个 float 。我编译这个在发布模式和启用我可以找到的所有优化。所产生的 asm 是：

This is a very naive implementation for calculating abs(x) where x is a float. I compiled this in Release mode and enabled all optimisations I could find. The resulting asm is:

; 4    :    return (x < 0) ? x * -1 : x;

    movss   xmm1, DWORD PTR _x$[ebp]
    xorps   xmm0, xmm0
    comiss  xmm0, xmm1
    jbe SHORT $LN3@foo
    xorps   xmm1, DWORD PTR __xmm@80000000800000008000000080000000
$LN3@foo:
    movss   DWORD PTR tv66[ebp], xmm1
    fld DWORD PTR tv66[ebp]

正如你可以看到，这仍然包含分支和条件跳转。然而，一个 float 由IEEE754定义，因此我可以改变实现，只是将符号位设置为0：

As you can see this still contains the branch and the conditional jump. Yet a float is defined by the IEEE754 and thus I could change the implementation to simply set the sign bit to 0:

__declspec(dllexport)
float foo(float x) {
    void* bar = &x;
    __int32 y = ((*(__int32*)bar) & ~(1 << 31));
    return  *(float*)&y;
}

这不会跳转，并且需要较少的命令：

which does not jump and requires less commands:

; 3    :        void* bar = &x;
; 4    :        __int32 y = ((*(__int32*)bar) & ~(1 << 31));

    mov eax, DWORD PTR _x$[ebp]
    and eax, 2147483647             ; 7fffffffH
    mov DWORD PTR _y$[ebp], eax

; 5    :        return  *(float*)&y;

    fld DWORD PTR _y$[ebp]

甚至存在这个动作的特定命令，但也许这只是在非常特殊的架构上？

I would have expected that there even exist specific commands for this action, but maybe this is only on very special architectures?

那么，编译器不能捕获这个优化的原因是什么？

So what is the reason the compiler can't catch this optimization? Or am I making a mistake by doing this?

答

因为这会导致负零的错误结果！

Because that would yield the wrong result for negative zero!

负零不小于零，因此它的符号保持为负，使得条件分支无效。

Negative zero is not smaller than zero, so its sign stays negative, rendering an elimination of the conditional branch invalid.

使用

copysign(x, 0.0);

。

为什么VS 2015编译器不能在浮点数的abs（）实现中优化分支？

相关推荐