在各种x86 CPU上,FP操作是否能获得完全相同的结果?
假设不同的x86 CPU(具有内置FPU,并且相当新,比如说已经在本世纪推出的),它们的浮点基元会产生完全相同的结果.相比之下,相同的输入和相同的操作参数(例如取整模式)?我对时间上的差异以及奔腾FDIV错误都不感兴趣.仅因为该事件是古老事件才有资格获得.)
Do different x86 CPUs (with build-in FPUs and reasonably recent, say launched this millenium) produce exactly the same result for their Floating Point primitives, assuming the same instruction is available on the CPUs being compared, same input and same operating parameters such as rounding mode? I'm not interested in differences in timing, nor in the Pentium FDIV bug (which does not qualify only because that incident is ancient).
我想答案是肯定的,因为它们具有精确的定义,所以加法,减法,求反和舍入到整数都是如此,而且我几乎无法想象实现之间会有什么差异(可能就是检测中的错误)上溢/下溢,但这在某些应用程序中将是一场灾难,因此我想这早已被发现并解决了).
I guess the answer is yes for addition, subtraction, negation, and round-to-integer, since these have precise definitions, and I can hardly imagine what a divergence in implementations could be (short perhaps of a bug in the detection of overflow/underflow, but that would be a disaster in some applications, so I imagine this would have been caught and fixed long ago).
乘法似乎更可能具有不同的实现方式:确定两个DPFPN乘积的(例如)最接近的可表示双精度浮点数(64位,包括尾数52 + 1)有时需要计算它们的乘积尾数达到(大约)104位精度,对于少数LSBit来说,这无疑是浪费精力.我不知道是否可以尝试并正确完成此操作.也许是IEEE-754或某些实际标准规定了某些内容?
Multiplication seems more likely to have diverging implementations: determining the (say) nearest representable Double-Precision Float-Point Number (64 bits, including 52+1 of mantissa) of the product of two DPFPN sometime requires computing the product of their mantissa to (about) 104-bit accuracy, which, for the few LSBits, is arguably a waste of effort. I wonder if this is even attempted, and done correctly. Or perhaps IEEE-754, or some de-facto standard, prescribes something?
划分似乎更加微妙.
而且,由于缺乏通用的设计,我怀疑所有更复杂的事物(触发函数,日志..)的所有实现是否可能完全同步,因为可以使用多种数学方法.
And, short of a common design, I doubt all implementations of the much more complex things (trig functions, logs..) could be exactly in sync, given the variety of mathematical methods that can be used.
我要问的是纯粹的吵闹.愿意改善我的答案;并希望有一种方法(有时)允许在VM中运行的程序检测到假装正在运行的CPU与实际运行的CPU之间的不匹配.
I'm asking that out of a combination of pure nosiness; willingness to improve that answer of mine; and desire for a method to (sometime) allow a program running in a VM to detect a mismatch between the CPU that pretends to be running, and the real one.
在汇编级别,基本的浮点指令(加,减,乘,除,平方根,FMA,舍入)总是产生相同的结果,如IEEE754标准.有两种指令可能会在不同的体系结构上产生不同的结果:用于计算先验运算的复杂FPU指令(FSIN,FCOS,F2XM1等),以及近似SSE指令(用于计算近似倒数的RCPSS/RCPPS,以及RSQRTSS,RSQRTPS用于计算近似的倒数平方根).超越性x87 FPU操作以微码实现,并且AFAIK(除AMD K5外)所有Intel和AMD CPU都使用相同的微码,因此您不能将其用于检测.它可能仅对VIA,Cyrix,Transmeta和其他旧CPU的检测有用,但是很少考虑.大约SSE指令在Intel和AMD上的实现方式有所不同,而AFAIK在旧(K8之前)和较新的AMD CPU上的实现存在一些差异.您可以使用这种差异来检测冒充Intel的AMD CPU,反之亦然,但这是一个有限的用例.
On assembly level basic floating-point instructions (add, subtract, multiply, divide, square root, FMA, round) always produce the same result, as described by IEEE754 standard. There are two kinds of instructions which may produce different results on different architectures: complex FPU instructions for computing transcendental operations (FSIN, FCOS, F2XM1, and alike), and approximate SSE instructions (RCPSS/RCPPS for computing approximate reciprocal, and RSQRTSS, RSQRTPS for computing approximate reciprocal square root). Transcendental x87 FPU operations are implemented in microcode, and AFAIK all Intel and AMD CPUs except AMD K5 use the same microcode, so you can't use it for detection. It might be helpful only for detection of VIA, Cyrix, Transmeta, and other old CPUs, but those are too rare to consider. Approximate SSE instructions are implemented differently on Intel and AMD, and AFAIK there is some difference in implementation on old (pre-K8) and newer AMD CPUs. You could use that difference to detect AMD CPU pretending to be Intel and vice versa, but that is a limited use-case.