mirror of git://sourceware.org/git/glibc.git
As mentioned by the reporter in a pull request against gcc-mirror, the THREEp96 constant in e_expl.c is incorrect, it is actually 0x3.p+94f128 rather than 0x3.p+96f128. The algorithm uses that to compute the t2 integer (tval2), by whose delta it adjusts the x+xl pair and then in the result uses the precomputed exp value for that entry. Using 0x3.p+94f128 rather than 0x3.p+96f128 results in tval2 sometimes being one smaller, sometimes one larger than the desired value, thus can mean the x+xl pair after adjustment will be larger in absolute value than it should be. DesWursters created a test program for this https://github.com/DesWurstes/comparefloats and his results were total: 1135000000 not_equal: 4322 earlier_score: 674 later_score: 3648 I've modified this so with https://sourceware.org/bugzilla/show_bug.cgi?id=32411#c3 so that it actually tests pseudo-random _Float128 values with range (-16384.,16384) with strong bias on values larger than 0.0002 in absolute value (so that tval1/tval2 aren't zero most of the time) and that gave total: 10000000000 not_equal: 29861 earlier_score: 4606 later_score: 25255 So, in both cases, in most cases the change doesn't result in any differences, and in those rare cases where does, about 85% have smaller ulp than without the patch. Additionally I've tried https://sourceware.org/bugzilla/show_bug.cgi?id=32411#c4 and in 2 billion iterations it didn't find any case where x+xl after the adjustments without this change would be smaller in absolute value compared to x+xl after the adjustments with this change. Reviewed-by: Joseph Myers <josmyers@redhat.com> |
||
---|---|---|
.. | ||
dbl-64 | ||
float128 | ||
flt-32 | ||
ldbl-64-128 | ||
ldbl-96 | ||
ldbl-128 | ||
ldbl-128ibm | ||
ldbl-128ibm-compat | ||
ldbl-opt | ||
soft-fp | ||
Makefile | ||
ieee754.h | ||
k_standard.c | ||
k_standardf.c | ||
k_standardl.c | ||
libm-alias-finite.h | ||
s_lib_version.c | ||
s_matherr.c | ||
s_signgam.c |