On Mon, Nov 14, 2022 at 09:55:29PM +0000, Joseph Myers wrote:
> On Sun, 13 Nov 2022, Jakub Jelinek via Gcc-patches wrote:
> 
> > So, I wonder if we don't need to add a target hook where targets will be
> > able to provide upper bound on error for floating point functions for
> > different floating point modes and some way to signal unknown accuracy/can't
> > be trusted, in which case we would give up or return just the range for
> > VARYING.
> 
> Note that the figures given in the glibc manual are purely empirical 
> (largest errors observed for inputs in the glibc testsuite on a system 
> that was then used to update the libm-test-ulps files); they don't 
> constitute any kind of guarantee about either the current implementation 
> or the API, nor are they formally verified, nor do they come from 
> exhaustive testing (though worst cases from exhaustive testing for float 
> may have been added to the glibc testsuite in some cases).  (I think the 
> only functions known to give huge errors for some inputs, outside of any 
> IBM long double issues, are the Bessel functions and cpow functions.  But 
> even if other functions don't have huge errors, and some 
> architecture-specific implementations might have issues, there are 
> certainly some cases where errors can exceed the 9ulp threshold on what 
> the libm tests will accept in libm-test-ulps files, which are thus 
> considered glibc bugs.  (That's 9ulp from the correctly rounded value, 
> computed in ulp of that value.  For IBM long double it's 16ulp instead, 
> treating the format as having a fixed 106 bits of precision.  Both figures 
> are empirical ones chosen based on what bounds sufficed for most libm 
> functions some years ago; ideally, with better implementations of some 
> functions we could probably bring those numbers down.))

I know I can't get guarantees without formal proofs and even ulps from
reported errors are better than randomized testing.
But I think at least for non-glibc we want to be able to get a rough idea
of the usual error range in ulps.

This is what I came up with so far (link with
gcc -o ulp-tester{,.c} -O2 -lmpfr -lm
), it still doesn't verify that functions are always within the mathematical
range of results ([-0.0, Inf] for sqrt, [-1.0, 1.0] for sin/cos etc.), guess
that would be useful and verify the program actually does what is intended.
One can supply just one argument (number of tests, first 46 aren't really
random) or two, in the latter case the second should be upward, downward or
towardzero to use non-default rounding mode.
The idea is that we'd collect ballpark estimates for roundtonearest and
then estimates for the other 3 rounding modes, the former would be used
without -frounding-math, max over all 4 rounding modes for -frounding-math
as gcc will compute using mpfr always in round to nearest.

	Jakub