To go along with whatever magic we're gonna tack along to the
range-ops sqrt implementation, here is another revision addressing the
VARYING issue you pointed out.

A few things...

Instead of going through trees, I decided to call do_mpfr_arg1
directly.  Let's not go the wide int <-> tree rat hole in this one.

The function do_mpfr_arg1 bails on +INF, so I had to handle it manually.

There's a regression in gfortran.dg/ieee/ieee_6.f90, which I'm not
sure how to handle.  We are failing because we are calculating
sqrt(-1) and expecting certain IEEE flags set.  These flags aren't
set, presumably because we folded sqrt(-1) into a NAN directly:

    // All negatives.
    if (real_compare (LT_EXPR, &lh_ub, &dconst0))
      {
    real_nan (&lb, "", 0, TYPE_MODE (type));
    ub = lb;
    maybe_nan = true;
    return;
      }

The failing part of the test is:

  if (.not. (all(flags .eqv. [.false.,.false.,.true.,.true.,.false.]) &
             .or. all(flags .eqv. [.false.,.false.,.true.,.true.,.true.]) &
             .or. all(flags .eqv. [.false.,.false.,.true.,.false.,.false.]) &
             .or. all(flags .eqv.
[.false.,.false.,.true.,.false.,.true.]))) STOP 5

But we are generating F F F F F.  Google has informed me that that 3rd
flag is IEEE_INVALID.

So... is the optimization wrong?  Are we not allowed to substitute
that NAN if we know it's gonna happen?  Should we also allow F F F F F
in the test?  Or something else?

Thanks.
Aldy

On Wed, Nov 16, 2022 at 9:33 PM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Mon, Nov 14, 2022 at 09:55:29PM +0000, Joseph Myers wrote:
> > On Sun, 13 Nov 2022, Jakub Jelinek via Gcc-patches wrote:
> >
> > > So, I wonder if we don't need to add a target hook where targets will be
> > > able to provide upper bound on error for floating point functions for
> > > different floating point modes and some way to signal unknown accuracy/can't
> > > be trusted, in which case we would give up or return just the range for
> > > VARYING.
> >
> > Note that the figures given in the glibc manual are purely empirical
> > (largest errors observed for inputs in the glibc testsuite on a system
> > that was then used to update the libm-test-ulps files); they don't
> > constitute any kind of guarantee about either the current implementation
> > or the API, nor are they formally verified, nor do they come from
> > exhaustive testing (though worst cases from exhaustive testing for float
> > may have been added to the glibc testsuite in some cases).  (I think the
> > only functions known to give huge errors for some inputs, outside of any
> > IBM long double issues, are the Bessel functions and cpow functions.  But
> > even if other functions don't have huge errors, and some
> > architecture-specific implementations might have issues, there are
> > certainly some cases where errors can exceed the 9ulp threshold on what
> > the libm tests will accept in libm-test-ulps files, which are thus
> > considered glibc bugs.  (That's 9ulp from the correctly rounded value,
> > computed in ulp of that value.  For IBM long double it's 16ulp instead,
> > treating the format as having a fixed 106 bits of precision.  Both figures
> > are empirical ones chosen based on what bounds sufficed for most libm
> > functions some years ago; ideally, with better implementations of some
> > functions we could probably bring those numbers down.))
>
> I know I can't get guarantees without formal proofs and even ulps from
> reported errors are better than randomized testing.
> But I think at least for non-glibc we want to be able to get a rough idea
> of the usual error range in ulps.
>
> This is what I came up with so far (link with
> gcc -o ulp-tester{,.c} -O2 -lmpfr -lm
> ), it still doesn't verify that functions are always within the mathematical
> range of results ([-0.0, Inf] for sqrt, [-1.0, 1.0] for sin/cos etc.), guess
> that would be useful and verify the program actually does what is intended.
> One can supply just one argument (number of tests, first 46 aren't really
> random) or two, in the latter case the second should be upward, downward or
> towardzero to use non-default rounding mode.
> The idea is that we'd collect ballpark estimates for roundtonearest and
> then estimates for the other 3 rounding modes, the former would be used
> without -frounding-math, max over all 4 rounding modes for -frounding-math
> as gcc will compute using mpfr always in round to nearest.
>
>         Jakub