public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f
@ 2023-03-03 12:08 marxin at gcc dot gnu.org
  2023-03-03 12:11 ` [Bug tree-optimization/109008] [13 Regression] Maybe " rguenth at gcc dot gnu.org
                   ` (50 more replies)
  0 siblings, 51 replies; 52+ messages in thread
From: marxin at gcc dot gnu.org @ 2023-03-03 12:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

            Bug ID: 109008
           Summary: [13 Regression ]Maybe wrong code in scipy package
                    since r13-3926-gd4c2f1d376da6f
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Keywords: wrong-code
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: marxin at gcc dot gnu.org
                CC: aldyh at gcc dot gnu.org, jakub at gcc dot gnu.org
  Target Milestone: ---

Created attachment 54578
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54578&action=edit
test-case

Since the revision, the scipy package fails in tests due to:
https://github.com/scipy/scipy/blob/main/scipy/integrate/odepack/vode.f#L3500-L3515

$ cat /tmp/x.f
      DOUBLE PRECISION FUNCTION D1MACH (IDUM)
      INTEGER IDUM
C-----------------------------------------------------------------------
C This routine computes the unit roundoff of the machine.
C This is defined as the smallest positive machine number
C u such that  1.0 + u .ne. 1.0
C
C Subroutines/functions called by D1MACH.. None
C-----------------------------------------------------------------------
      DOUBLE PRECISION U, COMP
      U = 1.0D0
 10   U = U*0.5D0
      COMP = 1.0D0 + U
      IF (COMP .NE. 1.0D0) GO TO 10
      D1MACH = U*2.0D0
      RETURN

      END FUNCTION D1MACH

$ gcc /tmp/x.f -c -O2 -fdump-tree-optimized=/dev/stdout
;; Function d1mach (d1mach_, funcdef_no=0, decl_uid=4259, cgraph_uid=1,
symbol_order=0)

__attribute__((fn spec (". w ")))
real(kind=8) d1mach (integer(kind=4) & restrict idum)
{
  <bb 2> [local count: 20293720]:
  return 0.0;

}

Before the revision we did:


;; Function d1mach (d1mach_, funcdef_no=0, decl_uid=4259, cgraph_uid=1,
symbol_order=0)

Removing basic block 5
__attribute__((fn spec (". w ")))
real(kind=8) d1mach (integer(kind=4) & restrict idum)
{
  real(kind=8) u;
  real(kind=8) _1;
  character(kind=4) ivtmp_7;
  character(kind=4) ivtmp_8;

  <bb 2> [local count: 20293720]:

  <bb 3> [local count: 1073741824]:
  # u_2 = PHI <1.0e+0(2), u_4(3)>
  # ivtmp_7 = PHI <53(2), ivtmp_8(3)>
  u_4 = u_2 * 5.0e-1;
  ivtmp_8 = ivtmp_7 + 4294967295;
  if (ivtmp_8 != 0)
    goto <bb 3>; [98.11%]
  else
    goto <bb 4>; [1.89%]

  <bb 4> [local count: 20293720]:
  _1 = u_4 * 2.0e+0;
  return _1;

}

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
@ 2023-03-03 12:11 ` rguenth at gcc dot gnu.org
  2023-03-03 12:14 ` rguenth at gcc dot gnu.org
                   ` (49 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-03-03 12:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|                            |x86_64-linux-gnu
   Target Milestone|---                         |13.0

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
  2023-03-03 12:11 ` [Bug tree-optimization/109008] [13 Regression] Maybe " rguenth at gcc dot gnu.org
@ 2023-03-03 12:14 ` rguenth at gcc dot gnu.org
  2023-03-03 12:18 ` [Bug tree-optimization/109008] [13 Regression] Wrong " rguenth at gcc dot gnu.org
                   ` (48 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-03-03 12:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2023-03-03
           Priority|P3                          |P1

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Confirmed.  EVRP folds the return to 0

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
  2023-03-03 12:11 ` [Bug tree-optimization/109008] [13 Regression] Maybe " rguenth at gcc dot gnu.org
  2023-03-03 12:14 ` rguenth at gcc dot gnu.org
@ 2023-03-03 12:18 ` rguenth at gcc dot gnu.org
  2023-03-03 12:24 ` rguenth at gcc dot gnu.org
                   ` (47 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-03-03 12:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
So it boils down to us optimizing

 d = 1. + eps;
 if (d == 1.)
   return eps;

to return 0. which is of course wrong.

double eps_or_zero (double eps)
{
  double d = 1. + eps;
  if (d == 1.)
    return eps;
  return 0.0;
}

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2023-03-03 12:18 ` [Bug tree-optimization/109008] [13 Regression] Wrong " rguenth at gcc dot gnu.org
@ 2023-03-03 12:24 ` rguenth at gcc dot gnu.org
  2023-03-03 12:27 ` rguenth at gcc dot gnu.org
                   ` (46 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-03-03 12:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
So with [1., 1.] = [1., 1.] + op2 we are calculating op2 = [1., 1.] - [1., 1.]
but with FP math we cannot apply such simplification without considering
rounding or exponent ranges.  Maybe it's enough to "fuzz" the results by
half an ulp of both(?) input ranges ...

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2023-03-03 12:24 ` rguenth at gcc dot gnu.org
@ 2023-03-03 12:27 ` rguenth at gcc dot gnu.org
  2023-03-03 12:31 ` jakub at gcc dot gnu.org
                   ` (45 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-03-03 12:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
The "easiest" way would be if the range endpoints would have one bit more of
mantissa and so we can represent all endpoints with one ulp off (aka
nextafter/nextbefore).  So constants would always become non-singleton (beyond
the original precision).

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2023-03-03 12:27 ` rguenth at gcc dot gnu.org
@ 2023-03-03 12:31 ` jakub at gcc dot gnu.org
  2023-03-03 12:54 ` jakub at gcc dot gnu.org
                   ` (44 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-03-03 12:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Well, we already do that half ulp fuzzing in frange_arithmetic.  The thing is
that for
forward [1., 1.] - [1., -1.] it is really correctly [0., 0.], the subtraction
is exact in that case.  It is just the reverse operation which perhaps need
some flag that would
be propagated to frange_arithmetic and would make it fuzz even more.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2023-03-03 12:31 ` jakub at gcc dot gnu.org
@ 2023-03-03 12:54 ` jakub at gcc dot gnu.org
  2023-03-03 13:03 ` rguenth at gcc dot gnu.org
                   ` (43 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-03-03 12:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Note, the ulps frange_arithmetic are ulps of the result, result is in this case
0.0,
so 1ulp is the smallest subnormal number.
That is something completely different from what we need here though for the
reverse
operations.
[1.0, 1.0] + x == [1.0, 1.0] is true with round to nearest for x in
[-0x1.0p-54, 0x1.0p-53], so that is ulp of 1.0, not the result.
Now, is this problem just around the 0 result or other values too?
Say [1.0, 1.0] + x == [2.0, 2.0]?

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2023-03-03 12:54 ` jakub at gcc dot gnu.org
@ 2023-03-03 13:03 ` rguenth at gcc dot gnu.org
  2023-03-03 13:10 ` rguenth at gcc dot gnu.org
                   ` (42 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-03-03 13:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #6)
> Note, the ulps frange_arithmetic are ulps of the result, result is in this
> case 0.0,
> so 1ulp is the smallest subnormal number.
> That is something completely different from what we need here though for the
> reverse
> operations.
> [1.0, 1.0] + x == [1.0, 1.0] is true with round to nearest for x in
> [-0x1.0p-54, 0x1.0p-53], so that is ulp of 1.0, not the result.
> Now, is this problem just around the 0 result or other values too?
> Say [1.0, 1.0] + x == [2.0, 2.0]?

Consider

 [1.0, 1.0] + x == [1.00001, 1.00001]

I think that we need to adjust x here as well, it's not
just [0.00001, 0.00001] with an ulp of that result but it's a wider
range adjusted by the "biggest" ulp of the input ranges?

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2023-03-03 13:03 ` rguenth at gcc dot gnu.org
@ 2023-03-03 13:10 ` rguenth at gcc dot gnu.org
  2023-03-03 13:14 ` rguenth at gcc dot gnu.org
                   ` (41 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-03-03 13:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #8 from Richard Biener <rguenth at gcc dot gnu.org> ---
We basically have to consider an input range [a, b] as [a - x, b + y]
with the largest positive x and y so that correctly rounding the value
yields a and b again.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2023-03-03 13:10 ` rguenth at gcc dot gnu.org
@ 2023-03-03 13:14 ` rguenth at gcc dot gnu.org
  2023-03-03 13:34 ` jakub at gcc dot gnu.org
                   ` (40 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-03-03 13:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #8)
> We basically have to consider an input range [a, b] as [a - x, b + y]
> with the largest positive x and y so that correctly rounding the value
> yields a and b again.

So when we have

 [a, b] = x + [c, d];

we have to compute

 x = [a + a', b + b'] - [c + c', d + d'];

and because we deal with FP actually do

 x = [a, b] - [c, d];
 x' = [a', b'] - [c', d'];

and yield x U x' as result.  For other operations it might get more
complicated.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (9 preceding siblings ...)
  2023-03-03 13:14 ` rguenth at gcc dot gnu.org
@ 2023-03-03 13:34 ` jakub at gcc dot gnu.org
  2023-03-03 13:50 ` rguenth at gcc dot gnu.org
                   ` (39 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-03-03 13:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #10 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
So, can't we say compute what we compute right now for the reverse operation
and then call some helper function which will try to extended that range a
little bit in both directions by performing frange_arithmetic (or variant
thereof) on the original operation and checking if the other op's range OP the
slightly extended range still yields the result range?  Of course it would be
better if we knew how to exactly compute that rather than try to iteratively
guess, but even iterative guess could say punt after a few iterations on
smallest range extension which would already result in a different range. 
Though, I bet reverse multiplication/division are even harder because there we
perform the reverse operation of all the boundaries against each other and
union.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (10 preceding siblings ...)
  2023-03-03 13:34 ` jakub at gcc dot gnu.org
@ 2023-03-03 13:50 ` rguenth at gcc dot gnu.org
  2023-03-03 14:28 ` jakub at gcc dot gnu.org
                   ` (38 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-03-03 13:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #11 from Richard Biener <rguenth at gcc dot gnu.org> ---
I think for the reverse op I'd naiively try to compute the result as we do now
and then extend the range by one ulp of the input range with the largest
magnitude.

Does real_nextafter (0.0) result in a denormal?  For multiplication we have
to consider underflow, say, if

 0.0 = x * 1e-63;

depending on whether we represent denormals in ranges x is the range
where the operation underflows.

As said we probably have to think and document what we do for individual
operations.

We can also conservatively widen the input ranges by one
ulp, that's going to be always correct and maybe the best thing to do
for GCC 13?  It should be only necessary for the reverse operations.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (11 preceding siblings ...)
  2023-03-03 13:50 ` rguenth at gcc dot gnu.org
@ 2023-03-03 14:28 ` jakub at gcc dot gnu.org
  2023-03-07 12:06 ` rguenth at gcc dot gnu.org
                   ` (37 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-03-03 14:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #12 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #11)

I think before we code something on the compiler side, it might be better
to play just with C testcases.

Quite naive

#include <math.h>

__attribute__((noipa)) void
extend_range (double result, int (*test) (double), double *result_min, double
*result_max)
{
  *result_min = result;
  *result_max = result;
  for (double eps = __DBL_DENORM_MIN__; __builtin_isfinite (eps); eps = eps *
2.0)
    if (!test (result - eps))
      {
        *result_min = result - eps;
        break;
      }
  for (double eps = __DBL_DENORM_MIN__; __builtin_isfinite (eps); eps = eps *
2.0)
    if (!test (result + eps))
      {
        *result_max = result + eps;
        break;
      }
}

__attribute__((noipa)) double
fn1 (double eps)
{
  double d = 1. + eps;
  if (d == 1.)
    return eps;
  return 0.0;
}

int
test1 (double eps)
{
  return fn1 (eps) == eps;
}

__attribute__((noipa)) double
fn2 (double eps)
{
  double d = 1. + eps;
  if (d == 0x1.0000001p0)
    return eps;
  return 0.0;
}

int
test2 (double eps)
{
  return fn2 (eps) == eps;
}

__attribute__((noipa)) double
fn3 (double eps)
{
  double d = 1. + eps;
  if (d == 2.)
    return eps;
  return 0.0;
}

int
test3 (double eps)
{
  return fn3 (eps) == eps;
}

int
main ()
{
  double min, max;
  extend_range (1. - 1., test1, &min, &max);
  __builtin_printf ("%.32a [%.32a, %.32a]\n", 1.0 - 1., min, max);
  extend_range (0x1.0000001p0 - 1., test2, &min, &max);
  __builtin_printf ("%.32a [%.32a, %.32a]\n", 0x1.0000001p0 - 1., min, max);
  extend_range (2. - 1., test3, &min, &max);
  __builtin_printf ("%.32a [%.32a, %.32a]\n", 2. - 1., min, max);
  return 0;
}

prints
0x0.00000000000000000000000000000000p+0
[-0x1.00000000000000000000000000000000p-53,
0x1.00000000000000000000000000000000p-52]
0x1.00000000000000000000000000000000p-28
[0x1.fffffe00000000000000000000000000p-29,
0x1.00000100000000000000000000000000p-28]
0x1.00000000000000000000000000000000p+0
[0x1.ffffffffffffe0000000000000000000p-1,
0x1.00000000000020000000000000000000p+0]
so yes, clearly even the x + 1. == 2. reverse is affected, but while maximum of
other_op and lhs range's ulp might be a good starting point, it would be good
to iteratively adjust it.  The above just searches for a power of two epsilon
in both directions (first one that changes the range, so conservatively
larger), wonder if for the search instead of iterations we couldn't do a binary
search between the largest and smallest exponents,
and/or whether after finding a pair of power of two epsilons where one extends
the lhs range and the other one still doesn't do up to mantissa precision
further iterations to find the exact epsilon value; though perhaps stop after
some constant number of them and consider that good enough (similarly, if the
min - eps (or max + eps) for two different values yield the same value).
All this still sounds quite brute force as opposed to trying to be clever and
do the math right.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (12 preceding siblings ...)
  2023-03-03 14:28 ` jakub at gcc dot gnu.org
@ 2023-03-07 12:06 ` rguenth at gcc dot gnu.org
  2023-03-07 12:12 ` jakub at gcc dot gnu.org
                   ` (36 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-03-07 12:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #13 from Richard Biener <rguenth at gcc dot gnu.org> ---
The question is what we want to do for GCC 13 - I suppose iterating would work
but it'll be slow (what's the range to binary search here?).  Doing the math
"right" probably differs for each reverse operation (but how many do we have?
I see plus, minus, mult and div only).

I'd say approximate but conservatively correct math would be prefered here.
Most conservative is probably to simply extend the input ranges by 1 ulp,
that should work for all operators but inevitably leads to less precise
answers.  As said, we could also resort to mpfr and widen the ranges by
0.5 ulp only, the question is whether that's less costly than iterating.
We could always at least verify if the forward operation with the result
yields something that contains the old LHS range though (maybe that's a
good idea anyway when checking is enabled).

But the real challenge of course is to somehow have test coverage for all this.

With something like

diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc
index ff42b95de4f..e142b83c047 100644
--- a/gcc/range-op-float.cc
+++ b/gcc/range-op-float.cc
@@ -2214,7 +2214,29 @@ public:
     range_op_handler minus (MINUS_EXPR, type);
     if (!minus)
       return false;
-    return float_binary_op_range_finish (minus.fold_range (r, type, lhs, op2),
+    /* ???  some first-class "widening" CTOR might be nicer, maybe
+       add some static function?  */
+    frange wlhs (lhs);
+    if (!lhs.known_isnan ())
+      {
+       REAL_VALUE_TYPE lhsl = lhs.lower_bound ();
+       frange_nextafter (TYPE_MODE (type), lhsl, dconstninf);
+       REAL_VALUE_TYPE lhsu = lhs.upper_bound ();
+       frange_nextafter (TYPE_MODE (type), lhsu, dconstinf);
+       wlhs.set (type, lhsl, lhsu);
+       // no way to copy NaN state?
+      }
+    frange wop2 (op2);
+    if (!op2.known_isnan ())
+      {
+       REAL_VALUE_TYPE op2l = op2.lower_bound ();
+       frange_nextafter (TYPE_MODE (type), op2l, dconstninf);
+       REAL_VALUE_TYPE op2u = op2.upper_bound ();
+       frange_nextafter (TYPE_MODE (type), op2u, dconstinf);
+       wop2.set (type, op2l, op2u);
+       // no way to copy NaN state?
+      }
+    return float_binary_op_range_finish (minus.fold_range (r, type, wlhs,
wop2),
                                         r, type, lhs);
   }
   virtual bool op2_range (frange &r, tree type,

the comment#2 testcase shows in EVRP

Imports: eps_2(D)
Exports: eps_2(D)  d_3
         d_3 : eps_2(D)(I)
eps_2(D)        [frange] double VARYING +-NAN
    <bb 2> :
    d_3 = eps_2(D) + 1.0e+0;
    if (d_3 == 1.0e+0)
      goto <bb 3>; [INV]
    else
      goto <bb 4>; [INV]

2->3  (T) eps_2(D) :    [frange] double
[-3.3306690738754696212708950042724609375e-16 (-0x0.cp-51),
3.3306690738754696212708950042724609375e-16 (0x0.cp-51)]
2->3  (T) d_3 :         [frange] double [1.0e+0 (0x0.8p+1), 1.0e+0 (0x0.8p+1)]

that's at least no longer incorrect.

As written in the comment above mangling a range like this is currently
a bit awkward, but maybe a static method in frange like
frange frange::1ulp_wider (const frange &) would be possible.

If we ever get range ops for things like exp() things will get more
interesting, the input range widening by n * ulp only works for linear
ops I think.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (13 preceding siblings ...)
  2023-03-07 12:06 ` rguenth at gcc dot gnu.org
@ 2023-03-07 12:12 ` jakub at gcc dot gnu.org
  2023-03-07 12:20 ` jakub at gcc dot gnu.org
                   ` (35 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-03-07 12:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #14 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 54599
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54599&action=edit
gcc13-pr109008-wip.patch

I have now this WIP but it doesn't work correctly yet, need to debug it now,
just finished writing it and making it compile.
Note, after the binary search it is still unfinished, I'd like to then adjust
it one bit at a time to narrow it further.  But before working on that it needs
to work correctly.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (14 preceding siblings ...)
  2023-03-07 12:12 ` jakub at gcc dot gnu.org
@ 2023-03-07 12:20 ` jakub at gcc dot gnu.org
  2023-03-07 12:23 ` rguenth at gcc dot gnu.org
                   ` (34 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-03-07 12:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #15 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I believe worst case the above would do (for IEEE quad) something like 16
checks because of 16-bit exponent, roughly one range_arithmetic, one (for
mult/div 2?) frange_arithmetic and real_equal + real_isfinite each, usually
much less.  Plus if we narrow it down further up to mantissa precision bits
more.  But at least for the latter we could have some hard limit, don't go
further than say 30 bits at most or so.
Once/if it makes correctly, we can just grab statistics across some math heavy
project (suggestions welcome).

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (15 preceding siblings ...)
  2023-03-07 12:20 ` jakub at gcc dot gnu.org
@ 2023-03-07 12:23 ` rguenth at gcc dot gnu.org
  2023-03-07 12:25 ` rguenth at gcc dot gnu.org
                   ` (33 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-03-07 12:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #16 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #14)
> Created attachment 54599 [details]
> gcc13-pr109008-wip.patch
> 
> I have now this WIP but it doesn't work correctly yet, need to debug it now,
> just finished writing it and making it compile.
> Note, after the binary search it is still unfinished, I'd like to then
> adjust it one bit at a time to narrow it further.  But before working on
> that it needs to work correctly.

Looks nicer and more precise (even though the iterating function is ugly
and you also suffer from the lack of copying of alternate range state,
but using .union () is a way out here I guess ;)).

Can we ignore -frounding-math here?  The way you structured the tests
rely on doing the "correct" rounding?  Or does fold_range already
compensate for -frounding-math?

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (16 preceding siblings ...)
  2023-03-07 12:23 ` rguenth at gcc dot gnu.org
@ 2023-03-07 12:25 ` rguenth at gcc dot gnu.org
  2023-03-07 13:16 ` jakub at gcc dot gnu.org
                   ` (32 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-03-07 12:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #17 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #15)
> I believe worst case the above would do (for IEEE quad) something like 16
> checks because of 16-bit exponent, roughly one range_arithmetic, one (for
> mult/div 2?) frange_arithmetic and real_equal + real_isfinite each, usually
> much less.  Plus if we narrow it down further up to mantissa precision bits
> more.  But at least for the latter we could have some hard limit, don't go
> further than say 30 bits at most or so.
> Once/if it makes correctly, we can just grab statistics across some math
> heavy project (suggestions welcome).

I think FP _compares_ are not so common, but nevertheless I'd suggest
to simply use SPEC CPU 2017 FP and/or polyhedron

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (17 preceding siblings ...)
  2023-03-07 12:25 ` rguenth at gcc dot gnu.org
@ 2023-03-07 13:16 ` jakub at gcc dot gnu.org
  2023-03-07 13:18 ` rguenth at gcc dot gnu.org
                   ` (31 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-03-07 13:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #18 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #16)
> (In reply to Jakub Jelinek from comment #14)
> > Created attachment 54599 [details]
> > gcc13-pr109008-wip.patch
> > 
> > I have now this WIP but it doesn't work correctly yet, need to debug it now,
> > just finished writing it and making it compile.
> > Note, after the binary search it is still unfinished, I'd like to then
> > adjust it one bit at a time to narrow it further.  But before working on
> > that it needs to work correctly.
> 
> Looks nicer and more precise (even though the iterating function is ugly
> and you also suffer from the lack of copying of alternate range state,
> but using .union () is a way out here I guess ;)).

I thought the union_ handles it.  I didn't want to simply override the lower or
upper bound on r, because frange::set has various canonicalizations and the
like.

> Can we ignore -frounding-math here?

I believe it isn't ignored, -frounding-math ought to be taken into account in
frange_arithmetic.  The intent was that the testing function/lambda would e.g.
for the
addition when extending lower bound perform rounding etc. towards +inf and when
extending upper bound towards -inf.
But as I said, the patch currently doesn't work correctly, the test on the
nextafter
value at the start of the new function fails even when I'd think it should
pass.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (18 preceding siblings ...)
  2023-03-07 13:16 ` jakub at gcc dot gnu.org
@ 2023-03-07 13:18 ` rguenth at gcc dot gnu.org
  2023-03-07 13:29 ` rguenth at gcc dot gnu.org
                   ` (30 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-03-07 13:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #19 from Richard Biener <rguenth at gcc dot gnu.org> ---
I still think we should avoid iteration.  Looking at plus we have

 x = y + a

which is actually

 x = y + a +- 0.5ulp (y + a)

(0.5ulp of the y + a result), so we can compute a as

 a = x - y +- 0.5ulp (y + a)

note 0.5ulp (y + a) isn't 0.5ulp (x), not exactly at least, so we have to
approximate it and a conservative bound is 1ulp (x) here.  We may also have
to make sure to round toward -Inf for the lower bound of the result range
and +Inf of the upper bound of the result range I think.


For multiplication it's

 x = y * a +- 0.5ulp (y *  a)

 a = (x +- 0.5ulp (y * a)) / y

so it's again similar.

So what we can indeed do is widen the LHS range by 0.5 ulp and since we
cannot represent that and it might be imprecise if 1 ulp _after_ the
rounding is smaller than 1 ulp _before_ the rounding operation we simply
use 1 ulp widening of the LHS?

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (19 preceding siblings ...)
  2023-03-07 13:18 ` rguenth at gcc dot gnu.org
@ 2023-03-07 13:29 ` rguenth at gcc dot gnu.org
  2023-03-07 13:58 ` rguenth at gcc dot gnu.org
                   ` (29 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-03-07 13:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #20 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #19)
> I still think we should avoid iteration.  Looking at plus we have
> 
>  x = y + a
> 
> which is actually
> 
>  x = y + a +- 0.5ulp (y + a)
> 
> (0.5ulp of the y + a result), so we can compute a as
> 
>  a = x - y +- 0.5ulp (y + a)
> 
> note 0.5ulp (y + a) isn't 0.5ulp (x), not exactly at least, so we have to
> approximate it and a conservative bound is 1ulp (x) here.  We may also have
> to make sure to round toward -Inf for the lower bound of the result range
> and +Inf of the upper bound of the result range I think.
> 
> 
> For multiplication it's
> 
>  x = y * a +- 0.5ulp (y *  a)
> 
>  a = (x +- 0.5ulp (y * a)) / y
> 
> so it's again similar.
> 
> So what we can indeed do is widen the LHS range by 0.5 ulp and since we
> cannot represent that and it might be imprecise if 1 ulp _after_ the
> rounding is smaller than 1 ulp _before_ the rounding operation we simply
> use 1 ulp widening of the LHS?

So in principle REAL_VALUE_TYPE would support adjusting by half an ulp,
we just cannot use real_nextafter for that but we'd have to do this
manually somehow and the real_convert after could in principle also do
the -+INF rounding (but that's not implemented yet).

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (20 preceding siblings ...)
  2023-03-07 13:29 ` rguenth at gcc dot gnu.org
@ 2023-03-07 13:58 ` rguenth at gcc dot gnu.org
  2023-03-07 14:10 ` rguenth at gcc dot gnu.org
                   ` (28 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-03-07 13:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #21 from Richard Biener <rguenth at gcc dot gnu.org> ---
So without messing with real.cc to try exposing 0.5ulp adjustments for GCC 13
I'd simply do something like the following:

diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc
index ff42b95de4f..1ae68503664 100644
--- a/gcc/range-op-float.cc
+++ b/gcc/range-op-float.cc
@@ -2214,7 +2214,26 @@ public:
     range_op_handler minus (MINUS_EXPR, type);
     if (!minus)
       return false;
-    return float_binary_op_range_finish (minus.fold_range (r, type, lhs, op2),
+    /* The forward operation is
+        lhs = r + op2
+       where the result is +-0.5ulp of lhs before rounding.  For the
+       reverse operation we need the lhs range before rounding, so
+       conservatively use nextafter on it.
+       ???  REAL_VALUE_TYPE could handle this more precisely if we
+       make sure to not round the inputs for the format before the
+       real_operation is carried out and when we can properly round
+       towards +-Inf for the lower/upper bounds.  */
+    frange wlhs (lhs);
+    if (!lhs.known_isnan ())
+      {
+       REAL_VALUE_TYPE lhsl = lhs.lower_bound ();
+       frange_nextafter (TYPE_MODE (type), lhsl, dconstninf);
+       REAL_VALUE_TYPE lhsu = lhs.upper_bound ();
+       frange_nextafter (TYPE_MODE (type), lhsu, dconstinf);
+       wlhs.set (type, lhsl, lhsu);
+       wlhs.union_ (lhs);  /* Copy NaN state.  */
+      }
+    return float_binary_op_range_finish (minus.fold_range (r, type, wlhs,
op2),
                                         r, type, lhs);
   }
   virtual bool op2_range (frange &r, tree type,

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (21 preceding siblings ...)
  2023-03-07 13:58 ` rguenth at gcc dot gnu.org
@ 2023-03-07 14:10 ` rguenth at gcc dot gnu.org
  2023-03-07 18:52 ` jakub at gcc dot gnu.org
                   ` (27 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-03-07 14:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #22 from Richard Biener <rguenth at gcc dot gnu.org> ---
New real.h API could be

extern void real_unround (REAL_VALUE_TYPE *, format_helper, int dir);

where for dir < 0 it produces the least significant bits so that rounding
produces the original while for dir > 0 it produces the most
significant bits so that rounding produces the original.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (22 preceding siblings ...)
  2023-03-07 14:10 ` rguenth at gcc dot gnu.org
@ 2023-03-07 18:52 ` jakub at gcc dot gnu.org
  2023-03-07 19:39 ` jakub at gcc dot gnu.org
                   ` (26 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-03-07 18:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #54599|0                           |1
        is obsolete|                            |

--- Comment #23 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 54601
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54601&action=edit
gcc13-pr109008-wip.patch

Here is an updated version of the WIP, I think this one now works (but is only
hooked up in the foperator_plus::op{1,2}_range and not for -, * and /.
Also, it returns the first smaller value (for low bound) or first larger value
(for upper bound) for which the test already isn't met, while I think we should
set it to nextafter of that towards inf for the former and nextafter of that
towards -inf for the latter.
I can try to get statistics from it and also whether the loop over significand
bits is needed and helps something.  I've added another thing to the heuristics
to narrow down the range for the binary search on exponents, in particular the
maximum exponent from the lhs and op2 bounds (all 4 of them) next to the l + p
and l + 8 * p tests.  Bet detailed statistics can help to point what helps and
what doesn't.

If we get some smart math (or even the patch you've provided), I guess we can
try to compare the two implementations for the precision of the ranges as well
as compile time cost.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (23 preceding siblings ...)
  2023-03-07 18:52 ` jakub at gcc dot gnu.org
@ 2023-03-07 19:39 ` jakub at gcc dot gnu.org
  2023-03-07 20:49 ` jakub at gcc dot gnu.org
                   ` (25 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-03-07 19:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #54601|0                           |1
        is obsolete|                            |

--- Comment #24 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 54602
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54602&action=edit
gcc13-pr109008-wip.patch

And this version has both variants of foperator_minus hooked up as well.

Testcase could be e.g.:
double f1 (double eps) { double d = 1. + eps; if (d == 1.) return eps; return
0.0; }
double f2 (double eps) { double d = 1. + eps; if (d == 0x1.00001p0) return eps;
return 0.0; }
double f3 (double eps) { double d = 1. + eps; if (d == 2.) return eps; return
0.0; }
double f4 (double eps) { double d = 42. - eps; if (d == 42.) return eps; return
0.0; }
double f5 (double eps) { double d = 42. - eps; if (d == 0x2a.0001p0) return
eps; return 0.0; }
double f6 (double eps) { double d = 42. - eps; if (d == 43.) return eps; return
0.0; }
double f7 (double eps) { double d = eps - 42.; if (d == 42.) return eps; return
0.0; }
double f8 (double eps) { double d = eps - 42.; if (d == 0x2a.0001p0) return
eps; return 0.0; }
double f9 (double eps) { double d = eps - 42.; if (d == 43.) return eps; return
0.0; }

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (24 preceding siblings ...)
  2023-03-07 19:39 ` jakub at gcc dot gnu.org
@ 2023-03-07 20:49 ` jakub at gcc dot gnu.org
  2023-03-08  9:26 ` aldyh at gcc dot gnu.org
                   ` (24 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-03-07 20:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #25 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I guess more debugging tomorrow.  Because with the frange_nextafter it comes up
for the f1 to range
p[frange] double
[-2.22044604925031283432823045461545143383482334912930322712e-16
(-0x0.fffffffffffff8p-52),
4.44089209850062566865646090923090286766964669825860645425e-16
(0x0.fffffffffffff8p-51)]
But as can be seen on
double f1 (double eps) { double d = 1. + eps; if (d == 1.) return eps; return
0.0; }
int
main ()
{
  __builtin_printf ("%.32a\n", f1 (-0x0.fffffffffffff8p-52));
  __builtin_printf ("%.32a\n", f1 (__builtin_nextafter
(-0x0.fffffffffffff8p-52, -42.0)));
  __builtin_printf ("%.32a\n", f1 (0x0.fffffffffffff8p-51));
  __builtin_printf ("%.32a\n", f1 (__builtin_nextafter (0x0.fffffffffffff8p-51,
42.0)));
  __builtin_printf ("%.32a\n", f1 (-0x1.0p-54));
  __builtin_printf ("%.32a\n", f1 (__builtin_nextafter (-0x1.0p-54, -42.0)));
  __builtin_printf ("%.32a\n", f1 (0x1.0p-53));
  __builtin_printf ("%.32a\n", f1 (__builtin_nextafter (0x1.0p-53, 42.0)));
  __builtin_printf ("%.32a\n", f1 (-0x0.8p-53));
  __builtin_printf ("%.32a\n", f1 (__builtin_nextafter (-0x0.8p-53, -42.0)));
  __builtin_printf ("%.32a\n", f1 (0x0.8p-52));
  __builtin_printf ("%.32a\n", f1 (__builtin_nextafter (0x0.8p-52, 42.0)));
}
the correct exact range is what is written in the comment, i.e. [-0x1.0p-54,
0x1.0p-53]
aka [-0x0.8p-53, 0x0.8p-52].

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (25 preceding siblings ...)
  2023-03-07 20:49 ` jakub at gcc dot gnu.org
@ 2023-03-08  9:26 ` aldyh at gcc dot gnu.org
  2023-03-08  9:29 ` aldyh at gcc dot gnu.org
                   ` (23 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: aldyh at gcc dot gnu.org @ 2023-03-08  9:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

Aldy Hernandez <aldyh at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |amacleod at redhat dot com

--- Comment #26 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #16)
> (In reply to Jakub Jelinek from comment #14)
> > Created attachment 54599 [details]
> > gcc13-pr109008-wip.patch
> > 
> > I have now this WIP but it doesn't work correctly yet, need to debug it now,
> > just finished writing it and making it compile.
> > Note, after the binary search it is still unfinished, I'd like to then
> > adjust it one bit at a time to narrow it further.  But before working on
> > that it needs to work correctly.
> 
> Looks nicer and more precise (even though the iterating function is ugly
> and you also suffer from the lack of copying of alternate range state,
> but using .union () is a way out here I guess ;)).

While working on LTO streaming of ranges, I also noticed that we don't have a
way of accessing the individual NAN sign bits or a way of copying NAN state. 
Why don't we get this right instead of hacking around it?

Would it help if we provided the following?

nan_state_t irange::get_nan_state ();

irange::set (tree type, const REAL_VALUE_TYPE &, const REAL_VALUE_TYPE &, const
nan_state_t &, value_range_kind = VR_RANGE);

We could implement nan_state_t to include the m_pos_nan and m_neg_nan for
minimal disruption now, and later use it for extending the NAN information
(signalling, etc).

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (26 preceding siblings ...)
  2023-03-08  9:26 ` aldyh at gcc dot gnu.org
@ 2023-03-08  9:29 ` aldyh at gcc dot gnu.org
  2023-03-08  9:36 ` jakub at gcc dot gnu.org
                   ` (22 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: aldyh at gcc dot gnu.org @ 2023-03-08  9:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #27 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #21)
> So without messing with real.cc to try exposing 0.5ulp adjustments for GCC
> 13 I'd simply do something like the following:
> 
> diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc
> index ff42b95de4f..1ae68503664 100644
> --- a/gcc/range-op-float.cc
> +++ b/gcc/range-op-float.cc
> @@ -2214,7 +2214,26 @@ public:
>      range_op_handler minus (MINUS_EXPR, type);
>      if (!minus)
>        return false;
> -    return float_binary_op_range_finish (minus.fold_range (r, type, lhs,
> op2),
> +    /* The forward operation is
> +        lhs = r + op2
> +       where the result is +-0.5ulp of lhs before rounding.  For the
> +       reverse operation we need the lhs range before rounding, so
> +       conservatively use nextafter on it.
> +       ???  REAL_VALUE_TYPE could handle this more precisely if we
> +       make sure to not round the inputs for the format before the
> +       real_operation is carried out and when we can properly round
> +       towards +-Inf for the lower/upper bounds.  */
> +    frange wlhs (lhs);
> +    if (!lhs.known_isnan ())
> +      {
> +       REAL_VALUE_TYPE lhsl = lhs.lower_bound ();
> +       frange_nextafter (TYPE_MODE (type), lhsl, dconstninf);
> +       REAL_VALUE_TYPE lhsu = lhs.upper_bound ();
> +       frange_nextafter (TYPE_MODE (type), lhsu, dconstinf);
> +       wlhs.set (type, lhsl, lhsu);
> +       wlhs.union_ (lhs);  /* Copy NaN state.  */
> +      }
> +    return float_binary_op_range_finish (minus.fold_range (r, type, wlhs,
> op2),
>                                          r, type, lhs);
>    }
>    virtual bool op2_range (frange &r, tree type,

Could we abstract this into an inline function until this could be properly
implemented in real.cc?

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (27 preceding siblings ...)
  2023-03-08  9:29 ` aldyh at gcc dot gnu.org
@ 2023-03-08  9:36 ` jakub at gcc dot gnu.org
  2023-03-08 10:09 ` aldyh at gcc dot gnu.org
                   ` (21 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-03-08  9:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #28 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
On IRC we've discussed this and I believe a possible fix could be before we do:
    return float_binary_op_range_finish (minus.fold_range (r, type, lhs, op2),
                                         r, type, lhs);
etc. artificially extend the lhs range by 1ulp or .5ulp or whatever works in
each direction (on a copy) and then just let it do its job.

But I want to fixup above patch first, so that we have something to compare to.
From
--- gcc/range-op-float.cc.jj    2023-03-07 21:20:49.885225381 +0100
+++ gcc/range-op-float.cc       2023-03-08 09:13:10.063608296 +0100
@@ -2319,9 +2319,15 @@ float_range_extend (tree type, frange &r
        u = m;
     }
   gcc_checking_assert (l + 1 == u);
+fprintf (stderr, "--\n");
+SET_REAL_EXP (&t, l);
+real_arithmetic (&w, PLUS_EXPR, &v, &t);
+real_convert (&w, mode, &w);
+test (w, type, lhs, op2);
   SET_REAL_EXP (&t, u);
   real_arithmetic (&w, PLUS_EXPR, &v, &t);
   real_convert (&w, mode, &w);
+test (w, type, lhs, op2);
   REAL_VALUE_TYPE lastw = w;
   for (int i = SIGNIFICAND_BITS - 2; i >= SIGNIFICAND_BITS - 2 - p; i--)
     {
@@ -2338,7 +2344,10 @@ float_range_extend (tree type, frange &r
       lastw = w;
     }
   w = lastw;
+fprintf (stderr, "---\n");
+test (w, type, lhs, op2);
   frange_nextafter (mode, w, upper ? dconstninf : dconstinf);
+test (w, type, lhs, op2);
   goto update;
 }

@@ -2367,7 +2376,15 @@ public:
                            REAL_VALUE_TYPE res;
                            frange_arithmetic (PLUS_EXPR, type, res, r,
                                               op2.lower_bound (), dconstinf);
-                           return !real_less (&res, &lhs.lower_bound ());
+bool ret = !real_less (&res, &lhs.lower_bound ());
+char br[60], bop2[60], bres[60], blhs[60];
+real_to_hexadecimal (br, &r, sizeof (br), 0, 1);
+real_to_hexadecimal (bop2, &op2.lower_bound (), sizeof (bop2), 0, 1);
+real_to_hexadecimal (bres, &res, sizeof (bres), 0, 1);
+real_to_hexadecimal (blhs, &lhs.lower_bound (), sizeof (blhs), 0, 1);
+fprintf (stderr, "float_range_extend1 %s + %s = %s %s %s\n", br, bop2, bres,
ret ? ">=" : "<", blhs);
+return ret;
+//                         return !real_less (&res, &lhs.lower_bound ());
                          });
     float_range_extend (type, r, lhs, op2, true,
                        [] (REAL_VALUE_TYPE &r, tree type,
@@ -2376,7 +2393,15 @@ public:
                            REAL_VALUE_TYPE res;
                            frange_arithmetic (PLUS_EXPR, type, res, r,
                                               op2.upper_bound (), dconstninf);
-                           return !real_less (&lhs.upper_bound (), &res);
+bool ret = !real_less (&lhs.upper_bound (), &res);
+char br[60], bop2[60], bres[60], blhs[60];
+real_to_hexadecimal (br, &r, sizeof (br), 0, 1);
+real_to_hexadecimal (bop2, &op2.lower_bound (), sizeof (bop2), 0, 1);
+real_to_hexadecimal (bres, &res, sizeof (bres), 0, 1);
+real_to_hexadecimal (blhs, &lhs.lower_bound (), sizeof (blhs), 0, 1);
+fprintf (stderr, "float_range_extend2 %s + %s = %s %s %s\n", br, bop2, bres,
ret ? "<=" : ">", blhs);
+return ret;
+//                         return !real_less (&lhs.upper_bound (), &res);
                          });
     return true;
   }
debugging hacks seems it is the loop that tries to narrow the mantissa bits
that is wrong, as immediately before it the l and u values seem correct:
float_range_extend1 -0x0.8p-53 + 0x0.8p+1 = 0x0.8p+1 >= 0x0.8p+1
float_range_extend1 -0x0.8p-52 + 0x0.8p+1 = 0x0.fffffffffffff8p+0 < 0x0.8p+1
where the first is for l where test passes and u doesn't (and the right answer
here is -0x0.8p-53).

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (28 preceding siblings ...)
  2023-03-08  9:36 ` jakub at gcc dot gnu.org
@ 2023-03-08 10:09 ` aldyh at gcc dot gnu.org
  2023-03-08 11:29 ` jakub at gcc dot gnu.org
                   ` (20 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: aldyh at gcc dot gnu.org @ 2023-03-08 10:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #29 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
Created attachment 54604
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54604&action=edit
untested patch for NAN state copying

This is what I had in mind.

Notice I haven't touched the fields in frange itself, to keep disruption to a
minimum in this release.  In the next release we could replace m_pos_nan and
m_neg_nan with nan_state.

If this helps, I could test and document it.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (29 preceding siblings ...)
  2023-03-08 10:09 ` aldyh at gcc dot gnu.org
@ 2023-03-08 11:29 ` jakub at gcc dot gnu.org
  2023-03-08 11:30 ` jakub at gcc dot gnu.org
                   ` (19 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-03-08 11:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #54602|0                           |1
        is obsolete|                            |

--- Comment #30 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 54605
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54605&action=edit
gcc13-pr109008-wip.patch

Fixed iterative patch (just for archival purposes only).
Note, even this one isn't exact, because of the way the tests conservatively
try to round inexact operations (normally in a way to make the range slightly
larger, in this case the tests for e.g. low bound the normal operation towards
+inf), for f1 above with this patch one gets
[frange] double
[-1.11022302462515641716411522730772571691741167456465161356e-16
(-0x0.fffffffffffff8p-53),
2.22044604925031283432823045461545143383482334912930322712e-16
(0x0.fffffffffffff8p-52)]
instead of [-0x0.8p-53, 0x0.8p-52] which is the minimal range.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (30 preceding siblings ...)
  2023-03-08 11:29 ` jakub at gcc dot gnu.org
@ 2023-03-08 11:30 ` jakub at gcc dot gnu.org
  2023-03-08 14:51 ` jakub at gcc dot gnu.org
                   ` (18 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-03-08 11:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #31 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 54606
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54606&action=edit
gcc13-pr109008-wip-debug.patch

And the incremental debugging patch for that.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (31 preceding siblings ...)
  2023-03-08 11:30 ` jakub at gcc dot gnu.org
@ 2023-03-08 14:51 ` jakub at gcc dot gnu.org
  2023-03-08 14:52 ` jakub at gcc dot gnu.org
                   ` (17 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-03-08 14:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #32 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 54608
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54608&action=edit
gcc13-pr109008.patch

So far almost untested patch which does the lhs widening.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (32 preceding siblings ...)
  2023-03-08 14:51 ` jakub at gcc dot gnu.org
@ 2023-03-08 14:52 ` jakub at gcc dot gnu.org
  2023-03-08 15:07 ` jakub at gcc dot gnu.org
                   ` (16 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-03-08 14:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #33 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 54609
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54609&action=edit
gcc13-pr109008-debug.patch

And debugging code.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (33 preceding siblings ...)
  2023-03-08 14:52 ` jakub at gcc dot gnu.org
@ 2023-03-08 15:07 ` jakub at gcc dot gnu.org
  2023-03-08 15:54 ` jakub at gcc dot gnu.org
                   ` (15 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-03-08 15:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #34 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Testing I've performed so far (though on 10000 iterations rather than 300000,
that is ongoing), once with the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008#c33 patch alone, once with
that patch and https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008#c32.

First step, generate a random testcase:
pr109008-gen.c:
#include <stdint.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>

static long rand_n;
static int rand_c;

static uint32_t
get_rand32 (void)
{
  uint32_t ret = 0;
  if (rand_c == 0)
    {
      ret = random () & 0x7fffffff;
      rand_c = 31;
    }
  else
    ret = rand_n & (((uint32_t) 1 << rand_c) - 1);
  ret <<= (32 - rand_c);
  rand_n = random ();
  ret |= rand_n & (((uint32_t) 1 << (32 - rand_c)) - 1);
  rand_n >>= (32 - rand_c);
  rand_c = 31 - (32 - rand_c);
  return ret;
}

static uint64_t
get_rand64 (void)
{
  return (((uint64_t) get_rand32 ()) << 32) | get_rand32 ();
}

static float
get_randf (void)
{
  uint32_t i = get_rand32 ();
  float f;
  memcpy (&f, &i, sizeof (f));
  return f;
}

int
main ()
{
  printf ("#define nanf __builtin_nanf (\"\")\n");
  printf ("#define inf __builtin_inff ()\n");
  for (int n = 0; n < 300000; ++n)
    {
      float n1 = get_randf ();
      float n2 = get_randf ();
      uint32_t x = get_rand32 ();
      if ((x & 7) == 0)
        n2 = n1;
      x >>= 3;
      printf ("float f%d (float eps) { float f = ", n);
      switch (x % 3)
        {
        case 0: printf ("%af + eps", n1); break;
        case 1: printf ("%af - eps", n1); break;
        case 2: printf ("eps - %af", n1); break;
        }
      printf ("; if (f == %af) return eps; return __builtin_nanf (\"42\");
}\n", n2);
    }
  return 0;
}
pr109008-main.c:
#include <math.h>

#include "pr109008.c"

struct S { float (*fn) (float); float lb, ub; };
struct S arr[] = {
#include "pr109008.h"
};

int
main ()
{
  float plus_inf = __builtin_inf ();
  float minus_inf = -plus_inf;
  for (int i = 0; i < sizeof (arr) / sizeof (arr[0]); ++i)
    {
      float lb = nextafterf (arr[i].lb, minus_inf);
      float ub = nextafterf (arr[i].ub, plus_inf);
      if (!__builtin_isnan (arr[i].fn (lb)) || !__builtin_isnan (arr[i].fn
(ub)))
        __builtin_printf ("%p err\n", arr[i].fn);
    }
}
gcc -g -O2 -o pr109008-gen{,.c}; ./pr109008-gen > pr109008.c
Next, with cc1 built with just #c33 patch:
rm -f /tmp/ranges; ./cc1 -quiet -O2 pr109008.c; sort -u /tmp/ranges >
pr109008.h
gcc -g -o pr109008-main{,.c}; ./pr109008-main
For 10000 iterations this showed 872 errors.
Next, with cc1 built with both #c32 and #c33 patches:
rm -f /tmp/ranges; ./cc1 -quiet -O2 pr109008.c; sort -u /tmp/ranges >
pr109008.h
gcc -g -o pr109008-main{,.c}; ./pr109008-main
This didn't print any errors, so at least for foperator_plus and
foperator_minus seems
to be from this limited testing conservatively correct.
Want to finish now this testing also for 300000 iterations and then perhaps try
the #c30 patch with variant of #c33 to check also that implementation.
And finally compare the #c32+#c33 vs. #c30+#c33variant ranges.

Another thing which would be nice to think about is whether
float_widen_lhs_range
needs to extend even real_{min,max}_representable () bounds towards -+inf, or
whether
that case can't happen (and check that separately for + or - and * or /),
because e.g. for mult/div whether lhs is finite is quite important.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (34 preceding siblings ...)
  2023-03-08 15:07 ` jakub at gcc dot gnu.org
@ 2023-03-08 15:54 ` jakub at gcc dot gnu.org
  2023-03-08 16:37 ` jakub at gcc dot gnu.org
                   ` (14 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-03-08 15:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #35 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
gcc -g -o pr109008-main{,.c}; ./pr109008-main
should have been
gcc -g -o pr109008-main{,.c} -lm; ./pr109008-main
sorry.
Anyway, 300000 functions now finished, 25471 errors for #c33 patch, 0 errors
for #c32 + #c33 patches, now #c30 with adjustments.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (35 preceding siblings ...)
  2023-03-08 15:54 ` jakub at gcc dot gnu.org
@ 2023-03-08 16:37 ` jakub at gcc dot gnu.org
  2023-03-08 18:56 ` jakub at gcc dot gnu.org
                   ` (13 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-03-08 16:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #36 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Ok, checked the #c30 patch with printouts from #c33 added by hand before return
true;
at the end of each of +/- op?_range, that passed 300000 tests too.
Out of the 297936 lines in the pr109008.h headers (rest are probably lines with
nans),
#include <math.h>

#include "pr109008-4.c"

struct S { float (*fn) (float); float lb, ub; };
struct S arr[] = {
#include "pr109008-4-new.h"
};
struct S arr2[] = {
#include "pr109008-4-iter.h"
};

int
main ()
{
  int stats[10] = {};
  float plus_inf = __builtin_inf ();
  float minus_inf = -plus_inf;
  for (int i = 0; i < sizeof (arr) / sizeof (arr[0]); ++i)
    {
      if (arr[i].lb == arr2[i].lb && arr[i].ub == arr2[i].ub)
        {
          stats[0]++;
          continue;
        }
      if (arr[i].lb > arr2[i].lb || arr[i].ub < arr2[i].ub)
        {
          stats[1]++;
          continue;
        }
      float lb = nextafterf (arr2[i].lb, minus_inf);
      float ub = nextafterf (arr2[i].ub, plus_inf);
      if (arr[i].lb == lb && arr[i].ub == ub)
        {
          stats[2]++;
          continue;
        }
      __builtin_printf ("%p %a %a %a %a\n", arr[i].fn, arr[i].lb, arr2[i].lb,
arr2[i].ub, arr[i].ub);
    }
  __builtin_printf ("%d %d %d\n", stats[0], stats[1], stats[2]);
}

hack shows 128572 cases where both approaches yield the same ranges, 0 cases
where the #c32 would result
in narrower ranges than #c30, 167497 where both bounds are exactly one ulp
worse with #c32 and 1868
other cases, some are 2ulps, others a little bit more.  But I'd say this is
still all acceptable.

Richi, what do you think?

I'll bootstrap/regtest the #c32 patch overnight just in case.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (36 preceding siblings ...)
  2023-03-08 16:37 ` jakub at gcc dot gnu.org
@ 2023-03-08 18:56 ` jakub at gcc dot gnu.org
  2023-03-08 20:02 ` jakub at gcc dot gnu.org
                   ` (12 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-03-08 18:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #37 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I've additionally ran
#include <math.h>

#include "pr109008-4.c"

struct S { float (*fn) (float); float lb, ub; };
struct S arr[] = {
#include "pr109008-4.h"
};

int
main ()
{
  float plus_inf = __builtin_inf ();
  float minus_inf = -plus_inf;
  for (int i = 0; i < sizeof (arr) / sizeof (arr[0]); ++i)
    {
      int n1, n2;
      float lb = nextafterf (arr[i].lb, minus_inf);
      float ub = nextafterf (arr[i].ub, plus_inf);
      if (!__builtin_isnan (arr[i].fn (lb)) || !__builtin_isnan (arr[i].fn
(ub)))
        __builtin_printf ("%p err\n", arr[i].fn);
      lb = arr[i].lb;
      ub = arr[i].ub;
      for (n1 = 0; n1 < 50; n1++)
        if (!__builtin_isnan (arr[i].fn (lb)))
          break;
        else if (lb > ub)
          {
            n1 = -1;
            n2 = -1;
            break;
          }
        else
          lb = nextafterf (lb, plus_inf);
      if (n1 != -1)
        for (n2 = 0; n2 < 50; n2++)
          if (!__builtin_isnan (arr[i].fn (ub)))
            break;
          else
            ub = nextafterf (ub, minus_inf);
      __builtin_printf ("%d %d\n", n1, n2);
    }
}
on the header from the #c32+#c33 patches, and
./pr109008-4-main 2>&1 | sort | uniq -c | sort -n
      1 1 11
      1 11 1
      1 1 17
      1 1 25
      1 1 28
      1 1 3
      1 1 4
      1 19 1
      1 2 13
      1 2 16
      1 2 17
      1 2 36
      1 2 40
      1 3 2
      1 39 1
      1 5 1
      1 5 2
      2 1 32
      2 1 7
      2 3 1
      2 3 50
      3 4 1
     12 50 2
     16 2 50
     27 50 1
     39 1 50
     48 32 32
     53 33 33
     57 17 17
     73 16 16
     77 8 8
     78 9 9
    114 4 4
    124 5 5
    170 3 3
   8533 2 2
  15203 1 1
  24287 50 50
  59288 2 1
  59613 1 2
 130096 -1 -1

The -1 -1 case means that in reality the range is empty (UNDEFINED), there are
no values of eps for which the function returns it, as the difference in
exponents is too large for such a number to exist.  50 means actually 50 or
more ulps.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (37 preceding siblings ...)
  2023-03-08 18:56 ` jakub at gcc dot gnu.org
@ 2023-03-08 20:02 ` jakub at gcc dot gnu.org
  2023-03-09  8:52 ` cvs-commit at gcc dot gnu.org
                   ` (11 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-03-08 20:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #38 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I've also repeated the testing with the above generator generated testcase with
" - " replaced with " / " and " + " with " * ", with #c32 + #c33 patches this
resulted
in
      1 15 50
      1 17 50
      1 18 50
      1 23 50
      1 29 50
      1 32 50
      1 50 14
      1 50 18
      1 50 33
      1 50 35
      2 11 50
      2 13 50
      2 50 12
      2 50 8
      2 7 50
      3 50 10
      3 9 50
      4 50 6
      4 50 9
      4 6 50
      5 10 50
      5 8 50
      6 50 7
      9 5 50
     10 4 50
     10 50 4
     12 50 5
     28 3 50
     45 50 3
    171 50 2
    194 2 50
    366 1 50
    411 50 1
   8744 50 50
  33627 2 2
  45607 1 2
  52776 1 1
  70436 2 1
  85436 -1 -1
so again, seems at least for the 297936 tests with finite ranges it is
conservatively correct and while it still has various 50+ulps cases, in most
cases it is within an ulp or two from the minimal range.
If we find some important issue where it is too conservative, we can deal with
it incrementally.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (38 preceding siblings ...)
  2023-03-08 20:02 ` jakub at gcc dot gnu.org
@ 2023-03-09  8:52 ` cvs-commit at gcc dot gnu.org
  2023-03-09 10:24 ` rguenth at gcc dot gnu.org
                   ` (10 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-03-09  8:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #39 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:bad177e848787258070415dbe002b2c6fba1c511

commit r13-6549-gbad177e848787258070415dbe002b2c6fba1c511
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Thu Mar 9 09:51:37 2023 +0100

    range-op-float: Fix up reverse binary operations [PR109008]

    The following testcase is reduced from miscompilation of scipy package.
    If we have say lhs = [1., 1.] - [1., 1.] and want to compute the range
    of lhs from it, we correctly determine it is [0., 0.] (if computations
    are exact, we generally don't try to round them further in
    frange_arithmetic).  In the testcase it is about a reverse operation,
    [1., 1.] = op1 + [1., 1.] and we want to compute range of op1 from that.
    Right now we just perform the inverse operation (there are some corner
    cases about NaN and infinities handling) and so arrive to range
    [0., 0.] as well, and because it is a singleton, optimize return eps;
    to return 0.  That is incorrect though, for the reverse ops we need to
    take into account also rounding, the right exact range is
    [-0x1.0p-54, 0x1.0p-53] in this case when rounding to nearest, i.e.
    all numbers which added to 1. with round to nearest still produce 1.

    The problem isn't solely on singleton ranges, and isn't solely on
    results around zero.  We basically need to consider also values
    where the result is up to 0.5ulp away from the lhs range boundaries
    in each direction.

    The following patch fixes it by extending the lhs range for the
    reverse operations by 1ulp in each direction.  The PR contains
    a pseudo-random test generator I've used to generate 300000 tests
    of + and - and then used the same test with * and / instead of + and -
    together with a hack to print the discovered ranges by the patch in
    a form that another test could then verify the range is conservatively
    correct and how far it is from a minimal range.

    I believe the results are good enough for now, though plan to look
    incrementally into trying to do something better on the -XXX_MAX or
    XXX_MAX boundaries (where I think frange_nextafter will use -inf or +inf)
    and also try to increase the range just by 0.5ulp rather than 1ulp
    if !flag_rounding_math.  But dunno if either of those will be doable
    and will pass the testing, so I think it is worth committing this fix
    first.

    2023-03-09  Jakub Jelinek  <jakub@redhat.com>
                Richard Biener  <rguenther@suse.de>

            PR tree-optimization/109008
            * range-op-float.cc (float_widen_lhs_range): New function.
            (foperator_plus::op1_range, foperator_minus::op1_range,
            foperator_minus::op2_range, foperator_mult::op1_range,
            foperator_div::op1_range, foperator_div::op2_range): Use it.

            * gcc.c-torture/execute/ieee/pr109008.c: New test.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (39 preceding siblings ...)
  2023-03-09  8:52 ` cvs-commit at gcc dot gnu.org
@ 2023-03-09 10:24 ` rguenth at gcc dot gnu.org
  2023-03-09 11:06 ` jakub at gcc dot gnu.org
                   ` (9 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-03-09 10:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #40 from Richard Biener <rguenth at gcc dot gnu.org> ---
Should be fixed.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (40 preceding siblings ...)
  2023-03-09 10:24 ` rguenth at gcc dot gnu.org
@ 2023-03-09 11:06 ` jakub at gcc dot gnu.org
  2023-03-09 11:12 ` marxin at gcc dot gnu.org
                   ` (8 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-03-09 11:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #41 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
So, regarding the behavior around DBL_MAX or -DBL_MAX, consider say:
__attribute__((noipa)) double
foo (double eps)
{
  double d = __DBL_MAX__ + eps;
  if (d == __DBL_MAX__)
    return eps;
  return 0.0;
}
The exact range (tested with just trying to call the function with appropriate
constants) is
[-0x0.fffffffffffff8p+970, 0x0.fffffffffffff8p+970], the function for those
boundary
values still returns that value, while for nextafter of the former towards -inf
(-0x0.8p+971) or nextafter of the latter towards +inf (0x0.8p+971) it already
returns 0.

GCC trunk before r13-6549 would compute obviously incorrectly
[frange] double [0.0 (0x0.0p+0), 0.0 (0x0.0p+0)]
and miscompile it.
r13-6549 computes
[frange] double
[-1.9958403095347198116563727130368385660674512604354575415e+292 (-0x0.8p+972),
+Inf]
which is conservatively correct, but especially the +Inf side way too large.
With an untested patch I'm going to attach momentarily we get instead
[frange] double
[-1.9958403095347198116563727130368385660674512604354575415e+292 (-0x0.8p+972),
1.9958403095347198116563727130368385660674512604354575415e+292 (
0x0.8p+972)], which is still wider than needed, but at least finite.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (41 preceding siblings ...)
  2023-03-09 11:06 ` jakub at gcc dot gnu.org
@ 2023-03-09 11:12 ` marxin at gcc dot gnu.org
  2023-03-09 11:50 ` jakub at gcc dot gnu.org
                   ` (7 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: marxin at gcc dot gnu.org @ 2023-03-09 11:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #42 from Martin Liška <marxin at gcc dot gnu.org> ---
Thank your for the hard engineering fix!

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (42 preceding siblings ...)
  2023-03-09 11:12 ` marxin at gcc dot gnu.org
@ 2023-03-09 11:50 ` jakub at gcc dot gnu.org
  2023-03-09 12:34 ` rguenth at gcc dot gnu.org
                   ` (6 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-03-09 11:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #43 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 54622
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54622&action=edit
gcc13-pr109008-2.patch

Above mentioned incremental patch.  It does actually 2 things.  One is not
widening to -inf or +inf, but to nextafter value in a hypothetical wider
floating point type with equal mantissa precision but wider exponent range.
And the other is, as the new test in that patch shows, that regardless whether
we do the above optimization or not, with -ffinite-math-only it would still
be miscompiled, as we limit the range to the maximum representable value and so
on the resulting range in this case we get with vanilla trunk
[frange] double
[-1.9958403095347198116563727130368385660674512604354575415e+292 (-0x0.8p+972),
0.0 (0x0.0p+0)] range.  That is incorrect, __DBL_MAX__ +
0x0.fffffffffffff8p+970 when rounding to nearest is still finite (__DBL_MAX__),
and so valid for -ffinite-math-only.

There is another issue.  For !MODE_HAS_INFINITIES (TYPE_MODE (type)) types
I'm afraid this is still broken and significantly more so.  Because the minimum
or maximum representable values in those cases (probably, haven't played with
such machines) act as saturation boundaries (like infinities normally),
WHATEVER_MAX + anything_positive is still WHATEVER_MAX etc.
So, wonder if we e.g. shouldn't just punt in float_binary_op_range_finish
for such modes if lhs range has at least one of the boundaries the maximum
representable one.  Though, maybe it isn't limited to the reverse ops, who
knows...

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (43 preceding siblings ...)
  2023-03-09 11:50 ` jakub at gcc dot gnu.org
@ 2023-03-09 12:34 ` rguenth at gcc dot gnu.org
  2023-03-09 12:56 ` jakub at gcc dot gnu.org
                   ` (5 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-03-09 12:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #44 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #43)
> Created attachment 54622 [details]
> gcc13-pr109008-2.patch
> 
> Above mentioned incremental patch.  It does actually 2 things.  One is not
> widening to -inf or +inf, but to nextafter value in a hypothetical wider
> floating point type with equal mantissa precision but wider exponent range.
> And the other is, as the new test in that patch shows, that regardless
> whether
> we do the above optimization or not, with -ffinite-math-only it would still
> be miscompiled, as we limit the range to the maximum representable value and
> so
> on the resulting range in this case we get with vanilla trunk
> [frange] double
> [-1.9958403095347198116563727130368385660674512604354575415e+292
> (-0x0.8p+972), 0.0 (0x0.0p+0)] range.  That is incorrect, __DBL_MAX__ +
> 0x0.fffffffffffff8p+970 when rounding to nearest is still finite
> (__DBL_MAX__),
> and so valid for -ffinite-math-only.

Note the widening is supposed to undo a round-to-nearest operation so
if the rounded value is finite then the original unrounded value is
necessarily so as well, no?  But sure, it can be bigger than __DBL_MAX__,
but not by much.

> There is another issue.  For !MODE_HAS_INFINITIES (TYPE_MODE (type)) types
> I'm afraid this is still broken and significantly more so.  Because the
> minimum or maximum representable values in those cases (probably, haven't
> played with such machines) act as saturation boundaries (like infinities
> normally), WHATEVER_MAX + anything_positive is still WHATEVER_MAX etc.
> So, wonder if we e.g. shouldn't just punt in float_binary_op_range_finish
> for such modes if lhs range has at least one of the boundaries the maximum
> representable one.  Though, maybe it isn't limited to the reverse ops, who
> knows...

But if we avoid "rounding" our widened range to the target format then
within real.cc it should be all fine to go beyond __DBL_MAX__ (whether
the mode has infinities or not), no?

As said elsewhere it would be nice to re-compute the forward range op
and see whether the re-computed LHS range covers the original LHS fully
(so we're at least conservative here).

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (44 preceding siblings ...)
  2023-03-09 12:34 ` rguenth at gcc dot gnu.org
@ 2023-03-09 12:56 ` jakub at gcc dot gnu.org
  2023-03-09 13:03 ` jakub at gcc dot gnu.org
                   ` (4 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-03-09 12:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #45 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 54623
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54623&action=edit
gcc13-pr109008-3.patch

Further incremental patch to do the .5ulp widening rather than 1ulp if not
-frounding-math and not for IBM double double.
So far I've tested the + and - 300000 tests, still no errors (so conservatively
correct) and:
      1 0 50
      1 1 11
      1 11 1
      1 1 13
      1 1 16
      1 1 25
      1 1 28
      1 1 3
      1 1 36
      1 1 4
      1 1 40
      1 19 1
      1 39 1
      1 50 0
      2 1 17
      2 1 2
      2 1 32
      2 1 7
      2 2 1
      2 5 1
      3 3 1
      3 4 1
     38 50 1
     56 1 50
     71 1 0
     78 0 1
    160 50 50
  13173 0 0
 130096 -1 -1
 154232 1 1
so significantly better than the 1ulp version, over 50% of cases are 1ulp away
in each direction, over 43% of other cases have really empty range, over 4%
exact and rest is noise.
Let me finish * and / testing.

For the forward op, we'd need to do something like in the #c30 patch, which was
ugly.

And regarding !MODE_HAS_INFINITIES, the min/max values really act there like
infinities, so I think we should just detect that case and use infinities in
the
widened range even when the mode doesn't have them.  But, I'm afraid I don't
know how to test that...

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (45 preceding siblings ...)
  2023-03-09 12:56 ` jakub at gcc dot gnu.org
@ 2023-03-09 13:03 ` jakub at gcc dot gnu.org
  2023-03-09 13:25 ` jakub at gcc dot gnu.org
                   ` (3 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-03-09 13:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #46 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Multiplication/division
    638 1 50
    679 50 1
   8737 50 50
  85436 -1 -1
 202446 1 1
so much better than previously as well and no errors (conservatively correct).

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (46 preceding siblings ...)
  2023-03-09 13:03 ` jakub at gcc dot gnu.org
@ 2023-03-09 13:25 ` jakub at gcc dot gnu.org
  2023-03-10  9:08 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-03-09 13:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #47 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
For the !MODE_HAS_INFINITIES, perhaps
--- gcc/range-op-float.cc.jj    2023-03-09 13:12:05.248873234 +0100
+++ gcc/range-op-float.cc       2023-03-09 14:21:51.959802517 +0100
@@ -2216,6 +2216,7 @@ float_widen_lhs_range (tree type, const
     return ret;
   REAL_VALUE_TYPE lb = lhs.lower_bound ();
   REAL_VALUE_TYPE ub = lhs.upper_bound ();
+  bool lb_inf = false, ub_inf = false;
   if (real_isfinite (&lb))
     {
       frange_nextafter (TYPE_MODE (type), lb, dconstninf);
@@ -2238,6 +2239,14 @@ float_widen_lhs_range (tree type, const
          real_arithmetic (&tem, PLUS_EXPR, &lhs.lower_bound (), &lb);
          real_arithmetic (&lb, RDIV_EXPR, &tem, &dconst2);
        }
+      if (!MODE_HAS_INFINITIES (TYPE_MODE (type))
+         && frange_val_is_min (lhs.lower_bound (), type))
+       {
+         /* If type doesn't have infinities, then the minimum acts
+            as kind of infinity.  Turn it into a true infinity.  */
+         real_inf (&lb, true);
+         lb_inf = true;
+       }
     }
   if (real_isfinite (&ub))
     {
@@ -2256,6 +2265,14 @@ float_widen_lhs_range (tree type, const
          real_arithmetic (&tem, PLUS_EXPR, &lhs.upper_bound (), &ub);
          real_arithmetic (&ub, RDIV_EXPR, &tem, &dconst2);
        }
+      if (!MODE_HAS_INFINITIES (TYPE_MODE (type))
+         && frange_val_is_max (lhs.upper_bound (), type))
+       {
+         /* If type doesn't have infinities, then the maximum acts
+            as kind of infinity.  Turn it into a true infinity.  */
+         real_inf (&ub, false);
+         ub_inf = true;
+       }
     }
   /* Temporarily disable -ffinite-math-only, so that frange::set doesn't
      reduce the range back to real_min_representable (type) as lower bound
@@ -2266,6 +2283,15 @@ float_widen_lhs_range (tree type, const
   ret.clear_nan ();
   ret.union_ (lhs);
   flag_finite_math_only = save_flag_finite_math_only;
+  if (lb_inf || ub_inf)
+    {
+      if (ret.kind () == VR_VARYING)
+       ret.m_kind = VR_RANGE;
+      if (lb_inf)
+       ret.m_min = lb;
+      if (ub_inf)
+       ret.m_max = ub;
+    }
   return ret;
 }

except that the members are private and so the above will not compile.  We'd
need some kind of set_unsafe or whatever that would allow to override those
even when it is not canonical.
But given that I'm not sure if even the forward operations result in correct
ranges around these non-infinity "infinities", doing something with it is
perhaps premature.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (47 preceding siblings ...)
  2023-03-09 13:25 ` jakub at gcc dot gnu.org
@ 2023-03-10  9:08 ` cvs-commit at gcc dot gnu.org
  2023-03-10 11:41 ` cvs-commit at gcc dot gnu.org
  2023-03-22  9:07 ` cvs-commit at gcc dot gnu.org
  50 siblings, 0 replies; 52+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-03-10  9:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #48 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:a1d5c729ceeb112af26e3298314a0de3058f1d82

commit r13-6574-ga1d5c729ceeb112af26e3298314a0de3058f1d82
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Fri Mar 10 10:07:51 2023 +0100

    range-op-float: Fix up -ffinite-math-only range extension and don't extend
into infinities [PR109008]

    The following patch does two things (both related to range extension
    around the boundaries).

    The first part (in the 2 real_isfinite blocks) is to make the ranges
    narrower when the old boundaries are minimum and/or maximum representable
    finite number.  In that case frange_nextafter gives -Inf or +Inf,
    but then the resulting computed reverse range is very far from the actually
    needed range, usually extends up to infinity or could even result in NaNs.
    While infinities are really the next representable numbers in the
    corresponding mode, REAL_VALUE_TYPE is actually a type with wider range
    for exponent and 160 bit precision, so the patch instead uses
    nextafter number in a hypothetical floating point format with the same
    mantissa precision but wider range of exponents.  This significantly
    improves the actual ranges of the reverse operations, while still making
    them conservatively correct.

    The second part is a fix for miscompilation of the new testcase below.
    For -ffinite-math-only, without this patch we extend the minimum and/or
    maximum representable finite number to -Inf or +Inf, with the patch to
    some number outside of the normal exponent range of the mode, but then
    we use set which canonicalizes it and turns the boundaries back to
    the minimum and/or maximum representable finite numbers, but because
    in say [__DBL_MAX__, __DBL_MAX__] = op1 + [__DBL_MAX__, __DBL_MAX__]
    op1 can be larger than 0, up to the largest number which rounds to even
    down back to __DBL_MAX__ and there are still no infinities involved,
    it needs to work even with -ffinite-math-only.  So, we really need to
    widen the lhs range a little bit even in that case.  The patch does
    that through temporarily clearing -ffinite-math-only, such that the
    value with infinities or the outside of bounds values passes the
    setting and verification (the VR_VARYING case is needed because
    we get ICEs otherwise, but when lhs is VR_VARYING in -ffast-math,
    i.e. minimum to maximum representable finite and both signs of NaN,
    then set does all we need, we don't need to or in a NaN range).
    We don't really later use the range in a way that would become a problem
    that it is wider than varying, we actually just perform maths on the
    two boundaries.

    As I said in the PR, this doesn't fix the !MODE_HAS_INFINITIES case,
    I believe we actually need to treat the boundary values as infinities
    in that case because they (probably) work like that, but it is unclear
    if it is just the reverse operation lhs widening that is a problem there,
    or whether it is a general problem.  I have zero experience with
    floating points without infinities (PDP11, some ARM half type?,
    what else?).

    2023-03-10  Jakub Jelinek  <jakub@redhat.com>

            PR tree-optimization/109008
            * range-op-float.cc (float_widen_lhs_range): If lb is
            minimum representable finite number or ub is maximum
            representable finite number, instead of widening it to
            -inf or inf widen it to negative or positive 0x0.8p+(EMAX+1).
            Temporarily clear flag_finite_math_only when canonicalizing
            the widened range.

            * gcc.dg/pr109008.c: New test.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (48 preceding siblings ...)
  2023-03-10  9:08 ` cvs-commit at gcc dot gnu.org
@ 2023-03-10 11:41 ` cvs-commit at gcc dot gnu.org
  2023-03-22  9:07 ` cvs-commit at gcc dot gnu.org
  50 siblings, 0 replies; 52+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-03-10 11:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #49 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:44f80a370b76fd1564e280f08d6640d0f641d487

commit r13-6580-g44f80a370b76fd1564e280f08d6640d0f641d487
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Fri Mar 10 12:40:23 2023 +0100

    range-op-float: Extend lhs by 0.5ulp rather than 1ulp if not
-frounding-math [PR109008]

    This patch, incremental to the just posted one, improves the reverse
    operation ranges significantly by widening just by 0.5ulp in each
    direction rather than 1ulp.  Again, REAL_VALUE_TYPE has both wider
    exponent range and wider mantissa precision (160 bits) than any
    supported type, this patch uses the latter property.

    The patch doesn't do it if -frounding-math, because then the rounding
    can be +-1ulp in each direction depending on the rounding mode which
    we don't know, or for IBM double double because that type is just weird
    and we can't trust in sane properties.

    I've performed testing of these 2 patches on 300000 random tests as with
    yesterday's patch, exact numbers are in the PR, but I see very significant
    improvement in the precision of the ranges while keeping it conservatively
    correct.

    2023-03-10  Jakub Jelinek  <jakub@redhat.com>

            PR tree-optimization/109008
            * range-op-float.cc (float_widen_lhs_range): If not
            -frounding-math and not IBM double double format, extend lhs
            range just by 0.5ulp rather than 1ulp in each direction.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug tree-optimization/109008] [13 Regression] Wrong code in scipy package since r13-3926-gd4c2f1d376da6f
  2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
                   ` (49 preceding siblings ...)
  2023-03-10 11:41 ` cvs-commit at gcc dot gnu.org
@ 2023-03-22  9:07 ` cvs-commit at gcc dot gnu.org
  50 siblings, 0 replies; 52+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-03-22  9:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109008

--- Comment #50 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Aldy Hernandez <aldyh@gcc.gnu.org>:

https://gcc.gnu.org/g:81295d445ec569c6413fd6f7921e27420167f8fa

commit r13-6797-g81295d445ec569c6413fd6f7921e27420167f8fa
Author: Aldy Hernandez <aldyh@redhat.com>
Date:   Wed Mar 8 10:58:01 2023 +0100

    frange: Implement nan_state class [PR109008]

    This patch implements a nan_state class, that allows us to query or
    pass around the NANness of an frange.  We can store +NAN, -NAN, +-NAN,
    or not-a-NAN with it.

    I tried to touch as little as possible, leaving other cleanups to the
    next release.  For example, we should replace the m_*_nan fields in
    frange with nan_state, and provide relevant accessors to nan_state
    (isnan, etc).

            PR tree-optimization/109008

    gcc/ChangeLog:

            * value-range.cc (frange::set): Add nan_state argument.
            * value-range.h (class nan_state): New.
            (frange::get_nan_state): New.

^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2023-03-22  9:07 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-03 12:08 [Bug tree-optimization/109008] New: [13 Regression ]Maybe wrong code in scipy package since r13-3926-gd4c2f1d376da6f marxin at gcc dot gnu.org
2023-03-03 12:11 ` [Bug tree-optimization/109008] [13 Regression] Maybe " rguenth at gcc dot gnu.org
2023-03-03 12:14 ` rguenth at gcc dot gnu.org
2023-03-03 12:18 ` [Bug tree-optimization/109008] [13 Regression] Wrong " rguenth at gcc dot gnu.org
2023-03-03 12:24 ` rguenth at gcc dot gnu.org
2023-03-03 12:27 ` rguenth at gcc dot gnu.org
2023-03-03 12:31 ` jakub at gcc dot gnu.org
2023-03-03 12:54 ` jakub at gcc dot gnu.org
2023-03-03 13:03 ` rguenth at gcc dot gnu.org
2023-03-03 13:10 ` rguenth at gcc dot gnu.org
2023-03-03 13:14 ` rguenth at gcc dot gnu.org
2023-03-03 13:34 ` jakub at gcc dot gnu.org
2023-03-03 13:50 ` rguenth at gcc dot gnu.org
2023-03-03 14:28 ` jakub at gcc dot gnu.org
2023-03-07 12:06 ` rguenth at gcc dot gnu.org
2023-03-07 12:12 ` jakub at gcc dot gnu.org
2023-03-07 12:20 ` jakub at gcc dot gnu.org
2023-03-07 12:23 ` rguenth at gcc dot gnu.org
2023-03-07 12:25 ` rguenth at gcc dot gnu.org
2023-03-07 13:16 ` jakub at gcc dot gnu.org
2023-03-07 13:18 ` rguenth at gcc dot gnu.org
2023-03-07 13:29 ` rguenth at gcc dot gnu.org
2023-03-07 13:58 ` rguenth at gcc dot gnu.org
2023-03-07 14:10 ` rguenth at gcc dot gnu.org
2023-03-07 18:52 ` jakub at gcc dot gnu.org
2023-03-07 19:39 ` jakub at gcc dot gnu.org
2023-03-07 20:49 ` jakub at gcc dot gnu.org
2023-03-08  9:26 ` aldyh at gcc dot gnu.org
2023-03-08  9:29 ` aldyh at gcc dot gnu.org
2023-03-08  9:36 ` jakub at gcc dot gnu.org
2023-03-08 10:09 ` aldyh at gcc dot gnu.org
2023-03-08 11:29 ` jakub at gcc dot gnu.org
2023-03-08 11:30 ` jakub at gcc dot gnu.org
2023-03-08 14:51 ` jakub at gcc dot gnu.org
2023-03-08 14:52 ` jakub at gcc dot gnu.org
2023-03-08 15:07 ` jakub at gcc dot gnu.org
2023-03-08 15:54 ` jakub at gcc dot gnu.org
2023-03-08 16:37 ` jakub at gcc dot gnu.org
2023-03-08 18:56 ` jakub at gcc dot gnu.org
2023-03-08 20:02 ` jakub at gcc dot gnu.org
2023-03-09  8:52 ` cvs-commit at gcc dot gnu.org
2023-03-09 10:24 ` rguenth at gcc dot gnu.org
2023-03-09 11:06 ` jakub at gcc dot gnu.org
2023-03-09 11:12 ` marxin at gcc dot gnu.org
2023-03-09 11:50 ` jakub at gcc dot gnu.org
2023-03-09 12:34 ` rguenth at gcc dot gnu.org
2023-03-09 12:56 ` jakub at gcc dot gnu.org
2023-03-09 13:03 ` jakub at gcc dot gnu.org
2023-03-09 13:25 ` jakub at gcc dot gnu.org
2023-03-10  9:08 ` cvs-commit at gcc dot gnu.org
2023-03-10 11:41 ` cvs-commit at gcc dot gnu.org
2023-03-22  9:07 ` cvs-commit at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).