[Bug tree-optimization/107591] New: range-op{,-float}.cc for x * x

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug tree-optimization/107591] New: range-op{,-float}.cc for x * x
@ 2022-11-09 15:37 jakub at gcc dot gnu.org
  2022-11-09 15:49 ` [Bug tree-optimization/107591] " aldyh at gcc dot gnu.org
                   ` (19 more replies)
  0 siblings, 20 replies; 21+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-11-09 15:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107591

            Bug ID: 107591
           Summary: range-op{,-float}.cc for x * x
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jakub at gcc dot gnu.org
  Target Milestone: ---

int
foo (int x)
{
  if (x < -13 || x > 26)
    return -1;
  return x * x;
}

results in
x_4(D) :      [irange] int [-13, 26]
_5  : [irange] int [-338, 676]
That is unnecessarily pessimized, because it only computes [-13, 26] * [-13,
26]
range without taking into account that both operands are the same.
For the powi (x, 2) case the range is actually [0, 26 * 26], i.e. we shouldn't
do a cross product for it, just compute the -13 * -13, 0 * 0 (if 0 is in the
range) and 26 * 26 products and form from that the range (I admit I haven't
thought about unsigned or wrapping stuff).

On the PR107569 testcase it is on frange:
  _3 = u_2(D)->x;
  _6 = _3 * _3;
  _7 = u_2(D)->y;
  _8 = _7 * _7;
  _9 = _6 + _8;
  if (_9 u>= 0.0)
If we don't know anything further about u_2(D)->x and u_2(D)->y, VARYING *
VARYING is [0, +Inf] +-NAN, added twice is the same (note, unless
-fno-signed-zeros, it should be really [+0, +Inf] +-NAN, not [-0, +Inf] +-NAN,
but it doesn't matter for the u>= 0.0 comparison).
And then we can fold u>= 0.0 to true.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/107591] range-op{,-float}.cc for x * x
  2022-11-09 15:37 [Bug tree-optimization/107591] New: range-op{,-float}.cc for x * x jakub at gcc dot gnu.org
@ 2022-11-09 15:49 ` aldyh at gcc dot gnu.org
  2022-11-09 15:55 ` aldyh at gcc dot gnu.org
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: aldyh at gcc dot gnu.org @ 2022-11-09 15:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107591

Aldy Hernandez <aldyh at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2022-11-09
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW
                 CC|                            |aldyh at gcc dot gnu.org,
                   |                            |amacleod at redhat dot com

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/107591] range-op{,-float}.cc for x * x
  2022-11-09 15:37 [Bug tree-optimization/107591] New: range-op{,-float}.cc for x * x jakub at gcc dot gnu.org
  2022-11-09 15:49 ` [Bug tree-optimization/107591] " aldyh at gcc dot gnu.org
@ 2022-11-09 15:55 ` aldyh at gcc dot gnu.org
  2022-11-09 15:57 ` jakub at gcc dot gnu.org
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: aldyh at gcc dot gnu.org @ 2022-11-09 15:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107591

--- Comment #1 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
pinskia is a god.  How does he keep track of all these bugs, plus the cross
reference between them?  I knew PR91645 was related, but it was specifically
something on my radar, not the 5 trillion bugs in pinskia's head :-).  Kudos!

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/107591] range-op{,-float}.cc for x * x
  2022-11-09 15:37 [Bug tree-optimization/107591] New: range-op{,-float}.cc for x * x jakub at gcc dot gnu.org
  2022-11-09 15:49 ` [Bug tree-optimization/107591] " aldyh at gcc dot gnu.org
  2022-11-09 15:55 ` aldyh at gcc dot gnu.org
@ 2022-11-09 15:57 ` jakub at gcc dot gnu.org
  2022-11-09 15:59 ` pinskia at gcc dot gnu.org
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-11-09 15:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107591

--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Perhaps before we try to map MULT_EXPR into some irange/frange op the usual
way,
while we still have access to gimple statement check if it is MULT_EXPR with
the same operands and use a different artificial unary op for pow2.
We could do this for PLUS_EXPR x + x as well and handle it as 2 * x if we don't
do that already or if match.pd doesn't already canonicalize x + x that way.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/107591] range-op{,-float}.cc for x * x
  2022-11-09 15:37 [Bug tree-optimization/107591] New: range-op{,-float}.cc for x * x jakub at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2022-11-09 15:57 ` jakub at gcc dot gnu.org
@ 2022-11-09 15:59 ` pinskia at gcc dot gnu.org
  2022-11-09 16:01 ` pinskia at gcc dot gnu.org
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-11-09 15:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107591

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Aldy Hernandez from comment #1)
> pinskia is a god.  How does he keep track of all these bugs, plus the cross
> reference between them?  I knew PR91645 was related, but it was specifically
> something on my radar, not the 5 trillion bugs in pinskia's head :-).  Kudos!

I remembered seeing a different "x * x" issue and I just did a search. There is
at least one more about "x * x" too but I need to find it still.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/107591] range-op{,-float}.cc for x * x
  2022-11-09 15:37 [Bug tree-optimization/107591] New: range-op{,-float}.cc for x * x jakub at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2022-11-09 15:59 ` pinskia at gcc dot gnu.org
@ 2022-11-09 16:01 ` pinskia at gcc dot gnu.org
  2022-11-09 16:02 ` aldyh at gcc dot gnu.org
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-11-09 16:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107591

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           See Also|                            |https://gcc.gnu.org/bugzill
                   |                            |a/show_bug.cgi?id=56355

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
PR 56355 is related but it is definitely more complex.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/107591] range-op{,-float}.cc for x * x
  2022-11-09 15:37 [Bug tree-optimization/107591] New: range-op{,-float}.cc for x * x jakub at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2022-11-09 16:01 ` pinskia at gcc dot gnu.org
@ 2022-11-09 16:02 ` aldyh at gcc dot gnu.org
  2022-11-09 16:04 ` aldyh at gcc dot gnu.org
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: aldyh at gcc dot gnu.org @ 2022-11-09 16:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107591

--- Comment #5 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #3)
> (In reply to Aldy Hernandez from comment #1)
> > pinskia is a god.  How does he keep track of all these bugs, plus the cross
> > reference between them?  I knew PR91645 was related, but it was specifically
> > something on my radar, not the 5 trillion bugs in pinskia's head :-).  Kudos!
> 
> I remembered seeing a different "x * x" issue and I just did a search. There
> is at least one more about "x * x" too but I need to find it still.

I think you're just being humble.  Either that or maybe you're just a bot with
superpowers :-P.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/107591] range-op{,-float}.cc for x * x
  2022-11-09 15:37 [Bug tree-optimization/107591] New: range-op{,-float}.cc for x * x jakub at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2022-11-09 16:02 ` aldyh at gcc dot gnu.org
@ 2022-11-09 16:04 ` aldyh at gcc dot gnu.org
  2022-11-09 16:31 ` jakub at gcc dot gnu.org
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: aldyh at gcc dot gnu.org @ 2022-11-09 16:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107591

--- Comment #6 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #2)
> Perhaps before we try to map MULT_EXPR into some irange/frange op the usual
> way,
> while we still have access to gimple statement check if it is MULT_EXPR with
> the same operands and use a different artificial unary op for pow2.
> We could do this for PLUS_EXPR x + x as well and handle it as 2 * x if we
> don't do that already or if match.pd doesn't already canonicalize x + x that
> way.

FYI, the range operators have a relation field, so it should be able to tell
you that both operands are equal with VREL_EQ (?).  You could use that to
optimize things.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/107591] range-op{,-float}.cc for x * x
  2022-11-09 15:37 [Bug tree-optimization/107591] New: range-op{,-float}.cc for x * x jakub at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2022-11-09 16:04 ` aldyh at gcc dot gnu.org
@ 2022-11-09 16:31 ` jakub at gcc dot gnu.org
  2022-11-09 16:40 ` jakub at gcc dot gnu.org
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-11-09 16:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107591

--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
To answer my own question:
int
foo (int x)
{
  return x + x;
}

int
bar (int x)
{
  return x * x * x * x * x * x;
}

float
baz (float x)
{
  return x + x;
}

float
qux (float x)
{
  return x * x * x * x * x * x;
}

float
corge (float x)
{
  return x * x;
}
match.pd or fold-const.cc canonicalizes x + x to x * 2, so no need to special
case that, x * x isn't canonicalized for integral types nor without -ffast-math
for floats either, with -ffast-math reassoc1 canonicalizes the multiplications
into __builtin_powif (x, 6) resp. __builtin_powif (x, 2) until powcabs pass
later on splits those into multiplications again.
So, for frange, if we implement good range op for CASE_CFN_POWI, we could just
use that, perhaps for VREL_EQ on mult.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/107591] range-op{,-float}.cc for x * x
  2022-11-09 15:37 [Bug tree-optimization/107591] New: range-op{,-float}.cc for x * x jakub at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2022-11-09 16:31 ` jakub at gcc dot gnu.org
@ 2022-11-09 16:40 ` jakub at gcc dot gnu.org
  2022-11-09 17:45 ` amacleod at redhat dot com
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-11-09 16:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107591

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Aldy Hernandez from comment #6)
> FYI, the range operators have a relation field, so it should be able to tell
> you that both operands are equal with VREL_EQ (?).  You could use that to
> optimize things.

You're right, at least for irange trio.op1_op2 () in the mult fold_range is
VREL_EQ and we could just go from that.
For frange perhaps it isn't as perfect because of signed zeros, so in
float
foo (float x, float y)
{
  if (x == y)
    return x * y;
  return 42.0f;
}
trio.op1_op2 () could be VREL_EQ even if one is signed and the other unsigned
zero, and in that case -0.0 would need to be in the range too.  But maybe that
is good enough for now.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/107591] range-op{,-float}.cc for x * x
  2022-11-09 15:37 [Bug tree-optimization/107591] New: range-op{,-float}.cc for x * x jakub at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2022-11-09 16:40 ` jakub at gcc dot gnu.org
@ 2022-11-09 17:45 ` amacleod at redhat dot com
  2022-11-09 17:49 ` jakub at gcc dot gnu.org
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: amacleod at redhat dot com @ 2022-11-09 17:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107591

--- Comment #9 from Andrew Macleod <amacleod at redhat dot com> ---
(In reply to Jakub Jelinek from comment #8)
> (In reply to Aldy Hernandez from comment #6)
> > FYI, the range operators have a relation field, so it should be able to tell
> > you that both operands are equal with VREL_EQ (?).  You could use that to
> > optimize things.
> 
> You're right, at least for irange trio.op1_op2 () in the mult fold_range is
> VREL_EQ and we could just go from that.
> For frange perhaps it isn't as perfect because of signed zeros, so in
> float
> foo (float x, float y)
> {
>   if (x == y)
>     return x * y;
>   return 42.0f;
> }
> trio.op1_op2 () could be VREL_EQ even if one is signed and the other
> unsigned zero, and in that case -0.0 would need to be in the range too.  But
> maybe that is good enough for now.

you could also test whether op1_range contains + and/or - 0, as well as
op2_range.  VREL_EQ is a symbolic equality.. the ranges can still be distinct
and individually testable to see if you have a +0 and -0..   I guess you could
also test for equality of the ranges.. op1_range == op2_range

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/107591] range-op{,-float}.cc for x * x
  2022-11-09 15:37 [Bug tree-optimization/107591] New: range-op{,-float}.cc for x * x jakub at gcc dot gnu.org
                   ` (9 preceding siblings ...)
  2022-11-09 17:45 ` amacleod at redhat dot com
@ 2022-11-09 17:49 ` jakub at gcc dot gnu.org
  2022-11-09 18:22 ` amacleod at redhat dot com
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-11-09 17:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107591

--- Comment #10 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Andrew Macleod from comment #9)
> you could also test whether op1_range contains + and/or - 0, as well as
> op2_range.  VREL_EQ is a symbolic equality.. the ranges can still be
> distinct and individually testable to see if you have a +0 and -0..   I
> guess you could also test for equality of the ranges.. op1_range == op2_range

Equality of the ranges doesn't imply equality of the values.

int
foo (int x, y)
{
  if (x < -13 || x > 26)
    return -1;
  if (y < -13 || y > 26)
    return -1;
  return x * y;
}

In the above testcase, both x and y have [-13, 26] ranges, but here the range
[-338, 676] is correct, while [0, 676] is right for the #c0 testcase.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/107591] range-op{,-float}.cc for x * x
  2022-11-09 15:37 [Bug tree-optimization/107591] New: range-op{,-float}.cc for x * x jakub at gcc dot gnu.org
                   ` (10 preceding siblings ...)
  2022-11-09 17:49 ` jakub at gcc dot gnu.org
@ 2022-11-09 18:22 ` amacleod at redhat dot com
  2022-11-09 18:35 ` jakub at gcc dot gnu.org
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: amacleod at redhat dot com @ 2022-11-09 18:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107591

--- Comment #11 from Andrew Macleod <amacleod at redhat dot com> ---
(In reply to Jakub Jelinek from comment #10)
> (In reply to Andrew Macleod from comment #9)
> > you could also test whether op1_range contains + and/or - 0, as well as
> > op2_range.  VREL_EQ is a symbolic equality.. the ranges can still be
> > distinct and individually testable to see if you have a +0 and -0..   I
> > guess you could also test for equality of the ranges.. op1_range == op2_range
> 
> Equality of the ranges doesn't imply equality of the values.
> 
> int
> foo (int x, y)
> {
>   if (x < -13 || x > 26)
>     return -1;
>   if (y < -13 || y > 26)
>     return -1;
>   return x * y;
> }
> 
> In the above testcase, both x and y have [-13, 26] ranges, but here the
> range [-338, 676] is correct, while [0, 676] is right for the #c0 testcase.

no, I meant in addition to the VREL_EQ.  so 
  if (rel == VREL_EQ && op1_range != op2_range)
     then you know you have something like  if (x == y) z=x*y and may have to
check for various signed zero cobinations in each range, 
whereas is op1_range == op2_range in this case, it should be perfectly safe..

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/107591] range-op{,-float}.cc for x * x
  2022-11-09 15:37 [Bug tree-optimization/107591] New: range-op{,-float}.cc for x * x jakub at gcc dot gnu.org
                   ` (11 preceding siblings ...)
  2022-11-09 18:22 ` amacleod at redhat dot com
@ 2022-11-09 18:35 ` jakub at gcc dot gnu.org
  2022-11-09 19:20 ` amacleod at redhat dot com
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-11-09 18:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107591

--- Comment #12 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Andrew Macleod from comment #11)
> no, I meant in addition to the VREL_EQ.  so 
>   if (rel == VREL_EQ && op1_range != op2_range)
>      then you know you have something like  if (x == y) z=x*y and may have
> to check for various signed zero cobinations in each range, 
> whereas is op1_range == op2_range in this case, it should be perfectly safe..

I don't understand.  Let's modify the testcase to:
float
foo (float x, y)
{
  if (x < -13.f || x > 26.f)
    return -1.0f;
  if (y < -13.f || y > 26.f)
    return -1.0f;
  if (x == y)
    return x * y;
  return 42.0f;
}
I'd expect that fold_range on the x * y should see trio.op1_op2 () == VREL_EQ
because of the guarding x == y condition.  And the ranges of both are [-13.f,
26.f] +-NAN too.  Still, x could be -0.0f and y 0.0f or vice versa, and so
x * y could be -0.0f, so we need [-0.f, 676.f] +-NAN.  While if it is x * x, we
know the result will have always sign bit of 0 (except if NAN), so [0.f, 676.f]
+-NAN.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/107591] range-op{,-float}.cc for x * x
  2022-11-09 15:37 [Bug tree-optimization/107591] New: range-op{,-float}.cc for x * x jakub at gcc dot gnu.org
                   ` (12 preceding siblings ...)
  2022-11-09 18:35 ` jakub at gcc dot gnu.org
@ 2022-11-09 19:20 ` amacleod at redhat dot com
  2022-11-09 20:24 ` jakub at gcc dot gnu.org
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: amacleod at redhat dot com @ 2022-11-09 19:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107591

--- Comment #13 from Andrew Macleod <amacleod at redhat dot com> ---
(In reply to Jakub Jelinek from comment #12)
> (In reply to Andrew Macleod from comment #11)
> > no, I meant in addition to the VREL_EQ.  so 
> >   if (rel == VREL_EQ && op1_range != op2_range)
> >      then you know you have something like  if (x == y) z=x*y and may have
> > to check for various signed zero cobinations in each range, 
> > whereas is op1_range == op2_range in this case, it should be perfectly safe..
> 
> I don't understand.  Let's modify the testcase to:

> I'd expect that fold_range on the x * y should see trio.op1_op2 () == VREL_EQ
> because of the guarding x == y condition.  And the ranges of both are
> [-13.f, 26.f] +-NAN too.  Still, x could be -0.0f and y 0.0f or vice versa,
> and so
> x * y could be -0.0f, so we need [-0.f, 676.f] +-NAN.  While if it is x * x,
> we know the result will have always sign bit of 0 (except if NAN), so [0.f,
> 676.f] +-NAN.

gah. Clearly it is I who does not understand. -0.0 and +0.0 interactions remind
me of all the signed single bit crud we (ie Aldy) went thru.  The same team
must have come up with both concepts :-P  At least they deserve the same award.

Move along and ignore me.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/107591] range-op{,-float}.cc for x * x
  2022-11-09 15:37 [Bug tree-optimization/107591] New: range-op{,-float}.cc for x * x jakub at gcc dot gnu.org
                   ` (13 preceding siblings ...)
  2022-11-09 19:20 ` amacleod at redhat dot com
@ 2022-11-09 20:24 ` jakub at gcc dot gnu.org
  2022-11-09 20:45 ` jakub at gcc dot gnu.org
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-11-09 20:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107591

--- Comment #14 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Incremental patch on top of the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107569#c18 patch which optimizes
the floating point x * x:

--- gcc/range-op-float.cc.jj    2022-11-09 19:06:11.075716000 +0100
+++ gcc/range-op-float.cc       2022-11-09 21:11:52.468256045 +0100
@@ -51,7 +51,7 @@ along with GCC; see the file COPYING3.
 bool
 range_operator_float::fold_range (frange &r, tree type,
                                  const frange &op1, const frange &op2,
-                                 relation_trio) const
+                                 relation_trio trio) const
 {
   if (empty_range_varying (r, type, op1, op2))
     return true;
@@ -65,7 +65,7 @@ range_operator_float::fold_range (frange
   bool maybe_nan;
   rv_fold (lb, ub, maybe_nan, type,
           op1.lower_bound (), op1.upper_bound (),
-          op2.lower_bound (), op2.upper_bound ());
+          op2.lower_bound (), op2.upper_bound (), trio);

   // Handle possible NANs by saturating to the appropriate INF if only
   // one end is a NAN.  If both ends are a NAN, just return a NAN.
@@ -103,8 +103,8 @@ range_operator_float::rv_fold (REAL_VALU
                               const REAL_VALUE_TYPE &lh_lb ATTRIBUTE_UNUSED,
                               const REAL_VALUE_TYPE &lh_ub ATTRIBUTE_UNUSED,
                               const REAL_VALUE_TYPE &rh_lb ATTRIBUTE_UNUSED,
-                              const REAL_VALUE_TYPE &rh_ub ATTRIBUTE_UNUSED)
-  const
+                              const REAL_VALUE_TYPE &rh_ub ATTRIBUTE_UNUSED,
+                              relation_trio) const
 {
   lb = dconstninf;
   ub = dconstinf;
@@ -1868,7 +1868,8 @@ class foperator_plus : public range_oper
                const REAL_VALUE_TYPE &lh_lb,
                const REAL_VALUE_TYPE &lh_ub,
                const REAL_VALUE_TYPE &rh_lb,
-               const REAL_VALUE_TYPE &rh_ub) const final override
+               const REAL_VALUE_TYPE &rh_ub,
+               relation_trio) const final override
   {
     frange_arithmetic (PLUS_EXPR, type, lb, lh_lb, rh_lb, dconstninf);
     frange_arithmetic (PLUS_EXPR, type, ub, lh_ub, rh_ub, dconstinf);
@@ -1892,7 +1893,8 @@ class foperator_minus : public range_ope
                const REAL_VALUE_TYPE &lh_lb,
                const REAL_VALUE_TYPE &lh_ub,
                const REAL_VALUE_TYPE &rh_lb,
-               const REAL_VALUE_TYPE &rh_ub) const final override
+               const REAL_VALUE_TYPE &rh_ub,
+               relation_trio) const final override
   {
     frange_arithmetic (MINUS_EXPR, type, lb, lh_lb, rh_ub, dconstninf);
     frange_arithmetic (MINUS_EXPR, type, ub, lh_ub, rh_lb, dconstinf);
@@ -1910,7 +1912,7 @@ class foperator_minus : public range_ope

 /* Wrapper around frange_arithmetics, that computes the result
    if inexact rounded to both directions.  Also, if one of the
-   operands is +-0.0 and another +-inf, return +-0.0 rather than
+   operands is +-0.0 and another +-INF, return +-0.0 rather than
    NAN.  */

 static void
@@ -1945,13 +1947,42 @@ class foperator_mult : public range_oper
                const REAL_VALUE_TYPE &lh_lb,
                const REAL_VALUE_TYPE &lh_ub,
                const REAL_VALUE_TYPE &rh_lb,
-               const REAL_VALUE_TYPE &rh_ub) const final override
+               const REAL_VALUE_TYPE &rh_ub,
+               relation_trio trio) const final override
   {
     REAL_VALUE_TYPE cp[8];
+    bool is_square
+      = (trio.op1_op2 () == VREL_EQ
+        && real_equal (&lh_lb, &rh_lb)
+        && real_equal (&lh_ub, &rh_ub)
+        && real_isneg (&lh_lb) == real_isneg (&rh_lb)
+        && real_isneg (&lh_ub) == real_isneg (&rh_ub));
     // Do a cross-product.
     frange_mult (type, cp[0], cp[4], lh_lb, rh_lb);
-    frange_mult (type, cp[1], cp[5], lh_lb, rh_ub);
-    frange_mult (type, cp[2], cp[6], lh_ub, rh_lb);
+    if (is_square)
+      {
+       // For x * x we can just do max (lh_lb * lh_lb, lh_ub * lh_ub)
+       // as maximum and -0.0 as minimum if 0.0 is in the range,
+       // otherwise min (lh_lb * lh_lb, lh_ub * lh_ub).
+       // -0.0 rather than 0.0 because VREL_EQ doesn't prove that
+       // x and y are bitwise equal, just that they compare equal.
+       if (real_compare (LE_EXPR, &lh_lb, &dconst0)
+           && real_compare (GE_EXPR, &lh_ub, &dconst0))
+         {
+           cp[1] = dconst0;
+           real_value_negate (&cp[1]);
+         }
+       else
+         cp[1] = cp[0];
+       cp[2] = cp[0];
+       cp[5] = cp[4];
+       cp[6] = cp[4];
+      }
+    else
+      {
+       frange_mult (type, cp[1], cp[5], lh_lb, rh_ub);
+       frange_mult (type, cp[2], cp[6], lh_ub, rh_lb);
+      }
     frange_mult (type, cp[3], cp[7], lh_ub, rh_ub);
     for (int i = 1; i < 3; ++i)
       {
@@ -1965,18 +1996,27 @@ class foperator_mult : public range_oper
     lb = cp[0];
     ub = cp[4];

-    // [+-0, +-0] * [+INF,+INF] (or [-INF,-INF] or swapped is a known NaN.
-    if ((real_iszero (&lh_lb) && real_iszero (&lh_ub)
-        && real_isinf (&rh_lb) && real_isinf (&rh_ub, real_isneg (&rh_lb)))
-       || (real_iszero (&rh_lb) && real_iszero (&rh_ub)
-           && real_isinf (&lh_lb) && real_isinf (&lh_ub, real_isneg
(&lh_lb))))
+    // If both operands are the same, then we know it can be +-0.0, or +-INF,
+    // but not both at the same time, so it will never be invalid unless
+    // operand was already NAN.
+    if (is_square)
+      maybe_nan = false;
+    // [+-0, +-0] * [+INF,+INF] (or [-INF,-INF] or swapped is a known NAN.
+    else if ((real_iszero (&lh_lb)
+             && real_iszero (&lh_ub)
+             && real_isinf (&rh_lb)
+             && real_isinf (&rh_ub, real_isneg (&rh_lb)))
+            || (real_iszero (&rh_lb)
+                && real_iszero (&rh_ub)
+                && real_isinf (&lh_lb)
+                && real_isinf (&lh_ub, real_isneg (&lh_lb))))
       {
        real_nan (&lb, NULL, 0, TYPE_MODE (type));
        ub = lb;
        maybe_nan = true;
       }
     // Otherwise, if one range includes zero and the other ends with +-INF,
-    // it is a maybe NaN.
+    // it is a maybe NAN.
     else if (real_compare (LE_EXPR, &lh_lb, &dconst0)
             && real_compare (GE_EXPR, &lh_ub, &dconst0)
             && (real_isinf (&rh_lb) || real_isinf (&rh_ub)))
--- gcc/range-op.h.jj   2022-11-09 11:22:42.867624633 +0100
+++ gcc/range-op.h      2022-11-09 20:20:02.266964633 +0100
@@ -123,7 +123,8 @@ public:
                        const REAL_VALUE_TYPE &lh_lb,
                        const REAL_VALUE_TYPE &lh_ub,
                        const REAL_VALUE_TYPE &rh_lb,
-                       const REAL_VALUE_TYPE &rh_ub) const;
+                       const REAL_VALUE_TYPE &rh_ub,
+                       relation_trio) const;
   // Unary operations have the range of the LHS as op2.
   virtual bool fold_range (irange &r, tree type,
                           const frange &lh,

We determine the right range (I think), but then it helps just to optimize away
the call to sqrt function, not the actual comparison (bet for a fear that a
sNaN 
could appear there).
If frange also tracked maybe sNaN (and cleared it say on all binops or unops
other than the operations that might not trigger exception/quiet it), perhaps
we could optimize that.  Or say without frange help just by assuming that say
result of a binary floating point operation can't ever be a sNaN.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/107591] range-op{,-float}.cc for x * x
  2022-11-09 15:37 [Bug tree-optimization/107591] New: range-op{,-float}.cc for x * x jakub at gcc dot gnu.org
                   ` (14 preceding siblings ...)
  2022-11-09 20:24 ` jakub at gcc dot gnu.org
@ 2022-11-09 20:45 ` jakub at gcc dot gnu.org
  2022-11-09 22:25 ` amacleod at redhat dot com
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-11-09 20:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107591

--- Comment #15 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I've screwed up the real_value_negate calls, they need to assign the result.

Anyway, yet another option would be for cdce to ask the ranger if the sqrt
argument can be smaller than -0.0 (and only if it can, emit the comparison).
Then we don't need to deal with removing the comparison again.
Of course, such removals might be still useful for user code.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/107591] range-op{,-float}.cc for x * x
  2022-11-09 15:37 [Bug tree-optimization/107591] New: range-op{,-float}.cc for x * x jakub at gcc dot gnu.org
                   ` (15 preceding siblings ...)
  2022-11-09 20:45 ` jakub at gcc dot gnu.org
@ 2022-11-09 22:25 ` amacleod at redhat dot com
  2022-11-09 23:27 ` jakub at gcc dot gnu.org
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: amacleod at redhat dot com @ 2022-11-09 22:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107591

--- Comment #16 from Andrew Macleod <amacleod at redhat dot com> ---
(In reply to Jakub Jelinek from comment #15)
> I've screwed up the real_value_negate calls, they need to assign the result.
> 
> Anyway, yet another option would be for cdce to ask the ranger if the sqrt
> argument can be smaller than -0.0 (and only if it can, emit the comparison).
> Then we don't need to deal with removing the comparison again.
> Of course, such removals might be still useful for user code.

even checking the global range from cdce at this point should show that its >=
-0.0...

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/107591] range-op{,-float}.cc for x * x
  2022-11-09 15:37 [Bug tree-optimization/107591] New: range-op{,-float}.cc for x * x jakub at gcc dot gnu.org
                   ` (16 preceding siblings ...)
  2022-11-09 22:25 ` amacleod at redhat dot com
@ 2022-11-09 23:27 ` jakub at gcc dot gnu.org
  2022-11-12  8:42 ` cvs-commit at gcc dot gnu.org
  2023-04-05  7:14 ` pilarlatiesa at gmail dot com
  19 siblings, 0 replies; 21+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-11-09 23:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107591

--- Comment #17 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Andrew Macleod from comment #16)
> (In reply to Jakub Jelinek from comment #15)
> > I've screwed up the real_value_negate calls, they need to assign the result.
> > 
> > Anyway, yet another option would be for cdce to ask the ranger if the sqrt
> > argument can be smaller than -0.0 (and only if it can, emit the comparison).
> > Then we don't need to deal with removing the comparison again.
> > Of course, such removals might be still useful for user code.
> 
> even checking the global range from cdce at this point should show that its
> >= -0.0...

Without larger changes in tree-call-cdce.cc, perhaps we could change
gen_one_condition to check the global or even non-global frange (but in the
latter case the question is at what stmt) and if the range indicates that tcode
of arg against (float_type) lbub (aka lbub_real_cst) is always true or always
false, simply emit an always true or always false condition (though the domain
ranges aren't exact but rounded up/down to integers and I think even there is
sometimes an off by one conservatively, so probably only always true
conditions)
rather than actual arg comparison.  Even better would be not to emit any and/or
propagate a stmt to it, but am not sure about either (the code treats nconds ==
0 differently and there are separators in between sometimes; and for stmt, it
uses some code for better placement of a check, but not sure if say a range
could be true on the sqrt etc. call and then not on the placement of the
condition which it picks).
In any case, not much familiar with the range_of_expr etc. APIs.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/107591] range-op{,-float}.cc for x * x
  2022-11-09 15:37 [Bug tree-optimization/107591] New: range-op{,-float}.cc for x * x jakub at gcc dot gnu.org
                   ` (17 preceding siblings ...)
  2022-11-09 23:27 ` jakub at gcc dot gnu.org
@ 2022-11-12  8:42 ` cvs-commit at gcc dot gnu.org
  2023-04-05  7:14 ` pilarlatiesa at gmail dot com
  19 siblings, 0 replies; 21+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-11-12  8:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107591

--- Comment #18 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:2f7f9edd28d75a85a33599978f23811e679e443d

commit r13-3923-g2f7f9edd28d75a85a33599978f23811e679e443d
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Sat Nov 12 09:33:01 2022 +0100

    range-op: Implement floating point multiplication fold_range [PR107569]

    The following patch implements frange multiplication, including the
    special case of x * x.  The callers don't tell us that it is x * x,
    just that it is either z = x * x or if (x == y) z = x * y;
    For irange that makes no difference, but for frange it can mean
    x is -0.0 and y is 0.0 if they have the same range that includes both
    signed and unsigned zeros, so we need to assume result could be -0.0.

    The patch causes one regression:
    +FAIL: gcc.dg/fold-overflow-1.c scan-assembler-times 2139095040 2
    but that is already tracked in PR107608 and affects not just the newly
    added multiplication, but addition and other floating point operations
    (and doesn't seem like a ranger bug but dce or whatever else).

    2022-11-12  Jakub Jelinek  <jakub@redhat.com>

            PR tree-optimization/107569
            PR tree-optimization/107591
            * range-op.h (range_operator_float::rv_fold): Add relation_kind
            argument.
            * range-op-float.cc (range_operator_float::fold_range): Name
            last argument trio and pass trio.op1_op2 () as last argument to
            rv_fold.
            (range_operator_float::rv_fold): Add relation_kind argument.
            (foperator_plus::rv_fold, foperator_minus::rv_fold): Likewise.
            (foperator_mult): New class.
            (floating_op_table::floating_op_table): Use foperator_mult for
            MULT_EXPR.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/107591] range-op{,-float}.cc for x * x
  2022-11-09 15:37 [Bug tree-optimization/107591] New: range-op{,-float}.cc for x * x jakub at gcc dot gnu.org
                   ` (18 preceding siblings ...)
  2022-11-12  8:42 ` cvs-commit at gcc dot gnu.org
@ 2023-04-05  7:14 ` pilarlatiesa at gmail dot com
  19 siblings, 0 replies; 21+ messages in thread
From: pilarlatiesa at gmail dot com @ 2023-04-05  7:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107591

--- Comment #19 from Pilar Latiesa <pilarlatiesa at gmail dot com> ---
The testcase:

    #include <cmath>

    struct TVec { double x, y, z; };

    double dot(TVec const &u, TVec const &v)
      { return u.x * v.x + u.y * v.y + u.y * v.y + u.z * v.z; }

    double mag(TVec const &u)
      { return std::sqrt(dot(u, u)); }

with current trunk is compiled as:

        vmovsd  8(%rdi), %xmm0
        vmovsd  (%rdi), %xmm1
        vmulsd  %xmm0, %xmm0, %xmm0
        vmovsd  %xmm1, %xmm1, %xmm2
        vfmadd132sd     %xmm1, %xmm0, %xmm2
        vmovsd  16(%rdi), %xmm1
        vaddsd  %xmm2, %xmm0, %xmm0
        vfmadd132sd     %xmm1, %xmm0, %xmm1
        vsqrtsd %xmm1, %xmm1, %xmm0
        ret

That's excellent. Thanks

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2023-04-05  7:14 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-09 15:37 [Bug tree-optimization/107591] New: range-op{,-float}.cc for x * x jakub at gcc dot gnu.org
2022-11-09 15:49 ` [Bug tree-optimization/107591] " aldyh at gcc dot gnu.org
2022-11-09 15:55 ` aldyh at gcc dot gnu.org
2022-11-09 15:57 ` jakub at gcc dot gnu.org
2022-11-09 15:59 ` pinskia at gcc dot gnu.org
2022-11-09 16:01 ` pinskia at gcc dot gnu.org
2022-11-09 16:02 ` aldyh at gcc dot gnu.org
2022-11-09 16:04 ` aldyh at gcc dot gnu.org
2022-11-09 16:31 ` jakub at gcc dot gnu.org
2022-11-09 16:40 ` jakub at gcc dot gnu.org
2022-11-09 17:45 ` amacleod at redhat dot com
2022-11-09 17:49 ` jakub at gcc dot gnu.org
2022-11-09 18:22 ` amacleod at redhat dot com
2022-11-09 18:35 ` jakub at gcc dot gnu.org
2022-11-09 19:20 ` amacleod at redhat dot com
2022-11-09 20:24 ` jakub at gcc dot gnu.org
2022-11-09 20:45 ` jakub at gcc dot gnu.org
2022-11-09 22:25 ` amacleod at redhat dot com
2022-11-09 23:27 ` jakub at gcc dot gnu.org
2022-11-12  8:42 ` cvs-commit at gcc dot gnu.org
2023-04-05  7:14 ` pilarlatiesa at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).