public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/96542] New: Failure to optimize simple code to a constant when storing part of the operation in a variable
@ 2020-08-09 22:10 gabravier at gmail dot com
  2020-08-10  9:49 ` [Bug tree-optimization/96542] " jakub at gcc dot gnu.org
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: gabravier at gmail dot com @ 2020-08-09 22:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96542

            Bug ID: 96542
           Summary: Failure to optimize simple code to a constant when
                    storing part of the operation in a variable
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gabravier at gmail dot com
  Target Milestone: ---

uint8_t f(uint32_t x)
{
    bool tmp = x;
    return (0xFF >> tmp) * 2;
}

This can be optimized to `return -2;`. This transformation is done by LLVM, but
not by GCC. Strangely, this only seems to happen if `(bool)x` is stored in an
intermediate variable, otherwise the transformation is done by GCC.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/96542] Failure to optimize simple code to a constant when storing part of the operation in a variable
  2020-08-09 22:10 [Bug tree-optimization/96542] New: Failure to optimize simple code to a constant when storing part of the operation in a variable gabravier at gmail dot com
@ 2020-08-10  9:49 ` jakub at gcc dot gnu.org
  2020-08-10 10:33 ` glisse at gcc dot gnu.org
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-08-10  9:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96542

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
                 CC|                            |jakub at gcc dot gnu.org
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2020-08-10

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
unsigned char
foo (unsigned int x)
{
  _Bool y = x;
  return (((unsigned char) ~0) >> y) * 2;
}

unsigned char
bar (unsigned int x)
{
  return (((unsigned char) ~0) >> (_Bool) x) * 2;
}

In bar, this is optimized, because fold_binary_op_with_conditional_arg
optimizes
255 >> (x ? 1 : 0) into x ? 127 : 255 and when multiplied by two in unsigned
char this results in x ? 254 : 254.
We don't have anything comparable in match.pd yet I believe (and should we?).
Or shall say VRP try harder if it sees [0, 1] ranges?
Though, shouldn't we optimize e.g.
unsigned
baz (unsigned int x)
{
  if (x >= 4) return 32;
  return (-1U >> x) * 16;
}
too to return x >= 4 ? 32U : -16U; ?
Not sure where and how to generalize it though.
Value range of -1U >> [0, 3] is not really useful here, nonzero bits either.
And having a specialized (const1 >> x) * const2 optimizer based on x's value
range would work, but not sure if it has a real-world benefit.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/96542] Failure to optimize simple code to a constant when storing part of the operation in a variable
  2020-08-09 22:10 [Bug tree-optimization/96542] New: Failure to optimize simple code to a constant when storing part of the operation in a variable gabravier at gmail dot com
  2020-08-10  9:49 ` [Bug tree-optimization/96542] " jakub at gcc dot gnu.org
@ 2020-08-10 10:33 ` glisse at gcc dot gnu.org
  2020-08-10 10:47 ` jakub at gcc dot gnu.org
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: glisse at gcc dot gnu.org @ 2020-08-10 10:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96542

--- Comment #2 from Marc Glisse <glisse at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #1)
> In bar, this is optimized, because fold_binary_op_with_conditional_arg
> optimizes
> 255 >> (x ? 1 : 0) into x ? 127 : 255 and when multiplied by two in unsigned
> char this results in x ? 254 : 254.
> We don't have anything comparable in match.pd yet I believe (and should we?).

We have something for VEC_COND_EXPR to fold a op (b?c:d), but not for
COND_EXPR, which you would be unlikely to see in gimple (and the generator of
phiopt transforms from match.pd patterns hasn't appeared yet). Also, we only
have x!=0, and while fold_binary_op_with_conditional_arg tries to handle it
like x!=0?1:0, we indeed don't do anything like that for gimple. And it seems
possibly better suited to forward propagation than backward like match.pd.

> Or shall say VRP try harder if it sees [0, 1] ranges?

If a range has only 2 (or some other small number) values, try propagating each
and see if some variables end up with the same value in both cases? Or if
enough simplifications occur that it is worth introducing a conditional? I am
not sure it would be worth the trouble.

> Though, shouldn't we optimize e.g.
> unsigned
> baz (unsigned int x)
> {
>   if (x >= 4) return 32;
>   return (-1U >> x) * 16;
> }
> too to return x >= 4 ? 32U : -16U; ?
> Not sure where and how to generalize it though.
> Value range of -1U >> [0, 3] is not really useful here, nonzero bits either.
> And having a specialized (const1 >> x) * const2 optimizer based on x's value
> range would work, but not sure if it has a real-world benefit.

And here this is complicated by the fact that we do not narrow the operation,
so it is less obvious that the constant is -1.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/96542] Failure to optimize simple code to a constant when storing part of the operation in a variable
  2020-08-09 22:10 [Bug tree-optimization/96542] New: Failure to optimize simple code to a constant when storing part of the operation in a variable gabravier at gmail dot com
  2020-08-10  9:49 ` [Bug tree-optimization/96542] " jakub at gcc dot gnu.org
  2020-08-10 10:33 ` glisse at gcc dot gnu.org
@ 2020-08-10 10:47 ` jakub at gcc dot gnu.org
  2020-08-10 10:48 ` jakub at gcc dot gnu.org
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-08-10 10:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96542

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |aldyh at gcc dot gnu.org,
                   |                            |amacleod at redhat dot com

--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Marc Glisse from comment #2)
> > Or shall say VRP try harder if it sees [0, 1] ranges?
> 
> If a range has only 2 (or some other small number) values, try propagating
> each and see if some variables end up with the same value in both cases?

Maybe.  The question is if it should be done in forwprop, or say vrp or in the
ranger code (does that one do forward propagation)?
In any case, it should be limited to ranges with small number of values
(perhaps decided with a param) and also bound on how many immediate use stmts
it attempts to propagate it through.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/96542] Failure to optimize simple code to a constant when storing part of the operation in a variable
  2020-08-09 22:10 [Bug tree-optimization/96542] New: Failure to optimize simple code to a constant when storing part of the operation in a variable gabravier at gmail dot com
                   ` (2 preceding siblings ...)
  2020-08-10 10:47 ` jakub at gcc dot gnu.org
@ 2020-08-10 10:48 ` jakub at gcc dot gnu.org
  2020-08-10 14:18 ` amacleod at redhat dot com
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-08-10 10:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96542

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
As for COND_EXPR, if we do it that way, it should be rather keyed on a range
with only two possible values in the range.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/96542] Failure to optimize simple code to a constant when storing part of the operation in a variable
  2020-08-09 22:10 [Bug tree-optimization/96542] New: Failure to optimize simple code to a constant when storing part of the operation in a variable gabravier at gmail dot com
                   ` (3 preceding siblings ...)
  2020-08-10 10:48 ` jakub at gcc dot gnu.org
@ 2020-08-10 14:18 ` amacleod at redhat dot com
  2020-08-10 14:53 ` amacleod at redhat dot com
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: amacleod at redhat dot com @ 2020-08-10 14:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96542

--- Comment #5 from Andrew Macleod <amacleod at redhat dot com> ---
I think this all goes away when multi-range is enabled.

The original testcase produces:
=========== BB 2 ============
x_4(D)  unsigned int VARYING
    <bb 2> :
    tmp_5 = x_4(D) != 0;
    _1 = (int) tmp_5;
    _2 = 255 >> _1;
    _3 = (unsigned char) _2;
    _6 = _3 * 2;
    return _6;

_1 : int [0, 1]
_2 : int [127, 255]
_3 : unsigned char [127, +INF]


A number of the range-ops routines are simply inherited from the single range
codebase of value_range, and are not multi-range enabled yet.

In particular, rshift here.   with a tweak to operator_rshift, we can take

255 >> [0,1] and instead calculate 
_2 = [127,127][255,255]
which would make the results:

_1 : int [0, 1]
_2 : int [127, 127][255, 255]
_3 : unsigned char [127, 127] [255, 255]

Then when _6 is calculated as [127, 127][255, 255] * 2 , range-ops will
calculate the result to be [254, 254]


The whole thing will just fold away to return 254  (or -2 if you want to sign
it :-)

All we need to do is get the multi-range code enabled.... it's coming.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/96542] Failure to optimize simple code to a constant when storing part of the operation in a variable
  2020-08-09 22:10 [Bug tree-optimization/96542] New: Failure to optimize simple code to a constant when storing part of the operation in a variable gabravier at gmail dot com
                   ` (4 preceding siblings ...)
  2020-08-10 14:18 ` amacleod at redhat dot com
@ 2020-08-10 14:53 ` amacleod at redhat dot com
  2021-07-17  2:31 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: amacleod at redhat dot com @ 2020-08-10 14:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96542

--- Comment #6 from Andrew Macleod <amacleod at redhat dot com> ---
Likewise, for

unsigned
baz (unsigned int x)
{
  if (x >= 4) return 32;
  return (-1U >> x) * 16;
}


=========== BB 2 ============
x_3(D)  unsigned int VARYING
_4      UNDEFINED
    <bb 2> :
    if (x_3(D) > 3)
      goto <bb 3>; [INV]
    else
      goto <bb 4>; [INV]

2->3  (T) x_3(D) :      unsigned int [4, +INF]
2->4  (F) x_3(D) :      unsigned int [0, 3]

=========== BB 3 ============
_4      UNDEFINED
    <bb 3> :
    // predicted unlikely by early return (on trees) predictor.
    goto <bb 5>; [INV]


=========== BB 4 ============
x_3(D)  unsigned int [0, 3]
    <bb 4> :
    _1 = 4294967295 >> x_3(D);
    _4 = _1 * 16;
_1 : unsigned int [536870911, +INF]

0xFFFFFFFF >> [0,3] is currently producing [0x2FFFFFFF, 0xFFFFFFFF]

again making operator_rshift capable of generating multi-ranges,

0xFFFFFFFF >> [0,3] should produce
[0x2FFFFFFF, 0x2FFFFFFF][0x4FFFFFFF, 0x4FFFFFFF][0x8FFFFFFF,
0x8FFFFFFF][0xFFFFFFFF, 0xFFFFFFFF] 

and then _4 = _1 *16 should automatically produce: [0xFFFFFFF0, 0xFFFFFFF0]
I think multiply does that today, if its given the proper input.




=========== BB 5 ============
_4      unsigned int VARYING
    <bb 5> :
    # _2 = PHI <32(3), _4(4)>
    return _2;

And then this PHI will have the constant propagated and become:
# _2 = PHI <32(3), 0xFFFFFFF0(4)>
return _2

ANd given that, presumably phi-opt or maybe even VRPs simplfication will turn
that into the desired conditional once the constant is calculated.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/96542] Failure to optimize simple code to a constant when storing part of the operation in a variable
  2020-08-09 22:10 [Bug tree-optimization/96542] New: Failure to optimize simple code to a constant when storing part of the operation in a variable gabravier at gmail dot com
                   ` (5 preceding siblings ...)
  2020-08-10 14:53 ` amacleod at redhat dot com
@ 2021-07-17  2:31 ` cvs-commit at gcc dot gnu.org
  2021-07-17  2:34 ` amacleod at redhat dot com
  2021-08-06 19:42 ` pinskia at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-07-17  2:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96542

--- Comment #7 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Andrew Macleod <amacleod@gcc.gnu.org>:

https://gcc.gnu.org/g:704e8a825c78b9a8424c291509413bbb48e602c7

commit r12-2381-g704e8a825c78b9a8424c291509413bbb48e602c7
Author: Andrew MacLeod <amacleod@redhat.com>
Date:   Fri Jul 16 11:42:14 2021 -0400

    Add wi_fold_in_parts.

    range-ops uses wi_fold to individually fold subranges one at a time and
    then combined them.  This patch first calls wi_fold_in_parts which checks
if
    one of the subranges is small, and if so, further splits that subrange
    into constants.

            gcc/
            PR tree-optimization/96542
            * range-op.cc (range_operator::wi_fold_in_parts): New.
            (range_operator::fold_range): Call wi_fold_in_parts.
            (operator_lshift::wi_fold): Fix broken lshift by [0,0].
            * range-op.h (wi_fold_in_parts): Add prototype.

            gcc/testsuite
            * gcc.dg/pr96542.c: New.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/96542] Failure to optimize simple code to a constant when storing part of the operation in a variable
  2020-08-09 22:10 [Bug tree-optimization/96542] New: Failure to optimize simple code to a constant when storing part of the operation in a variable gabravier at gmail dot com
                   ` (6 preceding siblings ...)
  2021-07-17  2:31 ` cvs-commit at gcc dot gnu.org
@ 2021-07-17  2:34 ` amacleod at redhat dot com
  2021-08-06 19:42 ` pinskia at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: amacleod at redhat dot com @ 2021-07-17  2:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96542

Andrew Macleod <amacleod at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #8 from Andrew Macleod <amacleod at redhat dot com> ---

Range-ops uses wi_fold (implemented by each opcode)  to individually 
fold subranges one at a time and then combines them. This patch first 
calls wi_fold_in_parts which checks if one of the subranges is small, 
and if so, further splits that subrange into constants.

Currently, if a subrange is 4 or less values, then we call it 
individually for each of the 4 values. 4 was chosen as a reasonable 
tradeoff between excess work vs benefit.  

this gets all 3 cases now.
fixed

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/96542] Failure to optimize simple code to a constant when storing part of the operation in a variable
  2020-08-09 22:10 [Bug tree-optimization/96542] New: Failure to optimize simple code to a constant when storing part of the operation in a variable gabravier at gmail dot com
                   ` (7 preceding siblings ...)
  2021-07-17  2:34 ` amacleod at redhat dot com
@ 2021-08-06 19:42 ` pinskia at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-08-06 19:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96542

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |12.0

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-08-06 19:42 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-09 22:10 [Bug tree-optimization/96542] New: Failure to optimize simple code to a constant when storing part of the operation in a variable gabravier at gmail dot com
2020-08-10  9:49 ` [Bug tree-optimization/96542] " jakub at gcc dot gnu.org
2020-08-10 10:33 ` glisse at gcc dot gnu.org
2020-08-10 10:47 ` jakub at gcc dot gnu.org
2020-08-10 10:48 ` jakub at gcc dot gnu.org
2020-08-10 14:18 ` amacleod at redhat dot com
2020-08-10 14:53 ` amacleod at redhat dot com
2021-07-17  2:31 ` cvs-commit at gcc dot gnu.org
2021-07-17  2:34 ` amacleod at redhat dot com
2021-08-06 19:42 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).