public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/114331] New: Missed optimization: indicate knownbits from dominating condition switch(trunc(a))
@ 2024-03-14  5:50 xxs_chy at outlook dot com
  2024-03-14  8:36 ` [Bug tree-optimization/114331] " rguenth at gcc dot gnu.org
                   ` (14 more replies)
  0 siblings, 15 replies; 16+ messages in thread
From: xxs_chy at outlook dot com @ 2024-03-14  5:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114331

            Bug ID: 114331
           Summary: Missed optimization: indicate knownbits from
                    dominating condition switch(trunc(a))
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: xxs_chy at outlook dot com
  Target Milestone: ---

Godbolt link: https://godbolt.org/z/dso53ndTo
For code like:

int src(int num) {
    switch((short)num){
        case 111:
          return num & 0xfffe;
        case 267:
        case 204:
        case 263:
          return 0;
        default:
          dummy();
          return 0;
    }
}

"num & 0xfffe" can be folded to "110". But both LLVM and GCC fail to fold it.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/114331] Missed optimization: indicate knownbits from dominating condition switch(trunc(a))
  2024-03-14  5:50 [Bug tree-optimization/114331] New: Missed optimization: indicate knownbits from dominating condition switch(trunc(a)) xxs_chy at outlook dot com
@ 2024-03-14  8:36 ` rguenth at gcc dot gnu.org
  2024-03-14  8:44 ` jakub at gcc dot gnu.org
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-03-14  8:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114331

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2024-03-14
             Blocks|                            |85316
           Keywords|                            |missed-optimization
             Status|UNCONFIRMED                 |NEW

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
ISTR we somewhat ignore switches in range analysis.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85316
[Bug 85316] [meta-bug] VRP range propagation missed cases

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/114331] Missed optimization: indicate knownbits from dominating condition switch(trunc(a))
  2024-03-14  5:50 [Bug tree-optimization/114331] New: Missed optimization: indicate knownbits from dominating condition switch(trunc(a)) xxs_chy at outlook dot com
  2024-03-14  8:36 ` [Bug tree-optimization/114331] " rguenth at gcc dot gnu.org
@ 2024-03-14  8:44 ` jakub at gcc dot gnu.org
  2024-03-14 12:45 ` amacleod at redhat dot com
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-03-14  8:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114331

--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I thought we don't, if the testcase starts with
void dummy (void);
short int foo (void);
int src(void) {
    short int num = foo ();
    switch(num){
instead then evrp1 optimizes away the BIT_AND_EXPR.
Though, if the ranger just has a known zero mask in addition to normal ranges,
maybe it can't handle it, because the range of num with (short)num == 111 is
hard to express
(it would be a union of 65536 ranges).
Maybe bit-ccp should be able to handle that though, this case is exactly that
upper 16 bits of the value are uknown and lower 16 bits of the value are all
known,
some of them 0, some of them 1.
But maybe bit-ccp doesn't handle switches.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/114331] Missed optimization: indicate knownbits from dominating condition switch(trunc(a))
  2024-03-14  5:50 [Bug tree-optimization/114331] New: Missed optimization: indicate knownbits from dominating condition switch(trunc(a)) xxs_chy at outlook dot com
  2024-03-14  8:36 ` [Bug tree-optimization/114331] " rguenth at gcc dot gnu.org
  2024-03-14  8:44 ` jakub at gcc dot gnu.org
@ 2024-03-14 12:45 ` amacleod at redhat dot com
  2024-03-14 13:21 ` jakub at gcc dot gnu.org
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: amacleod at redhat dot com @ 2024-03-14 12:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114331

--- Comment #3 from Andrew Macleod <amacleod at redhat dot com> ---
(In reply to Jakub Jelinek from comment #2)
> I thought we don't, if the testcase starts with
> void dummy (void);
> short int foo (void);
> int src(void) {
>     short int num = foo ();
>     switch(num){
> instead then evrp1 optimizes away the BIT_AND_EXPR.
> Though, if the ranger just has a known zero mask in addition to normal
> ranges, maybe it can't handle it, because the range of num with (short)num
> == 111 is hard to express
> (it would be a union of 65536 ranges).
> Maybe bit-ccp should be able to handle that though, this case is exactly that
> upper 16 bits of the value are uknown and lower 16 bits of the value are all
> known,
> some of them 0, some of them 1.
> But maybe bit-ccp doesn't handle switches.

We certainly do fully handle switches.

The problem is the case is on a cast and then  we use num in its fullsize in
the expression.  So its all the other bits:

num_5(D)        [irange] int [-INF, -65537][-65425, -65425][111, 111][65536,
+INF]
    <bb 3> :
  <L0>:
    _8 = num_5(D) & 65534;
    goto <bb 5>; [INV]
_8 : [irange] int [0, 65534] MASK 0xfffe VALUE 0x0


Given it starts with:
    _1 = (short int) num_5(D);
    _2 = (int) _1;
    switch (_1) <default: <L4> [INV], case 111: <L0> [INV], case 204: <L6>
[INV], case 263: <L6> [INV], case 267: <L6> [INV]>

Its difficult to recognize that the known bits of 'num_5' match each switch
case.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/114331] Missed optimization: indicate knownbits from dominating condition switch(trunc(a))
  2024-03-14  5:50 [Bug tree-optimization/114331] New: Missed optimization: indicate knownbits from dominating condition switch(trunc(a)) xxs_chy at outlook dot com
                   ` (2 preceding siblings ...)
  2024-03-14 12:45 ` amacleod at redhat dot com
@ 2024-03-14 13:21 ` jakub at gcc dot gnu.org
  2024-03-14 16:33 ` amacleod at redhat dot com
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-03-14 13:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114331

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Actually, looking at value-range.h, irange_bitmask doesn't have just the mask
but also value, so I wonder why it isn't able to figure this out.
I'd think m_mask should be ~0xffff and value 111.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/114331] Missed optimization: indicate knownbits from dominating condition switch(trunc(a))
  2024-03-14  5:50 [Bug tree-optimization/114331] New: Missed optimization: indicate knownbits from dominating condition switch(trunc(a)) xxs_chy at outlook dot com
                   ` (3 preceding siblings ...)
  2024-03-14 13:21 ` jakub at gcc dot gnu.org
@ 2024-03-14 16:33 ` amacleod at redhat dot com
  2024-03-14 16:57 ` aldyh at gcc dot gnu.org
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: amacleod at redhat dot com @ 2024-03-14 16:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114331

--- Comment #5 from Andrew Macleod <amacleod at redhat dot com> ---
(In reply to Jakub Jelinek from comment #4)
> Actually, looking at value-range.h, irange_bitmask doesn't have just the
> mask but also value, so I wonder why it isn't able to figure this out.
> I'd think m_mask should be ~0xffff and value 111.

    _1 = (short int) num_5(D);
    _2 = (int) _1;
    switch (_1)

globally we know 
_2 : [irange] int [-32768, 32767]
and coming into bb 3:
_2 : [irange] int [-32768, 32767]
2->3      _1 :  [irange] short int [111, 111]
2->3      _2 :  [irange] int [111, 111]
2->3      num_5(D) :    [irange] int [-INF, -65537][-65425, -65425][111,
111][65536, +INF]

I guess what you are looking for is to add known bits to the range produced for
num_5 on the outgoing edge.

This would have to be done in operator_cast::op1_range  where we are reducing
it to known range of the switch.  I do not believe it makes any attempts to set
bitmasks based on casts like that currently.

so, GORI working backwards has _1 = [irange] short int [111, 111] , so it would
be satisfying the expression:
  short int [111, 111] = (short int) num_5(D);

when evaluating num_5 in operator_cast::op1_range, in the truncating_cast_p
section we could also see if there is a bitmask pattern for the LHS that could
be applied to the range..  I think that basically what you are asking for.  
Simple enough for a constant I suppose for starters.   Or maybe Aldy has a
routine that picks bitmasks and values from ranges somewhere?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/114331] Missed optimization: indicate knownbits from dominating condition switch(trunc(a))
  2024-03-14  5:50 [Bug tree-optimization/114331] New: Missed optimization: indicate knownbits from dominating condition switch(trunc(a)) xxs_chy at outlook dot com
                   ` (4 preceding siblings ...)
  2024-03-14 16:33 ` amacleod at redhat dot com
@ 2024-03-14 16:57 ` aldyh at gcc dot gnu.org
  2024-03-14 17:08 ` jakub at gcc dot gnu.org
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: aldyh at gcc dot gnu.org @ 2024-03-14 16:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114331

--- Comment #6 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
(In reply to Andrew Macleod from comment #5)
> (In reply to Jakub Jelinek from comment #4)
> > Actually, looking at value-range.h, irange_bitmask doesn't have just the
> > mask but also value, so I wonder why it isn't able to figure this out.
> > I'd think m_mask should be ~0xffff and value 111.
> 
>     _1 = (short int) num_5(D);
>     _2 = (int) _1;
>     switch (_1)
> 
> globally we know 
> _2 : [irange] int [-32768, 32767]
> and coming into bb 3:
> _2 : [irange] int [-32768, 32767]
> 2->3      _1 :  [irange] short int [111, 111]
> 2->3      _2 :  [irange] int [111, 111]
> 2->3      num_5(D) :    [irange] int [-INF, -65537][-65425, -65425][111,
> 111][65536, +INF]
> 
> I guess what you are looking for is to add known bits to the range produced
> for num_5 on the outgoing edge.
> 
> This would have to be done in operator_cast::op1_range  where we are
> reducing it to known range of the switch.  I do not believe it makes any
> attempts to set bitmasks based on casts like that currently.
> 
> so, GORI working backwards has _1 = [irange] short int [111, 111] , so it
> would be satisfying the expression:
>   short int [111, 111] = (short int) num_5(D);
> 
> when evaluating num_5 in operator_cast::op1_range, in the truncating_cast_p
> section we could also see if there is a bitmask pattern for the LHS that
> could be applied to the range..  I think that basically what you are asking
> for.   Simple enough for a constant I suppose for starters.   Or maybe Aldy
> has a routine that picks bitmasks and values from ranges somewhere?

You may want to look at:

// Return the bitmask inherent in the range.

irange_bitmask
irange::get_bitmask_from_range () const
{
}

IIRC, this code was originally authored by Jakub and was embedded in the
tree-ssanames.cc code setting nonzero bits every time we set a global range.  I
just abstracted it and adapted it to work within our framework.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/114331] Missed optimization: indicate knownbits from dominating condition switch(trunc(a))
  2024-03-14  5:50 [Bug tree-optimization/114331] New: Missed optimization: indicate knownbits from dominating condition switch(trunc(a)) xxs_chy at outlook dot com
                   ` (5 preceding siblings ...)
  2024-03-14 16:57 ` aldyh at gcc dot gnu.org
@ 2024-03-14 17:08 ` jakub at gcc dot gnu.org
  2024-03-14 17:28 ` amacleod at redhat dot com
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-03-14 17:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114331

--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Aldy Hernandez from comment #6)
> You may want to look at:
> 
> // Return the bitmask inherent in the range.
> 
> irange_bitmask
> irange::get_bitmask_from_range () const
> {
> }
> 
> IIRC, this code was originally authored by Jakub and was embedded in the
> tree-ssanames.cc code setting nonzero bits every time we set a global range.
> I just abstracted it and adapted it to work within our framework.

I'm afraid on this testcase trying to derive bitmask from range is not going to
work if the range is that
[irange] int [-INF, -65537][-65425, -65425][111, 111][65536, +INF]
We really want to record that because of the bb we know that the low 16 bits
are equal to 111 and the upper 16 bits are unknown.
Because the range could help there only if we had a union of 65536 subranges
here.
tree-ssanames.cc before contained just a single mask, so could only track known
zero bits vs. unknown bits rather than known zero bits vs. known one bits vs.
unknown bits.
While MASK 0xffff0000 VALUE 0x6f can represent what supposedly the BIT_AND_EXPR
optimization needs (whether we actually use it then to optimize it away is
another question).

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/114331] Missed optimization: indicate knownbits from dominating condition switch(trunc(a))
  2024-03-14  5:50 [Bug tree-optimization/114331] New: Missed optimization: indicate knownbits from dominating condition switch(trunc(a)) xxs_chy at outlook dot com
                   ` (6 preceding siblings ...)
  2024-03-14 17:08 ` jakub at gcc dot gnu.org
@ 2024-03-14 17:28 ` amacleod at redhat dot com
  2024-03-14 17:39 ` pinskia at gcc dot gnu.org
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: amacleod at redhat dot com @ 2024-03-14 17:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114331

--- Comment #8 from Andrew Macleod <amacleod at redhat dot com> ---
(In reply to Jakub Jelinek from comment #7)
> (In reply to Aldy Hernandez from comment #6)
> > You may want to look at:
> > 
> > // Return the bitmask inherent in the range.
> > 
> > irange_bitmask
> > irange::get_bitmask_from_range () const
> > {
> > }
> > 
> > IIRC, this code was originally authored by Jakub and was embedded in the
> > tree-ssanames.cc code setting nonzero bits every time we set a global range.
> > I just abstracted it and adapted it to work within our framework.
> 
> I'm afraid on this testcase trying to derive bitmask from range is not going
> to work if the range is that
> [irange] int [-INF, -65537][-65425, -65425][111, 111][65536, +INF]
> We really want to record that because of the bb we know that the low 16 bits
> are equal to 111 and the upper 16 bits are unknown.
> Because the range could help there only if we had a union of 65536 subranges
> here.

Isnt that exactly what the known bits is suppose to do?  op1_range already
knows [111,111] is the lower 15 bits, but it doesnt add the bitmask.  We
shoulod also be able to set the known mask for the lower 16 bits as well?

> tree-ssanames.cc before contained just a single mask, so could only track
> known zero bits vs. unknown bits rather than known zero bits vs. known one
> bits vs. unknown bits.
> While MASK 0xffff0000 VALUE 0x6f can represent what supposedly the
> BIT_AND_EXPR optimization needs (whether we actually use it then to optimize
> it away is another question).


then when we see 
    _8 = num_5(D) & 65534;
the operator_bitwise_and::fold_range should pick up the known bits from the
num_5 range, recognize its the only relevant part of the mask, and get it right
shouldn't it? 
It can't do it without the mask for sure.     It possible that
operator_bitwise_and::fold_range needs some tweaking to recognize the known
bits in op1 and map it to the mask in op2, but maybe we alreayd do that. I
havent looked at that code either, but it seems like something it should be
doing.


So instead of 
_8 : [irange] int [0, 65534] MASK 0xfffe VALUE 0x0

we'd get [110,110] for _8

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/114331] Missed optimization: indicate knownbits from dominating condition switch(trunc(a))
  2024-03-14  5:50 [Bug tree-optimization/114331] New: Missed optimization: indicate knownbits from dominating condition switch(trunc(a)) xxs_chy at outlook dot com
                   ` (7 preceding siblings ...)
  2024-03-14 17:28 ` amacleod at redhat dot com
@ 2024-03-14 17:39 ` pinskia at gcc dot gnu.org
  2024-03-14 17:58 ` jakub at gcc dot gnu.org
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-14 17:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114331

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           See Also|                            |https://gcc.gnu.org/bugzill
                   |                            |a/show_bug.cgi?id=110405

--- Comment #9 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I suspect PR 110405 is very similar.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/114331] Missed optimization: indicate knownbits from dominating condition switch(trunc(a))
  2024-03-14  5:50 [Bug tree-optimization/114331] New: Missed optimization: indicate knownbits from dominating condition switch(trunc(a)) xxs_chy at outlook dot com
                   ` (8 preceding siblings ...)
  2024-03-14 17:39 ` pinskia at gcc dot gnu.org
@ 2024-03-14 17:58 ` jakub at gcc dot gnu.org
  2024-03-14 18:29 ` amacleod at redhat dot com
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-03-14 17:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114331

--- Comment #10 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I really don't know how GORI etc. works.
But, if when the switch handling determines that _1 (the switch controlling
expression) has [irange] [111, 111] MASK 0x0 VALUE 0x6f (does it actually? i.e.
for a singleton range all the bits here are known and equal to the value), then
when trying to derive a range for related num_9(D) which is int rather than
_1's short and
  _1 = (short int) num_5(D);
for the MASK/VALUE we should just use the same VALUE and or in 0xffff0000 into
MASK because we then don't know anything about the upper bits.
Though, looking at the evrp detailed dump there is
2->3      _1 :  [irange] short int [111, 111]
2->3      _2 :  [irange] int [111, 111]
2->3      num_5(D) :    [irange] int [-INF, -65537][-65425, -65425][111,
111][65536, +INF]
and so no MASK/VALUE for any of those ranges.
Now, from comments it seems that irange_bitmask is only computed on demand to
speed things up, unless it has been explicitly set.
Now, say for _1 or _2 above, we don't have anything recorded but we can always
compute it on demand from the value range.  But when adding the num_5(D) range
based on the related _1 range, the on-demand irange_bitmask is no longer as
precise as it would be if we when deriving that [-INF, -65537][-65425,
-65425][111, 111][65536, +INF] range
from the [111, 111] range also derived from the in that case on-demand asked
MASK 0x0 VALUE 0x6f to MASK 0xffff0000 VALUE 0x6f.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/114331] Missed optimization: indicate knownbits from dominating condition switch(trunc(a))
  2024-03-14  5:50 [Bug tree-optimization/114331] New: Missed optimization: indicate knownbits from dominating condition switch(trunc(a)) xxs_chy at outlook dot com
                   ` (9 preceding siblings ...)
  2024-03-14 17:58 ` jakub at gcc dot gnu.org
@ 2024-03-14 18:29 ` amacleod at redhat dot com
  2024-03-14 18:41 ` jakub at gcc dot gnu.org
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: amacleod at redhat dot com @ 2024-03-14 18:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114331

--- Comment #11 from Andrew Macleod <amacleod at redhat dot com> ---
(In reply to Jakub Jelinek from comment #10)
> I really don't know how GORI etc. works.
> But, if when the switch handling determines that _1 (the switch controlling
> expression) has [irange] [111, 111] MASK 0x0 VALUE 0x6f (does it actually?
> i.e. for a singleton range all the bits here are known and equal to the
> value), then when trying to derive a range for related num_9(D) which is int
> rather than _1's short and
>   _1 = (short int) num_5(D);
> for the MASK/VALUE we should just use the same VALUE and or in 0xffff0000
> into MASK because we then don't know anything about the upper bits.
> Though, looking at the evrp detailed dump there is
> 2->3      _1 :  [irange] short int [111, 111]
> 2->3      _2 :  [irange] int [111, 111]
> 2->3      num_5(D) :    [irange] int [-INF, -65537][-65425, -65425][111,
> 111][65536, +INF]
> and so no MASK/VALUE for any of those ranges.

Right. No mask needed for _1 and _2 as the range fully represents the known
bits, and operator_cast::op1_range hasn't been taught to add a bitmask when
calculating num_5 yet.  It could have as mask and value dded to it because its
implied by the result of the cast being short int [111, 111]   The routine Aldy
provided should create that mask when asked I think.



> Now, from comments it seems that irange_bitmask is only computed on demand
> to speed things up, unless it has been explicitly set.
> Now, say for _1 or _2 above, we don't have anything recorded but we can
> always compute it on demand from the value range.  But when adding the

Which I think we are both on the same page so far.

> num_5(D) range based on the related _1 range, the on-demand irange_bitmask
> is no longer as precise as it would be if we when deriving that [-INF,
> -65537][-65425, -65425][111, 111][65536, +INF] range

We aren't deriving it from that range tho.  we are solving for num_5 the
equation
[111,111] = (short int) num_5

The range you list is the best we can currently produce with *just* ranges.  if
we also add that bitmask along with the range (generated from the range on the
LHS,  we can adjust that to what you specify

> from the [111, 111] range also derived from the in that case on-demand asked
> MASK 0x0 VALUE 0x6f to MASK 0xffff0000 VALUE 0x6f.

Right, so the LHS 16 bits produce MASK 0x0000 VALUE 0x6f, which means those
bits should apply tot he RHS as well.  Since we're extending that to 32 bits,
we'd have to make the upper ones unknown, so we should be able to create

num_5 : [rainge] int [-INF, -65537][-65425, -65425][111, 111][65536, +INF] MASK
0xffff0000 VALUE 0x6f

And then in bb 3 when we see
  _8 = num_5(D) & 65534;

instead of producing 
_8 : [irange] int [0, 65534] MASK 0xfffe VALUE 0x0

operator_bitwise_and::fold_range ought to be able to combine the known bits and
come up with
_8 : [irange] int [0, 65534] MASK 0x0000 VALUE 0x6e,

which if Aldy's bitmask code is working right (:-) should turn into 
_8 : [irange] int [110, 110]

It should be fairly straightforward if operator_cast::op1_range creates the
mask for the range it produces.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/114331] Missed optimization: indicate knownbits from dominating condition switch(trunc(a))
  2024-03-14  5:50 [Bug tree-optimization/114331] New: Missed optimization: indicate knownbits from dominating condition switch(trunc(a)) xxs_chy at outlook dot com
                   ` (10 preceding siblings ...)
  2024-03-14 18:29 ` amacleod at redhat dot com
@ 2024-03-14 18:41 ` jakub at gcc dot gnu.org
  2024-03-15  7:51 ` aldyh at gcc dot gnu.org
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-03-14 18:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114331

--- Comment #12 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Yeah.  So the cases where we should do it is when we are reversing a narrowing
cast, or also something for the other PRs Andrew mentioned, like when reversing
BIT_AND_EXPR (but maybe also BIT_IOR_EXPR/BIT_XOR_EXPR, haven't thought that
out; maybe only if BIT_AND_EXPR has constant second argument?).
For that
  if ((i & 7) == 6)
in there aka
  _1 = i_2(D) & 7;
  if (_1 == 6)
    ...
we get [6, 6] range on that edge (with irange_bitmask again implicit), but if
we want to ask what the range of i_2(D) is we can ask for irange_bitmask to be
computed (MASK 0x0 VALUE 0x6) and for i_2(D) reverse the mask, i.e. MASK
0xfffffff8 VALUE 0x6;

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/114331] Missed optimization: indicate knownbits from dominating condition switch(trunc(a))
  2024-03-14  5:50 [Bug tree-optimization/114331] New: Missed optimization: indicate knownbits from dominating condition switch(trunc(a)) xxs_chy at outlook dot com
                   ` (11 preceding siblings ...)
  2024-03-14 18:41 ` jakub at gcc dot gnu.org
@ 2024-03-15  7:51 ` aldyh at gcc dot gnu.org
  2024-03-15  8:01 ` jakub at gcc dot gnu.org
  2024-03-15  8:05 ` jakub at gcc dot gnu.org
  14 siblings, 0 replies; 16+ messages in thread
From: aldyh at gcc dot gnu.org @ 2024-03-15  7:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114331

--- Comment #13 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
I think y'all have it all figured out.  Basically,

operator_cast::op1_range() is solving num_5 in the equation:

[111,111] = (short int) num_5

Where lhs is:
(gdb) p debug(lhs)
[irange] short int [111, 111]

And the bitmask is implicit, but correctly calculated:
(gdb) p debug(lhs.get_bitmask ())
MASK 0x0 VALUE 0x6f

And the range for num_5 is just varying.

You are looking at the true side of the truncating_cast_p() conditional in
op1_range.  Essentially, once it starts building the range it ignores known
bitmasks, which cause it to return an overly conservative range.

And yes, bitmasks are calculated on demand per the comment in get_bitmask:

  // The mask inherent in the range is calculated on-demand.  For               
  // example, [0,255] does not have known bits set by default.  This            
  // saves us considerable time, because setting it at creation incurs          
  // a large penalty for irange::set.  At the time of writing there             
  // was a 5% slowdown in VRP if we kept the mask precisely up to date          
  // at all times.  Instead, we default to -1 and set it when                   
  // explicitly requested.  However, this function will always return           
  // the correct mask.                                

We may have to revisit this.  Perhaps it doesn't matter any more, and we could
keep bitmasks up to date.  It would definitely make the dumps easier to read. 
Though, at least for this testcase the bitmask is correctly returned with
get_bitmask().

I think the analysis in comment 11 is spot on.  And no Andrew, it's not *my*
code, it all comes from the bit-ccp code :-P.

fold_range() should call update_bitmask(), which eventually calls
bit_value_binop() in the CCP code.

And yes Jakub, as you have noticed, BIT_IOR_EXPR, BIT_XOR_EXPR, and likely
other operators may need to be tweaked to take bitmasks into account.  I
wouldn't be surprised if there's a lot of low hanging fruit in this space.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/114331] Missed optimization: indicate knownbits from dominating condition switch(trunc(a))
  2024-03-14  5:50 [Bug tree-optimization/114331] New: Missed optimization: indicate knownbits from dominating condition switch(trunc(a)) xxs_chy at outlook dot com
                   ` (12 preceding siblings ...)
  2024-03-15  7:51 ` aldyh at gcc dot gnu.org
@ 2024-03-15  8:01 ` jakub at gcc dot gnu.org
  2024-03-15  8:05 ` jakub at gcc dot gnu.org
  14 siblings, 0 replies; 16+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-03-15  8:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114331

--- Comment #14 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Aldy Hernandez from comment #13)
> And yes Jakub, as you have noticed, BIT_IOR_EXPR, BIT_XOR_EXPR, and likely
> other operators may need to be tweaked to take bitmasks into account.  I
> wouldn't be surprised if there's a lot of low hanging fruit in this space.

I think next to BIT_AND_EXPR with const second operand in the reverse direction
it certainly is BIT_IOR_EXPR with const second operand in the forward
direction, that is another thing where some bits become unknown compared to the
source bitmask.
Whether it is also BIT_XOR_EXPR/BIT_NOT_EXPR, dunno right now, we'd need to
play with testcases.  Maybe it is also BIT_AND_EXPR with const second operand
in the forward direction (and then BIT_IOR_EXPR with const second operand in
the reverse direction and BIT_XOR_EXPR with const second operand and
BIT_NOT_EXPR in both directions) which instead of making some bits unknown
(i.e. oring some bits into mask) makes some bits known (i.e. removes them from
the mask; for xor/not keeps mask as is but modifies value).

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/114331] Missed optimization: indicate knownbits from dominating condition switch(trunc(a))
  2024-03-14  5:50 [Bug tree-optimization/114331] New: Missed optimization: indicate knownbits from dominating condition switch(trunc(a)) xxs_chy at outlook dot com
                   ` (13 preceding siblings ...)
  2024-03-15  8:01 ` jakub at gcc dot gnu.org
@ 2024-03-15  8:05 ` jakub at gcc dot gnu.org
  14 siblings, 0 replies; 16+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-03-15  8:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114331

--- Comment #15 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Oh, and maybe if you want for performance reasons to avoid computing
irange_bitmask unless necessary, in cases like where irange_bitmask wasn't set
before verify if it is really different from the bitmask implied for its range
and only set if it is different from that.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2024-03-15  8:06 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-14  5:50 [Bug tree-optimization/114331] New: Missed optimization: indicate knownbits from dominating condition switch(trunc(a)) xxs_chy at outlook dot com
2024-03-14  8:36 ` [Bug tree-optimization/114331] " rguenth at gcc dot gnu.org
2024-03-14  8:44 ` jakub at gcc dot gnu.org
2024-03-14 12:45 ` amacleod at redhat dot com
2024-03-14 13:21 ` jakub at gcc dot gnu.org
2024-03-14 16:33 ` amacleod at redhat dot com
2024-03-14 16:57 ` aldyh at gcc dot gnu.org
2024-03-14 17:08 ` jakub at gcc dot gnu.org
2024-03-14 17:28 ` amacleod at redhat dot com
2024-03-14 17:39 ` pinskia at gcc dot gnu.org
2024-03-14 17:58 ` jakub at gcc dot gnu.org
2024-03-14 18:29 ` amacleod at redhat dot com
2024-03-14 18:41 ` jakub at gcc dot gnu.org
2024-03-15  7:51 ` aldyh at gcc dot gnu.org
2024-03-15  8:01 ` jakub at gcc dot gnu.org
2024-03-15  8:05 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).