public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/102650] New: Dead Code Elimination Regression at -O3 (trunk vs 11.2.0)
@ 2021-10-08 13:47 theodort at inf dot ethz.ch
  2021-10-08 16:45 ` [Bug tree-optimization/102650] " amacleod at redhat dot com
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: theodort at inf dot ethz.ch @ 2021-10-08 13:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102650

            Bug ID: 102650
           Summary: Dead Code Elimination Regression at -O3 (trunk vs
                    11.2.0)
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: theodort at inf dot ethz.ch
  Target Milestone: ---

cat case.c                                                                     
      bisections
static int a = 2, b, c, d;
void foo(void);
int main() {
    short e;
    int f = -1;
    if (b)
        c = 0;
    c || (f = 2);
    for (; d < 1; d++)
        e = f + a;
    if (!e)
        foo();
    return 0;
}


11.2.0 at -O3 can eliminate the call to foo but trunk at -O3 cannot:

gcc-11 -v
Target: x86_64-pc-linux-gnu
Configured with: ../configure --disable-multilib --disable-bootstrap
--enable-languages=c,c++
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.2.0 (GCC)

gcc-11 -O3 case.c -S -o /dev/stdout
...
main:
.LFB0:
        .cfi_startproc
        movl    d(%rip), %eax
        testl   %eax, %eax
        jg      .L2
        movl    $1, d(%rip)
.L2:
        xorl    %eax, %eax
        ret

gcc-trunk -v
Target: x86_64-pc-linux-gnu
Configured with: ../configure --disable-multilib --disable-bootstrap
--enable-languages=c,c++ 
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.0.0 20211008 (experimental) (GCC)

gcc-trunk -O3 case.c -S -o /dev/stdout
...
main:
.LFB0:
        .cfi_startproc
        movl    d(%rip), %ecx
        testl   %ecx, %ecx
        jg      .L3
        movl    $1, d(%rip)
        xorl    %eax, %eax
        ret
.L3:
        pushq   %rax
        .cfi_def_cfa_offset 16
        call    foo
        xorl    %eax, %eax
        popq    %rdx
        .cfi_def_cfa_offset 8
        ret


18b88412069f51433e1b4f440d3c035bfc7b5cca
(https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=18b88412069f51433e1b4f440d3c035bfc7b5cca)
introduced this regression

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/102650] Dead Code Elimination Regression at -O3 (trunk vs 11.2.0)
  2021-10-08 13:47 [Bug tree-optimization/102650] New: Dead Code Elimination Regression at -O3 (trunk vs 11.2.0) theodort at inf dot ethz.ch
@ 2021-10-08 16:45 ` amacleod at redhat dot com
  2021-10-11  8:34 ` [Bug tree-optimization/102650] [12 Regression] " rguenth at gcc dot gnu.org
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: amacleod at redhat dot com @ 2021-10-08 16:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102650

Andrew Macleod <amacleod at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |amacleod at redhat dot com

--- Comment #1 from Andrew Macleod <amacleod at redhat dot com> ---
This is a result of the vagaries of the single subrange value-range.

VRP is seeing:
# f_11 = PHI <-1(2), 2(3)>
  goto <bb 6>; [100.00%]

  <bb 5> [local count: 955630225]:
  _3 = (unsigned short) f_11;
  _6 = _3 + 2;
  e_19 = (short int) _6;

It knows f_11 is [-1, 2] and when that is cast to a ushort,  produces ~[3,
65534].

That is all we knew about it in GCC11, so when we calculate_6 = ~[3,65534] + 2 
it comes up with [1,4] and the e_19 == 0 later on then can be folded away.

in gcc12, EVRP has figured out that _3 is unsigned short [2, 2][+INF, +INF]. 
which if we add 2 to it, would come up with [1,1][4,4] which would be perfect.

We save this to the value_range global table in EVRP, but alas it gets
transliterated to a single pair value_range : _3  : unsigned short [2, +INF]

Now when VRP calculates ~[3, 65534] and intersects that with the known global
[2, +INF] the legacy intersect routine  has to come up with  pair and decides
to keep [2, +INF].  
when you add 2 to that, 0 is no longer eliminated.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/102650] [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs 11.2.0)
  2021-10-08 13:47 [Bug tree-optimization/102650] New: Dead Code Elimination Regression at -O3 (trunk vs 11.2.0) theodort at inf dot ethz.ch
  2021-10-08 16:45 ` [Bug tree-optimization/102650] " amacleod at redhat dot com
@ 2021-10-11  8:34 ` rguenth at gcc dot gnu.org
  2021-11-05 20:20 ` amacleod at redhat dot com
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-10-11  8:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102650

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
   Target Milestone|---                         |12.0
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2021-10-11
            Summary|Dead Code Elimination       |[12 Regression] Dead Code
                   |Regression at -O3 (trunk vs |Elimination Regression at
                   |11.2.0)                     |-O3 (trunk vs 11.2.0)

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Andrew Macleod from comment #1)
> This is a result of the vagaries of the single subrange value-range.
> 
> VRP is seeing:
> # f_11 = PHI <-1(2), 2(3)>
>   goto <bb 6>; [100.00%]
> 
>   <bb 5> [local count: 955630225]:
>   _3 = (unsigned short) f_11;
>   _6 = _3 + 2;
>   e_19 = (short int) _6;
> 
> It knows f_11 is [-1, 2] and when that is cast to a ushort,  produces ~[3,
> 65534].
> 
> That is all we knew about it in GCC11, so when we calculate_6 = ~[3,65534] +
> 2  it comes up with [1,4] and the e_19 == 0 later on then can be folded away.
> 
> in gcc12, EVRP has figured out that _3 is unsigned short [2, 2][+INF, +INF].
> which if we add 2 to it, would come up with [1,1][4,4] which would be
> perfect.
> 
> We save this to the value_range global table in EVRP, but alas it gets
> transliterated to a single pair value_range : _3  : unsigned short [2, +INF]

That translation could see whether the corresponding anti-rante ~[3,65534] is
smaller (which it is) and use that instead?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/102650] [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs 11.2.0)
  2021-10-08 13:47 [Bug tree-optimization/102650] New: Dead Code Elimination Regression at -O3 (trunk vs 11.2.0) theodort at inf dot ethz.ch
  2021-10-08 16:45 ` [Bug tree-optimization/102650] " amacleod at redhat dot com
  2021-10-11  8:34 ` [Bug tree-optimization/102650] [12 Regression] " rguenth at gcc dot gnu.org
@ 2021-11-05 20:20 ` amacleod at redhat dot com
  2022-01-19 14:13 ` rguenth at gcc dot gnu.org
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: amacleod at redhat dot com @ 2021-11-05 20:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102650

--- Comment #3 from Andrew Macleod <amacleod at redhat dot com> ---
I figured running ranger as VRP2 would fix this... but alas, there is some
interference :-)

After fre5:
  <bb 4> [local count: 118111600]:
  # prephitmp_24 = PHI <1(3), 4(2)>
  d.5_16 = d;
  if (d.5_16 <= 0)
    goto <bb 5>; [89.00%]
  else
    goto <bb 6>; [11.00%]

  <bb 5> [local count: 955630225]:
  d = 1;

  <bb 6> [local count: 118111600]:
  # e_4 = PHI <prephitmp_24(5), e_17(D)(4)>
  if (e_4 == 0)
    goto <bb 7>; [33.00%]
  else
    goto <bb 8>; [67.00%]

  <bb 7> [local count: 38976828]:
  foo ();

We know prephitmp_24 is [1,1] [4,4], and e_17 is undefined on 4->6, so ranger
will evaluate that PHI as [1,1], [4,4] and fold the condition as never true.

Unfortuntely, the next opass is threasd2 and its decides to thread this,
producing:

 <bb 4> [local count: 118111600]:
  # prephitmp_24 = PHI <1(3), 4(2)>
  d.5_16 = d;
  if (d.5_16 <= 0)
    goto <bb 5>; [89.00%]
  else
    goto <bb 6>; [11.00%]

  <bb 5> [local count: 105119324]:
  d = 1;
  goto <bb 8>; [100.00%]

  <bb 6> [local count: 12992276]:
  # e_4 = PHI <e_17(D)(4)>
  if (e_4 == 0)
    goto <bb 7>; [66.33%]
  else
    goto <bb 8>; [33.67%]

  <bb 7> [local count: 38976828]:
  foo ();

  <bb 8> [local count: 118111600]:
  return 0;

now we have a condition if (e_4 == 0) and e_4 is UNDEFINED, and so ranger
leaves it alone.

Is there any pass that examines branches using undefined values and decides
which way to fold it is most profitable?  probably not.  

Running this with --param=vrp1-mode=ranger resolves the problem because we
don't introduce this situation before it's taken care of,a nd we fold the
condition based on the PHI.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/102650] [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs 11.2.0)
  2021-10-08 13:47 [Bug tree-optimization/102650] New: Dead Code Elimination Regression at -O3 (trunk vs 11.2.0) theodort at inf dot ethz.ch
                   ` (2 preceding siblings ...)
  2021-11-05 20:20 ` amacleod at redhat dot com
@ 2022-01-19 14:13 ` rguenth at gcc dot gnu.org
  2022-01-19 14:18 ` rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-01-19 14:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102650

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
If it is undefined it should be unreachable, not switch to a random static
branch ;)  (defeating uninit diagnostics, of course)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/102650] [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs 11.2.0)
  2021-10-08 13:47 [Bug tree-optimization/102650] New: Dead Code Elimination Regression at -O3 (trunk vs 11.2.0) theodort at inf dot ethz.ch
                   ` (3 preceding siblings ...)
  2022-01-19 14:13 ` rguenth at gcc dot gnu.org
@ 2022-01-19 14:18 ` rguenth at gcc dot gnu.org
  2022-01-19 15:15 ` amacleod at redhat dot com
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-01-19 14:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102650

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hubicka at gcc dot gnu.org

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
This is another case where IPA const promotion could see that we only ever
store zero to 'c' and thus it can be promoted R/O when eliding that store.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/102650] [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs 11.2.0)
  2021-10-08 13:47 [Bug tree-optimization/102650] New: Dead Code Elimination Regression at -O3 (trunk vs 11.2.0) theodort at inf dot ethz.ch
                   ` (4 preceding siblings ...)
  2022-01-19 14:18 ` rguenth at gcc dot gnu.org
@ 2022-01-19 15:15 ` amacleod at redhat dot com
  2022-01-20  9:20 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: amacleod at redhat dot com @ 2022-01-19 15:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102650

--- Comment #6 from Andrew Macleod <amacleod at redhat dot com> ---
(In reply to Richard Biener from comment #4)
> If it is undefined it should be unreachable, not switch to a random static
> branch ;)  (defeating uninit diagnostics, of course)

Well, its not unreachable.. we reach that code always and have to take one of
the branches. It just uses an automatic variable which hasn't been initialized,
making it undefined.

We could choose whatever value we want for it.  If we chose non-zero, the call
would be eliminated.  It would take analysis of some sort to decide which
branch is more profitable to remove in the presence of undefined...  

THis seems like something that maybe uninit could do :-)  Hey, this is
uninitialized, tell the user and then see what the most profitable value to
assume for it going forward would be and set it to that value :-)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/102650] [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs 11.2.0)
  2021-10-08 13:47 [Bug tree-optimization/102650] New: Dead Code Elimination Regression at -O3 (trunk vs 11.2.0) theodort at inf dot ethz.ch
                   ` (5 preceding siblings ...)
  2022-01-19 15:15 ` amacleod at redhat dot com
@ 2022-01-20  9:20 ` rguenth at gcc dot gnu.org
  2022-01-20 14:52 ` amacleod at redhat dot com
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-01-20  9:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102650

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
Actually e will not be used uninitialized

    for (; d < 1; d++)
        e = f + a;

will initialize it since d is zero and its value will be 4.  But jump
threading isolates the case where we would access e uninitialized.
So yes, it does seem worth doing that but maybe only on isolated paths
(to not defeat uninit diagnostics and also to remove spurious uninit
diagnostics).  The situation isn't easily visible from the threader
itself and the question is how much GCC itself will expose unconditional
uninit uses (there are some bugs around ifcombine doing that) so it's
prone to producing wrong-code as well.

That said, we probably have to live with this regression for GCC 12 and
could look into sanitizing our undef behavior for GCC 13 somehow.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/102650] [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs 11.2.0)
  2021-10-08 13:47 [Bug tree-optimization/102650] New: Dead Code Elimination Regression at -O3 (trunk vs 11.2.0) theodort at inf dot ethz.ch
                   ` (6 preceding siblings ...)
  2022-01-20  9:20 ` rguenth at gcc dot gnu.org
@ 2022-01-20 14:52 ` amacleod at redhat dot com
  2022-03-23  8:45 ` rguenth at gcc dot gnu.org
  2022-11-03 19:33 ` [Bug tree-optimization/102650] [12/13 " amacleod at redhat dot com
  9 siblings, 0 replies; 11+ messages in thread
From: amacleod at redhat dot com @ 2022-01-20 14:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102650

--- Comment #8 from Andrew Macleod <amacleod at redhat dot com> ---
Aldy had some ideas of how to extend the threaders new path evaluation
capabilities to determine if there are any paths between bbX and bbY which meet
specific range conditions like UNDEFINED, or [0, 0] for null tracking.

This could add more robustness to some of the warnings and other passes by
isolating each path and checking it with more specificity than the generalized
range query provides.  It could also provide the specific path(s) and
conditions upon which the failure occurs.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/102650] [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs 11.2.0)
  2021-10-08 13:47 [Bug tree-optimization/102650] New: Dead Code Elimination Regression at -O3 (trunk vs 11.2.0) theodort at inf dot ethz.ch
                   ` (7 preceding siblings ...)
  2022-01-20 14:52 ` amacleod at redhat dot com
@ 2022-03-23  8:45 ` rguenth at gcc dot gnu.org
  2022-11-03 19:33 ` [Bug tree-optimization/102650] [12/13 " amacleod at redhat dot com
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-03-23  8:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102650

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P2
   Target Milestone|12.0                        |13.0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/102650] [12/13 Regression] Dead Code Elimination Regression at -O3 (trunk vs 11.2.0)
  2021-10-08 13:47 [Bug tree-optimization/102650] New: Dead Code Elimination Regression at -O3 (trunk vs 11.2.0) theodort at inf dot ethz.ch
                   ` (8 preceding siblings ...)
  2022-03-23  8:45 ` rguenth at gcc dot gnu.org
@ 2022-11-03 19:33 ` amacleod at redhat dot com
  9 siblings, 0 replies; 11+ messages in thread
From: amacleod at redhat dot com @ 2022-11-03 19:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102650

Andrew Macleod <amacleod at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #9 from Andrew Macleod <amacleod at redhat dot com> ---
Fixed by
commit e7310e24b1c0ca67b1bb507c1330b2bf39e59e32
Author: Andrew MacLeod <amacleod@redhat.com>
Date:   Tue Oct 25 16:42:41 2022 -0400

    Make ranger vrp1 default.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-11-03 19:33 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-08 13:47 [Bug tree-optimization/102650] New: Dead Code Elimination Regression at -O3 (trunk vs 11.2.0) theodort at inf dot ethz.ch
2021-10-08 16:45 ` [Bug tree-optimization/102650] " amacleod at redhat dot com
2021-10-11  8:34 ` [Bug tree-optimization/102650] [12 Regression] " rguenth at gcc dot gnu.org
2021-11-05 20:20 ` amacleod at redhat dot com
2022-01-19 14:13 ` rguenth at gcc dot gnu.org
2022-01-19 14:18 ` rguenth at gcc dot gnu.org
2022-01-19 15:15 ` amacleod at redhat dot com
2022-01-20  9:20 ` rguenth at gcc dot gnu.org
2022-01-20 14:52 ` amacleod at redhat dot com
2022-03-23  8:45 ` rguenth at gcc dot gnu.org
2022-11-03 19:33 ` [Bug tree-optimization/102650] [12/13 " amacleod at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).