public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/100112] New: missed optimization for dead code elimination at -O3, -Os (vs. -O1, -O2)
@ 2021-04-16  7:22 zhendong.su at inf dot ethz.ch
  2021-04-20  8:14 ` [Bug ipa/100112] " rguenth at gcc dot gnu.org
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: zhendong.su at inf dot ethz.ch @ 2021-04-16  7:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100112

            Bug ID: 100112
           Summary: missed optimization for dead code elimination at -O3,
                    -Os (vs. -O1, -O2)
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: zhendong.su at inf dot ethz.ch
  Target Milestone: ---

[545] % gcctk -v
Using built-in specs.
COLLECT_GCC=gcctk
COLLECT_LTO_WRAPPER=/local/suz-local/software/local/gcc-trunk/libexec/gcc/x86_64-pc-linux-gnu/11.0.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-trunk/configure --disable-bootstrap
--prefix=/local/suz-local/software/local/gcc-trunk --enable-languages=c,c++
--disable-werror --enable-multilib --with-system-zlib
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 11.0.1 20210416 (experimental) [master revision
89c863488bc:10ed13839be:76c7e7d6b003a17d183d0571bf9b34c691819d25] (GCC) 
[546] % 
[546] % gcctk -O1 -S -o O1.s small.c
[547] % gcctk -O3 -S -o O3.s small.c
[548] % 
[548] % wc O1.s O3.s
 17  38 365 O1.s
 37  78 633 O3.s
 54 116 998 total
[549] % 
[549] % grep foo O1.s
[550] % grep foo O3.s
        call    foo
[551] % 
[551] % cat small.c
extern void foo(void);
static int e, *a = &e, b, *c = &b;
static int d(int f, int i) {
  if (f ^ i)
    foo();
}
int main() {
  int **g = &c;
  (*a)++;
  *g = c;
  d(1, c != 0);
  return 0;
}

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug ipa/100112] missed optimization for dead code elimination at -O3, -Os (vs. -O1, -O2)
  2021-04-16  7:22 [Bug tree-optimization/100112] New: missed optimization for dead code elimination at -O3, -Os (vs. -O1, -O2) zhendong.su at inf dot ethz.ch
@ 2021-04-20  8:14 ` rguenth at gcc dot gnu.org
  2021-09-25  7:51 ` [Bug tree-optimization/100112] " pinskia at gcc dot gnu.org
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-04-20  8:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100112

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|tree-optimization           |ipa
           Keywords|                            |missed-optimization
     Ever confirmed|0                           |1
                 CC|                            |marxin at gcc dot gnu.org
             Status|UNCONFIRMED                 |NEW
            Version|unknown                     |11.0
   Last reconfirmed|                            |2021-04-20

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
It's very late eliminated at -O1 (DCE7) while with -O3 we still have

  <bb 2> [local count: 1073741824]:
  _2 = e;
  _3 = _2 + 1;
  e = _3; 
  c.1_4 = c;
  if (c.1_4 == 0B)
    goto <bb 3>; [48.88%]
  else
    goto <bb 4>; [51.12%]

  <bb 3> [local count: 524845000]:
  foo ();

there.  In the end it's some IPA issue, failing to make 'c' readonly at -O3
vs. -O1.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/100112] missed optimization for dead code elimination at -O3, -Os (vs. -O1, -O2)
  2021-04-16  7:22 [Bug tree-optimization/100112] New: missed optimization for dead code elimination at -O3, -Os (vs. -O1, -O2) zhendong.su at inf dot ethz.ch
  2021-04-20  8:14 ` [Bug ipa/100112] " rguenth at gcc dot gnu.org
@ 2021-09-25  7:51 ` pinskia at gcc dot gnu.org
  2021-09-25  9:40 ` pinskia at gcc dot gnu.org
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-25  7:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100112

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |alias
          Component|ipa                         |tree-optimization

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Hmm, the difference is in fre, for some reason -fstrict-aliasing makes a
difference.

ealias at -O1:
  c.1_4 = c;
  c = c.1_4;
  c.2_5 = c;
  _6 = c.2_5 != 0B;

ealias at -O3:
  *a.0_1 = _3;
  c.1_4 = c;
  c = c.1_4;
  c.2_5 = c;
  _6 = c.2_5 != 0B;

fre1 at -O1:
  *a.0_1 = _3;
  c.1_4 = c;
  _6 = c.1_4 != 0B;

fre1 at -O3:
  *a.0_1 = _3;
  c.1_4 = c;
  c = c.1_4;
  _6 = c.1_4 != 0B;


fre1 at -O1 dump:
Value numbering stmt = c = c.1_4;
Store matched earlier value, value numbering store vdefs to matching vuses.

fre1 at -O3 dump:
Value numbering stmt = c.1_4 = c;
Setting value number of c.1_4 to c.1_4 (changed)
Making available beyond BB2 c.1_4 for value c.1_4
Value numbering stmt = c = c.1_4;
No store match
Value numbering store c to c.1_4
Setting value number of .MEM_11 to .MEM_11 (changed)

If I use -O3 -fno-strict-aliasing, I get the same result as -O1.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/100112] missed optimization for dead code elimination at -O3, -Os (vs. -O1, -O2)
  2021-04-16  7:22 [Bug tree-optimization/100112] New: missed optimization for dead code elimination at -O3, -Os (vs. -O1, -O2) zhendong.su at inf dot ethz.ch
  2021-04-20  8:14 ` [Bug ipa/100112] " rguenth at gcc dot gnu.org
  2021-09-25  7:51 ` [Bug tree-optimization/100112] " pinskia at gcc dot gnu.org
@ 2021-09-25  9:40 ` pinskia at gcc dot gnu.org
  2021-09-27  7:02 ` rguenth at gcc dot gnu.org
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-25  9:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100112

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rguenth at gcc dot gnu.org

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
In the strict aliasing case:

  # .MEM_10 = VDEF <.MEM_9(D)>
  *a.0_1 = _3;
  # VUSE <.MEM_10>
  c.1_4 = c;
  # .MEM_11 = VDEF <.MEM_10>
  c = c.1_4;
  # VUSE <.MEM_11>
  c.2_5 = c;

We use MEM_9 for the load of c (the one with VUSE of MEM_10) as the vuse for
the hashtable. In the non-strict-aliasing case, we use the VUSE of MEM_10 and
it works.

Richi,
  You know the SCCVN code the best, maybe you can understand how to fix this
better.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/100112] missed optimization for dead code elimination at -O3, -Os (vs. -O1, -O2)
  2021-04-16  7:22 [Bug tree-optimization/100112] New: missed optimization for dead code elimination at -O3, -Os (vs. -O1, -O2) zhendong.su at inf dot ethz.ch
                   ` (2 preceding siblings ...)
  2021-09-25  9:40 ` pinskia at gcc dot gnu.org
@ 2021-09-27  7:02 ` rguenth at gcc dot gnu.org
  2021-09-27  7:06 ` pinskia at gcc dot gnu.org
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-09-27  7:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100112

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot gnu.org
             Status|NEW                         |ASSIGNED

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
This circles around redundant store elimination which is tricky with TBAA since
even a "value-redundant store" can change the dynamic type and thus we need to
preserve it.  See also PR101641 for a pending wrong-code issue here.

I _think_ we have an almost exact duplicate but let me take it, this is
related to last_vuse - we're doing the lookup w/ VN_NOWALK and thus don't
see the hashtable entry with the "optimized" VUSE from last_vuse handling.

IIRC VN_NOWALK is mainly a compile-time optimization, at elimination time
we even use VN_WALKREWRITE.  It originally changed with
g:649caaad399d6f4865a4d0015a1ac76c3cce7eb0 as a wrong-code fix though.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/100112] missed optimization for dead code elimination at -O3, -Os (vs. -O1, -O2)
  2021-04-16  7:22 [Bug tree-optimization/100112] New: missed optimization for dead code elimination at -O3, -Os (vs. -O1, -O2) zhendong.su at inf dot ethz.ch
                   ` (3 preceding siblings ...)
  2021-09-27  7:02 ` rguenth at gcc dot gnu.org
@ 2021-09-27  7:06 ` pinskia at gcc dot gnu.org
  2021-09-27  7:26 ` rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-27  7:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100112

--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #4)
> I _think_ we have an almost exact duplicate but let me take it, this is
> related to last_vuse - we're doing the lookup w/ VN_NOWALK and thus don't
> see the hashtable entry with the "optimized" VUSE from last_vuse handling.

Yes, You are thinking of PR 93891 which I linked here already in the see also
field.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/100112] missed optimization for dead code elimination at -O3, -Os (vs. -O1, -O2)
  2021-04-16  7:22 [Bug tree-optimization/100112] New: missed optimization for dead code elimination at -O3, -Os (vs. -O1, -O2) zhendong.su at inf dot ethz.ch
                   ` (4 preceding siblings ...)
  2021-09-27  7:06 ` pinskia at gcc dot gnu.org
@ 2021-09-27  7:26 ` rguenth at gcc dot gnu.org
  2021-09-27  7:28 ` rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-09-27  7:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100112

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
Oh, and walking doesn't help since we may not use TBAA for the lookup but the
load handling did use TBAA and only because of that figured a better VUSE to
record the reference into the hashtables.

A smaller testcase is

int *c, *b;
void foo()
{
  int *tem = b;
  *tem = 0;
  tem = c;
  c = tem;
}

so it really boils down to last_vuse being good and bad at the same time.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/100112] missed optimization for dead code elimination at -O3, -Os (vs. -O1, -O2)
  2021-04-16  7:22 [Bug tree-optimization/100112] New: missed optimization for dead code elimination at -O3, -Os (vs. -O1, -O2) zhendong.su at inf dot ethz.ch
                   ` (5 preceding siblings ...)
  2021-09-27  7:26 ` rguenth at gcc dot gnu.org
@ 2021-09-27  7:28 ` rguenth at gcc dot gnu.org
  2021-09-27  7:28 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-09-27  7:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100112

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           See Also|                            |https://gcc.gnu.org/bugzill
                   |                            |a/show_bug.cgi?id=99793

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
Or more like PR99793

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/100112] missed optimization for dead code elimination at -O3, -Os (vs. -O1, -O2)
  2021-04-16  7:22 [Bug tree-optimization/100112] New: missed optimization for dead code elimination at -O3, -Os (vs. -O1, -O2) zhendong.su at inf dot ethz.ch
                   ` (6 preceding siblings ...)
  2021-09-27  7:28 ` rguenth at gcc dot gnu.org
@ 2021-09-27  7:28 ` rguenth at gcc dot gnu.org
  2021-09-27  7:31 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-09-27  7:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100112

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           See Also|                            |https://gcc.gnu.org/bugzill
                   |                            |a/show_bug.cgi?id=88854

--- Comment #8 from Richard Biener <rguenth at gcc dot gnu.org> ---
And PR88854

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/100112] missed optimization for dead code elimination at -O3, -Os (vs. -O1, -O2)
  2021-04-16  7:22 [Bug tree-optimization/100112] New: missed optimization for dead code elimination at -O3, -Os (vs. -O1, -O2) zhendong.su at inf dot ethz.ch
                   ` (7 preceding siblings ...)
  2021-09-27  7:28 ` rguenth at gcc dot gnu.org
@ 2021-09-27  7:31 ` rguenth at gcc dot gnu.org
  2021-09-28 10:32 ` cvs-commit at gcc dot gnu.org
  2021-09-28 10:32 ` rguenth at gcc dot gnu.org
  10 siblings, 0 replies; 12+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-09-27  7:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100112

--- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> ---
So the point is that

int *c, *b;
void foo()
{
  int *tem = b;
  *tem = 0;
  int *tem2 = c;
  c = tem2;
}

and

int *c, *b;
void foo()
{
  int *tem = b;
  int *tem2 = c;
  *tem = 0;
  c = tem2;
}

are different but we encode tem2 = c; the same in the hashtable with
strict-aliasing.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/100112] missed optimization for dead code elimination at -O3, -Os (vs. -O1, -O2)
  2021-04-16  7:22 [Bug tree-optimization/100112] New: missed optimization for dead code elimination at -O3, -Os (vs. -O1, -O2) zhendong.su at inf dot ethz.ch
                   ` (8 preceding siblings ...)
  2021-09-27  7:31 ` rguenth at gcc dot gnu.org
@ 2021-09-28 10:32 ` cvs-commit at gcc dot gnu.org
  2021-09-28 10:32 ` rguenth at gcc dot gnu.org
  10 siblings, 0 replies; 12+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-09-28 10:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100112

--- Comment #10 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:5b8b1522e04adc20980f396571be1929a32d148a

commit r12-3918-g5b8b1522e04adc20980f396571be1929a32d148a
Author: Richard Biener <rguenther@suse.de>
Date:   Mon Sep 27 12:01:38 2021 +0200

    tree-optimization/100112 - VN last_vuse and redundant store elimination

    This avoids the last_vuse optimization hindering redundant store
    elimination by always also recording the original VUSE that was
    in effect on the load.

    In stage3 gcc/*.o we have 3182752 times recorded a single
    entry and 903409 times two entries (that's ~20% overhead).

    With just recording a single entry the number of hashtable lookups
    done when walking the vuse->vdef links to find an earlier access
    is 28961618.  When recording the second entry this makes us find
    that earlier for donwnstream redundant accesses, reducing the number
    of hashtable lookups to 25401052 (that's a ~10% reduction).

    2021-09-27  Richard Biener  <rguenther@suse.de>

            PR tree-optimization/100112
            * tree-ssa-sccvn.c (visit_reference_op_load): Record the
            referece into the hashtable twice in case last_vuse is
            different from the original vuse on the stmt.

            * gcc.dg/tree-ssa/ssa-fre-95.c: New testcase.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/100112] missed optimization for dead code elimination at -O3, -Os (vs. -O1, -O2)
  2021-04-16  7:22 [Bug tree-optimization/100112] New: missed optimization for dead code elimination at -O3, -Os (vs. -O1, -O2) zhendong.su at inf dot ethz.ch
                   ` (9 preceding siblings ...)
  2021-09-28 10:32 ` cvs-commit at gcc dot gnu.org
@ 2021-09-28 10:32 ` rguenth at gcc dot gnu.org
  10 siblings, 0 replies; 12+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-09-28 10:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100112

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |12.0
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #11 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed for GCC 12.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-09-28 10:32 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-16  7:22 [Bug tree-optimization/100112] New: missed optimization for dead code elimination at -O3, -Os (vs. -O1, -O2) zhendong.su at inf dot ethz.ch
2021-04-20  8:14 ` [Bug ipa/100112] " rguenth at gcc dot gnu.org
2021-09-25  7:51 ` [Bug tree-optimization/100112] " pinskia at gcc dot gnu.org
2021-09-25  9:40 ` pinskia at gcc dot gnu.org
2021-09-27  7:02 ` rguenth at gcc dot gnu.org
2021-09-27  7:06 ` pinskia at gcc dot gnu.org
2021-09-27  7:26 ` rguenth at gcc dot gnu.org
2021-09-27  7:28 ` rguenth at gcc dot gnu.org
2021-09-27  7:28 ` rguenth at gcc dot gnu.org
2021-09-27  7:31 ` rguenth at gcc dot gnu.org
2021-09-28 10:32 ` cvs-commit at gcc dot gnu.org
2021-09-28 10:32 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).