public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/100162] New: missed optimization for dead code elimination at -O3 (vs. -O2)
@ 2021-04-20 19:20 zhendong.su at inf dot ethz.ch
  2021-04-21  8:35 ` [Bug tree-optimization/100162] " rguenth at gcc dot gnu.org
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: zhendong.su at inf dot ethz.ch @ 2021-04-20 19:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100162

            Bug ID: 100162
           Summary: missed optimization for dead code elimination at -O3
                    (vs. -O2)
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: zhendong.su at inf dot ethz.ch
  Target Milestone: ---

[720] % gcctk -v
Using built-in specs.
COLLECT_GCC=gcctk
COLLECT_LTO_WRAPPER=/local/suz-local/software/local/gcc-trunk/libexec/gcc/x86_64-pc-linux-gnu/11.0.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-trunk/configure --disable-bootstrap
--prefix=/local/suz-local/software/local/gcc-trunk --enable-languages=c,c++
--disable-werror --enable-multilib --with-system-zlib
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 11.0.1 20210420 (experimental) [master revision
67378cd63d6:5e36407d599:250f234988b6231669a720c52101d3686d645072] (GCC) 
[721] % 
[721] % gcctk -O2 -S -o O2.s small.c
[722] % gcctk -O3 -S -o O3.s small.c
[723] % 
[723] % wc O2.s O3.s
  52  119  923 O2.s
  70  150 1143 O3.s
 122  269 2066 total
[724] % 
[724] % grep foo O2.s
[725] % grep foo O3.s
        call    foo
[726] % 
[726] % cat small.c
extern void foo(void);
int printf(const char *, ...);
int a, b, c[5][1];
int main() {
  for (a = 0; a < 5; a++)
    c[a][b] = 2;
  if ((b || 0) / c[0][0])
    foo();
  printf("checksum=0");
  return 0;
}

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/100162] missed optimization for dead code elimination at -O3 (vs. -O2)
  2021-04-20 19:20 [Bug tree-optimization/100162] New: missed optimization for dead code elimination at -O3 (vs. -O2) zhendong.su at inf dot ethz.ch
@ 2021-04-21  8:35 ` rguenth at gcc dot gnu.org
  2023-05-05  8:07 ` pinskia at gcc dot gnu.org
  2023-05-05 10:16 ` rguenth at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-04-21  8:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100162

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Version|unknown                     |12.0
                 CC|                            |rguenth at gcc dot gnu.org
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2021-04-21
             Status|UNCONFIRMED                 |NEW
           Keywords|                            |missed-optimization

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Optimized by DOM3 which sees the following difference:

-  <bb 2> [local count: 118111601]:
+  <bb 2> [local count: 955630225]:
   b.1_1 = b;
-  c[0][b.1_1] = 2;
-  c[1][b.1_1] = 2;
-  c[2][b.1_1] = 2;
-  c[3][b.1_1] = 2;
+  _27 = (sizetype) b.1_1;
+  _28 = _27 * 4;
+  vectp_c.13_26 = &c + _28;
+  MEM <vector(4) int> [(int *)vectp_c.13_26] = { 2, 2, 2, 2 };
+  vectp_c.12_30 = vectp_c.13_26 + 16;
   c[4][b.1_1] = 2;
   a = 5;
   _5 = b.1_1 != 0;
   _6 = (int) _5;
-  _8 = _6 / 2;
+  _7 = c[0][0];
+  _8 = _6 / _7;
   if (_8 != 0)

here c[0][b.1_1] takes advantage of get_ref_base_and_extent honoring the known
array size of [1] while the pointer based access is not constrained this way
which makes matching c[0][0] to *(&c + _28) = { 2, 2, 2, 2 } difficult.

The realistic chance is to catch this by improving value-numbering done
on the not unrolled loop earlier:

  <bb 3> [local count: 955630225]:
  # a.3_19 = PHI <_2(3), 0(2)>
  c[a.3_19][b.1_1] = 2;
  _2 = a.3_19 + 1;
  if (_2 <= 4)
    goto <bb 3>; [89.00%]
  else
    goto <bb 4>; [11.00%]

  <bb 4> [local count: 118111600]:
  a = _2;
  _5 = b.1_1 != 0;
  _6 = (int) _5;
  _7 = c[0][0];

where we could use SCEV & friends to lookup c[0][0] at the c[a.3_19][b.1_1]
definition in vn_reference_lookup_3.

That might also help to look through loop abstraction earlier.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/100162] missed optimization for dead code elimination at -O3 (vs. -O2)
  2021-04-20 19:20 [Bug tree-optimization/100162] New: missed optimization for dead code elimination at -O3 (vs. -O2) zhendong.su at inf dot ethz.ch
  2021-04-21  8:35 ` [Bug tree-optimization/100162] " rguenth at gcc dot gnu.org
@ 2023-05-05  8:07 ` pinskia at gcc dot gnu.org
  2023-05-05 10:16 ` rguenth at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-05  8:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100162

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |needs-bisection

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
FRE is no longer able to optimize _7 to 2 in GCC 13+:
  c[0][b.1_1] = 2;
  c[1][b.1_1] = 2;
  c[2][b.1_1] = 2;
  c[3][b.1_1] = 2;
  c[4][b.1_1] = 2;
  a = 5;
  _5 = b.1_1 != 0;
  _6 = (int) _5;
  _7 = c[0][0];

Of course -O3 reasons why is listed below.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/100162] missed optimization for dead code elimination at -O3 (vs. -O2)
  2021-04-20 19:20 [Bug tree-optimization/100162] New: missed optimization for dead code elimination at -O3 (vs. -O2) zhendong.su at inf dot ethz.ch
  2021-04-21  8:35 ` [Bug tree-optimization/100162] " rguenth at gcc dot gnu.org
  2023-05-05  8:07 ` pinskia at gcc dot gnu.org
@ 2023-05-05 10:16 ` rguenth at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-05-05 10:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100162

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #2)
> FRE is no longer able to optimize _7 to 2 in GCC 13+:
>   c[0][b.1_1] = 2;
>   c[1][b.1_1] = 2;
>   c[2][b.1_1] = 2;
>   c[3][b.1_1] = 2;
>   c[4][b.1_1] = 2;
>   a = 5;
>   _5 = b.1_1 != 0;
>   _6 = (int) _5;
>   _7 = c[0][0];

That's PR108355.  The "magic" special-casing of single element arrays went
away (or rather now triggers more unreliably).

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-05-05 10:16 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-20 19:20 [Bug tree-optimization/100162] New: missed optimization for dead code elimination at -O3 (vs. -O2) zhendong.su at inf dot ethz.ch
2021-04-21  8:35 ` [Bug tree-optimization/100162] " rguenth at gcc dot gnu.org
2023-05-05  8:07 ` pinskia at gcc dot gnu.org
2023-05-05 10:16 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).