public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/103990] New: 541.leela_r slower by 4.5-6% with PGO+LTO -Ofast -march=native in the first week of January 2022
@ 2022-01-12 13:15 jamborm at gcc dot gnu.org
  2022-01-12 13:48 ` [Bug tree-optimization/103990] " rguenth at gcc dot gnu.org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: jamborm at gcc dot gnu.org @ 2022-01-12 13:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103990

            Bug ID: 103990
           Summary: 541.leela_r slower by 4.5-6% with PGO+LTO -Ofast
                    -march=native in the first week of January 2022
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jamborm at gcc dot gnu.org
                CC: rguenth at gcc dot gnu.org
  Target Milestone: ---
              Host: x86_64-linux
            Target: x86_64-linux

LNT reports that 541.leela_r from SPEC 2017 intrate suite regressed
when compiled with both PGO and LTO with -Ofast -march=native on all
machines in the first week of January:

zen3: https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=477.397.0
zen2: https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=286.397.0
zen1: https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=17.397.0
kaby: https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=16.397.0

On my zen2 desktop I have bisected the regression, or at least most of
it, to  r12-6208-gebc853deb7cc04:

  ebc853deb7cc0487de9ef6e891a007ba853d1933 is the first bad commit
  commit ebc853deb7cc0487de9ef6e891a007ba853d1933
  Author: Richard Biener <rguenther@suse.de>
  Date:   Tue Jan 4 11:59:35 2022 +0100

    tree-optimization/103690 - not up-to-date SSA and PRE DCE

    This avoids running simple_dce_from_worklist on partially not up-to-date
    SSA form (in unreachable code regions) by scheduling CFG cleanup
    manually as is done anyway when tail-merging runs.

    2022-01-04  Richard Biener  <rguenther@suse.de>

            PR tree-optimization/103690
            * tree-pass.h (tail_merge_optimize): Adjust.
            * tree-ssa-tail-merge.c (tail_merge_optimize): Pass in whether
            to re-split critical edges, move CFG cleanup ...
            * tree-ssa-pre.c (pass_pre::execute): ... here, before
            simple_dce_from_worklist and delay freeing inserted_exprs from
            ...
            (fini_pre): .. here.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/103990] 541.leela_r slower by 4.5-6% with PGO+LTO -Ofast -march=native in the first week of January 2022
  2022-01-12 13:15 [Bug tree-optimization/103990] New: 541.leela_r slower by 4.5-6% with PGO+LTO -Ofast -march=native in the first week of January 2022 jamborm at gcc dot gnu.org
@ 2022-01-12 13:48 ` rguenth at gcc dot gnu.org
  2022-01-12 13:59 ` rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-01-12 13:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103990

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |ASSIGNED
   Last reconfirmed|                            |2022-01-12
     Ever confirmed|0                           |1
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot gnu.org

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
OK, so the only effect I can think of is that simple_dce_from_worklist can end
up removing the last stmt in a BB and thus _eventually_ expose BB merging CFG
cleanup opportunities.  I also notice that while tail_merge_optimize altered
todo by clearing TODO_cleanup_cfg, PRE just did (and still does)

-  todo |= tail_merge_optimize (todo);
+  todo |= tail_merge_optimize (todo, need_crit_edge_split);

so it would have retained TODO_cleanup_cfg, something we now do not.  The
code is all somewhat of a mess due to the embedded tail-merge and I tried
to do as little changes as possible this late in the cycle.

I'll try to reproduce and see if keeping TODO_cleanup_cfg around helps.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/103990] 541.leela_r slower by 4.5-6% with PGO+LTO -Ofast -march=native in the first week of January 2022
  2022-01-12 13:15 [Bug tree-optimization/103990] New: 541.leela_r slower by 4.5-6% with PGO+LTO -Ofast -march=native in the first week of January 2022 jamborm at gcc dot gnu.org
  2022-01-12 13:48 ` [Bug tree-optimization/103990] " rguenth at gcc dot gnu.org
@ 2022-01-12 13:59 ` rguenth at gcc dot gnu.org
  2022-01-12 14:30 ` jamborm at gcc dot gnu.org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-01-12 13:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103990

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
diff --git a/gcc/tree-ssa-pre.c b/gcc/tree-ssa-pre.c
index ab24fa98a1f..2bdfae5482f 100644
--- a/gcc/tree-ssa-pre.c
+++ b/gcc/tree-ssa-pre.c
@@ -4442,7 +4442,6 @@ pass_pre::execute (function *fun)
   if (todo & TODO_cleanup_cfg)
     {
       cleanup_tree_cfg ();
-      todo &= ~TODO_cleanup_cfg;
       need_crit_edge_split = true;
     }

should fix that

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/103990] 541.leela_r slower by 4.5-6% with PGO+LTO -Ofast -march=native in the first week of January 2022
  2022-01-12 13:15 [Bug tree-optimization/103990] New: 541.leela_r slower by 4.5-6% with PGO+LTO -Ofast -march=native in the first week of January 2022 jamborm at gcc dot gnu.org
  2022-01-12 13:48 ` [Bug tree-optimization/103990] " rguenth at gcc dot gnu.org
  2022-01-12 13:59 ` rguenth at gcc dot gnu.org
@ 2022-01-12 14:30 ` jamborm at gcc dot gnu.org
  2022-01-12 15:18 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: jamborm at gcc dot gnu.org @ 2022-01-12 14:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103990

--- Comment #3 from Martin Jambor <jamborm at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #2)
> 
> should fix that

I can confirm that it does.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/103990] 541.leela_r slower by 4.5-6% with PGO+LTO -Ofast -march=native in the first week of January 2022
  2022-01-12 13:15 [Bug tree-optimization/103990] New: 541.leela_r slower by 4.5-6% with PGO+LTO -Ofast -march=native in the first week of January 2022 jamborm at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2022-01-12 14:30 ` jamborm at gcc dot gnu.org
@ 2022-01-12 15:18 ` cvs-commit at gcc dot gnu.org
  2022-01-12 15:18 ` rguenth at gcc dot gnu.org
  2022-01-19  9:56 ` [Bug tree-optimization/103990] [12 Regression] " pinskia at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-01-12 15:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103990

--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:2f62294dec1f3af59dd7505c058b0af38c2d1524

commit r12-6527-g2f62294dec1f3af59dd7505c058b0af38c2d1524
Author: Richard Biener <rguenther@suse.de>
Date:   Wed Jan 12 15:25:07 2022 +0100

    tree-optimization/103990 - fix CFG cleanup regression from PRE change

    This adjusts the CFG cleanup flow back to what it was before the
    last change which fixes the observed regression of 541.leela_r with
    LTO and FDO.

    2022-01-12  Richard Biener  <rguenther@suse.de>

            PR tree-optimization/103990
            * tree-pass.h (tail_merge_optimize): Drop unused argument.
            * tree-ssa-tail-merge.c (tail_merge_optimize): Likewise.
            * tree-ssa-pre.c (pass_pre::execute): Retain TODO_cleanup_cfg
            and adjust call to tail_merge_optimize.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/103990] 541.leela_r slower by 4.5-6% with PGO+LTO -Ofast -march=native in the first week of January 2022
  2022-01-12 13:15 [Bug tree-optimization/103990] New: 541.leela_r slower by 4.5-6% with PGO+LTO -Ofast -march=native in the first week of January 2022 jamborm at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2022-01-12 15:18 ` cvs-commit at gcc dot gnu.org
@ 2022-01-12 15:18 ` rguenth at gcc dot gnu.org
  2022-01-19  9:56 ` [Bug tree-optimization/103990] [12 Regression] " pinskia at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-01-12 15:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103990

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/103990] [12 Regression] 541.leela_r slower by 4.5-6% with PGO+LTO -Ofast -march=native in the first week of January 2022
  2022-01-12 13:15 [Bug tree-optimization/103990] New: 541.leela_r slower by 4.5-6% with PGO+LTO -Ofast -march=native in the first week of January 2022 jamborm at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2022-01-12 15:18 ` rguenth at gcc dot gnu.org
@ 2022-01-19  9:56 ` pinskia at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-01-19  9:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103990

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
            Summary|541.leela_r slower by       |[12 Regression] 541.leela_r
                   |4.5-6% with PGO+LTO -Ofast  |slower by 4.5-6% with
                   |-march=native in the first  |PGO+LTO -Ofast
                   |week of January 2022        |-march=native in the first
                   |                            |week of January 2022
   Target Milestone|---                         |12.0

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-01-19  9:56 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-12 13:15 [Bug tree-optimization/103990] New: 541.leela_r slower by 4.5-6% with PGO+LTO -Ofast -march=native in the first week of January 2022 jamborm at gcc dot gnu.org
2022-01-12 13:48 ` [Bug tree-optimization/103990] " rguenth at gcc dot gnu.org
2022-01-12 13:59 ` rguenth at gcc dot gnu.org
2022-01-12 14:30 ` jamborm at gcc dot gnu.org
2022-01-12 15:18 ` cvs-commit at gcc dot gnu.org
2022-01-12 15:18 ` rguenth at gcc dot gnu.org
2022-01-19  9:56 ` [Bug tree-optimization/103990] [12 Regression] " pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).