public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/101555] New: Compile slowdown in tree PRE
@ 2021-07-21 14:56 wsnyder at wsnyder dot org
2021-07-21 14:59 ` [Bug c++/101555] " wsnyder at wsnyder dot org
` (7 more replies)
0 siblings, 8 replies; 9+ messages in thread
From: wsnyder at wsnyder dot org @ 2021-07-21 14:56 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101555
Bug ID: 101555
Summary: Compile slowdown in tree PRE
Product: gcc
Version: 9.3.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c++
Assignee: unassigned at gcc dot gnu.org
Reporter: wsnyder at wsnyder dot org
Target Milestone: ---
Running
g++ (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
as
g++ -I. -c -Os -ftime-report -o /dev/null Vara___024root__114_0.cpp
takes 297 CPU seconds.
On clang version 10.0.0-4ubuntu1 it takes 37 seconds total.
In GCC, 276 of 297 seconds are in the "tree PRE" stage. Splitting the large
function into roughly three pieces (Vara___024root__114_0.cpp, _1 and _2)
the "tree PRE" stage takes 12.7, 7.8 and 3.9 seconds (totalling 24.4).
Therefore I suspect some O(n^2) issue in tree PRE.
Note removing -Os makes gcc much faster, but still signifcantly slower than
clang.
This issue was originally reported using gcc 8.2.0 by Matheus Cavalcante
at https://github.com/verilator/verilator/issues/3072
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug c++/101555] Compile slowdown in tree PRE
2021-07-21 14:56 [Bug c++/101555] New: Compile slowdown in tree PRE wsnyder at wsnyder dot org
@ 2021-07-21 14:59 ` wsnyder at wsnyder dot org
2021-07-22 8:41 ` [Bug tree-optimization/101555] [12 Regression] " rguenth at gcc dot gnu.org
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: wsnyder at wsnyder dot org @ 2021-07-21 14:59 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101555
--- Comment #1 from Wilson Snyder <wsnyder at wsnyder dot org> ---
Created attachment 51187
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51187&action=edit
Examples and runtimes
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/101555] [12 Regression] Compile slowdown in tree PRE
2021-07-21 14:56 [Bug c++/101555] New: Compile slowdown in tree PRE wsnyder at wsnyder dot org
2021-07-21 14:59 ` [Bug c++/101555] " wsnyder at wsnyder dot org
@ 2021-07-22 8:41 ` rguenth at gcc dot gnu.org
2021-07-22 9:24 ` rguenth at gcc dot gnu.org
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-07-22 8:41 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101555
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|--- |12.0
Keywords| |needs-bisection
Summary|Compile slowdown in tree |[12 Regression] Compile
|PRE |slowdown in tree PRE
Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org
Status|UNCONFIRMED |ASSIGNED
Version|9.3.0 |12.0
Ever confirmed|0 |1
Last reconfirmed| |2021-07-22
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Confirmed with g++-9 (SUSE Linux) 9.3.1 20200406 [revision
6db837a5288ee3ca5ec504fbd5a765817e556ac2]
tree PRE : 645.41 ( 97%) 0.16 ( 34%) 645.60 ( 97%)
177896 kB ( 15%)
TOTAL : 665.14 0.47 665.73
1154121 kB
with -fno-code-hoisting that get's down to
tree PRE : 219.50 ( 93%) 0.04 ( 11%) 219.69 ( 93%)
87813 kB ( 11%)
TOTAL : 235.70 0.37 236.12
807306 kB
GCC 10.3.0 shows
tree PRE : 267.62 ( 93%) 0.13 ( 32%) 267.69 ( 93%)
140858 kB ( 13%)
TOTAL : 288.92 0.41 289.37
1122606 kB
GCC 11 does
tree PRE : 251.41 ( 91%) 0.10 ( 23%) 251.71 ( 91%)
137M ( 16%)
TOTAL : 275.01 0.43 275.51
837M
Note that trunk regresses again:
tree PRE : 312.12 ( 93%) 0.07 ( 15%) 312.46 ( 93%)
137M ( 16%)
TOTAL : 337.03 0.48 337.58
836M
bisection for that would be nice.
So part of this PR is a dup of PR97623, but there's still slowness of PRE
(as said elsewhere PRE is known quadratic).
I'm turning this PR into the GCC 12 regression that shows above.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/101555] [12 Regression] Compile slowdown in tree PRE
2021-07-21 14:56 [Bug c++/101555] New: Compile slowdown in tree PRE wsnyder at wsnyder dot org
2021-07-21 14:59 ` [Bug c++/101555] " wsnyder at wsnyder dot org
2021-07-22 8:41 ` [Bug tree-optimization/101555] [12 Regression] " rguenth at gcc dot gnu.org
@ 2021-07-22 9:24 ` rguenth at gcc dot gnu.org
2021-09-06 14:44 ` [Bug tree-optimization/101555] " rguenth at gcc dot gnu.org
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-07-22 9:24 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101555
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
Btw, it behaves as "intended":
> grep -i iteration Vara___024root__114.cpp.138t.pre
RPO iteration over 1050 blocks visited 1050 blocks in total discovering 1050
executable blocks iterating 1.0 times, a block was visited max. 1 times
Starting iteration 0
Starting iteration 1
Starting insert iteration 1
worklist iteration #1
so all "iterating" stuff proceeds optimally.
> grep 'Insert' Vara___024root__114.cpp.138t.pre | wc -l
398400
so that's a lot of inserts. GCC 11 has the same.
Compared to GCC 11 PRE and VN didn't change in any significant way, so I'm
really wondering what this is about. The IL into PRE is also quite similar.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/101555] Compile slowdown in tree PRE
2021-07-21 14:56 [Bug c++/101555] New: Compile slowdown in tree PRE wsnyder at wsnyder dot org
` (2 preceding siblings ...)
2021-07-22 9:24 ` rguenth at gcc dot gnu.org
@ 2021-09-06 14:44 ` rguenth at gcc dot gnu.org
2021-09-07 7:04 ` rguenth at gcc dot gnu.org
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-09-06 14:44 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101555
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|[12 Regression] Compile |Compile slowdown in tree
|slowdown in tree PRE |PRE
Target Milestone|12.0 |---
Keywords|needs-bisection |
--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
Meanwhile GCC 12 seems to be as fast as GCC 11 (again).
Still PRE is slow, in this case the profile seems to be mostly alias walk
related.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/101555] Compile slowdown in tree PRE
2021-07-21 14:56 [Bug c++/101555] New: Compile slowdown in tree PRE wsnyder at wsnyder dot org
` (3 preceding siblings ...)
2021-09-06 14:44 ` [Bug tree-optimization/101555] " rguenth at gcc dot gnu.org
@ 2021-09-07 7:04 ` rguenth at gcc dot gnu.org
2021-09-07 9:14 ` cvs-commit at gcc dot gnu.org
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-09-07 7:04 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101555
--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
OK, so most of the time is spent in ANTIC compute, specifically PHI translation
and there translate_vuse_through_block doing the (rate limited)
stmt_may_clobber_ref_p_1 query.
It's a bit fishy that we're doing these things "twice", in particular
we are supposed to translate ANTIC_IN to the predecessor ANTIC_OUT, thus
all VUSEs should be already top-of-block. But then in compute_antic_aux
we're doing
/* Prune expressions that are clobbered in block and thus become
invalid if translated from ANTIC_OUT to ANTIC_IN. */
prune_clobbered_mems (ANTIC_OUT, block);
which is supposed to do the "translation" through the block but that does
not adjust VUSEs of expressions. That uses value_dies_in_block_x, something
with a cache but also following a somewhat different logic with respect
to the VUSEs in the expression.
The main complication here is that the expression set we start from is
taken from the VN tables which may put "pre-translated" VUSEs in rather
than starting with the VUSEs as they were.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/101555] Compile slowdown in tree PRE
2021-07-21 14:56 [Bug c++/101555] New: Compile slowdown in tree PRE wsnyder at wsnyder dot org
` (4 preceding siblings ...)
2021-09-07 7:04 ` rguenth at gcc dot gnu.org
@ 2021-09-07 9:14 ` cvs-commit at gcc dot gnu.org
2021-09-07 9:22 ` rguenth at gcc dot gnu.org
2022-07-22 9:48 ` rguenth at gcc dot gnu.org
7 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-09-07 9:14 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101555
--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:
https://gcc.gnu.org/g:f387ff788f63c1974479644edae728047f843ec4
commit r12-3378-gf387ff788f63c1974479644edae728047f843ec4
Author: Richard Biener <rguenther@suse.de>
Date: Tue Sep 7 10:35:42 2021 +0200
tree-optimization/101555 - avoid redundant alias queries in PRE
This avoids doing redundant work during PHI translation to invalidate
mems when translating their corresponding VUSE through the blocks
virtual PHI node. All the invalidation work is already done by
prune_clobbered_mems.
This speeds up the compile of the testcase from 275s with PRE
taking 91% of the compile-time down to 43s with PRE taking 16%
of the compile-time.
2021-09-07 Richard Biener <rguenther@suse.de>
PR tree-optimization/101555
* tree-ssa-pre.c (translate_vuse_through_block): Do not
perform an alias walk to determine the validity of the
mem at the start of the block which is already guaranteed
by means of prune_clobbered_mems.
(phi_translate_1): Pass edge to translate_vuse_through_block.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/101555] Compile slowdown in tree PRE
2021-07-21 14:56 [Bug c++/101555] New: Compile slowdown in tree PRE wsnyder at wsnyder dot org
` (5 preceding siblings ...)
2021-09-07 9:14 ` cvs-commit at gcc dot gnu.org
@ 2021-09-07 9:22 ` rguenth at gcc dot gnu.org
2022-07-22 9:48 ` rguenth at gcc dot gnu.org
7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-09-07 9:22 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101555
--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
The committed change improves compile-time to less than 50s, in principle it
also applies to the GCC 11 and 10 branches where the related issue was fixed.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/101555] Compile slowdown in tree PRE
2021-07-21 14:56 [Bug c++/101555] New: Compile slowdown in tree PRE wsnyder at wsnyder dot org
` (6 preceding siblings ...)
2021-09-07 9:22 ` rguenth at gcc dot gnu.org
@ 2022-07-22 9:48 ` rguenth at gcc dot gnu.org
7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-07-22 9:48 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101555
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution|--- |FIXED
--- Comment #8 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed in GCC 12.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2022-07-22 9:48 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-21 14:56 [Bug c++/101555] New: Compile slowdown in tree PRE wsnyder at wsnyder dot org
2021-07-21 14:59 ` [Bug c++/101555] " wsnyder at wsnyder dot org
2021-07-22 8:41 ` [Bug tree-optimization/101555] [12 Regression] " rguenth at gcc dot gnu.org
2021-07-22 9:24 ` rguenth at gcc dot gnu.org
2021-09-06 14:44 ` [Bug tree-optimization/101555] " rguenth at gcc dot gnu.org
2021-09-07 7:04 ` rguenth at gcc dot gnu.org
2021-09-07 9:14 ` cvs-commit at gcc dot gnu.org
2021-09-07 9:22 ` rguenth at gcc dot gnu.org
2022-07-22 9:48 ` rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).