* [Bug c++/13776] Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
@ 2004-01-20 18:53 ` bangerth at dealii dot org
2004-01-20 18:57 ` kgardas at objectsecurity dot com
` (89 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: bangerth at dealii dot org @ 2004-01-20 18:53 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From bangerth at dealii dot org 2004-01-20 18:52 -------
*** This bug has been marked as a duplicate of 13775 ***
--
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution| |DUPLICATE
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
2004-01-20 18:53 ` [Bug c++/13776] " bangerth at dealii dot org
@ 2004-01-20 18:57 ` kgardas at objectsecurity dot com
2004-01-20 19:03 ` dhazeghi at yahoo dot com
` (88 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: kgardas at objectsecurity dot com @ 2004-01-20 18:57 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From kgardas at objectsecurity dot com 2004-01-20 18:57 -------
Sorry, I don't understand -- this bugreport is about regression in 3.5-tree-ssa,
while 13775 is about regression in 3.4.0. I've thought they should be different
bugreports for different set of people (working on different branches). Should I
reopen bug in this case?
Thanks,
Karel
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
2004-01-20 18:53 ` [Bug c++/13776] " bangerth at dealii dot org
2004-01-20 18:57 ` kgardas at objectsecurity dot com
@ 2004-01-20 19:03 ` dhazeghi at yahoo dot com
2004-01-20 19:17 ` kgardas at objectsecurity dot com
` (87 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: dhazeghi at yahoo dot com @ 2004-01-20 19:03 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From dhazeghi at yahoo dot com 2004-01-20 19:03 -------
I think Wolfgang's rationale is that the problem is compilation speed, and fixing that problem will
fix both bugs. Not sure I agree though...
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (2 preceding siblings ...)
2004-01-20 19:03 ` dhazeghi at yahoo dot com
@ 2004-01-20 19:17 ` kgardas at objectsecurity dot com
2004-01-20 19:29 ` pinskia at gcc dot gnu dot org
` (86 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: kgardas at objectsecurity dot com @ 2004-01-20 19:17 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From kgardas at objectsecurity dot com 2004-01-20 19:17 -------
Hmm, well fixing 13775 might also fix 13777 but certainly not this problem which
is regression in tree-ssa. So I reopen this bug, especially when I know that
tree-ssa developers are curious to see such regressions.
--
What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |UNCONFIRMED
Resolution|DUPLICATE |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (3 preceding siblings ...)
2004-01-20 19:17 ` kgardas at objectsecurity dot com
@ 2004-01-20 19:29 ` pinskia at gcc dot gnu dot org
2004-01-20 19:42 ` bangerth at dealii dot org
` (85 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-01-20 19:29 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-01-20 19:29 -------
I am wondering how much of this is due to the current work that was done after the last merge into
the tree-ssa.
--
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |compile-time-hog
Target Milestone|--- |tree-ssa
Version|3.5.0 |tree-ssa
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (4 preceding siblings ...)
2004-01-20 19:29 ` pinskia at gcc dot gnu dot org
@ 2004-01-20 19:42 ` bangerth at dealii dot org
2004-01-21 23:33 ` [Bug c++/13776] [tree-ssa] " pinskia at gcc dot gnu dot org
` (84 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: bangerth at dealii dot org @ 2004-01-20 19:42 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From bangerth at dealii dot org 2004-01-20 19:42 -------
My bad, I misread it. Sorry
W.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (5 preceding siblings ...)
2004-01-20 19:42 ` bangerth at dealii dot org
@ 2004-01-21 23:33 ` pinskia at gcc dot gnu dot org
2004-01-25 21:03 ` [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120 mmitchel at gcc dot gnu dot org
` (83 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-01-21 23:33 UTC (permalink / raw)
To: gcc-bugs
--
What |Removed |Added
----------------------------------------------------------------------------
Severity|normal |critical
Summary|Many C++ compile-time |[tree-ssa] Many C++ compile-
|regression in 3.5-tree-ssa |time regression in 3.5-tree-
|040120 in comparison with |ssa 040120 in comparison
|3.4.0 040114 |with 3.4.0 040114
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (6 preceding siblings ...)
2004-01-21 23:33 ` [Bug c++/13776] [tree-ssa] " pinskia at gcc dot gnu dot org
@ 2004-01-25 21:03 ` mmitchel at gcc dot gnu dot org
2004-03-03 15:06 ` dnovillo at redhat dot com
` (82 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: mmitchel at gcc dot gnu dot org @ 2004-01-25 21:03 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From mmitchel at gcc dot gnu dot org 2004-01-25 21:03 -------
Measurements made in comparison with 3.4.0 040114.
--
What |Removed |Added
----------------------------------------------------------------------------
Summary|[tree-ssa] Many C++ compile-|[tree-ssa] Many C++ compile-
|time regression in 3.5-tree-|time regression in 3.5-tree-
|ssa 040120 in comparison |ssa 040120
|with 3.4.0 040114 |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (7 preceding siblings ...)
2004-01-25 21:03 ` [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120 mmitchel at gcc dot gnu dot org
@ 2004-03-03 15:06 ` dnovillo at redhat dot com
2004-03-03 15:39 ` rguenth at tat dot physik dot uni-tuebingen dot de
` (81 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: dnovillo at redhat dot com @ 2004-03-03 15:06 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From dnovillo at redhat dot com 2004-03-03 15:06 -------
Subject: [Fwd: [tree-ssa] 20% compile time regression vs.
3.4]
Adding to PR notes. More related C++ compile time regressions.
Diego.
-----Forwarded Message-----
From: Richard Guenther <rguenth@tat.physik.uni-tuebingen.de>
To: gcc@gcc.gnu.org
Subject: [tree-ssa] 20% compile time regression vs. 3.4
Date: Wed, 03 Mar 2004 15:46:01 +0100
Hi!
I thought it was time for another 3.4 vs. tree-ssa compile-time
comparison. For -O2 compile-time we regressed quite a bit (20%) with the
main problem areas are (first 3.4, second tree-ssa):
garbage collection : 12.19 ( 7%) usr 0.00 ( 0%) sys 12.50 ( 7%) wall
garbage collection : 17.26 ( 8%) usr 0.02 ( 0%) sys 17.45 ( 8%) wall
tree-ssa uses about double amount of memory
parser : 14.59 ( 9%) usr 1.26 (27%) sys 16.41 ( 9%) wall
parser : 18.29 ( 8%) usr 1.42 (27%) sys 19.94 ( 9%) wall
I cannot make any sense out of this - are there significant changes to the
parser!? Maybe that-much larger libstdc++?
integration : 17.86 (11%) usr 0.29 ( 6%) sys 18.34 (10%) wall
integration : 21.62 (10%) usr 0.18 ( 3%) sys 22.19 (10%) wall
probably different inlining choices
and finally some tree-ssa optimizer numbers stick out
tree gimplify : 3.39 ( 2%) usr 0.04 ( 1%) sys 3.48 ( 1%) wall
tree eh : 2.71 ( 1%) usr 0.01 ( 0%) sys 2.77 ( 1%) wall
tree CFG construction : 1.69 ( 1%) usr 0.12 ( 2%) sys 1.87 ( 1%) wall
tree CFG cleanup : 2.89 ( 1%) usr 0.02 ( 0%) sys 2.98 ( 1%) wall
tree PTA : 0.49 ( 0%) usr 0.03 ( 1%) sys 0.52 ( 0%) wall
tree alias analysis : 0.71 ( 0%) usr 0.01 ( 0%) sys 0.75 ( 0%) wall
tree PHI insertion : 2.14 ( 1%) usr 0.04 ( 1%) sys 2.25 ( 1%) wall
tree SSA rewrite : 2.94 ( 1%) usr 0.01 ( 0%) sys 3.03 ( 1%) wall
tree SSA other : 3.77 ( 2%) usr 0.33 ( 6%) sys 4.17 ( 2%) wall
tree operand scan : 2.95 ( 1%) usr 0.46 ( 8%) sys 3.51 ( 2%) wall
dominator optimization: 14.06 ( 6%) usr 0.20 ( 4%) sys 14.60 ( 6%) wall
tree SRA : 0.29 ( 0%) usr 0.00 ( 0%) sys 0.31 ( 0%) wall
tree CCP : 2.29 ( 1%) usr 0.00 ( 0%) sys 2.39 ( 1%) wall
tree split crit edges : 0.27 ( 0%) usr 0.00 ( 0%) sys 0.28 ( 0%) wall
tree PRE : 6.11 ( 3%) usr 0.06 ( 1%) sys 6.40 ( 3%) wall
tree linearize phis : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall
tree forward propagate: 1.37 ( 1%) usr 0.00 ( 0%) sys 1.42 ( 1%) wall
tree conservative DCE : 2.71 ( 1%) usr 0.02 ( 0%) sys 2.80 ( 1%) wall
tree aggressive DCE : 1.40 ( 1%) usr 0.00 ( 0%) sys 1.45 ( 1%) wall
tree DSE : 3.30 ( 2%) usr 0.03 ( 1%) sys 3.42 ( 1%) wall
tree copy headers : 1.80 ( 1%) usr 0.01 ( 0%) sys 1.84 ( 1%) wall
tree SSA to normal : 3.18 ( 1%) usr 0.13 ( 2%) sys 3.39 ( 1%) wall
tree NRV optimization : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
tree rename SSA copies: 0.83 ( 0%) usr 0.03 ( 1%) sys 0.88 ( 0%) wall
namely DOM (again) and PRE.
This is with the famous tramp3d-v2.cpp testcase you can find at
http://www.tat.physik.uni-tuebingen.de/~rguenth/gcc/tramp3d-v2.cpp.gz
g++-ssa (GCC) 3.5-tree-ssa 20040303 (merged 20040227)
g++ (GCC) 3.4.0 20040301 (prerelease)
compiled with -O2 -c tramp3d-v2.cpp -Dleafify=fooblah -ftime-report to
disable leafify effects. The 3.4 compiler was profiledbootstrapped while
the ssa one was only bootstrapped. Of course checking was disabled.
Thanks,
Richard.
--
Richard Guenther <richard dot guenther at uni-tuebingen dot de>
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (8 preceding siblings ...)
2004-03-03 15:06 ` dnovillo at redhat dot com
@ 2004-03-03 15:39 ` rguenth at tat dot physik dot uni-tuebingen dot de
2004-03-03 16:27 ` pinskia at gcc dot gnu dot org
` (80 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: rguenth at tat dot physik dot uni-tuebingen dot de @ 2004-03-03 15:39 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From rguenth at tat dot physik dot uni-tuebingen dot de 2004-03-03 15:39 -------
Testcase for the last report can be found attached to PR14408.
--
What |Removed |Added
----------------------------------------------------------------------------
CC| |rguenth at tat dot physik
| |dot uni-tuebingen dot de
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (9 preceding siblings ...)
2004-03-03 15:39 ` rguenth at tat dot physik dot uni-tuebingen dot de
@ 2004-03-03 16:27 ` pinskia at gcc dot gnu dot org
2004-03-12 14:21 ` rguenth at tat dot physik dot uni-tuebingen dot de
` (79 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-03-03 16:27 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-03-03 16:27 -------
*** Bug 14408 has been marked as a duplicate of this bug. ***
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (10 preceding siblings ...)
2004-03-03 16:27 ` pinskia at gcc dot gnu dot org
@ 2004-03-12 14:21 ` rguenth at tat dot physik dot uni-tuebingen dot de
2004-03-12 23:34 ` rguenth at tat dot physik dot uni-tuebingen dot de
` (78 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: rguenth at tat dot physik dot uni-tuebingen dot de @ 2004-03-12 14:21 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From rguenth at tat dot physik dot uni-tuebingen dot de 2004-03-12 14:20 -------
Subject: Re: [tree-ssa] Many C++ compile-time regression in
3.5-tree-ssa 040120
Compilation times in mode that matters to me (leafify enabled) degraded
half an order of magnitude:
g++-ssa (GCC) 3.5-tree-ssa 20040311 (merged 20040307)
bellatrix:/tmp$ g++-ssa -O2 -o tramp3d-v2 tramp3d-v2.cpp -static
-ftime-report
Execution times (seconds)
garbage collection : 38.48 ( 3%) usr 0.15 ( 1%) sys 38.90 ( 3%)
wall
callgraph construction: 1.49 ( 0%) usr 0.00 ( 0%) sys 1.49 ( 0%)
wall
callgraph optimization: 1.54 ( 0%) usr 0.08 ( 0%) sys 1.63 ( 0%)
wall
cfg construction : 1.35 ( 0%) usr 0.17 ( 1%) sys 1.52 ( 0%)
wall
cfg cleanup : 5.68 ( 0%) usr 0.03 ( 0%) sys 5.74 ( 0%)
wall
trivially dead code : 5.11 ( 0%) usr 0.01 ( 0%) sys 5.16 ( 0%)
wall
life analysis : 9.41 ( 1%) usr 0.05 ( 0%) sys 9.59 ( 1%)
wall
life info update : 5.78 ( 0%) usr 0.00 ( 0%) sys 5.81 ( 0%)
wall
alias analysis : 7.50 ( 1%) usr 0.03 ( 0%) sys 7.58 ( 1%)
wall
register scan : 3.92 ( 0%) usr 0.01 ( 0%) sys 3.95 ( 0%)
wall
rebuild jump labels : 1.37 ( 0%) usr 0.00 ( 0%) sys 1.41 ( 0%)
wall
preprocessing : 0.51 ( 0%) usr 0.10 ( 0%) sys 0.64 ( 0%)
wall
parser : 18.60 ( 2%) usr 1.47 ( 7%) sys 20.14 ( 2%)
wall
name lookup : 6.55 ( 1%) usr 1.53 ( 7%) sys 8.10 ( 1%)
wall
integration : 67.76 ( 6%) usr 1.46 ( 7%) sys 69.59 ( 6%)
wall
tree gimplify : 3.44 ( 0%) usr 0.04 ( 0%) sys 3.50 ( 0%)
wall
tree eh : 7.64 ( 1%) usr 0.25 ( 1%) sys 7.96 ( 1%)
wall
tree CFG construction : 4.54 ( 0%) usr 0.53 ( 3%) sys 5.07 ( 0%)
wall
tree CFG cleanup : 9.67 ( 1%) usr 0.08 ( 0%) sys 9.81 ( 1%)
wall
tree PTA : 1.26 ( 0%) usr 0.05 ( 0%) sys 1.31 ( 0%)
wall
tree alias analysis : 1.37 ( 0%) usr 0.01 ( 0%) sys 1.38 ( 0%)
wall
tree PHI insertion : 74.91 ( 6%) usr 0.24 ( 1%) sys 75.62 ( 6%)
wall
tree SSA rewrite : 7.46 ( 1%) usr 0.21 ( 1%) sys 7.72 ( 1%)
wall
tree SSA other : 10.67 ( 1%) usr 0.79 ( 4%) sys 11.58 ( 1%)
wall
tree operand scan : 6.77 ( 1%) usr 0.61 ( 3%) sys 7.41 ( 1%)
wall
dominator optimization: 46.56 ( 4%) usr 1.54 ( 8%) sys 48.39 ( 4%)
wall
tree SRA : 0.79 ( 0%) usr 0.02 ( 0%) sys 0.83 ( 0%)
wall
tree CCP : 4.88 ( 0%) usr 0.02 ( 0%) sys 4.97 ( 0%)
wall
tree split crit edges : 0.64 ( 0%) usr 0.06 ( 0%) sys 0.70 ( 0%)
wall
tree PRE : 583.13 (49%) usr 6.18 (30%) sys 592.99 (48%)
wall
tree linearize phis : 0.08 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%)
wall
tree forward propagate: 3.51 ( 0%) usr 0.00 ( 0%) sys 3.53 ( 0%)
wall
tree conservative DCE : 6.95 ( 1%) usr 0.08 ( 0%) sys 7.05 ( 1%)
wall
tree aggressive DCE : 2.89 ( 0%) usr 0.03 ( 0%) sys 2.93 ( 0%)
wall
tree DSE : 6.33 ( 1%) usr 0.19 ( 1%) sys 6.56 ( 1%)
wall
tree copy headers : 5.00 ( 0%) usr 0.05 ( 0%) sys 5.09 ( 0%)
wall
tree SSA to normal : 9.60 ( 1%) usr 0.42 ( 2%) sys 10.09 ( 1%)
wall
tree rename SSA copies: 2.11 ( 0%) usr 0.05 ( 0%) sys 2.17 ( 0%)
wall
dominance frontiers : 0.86 ( 0%) usr 0.00 ( 0%) sys 0.89 ( 0%)
wall
control dependences : 0.49 ( 0%) usr 0.00 ( 0%) sys 0.51 ( 0%)
wall
expand : 42.02 ( 3%) usr 1.38 ( 7%) sys 43.63 ( 4%)
wall
varconst : 0.82 ( 0%) usr 0.01 ( 0%) sys 0.83 ( 0%)
wall
jump : 8.24 ( 1%) usr 0.36 ( 2%) sys 8.64 ( 1%)
wall
CSE : 14.22 ( 1%) usr 0.08 ( 0%) sys 14.39 ( 1%)
wall
global CSE : 67.80 ( 6%) usr 0.84 ( 4%) sys 69.06 ( 6%)
wall
loop analysis : 11.90 ( 1%) usr 0.02 ( 0%) sys 12.02 ( 1%)
wall
bypass jumps : 2.48 ( 0%) usr 0.12 ( 1%) sys 2.60 ( 0%)
wall
CSE 2 : 6.33 ( 1%) usr 0.04 ( 0%) sys 6.38 ( 1%)
wall
branch prediction : 8.78 ( 1%) usr 0.03 ( 0%) sys 8.88 ( 1%)
wall
flow analysis : 0.28 ( 0%) usr 0.00 ( 0%) sys 0.29 ( 0%)
wall
combiner : 6.32 ( 1%) usr 0.08 ( 0%) sys 6.46 ( 1%)
wall
if-conversion : 1.72 ( 0%) usr 0.02 ( 0%) sys 1.74 ( 0%)
wall
regmove : 2.47 ( 0%) usr 0.00 ( 0%) sys 2.48 ( 0%)
wall
local alloc : 6.16 ( 1%) usr 0.03 ( 0%) sys 6.25 ( 1%)
wall
global alloc : 13.97 ( 1%) usr 0.21 ( 1%) sys 14.22 ( 1%)
wall
reload CSE regs : 5.61 ( 0%) usr 0.07 ( 0%) sys 5.75 ( 0%)
wall
flow 2 : 1.31 ( 0%) usr 0.07 ( 0%) sys 1.41 ( 0%)
wall
if-conversion 2 : 0.90 ( 0%) usr 0.00 ( 0%) sys 0.91 ( 0%)
wall
peephole 2 : 0.94 ( 0%) usr 0.02 ( 0%) sys 0.97 ( 0%)
wall
rename registers : 1.66 ( 0%) usr 0.07 ( 0%) sys 1.75 ( 0%)
wall
scheduling 2 : 8.51 ( 1%) usr 0.13 ( 1%) sys 8.67 ( 1%)
wall
machine dep reorg : 1.74 ( 0%) usr 0.00 ( 0%) sys 1.76 ( 0%)
wall
reorder blocks : 1.06 ( 0%) usr 0.04 ( 0%) sys 1.10 ( 0%)
wall
shorten branches : 1.87 ( 0%) usr 0.05 ( 0%) sys 1.94 ( 0%)
wall
reg stack : 0.36 ( 0%) usr 0.00 ( 0%) sys 0.36 ( 0%)
wall
final : 2.55 ( 0%) usr 0.18 ( 1%) sys 2.74 ( 0%)
wall
symout : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%)
wall
rest of compilation : 4.92 ( 0%) usr 0.07 ( 0%) sys 4.99 ( 0%)
wall
TOTAL :1201.61 20.47 1229.69
Look at the PRE times!!!
Also the resulting binary segfaults and such is miscompiled (for both
leafify enabled and disabled compilation). Ugh.
For reference, the leafify patch still sits at
http://www.tat.physik.uni-tuebingen.de/~rguenth/gcc/leafify-ssa-2
Building an instrumented compiler now.
Richard.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (11 preceding siblings ...)
2004-03-12 14:21 ` rguenth at tat dot physik dot uni-tuebingen dot de
@ 2004-03-12 23:34 ` rguenth at tat dot physik dot uni-tuebingen dot de
2004-03-13 2:02 ` dberlin at dberlin dot org
` (77 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: rguenth at tat dot physik dot uni-tuebingen dot de @ 2004-03-12 23:34 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From rguenth at tat dot physik dot uni-tuebingen dot de 2004-03-12 23:34 -------
Subject: Re: [tree-ssa] Many C++ compile-time regression in
3.5-tree-ssa 040120
Richard Guenther wrote:
> Compilation times in mode that matters to me (leafify enabled) degraded
> half an order of magnitude:
>
> g++-ssa (GCC) 3.5-tree-ssa 20040311 (merged 20040307)
>
> bellatrix:/tmp$ g++-ssa -O2 -o tramp3d-v2 tramp3d-v2.cpp -static
> -ftime-report
instrumented compiler gives:
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls Ks/call Ks/call name
15.51 183.22 183.22 726080 0.00 0.00
process_left_occs_and_kills
8.55 284.25 101.03 16644 0.00 0.00
create_and_insert_occ_in_preorder_dt_order
6.72 363.67 79.42 184430 0.00 0.00 compute_global_livein
6.54 440.93 77.26 16644 0.00 0.00 rename_1
3.13 477.86 36.93 16644 0.00 0.00 clear_all_eref_arrays
3.05 513.86 36.00 16644 0.00 0.00 compute_down_safety
2.49 543.32 29.46 152057062 0.00 0.00 expr_lexically_eq
2.09 568.00 24.68 201843 0.00 0.00 cgraph_remove_node
1.89 590.33 22.33 432753 0.00 0.00 alloc_page
1.70 610.44 20.11 eref_compare
1.70 630.47 20.03 158882 0.00 0.00 compute_transp
1.46 647.70 17.23 416131 0.00 0.00 cgraph_remove_edge
1.42 664.43 16.73 14808480 0.00 0.00
gt_ggc_mx_lang_tree_node
1.13 677.82 13.39 226466163 0.00 0.00 ggc_set_mark
1.07 690.43 12.61 1482 0.00 0.00 collect_expressions
0.90 701.05 10.62 15658237 0.00 0.00 walk_tree
i.e. PRE seems to do something very stupid?
Richard.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (12 preceding siblings ...)
2004-03-12 23:34 ` rguenth at tat dot physik dot uni-tuebingen dot de
@ 2004-03-13 2:02 ` dberlin at dberlin dot org
2004-03-13 2:08 ` dnovillo at redhat dot com
` (76 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: dberlin at dberlin dot org @ 2004-03-13 2:02 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From dberlin at dberlin dot org 2004-03-13 02:02 -------
Subject: Re: [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
>
> instrumented compiler gives:
>
> Flat profile:
>
> Each sample counts as 0.01 seconds.
> % cumulative self self total
> time seconds seconds calls Ks/call Ks/call name
> 15.51 183.22 183.22 726080 0.00 0.00
> process_left_occs_and_kills
This one is O(n^2) or O(n^3) in the number of vuses. Known problem. a
fix is really complicated.
> 8.55 284.25 101.03 16644 0.00 0.00
> create_and_insert_occ_in_preorder_dt_order
Hmmmm.
It is attempting to PRE 16664 things.
How many basic blocks do you have?
We shouldn't end up with trying to PRE that many expressions, since we
only try to PRE things that occur at least twice.
> 6.72 363.67 79.42 184430 0.00 0.00
> compute_global_livein
> 6.54 440.93 77.26 16644 0.00 0.00 rename_1
> 3.13 477.86 36.93 16644 0.00 0.00
> clear_all_eref_arrays
> 3.05 513.86 36.00 16644 0.00 0.00
> compute_down_safety
> 2.49 543.32 29.46 152057062 0.00 0.00
> expr_lexically_eq
> 2.09 568.00 24.68 201843 0.00 0.00
> cgraph_remove_node
> 1.89 590.33 22.33 432753 0.00 0.00 alloc_page
> 1.70 610.44 20.11 eref_compare
> 1.70 630.47 20.03 158882 0.00 0.00 compute_transp
> 1.46 647.70 17.23 416131 0.00 0.00
> cgraph_remove_edge
> 1.42 664.43 16.73 14808480 0.00 0.00
> gt_ggc_mx_lang_tree_node
> 1.13 677.82 13.39 226466163 0.00 0.00 ggc_set_mark
> 1.07 690.43 12.61 1482 0.00 0.00
> collect_expressions
> 0.90 701.05 10.62 15658237 0.00 0.00 walk_tree
>
> i.e. PRE seems to do something very stupid?
>
You must have an incredibly large number of basic blocks or something,
or a very weird flowgraph.
How many BB's are we talking about?
I can't fix the algorithmic properties of the SSAPRE algorithm we use,
which is what you are running into, i'm betting.
I'm working on a new PRE implementation that is O(n^2) memory usage in
the number of phi nodes, but should be a bit faster overall.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (13 preceding siblings ...)
2004-03-13 2:02 ` dberlin at dberlin dot org
@ 2004-03-13 2:08 ` dnovillo at redhat dot com
2004-03-13 2:08 ` dberlin at dberlin dot org
` (75 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: dnovillo at redhat dot com @ 2004-03-13 2:08 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From dnovillo at redhat dot com 2004-03-13 02:08 -------
Subject: Re: [tree-ssa] Many C++ compile-time regression in
3.5-tree-ssa 040120
On Fri, 2004-03-12 at 21:02, dberlin at dberlin dot org wrote:
> I can't fix the algorithmic properties of the SSAPRE algorithm we use,
> which is what you are running into, i'm betting.
>
Could we add thresholds to back away from overly complicated functions?
Diego.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (14 preceding siblings ...)
2004-03-13 2:08 ` dnovillo at redhat dot com
@ 2004-03-13 2:08 ` dberlin at dberlin dot org
2004-03-13 2:10 ` dberlin at gcc dot gnu dot org
` (74 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: dberlin at dberlin dot org @ 2004-03-13 2:08 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From dberlin at dberlin dot org 2004-03-13 02:08 -------
Subject: Re: [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
> You must have an incredibly large number of basic blocks or something,
> or a very weird flowgraph.
> How many BB's are we talking about?
>
> I can't fix the algorithmic properties of the SSAPRE algorithm we use,
> which is what you are running into, i'm betting.
>
> I'm working on a new PRE implementation that is O(n^2) memory usage in
> the number of phi nodes, but should be a bit faster overall.
>
Regardless, i'll see if i can find a machine with enough memory to look
at these.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (15 preceding siblings ...)
2004-03-13 2:08 ` dberlin at dberlin dot org
@ 2004-03-13 2:10 ` dberlin at gcc dot gnu dot org
2004-03-13 11:09 ` mattyt-bugzilla at tpg dot com dot au
` (73 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: dberlin at gcc dot gnu dot org @ 2004-03-13 2:10 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From dberlin at gcc dot gnu dot org 2004-03-13 02:10 -------
(In reply to comment #14)
> Subject: Re: [tree-ssa] Many C++ compile-time regression in
> 3.5-tree-ssa 040120
>
> On Fri, 2004-03-12 at 21:02, dberlin at dberlin dot org wrote:
>
> > I can't fix the algorithmic properties of the SSAPRE algorithm we use,
> > which is what you are running into, i'm betting.
> >
> Could we add thresholds to back away from overly complicated functions?
>
>
> Diego.
>
>
I need to know what exactly the properties of these functions are, it's unclear.
As i said, i'm working on it.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (16 preceding siblings ...)
2004-03-13 2:10 ` dberlin at gcc dot gnu dot org
@ 2004-03-13 11:09 ` mattyt-bugzilla at tpg dot com dot au
2004-03-13 11:43 ` rguenth at tat dot physik dot uni-tuebingen dot de
` (72 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: mattyt-bugzilla at tpg dot com dot au @ 2004-03-13 11:09 UTC (permalink / raw)
To: gcc-bugs
--
What |Removed |Added
----------------------------------------------------------------------------
CC| |mattyt-bugzilla at tpg dot
| |com dot au
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (17 preceding siblings ...)
2004-03-13 11:09 ` mattyt-bugzilla at tpg dot com dot au
@ 2004-03-13 11:43 ` rguenth at tat dot physik dot uni-tuebingen dot de
2004-03-13 11:46 ` rguenth at tat dot physik dot uni-tuebingen dot de
` (71 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: rguenth at tat dot physik dot uni-tuebingen dot de @ 2004-03-13 11:43 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From rguenth at tat dot physik dot uni-tuebingen dot de 2004-03-13 11:43 -------
Subject: Re: [tree-ssa] Many C++ compile-time regression in
3.5-tree-ssa 040120
dberlin at gcc dot gnu dot org wrote:
> ------- Additional Comments From dberlin at gcc dot gnu dot org 2004-03-13 02:10 -------
> (In reply to comment #14)
>
>>Subject: Re: [tree-ssa] Many C++ compile-time regression in
>> 3.5-tree-ssa 040120
>>
>>On Fri, 2004-03-12 at 21:02, dberlin at dberlin dot org wrote:
>>
>>
>>>I can't fix the algorithmic properties of the SSAPRE algorithm we use,
>>>which is what you are running into, i'm betting.
>>>
>>
>>Could we add thresholds to back away from overly complicated functions?
>>
>>
>>Diego.
>>
>>
>
>
>
> I need to know what exactly the properties of these functions are, it's unclear.
> As i said, i'm working on it.
Remember you need to patch the compiler to support
__attribute__((leafify)) to trigger the problem with the tramp3d-v2.cpp
testcase. I suspect the huge number of basic blocks comes from inlining
as I suspect at least one new basic block is inserted per inlined
function, no? So with a lot of C++ abstraction inside a leafified
function you get a lot of basic blocks. But I suppose a lot of them
could be eliminated easily?
Richard.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (18 preceding siblings ...)
2004-03-13 11:43 ` rguenth at tat dot physik dot uni-tuebingen dot de
@ 2004-03-13 11:46 ` rguenth at tat dot physik dot uni-tuebingen dot de
2004-03-13 15:57 ` dberlin at dberlin dot org
` (70 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: rguenth at tat dot physik dot uni-tuebingen dot de @ 2004-03-13 11:46 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From rguenth at tat dot physik dot uni-tuebingen dot de 2004-03-13 11:46 -------
Subject: Re: [tree-ssa] Many C++ compile-time regression in
3.5-tree-ssa 040120
dnovillo at redhat dot com wrote:
> ------- Additional Comments From dnovillo at redhat dot com 2004-03-13 02:08 -------
> Subject: Re: [tree-ssa] Many C++ compile-time regression in
> 3.5-tree-ssa 040120
>
> On Fri, 2004-03-12 at 21:02, dberlin at dberlin dot org wrote:
>
>
>>I can't fix the algorithmic properties of the SSAPRE algorithm we use,
>>which is what you are running into, i'm betting.
>>
>
> Could we add thresholds to back away from overly complicated functions?
Or just "split" them up using sort of windowing? It looks clearly wrong
to not limit a O(n^2) or O(n^3) algorithm.
Richard.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (19 preceding siblings ...)
2004-03-13 11:46 ` rguenth at tat dot physik dot uni-tuebingen dot de
@ 2004-03-13 15:57 ` dberlin at dberlin dot org
2004-03-13 16:00 ` dberlin at dberlin dot org
` (69 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: dberlin at dberlin dot org @ 2004-03-13 15:57 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From dberlin at dberlin dot org 2004-03-13 15:57 -------
Subject: Re: [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
On Mar 13, 2004, at 6:46 AM, rguenth at tat dot physik dot
uni-tuebingen dot de wrote:
>>>
>>
>> Could we add thresholds to back away from overly complicated
>> functions?
>
> Or just "split" them up using sort of windowing? It looks clearly
> wrong
> to not limit a O(n^2) or O(n^3) algorithm.
>
It's only collecting expressions that is O(n^2). The other parts of the
algorithm just has a large constant.
Also, it *is* splitting up the function. It performs PRE one expression
at a time.
We can't perform it one basic block at a time or anything with the
current algorithm (and it wouldn't make sense to, because you can't
find the optimal insertion points).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (20 preceding siblings ...)
2004-03-13 15:57 ` dberlin at dberlin dot org
@ 2004-03-13 16:00 ` dberlin at dberlin dot org
2004-03-13 16:44 ` rguenth at tat dot physik dot uni-tuebingen dot de
` (68 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: dberlin at dberlin dot org @ 2004-03-13 16:00 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From dberlin at dberlin dot org 2004-03-13 16:00 -------
Subject: Re: [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
>
> and a lot more of int d farther away? Also, how are bb's marked? I see
> <bb 0>: but no more, and some gotos reference <bb 18> and <bb 16>
> (with a label, too)?
>
> Can I get summaries somehow here? Or just dump one interesting
> function rather than all of the program?
>
> Also, how do I dump some stuff about the PRE pass? Specifying
> -fdump-tree-pre just dumps the trees after PRE with no information
> about the PRE pass itself.
-fdump-tree-pre-stats-details. But i already know what it is going to
show in this case, based on the profile.
I just need other properties of the functions, which i'm attempting to
get.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (21 preceding siblings ...)
2004-03-13 16:00 ` dberlin at dberlin dot org
@ 2004-03-13 16:44 ` rguenth at tat dot physik dot uni-tuebingen dot de
2004-03-13 16:53 ` rguenth at tat dot physik dot uni-tuebingen dot de
` (67 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: rguenth at tat dot physik dot uni-tuebingen dot de @ 2004-03-13 16:44 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From rguenth at tat dot physik dot uni-tuebingen dot de 2004-03-13 16:44 -------
Subject: Re: [tree-ssa] Many C++ compile-time regression in
3.5-tree-ssa 040120
Note that the reported times are a huge regression compared to
g++ (GCC) 3.5-tree-ssa 20040209 (merged 20040126)
which shows
Execution times (seconds)
garbage collection : 26.06 ( 7%) usr 0.00 ( 0%) sys 26.06 ( 6%)
wall
callgraph construction: 1.29 ( 0%) usr 0.00 ( 0%) sys 1.29 ( 0%)
wall
callgraph optimization: 1.39 ( 0%) usr 0.05 ( 1%) sys 1.44 ( 0%)
wall
cfg construction : 1.71 ( 0%) usr 0.02 ( 0%) sys 1.73 ( 0%)
wall
cfg cleanup : 3.49 ( 1%) usr 0.02 ( 0%) sys 3.51 ( 1%)
wall
trivially dead code : 2.85 ( 1%) usr 0.01 ( 0%) sys 2.86 ( 1%)
wall
life analysis : 5.88 ( 1%) usr 0.00 ( 0%) sys 5.88 ( 1%)
wall
life info update : 2.81 ( 1%) usr 0.00 ( 0%) sys 2.81 ( 1%)
wall
alias analysis : 4.46 ( 1%) usr 0.02 ( 0%) sys 4.48 ( 1%)
wall
register scan : 1.96 ( 0%) usr 0.00 ( 0%) sys 1.96 ( 0%)
wall
rebuild jump labels : 0.88 ( 0%) usr 0.01 ( 0%) sys 0.89 ( 0%)
wall
preprocessing : 0.61 ( 0%) usr 0.15 ( 3%) sys 0.76 ( 0%)
wall
parser : 19.47 ( 5%) usr 1.10 (23%) sys 21.16 ( 5%)
wall
name lookup : 12.05 ( 3%) usr 1.54 (33%) sys 13.75 ( 3%)
wall
integration : 47.73 (12%) usr 0.14 ( 3%) sys 47.87 (12%)
wall
tree gimplify : 3.05 ( 1%) usr 0.06 ( 1%) sys 3.19 ( 1%)
wall
tree eh : 5.32 ( 1%) usr 0.01 ( 0%) sys 5.34 ( 1%)
wall
tree CFG construction : 2.74 ( 1%) usr 0.08 ( 2%) sys 2.82 ( 1%)
wall
tree CFG cleanup : 6.10 ( 2%) usr 0.00 ( 0%) sys 6.10 ( 2%)
wall
tree alias analysis : 1.11 ( 0%) usr 0.00 ( 0%) sys 1.11 ( 0%)
wall
tree PHI insertion : 17.62 ( 4%) usr 0.01 ( 0%) sys 17.63 ( 4%)
wall
tree SSA rewrite : 5.90 ( 1%) usr 0.02 ( 0%) sys 5.92 ( 1%)
wall
tree SSA other : 10.18 ( 3%) usr 0.04 ( 1%) sys 10.22 ( 3%)
wall
dominator optimization: 31.18 ( 8%) usr 0.25 ( 5%) sys 31.43 ( 8%)
wall
tree SRA : 0.42 ( 0%) usr 0.00 ( 0%) sys 0.42 ( 0%)
wall
tree CCP : 6.99 ( 2%) usr 0.05 ( 1%) sys 7.04 ( 2%)
wall
tree split crit edges : 0.53 ( 0%) usr 0.01 ( 0%) sys 0.54 ( 0%)
wall
tree PRE : 67.53 (17%) usr 0.08 ( 2%) sys 67.92 (17%)
wall
tree conservative DCE : 5.12 ( 1%) usr 0.01 ( 0%) sys 5.13 ( 1%)
wall
tree aggressive DCE : 2.41 ( 1%) usr 0.00 ( 0%) sys 2.41 ( 1%)
wall
tree SSA to normal : 5.69 ( 1%) usr 0.17 ( 4%) sys 5.86 ( 1%)
wall
dominance frontiers : 0.65 ( 0%) usr 0.00 ( 0%) sys 0.65 ( 0%)
wall
control dependences : 0.35 ( 0%) usr 0.00 ( 0%) sys 0.35 ( 0%)
wall
expand : 20.80 ( 5%) usr 0.07 ( 1%) sys 20.88 ( 5%)
wall
varconst : 0.81 ( 0%) usr 0.04 ( 1%) sys 0.85 ( 0%)
wall
jump : 1.72 ( 0%) usr 0.13 ( 3%) sys 1.86 ( 0%)
wall
CSE : 8.43 ( 2%) usr 0.00 ( 0%) sys 8.43 ( 2%)
wall
global CSE : 10.58 ( 3%) usr 0.15 ( 3%) sys 10.74 ( 3%)
wall
loop analysis : 2.59 ( 1%) usr 0.01 ( 0%) sys 2.60 ( 1%)
wall
bypass jumps : 1.95 ( 0%) usr 0.03 ( 1%) sys 1.98 ( 0%)
wall
CSE 2 : 3.57 ( 1%) usr 0.00 ( 0%) sys 3.57 ( 1%)
wall
branch prediction : 4.66 ( 1%) usr 0.01 ( 0%) sys 4.69 ( 1%)
wall
flow analysis : 0.18 ( 0%) usr 0.00 ( 0%) sys 0.18 ( 0%)
wall
combiner : 3.53 ( 1%) usr 0.00 ( 0%) sys 3.53 ( 1%)
wall
if-conversion : 0.92 ( 0%) usr 0.00 ( 0%) sys 0.92 ( 0%)
wall
regmove : 1.29 ( 0%) usr 0.00 ( 0%) sys 1.29 ( 0%)
wall
local alloc : 3.36 ( 1%) usr 0.01 ( 0%) sys 3.37 ( 1%)
wall
global alloc : 7.97 ( 2%) usr 0.13 ( 3%) sys 8.10 ( 2%)
wall
reload CSE regs : 3.79 ( 1%) usr 0.00 ( 0%) sys 3.79 ( 1%)
wall
flow 2 : 0.78 ( 0%) usr 0.00 ( 0%) sys 0.78 ( 0%)
wall
if-conversion 2 : 0.46 ( 0%) usr 0.00 ( 0%) sys 0.46 ( 0%)
wall
peephole 2 : 0.83 ( 0%) usr 0.01 ( 0%) sys 0.84 ( 0%)
wall
rename registers : 1.16 ( 0%) usr 0.05 ( 1%) sys 1.21 ( 0%)
wall
scheduling 2 : 4.62 ( 1%) usr 0.06 ( 1%) sys 4.68 ( 1%)
wall
reorder blocks : 0.73 ( 0%) usr 0.00 ( 0%) sys 0.73 ( 0%)
wall
shorten branches : 1.16 ( 0%) usr 0.02 ( 0%) sys 1.18 ( 0%)
wall
reg stack : 0.19 ( 0%) usr 0.00 ( 0%) sys 0.19 ( 0%)
wall
final : 1.79 ( 0%) usr 0.13 ( 3%) sys 1.92 ( 0%)
wall
symout : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.05 ( 0%)
wall
rest of compilation : 2.76 ( 1%) usr 0.02 ( 0%) sys 2.78 ( 1%)
wall
TOTAL : 396.23 4.72 402.15
So appearantly PRE got a factor of 10 slower!?
Richard.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (22 preceding siblings ...)
2004-03-13 16:44 ` rguenth at tat dot physik dot uni-tuebingen dot de
@ 2004-03-13 16:53 ` rguenth at tat dot physik dot uni-tuebingen dot de
2004-03-13 17:00 ` dberlin at dberlin dot org
` (66 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: rguenth at tat dot physik dot uni-tuebingen dot de @ 2004-03-13 16:53 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From rguenth at tat dot physik dot uni-tuebingen dot de 2004-03-13 16:53 -------
Subject: Re: [tree-ssa] Many C++ compile-time regression in
3.5-tree-ssa 040120
dberlin at dberlin dot org wrote:
> ------- Additional Comments From dberlin at dberlin dot org 2004-03-13 16:00 -------
> Subject: Re: [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
>
>
>>and a lot more of int d farther away? Also, how are bb's marked? I see
>><bb 0>: but no more, and some gotos reference <bb 18> and <bb 16>
>>(with a label, too)?
>>
>>Can I get summaries somehow here? Or just dump one interesting
>>function rather than all of the program?
>>
>>Also, how do I dump some stuff about the PRE pass? Specifying
>>-fdump-tree-pre just dumps the trees after PRE with no information
>>about the PRE pass itself.
>
>
> -fdump-tree-pre-stats-details. But i already know what it is going to
> show in this case, based on the profile.
> I just need other properties of the functions, which i'm attempting to
> get.
I also see we're running PRE before DCE - the functions probably contain
a lot of dead code - would it be sensible and profitable to move the
first DCE pass before PRE? Can this be specified on the command line or
where would I need to change the source to do this?
Richard.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (23 preceding siblings ...)
2004-03-13 16:53 ` rguenth at tat dot physik dot uni-tuebingen dot de
@ 2004-03-13 17:00 ` dberlin at dberlin dot org
2004-03-13 17:02 ` dberlin at dberlin dot org
` (65 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: dberlin at dberlin dot org @ 2004-03-13 17:00 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From dberlin at dberlin dot org 2004-03-13 17:00 -------
Subject: Re: [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
>
> So appearantly PRE got a factor of 10 slower!?
>
Highly unlikely.
There haven't been any PRE changes in between the two compilers.
Something else changed, like inlining or something.
>
You are likely inlining *way* too much again or something.
--Dan
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (24 preceding siblings ...)
2004-03-13 17:00 ` dberlin at dberlin dot org
@ 2004-03-13 17:02 ` dberlin at dberlin dot org
2004-03-13 17:07 ` dberlin at dberlin dot org
` (64 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: dberlin at dberlin dot org @ 2004-03-13 17:02 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From dberlin at dberlin dot org 2004-03-13 17:02 -------
Subject: Re: [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
>
> I also see we're running PRE before DCE - the functions probably
> contain
> a lot of dead code - would it be sensible and profitable to move the
> first DCE pass before PRE?
No we aren't.
We run 3 DCE passes before PRE.
NEXT_PASS (pass_build_cfg);
...
NEXT_PASS (pass_dce);
...
NEXT_PASS (DUP_PASS (pass_dce));
...
NEXT_PASS (DUP_PASS (pass_dce));
NEXT_PASS (pass_split_crit_edges);
NEXT_PASS (pass_pre);
> Can this be specified on the command line or
> where would I need to change the source to do this?
>
> Richard.
>
>
> --
>
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (25 preceding siblings ...)
2004-03-13 17:02 ` dberlin at dberlin dot org
@ 2004-03-13 17:07 ` dberlin at dberlin dot org
2004-03-14 4:47 ` dberlin at gcc dot gnu dot org
` (63 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: dberlin at dberlin dot org @ 2004-03-13 17:07 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From dberlin at dberlin dot org 2004-03-13 17:07 -------
Subject: Re: [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
> So appearantly PRE got a factor of 10 slower!?
>
Note that the other functions got a factor of 3-5 slower too. As I
said, PRE just has a larger constant, so it's more noticeable.
This tells me something else important changed, probably in cgraph or
something.
There is little i can do, a lot of the portions wasting time are
already O(n) (compute_down_safety for example).
The only thing to do is reduce the number of expressions we PRE, give
up PRE entirely on such functions, or change PRE algorithms.
I'm actually working on 3 and 2, rather than 1.
1 is tricky, we already give up on expressions that occur once, which
makes us lose some load motion.
Number 2 requires figuring out what properties of this function make it
such a pain in the ass, which is what i'm doing.
and #3 is being worked on in the background, i'm waiting for Steven to
get back to get more work done.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (26 preceding siblings ...)
2004-03-13 17:07 ` dberlin at dberlin dot org
@ 2004-03-14 4:47 ` dberlin at gcc dot gnu dot org
2004-03-14 12:10 ` rguenth at tat dot physik dot uni-tuebingen dot de
` (62 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: dberlin at gcc dot gnu dot org @ 2004-03-14 4:47 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From dberlin at gcc dot gnu dot org 2004-03-14 04:47 -------
There are about 100 functions here with > a couple thousand bb's.
PRE takes about 2-3 seconds for each of these functions.
Which means i have to microoptimize it in order to get rid of the cumulative time effect.
A lot of is it simply iterating over large lists looking for certain types of nodes (like EPHiS), where the
lists are O(n_basic_blocks), and we only need to look at 10 entries or so. This doesn't matter when the
numbers are close, but when you have 8000 bb's to walk 20 times, instead of walking 40 entries 20
times, it matters.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (27 preceding siblings ...)
2004-03-14 4:47 ` dberlin at gcc dot gnu dot org
@ 2004-03-14 12:10 ` rguenth at tat dot physik dot uni-tuebingen dot de
2004-03-14 13:54 ` rguenth at tat dot physik dot uni-tuebingen dot de
` (61 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: rguenth at tat dot physik dot uni-tuebingen dot de @ 2004-03-14 12:10 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From rguenth at tat dot physik dot uni-tuebingen dot de 2004-03-14 12:10 -------
Subject: Re: [tree-ssa] Many C++ compile-time regression in
3.5-tree-ssa 040120
dberlin at gcc dot gnu dot org wrote:
> ------- Additional Comments From dberlin at gcc dot gnu dot org 2004-03-14 04:47 -------
> There are about 100 functions here with > a couple thousand bb's.
> PRE takes about 2-3 seconds for each of these functions.
> Which means i have to microoptimize it in order to get rid of the cumulative time effect.
> A lot of is it simply iterating over large lists looking for certain types of nodes (like EPHiS), where the
> lists are O(n_basic_blocks), and we only need to look at 10 entries or so. This doesn't matter when the
> numbers are close, but when you have 8000 bb's to walk 20 times, instead of walking 40 entries 20
> times, it matters.
Yes. I suppose simply storing those nodes separate does not work, as
does using a hash-table for storing them, no?
Another way would be to reduce the number of bb's somehow? I cannot
think of how 8000 bb's can accumulate in one of my math kernels other
than by inlining and maybe loop header copying. Can't we merge some
bb's before doing PRE?
Richard.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (28 preceding siblings ...)
2004-03-14 12:10 ` rguenth at tat dot physik dot uni-tuebingen dot de
@ 2004-03-14 13:54 ` rguenth at tat dot physik dot uni-tuebingen dot de
2004-03-14 15:38 ` dberlin at dberlin dot org
` (60 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: rguenth at tat dot physik dot uni-tuebingen dot de @ 2004-03-14 13:54 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From rguenth at tat dot physik dot uni-tuebingen dot de 2004-03-14 13:54 -------
Subject: Re: [tree-ssa] Many C++ compile-time regression in
3.5-tree-ssa 040120
dberlin at gcc dot gnu dot org wrote:
> ------- Additional Comments From dberlin at gcc dot gnu dot org 2004-03-14 04:47 -------
> There are about 100 functions here with > a couple thousand bb's.
> PRE takes about 2-3 seconds for each of these functions.
> Which means i have to microoptimize it in order to get rid of the cumulative time effect.
> A lot of is it simply iterating over large lists looking for certain types of nodes (like EPHiS), where the
> lists are O(n_basic_blocks), and we only need to look at 10 entries or so. This doesn't matter when the
> numbers are close, but when you have 8000 bb's to walk 20 times, instead of walking 40 entries 20
> times, it matters.
The nice thing is, that with -fno-exceptions the results look a _lot_
better:
Execution times (seconds)
garbage collection : 21.13 ( 7%) usr 0.01 ( 0%) sys 21.20 ( 7%)
wall
callgraph construction: 1.45 ( 0%) usr 0.01 ( 0%) sys 1.46 ( 0%)
wall
callgraph optimization: 1.51 ( 0%) usr 0.09 ( 1%) sys 1.61 ( 1%)
wall
cfg construction : 0.52 ( 0%) usr 0.05 ( 1%) sys 0.57 ( 0%)
wall
cfg cleanup : 1.67 ( 1%) usr 0.00 ( 0%) sys 1.67 ( 1%)
wall
trivially dead code : 2.27 ( 1%) usr 0.01 ( 0%) sys 2.28 ( 1%)
wall
life analysis : 5.01 ( 2%) usr 0.00 ( 0%) sys 5.02 ( 2%)
wall
life info update : 3.11 ( 1%) usr 0.00 ( 0%) sys 3.17 ( 1%)
wall
alias analysis : 4.02 ( 1%) usr 0.01 ( 0%) sys 4.03 ( 1%)
wall
register scan : 1.97 ( 1%) usr 0.00 ( 0%) sys 1.97 ( 1%)
wall
rebuild jump labels : 0.54 ( 0%) usr 0.00 ( 0%) sys 0.54 ( 0%)
wall
preprocessing : 0.69 ( 0%) usr 0.20 ( 3%) sys 1.72 ( 1%)
wall
parser : 18.39 ( 6%) usr 1.03 (16%) sys 19.44 ( 6%)
wall
name lookup : 6.74 ( 2%) usr 1.43 (23%) sys 8.18 ( 3%)
wall
integration : 58.53 (19%) usr 0.43 ( 7%) sys 58.99 (19%)
wall
tree gimplify : 3.43 ( 1%) usr 0.05 ( 1%) sys 3.48 ( 1%)
wall
tree eh : 0.76 ( 0%) usr 0.00 ( 0%) sys 0.76 ( 0%)
wall
tree CFG construction : 1.54 ( 1%) usr 0.13 ( 2%) sys 1.67 ( 1%)
wall
tree CFG cleanup : 1.84 ( 1%) usr 0.01 ( 0%) sys 1.85 ( 1%)
wall
tree PTA : 0.68 ( 0%) usr 0.00 ( 0%) sys 0.68 ( 0%)
wall
tree alias analysis : 1.07 ( 0%) usr 0.01 ( 0%) sys 1.08 ( 0%)
wall
tree PHI insertion : 1.37 ( 0%) usr 0.06 ( 1%) sys 1.43 ( 0%)
wall
tree SSA rewrite : 3.53 ( 1%) usr 0.06 ( 1%) sys 3.59 ( 1%)
wall
tree SSA other : 4.69 ( 2%) usr 0.41 ( 7%) sys 5.12 ( 2%)
wall
tree operand scan : 3.57 ( 1%) usr 0.27 ( 4%) sys 3.85 ( 1%)
wall
dominator optimization: 16.32 ( 5%) usr 0.52 ( 8%) sys 16.84 ( 5%)
wall
tree SRA : 0.43 ( 0%) usr 0.00 ( 0%) sys 0.43 ( 0%)
wall
tree CCP : 1.51 ( 0%) usr 0.01 ( 0%) sys 1.52 ( 0%)
wall
tree split crit edges : 0.16 ( 0%) usr 0.00 ( 0%) sys 0.16 ( 0%)
wall
tree PRE : 17.34 ( 6%) usr 0.05 ( 1%) sys 17.40 ( 6%)
wall
tree linearize phis : 0.01 ( 0%) usr 0.01 ( 0%) sys 0.02 ( 0%)
wall
tree forward propagate: 1.01 ( 0%) usr 0.00 ( 0%) sys 1.01 ( 0%)
wall
tree conservative DCE : 2.54 ( 1%) usr 0.01 ( 0%) sys 2.55 ( 1%)
wall
tree aggressive DCE : 0.83 ( 0%) usr 0.00 ( 0%) sys 0.83 ( 0%)
wall
tree DSE : 1.86 ( 1%) usr 0.07 ( 1%) sys 1.93 ( 1%)
wall
tree copy headers : 1.39 ( 0%) usr 0.01 ( 0%) sys 1.40 ( 0%)
wall
tree SSA to normal : 3.01 ( 1%) usr 0.04 ( 1%) sys 3.05 ( 1%)
wall
tree rename SSA copies: 0.69 ( 0%) usr 0.07 ( 1%) sys 0.77 ( 0%)
wall
dominance frontiers : 0.18 ( 0%) usr 0.00 ( 0%) sys 0.18 ( 0%)
wall
control dependences : 0.12 ( 0%) usr 0.00 ( 0%) sys 0.12 ( 0%)
wall
expand : 31.02 (10%) usr 0.24 ( 4%) sys 31.41 (10%)
wall
varconst : 0.94 ( 0%) usr 0.01 ( 0%) sys 0.99 ( 0%)
wall
jump : 1.77 ( 1%) usr 0.14 ( 2%) sys 1.97 ( 1%)
wall
CSE : 9.85 ( 3%) usr 0.03 ( 0%) sys 9.90 ( 3%)
wall
global CSE : 14.32 ( 5%) usr 0.17 ( 3%) sys 14.49 ( 5%)
wall
loop analysis : 4.19 ( 1%) usr 0.01 ( 0%) sys 4.21 ( 1%)
wall
bypass jumps : 1.19 ( 0%) usr 0.01 ( 0%) sys 1.20 ( 0%)
wall
CSE 2 : 4.24 ( 1%) usr 0.00 ( 0%) sys 4.24 ( 1%)
wall
branch prediction : 1.49 ( 0%) usr 0.03 ( 0%) sys 1.54 ( 0%)
wall
flow analysis : 0.12 ( 0%) usr 0.00 ( 0%) sys 0.14 ( 0%)
wall
combiner : 3.80 ( 1%) usr 0.01 ( 0%) sys 3.82 ( 1%)
wall
if-conversion : 0.61 ( 0%) usr 0.01 ( 0%) sys 0.63 ( 0%)
wall
regmove : 2.17 ( 1%) usr 0.00 ( 0%) sys 2.20 ( 1%)
wall
local alloc : 3.22 ( 1%) usr 0.03 ( 0%) sys 3.25 ( 1%)
wall
global alloc : 7.58 ( 3%) usr 0.21 ( 3%) sys 7.79 ( 3%)
wall
reload CSE regs : 3.25 ( 1%) usr 0.02 ( 0%) sys 3.27 ( 1%)
wall
flow 2 : 0.65 ( 0%) usr 0.00 ( 0%) sys 0.65 ( 0%)
wall
if-conversion 2 : 0.30 ( 0%) usr 0.00 ( 0%) sys 0.30 ( 0%)
wall
peephole 2 : 0.62 ( 0%) usr 0.02 ( 0%) sys 0.64 ( 0%)
wall
rename registers : 0.97 ( 0%) usr 0.04 ( 1%) sys 1.01 ( 0%)
wall
scheduling 2 : 5.57 ( 2%) usr 0.04 ( 1%) sys 5.66 ( 2%)
wall
machine dep reorg : 1.21 ( 0%) usr 0.00 ( 0%) sys 1.21 ( 0%)
wall
reorder blocks : 0.78 ( 0%) usr 0.00 ( 0%) sys 0.80 ( 0%)
wall
shorten branches : 0.82 ( 0%) usr 0.02 ( 0%) sys 0.84 ( 0%)
wall
reg stack : 0.20 ( 0%) usr 0.00 ( 0%) sys 0.20 ( 0%)
wall
final : 1.41 ( 0%) usr 0.14 ( 2%) sys 1.56 ( 1%)
wall
rest of compilation : 2.63 ( 1%) usr 0.04 ( 1%) sys 2.67 ( 1%)
wall
TOTAL : 302.38 6.29 310.24
So the question is, where is the difference and wether it needs to be
there ;)
Richard.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (29 preceding siblings ...)
2004-03-14 13:54 ` rguenth at tat dot physik dot uni-tuebingen dot de
@ 2004-03-14 15:38 ` dberlin at dberlin dot org
2004-03-14 18:00 ` rguenth at tat dot physik dot uni-tuebingen dot de
` (59 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: dberlin at dberlin dot org @ 2004-03-14 15:38 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From dberlin at dberlin dot org 2004-03-14 15:38 -------
Subject: Bug 13776
On Mar 14, 2004, at 8:35 AM, Richard Guenther wrote:
> Daniel Berlin wrote:
>> This adds a DOM pass in between split critical edges and PRE, and
>> works for me on i686 and powerpc
>> Tell me if it helps
>
> It made things worse in total, even PRE degraded some, but that may be
> in the noise.
>
> Richard.
I don't even get close to these numbers.
I've got your leafify patch installed (the one linked from the bug
report)
Even at -O2, on a checking enabled compiler, with tramp3d-v2 from the
bug report, with the following sizes:
[root@dberlin dberlin]# ls -trl tramp3d-v2.ii
-rw-r--r-- 1 root root 2962361 Feb 5 10:27 tramp3d-v2.ii
generated from
[root@dberlin dberlin]# ls -l tramp3d-v2.cpp
-rw-r--r-- 1 dberlin dberlin 1952077 Feb 5 10:14 tramp3d-v2.cpp
I get (without any changes to PRE):
[root@dberlin gcc]# ./cc1plus -O2 ~dberlin/tramp3d-v2.ii
...
Execution times (seconds)
garbage collection : 46.23 (15%) usr 0.27 ( 3%) sys 46.66 (15%)
wall
callgraph construction: 0.68 ( 0%) usr 0.01 ( 0%) sys 0.72 ( 0%)
wall
callgraph optimization: 0.80 ( 0%) usr 0.07 ( 1%) sys 0.92 ( 0%)
wall
cfg construction : 0.46 ( 0%) usr 0.04 ( 0%) sys 0.50 ( 0%)
wall
cfg cleanup : 1.82 ( 1%) usr 0.02 ( 0%) sys 1.84 ( 1%)
wall
CFG verifier : 8.07 ( 3%) usr 0.03 ( 0%) sys 8.15 ( 3%)
wall
trivially dead code : 1.28 ( 0%) usr 0.00 ( 0%) sys 1.29 ( 0%)
wall
life analysis : 2.96 ( 1%) usr 0.01 ( 0%) sys 2.97 ( 1%)
wall
life info update : 1.52 ( 0%) usr 0.01 ( 0%) sys 1.56 ( 0%)
wall
alias analysis : 2.64 ( 1%) usr 0.01 ( 0%) sys 2.66 ( 1%)
wall
register scan : 1.23 ( 0%) usr 0.02 ( 0%) sys 1.25 ( 0%)
wall
rebuild jump labels : 0.38 ( 0%) usr 0.00 ( 0%) sys 0.38 ( 0%)
wall
preprocessing : 0.29 ( 0%) usr 0.17 ( 2%) sys 0.46 ( 0%)
wall
parser : 13.65 ( 4%) usr 1.27 (16%) sys 20.56 ( 6%)
wall
name lookup : 4.99 ( 2%) usr 2.00 (25%) sys 7.07 ( 2%)
wall
integration : 28.17 ( 9%) usr 0.19 ( 2%) sys 28.57 ( 9%)
wall
tree gimplify : 2.08 ( 1%) usr 0.05 ( 1%) sys 2.19 ( 1%)
wall
tree eh : 2.86 ( 1%) usr 0.08 ( 1%) sys 2.96 ( 1%)
wall
tree CFG construction : 1.60 ( 1%) usr 0.09 ( 1%) sys 1.71 ( 1%)
wall
tree CFG cleanup : 3.99 ( 1%) usr 0.04 ( 0%) sys 4.04 ( 1%)
wall
tree PTA : 0.47 ( 0%) usr 0.01 ( 0%) sys 0.49 ( 0%)
wall
tree alias analysis : 0.61 ( 0%) usr 0.00 ( 0%) sys 0.61 ( 0%)
wall
tree PHI insertion : 9.15 ( 3%) usr 0.07 ( 1%) sys 9.26 ( 3%)
wall
tree SSA rewrite : 3.30 ( 1%) usr 0.01 ( 0%) sys 3.32 ( 1%)
wall
tree SSA other : 3.63 ( 1%) usr 0.51 ( 6%) sys 4.20 ( 1%)
wall
tree operand scan : 3.62 ( 1%) usr 0.59 ( 7%) sys 4.22 ( 1%)
wall
dominator optimization: 15.57 ( 5%) usr 0.46 ( 6%) sys 16.09 ( 5%)
wall
tree SRA : 0.31 ( 0%) usr 0.01 ( 0%) sys 0.32 ( 0%)
wall
tree CCP : 1.56 ( 1%) usr 0.02 ( 0%) sys 1.58 ( 0%)
wall
tree split crit edges : 0.57 ( 0%) usr 0.03 ( 0%) sys 0.61 ( 0%)
wall
tree PRE : 34.92 ( 9%) usr 0.14 ( 2%) sys 35.20 ( 9%)
wall
tree linearize phis : 0.03 ( 0%) usr 0.02 ( 0%) sys 0.05 ( 0%)
wall
tree forward propagate: 1.12 ( 0%) usr 0.02 ( 0%) sys 1.14 ( 0%)
wall
tree conservative DCE : 3.02 ( 1%) usr 0.03 ( 0%) sys 3.06 ( 1%)
wall
tree aggressive DCE : 0.78 ( 0%) usr 0.01 ( 0%) sys 0.79 ( 0%)
wall
tree DSE : 2.18 ( 1%) usr 0.01 ( 0%) sys 2.20 ( 1%)
wall
tree copy headers : 2.15 ( 1%) usr 0.02 ( 0%) sys 2.19 ( 1%)
wall
tree SSA to normal : 2.42 ( 1%) usr 0.13 ( 2%) sys 2.61 ( 1%)
wall
tree NRV optimization : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%)
wall
tree rename SSA copies: 0.71 ( 0%) usr 0.04 ( 0%) sys 0.75 ( 0%)
wall
tree SSA verifier : 25.23 ( 8%) usr 0.23 ( 3%) sys 25.52 ( 8%)
wall
tree STMT verifier : 3.72 ( 1%) usr 0.03 ( 0%) sys 3.76 ( 1%)
wall
callgraph verifier : 7.79 ( 3%) usr 0.25 ( 3%) sys 8.09 ( 3%)
wall
dominance frontiers : 0.27 ( 0%) usr 0.00 ( 0%) sys 0.27 ( 0%)
wall
control dependences : 0.14 ( 0%) usr 0.00 ( 0%) sys 0.14 ( 0%)
wall
expand : 16.03 ( 5%) usr 0.19 ( 2%) sys 16.41 ( 5%)
wall
varconst : 0.66 ( 0%) usr 0.05 ( 1%) sys 1.06 ( 0%)
wall
jump : 1.17 ( 0%) usr 0.15 ( 2%) sys 1.41 ( 0%)
wall
CSE : 8.76 ( 3%) usr 0.05 ( 1%) sys 8.84 ( 3%)
wall
global CSE : 5.01 ( 2%) usr 0.13 ( 2%) sys 5.15 ( 2%)
wall
loop analysis : 1.21 ( 0%) usr 0.01 ( 0%) sys 1.24 ( 0%)
wall
bypass jumps : 0.94 ( 0%) usr 0.00 ( 0%) sys 0.94 ( 0%)
wall
CSE 2 : 3.59 ( 1%) usr 0.02 ( 0%) sys 3.78 ( 1%)
wall
branch prediction : 2.25 ( 1%) usr 0.01 ( 0%) sys 2.31 ( 1%)
wall
flow analysis : 0.08 ( 0%) usr 0.00 ( 0%) sys 0.10 ( 0%)
wall
combiner : 2.58 ( 1%) usr 0.03 ( 0%) sys 2.64 ( 1%)
wall
if-conversion : 0.57 ( 0%) usr 0.00 ( 0%) sys 0.57 ( 0%)
wall
regmove : 0.85 ( 0%) usr 0.00 ( 0%) sys 0.86 ( 0%)
wall
local alloc : 1.80 ( 1%) usr 0.01 ( 0%) sys 1.84 ( 1%)
wall
global alloc : 5.34 ( 2%) usr 0.10 ( 1%) sys 5.50 ( 2%)
wall
reload CSE regs : 2.24 ( 1%) usr 0.00 ( 0%) sys 2.25 ( 1%)
wall
flow 2 : 0.33 ( 0%) usr 0.00 ( 0%) sys 0.34 ( 0%)
wall
if-conversion 2 : 0.35 ( 0%) usr 0.00 ( 0%) sys 0.35 ( 0%)
wall
peephole 2 : 0.38 ( 0%) usr 0.00 ( 0%) sys 0.39 ( 0%)
wall
rename registers : 1.43 ( 0%) usr 0.04 ( 0%) sys 1.52 ( 0%)
wall
scheduling 2 : 2.28 ( 1%) usr 0.08 ( 1%) sys 2.38 ( 1%)
wall
reorder blocks : 0.49 ( 0%) usr 0.01 ( 0%) sys 0.50 ( 0%)
wall
shorten branches : 0.70 ( 0%) usr 0.01 ( 0%) sys 0.71 ( 0%)
wall
reg stack : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.05 ( 0%)
wall
final : 1.03 ( 0%) usr 0.14 ( 2%) sys 1.38 ( 0%)
wall
symout : 0.02 ( 0%) usr 0.03 ( 0%) sys 0.06 ( 0%)
wall
rest of compilation : 1.48 ( 0%) usr 0.04 ( 0%) sys 1.54 ( 0%)
wall
TOTAL : 310.60 8.12 327.62
Extra diagnostic checks enabled; compiler may run slowly.
Configure with --disable-checking to disable checks.
With my changes to PRE, i get the same numbers, except PRE is at 28
seconds instead of 36.
I certainly get *nowhere close* to 600 seconds in PRE, or the numbers
you get overall.
I can't fix a problem i can't reproduce, i can only take stabs at it.
Can someone else please verify his numbers so i know whether it's my
test setup or his?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (30 preceding siblings ...)
2004-03-14 15:38 ` dberlin at dberlin dot org
@ 2004-03-14 18:00 ` rguenth at tat dot physik dot uni-tuebingen dot de
2004-03-14 18:22 ` rguenth at tat dot physik dot uni-tuebingen dot de
` (58 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: rguenth at tat dot physik dot uni-tuebingen dot de @ 2004-03-14 18:00 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From rguenth at tat dot physik dot uni-tuebingen dot de 2004-03-14 18:00 -------
Subject: Re: [tree-ssa] Many C++ compile-time regression in
3.5-tree-ssa 040120
dberlin at dberlin dot org wrote:
> ------- Additional Comments From dberlin at dberlin dot org 2004-03-14 15:38 -------
> Subject: Bug 13776
>
>
> On Mar 14, 2004, at 8:35 AM, Richard Guenther wrote:
>
>
>>Daniel Berlin wrote:
>>
>>>This adds a DOM pass in between split critical edges and PRE, and
>>>works for me on i686 and powerpc
>>>Tell me if it helps
>>
>>It made things worse in total, even PRE degraded some, but that may be
>>in the noise.
>>
>>Richard.
>
>
> I don't even get close to these numbers.
> I've got your leafify patch installed (the one linked from the bug
> report)
> Even at -O2, on a checking enabled compiler, with tramp3d-v2 from the
> bug report, with the following sizes:
>
> [root@dberlin dberlin]# ls -trl tramp3d-v2.ii
> -rw-r--r-- 1 root root 2962361 Feb 5 10:27 tramp3d-v2.ii
> generated from
> [root@dberlin dberlin]# ls -l tramp3d-v2.cpp
> -rw-r--r-- 1 dberlin dberlin 1952077 Feb 5 10:14 tramp3d-v2.cpp
That's the correct one.
> I get (without any changes to PRE):
> [root@dberlin gcc]# ./cc1plus -O2 ~dberlin/tramp3d-v2.ii
> ...
> Execution times (seconds)
> garbage collection : 46.23 (15%) usr 0.27 ( 3%) sys 46.66 (15%)
> wall
> callgraph construction: 0.68 ( 0%) usr 0.01 ( 0%) sys 0.72 ( 0%)
> wall
> callgraph optimization: 0.80 ( 0%) usr 0.07 ( 1%) sys 0.92 ( 0%)
> wall
> cfg construction : 0.46 ( 0%) usr 0.04 ( 0%) sys 0.50 ( 0%)
> wall
> cfg cleanup : 1.82 ( 1%) usr 0.02 ( 0%) sys 1.84 ( 1%)
> wall
> CFG verifier : 8.07 ( 3%) usr 0.03 ( 0%) sys 8.15 ( 3%)
> wall
> trivially dead code : 1.28 ( 0%) usr 0.00 ( 0%) sys 1.29 ( 0%)
> wall
> life analysis : 2.96 ( 1%) usr 0.01 ( 0%) sys 2.97 ( 1%)
> wall
> life info update : 1.52 ( 0%) usr 0.01 ( 0%) sys 1.56 ( 0%)
> wall
> alias analysis : 2.64 ( 1%) usr 0.01 ( 0%) sys 2.66 ( 1%)
> wall
> register scan : 1.23 ( 0%) usr 0.02 ( 0%) sys 1.25 ( 0%)
> wall
> rebuild jump labels : 0.38 ( 0%) usr 0.00 ( 0%) sys 0.38 ( 0%)
> wall
> preprocessing : 0.29 ( 0%) usr 0.17 ( 2%) sys 0.46 ( 0%)
> wall
> parser : 13.65 ( 4%) usr 1.27 (16%) sys 20.56 ( 6%)
> wall
> name lookup : 4.99 ( 2%) usr 2.00 (25%) sys 7.07 ( 2%)
> wall
> integration : 28.17 ( 9%) usr 0.19 ( 2%) sys 28.57 ( 9%)
> wall
> tree gimplify : 2.08 ( 1%) usr 0.05 ( 1%) sys 2.19 ( 1%)
> wall
> tree eh : 2.86 ( 1%) usr 0.08 ( 1%) sys 2.96 ( 1%)
> wall
> tree CFG construction : 1.60 ( 1%) usr 0.09 ( 1%) sys 1.71 ( 1%)
> wall
> tree CFG cleanup : 3.99 ( 1%) usr 0.04 ( 0%) sys 4.04 ( 1%)
> wall
> tree PTA : 0.47 ( 0%) usr 0.01 ( 0%) sys 0.49 ( 0%)
> wall
> tree alias analysis : 0.61 ( 0%) usr 0.00 ( 0%) sys 0.61 ( 0%)
> wall
> tree PHI insertion : 9.15 ( 3%) usr 0.07 ( 1%) sys 9.26 ( 3%)
> wall
> tree SSA rewrite : 3.30 ( 1%) usr 0.01 ( 0%) sys 3.32 ( 1%)
> wall
> tree SSA other : 3.63 ( 1%) usr 0.51 ( 6%) sys 4.20 ( 1%)
> wall
> tree operand scan : 3.62 ( 1%) usr 0.59 ( 7%) sys 4.22 ( 1%)
> wall
> dominator optimization: 15.57 ( 5%) usr 0.46 ( 6%) sys 16.09 ( 5%)
> wall
> tree SRA : 0.31 ( 0%) usr 0.01 ( 0%) sys 0.32 ( 0%)
> wall
> tree CCP : 1.56 ( 1%) usr 0.02 ( 0%) sys 1.58 ( 0%)
> wall
> tree split crit edges : 0.57 ( 0%) usr 0.03 ( 0%) sys 0.61 ( 0%)
> wall
> tree PRE : 34.92 ( 9%) usr 0.14 ( 2%) sys 35.20 ( 9%)
> wall
> tree linearize phis : 0.03 ( 0%) usr 0.02 ( 0%) sys 0.05 ( 0%)
> wall
> tree forward propagate: 1.12 ( 0%) usr 0.02 ( 0%) sys 1.14 ( 0%)
> wall
> tree conservative DCE : 3.02 ( 1%) usr 0.03 ( 0%) sys 3.06 ( 1%)
> wall
> tree aggressive DCE : 0.78 ( 0%) usr 0.01 ( 0%) sys 0.79 ( 0%)
> wall
> tree DSE : 2.18 ( 1%) usr 0.01 ( 0%) sys 2.20 ( 1%)
> wall
> tree copy headers : 2.15 ( 1%) usr 0.02 ( 0%) sys 2.19 ( 1%)
> wall
> tree SSA to normal : 2.42 ( 1%) usr 0.13 ( 2%) sys 2.61 ( 1%)
> wall
> tree NRV optimization : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%)
> wall
> tree rename SSA copies: 0.71 ( 0%) usr 0.04 ( 0%) sys 0.75 ( 0%)
> wall
> tree SSA verifier : 25.23 ( 8%) usr 0.23 ( 3%) sys 25.52 ( 8%)
> wall
> tree STMT verifier : 3.72 ( 1%) usr 0.03 ( 0%) sys 3.76 ( 1%)
> wall
> callgraph verifier : 7.79 ( 3%) usr 0.25 ( 3%) sys 8.09 ( 3%)
> wall
> dominance frontiers : 0.27 ( 0%) usr 0.00 ( 0%) sys 0.27 ( 0%)
> wall
> control dependences : 0.14 ( 0%) usr 0.00 ( 0%) sys 0.14 ( 0%)
> wall
> expand : 16.03 ( 5%) usr 0.19 ( 2%) sys 16.41 ( 5%)
> wall
> varconst : 0.66 ( 0%) usr 0.05 ( 1%) sys 1.06 ( 0%)
> wall
> jump : 1.17 ( 0%) usr 0.15 ( 2%) sys 1.41 ( 0%)
> wall
> CSE : 8.76 ( 3%) usr 0.05 ( 1%) sys 8.84 ( 3%)
> wall
> global CSE : 5.01 ( 2%) usr 0.13 ( 2%) sys 5.15 ( 2%)
> wall
> loop analysis : 1.21 ( 0%) usr 0.01 ( 0%) sys 1.24 ( 0%)
> wall
> bypass jumps : 0.94 ( 0%) usr 0.00 ( 0%) sys 0.94 ( 0%)
> wall
> CSE 2 : 3.59 ( 1%) usr 0.02 ( 0%) sys 3.78 ( 1%)
> wall
> branch prediction : 2.25 ( 1%) usr 0.01 ( 0%) sys 2.31 ( 1%)
> wall
> flow analysis : 0.08 ( 0%) usr 0.00 ( 0%) sys 0.10 ( 0%)
> wall
> combiner : 2.58 ( 1%) usr 0.03 ( 0%) sys 2.64 ( 1%)
> wall
> if-conversion : 0.57 ( 0%) usr 0.00 ( 0%) sys 0.57 ( 0%)
> wall
> regmove : 0.85 ( 0%) usr 0.00 ( 0%) sys 0.86 ( 0%)
> wall
> local alloc : 1.80 ( 1%) usr 0.01 ( 0%) sys 1.84 ( 1%)
> wall
> global alloc : 5.34 ( 2%) usr 0.10 ( 1%) sys 5.50 ( 2%)
> wall
> reload CSE regs : 2.24 ( 1%) usr 0.00 ( 0%) sys 2.25 ( 1%)
> wall
> flow 2 : 0.33 ( 0%) usr 0.00 ( 0%) sys 0.34 ( 0%)
> wall
> if-conversion 2 : 0.35 ( 0%) usr 0.00 ( 0%) sys 0.35 ( 0%)
> wall
> peephole 2 : 0.38 ( 0%) usr 0.00 ( 0%) sys 0.39 ( 0%)
> wall
> rename registers : 1.43 ( 0%) usr 0.04 ( 0%) sys 1.52 ( 0%)
> wall
> scheduling 2 : 2.28 ( 1%) usr 0.08 ( 1%) sys 2.38 ( 1%)
> wall
> reorder blocks : 0.49 ( 0%) usr 0.01 ( 0%) sys 0.50 ( 0%)
> wall
> shorten branches : 0.70 ( 0%) usr 0.01 ( 0%) sys 0.71 ( 0%)
> wall
> reg stack : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.05 ( 0%)
> wall
> final : 1.03 ( 0%) usr 0.14 ( 2%) sys 1.38 ( 0%)
> wall
> symout : 0.02 ( 0%) usr 0.03 ( 0%) sys 0.06 ( 0%)
> wall
> rest of compilation : 1.48 ( 0%) usr 0.04 ( 0%) sys 1.54 ( 0%)
> wall
> TOTAL : 310.60 8.12 327.62
> Extra diagnostic checks enabled; compiler may run slowly.
> Configure with --disable-checking to disable checks.
>
>
> With my changes to PRE, i get the same numbers, except PRE is at 28
> seconds instead of 36.
>
> I certainly get *nowhere close* to 600 seconds in PRE, or the numbers
> you get overall.
> I can't fix a problem i can't reproduce, i can only take stabs at it.
> Can someone else please verify his numbers so i know whether it's my
> test setup or his?
I even have checking disabled. GC time seems to be identical, parsing
is 13.5s vs 18.4s - the first big difference is integration, which
suggests that leafifying is not enabled? Maybe the patch applied
"wrong", I attached a complete diff of my local changes.
Anyway, I'm running on a 1GHz Athlon with 1GB of ram, compiler is
bootstrapped with checking disabled.
Richard.
Index: gcc/c-common.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/c-common.c,v
retrieving revision 1.344.2.63
diff -u -u -r1.344.2.63 c-common.c
--- gcc/c-common.c 2 Mar 2004 18:41:21 -0000 1.344.2.63
+++ gcc/c-common.c 14 Mar 2004 17:51:26 -0000
@@ -746,6 +746,7 @@
static tree handle_noinline_attribute (tree *, tree, tree, int, bool *);
static tree handle_always_inline_attribute (tree *, tree, tree, int,
bool *);
+static tree handle_leafify_attribute (tree *, tree, tree, int, bool *);
static tree handle_used_attribute (tree *, tree, tree, int, bool *);
static tree handle_unused_attribute (tree *, tree, tree, int, bool *);
static tree handle_const_attribute (tree *, tree, tree, int, bool *);
@@ -807,6 +808,8 @@
handle_noinline_attribute },
{ "always_inline", 0, 0, true, false, false,
handle_always_inline_attribute },
+ { "leafify", 0, 0, true, false, false,
+ handle_leafify_attribute },
{ "used", 0, 0, true, false, false,
handle_used_attribute },
{ "unused", 0, 0, false, false, false,
@@ -4458,6 +4461,29 @@
return NULL_TREE;
}
+
+/* Handle a "leafify" attribute; arguments as in
+ struct attribute_spec.handler. */
+
+static tree
+handle_leafify_attribute (tree *node, tree name,
+ tree args ATTRIBUTE_UNUSED,
+ int flags ATTRIBUTE_UNUSED, bool *no_add_attrs)
+{
+ if (TREE_CODE (*node) == FUNCTION_DECL)
+ {
+ /* Do nothing else, just set the attribute. We'll get at
+ it later with lookup_attribute. */
+ }
+ else
+ {
+ warning ("`%s' attribute ignored", IDENTIFIER_POINTER (name));
+ *no_add_attrs = true;
+ }
+
+ return NULL_TREE;
+}
+
/* Handle a "used" attribute; arguments as in
struct attribute_spec.handler. */
Index: gcc/cgraphunit.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/cgraphunit.c,v
retrieving revision 1.1.4.39
diff -u -u -r1.1.4.39 cgraphunit.c
--- gcc/cgraphunit.c 4 Mar 2004 15:38:34 -0000 1.1.4.39
+++ gcc/cgraphunit.c 14 Mar 2004 17:51:26 -0000
@@ -1045,7 +1045,7 @@
else
e->callee->global.inlined_to = e->caller;
- /* Recursivly clone all bodies. */
+ /* Recursivly clone all inlined bodies. */
for (e = e->callee->callees; e; e = e->next_callee)
if (!e->inline_failed)
cgraph_clone_inlined_nodes (e, duplicate);
@@ -1192,7 +1192,7 @@
recursive = what->decl == to->global.inlined_to->decl;
else
recursive = what->decl == to->decl;
- /* Marking recursive function inlinine has sane semantic and thus we should
+ /* Marking recursive function inline has sane semantic and thus we should
not warn on it. */
if (recursive && reason)
*reason = (what->local.disregard_inline_limits
@@ -1440,6 +1440,67 @@
free (heap_node);
}
+/* Find callgraph nodes closing a circle in the graph. The
+ resulting hashtab can be used to avoid walking the circles.
+ Uses the cgraph nodes ->aux field which needs to be zero
+ before and will be zero after operation. */
+
+static void
+cgraph_find_cycles (struct cgraph_node *node, htab_t cycles)
+{
+ struct cgraph_edge *e;
+
+ if (node->aux)
+ {
+ void **slot;
+ slot = htab_find_slot (cycles, node, INSERT);
+ if (!*slot)
+ {
+ if (cgraph_dump_file)
+ fprintf (cgraph_dump_file, "Cycle contains %s\n", cgraph_node_name (node));
+ *slot = node;
+ }
+ return;
+ }
+
+ node->aux = node;
+ for (e = node->callees; e; e = e->next_callee)
+ {
+ cgraph_find_cycles (e->callee, cycles);
+ }
+ node->aux = 0;
+}
+
+/* Leafify the cgraph node. We have to be careful in recursing
+ as to not run endlessly in circles of the callgraph.
+ We do so by using a hashtab of cycle entering nodes as generated
+ by cgraph_find_cycles. */
+
+static void
+cgraph_leafify_node (struct cgraph_node *node, htab_t cycles)
+{
+ struct cgraph_edge *e;
+
+ for (e = node->callees; e; e = e->next_callee)
+ {
+ /* Inline call, if possible, and recurse. Be sure we are not
+ entering callgraph circles here. */
+ if (e->inline_failed
+ && e->callee->local.inlinable
+ && !cgraph_recursive_inlining_p (node, e->callee,
+ &e->inline_failed)
+ && !htab_find (cycles, e->callee))
+ {
+ if (cgraph_dump_file)
+ fprintf (cgraph_dump_file, " inlining %s", cgraph_node_name (e->callee));
+ cgraph_mark_inline_edge (e);
+ cgraph_leafify_node (e->callee, cycles);
+ }
+ else if (cgraph_dump_file)
+ fprintf (cgraph_dump_file, " !inlining %s", cgraph_node_name (e->callee));
+ }
+}
+
/* Decide on the inlining. We do so in the topological order to avoid
expenses on updating datastructures. */
@@ -1477,6 +1538,24 @@
struct cgraph_edge *e;
node = order[i];
+
+ /* Handle nodes to be leafified, but don't update overall unit size. */
+ if (lookup_attribute ("leafify", DECL_ATTRIBUTES (node->decl)) != NULL)
+ {
+ int old_overall_insns = overall_insns;
+ htab_t cycles;
+ if (cgraph_dump_file)
+ fprintf (cgraph_dump_file,
+ "Leafifying %s\n", cgraph_node_name (node));
+ cycles = htab_create (7, htab_hash_pointer, htab_eq_pointer, NULL);
+ cgraph_find_cycles (node, cycles);
+ cgraph_leafify_node (node, cycles);
+ htab_delete (cycles);
+ overall_insns = old_overall_insns;
+ /* We don't need to consider always_inline functions inside the leafified
+ function anymore. */
+ continue;
+ }
for (e = node->callees; e; e = e->next_callee)
if (e->callee->local.disregard_inline_limits)
Index: gcc/doc/extend.texi
===================================================================
RCS file: /cvs/gcc/gcc/gcc/doc/extend.texi,v
retrieving revision 1.82.2.36
diff -u -u -r1.82.2.36 extend.texi
--- gcc/doc/extend.texi 2 Mar 2004 18:42:50 -0000 1.82.2.36
+++ gcc/doc/extend.texi 14 Mar 2004 17:51:30 -0000
@@ -1893,7 +1893,7 @@
attributes when making a declaration. This keyword is followed by an
attribute specification inside double parentheses. The following
attributes are currently defined for functions on all targets:
-@code{noreturn}, @code{noinline}, @code{always_inline},
+@code{noreturn}, @code{noinline}, @code{always_inline}, @code{leafify},
@code{pure}, @code{const}, @code{nothrow},
@code{format}, @code{format_arg}, @code{no_instrument_function},
@code{section}, @code{constructor}, @code{destructor}, @code{used},
@@ -1969,6 +1969,14 @@
Generally, functions are not inlined unless optimization is specified.
For functions declared inline, this attribute inlines the function even
if no optimization level was specified.
+
+@cindex @code{leafify} function attribute
+@item leafify
+Generally, inlining into a function is limited. For a function marked with
+this attribute, every call inside this function will be inlined, if possible.
+Whether the function itself is considered for inlining depends on its size and
+the current inlining parameters. The @code{leafify} attribute only works
+reliably in unit-at-a-time mode.
@cindex @code{pure} function attribute
@item pure
Index: libstdc++-v3/include/c_std/std_cmath.h
===================================================================
RCS file: /cvs/gcc/gcc/libstdc++-v3/include/c_std/std_cmath.h,v
retrieving revision 1.5.6.7
diff -u -u -r1.5.6.7 std_cmath.h
--- libstdc++-v3/include/c_std/std_cmath.h 3 Jan 2004 23:05:32 -0000 1.5.6.7
+++ libstdc++-v3/include/c_std/std_cmath.h 14 Mar 2004 17:51:55 -0000
@@ -330,9 +330,31 @@
{ return __builtin_modfl(__x, __iptr); }
template<typename _Tp>
- inline _Tp
+ inline _Tp __attribute__((always_inline))
__pow_helper(_Tp __x, int __n)
{
+ if (__builtin_constant_p(__n))
+ switch (__n) {
+ case -1:
+ return _Tp(1)/__x;
+ case 0:
+ return _Tp(1);
+ case 1:
+ return __x;
+ case 2:
+ return __x*__x;
+#if ! __OPTIMIZE_SIZE__
+ case -2:
+ return _Tp(1)/(__x*__x);
+ case 3:
+ return __x*__x*__x;
+ case 4:
+ {
+ _Tp __y = __x*__x;
+ return __y*__y;
+ }
+#endif
+ }
return __n < 0
? _Tp(1)/__cmath_power(__x, -__n)
: __cmath_power(__x, __n);
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (31 preceding siblings ...)
2004-03-14 18:00 ` rguenth at tat dot physik dot uni-tuebingen dot de
@ 2004-03-14 18:22 ` rguenth at tat dot physik dot uni-tuebingen dot de
2004-03-14 22:21 ` dberlin at dberlin dot org
` (57 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: rguenth at tat dot physik dot uni-tuebingen dot de @ 2004-03-14 18:22 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From rguenth at tat dot physik dot uni-tuebingen dot de 2004-03-14 18:22 -------
Subject: Re: [tree-ssa] Many C++ compile-time regression in
3.5-tree-ssa 040120
dberlin at dberlin dot org wrote:
> ------- Additional Comments From dberlin at dberlin dot org 2004-03-14 15:38 -------
> Subject: Bug 13776
> With my changes to PRE, i get the same numbers, except PRE is at 28
> seconds instead of 36.
>
> I certainly get *nowhere close* to 600 seconds in PRE, or the numbers
> you get overall.
> I can't fix a problem i can't reproduce, i can only take stabs at it.
> Can someone else please verify his numbers so i know whether it's my
> test setup or his?
A way to check if leafify is working correctly is to look at the
assembler generated for f.i.
_ZN14MultiArgKernelI9MultiArg5I5FieldI22UniformRectilinearMeshI10MeshTraitsILi3Ed21UniformRectilinearTag12CartesianTagLi3EEEd9BrickViewES9_S9_S9_S9_E15EvaluateLocLoopIN6Forgas5VXUpdILi3EEELi3EEE3runEv
it should be straight-line code without calls. Note that without
-funroll-loops or -fpeel-loops the code contains a lot of explicit
3-times rolling loops, so it's more "easy" to look at it with
-funroll-loops enabled.
Richard.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (32 preceding siblings ...)
2004-03-14 18:22 ` rguenth at tat dot physik dot uni-tuebingen dot de
@ 2004-03-14 22:21 ` dberlin at dberlin dot org
2004-03-14 22:23 ` dberlin at dberlin dot org
` (56 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: dberlin at dberlin dot org @ 2004-03-14 22:21 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From dberlin at dberlin dot org 2004-03-14 22:21 -------
Subject: Re: [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
>>
>
> A way to check if leafify is working correctly is to look at the
> assembler generated for f.i.
>
> _ZN14MultiArgKernelI9MultiArg5I5FieldI22UniformRectilinearMeshI10MeshTr
> aitsILi3Ed21UniformRectilinearTag12CartesianTagLi3EEEd9BrickViewES9_S9_
> S9_S9_E15EvaluateLocLoopIN6Forgas5VXUpdILi3EEELi3EEE3runEv
>
> it should be straight-line code without calls.
It is.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (33 preceding siblings ...)
2004-03-14 22:21 ` dberlin at dberlin dot org
@ 2004-03-14 22:23 ` dberlin at dberlin dot org
2004-03-14 22:28 ` rguenth at tat dot physik dot uni-tuebingen dot de
` (55 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: dberlin at dberlin dot org @ 2004-03-14 22:23 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From dberlin at dberlin dot org 2004-03-14 22:23 -------
Subject: Re: [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
>
> I even have checking disabled. GC time seems to be identical, parsing
> is 13.5s vs 18.4s - the first big difference is integration, which
> suggests that leafifying is not enabled?
As I showed in the next comment, the leafified functions have no
function calls.
> Maybe the patch applied
> "wrong", I attached a complete diff of my local changes.
I have exactly these changes installed.
(I verified it by hand and by comparing the applied diffs).
What platform are you doing this on?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (34 preceding siblings ...)
2004-03-14 22:23 ` dberlin at dberlin dot org
@ 2004-03-14 22:28 ` rguenth at tat dot physik dot uni-tuebingen dot de
2004-03-14 22:52 ` dberlin at gcc dot gnu dot org
` (54 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: rguenth at tat dot physik dot uni-tuebingen dot de @ 2004-03-14 22:28 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From rguenth at tat dot physik dot uni-tuebingen dot de 2004-03-14 22:28 -------
Subject: Re: [tree-ssa] Many C++ compile-time regression in
3.5-tree-ssa 040120
dberlin at dberlin dot org wrote:
> ------- Additional Comments From dberlin at dberlin dot org 2004-03-14 22:23 -------
> Subject: Re: [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
>
>
>>I even have checking disabled. GC time seems to be identical, parsing
>>is 13.5s vs 18.4s - the first big difference is integration, which
>>suggests that leafifying is not enabled?
>
>
> As I showed in the next comment, the leafified functions have no
> function calls.
>
>
>> Maybe the patch applied
>>"wrong", I attached a complete diff of my local changes.
>
>
> I have exactly these changes installed.
> (I verified it by hand and by comparing the applied diffs).
Ok.
>
> What platform are you doing this on?
On ia32, I'm trying to bootstrap on ia64 now. I'm configuring with
--enable-languages="c,c++" --enable-threads=posix --enable-__cxa_atexit
--disable-libunwind-exceptions --disable-mudflap --disable-checking
Richard.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (35 preceding siblings ...)
2004-03-14 22:28 ` rguenth at tat dot physik dot uni-tuebingen dot de
@ 2004-03-14 22:52 ` dberlin at gcc dot gnu dot org
2004-03-14 23:14 ` dberlin at gcc dot gnu dot org
` (53 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: dberlin at gcc dot gnu dot org @ 2004-03-14 22:52 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From dberlin at gcc dot gnu dot org 2004-03-14 22:52 -------
(In reply to comment #34)
> > What platform are you doing this on?
>
> On ia32, I'm trying to bootstrap on ia64 now. I'm configuring with
> --enable-languages="c,c++" --enable-threads=posix --enable-__cxa_atexit
> --disable-libunwind-exceptions --disable-mudflap --disable-checking
>
Hmmm.
I reconfigured with exactly those flags, and re-bootstrapped, and now i get the same numbers you do.
Memory usage was also way up.
However, after that, i just ran configure, then bootstrapped, then get the numbers i posted.
Can you just run configure without any options at all, bootstrap, and see what numbers you get?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (36 preceding siblings ...)
2004-03-14 22:52 ` dberlin at gcc dot gnu dot org
@ 2004-03-14 23:14 ` dberlin at gcc dot gnu dot org
2004-03-14 23:45 ` pinskia at gcc dot gnu dot org
` (52 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: dberlin at gcc dot gnu dot org @ 2004-03-14 23:14 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From dberlin at gcc dot gnu dot org 2004-03-14 23:14 -------
these are my numbers when configured with just --disable-checking (with the leafify patch, etc)
Execution times (seconds)
garbage collection : 21.30 ( 9%) usr 0.12 ( 1%) sys 22.05 ( 8%) wall
callgraph construction: 0.73 ( 0%) usr 0.00 ( 0%) sys 0.76 ( 0%) wall
callgraph optimization: 0.73 ( 0%) usr 0.03 ( 0%) sys 0.78 ( 0%) wall
cfg construction : 0.54 ( 0%) usr 0.04 ( 0%) sys 0.58 ( 0%) wall
cfg cleanup : 2.08 ( 1%) usr 0.05 ( 1%) sys 2.17 ( 1%) wall
trivially dead code : 1.45 ( 1%) usr 0.01 ( 0%) sys 1.48 ( 1%) wall
life analysis : 4.52 ( 2%) usr 0.01 ( 0%) sys 4.64 ( 2%) wall
life info update : 2.23 ( 1%) usr 0.01 ( 0%) sys 2.26 ( 1%) wall
alias analysis : 2.66 ( 1%) usr 0.03 ( 0%) sys 2.86 ( 1%) wall
register scan : 1.73 ( 1%) usr 0.00 ( 0%) sys 1.73 ( 1%) wall
rebuild jump labels : 0.52 ( 0%) usr 0.00 ( 0%) sys 0.52 ( 0%) wall
preprocessing : 0.63 ( 0%) usr 0.16 ( 2%) sys 0.80 ( 0%) wall
parser : 13.73 ( 6%) usr 1.55 (19%) sys 20.68 ( 8%) wall
name lookup : 5.70 ( 2%) usr 2.05 (25%) sys 7.89 ( 3%) wall
integration : 27.48 (11%) usr 0.21 ( 3%) sys 28.53 (11%) wall
tree gimplify : 1.96 ( 1%) usr 0.02 ( 0%) sys 2.02 ( 1%) wall
tree eh : 3.06 ( 1%) usr 0.13 ( 2%) sys 3.35 ( 1%) wall
tree CFG construction : 1.65 ( 1%) usr 0.07 ( 1%) sys 1.80 ( 1%) wall
tree CFG cleanup : 3.53 ( 1%) usr 0.03 ( 0%) sys 3.76 ( 1%) wall
tree PTA : 0.64 ( 0%) usr 0.00 ( 0%) sys 0.64 ( 0%) wall
tree alias analysis : 0.70 ( 0%) usr 0.00 ( 0%) sys 0.72 ( 0%) wall
tree PHI insertion : 11.00 ( 5%) usr 0.07 ( 1%) sys 11.31 ( 4%) wall
tree SSA rewrite : 3.34 ( 1%) usr 0.06 ( 1%) sys 3.55 ( 1%) wall
tree SSA other : 4.79 ( 2%) usr 0.64 ( 8%) sys 5.57 ( 2%) wall
tree operand scan : 4.10 ( 2%) usr 0.63 ( 8%) sys 4.80 ( 2%) wall
dominator optimization: 14.61 ( 6%) usr 0.54 ( 7%) sys 15.46 ( 6%) wall
tree SRA : 0.27 ( 0%) usr 0.02 ( 0%) sys 0.29 ( 0%) wall
tree CCP : 1.58 ( 1%) usr 0.02 ( 0%) sys 1.65 ( 1%) wall
tree split crit edges : 0.22 ( 0%) usr 0.00 ( 0%) sys 0.22 ( 0%) wall
tree PRE : 26.66 (11%) usr 0.17 ( 2%) sys 27.40 (10%) wall
tree linearize phis : 0.00 ( 0%) usr 0.01 ( 0%) sys 0.01 ( 0%) wall
tree forward propagate: 1.25 ( 1%) usr 0.01 ( 0%) sys 1.28 ( 0%) wall
tree conservative DCE : 2.54 ( 1%) usr 0.05 ( 1%) sys 2.70 ( 1%) wall
tree aggressive DCE : 1.09 ( 0%) usr 0.01 ( 0%) sys 1.10 ( 0%) wall
tree DSE : 2.52 ( 1%) usr 0.01 ( 0%) sys 2.64 ( 1%) wall
tree copy headers : 2.22 ( 1%) usr 0.06 ( 1%) sys 2.32 ( 1%) wall
tree SSA to normal : 2.74 ( 1%) usr 0.15 ( 2%) sys 2.90 ( 1%) wall
tree rename SSA copies: 0.59 ( 0%) usr 0.03 ( 0%) sys 0.66 ( 0%) wall
dominance frontiers : 0.42 ( 0%) usr 0.00 ( 0%) sys 0.42 ( 0%) wall
control dependences : 0.15 ( 0%) usr 0.00 ( 0%) sys 0.15 ( 0%) wall
expand : 15.77 ( 6%) usr 0.26 ( 3%) sys 16.61 ( 6%) wall
varconst : 0.54 ( 0%) usr 0.03 ( 0%) sys 0.89 ( 0%) wall
jump : 1.16 ( 0%) usr 0.14 ( 2%) sys 1.37 ( 1%) wall
CSE : 7.87 ( 3%) usr 0.04 ( 0%) sys 8.19 ( 3%) wall
global CSE : 6.11 ( 3%) usr 0.09 ( 1%) sys 6.30 ( 2%) wall
loop analysis : 1.41 ( 1%) usr 0.00 ( 0%) sys 1.41 ( 1%) wall
bypass jumps : 1.10 ( 0%) usr 0.00 ( 0%) sys 1.12 ( 0%) wall
CSE 2 : 3.16 ( 1%) usr 0.02 ( 0%) sys 3.20 ( 1%) wall
branch prediction : 2.52 ( 1%) usr 0.08 ( 1%) sys 2.73 ( 1%) wall
flow analysis : 0.10 ( 0%) usr 0.00 ( 0%) sys 0.10 ( 0%) wall
combiner : 3.49 ( 1%) usr 0.01 ( 0%) sys 3.62 ( 1%) wall
if-conversion : 0.70 ( 0%) usr 0.01 ( 0%) sys 0.74 ( 0%) wall
regmove : 1.01 ( 0%) usr 0.01 ( 0%) sys 1.04 ( 0%) wall
mode switching : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
local alloc : 2.88 ( 1%) usr 0.02 ( 0%) sys 2.97 ( 1%) wall
global alloc : 6.36 ( 3%) usr 0.17 ( 2%) sys 6.91 ( 3%) wall
reload CSE regs : 2.86 ( 1%) usr 0.00 ( 0%) sys 3.21 ( 1%) wall
flow 2 : 0.52 ( 0%) usr 0.00 ( 0%) sys 0.54 ( 0%) wall
if-conversion 2 : 0.39 ( 0%) usr 0.00 ( 0%) sys 0.40 ( 0%) wall
peephole 2 : 0.51 ( 0%) usr 0.02 ( 0%) sys 0.54 ( 0%) wall
rename registers : 0.73 ( 0%) usr 0.05 ( 1%) sys 0.79 ( 0%) wall
scheduling 2 : 2.85 ( 1%) usr 0.05 ( 1%) sys 3.02 ( 1%) wall
reorder blocks : 0.28 ( 0%) usr 0.01 ( 0%) sys 0.30 ( 0%) wall
shorten branches : 0.54 ( 0%) usr 0.02 ( 0%) sys 0.56 ( 0%) wall
reg stack : 0.08 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%) wall
final : 1.10 ( 0%) usr 0.13 ( 2%) sys 1.43 ( 1%) wall
symout : 0.03 ( 0%) usr 0.01 ( 0%) sys 0.04 ( 0%) wall
rest of compilation : 1.83 ( 1%) usr 0.01 ( 0%) sys 1.87 ( 1%) wall
TOTAL : 243.59 8.18 264.59
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (37 preceding siblings ...)
2004-03-14 23:14 ` dberlin at gcc dot gnu dot org
@ 2004-03-14 23:45 ` pinskia at gcc dot gnu dot org
2004-03-15 9:03 ` rguenth at tat dot physik dot uni-tuebingen dot de
` (51 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-03-14 23:45 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-03-14 23:45 -------
I think this one:
integration : 27.48 (11%) usr 0.21 ( 3%) sys 28.53 (11%) wall
is caused by gimple having more trees to copy so maybe doing inlining later on will help (aka after the
first DCE happens) but the inliner then needs to be a BB inliner.
--
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Ever Confirmed| |1
Last reconfirmed|0000-00-00 00:00:00 |2004-03-14 23:45:33
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (38 preceding siblings ...)
2004-03-14 23:45 ` pinskia at gcc dot gnu dot org
@ 2004-03-15 9:03 ` rguenth at tat dot physik dot uni-tuebingen dot de
2004-03-24 17:16 ` giovannibajo at libero dot it
` (50 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: rguenth at tat dot physik dot uni-tuebingen dot de @ 2004-03-15 9:03 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From rguenth at tat dot physik dot uni-tuebingen dot de 2004-03-15 09:03 -------
Subject: Re: [tree-ssa] Many C++ compile-time regression in
3.5-tree-ssa 040120
On Mon, 14 Mar 2004, dberlin at gcc dot gnu dot org wrote:
> ------- Additional Comments From dberlin at gcc dot gnu dot org 2004-03-14 23:14 -------
> these are my numbers when configured with just --disable-checking (with the leafify patch, etc)
The results with just --disable-checking are the same. Humm.
--disable-libunwind-exceptions should make no difference for me, as I
don't have libunwind installed - maybe it's making the difference for you?
Confused,
Richard.
--
Richard Guenther <richard dot guenther at uni-tuebingen dot de>
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (39 preceding siblings ...)
2004-03-15 9:03 ` rguenth at tat dot physik dot uni-tuebingen dot de
@ 2004-03-24 17:16 ` giovannibajo at libero dot it
2004-03-24 18:32 ` giovannibajo at libero dot it
` (49 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: giovannibajo at libero dot it @ 2004-03-24 17:16 UTC (permalink / raw)
To: gcc-bugs
--
What |Removed |Added
----------------------------------------------------------------------------
BugsThisDependsOn| |14719
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (40 preceding siblings ...)
2004-03-24 17:16 ` giovannibajo at libero dot it
@ 2004-03-24 18:32 ` giovannibajo at libero dot it
2004-03-28 7:38 ` pinskia at gcc dot gnu dot org
` (48 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: giovannibajo at libero dot it @ 2004-03-24 18:32 UTC (permalink / raw)
To: gcc-bugs
--
Bug 13776 depends on bug 14719, which changed state.
Bug 14719 Summary: [tree-ssa] Excessive memory trashing on tree-ssa
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14719
What |Old Value |New Value
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution| |INVALID
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (41 preceding siblings ...)
2004-03-24 18:32 ` giovannibajo at libero dot it
@ 2004-03-28 7:38 ` pinskia at gcc dot gnu dot org
2004-03-28 18:45 ` pinskia at gcc dot gnu dot org
` (47 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-03-28 7:38 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-03-28 07:38 -------
I noticed while profiling the build of libstdc++, I noticed that comptypes was not being
tailed/sibcalled because of the return type is bool so this depends on PR 14440.
--
What |Removed |Added
----------------------------------------------------------------------------
BugsThisDependsOn| |14440
Last reconfirmed|2004-03-14 23:45:33 |2004-03-28 07:38:40
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (42 preceding siblings ...)
2004-03-28 7:38 ` pinskia at gcc dot gnu dot org
@ 2004-03-28 18:45 ` pinskia at gcc dot gnu dot org
2004-03-29 2:32 ` pinskia at gcc dot gnu dot org
` (46 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-03-28 18:45 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-03-28 18:45 -------
I noticed that bsi functions were not being optimized that well because bsi is a struct
which contained structs so marking this depends on PR 13953 which is for SRA
optimizing on structs containing structs.
--
What |Removed |Added
----------------------------------------------------------------------------
BugsThisDependsOn| |13953
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (43 preceding siblings ...)
2004-03-28 18:45 ` pinskia at gcc dot gnu dot org
@ 2004-03-29 2:32 ` pinskia at gcc dot gnu dot org
2004-03-29 12:14 ` rguenth at tat dot physik dot uni-tuebingen dot de
` (45 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-03-29 2:32 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-03-29 02:32 -------
I am attaching a C example where tree-ssa is slower:
[zhivago2:~/src/testspeed] pinskia% time ~/gcc-tree-ssa/bin/gcc fold-const.i -S
18.640u 1.480s 0:21.38 94.1% 0+0k 0+5io 0pf+0w
[zhivago2:~/src/testspeed] pinskia% time ~/fsf-clean-nocheck/bin/gcc fold-const.i -S
9.060u 0.540s 0:09.93 96.6% 0+0k 0+4io 0pf+0w
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (44 preceding siblings ...)
2004-03-29 2:32 ` pinskia at gcc dot gnu dot org
@ 2004-03-29 12:14 ` rguenth at tat dot physik dot uni-tuebingen dot de
2004-03-31 19:33 ` steven at gcc dot gnu dot org
` (44 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: rguenth at tat dot physik dot uni-tuebingen dot de @ 2004-03-29 12:14 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From rguenth at tat dot physik dot uni-tuebingen dot de 2004-03-29 12:14 -------
I set up a nightly tester on ia64-linux that does a bootstrap for c,c++ and
builds the tramp3d-v3.cpp testcase and does a performance check on the resulting
binary. Stats can be viewed at
http://www.tat.physik.uni-tuebingen.de/~rguenth/gcc/monitor-summary.html
Testing is done with an unpatched tree-ssa branch (i.e. w/o leafify).
The summary plot is updated manually and so can lag behind if I forget updating it.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (45 preceding siblings ...)
2004-03-29 12:14 ` rguenth at tat dot physik dot uni-tuebingen dot de
@ 2004-03-31 19:33 ` steven at gcc dot gnu dot org
2004-03-31 19:37 ` rguenth at tat dot physik dot uni-tuebingen dot de
` (43 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: steven at gcc dot gnu dot org @ 2004-03-31 19:33 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steven at gcc dot gnu dot org 2004-03-31 19:32 -------
C is also slower, here's the top of the oprofile on amd64 for
"-fno-tree-pre -O3" on a subset of Diego Novillo's cc1-i-files.
vma samples % symbol name
00730fa0 117920 10.1391 htab_find_slot_with_hash
00731350 53286 4.5817 iterative_hash
004802b0 22184 1.9074 bitmap_bit_p
006a3e20 20801 1.7885 ggc_alloc_stat
006717e0 19669 1.6912 for_each_rtx
006c5590 18536 1.5938 walk_tree
00730d00 16933 1.4559 find_empty_slot_for_expand
0064d5d0 16794 1.4440 constrain_operands
006579f0 16467 1.4159 reg_scan_mark_refs
00701db0 13922 1.1971 reg_is_remote_constant_p
00402b60 12999 1.1177 yyparse
004af330 12958 1.1142 cse_insn
00501050 12339 1.0609 mark_set_1
00671d00 12320 1.0593 note_stores
00523570 11714 1.0072 compute_transp
004a9270 10726 0.9223 count_reg_usage
--
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed|2004-03-28 07:38:40 |2004-03-31 19:33:00
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (46 preceding siblings ...)
2004-03-31 19:33 ` steven at gcc dot gnu dot org
@ 2004-03-31 19:37 ` rguenth at tat dot physik dot uni-tuebingen dot de
2004-03-31 19:53 ` zack at codesourcery dot com
` (42 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: rguenth at tat dot physik dot uni-tuebingen dot de @ 2004-03-31 19:37 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From rguenth at tat dot physik dot uni-tuebingen dot de 2004-03-31 19:37 -------
Subject: Re: [tree-ssa] Many C++ compile-time regression in
3.5-tree-ssa 040120
steven at gcc dot gnu dot org wrote:
> ------- Additional Comments From steven at gcc dot gnu dot org 2004-03-31 19:32 -------
> C is also slower, here's the top of the oprofile on amd64 for
> "-fno-tree-pre -O3" on a subset of Diego Novillo's cc1-i-files.
>
> vma samples % symbol name
> 00730fa0 117920 10.1391 htab_find_slot_with_hash
We have a lot of pointer hashing in gcc now and I see the above, too.
We can possibly micro-optimize the pointer hashing by introducing a
"specialization" of the libiberty hashfn for pointers where we can
inline both the hashing function and the comparison function. It will
introduce some code duplication, though (if this only was using C++ and
templates...).
Richard.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (47 preceding siblings ...)
2004-03-31 19:37 ` rguenth at tat dot physik dot uni-tuebingen dot de
@ 2004-03-31 19:53 ` zack at codesourcery dot com
2004-03-31 20:01 ` rguenth at tat dot physik dot uni-tuebingen dot de
` (41 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: zack at codesourcery dot com @ 2004-03-31 19:53 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From zack at codesourcery dot com 2004-03-31 19:53 -------
Subject: Re: [tree-ssa] Many C++ compile-time regression in
3.5-tree-ssa 040120
"rguenth at tat dot physik dot uni-tuebingen dot de" <gcc-bugzilla@gcc.gnu.org> writes:
> We have a lot of pointer hashing in gcc now and I see the above, too.
> We can possibly micro-optimize the pointer hashing by introducing a
> "specialization" of the libiberty hashfn for pointers where we can
> inline both the hashing function and the comparison function. It will
> introduce some code duplication, though (if this only was using C++ and
> templates...).
Something I've wanted to do for a long time is do poor-man's templates
on hashtab.[ch] with macros. But I never seem to get sufficient round
tuits.
zw
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (48 preceding siblings ...)
2004-03-31 19:53 ` zack at codesourcery dot com
@ 2004-03-31 20:01 ` rguenth at tat dot physik dot uni-tuebingen dot de
2004-03-31 20:44 ` steven at gcc dot gnu dot org
` (40 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: rguenth at tat dot physik dot uni-tuebingen dot de @ 2004-03-31 20:01 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From rguenth at tat dot physik dot uni-tuebingen dot de 2004-03-31 20:01 -------
Subject: Re: [tree-ssa] Many C++ compile-time regression in
3.5-tree-ssa 040120
zack at codesourcery dot com wrote:
> ------- Additional Comments From zack at codesourcery dot com 2004-03-31 19:53 -------
> Subject: Re: [tree-ssa] Many C++ compile-time regression in
> 3.5-tree-ssa 040120
>
> "rguenth at tat dot physik dot uni-tuebingen dot de" <gcc-bugzilla@gcc.gnu.org> writes:
>
>
>>We have a lot of pointer hashing in gcc now and I see the above, too.
>>We can possibly micro-optimize the pointer hashing by introducing a
>>"specialization" of the libiberty hashfn for pointers where we can
>>inline both the hashing function and the comparison function. It will
>>introduce some code duplication, though (if this only was using C++ and
>>templates...).
>
>
> Something I've wanted to do for a long time is do poor-man's templates
> on hashtab.[ch] with macros. But I never seem to get sufficient round
> tuits.
I think it would pay for pointer hashing only, as this is the main use.
I did some experiments some time ago with a stripped down pointer-only
hash just replacing the walk_tree hashtab and it still was #1 in the
profile with little change in time (but I didn't measure overall
performance change).
Richard.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (49 preceding siblings ...)
2004-03-31 20:01 ` rguenth at tat dot physik dot uni-tuebingen dot de
@ 2004-03-31 20:44 ` steven at gcc dot gnu dot org
2004-04-04 12:45 ` steven at gcc dot gnu dot org
` (39 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: steven at gcc dot gnu dot org @ 2004-03-31 20:44 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steven at gcc dot gnu dot org 2004-03-31 20:44 -------
I agree that a special pointer hasher would be nice. Should be easy,
just duplicate the code of iterative_hash in hashtab.c and specialize
it for void *.
But that doesn't reduce the number of find_slot calls. It's not like
the tables are sparse and we're getting tons of collisions. We just
use the hash table that much, and we should be looking into ways for
speeding it up.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (50 preceding siblings ...)
2004-03-31 20:44 ` steven at gcc dot gnu dot org
@ 2004-04-04 12:45 ` steven at gcc dot gnu dot org
2004-04-10 14:58 ` rguenth at tat dot physik dot uni-tuebingen dot de
` (38 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: steven at gcc dot gnu dot org @ 2004-04-04 12:45 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steven at gcc dot gnu dot org 2004-04-04 12:45 -------
I did some profiling of iterative_hash on tree-ssa. Not
immediately related to this PR, perhaps, but part of the problem.
% cumulative self self total
time seconds seconds calls s/call s/call name
2.75 1.66 1.66 2329935 0.00 0.00 iterative_hash
2.29 3.04 1.38 235027 0.00 0.00 walk_tree
2.04 4.27 1.23 1419091 0.00 0.00 ggc_alloc_stat
1.87 5.40 1.13 1020397 0.00 0.00 htab_find_slot_with_hash
1.74 6.45 1.05 1295674 0.00 0.00 mark_set_1
1.67 7.46 1.01 396445 0.00 0.00 iterative_hash_expr
1.64 8.45 0.99 2947490 0.00 0.00 bitmap_bit_p
1.59 9.41 0.96 321482 0.00 0.00 for_each_rtx
1.54 10.34 0.93 1566242 0.00 0.00 bitmap_set_bit
1.42 11.20 0.86 770792 0.00 0.00 et_splay
Right now, this function seems to be used only on the tree-ssa
branch, and mostly in the tree optimizers via iterative_hash_expr:
-----------------------------------------------
1423028 iterative_hash_expr [35]
0.00 0.00 40/396445 pre_expression [433]
0.00 0.00 162/396445 process_delayed_rename [971]
0.03 0.04 10126/396445 gimple_tree_hash [516]
0.39 0.67 151915/396445 avail_expr_hash [71]
0.60 1.03 234202/396445 true_false_expr_hash [52]
[35] 4.6 1.01 1.74 396445+1423028 iterative_hash_expr [35]
1.65 0.00 2308918/2329935 iterative_hash [53]
0.06 0.00 383567/1028690 first_rtl_op [321]
0.03 0.00 546018/635717 commutative_tree_code [699]
1423028 iterative_hash_expr [35]
-----------------------------------------------
0.00 0.00 919/2329935 build_type_attribute_variant
<cycle 12> [1420]
0.00 0.00 940/2329935 build_array_type [1299]
0.00 0.00 4814/2329935 build_function_type <cycle
12> [671]
0.01 0.00 14344/2329935 type_hash_list [900]
1.65 0.00 2308918/2329935 iterative_hash_expr [35]
[53] 2.8 1.66 0.00 2329935 iterative_hash [53]
-----------------------------------------------
So ~95% of all iterative_hash_expr calls are from DOM, which could use
a little help in terms of compilation speed: ~12% for this particular
test case pt.i.
I also did some coverage testing on iterative_hash:
-: 794:hashval_t iterative_hash (k_in, length, initval)
-: 795: const PTR k_in; /* the key */
-: 796: register size_t length; /* the length of the key */
-: 797: register hashval_t initval; /* the previous hash, or an
arbitrary value */
13721488: 798:{
13721488: 799: register const unsigned char *k = (const unsigned char
*)k_in;
13721488: 800: register hashval_t a,b,c,len;
-: 801:
-: 802: /* Set up the internal state */
13721488: 803: len = length;
13721488: 804: a = b = 0x9e3779b9; /* the golden ratio; an arbitrary value
*/
13721488: 805: c = initval; /* the previous hash value */
-: 806:
-: 807: /*---------------------------------------- handle most of
the key */
-: 808:#ifndef WORDS_BIGENDIAN
-: 809: /* On a little-endian machine, if the data is 4-byte aligned
we can hash
-: 810: by word for better speed. This gives nondeterministic
results on
-: 811: big-endian machines. */
13721488: 812: if (sizeof (hashval_t) == 4 && (((size_t)k)&3) == 0)
branch 0 taken 0%
13724520: 813: while (len >= 12) /* aligned */
branch 0 taken 1%
branch 1 taken 100%
-: 814: {
3032: 815: a += *(hashval_t *)(k+0);
3032: 816: b += *(hashval_t *)(k+4);
3032: 817: c += *(hashval_t *)(k+8);
3032: 818: mix(a,b,c);
3032: 819: k += 12; len -= 12;
branch 0 taken 100%
-: 820: }
-: 821: else /* unaligned */
-: 822:#endif
#####: 823: while (len >= 12)
branch 0 never executed
branch 1 never executed
-: 824: {
#####: 825: a += (k[0] +((hashval_t)k[1]<<8)
+((hashval_t)k[2]<<16) +((hashval_t)k[3]<<24));
#####: 826: b += (k[4] +((hashval_t)k[5]<<8)
+((hashval_t)k[6]<<16) +((hashval_t)k[7]<<24));
#####: 827: c += (k[8] +((hashval_t)k[9]<<8)
+((hashval_t)k[10]<<16)+((hashval_t)k[11]<<24));
#####: 828: mix(a,b,c);
#####: 829: k += 12; len -= 12;
branch 0 never executed
-: 830: }
-: 831:
-: 832: /*------------------------------------- handle the last 11
bytes */
13721488: 833: c += length;
13721488: 834: switch(len) /* all the case statements fall
through */
branch 0 taken 0%
branch 1 taken 0%
branch 2 taken 0%
branch 3 taken 0%
branch 4 taken 0%
branch 5 taken 1%
branch 6 taken 0%
branch 7 taken 1%
branch 8 taken 99%
branch 9 taken 1%
branch 10 taken 1%
branch 11 taken 0%
branch 12 taken 1%
-: 835: {
#####: 836: case 11: c+=((hashval_t)k[10]<<24);
#####: 837: case 10: c+=((hashval_t)k[9]<<16);
#####: 838: case 9 : c+=((hashval_t)k[8]<<8);
-: 839: /* the first byte of c is reserved for the length */
#####: 840: case 8 : b+=((hashval_t)k[7]<<24);
129: 841: case 7 : b+=((hashval_t)k[6]<<16);
129: 842: case 6 : b+=((hashval_t)k[5]<<8);
181: 843: case 5 : b+=k[4];
13719971: 844: case 4 : a+=((hashval_t)k[3]<<24);
13719977: 845: case 3 : a+=((hashval_t)k[2]<<16);
13719979: 846: case 2 : a+=((hashval_t)k[1]<<8);
13719979: 847: case 1 : a+=k[0];
-: 848: /* case 0: nothing left to add */
-: 849: }
13721488: 850: mix(a,b,c);
-: 851: /*-------------------------------------------- report the
result */
13721488: 852: return c;
-: 853:}
So it seems that a specialized version for 4 byte objects really would
help here.
(Xeon is 32bit, so the 8 byte case is important for 64bit targets??)
--
What |Removed |Added
----------------------------------------------------------------------------
CC| |law at redhat dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (51 preceding siblings ...)
2004-04-04 12:45 ` steven at gcc dot gnu dot org
@ 2004-04-10 14:58 ` rguenth at tat dot physik dot uni-tuebingen dot de
2004-04-10 15:32 ` Diego Novillo
2004-04-10 15:33 ` dnovillo at redhat dot com
` (37 subsequent siblings)
90 siblings, 1 reply; 93+ messages in thread
From: rguenth at tat dot physik dot uni-tuebingen dot de @ 2004-04-10 14:58 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From rguenth at tat dot physik dot uni-tuebingen dot de 2004-04-10 14:04 -------
Subject: Re: [tree-ssa] Many C++ compile-time regression in
3.5-tree-ssa 040120
Again the automatic tester at
http://www.tat.physik.uni-tuebingen.de/~rguenth/gcc/monitor-summary.html
caught some compile time regressions for tree-ssa.
While bootstrap time didn't change (much), tramp3d-v3 compile time got a
hit between Wednesday and Thursday, same for runtime. You'll also note
that mainline runtime was improving a lot yesterday.
There aren't that much changes on tree-ssa right now, so I suspect
changes causing the regression be
2004-04-07 Diego Novillo <dnovillo@redhat.com>
* gimplify.c (gimplify_call_expr): Remove argument POST_P.
Update all callers.
Don't use POST_P when gimplifying the call expression.
(the tree is updated at 3am CEST, incident happened with the update
on Thursday)
Richard.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-04-10 14:58 ` rguenth at tat dot physik dot uni-tuebingen dot de
@ 2004-04-10 15:32 ` Diego Novillo
0 siblings, 0 replies; 93+ messages in thread
From: Diego Novillo @ 2004-04-10 15:32 UTC (permalink / raw)
To: gcc-bugzilla; +Cc: gcc-bugs
On Sat, 2004-04-10 at 10:04, rguenth at tat dot physik dot uni-tuebingen
dot de wrote:
> There aren't that much changes on tree-ssa right now, so I suspect
> changes causing the regression be
>
> 2004-04-07 Diego Novillo <dnovillo@redhat.com>
>
> * gimplify.c (gimplify_call_expr): Remove argument POST_P.
> Update all callers.
> Don't use POST_P when gimplifying the call expression.
>
Hmm, odd. This is a correctness fix. Side effects in function call
arguments must occur before the actual call takes place.
What may be happening here is that we are getting fewer commoning
opportunities for call-clobbered variables. Before, foo (a++) would
expand to:
foo (a);
a = a + 1;
But now, it expands to:
t = a;
a = a + 1;
foo (t);
If 'a' is call-clobbered, the second form will not allow us to common
out 'a + 1' because of the clobbering of 'a' by the call to foo.
However, it is a bit surprising that this would cause a significant
decline in compile time. Would you have a pre-patched cc1plus binary to
compare dump files?
Thanks. Diego.
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (52 preceding siblings ...)
2004-04-10 14:58 ` rguenth at tat dot physik dot uni-tuebingen dot de
@ 2004-04-10 15:33 ` dnovillo at redhat dot com
2004-04-10 15:36 ` rguenth at tat dot physik dot uni-tuebingen dot de
` (36 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: dnovillo at redhat dot com @ 2004-04-10 15:33 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From dnovillo at redhat dot com 2004-04-10 14:58 -------
Subject: Re: [tree-ssa] Many C++ compile-time regression in
3.5-tree-ssa 040120
On Sat, 2004-04-10 at 10:04, rguenth at tat dot physik dot uni-tuebingen
dot de wrote:
> There aren't that much changes on tree-ssa right now, so I suspect
> changes causing the regression be
>
> 2004-04-07 Diego Novillo <dnovillo@redhat.com>
>
> * gimplify.c (gimplify_call_expr): Remove argument POST_P.
> Update all callers.
> Don't use POST_P when gimplifying the call expression.
>
Hmm, odd. This is a correctness fix. Side effects in function call
arguments must occur before the actual call takes place.
What may be happening here is that we are getting fewer commoning
opportunities for call-clobbered variables. Before, foo (a++) would
expand to:
foo (a);
a = a + 1;
But now, it expands to:
t = a;
a = a + 1;
foo (t);
If 'a' is call-clobbered, the second form will not allow us to common
out 'a + 1' because of the clobbering of 'a' by the call to foo.
However, it is a bit surprising that this would cause a significant
decline in compile time. Would you have a pre-patched cc1plus binary to
compare dump files?
Thanks. Diego.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (53 preceding siblings ...)
2004-04-10 15:33 ` dnovillo at redhat dot com
@ 2004-04-10 15:36 ` rguenth at tat dot physik dot uni-tuebingen dot de
2004-04-10 16:10 ` rguenth at tat dot physik dot uni-tuebingen dot de
` (35 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: rguenth at tat dot physik dot uni-tuebingen dot de @ 2004-04-10 15:36 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From rguenth at tat dot physik dot uni-tuebingen dot de 2004-04-10 15:20 -------
Subject: Re: [tree-ssa] Many C++ compile-time regression in
3.5-tree-ssa 040120
dnovillo at redhat dot com wrote:
> ------- Additional Comments From dnovillo at redhat dot com 2004-04-10 14:58 -------
> Subject: Re: [tree-ssa] Many C++ compile-time regression in
> 3.5-tree-ssa 040120
>
> On Sat, 2004-04-10 at 10:04, rguenth at tat dot physik dot uni-tuebingen
> dot de wrote:
>
>
>>There aren't that much changes on tree-ssa right now, so I suspect
>>changes causing the regression be
>>
>>2004-04-07 Diego Novillo <dnovillo@redhat.com>
>>
>> * gimplify.c (gimplify_call_expr): Remove argument POST_P.
>> Update all callers.
>> Don't use POST_P when gimplifying the call expression.
>>
>
> Hmm, odd. This is a correctness fix. Side effects in function call
> arguments must occur before the actual call takes place.
>
> What may be happening here is that we are getting fewer commoning
> opportunities for call-clobbered variables. Before, foo (a++) would
> expand to:
>
> foo (a);
> a = a + 1;
>
> But now, it expands to:
>
> t = a;
> a = a + 1;
> foo (t);
>
> If 'a' is call-clobbered, the second form will not allow us to common
> out 'a + 1' because of the clobbering of 'a' by the call to foo.
>
> However, it is a bit surprising that this would cause a significant
> decline in compile time. Would you have a pre-patched cc1plus binary to
> compare dump files?
Yes, I have cc1plus binaries from all days lying around (though with
checking disabled). Just tell me what to do.
Richard.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (54 preceding siblings ...)
2004-04-10 15:36 ` rguenth at tat dot physik dot uni-tuebingen dot de
@ 2004-04-10 16:10 ` rguenth at tat dot physik dot uni-tuebingen dot de
2004-04-10 16:20 ` rguenth at tat dot physik dot uni-tuebingen dot de
` (34 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: rguenth at tat dot physik dot uni-tuebingen dot de @ 2004-04-10 16:10 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From rguenth at tat dot physik dot uni-tuebingen dot de 2004-04-10 15:36 -------
Subject: Re: [tree-ssa] Many C++ compile-time regression in
3.5-tree-ssa 040120
> However, it is a bit surprising that this would cause a significant
> decline in compile time. Would you have a pre-patched cc1plus binary to
> compare dump files?
Ok, I tried to just diff tree-optimized dumps, but noise is papering
over the differences (temps are differently numbered).
At least, before the compile time increase the dump had 1003736 lines,
and after it now has 1048682 lines. So there is a difference.
Richard.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug c++/13776] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (55 preceding siblings ...)
2004-04-10 16:10 ` rguenth at tat dot physik dot uni-tuebingen dot de
@ 2004-04-10 16:20 ` rguenth at tat dot physik dot uni-tuebingen dot de
2004-05-17 15:59 ` [Bug tree-optimization/13776] [3.5 Regression] " pinskia at gcc dot gnu dot org
` (33 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: rguenth at tat dot physik dot uni-tuebingen dot de @ 2004-04-10 16:20 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From rguenth at tat dot physik dot uni-tuebingen dot de 2004-04-10 15:48 -------
Subject: Re: [tree-ssa] Many C++ compile-time regression in
3.5-tree-ssa 040120
> However, it is a bit surprising that this would cause a significant
> decline in compile time. Would you have a pre-patched cc1plus binary to
> compare dump files?
Cutting off the numbers from the vars and killing <Dxxx> reveals:
@@ -256,6 +256,7 @@ virtual Smarts::Runnable::~Runnable() (t
{
bool T.;
int T.;
+ int (*__vtbl_ptr_type) () * T.;
<bb 0>:
this->_vptr.Runnable = &_ZTVN6Smarts8RunnableE[2];
(and similar in all destructors)
int fillLocStorage(int, Loc<Dim>&, constT1&) [with int Dim = 3, T1 =
Loc<3>] (currIndex, loc, a)
{
+ int currIndex.;
int ;
int d;
int T.;
@@ -1581,6 +1503,7 @@ int fillLocStorage(int, Loc<Dim>&, const
struct Domain<1,DomainTraits<Loc<1> > > * T.;
struct Loc<1> * T.;
struct Loc<1> & T.;
+ int currIndex.;
struct Domain<3,DomainTraits<Loc<3> > > * loc.;
int retval.;
int retval.;
@@ -1595,13 +1518,17 @@ int fillLocStorage(int, Loc<Dim>&, const
i = 0;
<L0>:;
+ currIndex. = currIndex + 1;
*(int &)(struct Domain<1,DomainTraits<Loc<1> > > *)(struct Loc<1>
*)(struct Loc<1> &)((struct Loc<1> *)((long unsigned int)currIndex * 4)
+ (struct Loc<1> *)(struct UninitializedVector<Loc<1>,3,int> *)(struct
Domain<3,DomainTraits<Loc<3> > > *)loc) = ((struct
DomainBase<DomainTraits<Loc<1> > > *)(struct
Domain<1,DomainTraits<Loc<1> > > *)(struct Loc<1> &)((struct Loc<1>
*)((long unsigned int)i * 4) + (struct Loc<1> *)(struct
UninitializedVector<Loc<1>,3,int> *)(struct Domain<3,DomainTraits<Loc<3>
> > *)a))->domain_m;
- currIndex = currIndex + 1;
i = i + 1;
- if (i <= 2) goto <L0>; else goto <L10>;
+ if (i <= 2) goto <L13>; else goto <L10>;
+
+<L13>:;
+ currIndex = currIndex.;
+ goto <bb 1> (<L0>);
<L10>:;
- return currIndex;
+ return currIndex.;
}
looks like DOM is now missing some optimization
then, lots of re-ordering of functions in the diff, and noise... (label
number changes, bb number changes). The dump files are huge (both
around 50MB uncompressed), if you want to download them, I can put them
to an accessible location.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug tree-optimization/13776] [3.5 Regression] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (56 preceding siblings ...)
2004-04-10 16:20 ` rguenth at tat dot physik dot uni-tuebingen dot de
@ 2004-05-17 15:59 ` pinskia at gcc dot gnu dot org
2004-06-07 18:13 ` pinskia at gcc dot gnu dot org
` (32 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-05-17 15:59 UTC (permalink / raw)
To: gcc-bugs
--
Bug 13776 depends on bug 14440, which changed state.
Bug 14440 Summary: [3.5 regression] no sib calling with _Bool types
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14440
What |Old Value |New Value
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution| |FIXED
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug tree-optimization/13776] [3.5 Regression] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (57 preceding siblings ...)
2004-05-17 15:59 ` [Bug tree-optimization/13776] [3.5 Regression] " pinskia at gcc dot gnu dot org
@ 2004-06-07 18:13 ` pinskia at gcc dot gnu dot org
2004-06-09 23:55 ` pinskia at gcc dot gnu dot org
` (31 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-06-07 18:13 UTC (permalink / raw)
To: gcc-bugs
--
Bug 13776 depends on bug 14440, which changed state.
Bug 14440 Summary: [3.5 regression] no sib calling with _Bool types
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14440
What |Old Value |New Value
----------------------------------------------------------------------------
Status|RESOLVED |REOPENED
Resolution|FIXED |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug tree-optimization/13776] [3.5 Regression] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (58 preceding siblings ...)
2004-06-07 18:13 ` pinskia at gcc dot gnu dot org
@ 2004-06-09 23:55 ` pinskia at gcc dot gnu dot org
2004-06-29 21:12 ` rth at gcc dot gnu dot org
` (30 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-06-09 23:55 UTC (permalink / raw)
To: gcc-bugs
--
Bug 13776 depends on bug 14440, which changed state.
Bug 14440 Summary: [3.5 regression] no sib calling with _Bool types
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14440
What |Old Value |New Value
----------------------------------------------------------------------------
Status|REOPENED |RESOLVED
Resolution| |FIXED
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug tree-optimization/13776] [3.5 Regression] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (59 preceding siblings ...)
2004-06-09 23:55 ` pinskia at gcc dot gnu dot org
@ 2004-06-29 21:12 ` rth at gcc dot gnu dot org
2004-06-30 3:15 ` giovannibajo at libero dot it
` (29 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: rth at gcc dot gnu dot org @ 2004-06-29 21:12 UTC (permalink / raw)
To: gcc-bugs
--
Bug 13776 depends on bug 13953, which changed state.
Bug 13953 Summary: [tree-ssa] SRA does not work for structs that contain a struct as a member
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13953
What |Old Value |New Value
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug tree-optimization/13776] [3.5 Regression] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (60 preceding siblings ...)
2004-06-29 21:12 ` rth at gcc dot gnu dot org
@ 2004-06-30 3:15 ` giovannibajo at libero dot it
2004-07-08 18:16 ` kgardas at objectsecurity dot com
` (28 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: giovannibajo at libero dot it @ 2004-06-30 3:15 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From giovannibajo at libero dot it 2004-06-30 03:06 -------
Karel,
all the main optimization issues that we spotted looking at the MICO
regressions are supposed to be fixed now. It would be very cool if you could
prepare an updated performance comparison table between 3.4.0 and today's
mainline, so that we can check how mainline is doing now.
Thanks
--
What |Removed |Added
----------------------------------------------------------------------------
CC| |giovannibajo at libero dot
| |it
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug tree-optimization/13776] [3.5 Regression] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (61 preceding siblings ...)
2004-06-30 3:15 ` giovannibajo at libero dot it
@ 2004-07-08 18:16 ` kgardas at objectsecurity dot com
2004-08-30 1:40 ` giovannibajo at libero dot it
` (27 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: kgardas at objectsecurity dot com @ 2004-07-08 18:16 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From kgardas at objectsecurity dot com 2004-07-08 18:16 -------
Subject: Re: [3.5 Regression] [tree-ssa] Many
C++ compile-time regression in 3.5-tree-ssa 040120
Giovani,
I have done comparison of 3.4.0, 3.4.1RC1 and trunk from 2004-06-30 and
posted all results here: http://gcc.gnu.org/ml/gcc/2004-07/msg00391.html
Cheers,
Karel
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug tree-optimization/13776] [3.5 Regression] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (62 preceding siblings ...)
2004-07-08 18:16 ` kgardas at objectsecurity dot com
@ 2004-08-30 1:40 ` giovannibajo at libero dot it
2004-08-31 9:15 ` kgardas at objectsecurity dot com
` (26 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: giovannibajo at libero dot it @ 2004-08-30 1:40 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From giovannibajo at libero dot it 2004-08-30 01:40 -------
Karel, would you mind posting an updated table using a recent mainline? Thanks.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug tree-optimization/13776] [3.5 Regression] [tree-ssa] Many C++ compile-time regression in 3.5-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (63 preceding siblings ...)
2004-08-30 1:40 ` giovannibajo at libero dot it
@ 2004-08-31 9:15 ` kgardas at objectsecurity dot com
2004-10-23 21:26 ` [Bug tree-optimization/13776] [4.0 Regression] [tree-ssa] Many C++ compile-time regression in 4.0-tree-ssa 040120 pinskia at gcc dot gnu dot org
` (25 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: kgardas at objectsecurity dot com @ 2004-08-31 9:15 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From kgardas at objectsecurity dot com 2004-08-31 09:15 -------
Subject: Re: [3.5 Regression] [tree-ssa] Many
C++ compile-time regression in 3.5-tree-ssa 040120
Hi,
updated table for gcc3.4.1 and main trunk 2004-08-30 is here:
http://gcc.gnu.org/ml/gcc/2004-08/msg01594.html
Cheers,
Karel
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug tree-optimization/13776] [4.0 Regression] [tree-ssa] Many C++ compile-time regression in 4.0-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (64 preceding siblings ...)
2004-08-31 9:15 ` kgardas at objectsecurity dot com
@ 2004-10-23 21:26 ` pinskia at gcc dot gnu dot org
2004-10-25 12:03 ` kgardas at objectsecurity dot com
` (24 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-10-23 21:26 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-10-23 21:26 -------
Can you post again the new result as a huge amount has been changed since Auguest 31 and there has
been some compile time improvements in that time?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug tree-optimization/13776] [4.0 Regression] [tree-ssa] Many C++ compile-time regression in 4.0-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (65 preceding siblings ...)
2004-10-23 21:26 ` [Bug tree-optimization/13776] [4.0 Regression] [tree-ssa] Many C++ compile-time regression in 4.0-tree-ssa 040120 pinskia at gcc dot gnu dot org
@ 2004-10-25 12:03 ` kgardas at objectsecurity dot com
2004-10-25 13:02 ` rguenth at tat dot physik dot uni-tuebingen dot de
` (23 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: kgardas at objectsecurity dot com @ 2004-10-25 12:03 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From kgardas at objectsecurity dot com 2004-10-25 12:03 -------
Subject: Re: [4.0 Regression] [tree-ssa] Many
C++ compile-time regression in 4.0-tree-ssa 040120
Sure! Here we go: http://gcc.gnu.org/ml/gcc/2004-10/msg00952.html
and results are really promissing, although some interesting regressions
are still presented.
Cheers,
Karel
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug tree-optimization/13776] [4.0 Regression] [tree-ssa] Many C++ compile-time regression in 4.0-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (66 preceding siblings ...)
2004-10-25 12:03 ` kgardas at objectsecurity dot com
@ 2004-10-25 13:02 ` rguenth at tat dot physik dot uni-tuebingen dot de
2004-10-25 13:20 ` kgardas at objectsecurity dot com
` (22 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: rguenth at tat dot physik dot uni-tuebingen dot de @ 2004-10-25 13:02 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From rguenth at tat dot physik dot uni-tuebingen dot de 2004-10-25 13:02 -------
Subject: Re: [4.0 Regression] [tree-ssa] Many
C++ compile-time regression in 4.0-tree-ssa 040120
And
http://gcc.gnu.org/ml/gcc/2004-10/msg00955.html
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug tree-optimization/13776] [4.0 Regression] [tree-ssa] Many C++ compile-time regression in 4.0-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (67 preceding siblings ...)
2004-10-25 13:02 ` rguenth at tat dot physik dot uni-tuebingen dot de
@ 2004-10-25 13:20 ` kgardas at objectsecurity dot com
2004-11-16 1:52 ` pinskia at gcc dot gnu dot org
` (21 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: kgardas at objectsecurity dot com @ 2004-10-25 13:20 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From kgardas at objectsecurity dot com 2004-10-25 13:20 -------
Subject: Re: [4.0 Regression] [tree-ssa] Many
C++ compile-time regression in 4.0-tree-ssa 040120
Updated table with GCC 3.4.2 and 4.0.0-041024 results is available here:
http://gcc.gnu.org/ml/gcc/2004-10/msg00952.html -- still some regressions
mainly on -O1 and -O2.
Cheers,
Karel
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug tree-optimization/13776] [4.0 Regression] [tree-ssa] Many C++ compile-time regression in 4.0-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (68 preceding siblings ...)
2004-10-25 13:20 ` kgardas at objectsecurity dot com
@ 2004-11-16 1:52 ` pinskia at gcc dot gnu dot org
2004-11-18 21:12 ` pinskia at gcc dot gnu dot org
` (20 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-11-16 1:52 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-11-16 01:51 -------
ir.cc 47.17 69.26 -31.89 72.42 129.49 -44.07 100.1 165.27 -39.43
I just sped up ir.cc a little with my patch to cp-gimplify.c (which was committed)
Reference: http://gcc.gnu.org/ml/gcc-patches/2004-11/msg01247.html
Also my patch to remove the a number of calls to is_gimple_reg speeds up optimizations:
http://gcc.gnu.org/ml/gcc-patches/2004-11/msg01284.html
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug tree-optimization/13776] [4.0 Regression] [tree-ssa] Many C++ compile-time regression in 4.0-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (69 preceding siblings ...)
2004-11-16 1:52 ` pinskia at gcc dot gnu dot org
@ 2004-11-18 21:12 ` pinskia at gcc dot gnu dot org
2004-11-19 11:15 ` kgardas at objectsecurity dot com
` (19 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-11-18 21:12 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-11-18 21:12 -------
Hmm, with the mainline on PPC-darwin for ir.ii at -O0 we are faster than both 3.3 and 3.1.
3.1:
51.260u 2.110s 0:56.27 94.8% 0+0k 0+7io 0pf+0w
3.3:
46.000u 3.600s 0:50.91 97.4% 0+0k 0+7io 0pf+0w
mainline:
39.730u 5.270s 0:48.27 93.2% 0+0k 0+8io 0pf+0w
Even at -O1 we are faster than 3.3:
mainline:
70.860u 5.010s 1:18.76 96.3% 0+0k 0+11io 0pf+0w
3.3:
72.650u 13.250s 1:29.99 95.4% 0+0k 0+7io 0pf+0w
For -O2 we are only 1 second slower than 3.3:
mainline:
99.720u 5.510s 1:54.78 91.6% 0+0k 0+13io 0pf+0w
3.3:
98.610u 38.800s 2:25.59 94.3% 0+0k 0+15io 0pf+0w
Could you check again on your platform?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug tree-optimization/13776] [4.0 Regression] [tree-ssa] Many C++ compile-time regression in 4.0-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (70 preceding siblings ...)
2004-11-18 21:12 ` pinskia at gcc dot gnu dot org
@ 2004-11-19 11:15 ` kgardas at objectsecurity dot com
2004-11-19 18:22 ` pinskia at gcc dot gnu dot org
` (18 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: kgardas at objectsecurity dot com @ 2004-11-19 11:15 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From kgardas at objectsecurity dot com 2004-11-19 11:14 -------
Subject: Re: [4.0 Regression] [tree-ssa] Many
C++ compile-time regression in 4.0-tree-ssa 040120
I've tested 3.4.2, 4.0.0 (20041026) and 4.0.0 (20041118) with following
results:
3.4.2:
c++ -I../include -time -O0 -Wall -DPIC -fPIC -c ir.cc -o ir.pic.o
# cc1plus 46.98 0.53
# as 4.62 0.22
peak memory consumed: 99MB
4.0.0 (20041026):
c++ -I../include -time -O0 -Wall -DPIC -fPIC -c ir.cc -o ir.pic.o
# cc1plus 67.13 2.05
# as 5.98 0.30
peak memory consumed: 243MB
4.0.0 (20041118):
c++ -I../include -time -O0 -Wall -DPIC -fPIC -c ir.cc -o ir.pic.o
# cc1plus 66.47 1.97
# as 5.84 0.27
peak memory consumed 243MB
so there is still both compile-time and memory usage regressions presented
on main-line.
The reason why do you see speed-up in comparison with 3.1/3.3 is that
3.4.2 is really faster compiler (at least from MICO sources point of
view).
Cheers,
Karel
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug tree-optimization/13776] [4.0 Regression] [tree-ssa] Many C++ compile-time regression in 4.0-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (71 preceding siblings ...)
2004-11-19 11:15 ` kgardas at objectsecurity dot com
@ 2004-11-19 18:22 ` pinskia at gcc dot gnu dot org
2004-11-29 19:56 ` kgardas at objectsecurity dot com
` (17 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-11-19 18:22 UTC (permalink / raw)
To: gcc-bugs
--
Bug 13776 depends on bug 18507, which changed state.
Bug 18507 Summary: block_defs_stack varrray should not be GC'ed
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18507
What |Old Value |New Value
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution| |FIXED
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug tree-optimization/13776] [4.0 Regression] [tree-ssa] Many C++ compile-time regression in 4.0-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (72 preceding siblings ...)
2004-11-19 18:22 ` pinskia at gcc dot gnu dot org
@ 2004-11-29 19:56 ` kgardas at objectsecurity dot com
2004-11-29 20:05 ` law at redhat dot com
` (16 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: kgardas at objectsecurity dot com @ 2004-11-29 19:56 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From kgardas at objectsecurity dot com 2004-11-29 19:56 -------
Subject: Re: [4.0 Regression] [tree-ssa] Many
C++ compile-time regression in 4.0-tree-ssa 040120
I've updated comparison table for 4.0.0 20041126 compiler version. You can
find it here: http://gcc.gnu.org/ml/gcc/2004-11/msg01157.html
Cheers,
Karel
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug tree-optimization/13776] [4.0 Regression] [tree-ssa] Many C++ compile-time regression in 4.0-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (73 preceding siblings ...)
2004-11-29 19:56 ` kgardas at objectsecurity dot com
@ 2004-11-29 20:05 ` law at redhat dot com
2004-11-29 21:04 ` kgardas at objectsecurity dot com
` (15 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: law at redhat dot com @ 2004-11-29 20:05 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From law at redhat dot com 2004-11-29 20:05 -------
Subject: Re: [4.0 Regression] [tree-ssa] Many
C++ compile-time regression in 4.0-tree-ssa 040120
On Mon, 2004-11-29 at 19:56 +0000, kgardas at objectsecurity dot com
wrote:
> ------- Additional Comments From kgardas at objectsecurity dot com 2004-11-29 19:56 -------
> Subject: Re: [4.0 Regression] [tree-ssa] Many
> C++ compile-time regression in 4.0-tree-ssa 040120
>
>
> I've updated comparison table for 4.0.0 20041126 compiler version. You can
> find it here: http://gcc.gnu.org/ml/gcc/2004-11/msg01157.html
BTW, if I'm reading that table correctly, overall the compile time
performance of mainline is actually on-par or better than 3.4 at
-O0, -O1 and -O2 for this test. That's not to diminish the need to
work on ir.cc, but things appear to be heading the right direction.
jeff
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug tree-optimization/13776] [4.0 Regression] [tree-ssa] Many C++ compile-time regression in 4.0-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (74 preceding siblings ...)
2004-11-29 20:05 ` law at redhat dot com
@ 2004-11-29 21:04 ` kgardas at objectsecurity dot com
2004-12-13 3:03 ` pinskia at gcc dot gnu dot org
` (14 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: kgardas at objectsecurity dot com @ 2004-11-29 21:04 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From kgardas at objectsecurity dot com 2004-11-29 21:04 -------
Subject: Re: [4.0 Regression] [tree-ssa] Many
C++ compile-time regression in 4.0-tree-ssa 040120
On Mon, 29 Nov 2004, law at redhat dot com wrote:
> > I've updated comparison table for 4.0.0 20041126 compiler version. You can
> > find it here: http://gcc.gnu.org/ml/gcc/2004-11/msg01157.html
> BTW, if I'm reading that table correctly, overall the compile time
> performance of mainline is actually on-par or better than 3.4 at
> -O0, -O1 and -O2 for this test.
Yes, you are 100% right.
Karel
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug tree-optimization/13776] [4.0 Regression] [tree-ssa] Many C++ compile-time regression in 4.0-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (75 preceding siblings ...)
2004-11-29 21:04 ` kgardas at objectsecurity dot com
@ 2004-12-13 3:03 ` pinskia at gcc dot gnu dot org
2004-12-13 6:38 ` pinskia at gcc dot gnu dot org
` (13 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-12-13 3:03 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-12-13 03:03 -------
I noticed that for ir.ii, there is some compile time spent in GC which means we have a memory problem,
I have a patch which should help a little on the memory problem but that too much.
--
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |memory-hog
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug tree-optimization/13776] [4.0 Regression] [tree-ssa] Many C++ compile-time regression in 4.0-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (76 preceding siblings ...)
2004-12-13 3:03 ` pinskia at gcc dot gnu dot org
@ 2004-12-13 6:38 ` pinskia at gcc dot gnu dot org
2004-12-13 6:59 ` pinskia at gcc dot gnu dot org
` (12 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-12-13 6:38 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-12-13 06:38 -------
Note for ir.ii at -O0, we spend more time in local alloc and global alloc with the mainline than 3.3.2.
2.41 vs 3.86 and 3.74 vs 6.07 so someone who knows local alloc and global alloc might want to look
into this. This is on powerpc-darwin by the way, on x86, there might be a different problem someone
should do a -ftime-report with both the mainline and 3.4.x to see if this is also true on x86.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug tree-optimization/13776] [4.0 Regression] [tree-ssa] Many C++ compile-time regression in 4.0-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (77 preceding siblings ...)
2004-12-13 6:38 ` pinskia at gcc dot gnu dot org
@ 2004-12-13 6:59 ` pinskia at gcc dot gnu dot org
2004-12-28 21:03 ` [Bug middle-end/13776] [4.0 Regression] " kgardas at objectsecurity dot com
` (11 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-12-13 6:59 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-12-13 06:59 -------
For -O1, integration is slower in the mainline compared with 3.3.2, 2.46 vs 1.51.
global alloc is also slower: 3.21 vs 2.38.
Speeding those up will help.
This again on powerpc-darwin. The reason why I thought 3.3.2 was much slower than the mainline was
because the GC limits were low for 3.3.2 on darwin.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug middle-end/13776] [4.0 Regression] Many C++ compile-time regression in 4.0-tree-ssa 040120
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (78 preceding siblings ...)
2004-12-13 6:59 ` pinskia at gcc dot gnu dot org
@ 2004-12-28 21:03 ` kgardas at objectsecurity dot com
2005-01-26 10:21 ` [Bug middle-end/13776] [4.0 Regression] Many C++ compile-time regressions for MICO's ORB code steven at gcc dot gnu dot org
` (10 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: kgardas at objectsecurity dot com @ 2004-12-28 21:03 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From kgardas at objectsecurity dot com 2004-12-28 21:03 -------
Hello,
New comparison is here:
http://gcc.gnu.org/ml/gcc/2004-12/msg01157.html
Cheers,
Karel
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug middle-end/13776] [4.0 Regression] Many C++ compile-time regressions for MICO's ORB code
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (79 preceding siblings ...)
2004-12-28 21:03 ` [Bug middle-end/13776] [4.0 Regression] " kgardas at objectsecurity dot com
@ 2005-01-26 10:21 ` steven at gcc dot gnu dot org
2005-01-26 10:25 ` rguenth at tat dot physik dot uni-tuebingen dot de
` (9 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-01-26 10:21 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steven at gcc dot gnu dot org 2005-01-26 10:20 -------
Bah, I hate profiles for "cc1plus -O2 ir.ii" without peaks:
CPU: P4 / Xeon with 2 hyper-threads, speed 3194.17 MHz (estimated)
Counted GLOBAL_POWER_EVENTS events (time during which processor is not
stopped) with a unit mask of 0x01 (mandatory) count 100000
samples % symbol name
78641 5.2991 ggc_alloc_stat
28267 1.9047 ggc_set_mark
26230 1.7675 splay_tree_splay_helper
25018 1.6858 walk_tree
24322 1.6389 cgraph_node_for_asm
20428 1.3765 gt_ggc_mx_lang_tree_node
19586 1.3198 htab_find_slot_with_hash
16006 1.0785 compute_immediate_uses
15133 1.0197 get_stmt_operands
14481 0.9758 constrain_operands
13414 0.9039 insert_aux
13308 0.8967 decl_assembler_name_equal
12795 0.8622 find_reloads
12052 0.8121 decl_assembler_name
11986 0.8077 cse_insn
11743 0.7913 record_reg_classes
11707 0.7889 bitmap_set_bit
11630 0.7837 ix86_decompose_address
11610 0.7823 mark_set_1
11538 0.7775 optimize_stmt
11201 0.7548 iterative_hash_expr
10615 0.7153 cp_walk_subtrees
10235 0.6897 rewrite_stmt
9892 0.6666 for_each_rtx_1
9816 0.6614 get_expr_operands
9813 0.6612 invalidate
9302 0.6268 pointer_set_insert
9293 0.6262 mark_def_sites
8570 0.5775 reg_scan_mark_refs
8503 0.5730 propagate_necessity
8424 0.5676 is_gimple_reg
8322 0.5608 compute_may_aliases
No single problem to focus on...
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug middle-end/13776] [4.0 Regression] Many C++ compile-time regressions for MICO's ORB code
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (80 preceding siblings ...)
2005-01-26 10:21 ` [Bug middle-end/13776] [4.0 Regression] Many C++ compile-time regressions for MICO's ORB code steven at gcc dot gnu dot org
@ 2005-01-26 10:25 ` rguenth at tat dot physik dot uni-tuebingen dot de
2005-01-26 10:25 ` kgardas at objectsecurity dot com
` (8 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: rguenth at tat dot physik dot uni-tuebingen dot de @ 2005-01-26 10:25 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From rguenth at tat dot physik dot uni-tuebingen dot de 2005-01-26 10:24 -------
Subject: Re: [4.0 Regression] Many C++ compile-time
regressions for MICO's ORB code
> Bah, I hate profiles for "cc1plus -O2 ir.ii" without peaks:
>
> CPU: P4 / Xeon with 2 hyper-threads, speed 3194.17 MHz (estimated)
> Counted GLOBAL_POWER_EVENTS events (time during which processor is not
> stopped) with a unit mask of 0x01 (mandatory) count 100000
> samples % symbol name
> 25018 1.6858 walk_tree
> 24322 1.6389 cgraph_node_for_asm
> 19586 1.3198 htab_find_slot_with_hash
Do you have numbers wether we are memory-bandwith limited here? If
not, we might micro-optimize hash table access somewhat more.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug middle-end/13776] [4.0 Regression] Many C++ compile-time regressions for MICO's ORB code
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (81 preceding siblings ...)
2005-01-26 10:25 ` rguenth at tat dot physik dot uni-tuebingen dot de
@ 2005-01-26 10:25 ` kgardas at objectsecurity dot com
2005-01-26 10:46 ` kgardas at objectsecurity dot com
` (7 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: kgardas at objectsecurity dot com @ 2005-01-26 10:25 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From rguenth at tat dot physik dot uni-tuebingen dot de 2005-01-26 10:24 -------
Subject: Re: [4.0 Regression] Many C++ compile-time
regressions for MICO's ORB code
> Bah, I hate profiles for "cc1plus -O2 ir.ii" without peaks:
>
> CPU: P4 / Xeon with 2 hyper-threads, speed 3194.17 MHz (estimated)
> Counted GLOBAL_POWER_EVENTS events (time during which processor is not
> stopped) with a unit mask of 0x01 (mandatory) count 100000
> samples % symbol name
> 25018 1.6858 walk_tree
> 24322 1.6389 cgraph_node_for_asm
> 19586 1.3198 htab_find_slot_with_hash
Do you have numbers wether we are memory-bandwith limited here? If
not, we might micro-optimize hash table access somewhat more.
------- Additional Comments From kgardas at objectsecurity dot com 2005-01-26 10:24 -------
Subject: Re: [4.0 Regression] Many C++ compile-time
regressions for MICO's ORB code
On Wed, 26 Jan 2005, steven at gcc dot gnu dot org wrote:
>
> ------- Additional Comments From steven at gcc dot gnu dot org 2005-01-26 10:20 -------
> Bah, I hate profiles for "cc1plus -O2 ir.ii" without peaks:
True, if I may add something, I would recommend to look at why ir.cc
regress so much in memory consumption in comparison with 3.4.x. If you
solve this, perhaps compile time regressions goes away too.
Thanks,
Karel
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug middle-end/13776] [4.0 Regression] Many C++ compile-time regressions for MICO's ORB code
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (82 preceding siblings ...)
2005-01-26 10:25 ` kgardas at objectsecurity dot com
@ 2005-01-26 10:46 ` kgardas at objectsecurity dot com
2005-01-26 11:36 ` steven at gcc dot gnu dot org
` (6 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: kgardas at objectsecurity dot com @ 2005-01-26 10:46 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From kgardas at objectsecurity dot com 2005-01-26 10:46 -------
Subject: Re: [4.0 Regression] Many C++ compile-time
regressions for MICO's ORB code
Just to note something about 4.0.0 and 3.4.2 memory usage while compiling
ir.cc.
3.4.2: it is quickly gorwing up to 90MB RAM, then it stay there for a long
time and then goes quickly to 99MB RAM where it finishes -- i.e. majority
of time is spend with ~90MB and less consumed memory
4.0.0: in comparison with 3.4.2, it is growing up to 243MB RAM, stays
there for some time (not majority but let say 1/3 of compilation
certainly), then it goes back to 200MB RAM consumed and then it finishes.
Hard to tell avarage memory usage here, but I think it is about 200MB.
My 4.0.0 here is quite old 20041228, but I guess the picture is still the
same.
Thanks,
Karel
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug middle-end/13776] [4.0 Regression] Many C++ compile-time regressions for MICO's ORB code
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (83 preceding siblings ...)
2005-01-26 10:46 ` kgardas at objectsecurity dot com
@ 2005-01-26 11:36 ` steven at gcc dot gnu dot org
2005-01-27 5:03 ` pinskia at gcc dot gnu dot org
` (5 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-01-26 11:36 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steven at gcc dot gnu dot org 2005-01-26 11:36 -------
It would be a Good Thing to look at the hash function. The number of
collisions per search is extremely high:
String pool
entries 128928
identifiers 128928 (100.00%)
slots 262144
bytes 1846k (142k overhead)
table size 2048k
coll/search 0.8518
ins/search 0.2747
avg. entry 14.66 bytes (+/- 17.60)
longest entry 830
There is also still a lot of memory allocated at the end of the compilation:
Memory still allocated at the end of the compilation process
Size Allocated Used Overhead
8 4096 200 120
16 4264k 1211k 91k
64 29M 10M 476k
128 3920k 1472k 53k
256 1240k 519k 16k
512 4084k 2026k 55k
1024 488k 390k 6832
2048 2628k 1998k 35k
4096 1160k 1160k 15k
8192 376k 368k 2632
16384 304k 288k 1064
32768 160k 128k 280
65536 704k 640k 616
131072 384k 384k 168
262144 512k 512k 112
524288 512k 512k 56
112 26M 19M 373k
208 63M 43M 883k
48 27M 14M 443k
32 18M 10M 337k
80 13M 13M 186k
Total 199M 122M 2982k
Note especially the 43MB. All of that is in the et-forest alloc-pools.
Perhaps we should allocate/free them per function.
Finally, we allocate a lot of SSA_NAMEs, and varrays are problematic as
always:
source location Garbage Freed
Leak Overhead Times
varray.c:170 (varray_grow) 39485908: 3.3%
280747780:47.6% 229448: 0.2% 80866528:32.0% 552682
tree-ssanames.c:197 (make_ssa_name) 94292264: 7.9% 0:
0.0% 0: 0.0% 8572024: 3.4% 1071503
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug middle-end/13776] [4.0 Regression] Many C++ compile-time regressions for MICO's ORB code
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (84 preceding siblings ...)
2005-01-26 11:36 ` steven at gcc dot gnu dot org
@ 2005-01-27 5:03 ` pinskia at gcc dot gnu dot org
2005-01-31 9:31 ` kgardas at objectsecurity dot com
` (4 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-01-27 5:03 UTC (permalink / raw)
To: gcc-bugs
--
What |Removed |Added
----------------------------------------------------------------------------
Severity|critical |normal
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug middle-end/13776] [4.0 Regression] Many C++ compile-time regressions for MICO's ORB code
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (85 preceding siblings ...)
2005-01-27 5:03 ` pinskia at gcc dot gnu dot org
@ 2005-01-31 9:31 ` kgardas at objectsecurity dot com
2005-02-01 13:39 ` arend dot bayer at web dot de
` (3 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: kgardas at objectsecurity dot com @ 2005-01-31 9:31 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From kgardas at objectsecurity dot com 2005-01-31 09:31 -------
Hello,
new timings MICO ORB sources are here:
http://gcc.gnu.org/ml/gcc/2005-01/msg01714.html
Cheers,
Karel
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug middle-end/13776] [4.0 Regression] Many C++ compile-time regressions for MICO's ORB code
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (86 preceding siblings ...)
2005-01-31 9:31 ` kgardas at objectsecurity dot com
@ 2005-02-01 13:39 ` arend dot bayer at web dot de
2005-03-02 20:09 ` [Bug middle-end/13776] [4.0/4.1 " kgardas at objectsecurity dot com
` (2 subsequent siblings)
90 siblings, 0 replies; 93+ messages in thread
From: arend dot bayer at web dot de @ 2005-02-01 13:39 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From arend dot bayer at web dot de 2005-02-01 13:39 -------
Karel, ir.ii does not compile since Mark Mitchell's patch to disallow floating
point literals in constant expressions went in. I think if you could
regenerated the preprocessed source, it should work again.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug middle-end/13776] [4.0/4.1 Regression] Many C++ compile-time regressions for MICO's ORB code
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (87 preceding siblings ...)
2005-02-01 13:39 ` arend dot bayer at web dot de
@ 2005-03-02 20:09 ` kgardas at objectsecurity dot com
2005-03-02 21:28 ` pinskia at gcc dot gnu dot org
2005-03-02 21:32 ` giovannibajo at libero dot it
90 siblings, 0 replies; 93+ messages in thread
From: kgardas at objectsecurity dot com @ 2005-03-02 20:09 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From kgardas at objectsecurity dot com 2005-03-02 20:09 -------
New results meassured for MICO compiled with 4.0.0 20050301 are posted here:
http://gcc.gnu.org/ml/gcc/2005-03/msg00132.html
Cheers,
Karel
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug middle-end/13776] [4.0/4.1 Regression] Many C++ compile-time regressions for MICO's ORB code
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (88 preceding siblings ...)
2005-03-02 20:09 ` [Bug middle-end/13776] [4.0/4.1 " kgardas at objectsecurity dot com
@ 2005-03-02 21:28 ` pinskia at gcc dot gnu dot org
2005-03-02 21:32 ` giovannibajo at libero dot it
90 siblings, 0 replies; 93+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-03-02 21:28 UTC (permalink / raw)
To: gcc-bugs
--
Bug 13776 depends on bug 17278, which changed state.
Bug 17278 Summary: [4.0/4.1 Regression] 8% C++ compile-time regression in comparison with 3.4.1 at -O1 optimization level
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17278
What |Old Value |New Value
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread
* [Bug middle-end/13776] [4.0/4.1 Regression] Many C++ compile-time regressions for MICO's ORB code
2004-01-20 18:39 [Bug c++/13776] New: Many C++ compile-time regression in 3.5-tree-ssa 040120 in comparison with 3.4.0 040114 kgardas at objectsecurity dot com
` (89 preceding siblings ...)
2005-03-02 21:28 ` pinskia at gcc dot gnu dot org
@ 2005-03-02 21:32 ` giovannibajo at libero dot it
90 siblings, 0 replies; 93+ messages in thread
From: giovannibajo at libero dot it @ 2005-03-02 21:32 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From giovannibajo at libero dot it 2005-03-02 21:32 -------
I gave a quick look at this and I can't find anything that is not already
fixed, especially after Karel's last results. Also having a bug with 85
comments is a good way to make developers run, so let's close this as fixed as
well. If anyone in CC list believes there is something still to fix mentioned
here, it is better to create a new bug.
--
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13776
^ permalink raw reply [flat|nested] 93+ messages in thread