* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
@ 2006-03-25 16:21 ` rguenth at gcc dot gnu dot org
2006-03-25 22:22 ` lucier at math dot purdue dot edu
` (119 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2006-03-25 16:21 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from rguenth at gcc dot gnu dot org 2006-03-25 16:21 -------
Can you do a comparison to 4.0.3?
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |compile-time-hog, memory-hog
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
2006-03-25 16:21 ` [Bug tree-optimization/26854] " rguenth at gcc dot gnu dot org
@ 2006-03-25 22:22 ` lucier at math dot purdue dot edu
2006-04-19 6:43 ` law at redhat dot com
` (118 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2006-03-25 22:22 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from lucier at math dot purdue dot edu 2006-03-25 22:22 -------
Subject: Re: Inordinate compile times on large routines
[lindv2:~/Desktop] lucier% /pkgs/gcc-4.0.3/bin/gcc -mcpu=970 -m64 -no-
cpp-precomp -Wall -W -Wno-unused -O1 -fno-math-errno -fschedule-
insns2 -fno-trapping-math -fno-strict-aliasing -fwrapv -fomit-frame-
pointer -fPIC -fno-common -bundle -flat_namespace -undefined suppress
-I/usr/local/Gambit-C/include/ -ftime-report -fmem-report all.i
gcc: unrecognized option '-no-cpp-precomp'
Memory still allocated at the end of the compilation process
Size Allocated Used Overhead
8 16k 11k 480
16 52k 12k 1144
64 10M 1841k 167k
256 4096 512 56
512 12k 4608 168
1024 96k 95k 1344
2048 4096 2048 56
4096 64k 64k 896
8192 16k 16k 112
32768 288k 288k 504
131072 128k 128k 56
1048576 3072k 3072k 168
2097152 4096k 4096k 112
112 19M 16M 272k
208 6360k 4213k 86k
48 7344k 4315k 114k
32 148k 74k 2664
80 16M 1336k 232k
Total 67M 35M 881k
String pool
entries 155812
identifiers 155812 (100.00%)
slots 262144
bytes 1952k (167k overhead)
table size 2048k
coll/search 0.8640
ins/search 0.1923
avg. entry 12.83 bytes (+/- 7.87)
longest entry 67
??? tree nodes created
(No per-node statistics)
Type hash: size 1021, 551 elements, 0.816291 collisions
Execution times (seconds)
garbage collection : 2.11 ( 0%) usr 0.04 ( 0%) sys 2.71
( 0%) wall
cfg construction : 0.68 ( 0%) usr 1.22 ( 0%) sys 2.29
( 0%) wall
cfg cleanup : 94.99 ( 9%) usr 0.54 ( 0%) sys 120.62
( 7%) wall
trivially dead code : 2.87 ( 0%) usr 0.06 ( 0%) sys 3.83
( 0%) wall
life analysis : 6.78 ( 1%) usr 3.26 ( 1%) sys 12.56
( 1%) wall
life info update : 1.09 ( 0%) usr 0.01 ( 0%) sys 1.34
( 0%) wall
alias analysis : 1.89 ( 0%) usr 0.04 ( 0%) sys 2.55
( 0%) wall
register scan : 1.25 ( 0%) usr 0.02 ( 0%) sys 1.62
( 0%) wall
rebuild jump labels : 0.34 ( 0%) usr 0.01 ( 0%) sys 0.42
( 0%) wall
preprocessing : 7.70 ( 1%) usr 12.37 ( 4%) sys 25.83
( 2%) wall
lexical analysis : 13.19 ( 1%) usr 25.54 ( 9%) sys 48.16
( 3%) wall
parser : 11.06 ( 1%) usr 13.13 ( 5%) sys 30.20
( 2%) wall
tree gimplify : 1.61 ( 0%) usr 0.07 ( 0%) sys 2.14
( 0%) wall
tree eh : 0.18 ( 0%) usr 0.01 ( 0%) sys 0.21
( 0%) wall
tree CFG construction : 0.63 ( 0%) usr 0.16 ( 0%) sys 0.97
( 0%) wall
tree CFG cleanup : 2.09 ( 0%) usr 0.02 ( 0%) sys 2.62
( 0%) wall
tree find referenced vars: 0.25 ( 0%) usr 0.01 ( 0%) sys 0.37
( 0%) wall
tree PTA : 615.45 (59%) usr 155.84 (55%) sys 967.56
(58%) wall
tree alias analysis : 0.63 ( 0%) usr 0.00 ( 0%) sys 0.73
( 0%) wall
tree PHI insertion : 4.27 ( 0%) usr 5.94 ( 2%) sys 12.63
( 1%) wall
tree SSA rewrite : 3.35 ( 0%) usr 0.10 ( 0%) sys 4.61
( 0%) wall
tree SSA other : 8.35 ( 1%) usr 7.78 ( 3%) sys 19.75
( 1%) wall
tree operand scan : 5.80 ( 1%) usr 7.91 ( 3%) sys 17.53
( 1%) wall
dominator optimization: 5.62 ( 1%) usr 0.45 ( 0%) sys 7.42
( 0%) wall
tree CCP : 1.78 ( 0%) usr 0.02 ( 0%) sys 2.18
( 0%) wall
tree split crit edges : 0.30 ( 0%) usr 0.04 ( 0%) sys 0.41
( 0%) wall
tree remove redundant PHIs: 3.92 ( 0%) usr 0.14 ( 0%) sys 4.96
( 0%) wall
tree linearize phis : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.03
( 0%) wall
tree forward propagate: 1.22 ( 0%) usr 0.01 ( 0%) sys 1.51
( 0%) wall
tree conservative DCE : 1.94 ( 0%) usr 0.01 ( 0%) sys 2.51
( 0%) wall
tree aggressive DCE : 0.82 ( 0%) usr 0.06 ( 0%) sys 1.05
( 0%) wall
tree DSE : 1.35 ( 0%) usr 0.05 ( 0%) sys 1.74
( 0%) wall
PHI merge : 0.11 ( 0%) usr 0.01 ( 0%) sys 0.16
( 0%) wall
tree record loop bounds: 0.29 ( 0%) usr 0.01 ( 0%) sys 0.37
( 0%) wall
loop invariant motion : 1.25 ( 0%) usr 0.02 ( 0%) sys 1.58
( 0%) wall
tree canonical iv creation: 0.26 ( 0%) usr 0.01 ( 0%) sys 0.34
( 0%) wall
tree loop init : 8.65 ( 1%) usr 2.11 ( 1%) sys 13.35
( 1%) wall
tree copy headers : 3.03 ( 0%) usr 1.35 ( 0%) sys 5.42
( 0%) wall
tree SSA to normal : 139.82 (13%) usr 1.01 ( 0%) sys 176.26
(11%) wall
tree rename SSA copies: 0.72 ( 0%) usr 0.10 ( 0%) sys 0.97
( 0%) wall
dominance frontiers : 0.76 ( 0%) usr 0.01 ( 0%) sys 0.94
( 0%) wall
expand : 5.16 ( 0%) usr 1.32 ( 0%) sys 8.31
( 0%) wall
varconst : 0.13 ( 0%) usr 0.02 ( 0%) sys 0.25
( 0%) wall
jump : 0.80 ( 0%) usr 0.03 ( 0%) sys 1.00
( 0%) wall
CSE : 2.27 ( 0%) usr 1.09 ( 0%) sys 4.26
( 0%) wall
loop analysis : 2.00 ( 0%) usr 0.15 ( 0%) sys 2.60
( 0%) wall
branch prediction : 3.36 ( 0%) usr 0.21 ( 0%) sys 4.44
( 0%) wall
flow analysis : 0.28 ( 0%) usr 0.01 ( 0%) sys 0.33
( 0%) wall
combiner : 3.82 ( 0%) usr 0.09 ( 0%) sys 4.97
( 0%) wall
if-conversion : 2.49 ( 0%) usr 0.08 ( 0%) sys 3.27
( 0%) wall
local alloc : 2.85 ( 0%) usr 0.11 ( 0%) sys 3.78
( 0%) wall
global alloc : 27.34 ( 3%) usr 23.42 ( 8%) sys 61.76
( 4%) wall
reload CSE regs : 27.92 ( 3%) usr 0.77 ( 0%) sys 36.15
( 2%) wall
flow 2 : 1.80 ( 0%) usr 2.14 ( 1%) sys 4.79
( 0%) wall
if-conversion 2 : 1.00 ( 0%) usr 0.06 ( 0%) sys 1.26
( 0%) wall
rename registers : 0.94 ( 0%) usr 0.19 ( 0%) sys 1.41
( 0%) wall
scheduling 2 : 3.49 ( 0%) usr 0.19 ( 0%) sys 4.40
( 0%) wall
shorten branches : 0.90 ( 0%) usr 0.03 ( 0%) sys 1.28
( 0%) wall
final : 1.68 ( 0%) usr 0.10 ( 0%) sys 2.08
( 0%) wall
rest of compilation : 1.52 ( 0%) usr 1.26 ( 0%) sys 3.34
( 0%) wall
TOTAL :1048.39 280.86 1665.66
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
2006-03-25 16:21 ` [Bug tree-optimization/26854] " rguenth at gcc dot gnu dot org
2006-03-25 22:22 ` lucier at math dot purdue dot edu
@ 2006-04-19 6:43 ` law at redhat dot com
2006-04-19 15:32 ` law at redhat dot com
` (117 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: law at redhat dot com @ 2006-04-19 6:43 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from law at redhat dot com 2006-04-19 06:43 -------
I'm peeking at DOM.
jeff
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (2 preceding siblings ...)
2006-04-19 6:43 ` law at redhat dot com
@ 2006-04-19 15:32 ` law at redhat dot com
2006-04-19 22:34 ` law at gcc dot gnu dot org
` (116 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: law at redhat dot com @ 2006-04-19 15:32 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from law at redhat dot com 2006-04-19 15:32 -------
OK, as expected, DOM was doing something totally stupid with immediate uses.
On my x86 box I've got a patch which takes us from ~250 seconds in DOM to
around 5.
I'm going to get this fix bootstrapped and regression tested, then port it to
mainline (where things are slightly different/rearranged, but the same core
problem exists).
Unfortunately, those gains are dwarfed by the wall-clock time burned
swapping/paging due to memory usage in other passes.
The worst memory offenders (in pain order) are:
reorder blocks (possible given the number of blocks/edges in this code)
expand (??? possibly being charged for some other passes time)
global-alloc
Mainline has a different memory pain profile -- the new RTL invariant code
motion pass goes absolutely nuts memory-wise.
I'm not planning to work on any of the memory consumption issues.
--
law at redhat dot com changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |rakdver at gcc dot gnu dot
| |org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (3 preceding siblings ...)
2006-04-19 15:32 ` law at redhat dot com
@ 2006-04-19 22:34 ` law at gcc dot gnu dot org
2006-04-20 3:18 ` lucier at math dot purdue dot edu
` (115 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: law at gcc dot gnu dot org @ 2006-04-19 22:34 UTC (permalink / raw)
To: gcc-bugs
------- Comment #5 from law at gcc dot gnu dot org 2006-04-19 22:34 -------
Subject: Bug 26854
Author: law
Date: Wed Apr 19 22:34:41 2006
New Revision: 113099
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=113099
Log:
PR tree-optimization/26854
* tree-ssa-dse.c (dse_optimize_stmt): Use has_single_use rather
than num_imm_uses.
* tree-ssa-dom.c (simplify_rhs_and_lookup_avail_expr): Similarly.
Modified:
branches/gcc-4_1-branch/gcc/ChangeLog
branches/gcc-4_1-branch/gcc/tree-ssa-dom.c
branches/gcc-4_1-branch/gcc/tree-ssa-dse.c
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (4 preceding siblings ...)
2006-04-19 22:34 ` law at gcc dot gnu dot org
@ 2006-04-20 3:18 ` lucier at math dot purdue dot edu
2006-04-20 3:28 ` law at redhat dot com
` (114 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2006-04-20 3:18 UTC (permalink / raw)
To: gcc-bugs
------- Comment #6 from lucier at math dot purdue dot edu 2006-04-20 03:18 -------
Subject: Re: Inordinate compile times on large routines
Thanks a lot. Here are the timing statistics (with --disable-
checking) after your patch.
PS: I'm sorry it took 9 hours to compile on your box.
Memory still allocated at the end of the compilation process
Size Allocated Used Overhead
8 16k 14k 480
16 52k 12k 1144
64 1276k 1239k 19k
256 484k 452k 6776
512 36k 25k 504
1024 220k 216k 3080
2048 24k 20k 336
4096 68k 68k 952
8192 56k 56k 392
16384 16k 16k 56
32768 288k 288k 504
65536 64k 64k 56
131072 128k 128k 56
262144 512k 512k 112
524288 512k 512k 56
1048576 1024k 1024k 56
2097152 4096k 4096k 112
112 34M 16M 484k
208 40k 38k 560
192 3344k 3287k 45k
160 28k 6240 392
176 564k 261k 7896
48 2088k 1165k 32k
32 144k 68k 2592
80 35M 2063k 499k
Total 85M 32M 1107k
String pool
entries 158128
identifiers 158128 (100.00%)
slots 262144
bytes 1981k (169k overhead)
table size 2048k
coll/search 1.1434
ins/search 0.1946
avg. entry 12.83 bytes (+/- 7.82)
longest entry 67
??? tree nodes created
(No per-node statistics)
Type hash: size 1021, 598 elements, 0.900368 collisions
DECL_DEBUG_EXPR hash: size 8191, 0 elements, 1.140991 collisions
DECL_VALUE_EXPR hash: size 1021, 0 elements, 0.000000 collisions
Execution times (seconds)
garbage collection : 1.84 ( 0%) usr 0.04 ( 0%) sys 2.47
( 0%) wall 0 kB ( 0%) ggc
callgraph construction: 1.79 ( 0%) usr 0.35 ( 0%) sys 2.67
( 0%) wall 21241 kB ( 2%) ggc
callgraph optimization: 0.05 ( 0%) usr 0.00 ( 0%) sys 0.05
( 0%) wall 0 kB ( 0%) ggc
ipa reference : 0.42 ( 0%) usr 0.14 ( 0%) sys 0.71
( 0%) wall 7 kB ( 0%) ggc
cfg construction : 0.31 ( 0%) usr 0.00 ( 0%) sys 0.48
( 0%) wall 7224 kB ( 1%) ggc
cfg cleanup : 95.98 ( 9%) usr 0.62 ( 0%) sys 125.14
( 8%) wall 2098 kB ( 0%) ggc
trivially dead code : 2.49 ( 0%) usr 0.06 ( 0%) sys 3.46
( 0%) wall 0 kB ( 0%) ggc
life analysis : 5.86 ( 1%) usr 3.35 ( 3%) sys 11.86
( 1%) wall 18686 kB ( 2%) ggc
life info update : 0.95 ( 0%) usr 0.02 ( 0%) sys 1.18
( 0%) wall 526 kB ( 0%) ggc
alias analysis : 1.67 ( 0%) usr 0.03 ( 0%) sys 2.07
( 0%) wall 16385 kB ( 2%) ggc
register scan : 0.93 ( 0%) usr 0.01 ( 0%) sys 1.29
( 0%) wall 4 kB ( 0%) ggc
rebuild jump labels : 0.30 ( 0%) usr 0.00 ( 0%) sys 0.37
( 0%) wall 0 kB ( 0%) ggc
preprocessing : 7.27 ( 1%) usr 13.04 (10%) sys 25.28
( 2%) wall 2197 kB ( 0%) ggc
lexical analysis : 13.10 ( 1%) usr 25.59 (20%) sys 47.58
( 3%) wall 0 kB ( 0%) ggc
parser : 9.44 ( 1%) usr 12.84 (10%) sys 28.21
( 2%) wall 72677 kB ( 7%) ggc
tree gimplify : 1.51 ( 0%) usr 0.08 ( 0%) sys 2.02
( 0%) wall 30969 kB ( 3%) ggc
tree eh : 0.17 ( 0%) usr 0.01 ( 0%) sys 0.22
( 0%) wall 0 kB ( 0%) ggc
tree CFG construction : 0.56 ( 0%) usr 0.14 ( 0%) sys 1.02
( 0%) wall 76077 kB ( 8%) ggc
tree CFG cleanup : 5.77 ( 1%) usr 0.06 ( 0%) sys 7.60
( 0%) wall 955 kB ( 0%) ggc
tree copy propagation : 5.43 ( 0%) usr 0.39 ( 0%) sys 7.83
( 0%) wall 10484 kB ( 1%) ggc
tree store copy prop : 0.73 ( 0%) usr 0.04 ( 0%) sys 0.96
( 0%) wall 1088 kB ( 0%) ggc
tree find ref. vars : 0.21 ( 0%) usr 0.00 ( 0%) sys 0.23
( 0%) wall 2502 kB ( 0%) ggc
tree PTA : 5.49 ( 0%) usr 0.57 ( 0%) sys 7.86
( 0%) wall 16435 kB ( 2%) ggc
tree alias analysis : 6.82 ( 1%) usr 10.23 ( 8%) sys 18.62
( 1%) wall 12810 kB ( 1%) ggc
tree PHI insertion : 1.05 ( 0%) usr 0.21 ( 0%) sys 1.62
( 0%) wall 24377 kB ( 2%) ggc
tree SSA rewrite : 2.50 ( 0%) usr 0.16 ( 0%) sys 3.34
( 0%) wall 39166 kB ( 4%) ggc
tree SSA other : 1.10 ( 0%) usr 1.49 ( 1%) sys 3.69
( 0%) wall 0 kB ( 0%) ggc
tree SSA incremental : 13.99 ( 1%) usr 3.74 ( 3%) sys 22.60
( 1%) wall 19165 kB ( 2%) ggc
tree operand scan : 626.32 (57%) usr 12.24 (10%) sys 833.21
(52%) wall 23910 kB ( 2%) ggc
dominator optimization: 6.09 ( 1%) usr 0.35 ( 0%) sys 8.22
( 1%) wall 63874 kB ( 7%) ggc
tree STORE-CCP : 0.67 ( 0%) usr 0.02 ( 0%) sys 0.87
( 0%) wall 513 kB ( 0%) ggc
tree CCP : 0.74 ( 0%) usr 0.02 ( 0%) sys 1.03
( 0%) wall 514 kB ( 0%) ggc
tree split crit edges : 0.37 ( 0%) usr 0.21 ( 0%) sys 0.85
( 0%) wall 40362 kB ( 4%) ggc
tree reassociation : 0.56 ( 0%) usr 0.02 ( 0%) sys 0.69
( 0%) wall 0 kB ( 0%) ggc
tree FRE : 12.83 ( 1%) usr 0.67 ( 1%) sys 17.70
( 1%) wall 40945 kB ( 4%) ggc
tree code sinking : 0.98 ( 0%) usr 0.06 ( 0%) sys 1.45
( 0%) wall 0 kB ( 0%) ggc
tree linearize phis : 0.15 ( 0%) usr 0.00 ( 0%) sys 0.30
( 0%) wall 0 kB ( 0%) ggc
tree forward propagate: 0.16 ( 0%) usr 0.00 ( 0%) sys 0.20
( 0%) wall 0 kB ( 0%) ggc
tree conservative DCE : 1.87 ( 0%) usr 0.03 ( 0%) sys 2.54
( 0%) wall 0 kB ( 0%) ggc
tree aggressive DCE : 0.87 ( 0%) usr 0.01 ( 0%) sys 1.17
( 0%) wall 0 kB ( 0%) ggc
tree DSE : 0.73 ( 0%) usr 0.04 ( 0%) sys 0.91
( 0%) wall 0 kB ( 0%) ggc
PHI merge : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.02
( 0%) wall 49 kB ( 0%) ggc
tree loop bounds : 0.35 ( 0%) usr 0.01 ( 0%) sys 0.41
( 0%) wall 0 kB ( 0%) ggc
loop invariant motion : 0.56 ( 0%) usr 0.01 ( 0%) sys 0.72
( 0%) wall 0 kB ( 0%) ggc
tree canonical iv : 0.14 ( 0%) usr 0.00 ( 0%) sys 0.21
( 0%) wall 0 kB ( 0%) ggc
scev constant prop : 1.47 ( 0%) usr 0.04 ( 0%) sys 2.02
( 0%) wall 1973 kB ( 0%) ggc
complete unrolling : 0.08 ( 0%) usr 0.01 ( 0%) sys 0.08
( 0%) wall 0 kB ( 0%) ggc
tree loop init : 5.15 ( 0%) usr 5.40 ( 4%) sys 14.04
( 1%) wall 58726 kB ( 6%) ggc
tree loop fini : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01
( 0%) wall 0 kB ( 0%) ggc
tree copy headers : 0.24 ( 0%) usr 0.01 ( 0%) sys 0.31
( 0%) wall 0 kB ( 0%) ggc
tree SSA uncprop : 0.44 ( 0%) usr 0.01 ( 0%) sys 0.52
( 0%) wall 0 kB ( 0%) ggc
tree SSA to normal : 171.82 (16%) usr 1.31 ( 1%) sys 224.37
(14%) wall 101554 kB (10%) ggc
tree rename SSA copies: 0.60 ( 0%) usr 0.07 ( 0%) sys 1.05
( 0%) wall 0 kB ( 0%) ggc
dominance frontiers : 0.54 ( 0%) usr 0.02 ( 0%) sys 0.67
( 0%) wall 0 kB ( 0%) ggc
expand : 7.37 ( 1%) usr 4.05 ( 3%) sys 14.98
( 1%) wall 122832 kB (13%) ggc
varconst : 0.01 ( 0%) usr 0.01 ( 0%) sys 0.00
( 0%) wall 7 kB ( 0%) ggc
jump : 0.66 ( 0%) usr 0.04 ( 0%) sys 0.88
( 0%) wall 0 kB ( 0%) ggc
CSE : 1.98 ( 0%) usr 1.16 ( 1%) sys 4.00
( 0%) wall 2442 kB ( 0%) ggc
loop analysis : 1.55 ( 0%) usr 0.13 ( 0%) sys 2.17
( 0%) wall 7001 kB ( 1%) ggc
branch prediction : 2.97 ( 0%) usr 0.18 ( 0%) sys 4.00
( 0%) wall 7022 kB ( 1%) ggc
flow analysis : 0.20 ( 0%) usr 0.00 ( 0%) sys 0.38
( 0%) wall 0 kB ( 0%) ggc
combiner : 3.85 ( 0%) usr 0.10 ( 0%) sys 5.14
( 0%) wall 31575 kB ( 3%) ggc
if-conversion : 1.82 ( 0%) usr 0.10 ( 0%) sys 2.31
( 0%) wall 325 kB ( 0%) ggc
local alloc : 2.72 ( 0%) usr 0.11 ( 0%) sys 3.65
( 0%) wall 13500 kB ( 1%) ggc
global alloc : 25.23 ( 2%) usr 21.24 (17%) sys 58.29
( 4%) wall 30563 kB ( 3%) ggc
reload CSE regs : 28.86 ( 3%) usr 0.35 ( 0%) sys 37.86
( 2%) wall 12947 kB ( 1%) ggc
flow 2 : 0.61 ( 0%) usr 0.00 ( 0%) sys 0.91
( 0%) wall 19 kB ( 0%) ggc
if-conversion 2 : 0.68 ( 0%) usr 0.06 ( 0%) sys 0.86
( 0%) wall 14 kB ( 0%) ggc
rename registers : 0.89 ( 0%) usr 0.17 ( 0%) sys 1.46
( 0%) wall 24 kB ( 0%) ggc
scheduling 2 : 3.47 ( 0%) usr 0.18 ( 0%) sys 4.52
( 0%) wall 35672 kB ( 4%) ggc
shorten branches : 0.16 ( 0%) usr 0.00 ( 0%) sys 0.17
( 0%) wall 0 kB ( 0%) ggc
final : 2.08 ( 0%) usr 0.13 ( 0%) sys 2.74
( 0%) wall 4096 kB ( 0%) ggc
symout : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01
( 0%) wall 0 kB ( 0%) ggc
TOTAL :1106.93 125.15
1593.33 977727 kB
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (5 preceding siblings ...)
2006-04-20 3:18 ` lucier at math dot purdue dot edu
@ 2006-04-20 3:28 ` law at redhat dot com
2006-04-20 3:39 ` lucier at math dot purdue dot edu
` (113 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: law at redhat dot com @ 2006-04-20 3:28 UTC (permalink / raw)
To: gcc-bugs
------- Comment #7 from law at redhat dot com 2006-04-20 03:28 -------
Subject: Re: Inordinate compile times on
large routines
On Thu, 2006-04-20 at 03:18 +0000, lucier at math dot purdue dot edu
wrote:
>
> ------- Comment #6 from lucier at math dot purdue dot edu 2006-04-20 03:18 -------
> Subject: Re: Inordinate compile times on large routines
>
> Thanks a lot. Here are the timing statistics (with --disable-
> checking) after your patch.
>
> PS: I'm sorry it took 9 hours to compile on your box.
No worries. I've got several boxes and having one busy overnight
isn't that big of a deal.
Clearly there's something different between PPC and x86 for
your testcase as you're not getting hit in bb-reorder or
expand.
Operand scanning is clearly the #1 issue when run on PPC (> 50% of
compile time, OUCH). This may actually be an indication of a pass
going nuts and marking too many things for rescanning though.
You'll likely get radically different pain points with mainline
as well. The RTL loop invariant code goes crazy memory-wise
for me, tree PRE and FRE also suck up large amounts of time.
Jeff
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (6 preceding siblings ...)
2006-04-20 3:28 ` law at redhat dot com
@ 2006-04-20 3:39 ` lucier at math dot purdue dot edu
2006-04-20 16:13 ` law at gcc dot gnu dot org
` (112 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2006-04-20 3:39 UTC (permalink / raw)
To: gcc-bugs
------- Comment #8 from lucier at math dot purdue dot edu 2006-04-20 03:39 -------
Subject: Re: Inordinate compile times on large routines
On Apr 19, 2006, at 10:28 PM, law at redhat dot com wrote:
> You'll likely get radically different pain points with mainline
> as well. The RTL loop invariant code goes crazy memory-wise
> for me, tree PRE and FRE also suck up large amounts of time.
Mainline doesn't build with -m64 -mcpu=970; this was reported as
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26892
which is still marked as UNCONFIRMED; I just realized today that this
could be listed as a 4.1 regression. In my limited understanding, I
suspect it's a configure problem, as I mentioned in
http://gcc.gnu.org/ml/gcc/2006-04/msg00265.html
Brad
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (7 preceding siblings ...)
2006-04-20 3:39 ` lucier at math dot purdue dot edu
@ 2006-04-20 16:13 ` law at gcc dot gnu dot org
2006-04-20 16:17 ` law at redhat dot com
` (111 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: law at gcc dot gnu dot org @ 2006-04-20 16:13 UTC (permalink / raw)
To: gcc-bugs
------- Comment #9 from law at gcc dot gnu dot org 2006-04-20 16:13 -------
Subject: Bug 26854
Author: law
Date: Thu Apr 20 16:13:12 2006
New Revision: 113120
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=113120
Log:
PR tree-optimization/26854
* tree-ssa-dse.c (dse_optimize_stmt): Avoid num_imm_uses when
checking for zero or one use.
* tree-ssa-dom.c (propagate_rhs_into_lhs): Similarly.
* tree-cfgcleanup.c (merge_phi_nodes): Similarly.
* tree-ssa-reassoc.c (negate_value): Similarly.
(reassociate_bb): Similarly.
Modified:
trunk/gcc/ChangeLog
trunk/gcc/tree-cfgcleanup.c
trunk/gcc/tree-ssa-dom.c
trunk/gcc/tree-ssa-dse.c
trunk/gcc/tree-ssa-reassoc.c
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (8 preceding siblings ...)
2006-04-20 16:13 ` law at gcc dot gnu dot org
@ 2006-04-20 16:17 ` law at redhat dot com
2006-04-20 16:21 ` dberlin at gcc dot gnu dot org
` (110 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: law at redhat dot com @ 2006-04-20 16:17 UTC (permalink / raw)
To: gcc-bugs
------- Comment #10 from law at redhat dot com 2006-04-20 16:17 -------
PRE/FRE for mainline need some TLC on their compile-time performance as
indicated by this PR as well. They're #3 & #4 respectively behind the operator
scanning code and store-ccp and way out of line when compared with the rest of
the tree optimization passes.
--
law at redhat dot com changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |dberlin at gcc dot gnu dot
| |org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (9 preceding siblings ...)
2006-04-20 16:17 ` law at redhat dot com
@ 2006-04-20 16:21 ` dberlin at gcc dot gnu dot org
2006-04-26 18:59 ` amacleod at redhat dot com
` (109 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: dberlin at gcc dot gnu dot org @ 2006-04-20 16:21 UTC (permalink / raw)
To: gcc-bugs
------- Comment #11 from dberlin at gcc dot gnu dot org 2006-04-20 16:21 -------
(In reply to comment #10)
> PRE/FRE for mainline need some TLC on their compile-time performance as
> indicated by this PR as well. They're #3 & #4 respectively behind the operator
> scanning code and store-ccp and way out of line when compared with the rest of
> the tree optimization passes.
>
I'll look into this in the next few weeks.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (10 preceding siblings ...)
2006-04-20 16:21 ` dberlin at gcc dot gnu dot org
@ 2006-04-26 18:59 ` amacleod at redhat dot com
2006-04-27 2:29 ` amacleod at redhat dot com
` (108 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: amacleod at redhat dot com @ 2006-04-26 18:59 UTC (permalink / raw)
To: gcc-bugs
------- Comment #12 from amacleod at redhat dot com 2006-04-26 18:59 -------
I have a patch to change the implementation of immediate uses forthcoming
which, as a side effect, cleans up the operand scanner time in this file:
on my x86 cross powerpc64:
before patch:
tree operand scan : 366.20 (31%) usr 2.59 (18%) sys 371.20 (31%) wall
TOTAL :1177.57 14.10 1200.53
after patch:
tree operand scan : 3.07 ( 0%) usr 1.72 (12%) sys 4.69 ( 1%) wall
TOTAL : 829.50 14.13 866.35
I will also take a look at the out-of-ssa time and see what can be done. Part
of the problem there is a conflict graph is being built with 650,000,000
conflicts... thats not condusive to fast compile times! Thats a lot of
SSA_NAMe version of a base variable!!!!
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (11 preceding siblings ...)
2006-04-26 18:59 ` amacleod at redhat dot com
@ 2006-04-27 2:29 ` amacleod at redhat dot com
2006-04-27 2:30 ` amacleod at redhat dot com
` (107 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: amacleod at redhat dot com @ 2006-04-27 2:29 UTC (permalink / raw)
To: gcc-bugs
------- Comment #13 from amacleod at redhat dot com 2006-04-27 02:29 -------
The patch for speeding up the operand cache has been posted to gcc-patches:
http://gcc.gnu.org/ml/gcc-patches/2006-04/msg01017.html
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (12 preceding siblings ...)
2006-04-27 2:29 ` amacleod at redhat dot com
@ 2006-04-27 2:30 ` amacleod at redhat dot com
2006-04-27 20:22 ` amacleod at gcc dot gnu dot org
` (106 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: amacleod at redhat dot com @ 2006-04-27 2:30 UTC (permalink / raw)
To: gcc-bugs
------- Comment #14 from amacleod at redhat dot com 2006-04-27 02:30 -------
I should point out that its a patch for mainline. Conversion to 4.1 requires
some minor tweaking.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (13 preceding siblings ...)
2006-04-27 2:30 ` amacleod at redhat dot com
@ 2006-04-27 20:22 ` amacleod at gcc dot gnu dot org
2006-11-30 4:36 ` lucier at math dot purdue dot edu
` (105 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: amacleod at gcc dot gnu dot org @ 2006-04-27 20:22 UTC (permalink / raw)
To: gcc-bugs
------- Comment #15 from amacleod at redhat dot com 2006-04-27 20:22 -------
Subject: Bug 26854
Author: amacleod
Date: Thu Apr 27 20:22:17 2006
New Revision: 113321
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=113321
Log:
Implement new immediate use iterators.
2006-04-27 Andrew MacLeod <amacleod@redhat.com>
PR tree-optimization/26854
* tree-vrp.c (remove_range_assertions): Use new Immuse iterator.
* doc/tree-ssa.texi: Update immuse iterator documentation.
* tree-ssa-math-opts.c (execute_cse_reciprocals_1): Use new iterator.
* tree-ssa-dom.c (propagate_rhs_into_lhs): Use new iterator.
* tree-flow-inline.h (end_safe_imm_use_traverse, end_safe_imm_use_p,
first_safe_imm_use, next_safe_imm_use): Remove.
(end_imm_use_stmt_p): New. Check for end of immuse stmt traversal.
(end_imm_use_stmt_traverse): New. Terminate immuse stmt traversal.
(move_use_after_head): New. Helper function to sort immuses in a stmt.
(link_use_stmts_after): New. Link all immuses in a stmt
consescutively.
(first_imm_use_stmt): New. Get first stmt in an immuse list.
(next_imm_use_stmt): New. Get next stmt in an immuse list.
(first_imm_use_on_stmt): New. Get first immuse on a stmt.
(end_imm_use_on_stmt_p): New. Check for end of immuses on a stmt.
(next_imm_use_on_stmt): New. Move to next immuse on a stmt.
* tree-ssa-forwprop.c (forward_propagate_addr_expr): Use new iterator.
* lambda-code.c (lambda_loopnest_to_gcc_loopnest): Use new iterator.
(perfect_nestify): Use new iterator.
* tree-vect-transform.c (vect_create_epilog_for_reduction): Use new
iterator.
* tree-flow.h (struct immediate_use_iterator_d): Add comments.
(next_imm_name): New field in struct immediate_use_iterator_d.
(FOR_EACH_IMM_USE_SAFE, BREAK_FROM_SAFE_IMM_USE): Remove.
(FOR_EACH_IMM_USE_STMT, BREAK_FROM_IMM_USE_STMT,
FOR_EACH_IMM_USE_ON_STMT): New immediate use iterator macros.
* tree-cfg.c (replace_uses_by): Use new iterator.
* tree-ssa-threadedge.c (lhs_of_dominating_assert): Use new iterator.
* tree-ssa-operands.c (correct_use_link): Remove.
(finalize_ssa_use_ops): No longer call correct_use_link.
Modified:
trunk/gcc/ChangeLog
trunk/gcc/doc/tree-ssa.texi
trunk/gcc/lambda-code.c
trunk/gcc/tree-cfg.c
trunk/gcc/tree-flow-inline.h
trunk/gcc/tree-flow.h
trunk/gcc/tree-ssa-dom.c
trunk/gcc/tree-ssa-forwprop.c
trunk/gcc/tree-ssa-math-opts.c
trunk/gcc/tree-ssa-operands.c
trunk/gcc/tree-ssa-threadedge.c
trunk/gcc/tree-vect-transform.c
trunk/gcc/tree-vrp.c
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (14 preceding siblings ...)
2006-04-27 20:22 ` amacleod at gcc dot gnu dot org
@ 2006-11-30 4:36 ` lucier at math dot purdue dot edu
2006-11-30 4:54 ` dberlin at dberlin dot org
` (104 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2006-11-30 4:36 UTC (permalink / raw)
To: gcc-bugs
------- Comment #16 from lucier at math dot purdue dot edu 2006-11-30 04:36 -------
I now get a segfault when trying this with the current 4.2.0 branch:
[descartes:~/Desktop] lucier% time /pkgs/gcc-4.2.0-64-test/bin/gcc -mcpu=970
-m64 -no-cpp-precomp -Wall -W -Wno-unused -O1 -fno-math-errno -fschedule-insns2
-fno-trapping-math -fno-strict-aliasing -fwrapv -fomit-frame-pointer -fPIC
-fno-common -bundle -flat_namespace -undefined suppress
-I/usr/local/Gambit-C/include/ -ftime-report -fmem-report all.i
gcc: unrecognized option '-no-cpp-precomp'
all.c: In function '___H__20_all_2e_o1':
all.c:132856: internal compiler error: Bus error
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://gcc.gnu.org/bugs.html> for instructions.
2100.522u 139.425s 49:12.72 75.8% 0+0k 0+13io 0pf+0w
running gdb with it gives no more information.
Some details:
The STAGE1 compiler was host=powerpc64-darwin and target=powerpc64-darwin:
[descartes:~/programs/gcc/4.2.0] gcc-test% /pkgs/gcc-4.2.0/bin/gcc -v
Using built-in specs.
Target: powerpc-apple-darwin8.8.0
Configured with: ../configure --with-gmp=/pkgs/gmp-4.2.1
--with-mpfr=/pkgs/gmp-4.2.1 --prefix=/pkgs/gcc-4.2.0 --enable-languages=c
--disable-checking
Thread model: posix
gcc version 4.2.0 20061021 (prerelease)
This was the compiler that segfaulted:
(gdb) [descartes:~/Desktop] lucier% /pkgs/gcc-4.2.0-64-test/bin/gcc -v
Using built-in specs.
Target: powerpc64-apple-darwin8.8.0
Configured with: ../configure --build=powerpc64-apple-darwin8.8.0
--host=powerpc64-apple-darwin8.8.0 --target=powerpc64-apple-darwin8.8.0
--enable-languages=c --prefix=/pkgs/gcc-4.2.0-64-test
--with-gmp=/pkgs/gmp-4.2.1-64/ --with-mpfr=/pkgs/gmp-4.2.1-64/
Thread model: posix
gcc version 4.2.0 20061129 (prerelease)
[descartes:~/programs/gcc/4.2.0] gcc-test% cat gcc/BASE-VER
4.2.0
[descartes:~/programs/gcc/4.2.0] gcc-test% cat gcc/DATESTAMP
20061129
[descartes:~/programs/gcc/4.2.0] gcc-test% cat LAST_UPDATED
Wed Nov 29 17:51:48 EST 2006
Wed Nov 29 22:51:48 UTC 2006 (revision 119334M)
[descartes:~/programs/gcc/4.2.0] gcc-test% gdb -v
GNU gdb 6.3.50-20050815 (Apple version gdb-573) (Fri Oct 20 15:54:33 GMT 2006)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "powerpc-apple-darwin".
A full bootstrap was done.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (15 preceding siblings ...)
2006-11-30 4:36 ` lucier at math dot purdue dot edu
@ 2006-11-30 4:54 ` dberlin at dberlin dot org
2006-12-07 17:33 ` lucier at math dot purdue dot edu
` (103 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: dberlin at dberlin dot org @ 2006-11-30 4:54 UTC (permalink / raw)
To: gcc-bugs
------- Comment #17 from dberlin at gcc dot gnu dot org 2006-11-30 04:54 -------
Subject: Re: Inordinate compile times on large routines
On 30 Nov 2006 04:36:05 -0000, lucier at math dot purdue dot edu
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #16 from lucier at math dot purdue dot edu 2006-11-30 04:36 -------
> I now get a segfault when trying this with the current 4.2.0 branch:
>
> [descartes:~/Desktop] lucier% time /pkgs/gcc-4.2.0-64-test/bin/gcc -mcpu=970
> -m64 -no-cpp-precomp -Wall -W -Wno-unused -O1 -fno-math-errno -fschedule-insns2
> -fno-trapping-math -fno-strict-aliasing -fwrapv -fomit-frame-pointer -fPIC
> -fno-common -bundle -flat_namespace -undefined suppress
> -I/usr/local/Gambit-C/include/ -ftime-report -fmem-report all.i
> gcc: unrecognized option '-no-cpp-precomp'
> all.c: In function '___H__20_all_2e_o1':
> all.c:132856: internal compiler error: Bus error
> Please submit a full bug report,
> with preprocessed source if appropriate
It shouldn't crash, but i'm still finishing the patch to not make it
take a ridiculous amount of time, which will need to be applied to the
4.2 branch.
Should be done this week (I sent a preview to the mailing list)
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (16 preceding siblings ...)
2006-11-30 4:54 ` dberlin at dberlin dot org
@ 2006-12-07 17:33 ` lucier at math dot purdue dot edu
2006-12-07 17:54 ` dberlin at dberlin dot org
` (102 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2006-12-07 17:33 UTC (permalink / raw)
To: gcc-bugs
------- Comment #18 from lucier at math dot purdue dot edu 2006-12-07 17:32 -------
Well, I decided to try it with 4.3.0 on powerpc64-apple-darwin8.8.0 and didn't
get any better results:
[descartes:~/Desktop] lucier% time /pkgs/gcc-4.3.0-64/bin/gcc -mcpu=970 -m64
-no-cpp-precomp -Wall -W -Wno-unused -O1 -fno-math-errno -fschedule-insns2
-fno-trapping-math -fno-strict-aliasing -fwrapv -fomit-frame-pointer -fPIC
-fno-common -bundle -flat_namespace -undefined suppress
-I/usr/local/Gambit-C/include/ -ftime-report -fmem-report all.i
gcc: unrecognized option '-no-cpp-precomp'
all.c: In function '___H__20_all_2e_o1':
all.c:132856: internal compiler error: Bus error
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://gcc.gnu.org/bugs.html> for instructions.
923.482u 110.120s 22:51.89 75.3% 0+0k 0+12io 0pf+0w
[descartes:~/Desktop] lucier% /pkgs/gcc-4.3.0-64/bin/gcc -v
Using built-in specs.
Target: powerpc64-apple-darwin8.8.0
Configured with: ../configure --build=powerpc64-apple-darwin8.8.0
--host=powerpc64-apple-darwin8.8.0 --target=powerpc64-apple-darwin8.8.0
--with-gmp=/pkgs/gmp-4.2.1-64/ --with-mpfr=/pkgs/gmp-4.2.1-64/
--prefix=/pkgs/gcc-4.3.0-64 --enable-languages=c --enable-checking=no
Thread model: posix
gcc version 4.3.0 20061206 (experimental)
This is the branch that you installed your changes on, right Dan?
I suppose I should try it on another architecture to see whether the problem
might be darwin-specific, or ppc-specific, or 64-bit specific, or ...
Who knows?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (17 preceding siblings ...)
2006-12-07 17:33 ` lucier at math dot purdue dot edu
@ 2006-12-07 17:54 ` dberlin at dberlin dot org
2006-12-07 17:54 ` dberlin at dberlin dot org
` (101 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: dberlin at dberlin dot org @ 2006-12-07 17:54 UTC (permalink / raw)
To: gcc-bugs
------- Comment #19 from dberlin at gcc dot gnu dot org 2006-12-07 17:54 -------
Subject: Re: Inordinate compile times on large routines
> This is the branch that you installed your changes on, right Dan?
yes
>
> I suppose I should try it on another architecture to see whether the problem
> might be darwin-specific, or ppc-specific, or 64-bit specific, or ...
>
> Who knows?
>
We now spend basically no time in PTA, and about 800 seconds in
remove_ssa_form.
Sometime later on, we run out of memory and crash.
(IE it's somewhere other than the PTA, alias analysis, or tree-ssa
that we run out of memory).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (18 preceding siblings ...)
2006-12-07 17:54 ` dberlin at dberlin dot org
@ 2006-12-07 17:54 ` dberlin at dberlin dot org
2006-12-07 21:51 ` lucier at math dot purdue dot edu
` (100 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: dberlin at dberlin dot org @ 2006-12-07 17:54 UTC (permalink / raw)
To: gcc-bugs
------- Comment #20 from dberlin at gcc dot gnu dot org 2006-12-07 17:54 -------
Subject: Re: Inordinate compile times on large routines
>
> We now spend basically no time in PTA, and about 800 seconds in
> remove_ssa_form.
>
> Sometime later on, we run out of memory and crash.
> (IE it's somewhere other than the PTA, alias analysis, or tree-ssa
> that we run out of memory).
>
Sorry, forgot to mention this is on darwin.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (19 preceding siblings ...)
2006-12-07 17:54 ` dberlin at dberlin dot org
@ 2006-12-07 21:51 ` lucier at math dot purdue dot edu
2006-12-08 1:24 ` lucier at math dot purdue dot edu
` (99 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2006-12-07 21:51 UTC (permalink / raw)
To: gcc-bugs
------- Comment #21 from lucier at math dot purdue dot edu 2006-12-07 21:51 -------
Subject: Re: Inordinate compile times on large routines
I reran things on mainline on my patched RHEL box. It took almost
7GB of memory, peak, to compile this routine (this was very near the
end of cc1).
All things considered, on mainline the CPU time for this routine is
not so bad (alias analysis and FRE are two obvious hot-spots), but
the memory required is very large.
Back to 4.2.0 branch for more testing ...
euler-157% time /pkgs/gcc-mainline/bin/gcc -no-cpp-precomp -Wall -W -
Wno-unused -O1 -fno-math-errno -fschedule-insns2 -fno-trapping-math -
fno-strict-aliasing -fwrapv -fomit-frame-pointer -fPIC -fno-common -
ftime-report -fmem-report -c all.i
gcc: unrecognized option '-no-cpp-precomp'
Memory still allocated at the end of the compilation process
Size Allocated Used Overhead
8 16k 13k 480
16 8472k 8083k 182k
64 39M 22M 625k
256 4096 2816 56
512 4096 1024 56
1024 236k 236k 3304
2048 40k 26k 560
4096 80k 80k 1120
8192 64k 64k 448
16384 16k 16k 56
32768 64k 64k 112
65536 960k 960k 840
131072 512k 512k 224
262144 768k 768k 168
1048576 7168k 5120k 392
2097152 2048k 2048k 56
112 144k 92k 2016
208 44k 41k 616
192 14M 10094k 198k
160 40k 37k 560
176 7972k 4786k 108k
96 18M 16M 263k
448 28k 27k 392
128 9696k 6860k 132k
48 30M 13M 484k
224 424k 385k 5936
32 72M 72M 1296k
80 65M 38M 923k
Total 278M 201M 4231k
String pool
entries 125055
identifiers 125055 (100.00%)
slots 262144
bytes 1675k (137k overhead)
table size 2048k
coll/search 0.8888
ins/search 0.1979
avg. entry 13.72 bytes (+/- 8.99)
longest entry 71
??? tree nodes created
(No per-node statistics)
Type hash: size 1021, 577 elements, 0.695294 collisions
DECL_DEBUG_EXPR hash: size 8191, 2893 elements, 1.005820 collisions
DECL_VALUE_EXPR hash: size 1021, 0 elements, 0.000000 collisions
Execution times (seconds)
garbage collection : 1.01 ( 0%) usr 0.00 ( 0%) sys 1.01
( 0%) wall 0 kB ( 0%) ggc
callgraph construction: 0.61 ( 0%) usr 0.09 ( 1%) sys 0.72
( 0%) wall 17017 kB ( 2%) ggc
callgraph optimization: 0.03 ( 0%) usr 0.00 ( 0%) sys 0.03
( 0%) wall 0 kB ( 0%) ggc
ipa reference : 0.17 ( 0%) usr 0.05 ( 0%) sys 0.23
( 0%) wall 7 kB ( 0%) ggc
cfg cleanup : 7.69 ( 2%) usr 0.00 ( 0%) sys 8.03
( 2%) wall 37 kB ( 0%) ggc
trivially dead code : 0.96 ( 0%) usr 0.00 ( 0%) sys 1.00
( 0%) wall 0 kB ( 0%) ggc
life analysis : 19.95 ( 4%) usr 0.01 ( 0%) sys 20.77
( 4%) wall 12767 kB ( 2%) ggc
life info update : 0.57 ( 0%) usr 0.00 ( 0%) sys 0.57
( 0%) wall 0 kB ( 0%) ggc
alias analysis : 0.80 ( 0%) usr 0.00 ( 0%) sys 0.80
( 0%) wall 7174 kB ( 1%) ggc
register scan : 0.42 ( 0%) usr 0.00 ( 0%) sys 0.46
( 0%) wall 1 kB ( 0%) ggc
rebuild jump labels : 0.10 ( 0%) usr 0.00 ( 0%) sys 0.10
( 0%) wall 0 kB ( 0%) ggc
preprocessing : 0.51 ( 0%) usr 0.90 ( 8%) sys 1.26
( 0%) wall 1794 kB ( 0%) ggc
lexical analysis : 0.57 ( 0%) usr 1.42 (12%) sys 2.33
( 0%) wall 0 kB ( 0%) ggc
parser : 1.29 ( 0%) usr 0.95 ( 8%) sys 2.29
( 0%) wall 59589 kB ( 8%) ggc
integration : 0.24 ( 0%) usr 0.09 ( 1%) sys 0.35
( 0%) wall 0 kB ( 0%) ggc
tree gimplify : 0.74 ( 0%) usr 0.04 ( 0%) sys 0.83
( 0%) wall 42732 kB ( 6%) ggc
tree eh : 0.07 ( 0%) usr 0.00 ( 0%) sys 0.08
( 0%) wall 0 kB ( 0%) ggc
tree CFG construction : 0.30 ( 0%) usr 0.05 ( 0%) sys 0.35
( 0%) wall 59312 kB ( 8%) ggc
tree CFG cleanup : 3.53 ( 1%) usr 0.00 ( 0%) sys 3.51
( 1%) wall 3716 kB ( 1%) ggc
tree copy propagation : 1.21 ( 0%) usr 0.00 ( 0%) sys 1.22
( 0%) wall 2220 kB ( 0%) ggc
tree store copy prop : 0.41 ( 0%) usr 0.00 ( 0%) sys 0.41
( 0%) wall 576 kB ( 0%) ggc
tree find ref. vars : 0.11 ( 0%) usr 0.00 ( 0%) sys 0.11
( 0%) wall 1186 kB ( 0%) ggc
tree PTA : 2.52 ( 1%) usr 0.03 ( 0%) sys 2.57
( 1%) wall 2280 kB ( 0%) ggc
tree alias analysis : 121.67 (27%) usr 0.50 ( 4%) sys 123.10
(26%) wall 18481 kB ( 3%) ggc
tree PHI insertion : 1.40 ( 0%) usr 0.07 ( 1%) sys 1.54
( 0%) wall 69532 kB ( 9%) ggc
tree SSA rewrite : 2.40 ( 1%) usr 0.02 ( 0%) sys 2.47
( 1%) wall 28127 kB ( 4%) ggc
tree SSA other : 0.09 ( 0%) usr 0.10 ( 1%) sys 0.23
( 0%) wall 0 kB ( 0%) ggc
tree SSA incremental : 8.18 ( 2%) usr 0.09 ( 1%) sys 8.20
( 2%) wall 19181 kB ( 3%) ggc
tree operand scan : 1.47 ( 0%) usr 0.58 ( 5%) sys 2.01
( 0%) wall 26491 kB ( 4%) ggc
dominator optimization: 2.51 ( 1%) usr 0.01 ( 0%) sys 2.55
( 1%) wall 46004 kB ( 6%) ggc
tree STORE-CCP : 0.58 ( 0%) usr 0.00 ( 0%) sys 0.58
( 0%) wall 1024 kB ( 0%) ggc
tree CCP : 0.61 ( 0%) usr 0.00 ( 0%) sys 0.61
( 0%) wall 1024 kB ( 0%) ggc
tree PHI const/copy prop: 0.19 ( 0%) usr 0.00 ( 0%) sys 0.20
( 0%) wall 9 kB ( 0%) ggc
tree split crit edges : 0.09 ( 0%) usr 0.02 ( 0%) sys 0.12
( 0%) wall 27005 kB ( 4%) ggc
tree reassociation : 0.45 ( 0%) usr 0.01 ( 0%) sys 0.45
( 0%) wall 0 kB ( 0%) ggc
tree FRE : 194.08 (42%) usr 0.18 ( 2%) sys 202.72
(42%) wall 23470 kB ( 3%) ggc
tree code sinking : 0.46 ( 0%) usr 0.00 ( 0%) sys 0.48
( 0%) wall 0 kB ( 0%) ggc
tree linearize phis : 0.12 ( 0%) usr 0.00 ( 0%) sys 0.13
( 0%) wall 0 kB ( 0%) ggc
tree forward propagate: 0.11 ( 0%) usr 0.00 ( 0%) sys 0.11
( 0%) wall 0 kB ( 0%) ggc
tree conservative DCE : 1.14 ( 0%) usr 0.00 ( 0%) sys 1.15
( 0%) wall 0 kB ( 0%) ggc
tree aggressive DCE : 0.40 ( 0%) usr 0.00 ( 0%) sys 0.41
( 0%) wall 0 kB ( 0%) ggc
tree DSE : 0.32 ( 0%) usr 0.00 ( 0%) sys 0.31
( 0%) wall 0 kB ( 0%) ggc
PHI merge : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00
( 0%) wall 2 kB ( 0%) ggc
tree loop bounds : 0.14 ( 0%) usr 0.00 ( 0%) sys 0.14
( 0%) wall 0 kB ( 0%) ggc
loop invariant motion : 0.29 ( 0%) usr 0.00 ( 0%) sys 0.30
( 0%) wall 0 kB ( 0%) ggc
tree canonical iv : 0.07 ( 0%) usr 0.00 ( 0%) sys 0.08
( 0%) wall 0 kB ( 0%) ggc
scev constant prop : 0.58 ( 0%) usr 0.00 ( 0%) sys 0.57
( 0%) wall 1756 kB ( 0%) ggc
complete unrolling : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.04
( 0%) wall 0 kB ( 0%) ggc
tree iv optimization : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00
( 0%) wall 0 kB ( 0%) ggc
tree loop init : 2.07 ( 0%) usr 0.07 ( 1%) sys 2.17
( 0%) wall 41825 kB ( 6%) ggc
tree loop fini : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01
( 0%) wall 0 kB ( 0%) ggc
tree copy headers : 0.11 ( 0%) usr 0.00 ( 0%) sys 0.11
( 0%) wall 1 kB ( 0%) ggc
tree SSA uncprop : 0.21 ( 0%) usr 0.00 ( 0%) sys 0.21
( 0%) wall 0 kB ( 0%) ggc
tree SSA to normal : 30.22 ( 7%) usr 0.07 ( 1%) sys 30.74
( 6%) wall 54480 kB ( 7%) ggc
tree rename SSA copies: 0.33 ( 0%) usr 0.00 ( 0%) sys 0.34
( 0%) wall 0 kB ( 0%) ggc
dominance frontiers : 0.41 ( 0%) usr 0.00 ( 0%) sys 0.40
( 0%) wall 0 kB ( 0%) ggc
dominance computation : 2.01 ( 0%) usr 0.01 ( 0%) sys 2.04
( 0%) wall 0 kB ( 0%) ggc
expand : 5.02 ( 1%) usr 0.12 ( 1%) sys 5.29
( 1%) wall 93938 kB (13%) ggc
varconst : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01
( 0%) wall 6 kB ( 0%) ggc
jump : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.06
( 0%) wall 0 kB ( 0%) ggc
CSE : 0.55 ( 0%) usr 0.01 ( 0%) sys 0.58
( 0%) wall 130 kB ( 0%) ggc
loop analysis : 17.46 ( 4%) usr 5.59 (48%) sys 24.20
( 5%) wall 5483 kB ( 1%) ggc
branch prediction : 0.75 ( 0%) usr 0.00 ( 0%) sys 0.74
( 0%) wall 1532 kB ( 0%) ggc
flow analysis : 0.14 ( 0%) usr 0.00 ( 0%) sys 0.15
( 0%) wall 0 kB ( 0%) ggc
combiner : 1.66 ( 0%) usr 0.01 ( 0%) sys 1.68
( 0%) wall 21082 kB ( 3%) ggc
if-conversion : 0.53 ( 0%) usr 0.00 ( 0%) sys 0.54
( 0%) wall 350 kB ( 0%) ggc
mode switching : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01
( 0%) wall 0 kB ( 0%) ggc
local alloc : 1.29 ( 0%) usr 0.00 ( 0%) sys 1.30
( 0%) wall 7039 kB ( 1%) ggc
global alloc : 7.10 ( 2%) usr 0.41 ( 3%) sys 7.52
( 2%) wall 6574 kB ( 1%) ggc
reload CSE regs : 0.75 ( 0%) usr 0.00 ( 0%) sys 0.73
( 0%) wall 10842 kB ( 1%) ggc
flow 2 : 0.37 ( 0%) usr 0.00 ( 0%) sys 0.38
( 0%) wall 3114 kB ( 0%) ggc
if-conversion 2 : 0.21 ( 0%) usr 0.00 ( 0%) sys 0.21
( 0%) wall 9 kB ( 0%) ggc
rename registers : 0.54 ( 0%) usr 0.04 ( 0%) sys 0.58
( 0%) wall 24 kB ( 0%) ggc
scheduling 2 : 2.36 ( 1%) usr 0.04 ( 0%) sys 2.40
( 0%) wall 10954 kB ( 1%) ggc
machine dep reorg : 0.44 ( 0%) usr 0.00 ( 0%) sys 0.44
( 0%) wall 135 kB ( 0%) ggc
final : 1.06 ( 0%) usr 0.01 ( 0%) sys 1.08
( 0%) wall 2050 kB ( 0%) ggc
TOTAL : 456.98 11.72
481.71 734771 kB
457.718u 12.529s 8:03.35 97.2% 0+0k 0+0io 0pf+0w
euler-158% /pkgs/gcc-mainline/bin/gcc -v
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: ../configure --prefix=/pkgs/gcc-mainline --with-gmp=/
pkgs/gmp-4.2.1 --with-mpfr=/pkgs/gmp-4.2.1 --enable-checking=no --
enable-languages=c
Thread model: posix
gcc version 4.3.0 20061207 (experimental)
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (20 preceding siblings ...)
2006-12-07 21:51 ` lucier at math dot purdue dot edu
@ 2006-12-08 1:24 ` lucier at math dot purdue dot edu
2006-12-11 6:28 ` lucier at math dot purdue dot edu
` (98 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2006-12-08 1:24 UTC (permalink / raw)
To: gcc-bugs
------- Comment #22 from lucier at math dot purdue dot edu 2006-12-08 01:24 -------
Subject: Re: Inordinate compile times on large routines
And here's the same data for 4.2.0 branch; Dan, your changes have
clearly helped a lot.
It seems to take about 5% more memory at the maximum, though, on
4.3.0 (6.9GB vs 6.6GB); but both these numbers are just from visual
inspection of "top" as things were running, so they are likely not
accurate.
Brad
euler-56% time /pkgs/gcc-4.2.0-test/bin/gcc -no-cpp-precomp -Wall -W
-Wno-unused -O1 -fno-math-errno -fschedule-insns2 -fno-trapping-math -
fno-strict-aliasing -fwrapv -fomit-frame-pointer -fPIC -fno-common -
ftime-report -fmem-report -c all.i
gcc: unrecognized option '-no-cpp-precomp'
Memory still allocated at the end of the compilation process
Size Allocated Used Overhead
8 16k 13k 480
16 8556k 7998k 183k
64 36M 22M 587k
256 4096 2816 56
512 4096 1536 56
1024 224k 222k 3136
2048 40k 26k 560
4096 76k 76k 1064
8192 64k 64k 448
16384 16k 16k 56
32768 64k 64k 112
65536 704k 704k 616
131072 512k 512k 224
262144 768k 768k 168
1048576 6144k 5120k 336
2097152 2048k 2048k 56
112 144k 92k 2016
208 44k 40k 616
192 14M 10M 209k
160 40k 37k 560
176 7996k 4802k 109k
96 19M 16M 267k
416 28k 23k 392
128 10M 7187k 142k
48 18M 9508k 302k
224 420k 384k 5880
32 71M 71M 1279k
80 62M 46M 880k
Total 261M 205M 3979k
String pool
entries 125047
identifiers 125047 (100.00%)
slots 262144
bytes 1675k (137k overhead)
table size 2048k
coll/search 0.8967
ins/search 0.1974
avg. entry 13.72 bytes (+/- 8.99)
longest entry 71
??? tree nodes created
(No per-node statistics)
Type hash: size 1021, 573 elements, 0.687728 collisions
DECL_DEBUG_EXPR hash: size 8191, 2981 elements, 0.983076 collisions
DECL_VALUE_EXPR hash: size 1021, 0 elements, 0.000000 collisions
Execution times (seconds)
garbage collection : 1.10 ( 0%) usr 0.00 ( 0%) sys 1.10
( 0%) wall 0 kB ( 0%) ggc
callgraph construction: 0.60 ( 0%) usr 0.12 ( 1%) sys 0.74
( 0%) wall 17017 kB ( 2%) ggc
callgraph optimization: 0.04 ( 0%) usr 0.00 ( 0%) sys 0.04
( 0%) wall 0 kB ( 0%) ggc
ipa reference : 0.17 ( 0%) usr 0.05 ( 0%) sys 0.25
( 0%) wall 7 kB ( 0%) ggc
cfg cleanup : 7.68 ( 0%) usr 0.00 ( 0%) sys 7.69
( 0%) wall 38 kB ( 0%) ggc
trivially dead code : 1.13 ( 0%) usr 0.00 ( 0%) sys 1.10
( 0%) wall 0 kB ( 0%) ggc
life analysis : 20.05 ( 1%) usr 0.01 ( 0%) sys 20.10
( 1%) wall 12032 kB ( 2%) ggc
life info update : 0.61 ( 0%) usr 0.00 ( 0%) sys 0.61
( 0%) wall 0 kB ( 0%) ggc
alias analysis : 0.92 ( 0%) usr 0.00 ( 0%) sys 0.89
( 0%) wall 7174 kB ( 1%) ggc
register scan : 0.51 ( 0%) usr 0.00 ( 0%) sys 0.50
( 0%) wall 1 kB ( 0%) ggc
rebuild jump labels : 0.21 ( 0%) usr 0.00 ( 0%) sys 0.20
( 0%) wall 0 kB ( 0%) ggc
preprocessing : 0.63 ( 0%) usr 0.92 ( 8%) sys 1.50
( 0%) wall 1794 kB ( 0%) ggc
lexical analysis : 0.61 ( 0%) usr 1.67 (14%) sys 2.39
( 0%) wall 0 kB ( 0%) ggc
parser : 1.27 ( 0%) usr 0.82 ( 7%) sys 2.30
( 0%) wall 59584 kB ( 8%) ggc
integration : 0.27 ( 0%) usr 0.10 ( 1%) sys 0.39
( 0%) wall 0 kB ( 0%) ggc
tree gimplify : 0.68 ( 0%) usr 0.02 ( 0%) sys 0.75
( 0%) wall 23041 kB ( 3%) ggc
tree eh : 0.09 ( 0%) usr 0.00 ( 0%) sys 0.10
( 0%) wall 0 kB ( 0%) ggc
tree CFG construction : 0.32 ( 0%) usr 0.06 ( 0%) sys 0.40
( 0%) wall 59313 kB ( 8%) ggc
tree CFG cleanup : 4.18 ( 0%) usr 0.00 ( 0%) sys 4.20
( 0%) wall 3716 kB ( 1%) ggc
tree copy propagation : 1.51 ( 0%) usr 0.01 ( 0%) sys 1.55
( 0%) wall 2219 kB ( 0%) ggc
tree store copy prop : 0.50 ( 0%) usr 0.00 ( 0%) sys 0.51
( 0%) wall 576 kB ( 0%) ggc
tree find ref. vars : 0.12 ( 0%) usr 0.00 ( 0%) sys 0.13
( 0%) wall 1186 kB ( 0%) ggc
tree PTA : 857.74 (53%) usr 0.47 ( 4%) sys 859.20
(53%) wall 2331 kB ( 0%) ggc
tree alias analysis : 383.35 (24%) usr 0.64 ( 5%) sys 385.19
(24%) wall 15963 kB ( 2%) ggc
tree PHI insertion : 1.57 ( 0%) usr 0.10 ( 1%) sys 1.75
( 0%) wall 69532 kB (10%) ggc
tree SSA rewrite : 3.05 ( 0%) usr 0.03 ( 0%) sys 3.07
( 0%) wall 28127 kB ( 4%) ggc
tree SSA other : 0.18 ( 0%) usr 0.10 ( 1%) sys 0.26
( 0%) wall 0 kB ( 0%) ggc
tree SSA incremental : 9.45 ( 1%) usr 0.07 ( 1%) sys 9.53
( 1%) wall 20443 kB ( 3%) ggc
tree operand scan : 6.22 ( 0%) usr 0.53 ( 4%) sys 7.09
( 0%) wall 26490 kB ( 4%) ggc
dominator optimization: 3.86 ( 0%) usr 0.05 ( 0%) sys 3.92
( 0%) wall 48855 kB ( 7%) ggc
tree STORE-CCP : 0.47 ( 0%) usr 0.00 ( 0%) sys 0.47
( 0%) wall 8 kB ( 0%) ggc
tree CCP : 0.63 ( 0%) usr 0.00 ( 0%) sys 0.63
( 0%) wall 16 kB ( 0%) ggc
tree PHI const/copy prop: 0.25 ( 0%) usr 0.00 ( 0%) sys 0.26
( 0%) wall 9 kB ( 0%) ggc
tree split crit edges : 0.12 ( 0%) usr 0.04 ( 0%) sys 0.17
( 0%) wall 27005 kB ( 4%) ggc
tree reassociation : 0.53 ( 0%) usr 0.00 ( 0%) sys 0.53
( 0%) wall 0 kB ( 0%) ggc
tree FRE : 181.71 (11%) usr 0.02 ( 0%) sys 181.71
(11%) wall 18940 kB ( 3%) ggc
tree code sinking : 0.57 ( 0%) usr 0.00 ( 0%) sys 0.58
( 0%) wall 0 kB ( 0%) ggc
tree linearize phis : 0.13 ( 0%) usr 0.00 ( 0%) sys 0.13
( 0%) wall 0 kB ( 0%) ggc
tree forward propagate: 0.14 ( 0%) usr 0.00 ( 0%) sys 0.15
( 0%) wall 0 kB ( 0%) ggc
tree conservative DCE : 1.32 ( 0%) usr 0.00 ( 0%) sys 1.31
( 0%) wall 0 kB ( 0%) ggc
tree aggressive DCE : 0.52 ( 0%) usr 0.00 ( 0%) sys 0.51
( 0%) wall 0 kB ( 0%) ggc
tree DSE : 0.41 ( 0%) usr 0.00 ( 0%) sys 0.40
( 0%) wall 1 kB ( 0%) ggc
PHI merge : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01
( 0%) wall 2 kB ( 0%) ggc
tree loop bounds : 0.24 ( 0%) usr 0.00 ( 0%) sys 0.23
( 0%) wall 0 kB ( 0%) ggc
loop invariant motion : 0.37 ( 0%) usr 0.00 ( 0%) sys 0.37
( 0%) wall 0 kB ( 0%) ggc
tree canonical iv : 0.12 ( 0%) usr 0.00 ( 0%) sys 0.12
( 0%) wall 0 kB ( 0%) ggc
scev constant prop : 0.67 ( 0%) usr 0.00 ( 0%) sys 0.66
( 0%) wall 1756 kB ( 0%) ggc
complete unrolling : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.06
( 0%) wall 0 kB ( 0%) ggc
tree iv optimization : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01
( 0%) wall 0 kB ( 0%) ggc
tree loop init : 2.35 ( 0%) usr 0.08 ( 1%) sys 2.47
( 0%) wall 45903 kB ( 6%) ggc
tree copy headers : 0.13 ( 0%) usr 0.00 ( 0%) sys 0.13
( 0%) wall 1 kB ( 0%) ggc
tree SSA uncprop : 0.27 ( 0%) usr 0.00 ( 0%) sys 0.26
( 0%) wall 0 kB ( 0%) ggc
tree SSA to normal : 56.18 ( 4%) usr 0.05 ( 0%) sys 56.26
( 3%) wall 55617 kB ( 8%) ggc
tree rename SSA copies: 0.45 ( 0%) usr 0.00 ( 0%) sys 0.45
( 0%) wall 0 kB ( 0%) ggc
dominance frontiers : 0.49 ( 0%) usr 0.00 ( 0%) sys 0.51
( 0%) wall 0 kB ( 0%) ggc
dominance computation : 2.43 ( 0%) usr 0.01 ( 0%) sys 2.39
( 0%) wall 0 kB ( 0%) ggc
expand : 7.24 ( 0%) usr 0.08 ( 1%) sys 7.44
( 0%) wall 95504 kB (13%) ggc
jump : 0.11 ( 0%) usr 0.00 ( 0%) sys 0.11
( 0%) wall 0 kB ( 0%) ggc
CSE : 0.82 ( 0%) usr 0.00 ( 0%) sys 0.83
( 0%) wall 1108 kB ( 0%) ggc
loop analysis : 18.66 ( 1%) usr 5.45 (45%) sys 24.87
( 2%) wall 5844 kB ( 1%) ggc
branch prediction : 0.92 ( 0%) usr 0.01 ( 0%) sys 0.93
( 0%) wall 1532 kB ( 0%) ggc
flow analysis : 0.17 ( 0%) usr 0.00 ( 0%) sys 0.15
( 0%) wall 0 kB ( 0%) ggc
combiner : 1.60 ( 0%) usr 0.02 ( 0%) sys 1.63
( 0%) wall 19153 kB ( 3%) ggc
if-conversion : 0.63 ( 0%) usr 0.01 ( 0%) sys 0.65
( 0%) wall 365 kB ( 0%) ggc
mode switching : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01
( 0%) wall 0 kB ( 0%) ggc
local alloc : 1.32 ( 0%) usr 0.01 ( 0%) sys 1.32
( 0%) wall 5154 kB ( 1%) ggc
global alloc : 7.14 ( 0%) usr 0.37 ( 3%) sys 7.57
( 0%) wall 9514 kB ( 1%) ggc
reload CSE regs : 0.82 ( 0%) usr 0.00 ( 0%) sys 0.83
( 0%) wall 11516 kB ( 2%) ggc
flow 2 : 0.40 ( 0%) usr 0.00 ( 0%) sys 0.41
( 0%) wall 2940 kB ( 0%) ggc
if-conversion 2 : 0.23 ( 0%) usr 0.00 ( 0%) sys 0.25
( 0%) wall 1 kB ( 0%) ggc
rename registers : 0.58 ( 0%) usr 0.03 ( 0%) sys 0.60
( 0%) wall 15 kB ( 0%) ggc
scheduling 2 : 2.33 ( 0%) usr 0.03 ( 0%) sys 2.38
( 0%) wall 10644 kB ( 1%) ggc
machine dep reorg : 0.49 ( 0%) usr 0.00 ( 0%) sys 0.49
( 0%) wall 91 kB ( 0%) ggc
final : 1.23 ( 0%) usr 0.02 ( 0%) sys 1.25
( 0%) wall 2050 kB ( 0%) ggc
TOTAL :1604.05 12.16
1620.25 716501 kB
1604.799u 13.117s 27:05.13 99.5% 0+0k 0+0io 0pf+0w
euler-57% /pkgs/gcc-4.2.0-test/bin/gcc -v
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: ../configure --prefix=/pkgs/gcc-4.2.0-test --with-
gmp=/pkgs/gmp-4.2.1 --with-mpfr=/pkgs/gmp-4.2.1 --enable-checking=no
--enable-languages=c
Thread model: posix
gcc version 4.2.0 20061207 (prerelease)
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (21 preceding siblings ...)
2006-12-08 1:24 ` lucier at math dot purdue dot edu
@ 2006-12-11 6:28 ` lucier at math dot purdue dot edu
2007-01-10 18:49 ` lucier at math dot purdue dot edu
` (97 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2006-12-11 6:28 UTC (permalink / raw)
To: gcc-bugs
------- Comment #23 from lucier at math dot purdue dot edu 2006-12-11 06:27 -------
Subject: Re: Inordinate compile times on large routines
After Andrew MacLeod's changes here
http://gcc.gnu.org/ml/gcc-patches/2006-12/msg00691.html
I see
tree SSA to normal : 5.23 ( 1%) usr 0.06 ( 0%) sys 5.30
( 1%) wall 52594 kB ( 7%) ggc
instead of
tree SSA to normal : 30.22 ( 7%) usr 0.07 ( 1%) sys 30.74
( 6%) wall 54480 kB ( 7%) ggc
Very nice.
Other passes with noticeable run times remaining are
tree alias analysis : 125.17 (28%) usr 0.52 ( 4%) sys 126.90
(27%) wall 18481 kB ( 3%) ggc
tree FRE : 207.84 (46%) usr 0.19 ( 1%) sys 208.56
(45%) wall 23470 kB ( 3%) ggc
Brad
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (22 preceding siblings ...)
2006-12-11 6:28 ` lucier at math dot purdue dot edu
@ 2007-01-10 18:49 ` lucier at math dot purdue dot edu
2007-01-10 19:48 ` amacleod at redhat dot com
` (96 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2007-01-10 18:49 UTC (permalink / raw)
To: gcc-bugs
------- Comment #24 from lucier at math dot purdue dot edu 2007-01-10 18:49 -------
Tried it again with today's 4.2.0:
euler-34% /pkgs/gcc-4.2.0-test/bin/gcc -v
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: ../configure --prefix=/pkgs/gcc-4.2.0-test
--with-gmp=/pkgs/gmp-4.2.1 --with-mpfr=/pkgs/gmp-4.2.1
Thread model: posix
gcc version 4.2.0 20070110 (prerelease)
The two hot spots were
tree SSA to normal : 52.63 (16%) usr 0.04 ( 0%) sys 52.69 (15%) wall
55617 kB ( 8%) ggc
tree FRE : 150.81 (46%) usr 0.20 ( 2%) sys 154.00 (45%) wall
18940 kB ( 3%) ggc
while
tree alias analysis : 1.94 ( 1%) usr 0.58 ( 5%) sys 2.18 ( 1%) wall
387 kB ( 0%) ggc
is now very low.
Is there a patch than can be back-ported from mainline to fix tree SSA to
normal? On mainline there is the tremendous result of
tree SSA to normal : 5.23 ( 1%) usr 0.06 ( 0%) sys 5.30 ( 1%) wall
52594 kB ( 7%) ggc
On another note, can I change this to be reported against 4.2.0?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (23 preceding siblings ...)
2007-01-10 18:49 ` lucier at math dot purdue dot edu
@ 2007-01-10 19:48 ` amacleod at redhat dot com
2007-11-14 9:56 ` steven at gcc dot gnu dot org
` (95 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: amacleod at redhat dot com @ 2007-01-10 19:48 UTC (permalink / raw)
To: gcc-bugs
------- Comment #25 from amacleod at redhat dot com 2007-01-10 19:47 -------
There were numerous factors in the mainline speedup of SSA->normal, including a
massive rewrite, but there are a couple of big wins that are backportable, and
were in fact considered. It was just that they were too late in stage 3 at the
time:
I don't remember which one(s) affected this test case the most.
live range speedup:
http://gcc.gnu.org/ml/gcc-patches/2006-08/msg00895.html
TER speedup
http://gcc.gnu.org/ml/gcc-patches/2006-08/msg00896.html
this one was applied later, and may or may not make any difference whatsoever.
I changes the coalesce list from a linked list to a hash table. I seem to
recall only one test case it affected.
http://gcc.gnu.org/ml/gcc-patches/2006-11/msg01515.html
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (24 preceding siblings ...)
2007-01-10 19:48 ` amacleod at redhat dot com
@ 2007-11-14 9:56 ` steven at gcc dot gnu dot org
2007-11-14 10:07 ` rguenth at gcc dot gnu dot org
` (94 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: steven at gcc dot gnu dot org @ 2007-11-14 9:56 UTC (permalink / raw)
To: gcc-bugs
------- Comment #26 from steven at gcc dot gnu dot org 2007-11-14 09:56 -------
Could someone test this with GCC 4.3, and report the results here?
--
steven at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |WAITING
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (25 preceding siblings ...)
2007-11-14 9:56 ` steven at gcc dot gnu dot org
@ 2007-11-14 10:07 ` rguenth at gcc dot gnu dot org
2007-11-14 12:04 ` steven at gcc dot gnu dot org
` (93 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2007-11-14 10:07 UTC (permalink / raw)
To: gcc-bugs
------- Comment #27 from rguenth at gcc dot gnu dot org 2007-11-14 10:07 -------
http://www.suse.de/~gcctest/c++bench/random/ tracks this testcase (on x86_64
that is).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (26 preceding siblings ...)
2007-11-14 10:07 ` rguenth at gcc dot gnu dot org
@ 2007-11-14 12:04 ` steven at gcc dot gnu dot org
2007-11-14 12:40 ` lucier at math dot purdue dot edu
` (92 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: steven at gcc dot gnu dot org @ 2007-11-14 12:04 UTC (permalink / raw)
To: gcc-bugs
------- Comment #28 from steven at gcc dot gnu dot org 2007-11-14 12:04 -------
Then I suggest we close this bug report.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (27 preceding siblings ...)
2007-11-14 12:04 ` steven at gcc dot gnu dot org
@ 2007-11-14 12:40 ` lucier at math dot purdue dot edu
2007-11-14 13:14 ` rguenth at gcc dot gnu dot org
` (91 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2007-11-14 12:40 UTC (permalink / raw)
To: gcc-bugs
------- Comment #29 from lucier at math dot purdue dot edu 2007-11-14 12:40 -------
Subject: Re: Inordinate compile times on large routines
It appears to me from the raw logs at
http://www.suse.de/~gcctest/c++bench/random/
that all runs except for the -O0 fail with an out-of-memory failure,
so I don't know what this is really testing.
Relevant excerpt from the logs follows.
> TEST: pr26854.c
> total: 782967 kB
>
> Execution times (seconds)
> garbage collection : 0.88 ( 2%) usr 0.01 ( 0%) sys 0.90
> ( 2%) wall 0 kB ( 0%) ggc
> callgraph construction: 0.01 ( 0%) usr 0.00 ( 0%) sys 0.02
> ( 0%) wall 0 kB ( 0%) ggc
> cfg cleanup : 3.13 ( 7%) usr 0.00 ( 0%) sys 3.14
> ( 7%) wall 186 kB ( 0%) ggc
> trivially dead code : 0.16 ( 0%) usr 0.00 ( 0%) sys 0.15
> ( 0%) wall 0 kB ( 0%) ggc
> df live regs : 0.30 ( 1%) usr 0.01 ( 0%) sys 0.31
> ( 1%) wall 0 kB ( 0%) ggc
> df reg dead/unused notes: 0.26 ( 1%) usr 0.01 ( 0%) sys 0.27
> ( 1%) wall 12048 kB ( 2%) ggc
> register information : 0.31 ( 1%) usr 0.00 ( 0%) sys 0.31
> ( 1%) wall 0 kB ( 0%) ggc
> alias analysis : 0.24 ( 1%) usr 0.00 ( 0%) sys 0.24
> ( 1%) wall 4096 kB ( 1%) ggc
> rebuild jump labels : 0.27 ( 1%) usr 0.00 ( 0%) sys 0.28
> ( 1%) wall 0 kB ( 0%) ggc
> preprocessing : 0.83 ( 2%) usr 0.98 (19%) sys 1.87
> ( 4%) wall 2978 kB ( 1%) ggc
> lexical analysis : 0.64 ( 2%) usr 1.80 (35%) sys 2.57
> ( 5%) wall 0 kB ( 0%) ggc
> parser : 2.57 ( 6%) usr 1.31 (26%) sys 3.70
> ( 8%) wall 106641 kB (22%) ggc
> inline heuristics : 0.78 ( 2%) usr 0.18 ( 4%) sys 0.97
> ( 2%) wall 0 kB ( 0%) ggc
> tree gimplify : 1.21 ( 3%) usr 0.11 ( 2%) sys 1.32
> ( 3%) wall 90819 kB (19%) ggc
> tree eh : 0.11 ( 0%) usr 0.00 ( 0%) sys 0.10
> ( 0%) wall 0 kB ( 0%) ggc
> tree CFG construction : 0.48 ( 1%) usr 0.05 ( 1%) sys 0.54
> ( 1%) wall 68530 kB (14%) ggc
> tree CFG cleanup : 0.17 ( 0%) usr 0.00 ( 0%) sys 0.15
> ( 0%) wall 0 kB ( 0%) ggc
> dominance computation : 0.16 ( 0%) usr 0.03 ( 1%) sys 0.20
> ( 0%) wall 0 kB ( 0%) ggc
> expand : 3.52 ( 8%) usr 0.30 ( 6%) sys 3.82
> ( 8%) wall 130942 kB (27%) ggc
> varconst : 0.07 ( 0%) usr 0.00 ( 0%) sys 0.06
> ( 0%) wall 1571 kB ( 0%) ggc
> jump : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.04
> ( 0%) wall 0 kB ( 0%) ggc
> local alloc : 2.49 ( 6%) usr 0.04 ( 1%) sys 2.53
> ( 5%) wall 4099 kB ( 1%) ggc
> global alloc : 21.22 (50%) usr 0.13 ( 3%) sys 21.36
> (45%) wall 48602 kB (10%) ggc
> thread pro- & epilogue: 0.08 ( 0%) usr 0.00 ( 0%) sys 0.08
> ( 0%) wall 3 kB ( 0%) ggc
> final : 2.14 ( 5%) usr 0.09 ( 2%) sys 2.21
> ( 5%) wall 763 kB ( 0%) ggc
> symout : 0.19 ( 0%) usr 0.02 ( 0%) sys 0.22
> ( 0%) wall 16699 kB ( 3%) ggc
> TOTAL : 42.26 5.08
> 47.39 488944 kB
> TIME: 44.33
> FILESIZE: text data bss dec hex filename 2899378 423808 3040
> 3326226 32c112 ./out.o
>
> cc1: out of memory allocating 4064 bytes after a total of
> 1020792832 bytes
> total: 1884587 kB
>
> cc1: out of memory allocating 4064 bytes after a total of
> 1020895232 bytes
> Command exited with non-zero status 1
> TIME: 79.98
> FILESIZE: text data bss dec hex filename 12492 872 336 13700 3584 ./
> out.o
>
> cc1: out of memory allocating 4064 bytes after a total of 993718272
> bytes
> total: 1884827 kB
>
> cc1: out of memory allocating 4064 bytes after a total of 993726464
> bytes
> Command exited with non-zero status 1
> TIME: 132.93
> FILESIZE: text data bss dec hex filename 12492 872 336 13700 3584 ./
> out.o
>
> cc1: out of memory allocating 4064 bytes after a total of 916152320
> bytes
> total: 1884835 kB
>
> cc1: out of memory allocating 4064 bytes after a total of 916217856
> bytes
> Command exited with non-zero status 1
> TIME: 143.76
> FILESIZE: text data bss dec hex filename 12492 872 336 13700 3584 ./
> out.o
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (28 preceding siblings ...)
2007-11-14 12:40 ` lucier at math dot purdue dot edu
@ 2007-11-14 13:14 ` rguenth at gcc dot gnu dot org
2007-11-14 13:38 ` lucier at math dot purdue dot edu
` (90 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2007-11-14 13:14 UTC (permalink / raw)
To: gcc-bugs
------- Comment #30 from rguenth at gcc dot gnu dot org 2007-11-14 13:13 -------
Right - the tester is limited to using 1GB of ram artificially. I probably
need
to fix the setup to report errors instead of "sofar" numbers in the oom cases.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (29 preceding siblings ...)
2007-11-14 13:14 ` rguenth at gcc dot gnu dot org
@ 2007-11-14 13:38 ` lucier at math dot purdue dot edu
2007-11-14 14:08 ` rguenth at gcc dot gnu dot org
` (89 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2007-11-14 13:38 UTC (permalink / raw)
To: gcc-bugs
------- Comment #31 from lucier at math dot purdue dot edu 2007-11-14 13:37 -------
Subject: Re: Inordinate compile times on large routines
To answer Steven's original question, here is a run with
euler-20% /pkgs/gcc-mainline/bin/gcc -v
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: ../../mainline/configure --prefix=/pkgs/gcc-mainline
--enable-languages=c --enable-checking=release --with-gmp=/pkgs/
gmp-4.2.2 --with-mpfr=/pkgs/gmp-4.2.2
Thread model: posix
gcc version 4.3.0 20071026 (experimental) [trunk revision 129664] (GCC)
Memory usage peaked at 10.3GB (just from monitoring top).
Brad
euler-19% time /pkgs/gcc-mainline/bin/gcc -Wall -W -Wno-unused -O1 -
fno-math-errno -fschedule-insns2 -fno-trapping-math -fno-strict-
aliasing -fwrapv -fomit-frame-pointer -fPIC -I/usr/local/Gambit-C/
include/ -ftime-report -fmem-report -c all.i
Memory still allocated at the end of the compilation process
Size Allocated Used Overhead
8 4096 16 120
16 108k 18k 2376
128 8192 2816 112
256 504k 464k 7056
512 4096 1024 56
1024 112k 110k 1568
2048 28k 22k 392
4096 76k 76k 1064
8192 48k 48k 336
16384 32k 32k 112
32768 32k 32k 56
131072 256k 256k 112
262144 512k 512k 112
524288 1024k 1024k 112
1048576 2048k 2048k 112
160 2764k 2669k 37k
176 144k 126k 2016
432 28k 21k 392
96 65M 14M 918k
48 2100k 1171k 32k
208 688k 325k 9632
64 1288k 1237k 20k
32 172k 64k 3096
80 30M 2060k 421k
Total 107M 26M 1459k
String pool
entries 159078
identifiers 159078 (100.00%)
slots 262144
bytes 1992k (170k overhead)
table size 2048k
coll/search 0.8632
ins/search 0.2065
avg. entry 12.83 bytes (+/- 7.80)
longest entry 67
??? tree nodes created
(No per-node statistics)
Type hash: size 2039, 919 elements, 0.860792 collisions
DECL_DEBUG_EXPR hash: size 16381, 0 elements, 1.211012 collisions
DECL_VALUE_EXPR hash: size 1021, 0 elements, 0.000000 collisions
Execution times (seconds)
garbage collection : 1.19 ( 0%) usr 0.00 ( 0%) sys 1.19
( 0%) wall 0 kB ( 0%) ggc
callgraph construction: 0.76 ( 0%) usr 0.11 ( 1%) sys 0.88
( 0%) wall 33780 kB ( 4%) ggc
callgraph optimization: 1.23 ( 1%) usr 0.00 ( 0%) sys 1.23
( 0%) wall 6 kB ( 0%) ggc
ipa reference : 0.22 ( 0%) usr 0.03 ( 0%) sys 0.25
( 0%) wall 7 kB ( 0%) ggc
cfg cleanup : 2.17 ( 1%) usr 0.01 ( 0%) sys 2.17
( 1%) wall 162 kB ( 0%) ggc
trivially dead code : 0.36 ( 0%) usr 0.00 ( 0%) sys 0.37
( 0%) wall 0 kB ( 0%) ggc
df reaching defs : 10.08 ( 4%) usr 4.09 (24%) sys 14.18
( 6%) wall 0 kB ( 0%) ggc
df live regs : 7.77 ( 3%) usr 0.01 ( 0%) sys 7.77
( 3%) wall 0 kB ( 0%) ggc
df live&initialized regs: 82.60 (35%) usr 2.60 (15%) sys 85.23
(33%) wall 0 kB ( 0%) ggc
df use-def / def-use chains: 8.23 ( 3%) usr 2.51 (14%) sys 10.73
( 4%) wall 0 kB ( 0%) ggc
df reg dead/unused notes: 0.97 ( 0%) usr 0.00 ( 0%) sys 0.97
( 0%) wall 10939 kB ( 1%) ggc
register information : 0.52 ( 0%) usr 0.00 ( 0%) sys 0.55
( 0%) wall 0 kB ( 0%) ggc
alias analysis : 0.90 ( 0%) usr 0.00 ( 0%) sys 0.89
( 0%) wall 7168 kB ( 1%) ggc
register scan : 0.10 ( 0%) usr 0.00 ( 0%) sys 0.10
( 0%) wall 4 kB ( 0%) ggc
rebuild jump labels : 0.34 ( 0%) usr 0.00 ( 0%) sys 0.34
( 0%) wall 0 kB ( 0%) ggc
preprocessing : 0.62 ( 0%) usr 0.96 ( 6%) sys 1.66
( 1%) wall 2932 kB ( 0%) ggc
lexical analysis : 0.62 ( 0%) usr 1.98 (11%) sys 2.31
( 1%) wall 0 kB ( 0%) ggc
parser : 1.29 ( 1%) usr 0.86 ( 5%) sys 2.37
( 1%) wall 68897 kB ( 8%) ggc
inline heuristics : 0.67 ( 0%) usr 0.17 ( 1%) sys 0.84
( 0%) wall 0 kB ( 0%) ggc
tree gimplify : 1.11 ( 0%) usr 0.06 ( 0%) sys 1.16
( 0%) wall 63192 kB ( 8%) ggc
tree eh : 0.11 ( 0%) usr 0.00 ( 0%) sys 0.12
( 0%) wall 0 kB ( 0%) ggc
tree CFG construction : 0.51 ( 0%) usr 0.06 ( 0%) sys 0.57
( 0%) wall 68527 kB ( 8%) ggc
tree CFG cleanup : 7.12 ( 3%) usr 0.00 ( 0%) sys 7.10
( 3%) wall 3525 kB ( 0%) ggc
tree copy propagation : 2.01 ( 1%) usr 0.05 ( 0%) sys 2.06
( 1%) wall 5125 kB ( 1%) ggc
tree store copy prop : 0.49 ( 0%) usr 0.00 ( 0%) sys 0.49
( 0%) wall 576 kB ( 0%) ggc
tree find ref. vars : 0.14 ( 0%) usr 0.00 ( 0%) sys 0.15
( 0%) wall 1826 kB ( 0%) ggc
tree PTA : 1.93 ( 1%) usr 0.13 ( 1%) sys 2.06
( 1%) wall 3734 kB ( 0%) ggc
tree alias analysis : 0.11 ( 0%) usr 0.08 ( 0%) sys 0.20
( 0%) wall 0 kB ( 0%) ggc
tree call clobbering : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.02
( 0%) wall 0 kB ( 0%) ggc
tree flow sensitive alias: 0.17 ( 0%) usr 0.00 ( 0%) sys 0.17
( 0%) wall 2146 kB ( 0%) ggc
tree memory partitioning: 1.24 ( 1%) usr 0.00 ( 0%) sys 1.25
( 0%) wall 0 kB ( 0%) ggc
tree PHI insertion : 0.61 ( 0%) usr 0.04 ( 0%) sys 0.65
( 0%) wall 18541 kB ( 2%) ggc
tree SSA rewrite : 1.94 ( 1%) usr 0.03 ( 0%) sys 1.95
( 1%) wall 35021 kB ( 4%) ggc
tree SSA other : 0.17 ( 0%) usr 0.12 ( 1%) sys 0.30
( 0%) wall 0 kB ( 0%) ggc
tree SSA incremental : 8.55 ( 4%) usr 0.08 ( 0%) sys 8.64
( 3%) wall 14256 kB ( 2%) ggc
tree operand scan : 0.71 ( 0%) usr 0.22 ( 1%) sys 0.91
( 0%) wall 28110 kB ( 3%) ggc
dominator optimization: 2.73 ( 1%) usr 0.02 ( 0%) sys 2.75
( 1%) wall 42635 kB ( 5%) ggc
tree SRA : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01
( 0%) wall 0 kB ( 0%) ggc
tree STORE-CCP : 0.57 ( 0%) usr 0.00 ( 0%) sys 0.57
( 0%) wall 1024 kB ( 0%) ggc
tree CCP : 1.18 ( 0%) usr 0.01 ( 0%) sys 1.19
( 0%) wall 1537 kB ( 0%) ggc
tree PHI const/copy prop: 0.24 ( 0%) usr 0.00 ( 0%) sys 0.23
( 0%) wall 11 kB ( 0%) ggc
tree split crit edges : 0.11 ( 0%) usr 0.02 ( 0%) sys 0.13
( 0%) wall 33706 kB ( 4%) ggc
tree reassociation : 0.61 ( 0%) usr 0.00 ( 0%) sys 0.62
( 0%) wall 1 kB ( 0%) ggc
tree FRE : 2.72 ( 1%) usr 0.06 ( 0%) sys 2.77
( 1%) wall 49006 kB ( 6%) ggc
tree code sinking : 0.47 ( 0%) usr 0.00 ( 0%) sys 0.48
( 0%) wall 6 kB ( 0%) ggc
tree linearize phis : 0.29 ( 0%) usr 0.00 ( 0%) sys 0.27
( 0%) wall 0 kB ( 0%) ggc
tree forward propagate: 0.32 ( 0%) usr 0.00 ( 0%) sys 0.33
( 0%) wall 426 kB ( 0%) ggc
tree conservative DCE : 1.60 ( 1%) usr 0.00 ( 0%) sys 1.61
( 1%) wall 0 kB ( 0%) ggc
tree aggressive DCE : 0.35 ( 0%) usr 0.00 ( 0%) sys 0.35
( 0%) wall 0 kB ( 0%) ggc
tree DSE : 0.35 ( 0%) usr 0.00 ( 0%) sys 0.36
( 0%) wall 1 kB ( 0%) ggc
PHI merge : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.06
( 0%) wall 7192 kB ( 1%) ggc
tree loop bounds : 0.16 ( 0%) usr 0.00 ( 0%) sys 0.17
( 0%) wall 2 kB ( 0%) ggc
loop invariant motion : 0.32 ( 0%) usr 0.00 ( 0%) sys 0.32
( 0%) wall 0 kB ( 0%) ggc
tree canonical iv : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.03
( 0%) wall 0 kB ( 0%) ggc
scev constant prop : 0.63 ( 0%) usr 0.01 ( 0%) sys 0.64
( 0%) wall 17787 kB ( 2%) ggc
complete unrolling : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01
( 0%) wall 0 kB ( 0%) ggc
tree loop init : 3.12 ( 1%) usr 0.08 ( 0%) sys 3.22
( 1%) wall 45438 kB ( 6%) ggc
tree loop fini : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01
( 0%) wall 0 kB ( 0%) ggc
tree copy headers : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.06
( 0%) wall 0 kB ( 0%) ggc
tree SSA uncprop : 0.26 ( 0%) usr 0.00 ( 0%) sys 0.26
( 0%) wall 0 kB ( 0%) ggc
tree SSA to normal : 11.49 ( 5%) usr 0.08 ( 0%) sys 11.56
( 5%) wall 83279 kB (10%) ggc
tree rename SSA copies: 0.53 ( 0%) usr 0.02 ( 0%) sys 0.56
( 0%) wall 0 kB ( 0%) ggc
dominance frontiers : 0.46 ( 0%) usr 0.00 ( 0%) sys 0.47
( 0%) wall 0 kB ( 0%) ggc
dominance computation : 2.40 ( 1%) usr 0.03 ( 0%) sys 2.40
( 1%) wall 0 kB ( 0%) ggc
expand : 14.26 ( 6%) usr 1.89 (11%) sys 16.13
( 6%) wall 92077 kB (11%) ggc
lower subreg : 0.24 ( 0%) usr 0.00 ( 0%) sys 0.24
( 0%) wall 0 kB ( 0%) ggc
jump : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.04
( 0%) wall 0 kB ( 0%) ggc
CSE : 0.78 ( 0%) usr 0.00 ( 0%) sys 0.77
( 0%) wall 1426 kB ( 0%) ggc
dead code elimination : 0.51 ( 0%) usr 0.00 ( 0%) sys 0.51
( 0%) wall 0 kB ( 0%) ggc
dead store elim1 : 0.42 ( 0%) usr 0.06 ( 0%) sys 0.48
( 0%) wall 7944 kB ( 1%) ggc
dead store elim2 : 0.48 ( 0%) usr 0.02 ( 0%) sys 0.49
( 0%) wall 8878 kB ( 1%) ggc
loop analysis : 0.60 ( 0%) usr 0.01 ( 0%) sys 0.63
( 0%) wall 70 kB ( 0%) ggc
branch prediction : 0.96 ( 0%) usr 0.02 ( 0%) sys 0.97
( 0%) wall 1541 kB ( 0%) ggc
combiner : 2.64 ( 1%) usr 0.04 ( 0%) sys 2.67
( 1%) wall 27876 kB ( 3%) ggc
if-conversion : 1.36 ( 1%) usr 0.01 ( 0%) sys 1.37
( 1%) wall 667 kB ( 0%) ggc
local alloc : 4.09 ( 2%) usr 0.02 ( 0%) sys 4.11
( 2%) wall 7074 kB ( 1%) ggc
global alloc : 26.15 (11%) usr 0.38 ( 2%) sys 26.54
(10%) wall 5112 kB ( 1%) ggc
reload CSE regs : 1.20 ( 1%) usr 0.01 ( 0%) sys 1.21
( 0%) wall 12243 kB ( 1%) ggc
thread pro- & epilogue: 0.10 ( 0%) usr 0.00 ( 0%) sys 0.10
( 0%) wall 4 kB ( 0%) ggc
if-conversion 2 : 0.38 ( 0%) usr 0.00 ( 0%) sys 0.38
( 0%) wall 82 kB ( 0%) ggc
rename registers : 0.61 ( 0%) usr 0.04 ( 0%) sys 0.65
( 0%) wall 31 kB ( 0%) ggc
scheduling 2 : 2.61 ( 1%) usr 0.07 ( 0%) sys 2.70
( 1%) wall 0 kB ( 0%) ggc
machine dep reorg : 0.51 ( 0%) usr 0.00 ( 0%) sys 0.51
( 0%) wall 146 kB ( 0%) ggc
reorder blocks : 0.26 ( 0%) usr 0.01 ( 0%) sys 0.26
( 0%) wall 6770 kB ( 1%) ggc
final : 1.20 ( 1%) usr 0.03 ( 0%) sys 1.22
( 0%) wall 0 kB ( 0%) ggc
tree if-combine : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.06
( 0%) wall 228 kB ( 0%) ggc
TOTAL : 238.24 17.40
255.72 824659 kB
239.030u 17.901s 4:17.09 99.9% 0+0k 0+0io 0pf+0w
euler-20%
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (30 preceding siblings ...)
2007-11-14 13:38 ` lucier at math dot purdue dot edu
@ 2007-11-14 14:08 ` rguenth at gcc dot gnu dot org
2007-11-14 16:57 ` dberlin at dberlin dot org
` (88 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2007-11-14 14:08 UTC (permalink / raw)
To: gcc-bugs
------- Comment #32 from rguenth at gcc dot gnu dot org 2007-11-14 14:08 -------
So, re-confirmed then.
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|WAITING |NEW
Ever Confirmed|0 |1
Last reconfirmed|0000-00-00 00:00:00 |2007-11-14 14:08:44
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (31 preceding siblings ...)
2007-11-14 14:08 ` rguenth at gcc dot gnu dot org
@ 2007-11-14 16:57 ` dberlin at dberlin dot org
2007-11-14 19:05 ` lucier at math dot purdue dot edu
` (87 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: dberlin at dberlin dot org @ 2007-11-14 16:57 UTC (permalink / raw)
To: gcc-bugs
------- Comment #33 from dberlin at gcc dot gnu dot org 2007-11-14 16:57 -------
Subject: Re: Inordinate compile times on large routines
On 14 Nov 2007 13:37:54 -0000, lucier at math dot purdue dot edu
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #31 from lucier at math dot purdue dot edu 2007-11-14 13:37 -------
> Subject: Re: Inordinate compile times on large routines
>
> To answer Steven's original question, here is a run with
>
> euler-20% /pkgs/gcc-mainline/bin/gcc -v
> Using built-in specs.
> Target: x86_64-unknown-linux-gnu
> Configured with: ../../mainline/configure --prefix=/pkgs/gcc-mainline
> --enable-languages=c --enable-checking=release --with-gmp=/pkgs/
> gmp-4.2.2 --with-mpfr=/pkgs/gmp-4.2.2
> Thread model: posix
> gcc version 4.3.0 20071026 (experimental) [trunk revision 129664] (GCC)
>
> Memory usage peaked at 10.3GB (just from monitoring top).
>
Any idea where?
None of the numbers below give any interesting suspects, IMHO.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (32 preceding siblings ...)
2007-11-14 16:57 ` dberlin at dberlin dot org
@ 2007-11-14 19:05 ` lucier at math dot purdue dot edu
2007-11-14 19:06 ` lucier at math dot purdue dot edu
` (86 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2007-11-14 19:05 UTC (permalink / raw)
To: gcc-bugs
------- Comment #34 from lucier at math dot purdue dot edu 2007-11-14 19:04 -------
Subject: Re: Inordinate compile times on large routines
On Nov 14, 2007, at 11:57 AM, dberlin at dberlin dot org wrote:
>> Memory usage peaked at 10.3GB (just from monitoring top).
>
> Any idea where?
Not really, but I ran cc1 through gdb to generate the following data;
I hope it's helpful.
The first interrupt was when top was reporting:
30359 lucier 25 0 9935m 9.6g 4128 T 0 61.7 2:19.65 cc1
At the second point in the compile (relatively stable top reports of
memory usage):
30359 lucier 25 0 4121m 4.0g 4352 T 21 25.4 2:58.86 cc1
This is with
euler-24% /pkgs/gcc-mainline/bin/gcc -v
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: ../../mainline/configure --prefix=/pkgs/gcc-mainline
--enable-languages=c --enable-checking=release --with-gmp=/pkgs/
gmp-4.2.2 --with-mpfr=/pkgs/gmp-4.2.2
Thread model: posix
gcc version 4.3.0 20071113 (experimental) [trunk revision 130159] (GCC)
Brad
euler-23% !gdb
gdb /pkgs/gcc-mainline/libexec/gcc/x86_64-unknown-linux-gnu/4.3.0/cc1
GNU gdb Red Hat Linux (6.3.0.0-1.143.el4rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and
you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for
details.
This GDB was configured as "x86_64-redhat-linux-gnu"...Using host
libthread_db library "/lib64/tls/libthread_db.so.1".
(gdb) run -Wall -W -Wno-unused -O1 -fno-math-errno -fschedule-insns2 -
fno-trapping-math -fno-strict-aliasing -fwrapv -fomit-frame-pointer -
fPIC -ftime-report -fmem-report all.i
Starting program: /export/pkgs/gcc-mainline/libexec/gcc/x86_64-
unknown-linux-gnu/4.3.0/cc1 -Wall -W -Wno-unused -O1 -fno-math-errno -
fschedule-insns2 -fno-trapping-math -fno-strict-aliasing -fwrapv -
fomit-frame-pointer -fPIC -ftime-report -fmem-report all.i
__sputc __istype __isctype __wcwidth ___H__20_all_2e_o1 ___init_proc
____20_all_2e_o1
Analyzing compilation unit
Performing interprocedural optimizations
<visibility> <early_local_cleanups> {GC 294991k -> 188566k} <inline>
<static-var> <pure-const>Assembling functions:
___H__20_all_2e_o1 {GC 382279k -> 277065k}
Program received signal SIGINT, Interrupt.
free_alloc_pool (pool=0xe7d8f60) at ../../../mainline/gcc/alloc-
pool.c:199
199 free (block);
(gdb) where
#0 free_alloc_pool (pool=0xe7d8f60) at ../../../mainline/gcc/alloc-
pool.c:199
#1 0x00000000004b12d3 in df_chain_remove_problem () at ../../../
mainline/gcc/df-problems.c:1935
#2 0x00000000004b1569 in df_chain_fully_remove_problem ()
at ../../../mainline/gcc/df-problems.c:1981
#3 0x00000000004ad1a0 in df_finish_pass (verify=Variable "verify" is
not available.
) at ../../../mainline/gcc/df-core.c:663
#4 0x000000000058791a in execute_one_pass (pass=0xc46960)
at ../../../mainline/gcc/passes.c:1140
#5 0x0000000000587a60 in execute_pass_list (pass=0xc46960)
at ../../../mainline/gcc/passes.c:1171
#6 0x0000000000587a75 in execute_pass_list (pass=0xc46840)
at ../../../mainline/gcc/passes.c:1172
#7 0x0000000000587a75 in execute_pass_list (pass=0xc46d60)
at ../../../mainline/gcc/passes.c:1172
#8 0x000000000062fae4 in tree_rest_of_compilation
(fndecl=0x2a990e84e0) at ../../../mainline/gcc/tree-optimize.c:404
#9 0x000000000073a232 in cgraph_expand_function (node=0x2a9865da00)
at ../../../mainline/gcc/cgraphunit.c:1151
#10 0x000000000073bc64 in cgraph_optimize () at ../../../mainline/gcc/
cgraphunit.c:1214
#11 0x000000000041225b in c_write_global_declarations () at ../../../
mainline/gcc/c-decl.c:8081
#12 0x00000000005fcfac in toplev_main (argc=Variable "argc" is not
available.
) at ../../../mainline/gcc/toplev.c:1055
#13 0x00000030fd11c3fb in __libc_start_main () from /lib64/tls/libc.so.6
#14 0x000000000040423a in _start ()
#15 0x0000007fbffff4e8 in ?? ()
#16 0x000000000000001c in ?? ()
#17 0x000000000000000f in ?? ()
#18 0x0000007fbffff7b2 in ?? ()
#19 0x0000007fbffff7fb in ?? ()
#20 0x0000007fbffff801 in ?? ()
#21 0x0000007fbffff804 in ?? ()
#22 0x0000007fbffff810 in ?? ()
#23 0x0000007fbffff814 in ?? ()
#24 0x0000007fbffff824 in ?? ()
#25 0x0000007fbffff836 in ?? ()
#26 0x0000007fbffff849 in ?? ()
#27 0x0000007fbffff85e in ?? ()
#28 0x0000007fbffff866 in ?? ()
#29 0x0000007fbffff87b in ?? ()
#30 0x0000007fbffff881 in ?? ()
#31 0x0000007fbffff88f in ?? ()
#32 0x0000007fbffff89c in ?? ()
#33 0x0000000000000000 in ?? ()
(gdb) c
Continuing.
Program received signal SIGINT, Interrupt.
0x00000000004687c9 in bitmap_elt_insert_after (head=0x963b0f0,
elt=0xd30a7a70, indx=561) at ../../../mainline/gcc/bitmap.c:203
203 if (element->next)
(gdb) where
#0 0x00000000004687c9 in bitmap_elt_insert_after (head=0x963b0f0,
elt=0xd30a7a70, indx=561) at ../../../mainline/gcc/bitmap.c:203
#1 0x000000000046a19b in bitmap_ior_into (a=0x963b0f0, b=Variable
"b" is not available.
) at ../../../mainline/gcc/bitmap.c:913
#2 0x00000000004adce6 in df_worklist_dataflow (dataflow=0x7829f20,
blocks_to_consider=0x9c1f250, blocks_in_postorder=0x2ab81c6010,
n_blocks=Variable "n_blocks" is not available.
)
at ../../../mainline/gcc/df-core.c:875
#3 0x00000000004acd7e in df_analyze_problem (dflow=0x7829f20,
blocks_to_consider=0x9c1f250, postorder=0x2ab81c6010, n_blocks=59465)
at ../../../mainline/gcc/df-core.c:1060
#4 0x00000000004ad00a in df_analyze () at ../../../mainline/gcc/df-
core.c:1150
#5 0x00000000008faee7 in if_convert () at ../../../mainline/gcc/
ifcvt.c:4045
#6 0x00000000008fb429 in rest_of_handle_if_after_combine ()
at ../../../mainline/gcc/ifcvt.c:4161
#7 0x00000000005878c2 in execute_one_pass (pass=0xc4b620)
at ../../../mainline/gcc/passes.c:1118
#8 0x0000000000587a60 in execute_pass_list (pass=0xc4b620)
at ../../../mainline/gcc/passes.c:1171
#9 0x0000000000587a75 in execute_pass_list (pass=0xc46d60)
at ../../../mainline/gcc/passes.c:1172
#10 0x000000000062fae4 in tree_rest_of_compilation
(fndecl=0x2a990e84e0) at ../../../mainline/gcc/tree-optimize.c:404
#11 0x000000000073a232 in cgraph_expand_function (node=0x2a9865da00)
at ../../../mainline/gcc/cgraphunit.c:1151
#12 0x000000000073bc64 in cgraph_optimize () at ../../../mainline/gcc/
cgraphunit.c:1214
#13 0x000000000041225b in c_write_global_declarations () at ../../../
mainline/gcc/c-decl.c:8081
#14 0x00000000005fcfac in toplev_main (argc=Variable "argc" is not
available.
) at ../../../mainline/gcc/toplev.c:1055
#15 0x00000030fd11c3fb in __libc_start_main () from /lib64/tls/libc.so.6
#16 0x000000000040423a in _start ()
#17 0x0000007fbffff4e8 in ?? ()
#18 0x000000000000001c in ?? ()
#19 0x000000000000000f in ?? ()
#20 0x0000007fbffff7b2 in ?? ()
#21 0x0000007fbffff7fb in ?? ()
#22 0x0000007fbffff801 in ?? ()
#23 0x0000007fbffff804 in ?? ()
#24 0x0000007fbffff810 in ?? ()
#25 0x0000007fbffff814 in ?? ()
#26 0x0000007fbffff824 in ?? ()
#27 0x0000007fbffff836 in ?? ()
#28 0x0000007fbffff849 in ?? ()
#29 0x0000007fbffff85e in ?? ()
#30 0x0000007fbffff866 in ?? ()
#31 0x0000007fbffff87b in ?? ()
#32 0x0000007fbffff881 in ?? ()
#33 0x0000007fbffff88f in ?? ()
#34 0x0000007fbffff89c in ?? ()
#35 0x0000000000000000 in ?? ()
(gdb) c
Continuing.
___init_proc ____20_all_2e_o1 {GC 466968k -> 26603k}Memory still
allocated at the end of the compilation process
Size Allocated Used Overhead
8 4096 32 120
16 72k 18k 1584
128 2144k 2135k 29k
256 8192 1536 112
512 4096 1024 56
1024 112k 110k 1568
2048 28k 22k 392
4096 76k 76k 1064
8192 48k 48k 336
16384 32k 32k 112
32768 32k 32k 56
131072 256k 256k 112
262144 512k 512k 112
524288 1024k 1024k 112
1048576 2048k 2048k 112
192 616k 300k 8624
144 20k 3024 280
160 132k 115k 1848
432 28k 21k 392
96 66M 14M 925k
48 2100k 1172k 32k
208 420k 375k 5880
64 1288k 1237k 20k
32 176k 72k 3168
80 30M 2060k 422k
Total 107M 25M 1455k
String pool
entries 159225
identifiers 159225 (100.00%)
slots 262144
bytes 1995k (172k overhead)
table size 2048k
coll/search 0.8692
ins/search 0.2066
avg. entry 12.83 bytes (+/- 7.80)
longest entry 67
??? tree nodes created
(No per-node statistics)
Type hash: size 2039, 920 elements, 0.860000 collisions
DECL_DEBUG_EXPR hash: size 16381, 0 elements, 1.303078 collisions
DECL_VALUE_EXPR hash: size 1021, 0 elements, 0.000000 collisions
Execution times (seconds)
garbage collection : 1.17 ( 0%) usr 0.00 ( 0%) sys 1.17
( 0%) wall 0 kB ( 0%) ggc
callgraph construction: 0.79 ( 0%) usr 0.11 ( 1%) sys 0.92
( 0%) wall 31928 kB ( 4%) ggc
callgraph optimization: 1.18 ( 0%) usr 0.00 ( 0%) sys 1.16
( 0%) wall 6 kB ( 0%) ggc
ipa reference : 0.22 ( 0%) usr 0.03 ( 0%) sys 0.25
( 0%) wall 7 kB ( 0%) ggc
cfg cleanup : 2.16 ( 1%) usr 0.00 ( 0%) sys 2.16
( 0%) wall 162 kB ( 0%) ggc
trivially dead code : 0.36 ( 0%) usr 0.00 ( 0%) sys 0.37
( 0%) wall 0 kB ( 0%) ggc
df reaching defs : 10.01 ( 4%) usr 3.74 (22%) sys 13.81
( 3%) wall 0 kB ( 0%) ggc
df live regs : 8.10 ( 3%) usr 0.01 ( 0%) sys 8.13
( 2%) wall 0 kB ( 0%) ggc
df live&initialized regs: 93.27 (37%) usr 2.67 (16%) sys 204.59
(41%) wall 0 kB ( 0%) ggc
df use-def / def-use chains: 8.56 ( 3%) usr 2.67 (16%) sys 11.27
( 2%) wall 0 kB ( 0%) ggc
df reg dead/unused notes: 1.00 ( 0%) usr 0.01 ( 0%) sys 1.00
( 0%) wall 10937 kB ( 1%) ggc
register information : 0.52 ( 0%) usr 0.00 ( 0%) sys 0.52
( 0%) wall 0 kB ( 0%) ggc
alias analysis : 0.93 ( 0%) usr 0.01 ( 0%) sys 0.91
( 0%) wall 7168 kB ( 1%) ggc
register scan : 0.10 ( 0%) usr 0.00 ( 0%) sys 0.10
( 0%) wall 4 kB ( 0%) ggc
rebuild jump labels : 0.35 ( 0%) usr 0.00 ( 0%) sys 0.35
( 0%) wall 0 kB ( 0%) ggc
preprocessing : 0.71 ( 0%) usr 1.05 ( 6%) sys 1.69
( 0%) wall 2918 kB ( 0%) ggc
lexical analysis : 0.54 ( 0%) usr 1.82 (11%) sys 2.39
( 0%) wall 0 kB ( 0%) ggc
parser : 1.34 ( 1%) usr 1.00 ( 6%) sys 2.37
( 0%) wall 66046 kB ( 8%) ggc
inline heuristics : 0.67 ( 0%) usr 0.16 ( 1%) sys 0.83
( 0%) wall 0 kB ( 0%) ggc
tree gimplify : 1.07 ( 0%) usr 0.04 ( 0%) sys 1.13
( 0%) wall 62339 kB ( 8%) ggc
tree eh : 0.11 ( 0%) usr 0.00 ( 0%) sys 0.11
( 0%) wall 0 kB ( 0%) ggc
tree CFG construction : 0.51 ( 0%) usr 0.07 ( 0%) sys 0.57
( 0%) wall 68526 kB ( 8%) ggc
tree CFG cleanup : 7.11 ( 3%) usr 0.00 ( 0%) sys 7.16
( 1%) wall 3524 kB ( 0%) ggc
tree copy propagation : 2.52 ( 1%) usr 0.06 ( 0%) sys 2.61
( 1%) wall 5702 kB ( 1%) ggc
tree find ref. vars : 0.15 ( 0%) usr 0.00 ( 0%) sys 0.15
( 0%) wall 1819 kB ( 0%) ggc
tree PTA : 1.96 ( 1%) usr 0.12 ( 1%) sys 2.08
( 0%) wall 3734 kB ( 0%) ggc
tree alias analysis : 0.06 ( 0%) usr 0.12 ( 1%) sys 0.19
( 0%) wall 0 kB ( 0%) ggc
tree call clobbering : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.03
( 0%) wall 0 kB ( 0%) ggc
tree flow sensitive alias: 0.17 ( 0%) usr 0.00 ( 0%) sys 0.16
( 0%) wall 2146 kB ( 0%) ggc
tree memory partitioning: 1.30 ( 1%) usr 0.00 ( 0%) sys 1.30
( 0%) wall 0 kB ( 0%) ggc
tree PHI insertion : 0.64 ( 0%) usr 0.04 ( 0%) sys 0.69
( 0%) wall 18541 kB ( 2%) ggc
tree SSA rewrite : 1.93 ( 1%) usr 0.03 ( 0%) sys 1.95
( 0%) wall 35021 kB ( 4%) ggc
tree SSA other : 0.23 ( 0%) usr 0.09 ( 1%) sys 0.24
( 0%) wall 0 kB ( 0%) ggc
tree SSA incremental : 8.52 ( 3%) usr 0.08 ( 0%) sys 8.57
( 2%) wall 14256 kB ( 2%) ggc
tree operand scan : 0.73 ( 0%) usr 0.24 ( 1%) sys 0.97
( 0%) wall 28110 kB ( 3%) ggc
dominator optimization: 2.75 ( 1%) usr 0.03 ( 0%) sys 2.80
( 1%) wall 42635 kB ( 5%) ggc
tree STORE-CCP : 0.59 ( 0%) usr 0.00 ( 0%) sys 0.59
( 0%) wall 1024 kB ( 0%) ggc
tree CCP : 1.20 ( 0%) usr 0.01 ( 0%) sys 1.21
( 0%) wall 1537 kB ( 0%) ggc
tree PHI const/copy prop: 0.24 ( 0%) usr 0.00 ( 0%) sys 0.25
( 0%) wall 11 kB ( 0%) ggc
tree split crit edges : 0.11 ( 0%) usr 0.03 ( 0%) sys 0.13
( 0%) wall 33706 kB ( 4%) ggc
tree reassociation : 0.63 ( 0%) usr 0.00 ( 0%) sys 0.64
( 0%) wall 1 kB ( 0%) ggc
tree FRE : 2.66 ( 1%) usr 0.05 ( 0%) sys 2.77
( 1%) wall 49006 kB ( 6%) ggc
tree code sinking : 0.49 ( 0%) usr 0.00 ( 0%) sys 0.48
( 0%) wall 6 kB ( 0%) ggc
tree linearize phis : 0.28 ( 0%) usr 0.00 ( 0%) sys 0.28
( 0%) wall 0 kB ( 0%) ggc
tree forward propagate: 0.34 ( 0%) usr 0.00 ( 0%) sys 0.32
( 0%) wall 426 kB ( 0%) ggc
tree conservative DCE : 1.60 ( 1%) usr 0.00 ( 0%) sys 1.62
( 0%) wall 0 kB ( 0%) ggc
tree aggressive DCE : 0.35 ( 0%) usr 0.00 ( 0%) sys 0.35
( 0%) wall 0 kB ( 0%) ggc
tree DSE : 0.35 ( 0%) usr 0.00 ( 0%) sys 0.36
( 0%) wall 1 kB ( 0%) ggc
PHI merge : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.06
( 0%) wall 7192 kB ( 1%) ggc
tree loop bounds : 0.17 ( 0%) usr 0.00 ( 0%) sys 0.17
( 0%) wall 2 kB ( 0%) ggc
loop invariant motion : 0.32 ( 0%) usr 0.00 ( 0%) sys 0.32
( 0%) wall 0 kB ( 0%) ggc
tree canonical iv : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.03
( 0%) wall 0 kB ( 0%) ggc
scev constant prop : 0.63 ( 0%) usr 0.00 ( 0%) sys 0.64
( 0%) wall 17787 kB ( 2%) ggc
complete unrolling : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01
( 0%) wall 0 kB ( 0%) ggc
tree loop init : 3.07 ( 1%) usr 0.09 ( 1%) sys 3.21
( 1%) wall 45438 kB ( 6%) ggc
tree loop fini : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01
( 0%) wall 0 kB ( 0%) ggc
tree copy headers : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.07
( 0%) wall 0 kB ( 0%) ggc
tree SSA uncprop : 0.24 ( 0%) usr 0.00 ( 0%) sys 0.26
( 0%) wall 0 kB ( 0%) ggc
tree SSA to normal : 11.15 ( 4%) usr 0.07 ( 0%) sys 11.26
( 2%) wall 81126 kB (10%) ggc
tree rename SSA copies: 0.55 ( 0%) usr 0.01 ( 0%) sys 0.56
( 0%) wall 0 kB ( 0%) ggc
dominance frontiers : 0.44 ( 0%) usr 0.00 ( 0%) sys 0.48
( 0%) wall 0 kB ( 0%) ggc
dominance computation : 2.49 ( 1%) usr 0.05 ( 0%) sys 2.52
( 1%) wall 0 kB ( 0%) ggc
expand : 14.26 ( 6%) usr 1.80 (10%) sys 144.06
(29%) wall 92074 kB (11%) ggc
lower subreg : 0.23 ( 0%) usr 0.00 ( 0%) sys 0.24
( 0%) wall 0 kB ( 0%) ggc
jump : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.05
( 0%) wall 0 kB ( 0%) ggc
CSE : 0.77 ( 0%) usr 0.00 ( 0%) sys 0.78
( 0%) wall 1426 kB ( 0%) ggc
dead code elimination : 0.50 ( 0%) usr 0.00 ( 0%) sys 0.52
( 0%) wall 0 kB ( 0%) ggc
dead store elim1 : 0.43 ( 0%) usr 0.06 ( 0%) sys 0.49
( 0%) wall 7944 kB ( 1%) ggc
dead store elim2 : 0.49 ( 0%) usr 0.01 ( 0%) sys 0.51
( 0%) wall 8877 kB ( 1%) ggc
loop analysis : 0.60 ( 0%) usr 0.01 ( 0%) sys 0.61
( 0%) wall 70 kB ( 0%) ggc
branch prediction : 0.95 ( 0%) usr 0.02 ( 0%) sys 0.98
( 0%) wall 1541 kB ( 0%) ggc
combiner : 2.65 ( 1%) usr 0.04 ( 0%) sys 2.70
( 1%) wall 27893 kB ( 3%) ggc
if-conversion : 1.55 ( 1%) usr 0.00 ( 0%) sys 1.55
( 0%) wall 655 kB ( 0%) ggc
local alloc : 4.01 ( 2%) usr 0.02 ( 0%) sys 4.05
( 1%) wall 7074 kB ( 1%) ggc
global alloc : 25.75 (10%) usr 0.36 ( 2%) sys 26.20
( 5%) wall 5111 kB ( 1%) ggc
reload CSE regs : 1.21 ( 0%) usr 0.01 ( 0%) sys 1.24
( 0%) wall 12243 kB ( 1%) ggc
thread pro- & epilogue: 0.10 ( 0%) usr 0.00 ( 0%) sys 0.10
( 0%) wall 4 kB ( 0%) ggc
if-conversion 2 : 0.39 ( 0%) usr 0.00 ( 0%) sys 0.36
( 0%) wall 82 kB ( 0%) ggc
rename registers : 0.62 ( 0%) usr 0.04 ( 0%) sys 0.65
( 0%) wall 31 kB ( 0%) ggc
scheduling 2 : 2.69 ( 1%) usr 0.05 ( 0%) sys 2.77
( 1%) wall 0 kB ( 0%) ggc
machine dep reorg : 0.52 ( 0%) usr 0.00 ( 0%) sys 0.52
( 0%) wall 149 kB ( 0%) ggc
reorder blocks : 0.25 ( 0%) usr 0.01 ( 0%) sys 0.26
( 0%) wall 6758 kB ( 1%) ggc
final : 1.26 ( 1%) usr 0.01 ( 0%) sys 1.27
( 0%) wall 0 kB ( 0%) ggc
tree if-combine : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.06
( 0%) wall 223 kB ( 0%) ggc
TOTAL : 249.32 17.19
503.80 816827 kB
Program exited normally.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (33 preceding siblings ...)
2007-11-14 19:05 ` lucier at math dot purdue dot edu
@ 2007-11-14 19:06 ` lucier at math dot purdue dot edu
2007-12-19 21:49 ` lucier at math dot purdue dot edu
` (85 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2007-11-14 19:06 UTC (permalink / raw)
To: gcc-bugs
------- Comment #35 from lucier at math dot purdue dot edu 2007-11-14 19:06 -------
Subject: Re: Inordinate compile times on large routines
PS: Should the "Reported against" field in bugzilla be changed to
4.3.0?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (34 preceding siblings ...)
2007-11-14 19:06 ` lucier at math dot purdue dot edu
@ 2007-12-19 21:49 ` lucier at math dot purdue dot edu
2007-12-19 22:13 ` steven at gcc dot gnu dot org
` (84 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2007-12-19 21:49 UTC (permalink / raw)
To: gcc-bugs
------- Comment #36 from lucier at math dot purdue dot edu 2007-12-19 21:48 -------
I changed the "reported against" field to 4.3.0 (see my previous comments).
--
lucier at math dot purdue dot edu changed:
What |Removed |Added
----------------------------------------------------------------------------
Version|4.1.0 |4.3.0
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (35 preceding siblings ...)
2007-12-19 21:49 ` lucier at math dot purdue dot edu
@ 2007-12-19 22:13 ` steven at gcc dot gnu dot org
2007-12-19 23:31 ` lucier at math dot purdue dot edu
` (83 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: steven at gcc dot gnu dot org @ 2007-12-19 22:13 UTC (permalink / raw)
To: gcc-bugs
------- Comment #37 from steven at gcc dot gnu dot org 2007-12-19 22:13 -------
Brad,
I am looking at your dump and your backtraces (many many thanks!!!) and I think
I have an idea how to improve the situation a bit here:
> Program received signal SIGINT, Interrupt.
> 0x00000000004687c9 in bitmap_elt_insert_after (head=0x963b0f0,
> elt=0xd30a7a70, indx=561) at ../../../mainline/gcc/bitmap.c:203
> 203 if (element->next)
> (gdb) where
> #0 0x00000000004687c9 in bitmap_elt_insert_after (head=0x963b0f0,
> elt=0xd30a7a70, indx=561) at ../../../mainline/gcc/bitmap.c:203
> #1 0x000000000046a19b in bitmap_ior_into (a=0x963b0f0, b=Variable
> "b" is not available.
> ) at ../../../mainline/gcc/bitmap.c:913
> #2 0x00000000004adce6 in df_worklist_dataflow (dataflow=0x7829f20,
> blocks_to_consider=0x9c1f250, blocks_in_postorder=0x2ab81c6010,
> n_blocks=Variable "n_blocks" is not available.
> )
> at ../../../mainline/gcc/df-core.c:875
> #3 0x00000000004acd7e in df_analyze_problem (dflow=0x7829f20,
blocks_to_consider=0x9c1f250, postorder=0x2ab81c6010, n_blocks=59465)
> at ../../../mainline/gcc/df-core.c:1060
... and ...
> df live&initialized regs: 93.27 (37%) usr 2.67 (16%) sys 204.59
(41%) wall 0 kB ( 0%) ggc
I have seen this before :-) In fact, I already attached a patch implementing
this idea in another bug report, bug 34400.
This may be asking a lot, but could you do something for me please? Could you
install the patches df_hack2.diff and df_double_queue_worklist.diff, and redo
the timings? Both patches are attached to bug 34400.
If adding the patches I mentioned does not help, could you try to interrupt gdb
a few times more, and then look a few times in df_analyze_problem which problem
dflow is? I.e. "p dflow" or "p (timevar_id_t) dflow->tv_id", or whatever works
to see which problem we are in? I suspect we may be creating dataflow problems
that are too large to handle.
Many thanks for your help!
--
steven at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |steven at gcc dot gnu dot
| |org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (36 preceding siblings ...)
2007-12-19 22:13 ` steven at gcc dot gnu dot org
@ 2007-12-19 23:31 ` lucier at math dot purdue dot edu
2007-12-20 0:02 ` steven at gcc dot gnu dot org
` (82 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2007-12-19 23:31 UTC (permalink / raw)
To: gcc-bugs
------- Comment #38 from lucier at math dot purdue dot edu 2007-12-19 23:31 -------
Subject: Re: Inordinate compile times on large routines
On Dec 19, 2007, at 5:13 PM, steven at gcc dot gnu dot org wrote:
> This may be asking a lot, but could you do something for me
> please? Could you
> install the patches df_hack2.diff and
> df_double_queue_worklist.diff, and redo
> the timings? Both patches are attached to bug 34400.
Your patches definitely help, for some value of "help". The top
memory usage (just from watching "top") went from 9998 MB to 6803MB
(of course I could have missed the peak memory usage of both jobs),
and the CPU time went down, too. Here are details.
Before your patches:
euler-32% /pkgs/gcc-mainline/bin/gcc -v
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: ../../mainline/configure --prefix=/pkgs/gcc-mainline
--enable-languages=c --enable-checking=release --with-gmp=/pkgs/
gmp-4.2.2 --with-mpfr=/pkgs/gmp-4.2.2
Thread model: posix
gcc version 4.3.0 20071219 (experimental) [trunk revision 131091] (GCC)
euler-33% /pkgs/gcc-mainline/bin/gcc -O1 -fno-math-errno -fschedule-
insns2 -fno-trapping-math -fno-strict-aliasing -fwrapv -fomit-frame-
pointer -fPIC -fno-common -ftime-report -fmem-report -c all.i
Memory still allocated at the end of the compilation process
Size Allocated Used Overhead
8 4096 16 120
16 72k 18k 1584
128 2144k 2135k 29k
256 4096 1536 56
512 4096 1024 56
1024 112k 110k 1568
2048 28k 22k 392
4096 76k 76k 1064
8192 48k 48k 336
16384 32k 32k 112
32768 32k 32k 56
131072 256k 256k 112
262144 512k 512k 112
524288 1024k 1024k 112
1048576 2048k 2048k 112
192 616k 300k 8624
144 20k 3024 280
160 132k 115k 1848
432 28k 21k 392
96 15M 14M 215k
48 2136k 1171k 33k
208 420k 375k 5880
64 1288k 1237k 20k
32 164k 64k 2952
80 29M 2060k 417k
Total 56M 25M 741k
String pool
entries 159286
identifiers 159286 (100.00%)
slots 262144
bytes 1995k (171k overhead)
table size 2048k
coll/search 0.9209
ins/search 0.2067
avg. entry 12.83 bytes (+/- 7.80)
longest entry 67
??? tree nodes created
(No per-node statistics)
Type hash: size 2039, 920 elements, 0.860000 collisions
DECL_DEBUG_EXPR hash: size 16381, 0 elements, 1.332565 collisions
DECL_VALUE_EXPR hash: size 1021, 0 elements, 0.000000 collisions
Execution times (seconds)
garbage collection : 1.05 ( 0%) usr 0.00 ( 0%) sys 1.06
( 0%) wall 0 kB ( 0%) ggc
callgraph construction: 0.79 ( 0%) usr 0.09 ( 1%) sys 0.89
( 0%) wall 31928 kB ( 4%) ggc
callgraph optimization: 1.02 ( 0%) usr 0.00 ( 0%) sys 1.03
( 0%) wall 6 kB ( 0%) ggc
ipa reference : 0.21 ( 0%) usr 0.03 ( 0%) sys 0.24
( 0%) wall 7 kB ( 0%) ggc
cfg cleanup : 2.16 ( 1%) usr 0.00 ( 0%) sys 2.16
( 1%) wall 164 kB ( 0%) ggc
trivially dead code : 0.35 ( 0%) usr 0.01 ( 0%) sys 0.35
( 0%) wall 0 kB ( 0%) ggc
df reaching defs : 9.53 ( 4%) usr 3.29 (20%) sys 12.83
( 5%) wall 0 kB ( 0%) ggc
df live regs : 8.09 ( 3%) usr 0.01 ( 0%) sys 8.11
( 3%) wall 0 kB ( 0%) ggc
df live&initialized regs: 98.09 (41%) usr 2.81 (17%) sys 100.95
(39%) wall 0 kB ( 0%) ggc
df use-def / def-use chains: 8.16 ( 3%) usr 2.38 (15%) sys
10.53 ( 4%) wall 0 kB ( 0%) ggc
df reg dead/unused notes: 0.95 ( 0%) usr 0.00 ( 0%) sys 0.95
( 0%) wall 10801 kB ( 1%) ggc
register information : 0.52 ( 0%) usr 0.01 ( 0%) sys 0.51
( 0%) wall 0 kB ( 0%) ggc
alias analysis : 0.85 ( 0%) usr 0.01 ( 0%) sys 0.87
( 0%) wall 7168 kB ( 1%) ggc
register scan : 0.10 ( 0%) usr 0.00 ( 0%) sys 0.10
( 0%) wall 4 kB ( 0%) ggc
rebuild jump labels : 0.33 ( 0%) usr 0.00 ( 0%) sys 0.33
( 0%) wall 0 kB ( 0%) ggc
preprocessing : 0.68 ( 0%) usr 0.90 ( 6%) sys 1.66
( 1%) wall 2918 kB ( 0%) ggc
lexical analysis : 0.55 ( 0%) usr 1.97 (12%) sys 2.18
( 1%) wall 0 kB ( 0%) ggc
parser : 1.29 ( 1%) usr 0.90 ( 6%) sys 2.45
( 1%) wall 66023 kB ( 8%) ggc
inline heuristics : 0.66 ( 0%) usr 0.15 ( 1%) sys 0.82
( 0%) wall 0 kB ( 0%) ggc
tree gimplify : 1.08 ( 0%) usr 0.06 ( 0%) sys 1.14
( 0%) wall 62339 kB ( 8%) ggc
tree eh : 0.10 ( 0%) usr 0.00 ( 0%) sys 0.11
( 0%) wall 0 kB ( 0%) ggc
tree CFG construction : 0.49 ( 0%) usr 0.05 ( 0%) sys 0.55
( 0%) wall 68526 kB ( 9%) ggc
tree CFG cleanup : 6.94 ( 3%) usr 0.01 ( 0%) sys 6.94
( 3%) wall 3575 kB ( 0%) ggc
tree copy propagation : 2.41 ( 1%) usr 0.06 ( 0%) sys 2.47
( 1%) wall 4818 kB ( 1%) ggc
tree find ref. vars : 0.14 ( 0%) usr 0.00 ( 0%) sys 0.15
( 0%) wall 1819 kB ( 0%) ggc
tree PTA : 1.93 ( 1%) usr 0.10 ( 1%) sys 2.03
( 1%) wall 3734 kB ( 0%) ggc
tree alias analysis : 0.11 ( 0%) usr 0.08 ( 0%) sys 0.11
( 0%) wall 0 kB ( 0%) ggc
tree call clobbering : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.02
( 0%) wall 0 kB ( 0%) ggc
tree flow sensitive alias: 0.16 ( 0%) usr 0.00 ( 0%) sys 0.17
( 0%) wall 2146 kB ( 0%) ggc
tree memory partitioning: 1.25 ( 1%) usr 0.00 ( 0%) sys 1.25
( 0%) wall 0 kB ( 0%) ggc
tree PHI insertion : 0.59 ( 0%) usr 0.03 ( 0%) sys 0.64
( 0%) wall 18541 kB ( 2%) ggc
tree SSA rewrite : 1.94 ( 1%) usr 0.03 ( 0%) sys 1.97
( 1%) wall 35021 kB ( 5%) ggc
tree SSA other : 0.18 ( 0%) usr 0.08 ( 0%) sys 0.26
( 0%) wall 0 kB ( 0%) ggc
tree SSA incremental : 9.06 ( 4%) usr 0.34 ( 2%) sys 9.43
( 4%) wall 14359 kB ( 2%) ggc
tree operand scan : 0.69 ( 0%) usr 0.28 ( 2%) sys 0.98
( 0%) wall 27918 kB ( 4%) ggc
dominator optimization: 2.86 ( 1%) usr 0.02 ( 0%) sys 2.96
( 1%) wall 44597 kB ( 6%) ggc
tree SRA : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00
( 0%) wall 0 kB ( 0%) ggc
tree STORE-CCP : 0.57 ( 0%) usr 0.00 ( 0%) sys 0.57
( 0%) wall 1024 kB ( 0%) ggc
tree CCP : 1.14 ( 0%) usr 0.00 ( 0%) sys 1.16
( 0%) wall 1537 kB ( 0%) ggc
tree PHI const/copy prop: 0.23 ( 0%) usr 0.00 ( 0%) sys 0.23
( 0%) wall 11 kB ( 0%) ggc
tree split crit edges : 0.12 ( 0%) usr 0.00 ( 0%) sys 0.12
( 0%) wall 33698 kB ( 4%) ggc
tree reassociation : 0.64 ( 0%) usr 0.00 ( 0%) sys 0.62
( 0%) wall 1 kB ( 0%) ggc
tree FRE : 0.26 ( 0%) usr 0.00 ( 0%) sys 0.25
( 0%) wall 5 kB ( 0%) ggc
tree code sinking : 0.47 ( 0%) usr 0.00 ( 0%) sys 0.47
( 0%) wall 6 kB ( 0%) ggc
tree linearize phis : 0.27 ( 0%) usr 0.00 ( 0%) sys 0.27
( 0%) wall 0 kB ( 0%) ggc
tree forward propagate: 0.33 ( 0%) usr 0.00 ( 0%) sys 0.35
( 0%) wall 426 kB ( 0%) ggc
tree conservative DCE : 1.59 ( 1%) usr 0.00 ( 0%) sys 1.59
( 1%) wall 0 kB ( 0%) ggc
tree aggressive DCE : 0.34 ( 0%) usr 0.00 ( 0%) sys 0.34
( 0%) wall 0 kB ( 0%) ggc
tree DSE : 0.36 ( 0%) usr 0.00 ( 0%) sys 0.36
( 0%) wall 1 kB ( 0%) ggc
PHI merge : 0.07 ( 0%) usr 0.00 ( 0%) sys 0.07
( 0%) wall 7192 kB ( 1%) ggc
tree loop bounds : 0.16 ( 0%) usr 0.00 ( 0%) sys 0.16
( 0%) wall 2 kB ( 0%) ggc
loop invariant motion : 0.31 ( 0%) usr 0.00 ( 0%) sys 0.31
( 0%) wall 0 kB ( 0%) ggc
tree canonical iv : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.02
( 0%) wall 0 kB ( 0%) ggc
scev constant prop : 0.66 ( 0%) usr 0.01 ( 0%) sys 0.67
( 0%) wall 17793 kB ( 2%) ggc
complete unrolling : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.02
( 0%) wall 0 kB ( 0%) ggc
tree loop init : 3.15 ( 1%) usr 0.10 ( 1%) sys 3.17
( 1%) wall 45121 kB ( 6%) ggc
tree loop fini : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01
( 0%) wall 0 kB ( 0%) ggc
tree copy headers : 0.07 ( 0%) usr 0.00 ( 0%) sys 0.07
( 0%) wall 0 kB ( 0%) ggc
tree SSA uncprop : 0.26 ( 0%) usr 0.00 ( 0%) sys 0.26
( 0%) wall 0 kB ( 0%) ggc
tree SSA to normal : 11.37 ( 5%) usr 0.10 ( 1%) sys 11.47
( 4%) wall 90617 kB (12%) ggc
tree rename SSA copies: 0.55 ( 0%) usr 0.02 ( 0%) sys 0.56
( 0%) wall 0 kB ( 0%) ggc
dominance frontiers : 0.44 ( 0%) usr 0.00 ( 0%) sys 0.44
( 0%) wall 0 kB ( 0%) ggc
dominance computation : 2.38 ( 1%) usr 0.04 ( 0%) sys 2.42
( 1%) wall 0 kB ( 0%) ggc
expand : 13.82 ( 6%) usr 1.53 ( 9%) sys 15.43
( 6%) wall 91541 kB (12%) ggc
lower subreg : 0.22 ( 0%) usr 0.00 ( 0%) sys 0.22
( 0%) wall 0 kB ( 0%) ggc
jump : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.04
( 0%) wall 0 kB ( 0%) ggc
CSE : 0.80 ( 0%) usr 0.00 ( 0%) sys 0.78
( 0%) wall 1403 kB ( 0%) ggc
dead code elimination : 0.48 ( 0%) usr 0.00 ( 0%) sys 0.47
( 0%) wall 0 kB ( 0%) ggc
dead store elim1 : 0.41 ( 0%) usr 0.03 ( 0%) sys 0.44
( 0%) wall 7973 kB ( 1%) ggc
dead store elim2 : 0.47 ( 0%) usr 0.01 ( 0%) sys 0.48
( 0%) wall 8688 kB ( 1%) ggc
loop analysis : 0.57 ( 0%) usr 0.01 ( 0%) sys 0.58
( 0%) wall 70 kB ( 0%) ggc
branch prediction : 0.93 ( 0%) usr 0.01 ( 0%) sys 0.94
( 0%) wall 1541 kB ( 0%) ggc
combiner : 2.62 ( 1%) usr 0.04 ( 0%) sys 2.67
( 1%) wall 28000 kB ( 4%) ggc
if-conversion : 1.55 ( 1%) usr 0.03 ( 0%) sys 1.54
( 1%) wall 586 kB ( 0%) ggc
local alloc : 4.00 ( 2%) usr 0.01 ( 0%) sys 4.01
( 2%) wall 7070 kB ( 1%) ggc
global alloc : 17.58 ( 7%) usr 0.30 ( 2%) sys 17.89
( 7%) wall 4961 kB ( 1%) ggc
reload CSE regs : 1.17 ( 0%) usr 0.02 ( 0%) sys 1.18
( 0%) wall 12069 kB ( 2%) ggc
thread pro- & epilogue: 0.09 ( 0%) usr 0.00 ( 0%) sys 0.09
( 0%) wall 4 kB ( 0%) ggc
if-conversion 2 : 0.38 ( 0%) usr 0.00 ( 0%) sys 0.37
( 0%) wall 119 kB ( 0%) ggc
rename registers : 0.61 ( 0%) usr 0.02 ( 0%) sys 0.63
( 0%) wall 29 kB ( 0%) ggc
scheduling 2 : 2.52 ( 1%) usr 0.04 ( 0%) sys 2.55
( 1%) wall 0 kB ( 0%) ggc
machine dep reorg : 0.50 ( 0%) usr 0.00 ( 0%) sys 0.50
( 0%) wall 148 kB ( 0%) ggc
reorder blocks : 0.28 ( 0%) usr 0.01 ( 0%) sys 0.27
( 0%) wall 6727 kB ( 1%) ggc
final : 1.19 ( 0%) usr 0.03 ( 0%) sys 1.25
( 0%) wall 0 kB ( 0%) ggc
tree if-combine : 0.05 ( 0%) usr 0.01 ( 0%) sys 0.06
( 0%) wall 224 kB ( 0%) ggc
TOTAL : 241.56 16.30
257.94 776880 kB
euler-34%
after your patches:
euler-43% patch < df-prob.patch
patching file df-problems.c
Hunk #1 succeeded at 1329 (offset 6 lines).
Hunk #3 succeeded at 1411 (offset 6 lines).
Hunk #5 succeeded at 1470 (offset 6 lines).
Hunk #7 succeeded at 1536 (offset 6 lines).
(The other one applied cleanly.)
euler-62% /pkgs/gcc-mainline/bin/gcc -O1 -fno-math-errno -fschedule-
insns2 -fno-trapping-math -fno-strict-aliasing -fwrapv -fomit-frame-
pointer -fPIC -fno-common -ftime-report -fmem-report -c all.i
Memory still allocated at the end of the compilation process
Size Allocated Used Overhead
8 4096 16 120
16 72k 18k 1584
128 2144k 2135k 29k
256 4096 1536 56
512 4096 1024 56
1024 112k 110k 1568
2048 28k 22k 392
4096 76k 76k 1064
8192 48k 48k 336
16384 32k 32k 112
32768 32k 32k 56
131072 256k 256k 112
262144 512k 512k 112
524288 1024k 1024k 112
1048576 2048k 2048k 112
192 616k 300k 8624
144 20k 3024 280
160 132k 115k 1848
432 28k 21k 392
96 15M 14M 215k
48 2136k 1171k 33k
208 420k 375k 5880
64 1288k 1237k 20k
32 164k 64k 2952
80 29M 2060k 417k
Total 56M 25M 741k
String pool
entries 159286
identifiers 159286 (100.00%)
slots 262144
bytes 1995k (171k overhead)
table size 2048k
coll/search 0.9209
ins/search 0.2067
avg. entry 12.83 bytes (+/- 7.80)
longest entry 67
??? tree nodes created
(No per-node statistics)
Type hash: size 2039, 920 elements, 0.860000 collisions
DECL_DEBUG_EXPR hash: size 16381, 0 elements, 1.332565 collisions
DECL_VALUE_EXPR hash: size 1021, 0 elements, 0.000000 collisions
Execution times (seconds)
garbage collection : 1.03 ( 1%) usr 0.00 ( 0%) sys 1.03
( 1%) wall 0 kB ( 0%) ggc
callgraph construction: 0.77 ( 0%) usr 0.09 ( 1%) sys 0.88
( 0%) wall 31928 kB ( 4%) ggc
callgraph optimization: 1.04 ( 1%) usr 0.00 ( 0%) sys 1.03
( 1%) wall 6 kB ( 0%) ggc
ipa reference : 0.21 ( 0%) usr 0.04 ( 0%) sys 0.24
( 0%) wall 7 kB ( 0%) ggc
cfg cleanup : 2.20 ( 1%) usr 0.00 ( 0%) sys 2.21
( 1%) wall 164 kB ( 0%) ggc
trivially dead code : 0.36 ( 0%) usr 0.00 ( 0%) sys 0.35
( 0%) wall 0 kB ( 0%) ggc
df reaching defs : 18.18 (10%) usr 3.25 (24%) sys 21.44
(11%) wall 0 kB ( 0%) ggc
df live regs : 11.56 ( 7%) usr 0.00 ( 0%) sys 11.53
( 6%) wall 0 kB ( 0%) ggc
df live&initialized regs: 15.71 ( 9%) usr 0.02 ( 0%) sys 15.77
( 8%) wall 0 kB ( 0%) ggc
df use-def / def-use chains: 8.02 ( 5%) usr 2.28 (17%) sys
10.30 ( 5%) wall 0 kB ( 0%) ggc
df reg dead/unused notes: 0.95 ( 1%) usr 0.00 ( 0%) sys 0.95
( 1%) wall 10801 kB ( 1%) ggc
register information : 0.50 ( 0%) usr 0.00 ( 0%) sys 0.52
( 0%) wall 0 kB ( 0%) ggc
alias analysis : 0.87 ( 0%) usr 0.00 ( 0%) sys 0.87
( 0%) wall 7168 kB ( 1%) ggc
register scan : 0.10 ( 0%) usr 0.00 ( 0%) sys 0.10
( 0%) wall 4 kB ( 0%) ggc
rebuild jump labels : 0.33 ( 0%) usr 0.00 ( 0%) sys 0.34
( 0%) wall 0 kB ( 0%) ggc
preprocessing : 0.71 ( 0%) usr 1.05 ( 8%) sys 1.61
( 1%) wall 2918 kB ( 0%) ggc
lexical analysis : 0.45 ( 0%) usr 1.86 (14%) sys 2.36
( 1%) wall 0 kB ( 0%) ggc
parser : 1.37 ( 1%) usr 0.90 ( 7%) sys 2.38
( 1%) wall 66023 kB ( 8%) ggc
inline heuristics : 0.69 ( 0%) usr 0.15 ( 1%) sys 0.82
( 0%) wall 0 kB ( 0%) ggc
tree gimplify : 1.08 ( 1%) usr 0.05 ( 0%) sys 1.13
( 1%) wall 62339 kB ( 8%) ggc
tree eh : 0.11 ( 0%) usr 0.00 ( 0%) sys 0.11
( 0%) wall 0 kB ( 0%) ggc
tree CFG construction : 0.50 ( 0%) usr 0.05 ( 0%) sys 0.54
( 0%) wall 68526 kB ( 9%) ggc
tree CFG cleanup : 6.94 ( 4%) usr 0.00 ( 0%) sys 6.90
( 4%) wall 3575 kB ( 0%) ggc
tree copy propagation : 2.39 ( 1%) usr 0.05 ( 0%) sys 2.44
( 1%) wall 4818 kB ( 1%) ggc
tree find ref. vars : 0.14 ( 0%) usr 0.00 ( 0%) sys 0.15
( 0%) wall 1819 kB ( 0%) ggc
tree PTA : 1.93 ( 1%) usr 0.10 ( 1%) sys 2.04
( 1%) wall 3734 kB ( 0%) ggc
tree alias analysis : 0.07 ( 0%) usr 0.10 ( 1%) sys 0.14
( 0%) wall 0 kB ( 0%) ggc
tree call clobbering : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.02
( 0%) wall 0 kB ( 0%) ggc
tree flow sensitive alias: 0.16 ( 0%) usr 0.00 ( 0%) sys 0.17
( 0%) wall 2146 kB ( 0%) ggc
tree memory partitioning: 1.25 ( 1%) usr 0.00 ( 0%) sys 1.25
( 1%) wall 0 kB ( 0%) ggc
tree PHI insertion : 0.60 ( 0%) usr 0.03 ( 0%) sys 0.64
( 0%) wall 18541 kB ( 2%) ggc
tree SSA rewrite : 1.92 ( 1%) usr 0.03 ( 0%) sys 1.98
( 1%) wall 35021 kB ( 5%) ggc
tree SSA other : 0.19 ( 0%) usr 0.12 ( 1%) sys 0.29
( 0%) wall 0 kB ( 0%) ggc
tree SSA incremental : 9.05 ( 5%) usr 0.40 ( 3%) sys 9.35
( 5%) wall 14359 kB ( 2%) ggc
tree operand scan : 0.69 ( 0%) usr 0.20 ( 1%) sys 0.90
( 0%) wall 27918 kB ( 4%) ggc
dominator optimization: 2.86 ( 2%) usr 0.04 ( 0%) sys 2.93
( 2%) wall 44597 kB ( 6%) ggc
tree SRA : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01
( 0%) wall 0 kB ( 0%) ggc
tree STORE-CCP : 0.57 ( 0%) usr 0.00 ( 0%) sys 0.57
( 0%) wall 1024 kB ( 0%) ggc
tree CCP : 1.14 ( 1%) usr 0.01 ( 0%) sys 1.15
( 1%) wall 1537 kB ( 0%) ggc
tree PHI const/copy prop: 0.24 ( 0%) usr 0.00 ( 0%) sys 0.22
( 0%) wall 11 kB ( 0%) ggc
tree split crit edges : 0.12 ( 0%) usr 0.00 ( 0%) sys 0.11
( 0%) wall 33698 kB ( 4%) ggc
tree reassociation : 0.63 ( 0%) usr 0.01 ( 0%) sys 0.62
( 0%) wall 1 kB ( 0%) ggc
tree FRE : 0.26 ( 0%) usr 0.00 ( 0%) sys 0.26
( 0%) wall 5 kB ( 0%) ggc
tree code sinking : 0.46 ( 0%) usr 0.00 ( 0%) sys 0.47
( 0%) wall 6 kB ( 0%) ggc
tree linearize phis : 0.27 ( 0%) usr 0.00 ( 0%) sys 0.26
( 0%) wall 0 kB ( 0%) ggc
tree forward propagate: 0.32 ( 0%) usr 0.00 ( 0%) sys 0.33
( 0%) wall 426 kB ( 0%) ggc
tree conservative DCE : 1.58 ( 1%) usr 0.00 ( 0%) sys 1.59
( 1%) wall 0 kB ( 0%) ggc
tree aggressive DCE : 0.34 ( 0%) usr 0.00 ( 0%) sys 0.34
( 0%) wall 0 kB ( 0%) ggc
tree DSE : 0.36 ( 0%) usr 0.00 ( 0%) sys 0.37
( 0%) wall 1 kB ( 0%) ggc
PHI merge : 0.07 ( 0%) usr 0.01 ( 0%) sys 0.07
( 0%) wall 7192 kB ( 1%) ggc
tree loop bounds : 0.16 ( 0%) usr 0.00 ( 0%) sys 0.16
( 0%) wall 2 kB ( 0%) ggc
loop invariant motion : 0.31 ( 0%) usr 0.00 ( 0%) sys 0.31
( 0%) wall 0 kB ( 0%) ggc
tree canonical iv : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.03
( 0%) wall 0 kB ( 0%) ggc
scev constant prop : 0.61 ( 0%) usr 0.00 ( 0%) sys 0.62
( 0%) wall 17793 kB ( 2%) ggc
complete unrolling : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.01
( 0%) wall 0 kB ( 0%) ggc
tree loop init : 3.13 ( 2%) usr 0.08 ( 1%) sys 3.26
( 2%) wall 45121 kB ( 6%) ggc
tree loop fini : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01
( 0%) wall 0 kB ( 0%) ggc
tree copy headers : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.07
( 0%) wall 0 kB ( 0%) ggc
tree SSA uncprop : 0.25 ( 0%) usr 0.00 ( 0%) sys 0.26
( 0%) wall 0 kB ( 0%) ggc
tree SSA to normal : 11.37 ( 7%) usr 0.09 ( 1%) sys 11.48
( 6%) wall 90617 kB (12%) ggc
tree NRV optimization : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01
( 0%) wall 0 kB ( 0%) ggc
tree rename SSA copies: 0.54 ( 0%) usr 0.02 ( 0%) sys 0.55
( 0%) wall 0 kB ( 0%) ggc
dominance frontiers : 0.43 ( 0%) usr 0.00 ( 0%) sys 0.45
( 0%) wall 0 kB ( 0%) ggc
dominance computation : 2.37 ( 1%) usr 0.05 ( 0%) sys 2.44
( 1%) wall 0 kB ( 0%) ggc
expand : 13.62 ( 8%) usr 1.64 (12%) sys 15.22
( 8%) wall 91541 kB (12%) ggc
lower subreg : 0.21 ( 0%) usr 0.01 ( 0%) sys 0.23
( 0%) wall 0 kB ( 0%) ggc
jump : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.04
( 0%) wall 0 kB ( 0%) ggc
CSE : 0.76 ( 0%) usr 0.01 ( 0%) sys 0.77
( 0%) wall 1403 kB ( 0%) ggc
dead code elimination : 0.47 ( 0%) usr 0.00 ( 0%) sys 0.47
( 0%) wall 0 kB ( 0%) ggc
dead store elim1 : 0.42 ( 0%) usr 0.04 ( 0%) sys 0.46
( 0%) wall 7973 kB ( 1%) ggc
dead store elim2 : 0.47 ( 0%) usr 0.01 ( 0%) sys 0.48
( 0%) wall 8688 kB ( 1%) ggc
loop analysis : 0.57 ( 0%) usr 0.02 ( 0%) sys 0.57
( 0%) wall 70 kB ( 0%) ggc
branch prediction : 0.93 ( 1%) usr 0.00 ( 0%) sys 0.95
( 1%) wall 1541 kB ( 0%) ggc
combiner : 2.61 ( 1%) usr 0.03 ( 0%) sys 2.64
( 1%) wall 28000 kB ( 4%) ggc
if-conversion : 1.49 ( 1%) usr 0.00 ( 0%) sys 1.51
( 1%) wall 586 kB ( 0%) ggc
local alloc : 7.94 ( 5%) usr 0.02 ( 0%) sys 7.97
( 4%) wall 7070 kB ( 1%) ggc
global alloc : 17.58 (10%) usr 0.29 ( 2%) sys 17.88
(10%) wall 4961 kB ( 1%) ggc
reload CSE regs : 1.18 ( 1%) usr 0.02 ( 0%) sys 1.18
( 1%) wall 12069 kB ( 2%) ggc
thread pro- & epilogue: 0.09 ( 0%) usr 0.00 ( 0%) sys 0.09
( 0%) wall 4 kB ( 0%) ggc
if-conversion 2 : 0.36 ( 0%) usr 0.00 ( 0%) sys 0.34
( 0%) wall 119 kB ( 0%) ggc
rename registers : 0.61 ( 0%) usr 0.03 ( 0%) sys 0.64
( 0%) wall 29 kB ( 0%) ggc
scheduling 2 : 2.51 ( 1%) usr 0.05 ( 0%) sys 2.55
( 1%) wall 0 kB ( 0%) ggc
machine dep reorg : 0.50 ( 0%) usr 0.00 ( 0%) sys 0.50
( 0%) wall 148 kB ( 0%) ggc
reorder blocks : 0.24 ( 0%) usr 0.00 ( 0%) sys 0.25
( 0%) wall 6727 kB ( 1%) ggc
final : 1.17 ( 1%) usr 0.02 ( 0%) sys 1.19
( 1%) wall 0 kB ( 0%) ggc
tree if-combine : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.05
( 0%) wall 224 kB ( 0%) ggc
TOTAL : 174.55 13.49
188.09 776880 kB
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (37 preceding siblings ...)
2007-12-19 23:31 ` lucier at math dot purdue dot edu
@ 2007-12-20 0:02 ` steven at gcc dot gnu dot org
2007-12-20 2:29 ` lucier at math dot purdue dot edu
` (81 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: steven at gcc dot gnu dot org @ 2007-12-20 0:02 UTC (permalink / raw)
To: gcc-bugs
------- Comment #39 from steven at gcc dot gnu dot org 2007-12-20 00:02 -------
We badly need a way to track memory in DF. Because DF uses alloc_pools for
almost all its data structures, the memory statistics are only recorded if you
configure with --gather-detailed-mem-stats. I think it would be good if the DF
problems would report an estimate of their memory usage in the dump files, or
if a function would be available that you can call from GDB to give such an
estimate.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (38 preceding siblings ...)
2007-12-20 0:02 ` steven at gcc dot gnu dot org
@ 2007-12-20 2:29 ` lucier at math dot purdue dot edu
2007-12-20 3:07 ` zadeck at naturalbridge dot com
` (80 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2007-12-20 2:29 UTC (permalink / raw)
To: gcc-bugs
------- Comment #40 from lucier at math dot purdue dot edu 2007-12-20 02:29 -------
Created an attachment (id=14798)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14798&action=view)
detailed memory usage report
I rebuilt mainline with --enable-gather-detailed-mem-stats and this is the
output for the run in comment 38.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (39 preceding siblings ...)
2007-12-20 2:29 ` lucier at math dot purdue dot edu
@ 2007-12-20 3:07 ` zadeck at naturalbridge dot com
2007-12-20 3:52 ` lucier at math dot purdue dot edu
` (79 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: zadeck at naturalbridge dot com @ 2007-12-20 3:07 UTC (permalink / raw)
To: gcc-bugs
------- Comment #41 from zadeck at naturalbridge dot com 2007-12-20 03:06 -------
Subject: Re: Inordinate compile times on large
routines
lucier at math dot purdue dot edu wrote:
> ------- Comment #40 from lucier at math dot purdue dot edu 2007-12-20 02:29 -------
> Created an attachment (id=14798)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14798&action=view)
> --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14798&action=view)
> detailed memory usage report
>
> I rebuilt mainline with --enable-gather-detailed-mem-stats and this is the
> output for the run in comment 38.
>
>
>
you should look at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34400#c42
kenny
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (40 preceding siblings ...)
2007-12-20 3:07 ` zadeck at naturalbridge dot com
@ 2007-12-20 3:52 ` lucier at math dot purdue dot edu
2007-12-20 14:49 ` zadeck at naturalbridge dot com
` (78 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2007-12-20 3:52 UTC (permalink / raw)
To: gcc-bugs
------- Comment #42 from lucier at math dot purdue dot edu 2007-12-20 03:52 -------
Created an attachment (id=14799)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14799&action=view)
memory details for an unpatched mainline
Here is the same information without Steven's two patches for mainline.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (41 preceding siblings ...)
2007-12-20 3:52 ` lucier at math dot purdue dot edu
@ 2007-12-20 14:49 ` zadeck at naturalbridge dot com
2007-12-20 15:08 ` stevenb dot gcc at gmail dot com
` (77 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: zadeck at naturalbridge dot com @ 2007-12-20 14:49 UTC (permalink / raw)
To: gcc-bugs
------- Comment #43 from zadeck at naturalbridge dot com 2007-12-20 14:49 -------
Subject: Re: Inordinate compile times on large
routines
lucier at math dot purdue dot edu wrote:
> ------- Comment #42 from lucier at math dot purdue dot edu 2007-12-20 03:52 -------
> Created an attachment (id=14799)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14799&action=view)
> --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14799&action=view)
> memory details for an unpatched mainline
>
> Here is the same information without Steven's two patches for mainline.
>
>
>
Could you add the attached patch in and rerun your example?
It will add 4 lines to indicate what kinds of def-use and use-def chains
are being created.
A lot of the space is being used by these chains and I want to find out
how many of those chains are for artificial uses and defs.
thanks
kenny
Index: df-problems.c
===================================================================
--- df-problems.c (revision 131096)
+++ df-problems.c (working copy)
@@ -1855,13 +1855,23 @@ df_live_verify_transfer_functions (void)
#define df_chain_problem_p(FLAG) (((enum
df_chain_flags)df_chain->local_flags)&(FLAG))
+static long df_chain_counters[4];
+
/* Create a du or ud chain from SRC to DST and link it into SRC. */
struct df_link *
df_chain_create (struct df_ref *src, struct df_ref *dst)
{
struct df_link *head = DF_REF_CHAIN (src);
- struct df_link *link = pool_alloc (df_chain->block_pool);;
+ struct df_link *link = pool_alloc (df_chain->block_pool);
+ int index = 0;
+
+ if (!src->insn)
+ index += (src->type == DF_REF_REG_DEF) ? 2 : 1;
+ if (!dst->insn)
+ index += (src->type == DF_REF_REG_DEF) ? 2 : 1;
+
+ df_chain_counters[index]++;
DF_REF_CHAIN (src) = link;
link->next = head;
@@ -2156,11 +2166,18 @@ df_chain_finalize (bitmap all_blocks)
{
unsigned int bb_index;
bitmap_iterator bi;
-
+
+ memset (df_chain_counters, 0, 4*sizeof(long));
+
EXECUTE_IF_SET_IN_BITMAP (all_blocks, 0, bb_index, bi)
{
df_chain_create_bb (bb_index);
}
+
+ fprintf (stderr, "real -> real = %ld\n", df_chain_counters[0]);
+ fprintf (stderr, "real -> art = %ld\n", df_chain_counters[1]);
+ fprintf (stderr, "art -> real = %ld\n", df_chain_counters[2]);
+ fprintf (stderr, "art -> art = %ld\n", df_chain_counters[3]);
}
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (42 preceding siblings ...)
2007-12-20 14:49 ` zadeck at naturalbridge dot com
@ 2007-12-20 15:08 ` stevenb dot gcc at gmail dot com
2007-12-20 15:31 ` zadeck at naturalbridge dot com
` (76 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: stevenb dot gcc at gmail dot com @ 2007-12-20 15:08 UTC (permalink / raw)
To: gcc-bugs
------- Comment #44 from stevenb dot gcc at gmail dot com 2007-12-20 15:08 -------
Subject: Re: Inordinate compile times on large routines
On 20 Dec 2007 14:49:12 -0000, zadeck at naturalbridge dot com
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #43 from zadeck at naturalbridge dot com 2007-12-20 14:49 -------
> Subject: Re: Inordinate compile times on large
> routines
>
> lucier at math dot purdue dot edu wrote:
> > ------- Comment #42 from lucier at math dot purdue dot edu 2007-12-20 03:52 -------
> > Created an attachment (id=14799)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14799&action=view)
> --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14799&action=view)
> > --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14799&action=view)
> > memory details for an unpatched mainline
> >
> > Here is the same information without Steven's two patches for mainline.
> >
> >
> >
> Could you add the attached patch in and rerun your example?
>
> It will add 4 lines to indicate what kinds of def-use and use-def chains
> are being created.
> A lot of the space is being used by these chains and I want to find out
> how many of those chains are for artificial uses and defs.
>
> thanks
>
> kenny
> struct df_link *
> df_chain_create (struct df_ref *src, struct df_ref *dst)
> {
> struct df_link *head = DF_REF_CHAIN (src);
> - struct df_link *link = pool_alloc (df_chain->block_pool);;
> + struct df_link *link = pool_alloc (df_chain->block_pool);
> + int index = 0;
> +
> + if (!src->insn)
> + index += (src->type == DF_REF_REG_DEF) ? 2 : 1;
> + if (!dst->insn)
> + index += (src->type == DF_REF_REG_DEF) ? 2 : 1;
> +
> + df_chain_counters[index]++;
Watch for segfaults. Index will be 1, 2, 3, or 4.
df_chain_counters[4] does not exist.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (43 preceding siblings ...)
2007-12-20 15:08 ` stevenb dot gcc at gmail dot com
@ 2007-12-20 15:31 ` zadeck at naturalbridge dot com
2007-12-20 16:06 ` zadeck at naturalbridge dot com
` (75 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: zadeck at naturalbridge dot com @ 2007-12-20 15:31 UTC (permalink / raw)
To: gcc-bugs
------- Comment #45 from zadeck at naturalbridge dot com 2007-12-20 15:31 -------
Subject: Re: Inordinate compile times on large
routines
stevenb dot gcc at gmail dot com wrote:
> ------- Comment #44 from stevenb dot gcc at gmail dot com 2007-12-20 15:08 -------
> Subject: Re: Inordinate compile times on large routines
>
> On 20 Dec 2007 14:49:12 -0000, zadeck at naturalbridge dot com
> <gcc-bugzilla@gcc.gnu.org> wrote:
>
>> ------- Comment #43 from zadeck at naturalbridge dot com 2007-12-20 14:49 -------
>> Subject: Re: Inordinate compile times on large
>> routines
>>
>> lucier at math dot purdue dot edu wrote:
>>
>>> ------- Comment #42 from lucier at math dot purdue dot edu 2007-12-20 03:52 -------
>>> Created an attachment (id=14799)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14799&action=view)
>>>
> --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14799&action=view)
>
>> --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14799&action=view)
>>
>>> --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14799&action=view)
>>> memory details for an unpatched mainline
>>>
>>> Here is the same information without Steven's two patches for mainline.
>>>
>>>
>>>
>>>
>> Could you add the attached patch in and rerun your example?
>>
>> It will add 4 lines to indicate what kinds of def-use and use-def chains
>> are being created.
>> A lot of the space is being used by these chains and I want to find out
>> how many of those chains are for artificial uses and defs.
>>
>> thanks
>>
>> kenny
>> struct df_link *
>> df_chain_create (struct df_ref *src, struct df_ref *dst)
>> {
>> struct df_link *head = DF_REF_CHAIN (src);
>> - struct df_link *link = pool_alloc (df_chain->block_pool);;
>> + struct df_link *link = pool_alloc (df_chain->block_pool);
>> + int index = 0;
>> +
>> + if (!src->insn)
>> + index += (src->type == DF_REF_REG_DEF) ? 2 : 1;
>> + if (!dst->insn)
>> + index += (src->type == DF_REF_REG_DEF) ? 2 : 1;
>> +
>> + df_chain_counters[index]++;
>>
>
> Watch for segfaults. Index will be 1, 2, 3, or 4.
> df_chain_counters[4] does not exist.
>
>
>
indexes will be 0, 1, 2, 3.
there are no def-def chains, and in particular there are no artificial
def to artificial def chains. those increments only happen for
artificial defs or uses. Regular uses or defs have an insn. a normal
def-use chain will have index 0.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (44 preceding siblings ...)
2007-12-20 15:31 ` zadeck at naturalbridge dot com
@ 2007-12-20 16:06 ` zadeck at naturalbridge dot com
2007-12-20 16:11 ` lucier at math dot purdue dot edu
` (74 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: zadeck at naturalbridge dot com @ 2007-12-20 16:06 UTC (permalink / raw)
To: gcc-bugs
------- Comment #46 from zadeck at naturalbridge dot com 2007-12-20 16:06 -------
Subject: Re: Inordinate compile times on large
routines
> indexes will be 0, 1, 2, 3.
>
> there are no def-def chains, and in particular there are no artificial
> def to artificial def chains. those increments only happen for
> artificial defs or uses. Regular uses or defs have an insn. a normal
> def-use chain will have index 0.
>
>
>
however there is a bug with the patch that steven did not notice, try
this one instead.
Index: df-problems.c
===================================================================
--- df-problems.c (revision 131096)
+++ df-problems.c (working copy)
@@ -1855,13 +1855,23 @@ df_live_verify_transfer_functions (void)
#define df_chain_problem_p(FLAG) (((enum
df_chain_flags)df_chain->local_flags)&(FLAG))
+static long df_chain_counters[4];
+
/* Create a du or ud chain from SRC to DST and link it into SRC. */
struct df_link *
df_chain_create (struct df_ref *src, struct df_ref *dst)
{
struct df_link *head = DF_REF_CHAIN (src);
- struct df_link *link = pool_alloc (df_chain->block_pool);;
+ struct df_link *link = pool_alloc (df_chain->block_pool);
+ int index = 0;
+
+ if (!src->insn)
+ index += (src->type == DF_REF_REG_DEF) ? 2 : 1;
+ if (!dst->insn)
+ index += (dst->type == DF_REF_REG_DEF) ? 2 : 1;
+
+ df_chain_counters[index]++;
DF_REF_CHAIN (src) = link;
link->next = head;
@@ -2156,11 +2166,18 @@ df_chain_finalize (bitmap all_blocks)
{
unsigned int bb_index;
bitmap_iterator bi;
-
+
+ memset (df_chain_counters, 0, 4*sizeof(long));
+
EXECUTE_IF_SET_IN_BITMAP (all_blocks, 0, bb_index, bi)
{
df_chain_create_bb (bb_index);
}
+
+ fprintf (stderr, "real -> real = %ld\n", df_chain_counters[0]);
+ fprintf (stderr, "real -> art = %ld\n", df_chain_counters[1]);
+ fprintf (stderr, "art -> real = %ld\n", df_chain_counters[2]);
+ fprintf (stderr, "art -> art = %ld\n", df_chain_counters[3]);
}
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (45 preceding siblings ...)
2007-12-20 16:06 ` zadeck at naturalbridge dot com
@ 2007-12-20 16:11 ` lucier at math dot purdue dot edu
2007-12-20 17:28 ` zadeck at naturalbridge dot com
` (73 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2007-12-20 16:11 UTC (permalink / raw)
To: gcc-bugs
------- Comment #47 from lucier at math dot purdue dot edu 2007-12-20 16:11 -------
Subject: Re: Inordinate compile times on large routines
I don't know what's happening here, the patch doesn't apply; first I get
euler-13% patch < zadeck2.patch
patching file df-problems.c
patch: **** malformed patch at line 8: df_chain_flags)df_chain-
>local_flags)&(FLAG))
and then after I join this line to the previous one (I think bugzilla
reformatted those lines), I get
euler-15% !pa
patch < zadeck2.patch
patching file df-problems.c
Hunk #1 FAILED at 1855.
1 out of 2 hunks FAILED -- saving rejects to file df-problems.c.rej
euler-16% cat df-problems.c.rej
***************
*** 1855,1867 ****
#define df_chain_problem_p(FLAG) (((enum df_chain_flags)df_chain-
>local_flags)&(FLAG))
/* Create a du or ud chain from SRC to DST and link it into SRC. */
struct df_link *
df_chain_create (struct df_ref *src, struct df_ref *dst)
{
struct df_link *head = DF_REF_CHAIN (src);
- struct df_link *link = pool_alloc (df_chain->block_pool);;
DF_REF_CHAIN (src) = link;
link->next = head;
--- 1855,1877 ----
#define df_chain_problem_p(FLAG) (((enum df_chain_flags)df_chain-
>local_flags)&(FLAG))
+ static long df_chain_counters[4];
+
/* Create a du or ud chain from SRC to DST and link it into SRC. */
struct df_link *
df_chain_create (struct df_ref *src, struct df_ref *dst)
{
struct df_link *head = DF_REF_CHAIN (src);
+ struct df_link *link = pool_alloc (df_chain->block_pool);
+ int index = 0;
+
+ if (!src->insn)
+ index += (src->type == DF_REF_REG_DEF) ? 2 : 1;
+ if (!dst->insn)
+ index += (dst->type == DF_REF_REG_DEF) ? 2 : 1;
+
+ df_chain_counters[index]++;
DF_REF_CHAIN (src) = link;
link->next = head;
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (46 preceding siblings ...)
2007-12-20 16:11 ` lucier at math dot purdue dot edu
@ 2007-12-20 17:28 ` zadeck at naturalbridge dot com
2007-12-20 18:56 ` lucier at math dot purdue dot edu
` (72 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: zadeck at naturalbridge dot com @ 2007-12-20 17:28 UTC (permalink / raw)
To: gcc-bugs
------- Comment #48 from zadeck at naturalbridge dot com 2007-12-20 17:28 -------
Created an attachment (id=14801)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14801&action=view)
patch to count different types of def-use chains
this patch replaces the one munged by bugzilla
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (47 preceding siblings ...)
2007-12-20 17:28 ` zadeck at naturalbridge dot com
@ 2007-12-20 18:56 ` lucier at math dot purdue dot edu
2008-01-17 21:41 ` zadeck at naturalbridge dot com
` (71 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2007-12-20 18:56 UTC (permalink / raw)
To: gcc-bugs
------- Comment #49 from lucier at math dot purdue dot edu 2007-12-20 18:56 -------
Subject: Re: Inordinate compile times on large routines
I think this is the extra information you wanted:
> real -> real = 163962912
> real -> art = 0
> art -> real = 0
> art -> art = 0
Brad
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (48 preceding siblings ...)
2007-12-20 18:56 ` lucier at math dot purdue dot edu
@ 2008-01-17 21:41 ` zadeck at naturalbridge dot com
2008-01-17 21:55 ` rguenth at gcc dot gnu dot org
` (70 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: zadeck at naturalbridge dot com @ 2008-01-17 21:41 UTC (permalink / raw)
To: gcc-bugs
------- Comment #50 from zadeck at naturalbridge dot com 2008-01-17 21:20 -------
Subject:
Mark,
Am I allowed to set the target milestone for a patch or is that your job?
26854 is not going to get fixed for 4.3. We made a lot of progress on it
with the patches to 34400, but largest remaining problem is the space
that the current representation of def-use and use-def chains requires.
I should be able to almost cut this in half if we move to something like
a vec rather than a linked list.
But this is a big patch and i do not want to start this until stage I.
kenny
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (49 preceding siblings ...)
2008-01-17 21:41 ` zadeck at naturalbridge dot com
@ 2008-01-17 21:55 ` rguenth at gcc dot gnu dot org
2008-01-17 22:07 ` zadeck at naturalbridge dot com
` (69 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-01-17 21:55 UTC (permalink / raw)
To: gcc-bugs
------- Comment #51 from rguenth at gcc dot gnu dot org 2008-01-17 21:43 -------
As this isn't even marked at a regression, you can fix it whenever you like ;)
Only regressions have a target milestone before they are actually fixed,
though.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (50 preceding siblings ...)
2008-01-17 21:55 ` rguenth at gcc dot gnu dot org
@ 2008-01-17 22:07 ` zadeck at naturalbridge dot com
2008-01-17 22:20 ` lucier at math dot purdue dot edu
` (68 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: zadeck at naturalbridge dot com @ 2008-01-17 22:07 UTC (permalink / raw)
To: gcc-bugs
------- Comment #52 from zadeck at naturalbridge dot com 2008-01-17 21:46 -------
Subject: Re: Inordinate compile times on large
routines
rguenth at gcc dot gnu dot org wrote:
> ------- Comment #51 from rguenth at gcc dot gnu dot org 2008-01-17 21:43 -------
> As this isn't even marked at a regression, you can fix it whenever you like ;)
>
> Only regressions have a target milestone before they are actually fixed,
> though.
>
>
>
just between you and me this is most likely a regression, on the other
hand, i think that people who write functions this large should be
thrown into a pit.
kenny
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (51 preceding siblings ...)
2008-01-17 22:07 ` zadeck at naturalbridge dot com
@ 2008-01-17 22:20 ` lucier at math dot purdue dot edu
2008-01-17 22:54 ` lucier at math dot purdue dot edu
` (67 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2008-01-17 22:20 UTC (permalink / raw)
To: gcc-bugs
------- Comment #53 from lucier at math dot purdue dot edu 2008-01-17 21:53 -------
Subject: Re: Inordinate compile times on large routines
On Jan 17, 2008, at 4:46 PM, zadeck at naturalbridge dot com wrote:
> just between you and me this is most likely a regression,
I, too, believe it is a regression; if you like I can come up with
results from older compilers
> on the other
> hand, i think that people who write functions this large should be
> thrown into a pit.
Luckily, it was written by a code-generator, and not by hand. ;-)
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (52 preceding siblings ...)
2008-01-17 22:20 ` lucier at math dot purdue dot edu
@ 2008-01-17 22:54 ` lucier at math dot purdue dot edu
2008-01-17 23:58 ` zadeck at naturalbridge dot com
` (66 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2008-01-17 22:54 UTC (permalink / raw)
To: gcc-bugs
------- Comment #54 from lucier at math dot purdue dot edu 2008-01-17 22:39 -------
Created an attachment (id=14963)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14963&action=view)
memory details for 131610
This is the detailed memory usage for the compiler
euler-5% /pkgs/gcc-mainline/bin/gcc -v
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: ../../mainline/configure --prefix=/pkgs/gcc-mainline
--enable-languages=c --enable-checking=release --with-gmp=/pkgs/gmp-4.2.2
--with-mpfr=/pkgs/gmp-4.2.2 --enable-gather-detailed-mem-stats
Thread model: posix
gcc version 4.3.0 20080117 (experimental) [trunk revision 131610] (GCC)
The maximum memory I observed in top was 10.2 GB.
Kenny, I can't tell whether your patch from
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34400#c50
has been committed; will that improve the situation, too?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (53 preceding siblings ...)
2008-01-17 22:54 ` lucier at math dot purdue dot edu
@ 2008-01-17 23:58 ` zadeck at naturalbridge dot com
2008-01-18 1:46 ` lucier at math dot purdue dot edu
` (65 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: zadeck at naturalbridge dot com @ 2008-01-17 23:58 UTC (permalink / raw)
To: gcc-bugs
------- Comment #55 from zadeck at naturalbridge dot com 2008-01-17 22:57 -------
Subject: Re: Inordinate compile times on large
routines
lucier at math dot purdue dot edu wrote:
> ------- Comment #54 from lucier at math dot purdue dot edu 2008-01-17 22:39 -------
> Created an attachment (id=14963)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14963&action=view)
> --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14963&action=view)
> memory details for 131610
>
> This is the detailed memory usage for the compiler
>
> euler-5% /pkgs/gcc-mainline/bin/gcc -v
> Using built-in specs.
> Target: x86_64-unknown-linux-gnu
> Configured with: ../../mainline/configure --prefix=/pkgs/gcc-mainline
> --enable-languages=c --enable-checking=release --with-gmp=/pkgs/gmp-4.2.2
> --with-mpfr=/pkgs/gmp-4.2.2 --enable-gather-detailed-mem-stats
> Thread model: posix
> gcc version 4.3.0 20080117 (experimental) [trunk revision 131610] (GCC)
>
> The maximum memory I observed in top was 10.2 GB.
>
> Kenny, I can't tell whether your patch from
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34400#c50
>
> has been committed; will that improve the situation, too?
>
>
>
it could, but it is not the big issue here, the big issue is the size of
the def use chains.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (54 preceding siblings ...)
2008-01-17 23:58 ` zadeck at naturalbridge dot com
@ 2008-01-18 1:46 ` lucier at math dot purdue dot edu
2008-01-18 2:18 ` zadeck at naturalbridge dot com
` (64 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2008-01-18 1:46 UTC (permalink / raw)
To: gcc-bugs
------- Comment #56 from lucier at math dot purdue dot edu 2008-01-18 01:38 -------
gcc is now 5-6 times faster than it was nearly two years ago when this was
first reported; many changes have made significant improvements in cpu time.
But Steven Bosscher's patch from December still improved things more on this
test case.
In particular, on 12/20/2007, without the patch, CPU time from
http://gcc.gnu.org/bugzilla/attachment.cgi?id=14799
was
TOTAL : 300.21 19.16 319.52
778432 kB
After Steven Bosscher's patch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34400#c28
it was
TOTAL : 210.97 15.80 226.88
778432 kB
and today it's
TOTAL : 281.08 18.03 299.41
776514 kB
Would it still be a good idea to apply Steven's patch?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (55 preceding siblings ...)
2008-01-18 1:46 ` lucier at math dot purdue dot edu
@ 2008-01-18 2:18 ` zadeck at naturalbridge dot com
2008-01-19 0:51 ` zadeck at gcc dot gnu dot org
` (63 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: zadeck at naturalbridge dot com @ 2008-01-18 2:18 UTC (permalink / raw)
To: gcc-bugs
------- Comment #57 from zadeck at naturalbridge dot com 2008-01-18 02:10 -------
Subject: Re: Inordinate compile times on large
routines
lucier at math dot purdue dot edu wrote:
> ------- Comment #56 from lucier at math dot purdue dot edu 2008-01-18 01:38 -------
> gcc is now 5-6 times faster than it was nearly two years ago when this was
> first reported; many changes have made significant improvements in cpu time.
>
> But Steven Bosscher's patch from December still improved things more on this
> test case.
>
> In particular, on 12/20/2007, without the patch, CPU time from
>
> http://gcc.gnu.org/bugzilla/attachment.cgi?id=14799
>
> was
>
> TOTAL : 300.21 19.16 319.52
> 778432 kB
>
> After Steven Bosscher's patch
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34400#c28
>
> it was
>
> TOTAL : 210.97 15.80 226.88
> 778432 kB
>
> and today it's
>
> TOTAL : 281.08 18.03 299.41
> 776514 kB
>
> Would it still be a good idea to apply Steven's patch?
>
>
>
the plan is to apply all of the patches, they each deal with a
different problem and the improvement should be additive.
kenny
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (56 preceding siblings ...)
2008-01-18 2:18 ` zadeck at naturalbridge dot com
@ 2008-01-19 0:51 ` zadeck at gcc dot gnu dot org
2008-01-20 2:21 ` zadeck at gcc dot gnu dot org
` (62 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: zadeck at gcc dot gnu dot org @ 2008-01-19 0:51 UTC (permalink / raw)
To: gcc-bugs
------- Comment #58 from zadeck at gcc dot gnu dot org 2008-01-19 00:39 -------
Subject: Bug 26854
Author: zadeck
Date: Sat Jan 19 00:38:34 2008
New Revision: 131649
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=131649
Log:
2008-01-18 Kenneth Zadeck <zadeck@naturalbridge.com>
Steven Bosscher <stevenb.gcc@gmail.com>
PR rtl-optimization/26854
PR rtl-optimization/34400
* df-problems.c (df_live_scratch): New scratch bitmap.
(df_live_alloc): Allocate df_live_scratch when doing df_live.
(df_live_reset): Clear the proper bitmaps.
(df_live_bb_local_compute): Only process the artificial defs once
since the order is not important.
(df_live_init): Init the df_live sets only with the variables
found live by df_lr.
(df_live_transfer_function): Use the df_lr sets to prune the
df_live sets as they are being computed.
(df_live_free): Free df_live_scratch.
Modified:
trunk/gcc/ChangeLog
trunk/gcc/df-problems.c
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (57 preceding siblings ...)
2008-01-19 0:51 ` zadeck at gcc dot gnu dot org
@ 2008-01-20 2:21 ` zadeck at gcc dot gnu dot org
2008-01-22 13:59 ` zadeck at gcc dot gnu dot org
` (61 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: zadeck at gcc dot gnu dot org @ 2008-01-20 2:21 UTC (permalink / raw)
To: gcc-bugs
------- Comment #59 from zadeck at gcc dot gnu dot org 2008-01-20 01:49 -------
Subject: Bug 26854
Author: zadeck
Date: Sun Jan 20 01:48:25 2008
New Revision: 131670
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=131670
Log:
2008-01-19 Kenneth Zadeck <zadeck@naturalbridge.com>
PR rtl-optimization/26854
PR rtl-optimization/34400
* ddg.c (create_ddg_dep_from_intra_loop_link): Do not use
DF_RD->gen.
* df.h (df_changeable_flags.DF_RD_NO_TRIM): New.
(df_rd_bb_info.expanded_lr_out): New.
* loop_invariant.c (find_defs): Added DF_RD_NO_TRIM flag.
* loop_iv.c (iv_analysis_loop_init): Ditto.
* df-problems.c (df_rd_free_bb_info, df_rd_alloc, df_rd_confluence_n,
df_rd_bb_local_compute, df_rd_transfer_function, df_rd_free):
Added code to allocate, initialize or free expanded_lr_out.
(df_rd_bb_local_compute_process_def): Restructured to make
more understandable.
(df_rd_confluence_n): Add code to do nothing with fake edges and
code to no apply invalidate_by_call sets if the sets are being trimmed.
(df_lr_local_finalize): Renamed to df_lr_finalize.
(df_live_local_finalize): Renamed to df_live_finalize.
Modified:
trunk/gcc/ChangeLog
trunk/gcc/ddg.c
trunk/gcc/df-problems.c
trunk/gcc/df.h
trunk/gcc/loop-invariant.c
trunk/gcc/loop-iv.c
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (58 preceding siblings ...)
2008-01-20 2:21 ` zadeck at gcc dot gnu dot org
@ 2008-01-22 13:59 ` zadeck at gcc dot gnu dot org
2008-01-23 15:45 ` lucier at math dot purdue dot edu
` (60 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: zadeck at gcc dot gnu dot org @ 2008-01-22 13:59 UTC (permalink / raw)
To: gcc-bugs
------- Comment #60 from zadeck at gcc dot gnu dot org 2008-01-22 13:57 -------
Subject: Bug 26854
Author: zadeck
Date: Tue Jan 22 13:57:01 2008
New Revision: 131719
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=131719
Log:
2008-01-22 Kenneth Zadeck <zadeck@naturalbridge.com>
PR rtl-optimization/26854
PR rtl-optimization/34400
PR rtl-optimization/34884
* ddg.c (create_ddg_dep_from_intra_loop_link): Use
DF_RD->gen.
* df.h (df_changeable_flags.DF_RD_NO_TRIM): Deleted
(df_rd_bb_info.expanded_lr_out): Deleted
* loop_invariant.c (find_defs): Deleted DF_RD_NO_TRIM flag.
* loop_iv.c (iv_analysis_loop_init): Ditto. * df-problems.c
(df_rd_free_bb_info, df_rd_alloc, df_rd_confluence_n,
df_rd_bb_local_compute, df_rd_transfer_function, df_rd_free):
Removed code to allocate, initialize or free expanded_lr_out.
(df_rd_bb_local_compute_process_def): Restructured to make more
understandable.
(df_rd_confluence_n): Removed code to no apply invalidate_by_call
sets if the sets are being trimmed.
Modified:
trunk/gcc/ChangeLog
trunk/gcc/ddg.c
trunk/gcc/df-problems.c
trunk/gcc/df.h
trunk/gcc/loop-invariant.c
trunk/gcc/loop-iv.c
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (59 preceding siblings ...)
2008-01-22 13:59 ` zadeck at gcc dot gnu dot org
@ 2008-01-23 15:45 ` lucier at math dot purdue dot edu
2008-05-15 2:49 ` lucier at math dot purdue dot edu
` (59 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2008-01-23 15:45 UTC (permalink / raw)
To: gcc-bugs
------- Comment #61 from lucier at math dot purdue dot edu 2008-01-23 15:03 -------
Subject: Re: Inordinate compile times on large routines
Kenny:
Even after you backed out this latest patch the CPU usage was down
(to 203 seconds from 280 seconds on my machine) and the maximum
memory usage was down (to 7.3 GB from 10.2 GB). That's a big
improvement.
Brad
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (60 preceding siblings ...)
2008-01-23 15:45 ` lucier at math dot purdue dot edu
@ 2008-05-15 2:49 ` lucier at math dot purdue dot edu
2008-05-15 2:51 ` lucier at math dot purdue dot edu
` (58 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2008-05-15 2:49 UTC (permalink / raw)
To: gcc-bugs
------- Comment #62 from lucier at math dot purdue dot edu 2008-05-15 02:48 -------
I thought I might test the ira branch with
euler-3% /pkgs/gcc-ira/bin/gcc -v
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: ../../ira/configure --enable-checking=release
--with-gmp=/pkgs/gmp-4.2.2/ --with-mpfr=/pkgs/gmp-4.2.2/ --prefix=/pkgs/gcc-ira
--enable-languages=c --enable-gather-detailed-mem-stats
Thread model: posix
gcc version 4.4.0 20080328 (experimental) [ira revision 135280] (GCC)
The command line was
/pkgs/gcc-ira/bin/gcc -fno-ira -Wall -W -Wno-unused -O1 -fno-math-errno
-fschedule-insns2 -fno-trapping-math -fno-strict-aliasing -fwrapv
-fomit-frame-pointer -fPIC -ftime-report -fmem-report -c all.i
with -fira and with -fno-ira.
The ira branch takes a lot longer to compile this code with -fira than without
it; the relevant lines seem to be:
for -fira:
integrated RA : 373.36 (66%) usr 0.33 ( 2%) sys 375.87 (64%) wall
12064 kB ( 2%) ggc
TOTAL : 563.85 15.94 582.98
763565 kB
for -fno-ira:
local alloc : 8.42 ( 4%) usr 0.03 ( 0%) sys 8.43 ( 4%) wall
7073 kB ( 1%) ggc
global alloc : 20.91 (11%) usr 0.30 ( 2%) sys 21.23 (10%) wall
4961 kB ( 1%) ggc
TOTAL : 196.25 17.55 213.84
766052 kB
I'll add the complete reports as the next two attachments.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (61 preceding siblings ...)
2008-05-15 2:49 ` lucier at math dot purdue dot edu
@ 2008-05-15 2:51 ` lucier at math dot purdue dot edu
2008-05-15 2:52 ` lucier at math dot purdue dot edu
` (57 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2008-05-15 2:51 UTC (permalink / raw)
To: gcc-bugs
------- Comment #63 from lucier at math dot purdue dot edu 2008-05-15 02:50 -------
Created an attachment (id=15639)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15639&action=view)
statistics for ira branch with -fno-ira
This is the output of the command in the previous comment with -fno-ira
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (62 preceding siblings ...)
2008-05-15 2:51 ` lucier at math dot purdue dot edu
@ 2008-05-15 2:52 ` lucier at math dot purdue dot edu
2008-05-15 5:59 ` steven at gcc dot gnu dot org
` (56 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2008-05-15 2:52 UTC (permalink / raw)
To: gcc-bugs
------- Comment #64 from lucier at math dot purdue dot edu 2008-05-15 02:51 -------
Created an attachment (id=15640)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15640&action=view)
statistics for ira branch with -fira
This is the output of the command in the previous comment with -fira
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (63 preceding siblings ...)
2008-05-15 2:52 ` lucier at math dot purdue dot edu
@ 2008-05-15 5:59 ` steven at gcc dot gnu dot org
2008-05-19 2:00 ` vmakarov at redhat dot com
` (55 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: steven at gcc dot gnu dot org @ 2008-05-15 5:59 UTC (permalink / raw)
To: gcc-bugs
------- Comment #65 from steven at gcc dot gnu dot org 2008-05-15 05:59 -------
integrated RA : 373.36 (66%) usr 0.33 ( 2%) sys 375.87 (64%) wall
12064 kB ( 2%) ggc
'nuff said.
Oh, not entirely yet: IRA should have more than one timevar.
--
steven at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |vmakarov at gcc dot gnu dot
| |org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (64 preceding siblings ...)
2008-05-15 5:59 ` steven at gcc dot gnu dot org
@ 2008-05-19 2:00 ` vmakarov at redhat dot com
2008-05-19 2:04 ` vmakarov at gcc dot gnu dot org
` (54 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: vmakarov at redhat dot com @ 2008-05-19 2:00 UTC (permalink / raw)
To: gcc-bugs
------- Comment #66 from vmakarov at redhat dot com 2008-05-19 02:00 -------
The problem with IRA was in too many allocnos to be chosen for spilling. The
most tome was spent in choosing the best allocno for spilling. The patch
solving the problem is coming.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (65 preceding siblings ...)
2008-05-19 2:00 ` vmakarov at redhat dot com
@ 2008-05-19 2:04 ` vmakarov at gcc dot gnu dot org
2008-05-19 2:09 ` vmakarov at redhat dot com
` (53 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: vmakarov at gcc dot gnu dot org @ 2008-05-19 2:04 UTC (permalink / raw)
To: gcc-bugs
------- Comment #67 from vmakarov at gcc dot gnu dot org 2008-05-19 02:03 -------
Subject: Bug 26854
Author: vmakarov
Date: Mon May 19 02:02:52 2008
New Revision: 135523
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=135523
Log:
2008-05-18 Vladimir Makarov <vmakarov@redhat.com>
PR tree-optimization/26854
* timevar.def (TV_RELOAD): New timer.
* ira.c (ira): Use TV_IRA and TV_RELOAD.
(pass_ira): Remove TV_IRA.
* Makefile.in (ira-color.o): Add SPLAY_TREE_H.
* ira-conflicts.c (DEF_VEC_P, DEF_ALLOCC_P): Move to ira-int.h.
* ira-int.h (DEF_VEC_P, DEF_ALLOCC_P): Move from ira-conflicts.c and
ira-color.c.
(struct allocno): New bitfield splay_removed_p.
(ALLOCNO_MAY_BE_SPILLED_P): New macro.
* ira-color.c (splay-tree.h): Add the header.
(allocno_spill_priority_compare, splay_tree_allocate,
splay_tree_free): New functions.
(DEF_VEC_P, DEF_ALLOCC_P): Move to ira-int.h.
(sorted_allocnos_for_spilling): Rename to allocnos_for_spilling.
(splay_tree_node_pool, removed_splay_allocno_vec,
uncolorable_allocnos_num, uncolorable_allocnos_splay_tree): New
global variables.
(add_allocno_to_bucket, add_allocno_to_ordered_bucket,
delete_allocno_from_bucket): Update uncolorable_allocnos_num.
(USE_SPLAY_P): New macro.
(push_allocno_to_stack): Remove allocno from the splay tree.
(push_allocnos_to_stack): Use the splay trees.
(do_coloring): Create and finish splay_tree_node_pool.
Move allocation/deallocation of allocnos_for_spilling to here...
(initiate_ira_assign, finish_ira_assign): Move
allocnos_for_spilling from here...
(ira_color): Allocate/deallocate removed_splay_allocno_vec.
* ira-build.c (DEF_VEC_P, DEF_ALLOCC_P): Move to ira-int.h.
(create_allocno): Initiate ALLOCNO_SPLAY_REMOVED_P.
Modified:
branches/ira/gcc/ChangeLog
branches/ira/gcc/Makefile.in
branches/ira/gcc/ira-build.c
branches/ira/gcc/ira-color.c
branches/ira/gcc/ira-conflicts.c
branches/ira/gcc/ira-int.h
branches/ira/gcc/ira.c
branches/ira/gcc/timevar.def
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (66 preceding siblings ...)
2008-05-19 2:04 ` vmakarov at gcc dot gnu dot org
@ 2008-05-19 2:09 ` vmakarov at redhat dot com
2008-05-19 17:55 ` lucier at math dot purdue dot edu
` (52 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: vmakarov at redhat dot com @ 2008-05-19 2:09 UTC (permalink / raw)
To: gcc-bugs
------- Comment #68 from vmakarov at redhat dot com 2008-05-19 02:08 -------
The patch solving IRA problem is described in
http://gcc.gnu.org/ml/gcc-patches/2008-05/msg01093.html
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (67 preceding siblings ...)
2008-05-19 2:09 ` vmakarov at redhat dot com
@ 2008-05-19 17:55 ` lucier at math dot purdue dot edu
2008-07-10 17:37 ` lucier at math dot purdue dot edu
` (51 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2008-05-19 17:55 UTC (permalink / raw)
To: gcc-bugs
------- Comment #69 from lucier at math dot purdue dot edu 2008-05-19 17:54 -------
That really smashed the problem. I find the following timings without IRA:
local alloc : 8.53 ( 4%) usr 0.01 ( 0%) sys 8.59 ( 3%) wall
7073 kB ( 1%) ggc
global alloc : 30.44 (14%) usr 0.33 ( 2%) sys 30.83 (12%) wall
4961 kB ( 1%) ggc
TOTAL : 211.48 17.00 261.74
766052 kB
and with IRA:
integrated RA : 10.58 ( 5%) usr 0.37 ( 2%) sys 11.05 ( 5%) wall
7138 kB ( 1%) ggc
reload : 11.89 ( 6%) usr 0.01 ( 0%) sys 11.96 ( 5%) wall
4925 kB ( 1%) ggc
TOTAL : 200.18 16.10 221.53
763565 kB
Thanks!
Brad
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (68 preceding siblings ...)
2008-05-19 17:55 ` lucier at math dot purdue dot edu
@ 2008-07-10 17:37 ` lucier at math dot purdue dot edu
2008-07-10 17:45 ` lucier at math dot purdue dot edu
` (50 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2008-07-10 17:37 UTC (permalink / raw)
To: gcc-bugs
------- Comment #70 from lucier at math dot purdue dot edu 2008-07-10 17:36 -------
Created an attachment (id=15893)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15893&action=view)
detailed memory stats for trunk revision 137644
These are the detailed memory stats for
euler-11% /pkgs/gcc-mainline/bin/gcc -v
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: ../../mainline/configure --enable-checking=release
--with-gmp=/pkgs/gmp-4.2.2/ --with-mpfr=/pkgs/gmp-4.2.2/
--prefix=/pkgs/gcc-mainline --enable-languages=c
--enable-gather-detailed-mem-stats
Thread model: posix
gcc version 4.4.0 20080708 (experimental) [trunk revision 137644] (GCC)
applied to this problem, with command line
/pkgs/gcc-mainline/bin/gcc -Wall -W -Wno-unused -O1 -fno-math-errno
-fschedule-insns2 -fno-trapping-math -fno-strict-aliasing -fwrapv
-fomit-frame-pointer -fPIC -ftime-report -fmem-report -c all.i >&
mainline-stats-O3
The run time isn't so bad, but the memory usage still peaks at 7.3 gigs.
Now that distributions have started shipping 4.2.whatever (Ubuntu 8.04 ships
4.2.3), this problem is showing up more an more as a regression against
previous releases of gcc.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (69 preceding siblings ...)
2008-07-10 17:37 ` lucier at math dot purdue dot edu
@ 2008-07-10 17:45 ` lucier at math dot purdue dot edu
2008-07-10 19:38 ` rguenth at gcc dot gnu dot org
` (49 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2008-07-10 17:45 UTC (permalink / raw)
To: gcc-bugs
------- Comment #71 from lucier at math dot purdue dot edu 2008-07-10 17:44 -------
Here are additional informal comparisons of 4.2.3 with Apple's 4.0.1 and gcc
3.4.5 on mingw:
https://webmail.iro.umontreal.ca/pipermail/gambit-list/2008-July/002450.html
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (70 preceding siblings ...)
2008-07-10 17:45 ` lucier at math dot purdue dot edu
@ 2008-07-10 19:38 ` rguenth at gcc dot gnu dot org
2008-07-10 19:40 ` zadeck at naturalbridge dot com
` (48 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-07-10 19:38 UTC (permalink / raw)
To: gcc-bugs
------- Comment #72 from rguenth at gcc dot gnu dot org 2008-07-10 19:37 -------
The memory counters for DF even overflow ;)
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (71 preceding siblings ...)
2008-07-10 19:38 ` rguenth at gcc dot gnu dot org
@ 2008-07-10 19:40 ` zadeck at naturalbridge dot com
2008-09-10 13:40 ` lucier at math dot purdue dot edu
` (47 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: zadeck at naturalbridge dot com @ 2008-07-10 19:40 UTC (permalink / raw)
To: gcc-bugs
------- Comment #73 from zadeck at naturalbridge dot com 2008-07-10 19:40 -------
Subject: Re: Inordinate compile times on large
routines
rguenth at gcc dot gnu dot org wrote:
> ------- Comment #72 from rguenth at gcc dot gnu dot org 2008-07-10 19:37 -------
> The memory counters for DF even overflow ;)
>
>
>
we have our best people working on it. this is what fuds are supposed
to fix.
kenny
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (72 preceding siblings ...)
2008-07-10 19:40 ` zadeck at naturalbridge dot com
@ 2008-09-10 13:40 ` lucier at math dot purdue dot edu
2008-09-17 19:39 ` [Bug tree-optimization/26854] [4.3/4.4 Regression] " rguenth at gcc dot gnu dot org
` (46 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2008-09-10 13:40 UTC (permalink / raw)
To: gcc-bugs
------- Comment #74 from lucier at math dot purdue dot edu 2008-09-10 13:39 -------
This need for more memory is a regression from earlier versions of gcc.
Can this bug be marked with
[4.3/4.4 Regression]
in the subject line?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (73 preceding siblings ...)
2008-09-10 13:40 ` lucier at math dot purdue dot edu
@ 2008-09-17 19:39 ` rguenth at gcc dot gnu dot org
2008-09-18 1:20 ` lucier at math dot purdue dot edu
` (45 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-09-17 19:39 UTC (permalink / raw)
To: gcc-bugs
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Priority|P3 |P2
Target Milestone|--- |4.3.3
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (74 preceding siblings ...)
2008-09-17 19:39 ` [Bug tree-optimization/26854] [4.3/4.4 Regression] " rguenth at gcc dot gnu dot org
@ 2008-09-18 1:20 ` lucier at math dot purdue dot edu
2008-09-26 15:45 ` lucier at math dot purdue dot edu
` (44 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2008-09-18 1:20 UTC (permalink / raw)
To: gcc-bugs
------- Comment #75 from lucier at math dot purdue dot edu 2008-09-18 01:19 -------
Created an attachment (id=16350)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16350&action=view)
statistics with checking enabled and using longs to count bytes
Using the patch from
http://gcc.gnu.org/ml/gcc-patches/2008-09/msg01270.html
I gathered statistics using 64-bit longs for this test case. Using it, one
finds that 10,292,897,120 bytes of bitmaps and 6,449,831,120 bytes in
alloc-pools are allocated with mainline for this test case (at least when
checking is enabled).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (75 preceding siblings ...)
2008-09-18 1:20 ` lucier at math dot purdue dot edu
@ 2008-09-26 15:45 ` lucier at math dot purdue dot edu
2008-09-26 15:45 ` lucier at math dot purdue dot edu
` (43 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2008-09-26 15:45 UTC (permalink / raw)
To: gcc-bugs
------- Comment #77 from lucier at math dot purdue dot edu 2008-09-26 15:44 -------
Created an attachment (id=16412)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16412&action=view)
memory and cpu statistics for 9/25
Here is a timing report from today.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (76 preceding siblings ...)
2008-09-26 15:45 ` lucier at math dot purdue dot edu
@ 2008-09-26 15:45 ` lucier at math dot purdue dot edu
2008-09-26 15:47 ` lucier at math dot purdue dot edu
` (42 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2008-09-26 15:45 UTC (permalink / raw)
To: gcc-bugs
------- Comment #76 from lucier at math dot purdue dot edu 2008-09-26 15:43 -------
Created an attachment (id=16411)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16411&action=view)
memory and cpu time statistics for 2008-09-19
There has been a 13% compile-time regression on this PR since September 19.
Looking at the statistics, it appears that there is a general increase in cpu
time in things that deal with df-chains.
This is the timings from 9/18; I'll include last night's times next.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (77 preceding siblings ...)
2008-09-26 15:45 ` lucier at math dot purdue dot edu
@ 2008-09-26 15:47 ` lucier at math dot purdue dot edu
2009-01-24 10:28 ` rguenth at gcc dot gnu dot org
` (41 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2008-09-26 15:47 UTC (permalink / raw)
To: gcc-bugs
------- Comment #78 from lucier at math dot purdue dot edu 2008-09-26 15:45 -------
Created an attachment (id=16413)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16413&action=view)
Memory and cpu statistics from 9/16
Sorry, I included the wrong file; this should be the correct one from 9/16.
--
lucier at math dot purdue dot edu changed:
What |Removed |Added
----------------------------------------------------------------------------
Attachment #16411|0 |1
is obsolete| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (78 preceding siblings ...)
2008-09-26 15:47 ` lucier at math dot purdue dot edu
@ 2009-01-24 10:28 ` rguenth at gcc dot gnu dot org
2009-02-04 12:45 ` bonzini at gnu dot org
` (40 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-01-24 10:28 UTC (permalink / raw)
To: gcc-bugs
------- Comment #79 from rguenth at gcc dot gnu dot org 2009-01-24 10:19 -------
GCC 4.3.3 is being released, adjusting target milestone.
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|4.3.3 |4.3.4
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (79 preceding siblings ...)
2009-01-24 10:28 ` rguenth at gcc dot gnu dot org
@ 2009-02-04 12:45 ` bonzini at gnu dot org
2009-02-04 17:27 ` lucier at math dot purdue dot edu
` (39 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: bonzini at gnu dot org @ 2009-02-04 12:45 UTC (permalink / raw)
To: gcc-bugs
------- Comment #80 from bonzini at gnu dot org 2009-02-04 12:45 -------
Brad, can you produce new stats?
--
bonzini at gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |bonzini at gnu dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (80 preceding siblings ...)
2009-02-04 12:45 ` bonzini at gnu dot org
@ 2009-02-04 17:27 ` lucier at math dot purdue dot edu
2009-02-04 17:28 ` lucier at math dot purdue dot edu
` (38 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2009-02-04 17:27 UTC (permalink / raw)
To: gcc-bugs
------- Comment #81 from lucier at math dot purdue dot edu 2009-02-04 17:27 -------
Created an attachment (id=17243)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17243&action=view)
Memory and CPU statistics for 2009/02/04
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (81 preceding siblings ...)
2009-02-04 17:27 ` lucier at math dot purdue dot edu
@ 2009-02-04 17:28 ` lucier at math dot purdue dot edu
2009-02-04 18:24 ` dberlin at dberlin dot org
` (37 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2009-02-04 17:28 UTC (permalink / raw)
To: gcc-bugs
------- Comment #82 from lucier at math dot purdue dot edu 2009-02-04 17:28 -------
I still have the bitmap.c patch from
http://gcc.gnu.org/ml/gcc-patches/2008-09/msg01270.html
in my tree so I don't get meaningless statistics for bitmaps. (Kenny installed
in the trunk something like the patch above for alloc-pool.c.)
There are more bitmaps allocated than on 2008-09-26 (13GB instead of 12GB).
3GB was allocated in alloc-pool.
Execution time was worse, 228.17 user seconds versus 168 seconds.
I didn't watch top to estimate the maximum memory usage.
This is with
euler-8% /pkgs/gcc-mainline/bin/gcc -v
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: ../../mainline/configure --enable-checking=release
--prefix=/pkgs/gcc-mainline --enable-languages=c
--enable-gather-detailed-mem-stats
Thread model: posix
gcc version 4.4.0 20090204 (experimental) [trunk revision 143922] (GCC)
Brad
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (82 preceding siblings ...)
2009-02-04 17:28 ` lucier at math dot purdue dot edu
@ 2009-02-04 18:24 ` dberlin at dberlin dot org
2009-02-11 13:22 ` rguenth at gcc dot gnu dot org
` (36 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: dberlin at dberlin dot org @ 2009-02-04 18:24 UTC (permalink / raw)
To: gcc-bugs
------- Comment #83 from dberlin at gcc dot gnu dot org 2009-02-04 18:24 -------
Subject: Re: [4.3/4.4 Regression] Inordinate
compile times on large routines
These numbers claim a leak of the graph->preds bitmap (and related
bitmaps) which are quite clearly freed all the time.
These bitmaps are allocated onto the predbitmap obstack, which is
released through remove_preds_and_fake_succs.
It always executes, so i have trouble understanding why it considers
this a leak.
On Wed, Feb 4, 2009 at 12:28 PM, lucier at math dot purdue dot edu
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #82 from lucier at math dot purdue dot edu 2009-02-04 17:28 -------
> I still have the bitmap.c patch from
>
> http://gcc.gnu.org/ml/gcc-patches/2008-09/msg01270.html
>
> in my tree so I don't get meaningless statistics for bitmaps. (Kenny installed
> in the trunk something like the patch above for alloc-pool.c.)
>
> There are more bitmaps allocated than on 2008-09-26 (13GB instead of 12GB).
>
> 3GB was allocated in alloc-pool.
>
> Execution time was worse, 228.17 user seconds versus 168 seconds.
>
> I didn't watch top to estimate the maximum memory usage.
>
> This is with
>
> euler-8% /pkgs/gcc-mainline/bin/gcc -v
> Using built-in specs.
> Target: x86_64-unknown-linux-gnu
> Configured with: ../../mainline/configure --enable-checking=release
> --prefix=/pkgs/gcc-mainline --enable-languages=c
> --enable-gather-detailed-mem-stats
> Thread model: posix
> gcc version 4.4.0 20090204 (experimental) [trunk revision 143922] (GCC)
>
> Brad
>
>
> --
>
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
>
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug, or are watching someone who is.
>
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (83 preceding siblings ...)
2009-02-04 18:24 ` dberlin at dberlin dot org
@ 2009-02-11 13:22 ` rguenth at gcc dot gnu dot org
2009-02-13 11:08 ` rguenth at gcc dot gnu dot org
` (35 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-02-11 13:22 UTC (permalink / raw)
To: gcc-bugs
------- Comment #84 from rguenth at gcc dot gnu dot org 2009-02-11 13:22 -------
Btw, for further analyzing it would be nice to have a "smaller" testcase.
Smaller being an order of magnitude less states in ___H__20_all_2e_o1()
(an order of magnitude less label addresses in ___hlbl_tbl).
The source looks sort-of autogenerated, so, is it possible to produce such
a smaller testcase? Thanks!
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (84 preceding siblings ...)
2009-02-11 13:22 ` rguenth at gcc dot gnu dot org
@ 2009-02-13 11:08 ` rguenth at gcc dot gnu dot org
2009-02-13 15:40 ` lucier at math dot purdue dot edu
` (34 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-02-13 11:08 UTC (permalink / raw)
To: gcc-bugs
------- Comment #85 from rguenth at gcc dot gnu dot org 2009-02-13 11:08 -------
*** Bug 39157 has been marked as a duplicate of this bug. ***
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (85 preceding siblings ...)
2009-02-13 11:08 ` rguenth at gcc dot gnu dot org
@ 2009-02-13 15:40 ` lucier at math dot purdue dot edu
2009-02-13 16:55 ` bonzini at gnu dot org
` (33 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2009-02-13 15:40 UTC (permalink / raw)
To: gcc-bugs
------- Comment #86 from lucier at math dot purdue dot edu 2009-02-13 15:40 -------
Subject: Re: [4.3/4.4 Regression] Inordinate
compile times on large routines
It's unfortunate that the discussion from 39157 will be somewhat hard to
find now that that bug is closed.
Steven wrote in a comment for 39157:
It's not like there will not be any loop invariant code motion
(LICM) at all anymore if the RTL LICM pass is disabled. There
is an LICM pass on GIMPLE, and there is also PRE for GIMPLE (and
lazy code motion for RTL but I think it disables itself for your
test case).
The RTL LICM pass mostly cleans up after expand, i.e. moves
things that are not exposed in GIMPLE. This is mostly just
address calculations.
The loop in _num.i that I mentioned in
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39157#c19
is the loop in PR 33928 that is no longer fully optimized after Paolo
(and you, I guess, your name is on the patch) added PRE and disabled
some optimizations in CSE, and what is no longer optimized in that loop
are address calculations. I don't know whether those address
calculations fall under LICM, the only point I'm trying to make right
now is that address calculations are no longer optimized as much as they
were before
http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=118475
and address calculations are an important class of calculations to
optimize.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (86 preceding siblings ...)
2009-02-13 15:40 ` lucier at math dot purdue dot edu
@ 2009-02-13 16:55 ` bonzini at gnu dot org
2009-02-13 17:06 ` jakub at gcc dot gnu dot org
` (32 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: bonzini at gnu dot org @ 2009-02-13 16:55 UTC (permalink / raw)
To: gcc-bugs
------- Comment #87 from bonzini at gnu dot org 2009-02-13 16:54 -------
Subject: Re: [4.3/4.4 Regression] Inordinate
compile times on large routines
> It's unfortunate that the discussion from 39157 will be somewhat hard to
> find now that that bug is closed.
Well, the patch there is not lost, I suppose Jakub will finish it and
post it.
The problem is that -O1 was never meant to give "very fast" code. You
are using it only because our throttling of expensive passes is
insufficient. Fixing that has two sides, as done in PR39157's
discussion: 1) disabling more passes at -O1, 2) establishing some
parameters to throttle down passes at -O2.
Ultimately, the goal should be that you can use -O2.
Paolo
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (87 preceding siblings ...)
2009-02-13 16:55 ` bonzini at gnu dot org
@ 2009-02-13 17:06 ` jakub at gcc dot gnu dot org
2009-02-13 17:30 ` lucier at math dot purdue dot edu
` (31 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: jakub at gcc dot gnu dot org @ 2009-02-13 17:06 UTC (permalink / raw)
To: gcc-bugs
------- Comment #88 from jakub at gcc dot gnu dot org 2009-02-13 17:06 -------
The patch in PR39157 is IMHO finished and has been bootstrapped/regtested on
x86_64-linux and i686-linux. I haven't posted it looked like Richard, Zdenek
and Steven prefer some other solution for it. If this isn't solved for 4.4
soon, I'm going to post that patch.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (88 preceding siblings ...)
2009-02-13 17:06 ` jakub at gcc dot gnu dot org
@ 2009-02-13 17:30 ` lucier at math dot purdue dot edu
2009-02-13 17:37 ` lucier at math dot purdue dot edu
` (30 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2009-02-13 17:30 UTC (permalink / raw)
To: gcc-bugs
------- Comment #89 from lucier at math dot purdue dot edu 2009-02-13 17:30 -------
Subject: Re: [4.3/4.4 Regression] Inordinate
compile times on large routines
On Fri, 2009-02-13 at 17:06 +0000, jakub at gcc dot gnu dot org wrote:
>
>
> ------- Comment #88 from jakub at gcc dot gnu dot org 2009-02-13 17:06 -------
> The patch in PR39157 is IMHO finished and has been bootstrapped/regtested on
> x86_64-linux and i686-linux. I haven't posted it looked like Richard, Zdenek
> and Steven prefer some other solution for it. If this isn't solved for 4.4
> soon, I'm going to post that patch.
I have to leave town within the hour and I may not be able to look at
this properly until Wednesday or so, but it would be interesting to me
to know how large (how many nodes?) are the 139 loops in _num.i referred
to in
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39157#c19
This information may suggest how large the default parameters should be
for -O1 and -O2. (For example, if all the non-whole-function loops have
< 2000 instructions, then 5000 might be a reasonable limit for -O1
loops.)
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (89 preceding siblings ...)
2009-02-13 17:30 ` lucier at math dot purdue dot edu
@ 2009-02-13 17:37 ` lucier at math dot purdue dot edu
2009-02-13 17:44 ` lucier at math dot purdue dot edu
` (29 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2009-02-13 17:37 UTC (permalink / raw)
To: gcc-bugs
------- Comment #90 from lucier at math dot purdue dot edu 2009-02-13 17:37 -------
Subject: Re: [4.3/4.4 Regression] Inordinate
compile times on large routines
On Fri, 2009-02-13 at 16:54 +0000, bonzini at gnu dot org wrote:
>
>
> ------- Comment #87 from bonzini at gnu dot org 2009-02-13 16:54 -------
> The problem is that -O1 was never meant to give "very fast" code.
I'm not looking for "very fast" code, I'm looking for code that doesn't
get > 30% slower from one SVN revision number to the next.
> You
> are using it only because our throttling of expensive passes is
> insufficient.
I am using -O1 because code of this type compiled with -O2 runs
significantly more slowly than code of this type compiled with -O1. I
have never used -O2 on this type of code.
> Fixing that has two sides, as done in PR39157's
> discussion: 1) disabling more passes at -O1, 2) establishing some
> parameters to throttle down passes at -O2.
I don't see that (1) and (2) form the main strategy to fix "that", it
seems that understanding the existing optimizations that are being
disabled in preference for new ones is a good start. And generally
ensuring that -O1 code doesn't get significantly slower while compile
times get significantly higher.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (90 preceding siblings ...)
2009-02-13 17:37 ` lucier at math dot purdue dot edu
@ 2009-02-13 17:44 ` lucier at math dot purdue dot edu
2009-02-14 14:42 ` stevenb dot gcc at gmail dot com
` (28 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2009-02-13 17:44 UTC (permalink / raw)
To: gcc-bugs
------- Comment #91 from lucier at math dot purdue dot edu 2009-02-13 17:43 -------
Subject: Re: [4.3/4.4 Regression] Inordinate
compile times on large routines
On Fri, 2009-02-13 at 17:37 +0000, lucier at math dot purdue dot edu
wrote:
> ------- Comment #90 from lucier at math dot purdue dot edu 2009-02-13 17:37 -------
> Subject: Re: [4.3/4.4 Regression] Inordinate
> compile times on large routines
>
> On Fri, 2009-02-13 at 16:54 +0000, bonzini at gnu dot org wrote:
> >
> >
> > ------- Comment #87 from bonzini at gnu dot org 2009-02-13 16:54 -------
>
> > The problem is that -O1 was never meant to give "very fast" code.
>
> I'm not looking for "very fast" code, I'm looking for code that doesn't
> get > 30% slower from one SVN revision number to the next.
Sorry, this comment refers to PR 33928, not this PR.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (91 preceding siblings ...)
2009-02-13 17:44 ` lucier at math dot purdue dot edu
@ 2009-02-14 14:42 ` stevenb dot gcc at gmail dot com
2009-02-14 21:58 ` lucier at math dot purdue dot edu
` (27 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: stevenb dot gcc at gmail dot com @ 2009-02-14 14:42 UTC (permalink / raw)
To: gcc-bugs
------- Comment #92 from stevenb dot gcc at gmail dot com 2009-02-14 14:42 -------
Subject: Re: [4.3/4.4 Regression] Inordinate
compile times on large routines
Re: Comment #88
I think the patch is perfectly acceptable as a stop-gap solution. I
don't think we have anything better for 4.4. Maybe you can add a
FIXME, though...
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (92 preceding siblings ...)
2009-02-14 14:42 ` stevenb dot gcc at gmail dot com
@ 2009-02-14 21:58 ` lucier at math dot purdue dot edu
2009-02-14 23:07 ` dberlin at dberlin dot org
` (26 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2009-02-14 21:58 UTC (permalink / raw)
To: gcc-bugs
------- Comment #93 from lucier at math dot purdue dot edu 2009-02-14 21:58 -------
Subject: Re: [4.3/4.4 Regression] Inordinate compile times on large routines
I instrumented the compiler and looked how many nodes were in each
loop processed by LICM for the Gambit runtime and compiler.
For generated code, except for the "loop" that contained the entire
function, the greatest number of nodes was 30. (Because computed
gotos are used in the code that checks for heap and stack overflows
after allocations and for waiting interrupts, it's hard to go long in
Scheme code without hitting the "big loop".) For hand-written code,
the greatest number of nodes in a loop was 123.
When bootstrapping gcc with --enable-languages=c, the largest number
of nodes in a loop was 803, and there were 12 loops detected that had
over 500 nodes. 548 loops had 100 nodes or greater. (This is a
bootstrap, so some files were compiled twice with the instrumented
compiler.)
So perhaps an -O1 default for LICM of 100 nodes is reasonable, or
perhaps one might up it to 1000 just to catch everything "reasonable".
Brad
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (93 preceding siblings ...)
2009-02-14 21:58 ` lucier at math dot purdue dot edu
@ 2009-02-14 23:07 ` dberlin at dberlin dot org
2009-02-15 11:26 ` stevenb dot gcc at gmail dot com
` (25 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: dberlin at dberlin dot org @ 2009-02-14 23:07 UTC (permalink / raw)
To: gcc-bugs
------- Comment #94 from dberlin at gcc dot gnu dot org 2009-02-14 23:06 -------
Subject: Re: [4.3/4.4 Regression] Inordinate
compile times on large routines
One of the reasons LCM in RTL is so slow is because it uses a crappy
iteration order.
With the right iteration order, it should be fast enough to turn it
back on and remove the address calculations in these testcases.
If it was block based, it would have been converted to use the DF
solver and gotten this automatically, but because it's edge based,
pretty much nobody has touched it since it was created :)
Even adding qsorts in the right place that sort the worklists into the
right order on each iteration would probably help orders of magnitude
here (though moving to the two worklist solver that DF now uses would
be even better).
On Sat, Feb 14, 2009 at 4:58 PM, lucier at math dot purdue dot edu
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #93 from lucier at math dot purdue dot edu 2009-02-14 21:58 -------
> Subject: Re: [4.3/4.4 Regression] Inordinate compile times on large routines
>
> I instrumented the compiler and looked how many nodes were in each
> loop processed by LICM for the Gambit runtime and compiler.
>
> For generated code, except for the "loop" that contained the entire
> function, the greatest number of nodes was 30. (Because computed
> gotos are used in the code that checks for heap and stack overflows
> after allocations and for waiting interrupts, it's hard to go long in
> Scheme code without hitting the "big loop".) For hand-written code,
> the greatest number of nodes in a loop was 123.
>
> When bootstrapping gcc with --enable-languages=c, the largest number
> of nodes in a loop was 803, and there were 12 loops detected that had
> over 500 nodes. 548 loops had 100 nodes or greater. (This is a
> bootstrap, so some files were compiled twice with the instrumented
> compiler.)
>
> So perhaps an -O1 default for LICM of 100 nodes is reasonable, or
> perhaps one might up it to 1000 just to catch everything "reasonable".
>
> Brad
>
>
> --
>
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
>
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug, or are watching someone who is.
>
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (94 preceding siblings ...)
2009-02-14 23:07 ` dberlin at dberlin dot org
@ 2009-02-15 11:26 ` stevenb dot gcc at gmail dot com
2009-02-16 2:08 ` dberlin at dberlin dot org
` (24 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: stevenb dot gcc at gmail dot com @ 2009-02-15 11:26 UTC (permalink / raw)
To: gcc-bugs
------- Comment #95 from stevenb dot gcc at gmail dot com 2009-02-15 11:26 -------
Subject: Re: [4.3/4.4 Regression] Inordinate
compile times on large routines
Re: Comment #94
The trouble with LCM in RTL (i.e. GCSE-PRE) is not that it is slow (or
that it is disabled -- istr it is enabled at -O2), and also not that
it is edge based. The problem is that it doesn't handle cascading
expressions, because that just doesn't fit in the LCM framework. You
have to iterate RTL GCSE-PRE to move the same invariants as what RTL
LICM (i.e. loop-invariant.c) can achieve.
(GCSE-PRE is old code from a time when GCC didn't really have a proper
CFG. It is edge based because for block based you need critical edge
splitting, which was was prohibitively expensive in the Old Days.
Nowadays, gcse.c+lcm.c works in cfglayout mode and pre-splitting
critical edges would be cheap, so it would be a good idea to
experiment with a block based GCSE-PRE rewrite...)
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (95 preceding siblings ...)
2009-02-15 11:26 ` stevenb dot gcc at gmail dot com
@ 2009-02-16 2:08 ` dberlin at dberlin dot org
2009-02-20 13:03 ` jakub at gcc dot gnu dot org
` (23 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: dberlin at dberlin dot org @ 2009-02-16 2:08 UTC (permalink / raw)
To: gcc-bugs
------- Comment #96 from dberlin at gcc dot gnu dot org 2009-02-16 02:07 -------
Subject: Re: [4.3/4.4 Regression] Inordinate
compile times on large routines
Uh, it's most certainly disabled on testcases like his.
look at is_too_expensive in gcse.c
This is in fact done because LCM iteration takes too long on
flowgraphs like that, because of it's iteration order.
On Sun, Feb 15, 2009 at 6:26 AM, stevenb dot gcc at gmail dot com
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #95 from stevenb dot gcc at gmail dot com 2009-02-15 11:26 -------
> Subject: Re: [4.3/4.4 Regression] Inordinate
> compile times on large routines
>
> Re: Comment #94
> The trouble with LCM in RTL (i.e. GCSE-PRE) is not that it is slow (or
> that it is disabled -- istr it is enabled at -O2), and also not that
> it is edge based. The problem is that it doesn't handle cascading
> expressions, because that just doesn't fit in the LCM framework. You
> have to iterate RTL GCSE-PRE to move the same invariants as what RTL
> LICM (i.e. loop-invariant.c) can achieve.
>
> (GCSE-PRE is old code from a time when GCC didn't really have a proper
> CFG. It is edge based because for block based you need critical edge
> splitting, which was was prohibitively expensive in the Old Days.
> Nowadays, gcse.c+lcm.c works in cfglayout mode and pre-splitting
> critical edges would be cheap, so it would be a good idea to
> experiment with a block based GCSE-PRE rewrite...)
>
>
> --
>
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
>
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug, or are watching someone who is.
>
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (96 preceding siblings ...)
2009-02-16 2:08 ` dberlin at dberlin dot org
@ 2009-02-20 13:03 ` jakub at gcc dot gnu dot org
2009-02-20 19:52 ` lucier at math dot purdue dot edu
` (22 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: jakub at gcc dot gnu dot org @ 2009-02-20 13:03 UTC (permalink / raw)
To: gcc-bugs
------- Comment #97 from jakub at gcc dot gnu dot org 2009-02-20 13:03 -------
http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=144320
limits now RTL LICM to loops with less than 10000 bbs (-O{2,3,s}) resp. 1000
bbs (-O1).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (97 preceding siblings ...)
2009-02-20 13:03 ` jakub at gcc dot gnu dot org
@ 2009-02-20 19:52 ` lucier at math dot purdue dot edu
2009-02-20 19:54 ` lucier at math dot purdue dot edu
` (21 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2009-02-20 19:52 UTC (permalink / raw)
To: gcc-bugs
------- Comment #98 from lucier at math dot purdue dot edu 2009-02-20 19:52 -------
Thank you, that indeed "fixes" the LICM problem.
Based on some comments for this PR and for PR 39157 I thought that a similar
patch might apply to PRE. So with
euler-14% /pkgs/gcc-mainline/bin/gcc -v
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: ../../mainline/configure --enable-checking=release
--prefix=/pkgs/gcc-mainline --enable-languages=c
--enable-gather-detailed-mem-stats
Thread model: posix
gcc version 4.4.0 20090220 (experimental) [trunk revision 144328] (GCC)
I ran this command
/pkgs/gcc-mainline/bin/gcc -v -c -O2 -fmem-report -ftime-report compiler.i
-save-temps > & ! report-compiler
where compiler.i is found at
http://www.math.purdue.edu/~lucier/bugzilla/8/
and I killed the job after it required 17GB of RAM. This job compiles just
fine with
euler-15% /pkgs/gcc-4.1.2/bin/gcc -v
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: ../configure --prefix=/pkgs/gcc-4.1.2
Thread model: posix
gcc version 4.1.2
in about 1.5 GB of RAM.
To derive some statistics I ran
/pkgs/gcc-mainline/bin/gcc -v -c -O2 -fmem-report -ftime-report _num.i
-save-temps > & ! report-num
where the smaller file _num.i is also found at
http://www.math.purdue.edu/~lucier/bugzilla/8/
I'll attach report-num to this PR. The highlights are
PRE : 23.28 (24%) usr 0.01 ( 0%) sys 23.51 (24%) wall
681 kB ( 0%) ggc
integrated RA : 12.70 (13%) usr 0.00 ( 0%) sys 12.83 (13%) wall
3709 kB ( 2%) ggc
TOTAL : 95.93 2.73 99.72
227422 kB
and that's about it, nothing else above 5%. There are also accurate memory
statistics, as I've added a patch to my local sources so that memory statistics
don't overflow 32-bit counters.
I think the -O1 and -O2 limits for LICM are quite reasonable; would it be
possible to limit PRE similarly so that one could compile compiler.i with -O2
in a reasonable amount of memory?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (98 preceding siblings ...)
2009-02-20 19:52 ` lucier at math dot purdue dot edu
@ 2009-02-20 19:54 ` lucier at math dot purdue dot edu
2009-02-20 19:56 ` lucier at math dot purdue dot edu
` (20 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2009-02-20 19:54 UTC (permalink / raw)
To: gcc-bugs
------- Comment #99 from lucier at math dot purdue dot edu 2009-02-20 19:54 -------
Created an attachment (id=17336)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17336&action=view)
Memory and CPU statistics when compiling _num.i with -O2
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (99 preceding siblings ...)
2009-02-20 19:54 ` lucier at math dot purdue dot edu
@ 2009-02-20 19:56 ` lucier at math dot purdue dot edu
2009-02-21 4:14 ` dberlin at dberlin dot org
` (19 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2009-02-20 19:56 UTC (permalink / raw)
To: gcc-bugs
------- Comment #100 from lucier at math dot purdue dot edu 2009-02-20 19:56 -------
The large memory requirements for LICM at -O1 and -O2 is still a regression for
the 4.2 and 4.3 branches. Jakub's patch is short and elegant; do you think it
would be a good idea to backport it to the other open branches?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (100 preceding siblings ...)
2009-02-20 19:56 ` lucier at math dot purdue dot edu
@ 2009-02-21 4:14 ` dberlin at dberlin dot org
2009-02-21 18:31 ` lucier at math dot purdue dot edu
` (18 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: dberlin at dberlin dot org @ 2009-02-21 4:14 UTC (permalink / raw)
To: gcc-bugs
------- Comment #101 from dberlin at gcc dot gnu dot org 2009-02-21 04:13 -------
Subject: Re: [4.3/4.4 Regression] Inordinate
compile times on large routines
PRE already gives up on this testcase, at least on my computer, and
takes no memory.
All of the memory here is being eaten by IRA and DF.
The actual time sink is SCCVN's DFS, which builds a large SCC then
counts it's size and gives up (which in turn causes PRE to give up).
It's not clear you can really modify this to give up earlier than it
does (since you don't know the size of the SCC until it's already done
all the work anyway) without a ton of work.
I'm replacing this algorithm with a non-SCC based one in 4.5.
On Fri, Feb 20, 2009 at 2:52 PM, lucier at math dot purdue dot edu
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #98 from lucier at math dot purdue dot edu 2009-02-20 19:52 -------
> Thank you, that indeed "fixes" the LICM problem.
>
> Based on some comments for this PR and for PR 39157 I thought that a similar
> patch might apply to PRE.
..
>I think the -O1 and -O2 limits for LICM are quite reasonable; would it be
>possible to limit PRE similarly so that one could compile compiler.i with -O2
>in a reasonable amount of memory?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (101 preceding siblings ...)
2009-02-21 4:14 ` dberlin at dberlin dot org
@ 2009-02-21 18:31 ` lucier at math dot purdue dot edu
2009-02-21 18:42 ` rguenther at suse dot de
` (17 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2009-02-21 18:31 UTC (permalink / raw)
To: gcc-bugs
------- Comment #102 from lucier at math dot purdue dot edu 2009-02-21 18:30 -------
Please humor me:
PRE = Partial Redundancy Elimination
IRA = Integrated Register Allocator
DF = ???
SCCVN = ??? Value Numbering?
DFS = ???
SCC = ? Confict ?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (102 preceding siblings ...)
2009-02-21 18:31 ` lucier at math dot purdue dot edu
@ 2009-02-21 18:42 ` rguenther at suse dot de
2009-02-21 18:56 ` lucier at math dot purdue dot edu
` (16 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: rguenther at suse dot de @ 2009-02-21 18:42 UTC (permalink / raw)
To: gcc-bugs
------- Comment #103 from rguenther at suse dot de 2009-02-21 18:42 -------
Subject: Re: [4.3/4.4 Regression] Inordinate
compile times on large routines
On Sat, 21 Feb 2009, lucier at math dot purdue dot edu wrote:
> ------- Comment #102 from lucier at math dot purdue dot edu 2009-02-21 18:30 -------
> Please humor me:
>
> PRE = Partial Redundancy Elimination
> IRA = Integrated Register Allocator
> DF = ???
> SCCVN = ??? Value Numbering?
> DFS = ???
> SCC = ? Confict ?
http://gcc.gnu.org/wiki/abbreviations_and_acronyms
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (103 preceding siblings ...)
2009-02-21 18:42 ` rguenther at suse dot de
@ 2009-02-21 18:56 ` lucier at math dot purdue dot edu
2009-02-21 19:04 ` steven at gcc dot gnu dot org
` (15 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2009-02-21 18:56 UTC (permalink / raw)
To: gcc-bugs
------- Comment #104 from lucier at math dot purdue dot edu 2009-02-21 18:56 -------
Subject: Re: [4.3/4.4 Regression] Inordinate
compile times on large routines
Cool, that leaves me with
> > DFS = ???
> > SCC = ? Confict ?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (104 preceding siblings ...)
2009-02-21 18:56 ` lucier at math dot purdue dot edu
@ 2009-02-21 19:04 ` steven at gcc dot gnu dot org
2009-02-21 22:35 ` dberlin at dberlin dot org
` (14 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: steven at gcc dot gnu dot org @ 2009-02-21 19:04 UTC (permalink / raw)
To: gcc-bugs
------- Comment #105 from steven at gcc dot gnu dot org 2009-02-21 19:04 -------
SCC as in SCCVN
DFS = Depth First Search
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (105 preceding siblings ...)
2009-02-21 19:04 ` steven at gcc dot gnu dot org
@ 2009-02-21 22:35 ` dberlin at dberlin dot org
2009-05-08 12:23 ` [Bug tree-optimization/26854] [4.3/4.4/4.5 " bonzini at gcc dot gnu dot org
` (13 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: dberlin at dberlin dot org @ 2009-02-21 22:35 UTC (permalink / raw)
To: gcc-bugs
------- Comment #106 from dberlin at gcc dot gnu dot org 2009-02-21 22:34 -------
Subject: Re: [4.3/4.4 Regression] Inordinate
compile times on large routines
Right.
Basically, the value numbering PRE uses as a pre-pass is known as SCCVN.
It value numbers by doing a depth first search over the SSA variables,
iterating only over cycles (which end up forming Strongly Connected
Components in this graph).
In your case, you end up with a strongly connected component
containing 46000 variables. Value numbering gives up at that point
(one value numbering gives up, PRE gives up as well).
The SCC finding algorithm is linear (the value numbering algorithm is
not) but the constant can be large sometimes.
My guess is that in this case, we are wasting time in the vec pushing
or something.
I haven't profiled it.
On Sat, Feb 21, 2009 at 2:04 PM, steven at gcc dot gnu dot org
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #105 from steven at gcc dot gnu dot org 2009-02-21 19:04 -------
> SCC as in SCCVN
> DFS = Depth First Search
>
>
> --
>
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
>
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug, or are watching someone who is.
>
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4/4.5 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (106 preceding siblings ...)
2009-02-21 22:35 ` dberlin at dberlin dot org
@ 2009-05-08 12:23 ` bonzini at gcc dot gnu dot org
2009-06-15 16:30 ` bonzini at gnu dot org
` (12 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: bonzini at gcc dot gnu dot org @ 2009-05-08 12:23 UTC (permalink / raw)
To: gcc-bugs
------- Comment #107 from bonzini at gnu dot org 2009-05-08 12:22 -------
Subject: Bug 26854
Author: bonzini
Date: Fri May 8 12:22:30 2009
New Revision: 147282
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=147282
Log:
2009-05-08 Paolo Bonzini <bonzini@gnu.org>
PR rtl-optimization/33928
PR 26854
* fwprop.c (use_def_ref, get_def_for_use, bitmap_only_bit_bitween,
process_uses, build_single_def_use_links): New.
(update_df): Update use_def_ref.
(forward_propagate_into): Use get_def_for_use instead of use-def
chains.
(fwprop_init): Call build_single_def_use_links and let it initialize
dataflow.
(fwprop_done): Free use_def_ref.
(fwprop_addr): Eliminate duplicate call to df_set_flags.
* df-problems.c (df_rd_simulate_artificial_defs_at_top,
df_rd_simulate_one_insn): New.
(df_rd_bb_local_compute_process_def): Update head comment.
(df_chain_create_bb): Use the new RD simulation functions.
* df.h (df_rd_simulate_artificial_defs_at_top,
df_rd_simulate_one_insn): New.
* opts.c (decode_options): Enable fwprop at -O1.
* doc/invoke.texi (-fforward-propagate): Document this.
Modified:
trunk/gcc/ChangeLog
trunk/gcc/df-problems.c
trunk/gcc/df.h
trunk/gcc/doc/invoke.texi
trunk/gcc/fwprop.c
trunk/gcc/opts.c
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4/4.5 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (107 preceding siblings ...)
2009-05-08 12:23 ` [Bug tree-optimization/26854] [4.3/4.4/4.5 " bonzini at gcc dot gnu dot org
@ 2009-06-15 16:30 ` bonzini at gnu dot org
2009-06-27 14:49 ` bonzini at gcc dot gnu dot org
` (11 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: bonzini at gnu dot org @ 2009-06-15 16:30 UTC (permalink / raw)
To: gcc-bugs
------- Comment #108 from bonzini at gnu dot org 2009-06-15 16:30 -------
http://gcc.gnu.org/bugzilla/attachment.cgi?id=17968
This is the current state of -ftime-report/-fmem-report after the proposed
reimplementation of fwprop's dataflow.
Remaining hogs are:
1) Accounting for TV_ALIAS_STMT_WALKING is expensive. Waiting for info from
Brad as to how much of the cost is paid without -ftime-report. If it turns out
to be noticeable, Richi preapproved the trivial patch to remove the timevar.
2) CFG cleanup uses heavily the iterative fixing of dominators in
remove_edge_and_dominated_blocks, which on this testcase is very expensive.
Probably we should make sure no dominators are there in some key cfgcleanup
passes, or just kill dominators at the beginning of CFG cleanup if the testcase
is particularly large.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4/4.5 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (108 preceding siblings ...)
2009-06-15 16:30 ` bonzini at gnu dot org
@ 2009-06-27 14:49 ` bonzini at gcc dot gnu dot org
2009-08-04 12:33 ` rguenth at gcc dot gnu dot org
` (10 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: bonzini at gcc dot gnu dot org @ 2009-06-27 14:49 UTC (permalink / raw)
To: gcc-bugs
------- Comment #109 from bonzini at gnu dot org 2009-06-27 14:48 -------
Subject: Bug 26854
Author: bonzini
Date: Sat Jun 27 14:48:34 2009
New Revision: 149010
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=149010
Log:
2009-06-07 Paolo Bonzini <bonzini@gnu.org>
PR rtl-optimization/26854
* timevar.def: Remove TV_DF_RU, add TV_DF_MD.
* df-problems.c (df_rd_add_problem): Fix comment.
(df_md_set_bb_info, df_md_free_bb_info, df_md_alloc,
df_md_simulate_artificial_defs_at_top,
df_md_simulate_one_insn, df_md_bb_local_compute_process_def,
df_md_bb_local_compute, df_md_local_compute, df_md_reset,
df_md_transfer_function, df_md_init, df_md_confluence_0,
df_md_confluence_n, df_md_free, df_md_top_dump, df_md_bottom_dump,
problem_MD, df_md_add_problem): New.
* df.h (DF_MD, DF_MD_BB_INFO, struct df_md_bb_info, df_md,
df_md_get_bb_info): New.
DF_LAST_PROBLEM_PLUS1): Adjust.
* Makefile.in (fwprop.o): Include domwalk.h.
* fwprop.c: Include domwalk.h.
(reg_defs, reg_defs_stack): New.
(bitmap_only_bit_between): Remove.
(process_defs): New.
(process_uses): Use reg_defs and local_md instead of
bitmap_only_bit_between and local_rd.
(single_def_use_enter_block): New, from build_single_def_use_links.
(single_def_use_leave_block): New.
(build_single_def_use_links): Remove code moved to
single_def_use_enter_block, invoke domwalk.
(use_killed_between): Adjust comment.
Modified:
trunk/gcc/ChangeLog
trunk/gcc/Makefile.in
trunk/gcc/df-problems.c
trunk/gcc/df.h
trunk/gcc/fwprop.c
trunk/gcc/timevar.def
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4/4.5 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (109 preceding siblings ...)
2009-06-27 14:49 ` bonzini at gcc dot gnu dot org
@ 2009-08-04 12:33 ` rguenth at gcc dot gnu dot org
2009-10-03 1:39 ` bergner at gcc dot gnu dot org
` (9 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-08-04 12:33 UTC (permalink / raw)
To: gcc-bugs
------- Comment #110 from rguenth at gcc dot gnu dot org 2009-08-04 12:27 -------
GCC 4.3.4 is being released, adjusting target milestone.
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|4.3.4 |4.3.5
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4/4.5 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (110 preceding siblings ...)
2009-08-04 12:33 ` rguenth at gcc dot gnu dot org
@ 2009-10-03 1:39 ` bergner at gcc dot gnu dot org
2010-03-26 17:44 ` howarth at nitro dot med dot uc dot edu
` (8 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: bergner at gcc dot gnu dot org @ 2009-10-03 1:39 UTC (permalink / raw)
To: gcc-bugs
------- Comment #111 from bergner at gcc dot gnu dot org 2009-10-03 01:39 -------
Subject: Bug 26854
Author: bergner
Date: Sat Oct 3 01:39:14 2009
New Revision: 152430
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=152430
Log:
Backport from mainline.
2009-08-30 Alan Modra <amodra@bigpond.net.au>
PR target/41081
* fwprop.c (get_reg_use_in): Delete.
(free_load_extend): New function.
(forward_propagate_subreg): Use it.
2009-08-23 Alan Modra <amodra@bigpond.net.au>
PR target/41081
* fwprop.c (try_fwprop_subst): Allow multiple sets.
(get_reg_use_in): New function.
(forward_propagate_subreg): Propagate through subreg of zero_extend
or sign_extend.
2009-05-08 Paolo Bonzini <bonzini@gnu.org>
PR rtl-optimization/33928
PR 26854
* fwprop.c (use_def_ref, get_def_for_use, bitmap_only_bit_bitween,
process_uses, build_single_def_use_links): New.
(update_df): Update use_def_ref.
(forward_propagate_into): Use get_def_for_use instead of use-def
chains.
(fwprop_init): Call build_single_def_use_links and let it initialize
dataflow.
(fwprop_done): Free use_def_ref.
(fwprop_addr): Eliminate duplicate call to df_set_flags.
* df-problems.c (df_rd_simulate_artificial_defs_at_top,
df_rd_simulate_one_insn): New.
(df_rd_bb_local_compute_process_def): Update head comment.
(df_chain_create_bb): Use the new RD simulation functions.
* df.h (df_rd_simulate_artificial_defs_at_top,
df_rd_simulate_one_insn): New.
* opts.c (decode_options): Enable fwprop at -O1.
* doc/invoke.texi (-fforward-propagate): Document this.
Modified:
branches/ibm/gcc-4_3-branch/gcc/ChangeLog.ibm
branches/ibm/gcc-4_3-branch/gcc/REVISION
branches/ibm/gcc-4_3-branch/gcc/df-problems.c
branches/ibm/gcc-4_3-branch/gcc/df.h
branches/ibm/gcc-4_3-branch/gcc/doc/invoke.texi
branches/ibm/gcc-4_3-branch/gcc/fwprop.c
branches/ibm/gcc-4_3-branch/gcc/opts.c
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4/4.5 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (111 preceding siblings ...)
2009-10-03 1:39 ` bergner at gcc dot gnu dot org
@ 2010-03-26 17:44 ` howarth at nitro dot med dot uc dot edu
2010-03-27 4:28 ` lucier at math dot purdue dot edu
` (7 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: howarth at nitro dot med dot uc dot edu @ 2010-03-26 17:44 UTC (permalink / raw)
To: gcc-bugs
------- Comment #112 from howarth at nitro dot med dot uc dot edu 2010-03-26 17:44 -------
What is the status of this bug?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4/4.5 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (112 preceding siblings ...)
2010-03-26 17:44 ` howarth at nitro dot med dot uc dot edu
@ 2010-03-27 4:28 ` lucier at math dot purdue dot edu
2010-03-27 4:59 ` lucier at math dot purdue dot edu
` (6 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2010-03-27 4:28 UTC (permalink / raw)
To: gcc-bugs
------- Comment #113 from lucier at math dot purdue dot edu 2010-03-27 04:27 -------
Created an attachment (id=20220)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20220&action=view)
time/mem report compiling compiler.i
This is the time and detailed memory report for 20100302 compiling compiler.i
above with main optimization options -O1 -fschedule-insns2 (precise command
line and configuration options are given at the top of the file).
With these optimization levels cpu time and memory don't look too bad to me.
The main routines are
parser : 320.93 (59%) usr 1.40 (27%) sys 322.62 (59%) wall
103143 kB (15%) ggc
tree CFG cleanup : 73.43 (14%) usr 0.01 ( 0%) sys 73.46 (13%) wall
1388 kB ( 0%) ggc
Nothing else is above 3%.
I'm building today's gcc on an X86-64 RHEL5 machine with more memory to test
with -O3 -fschedule-insns, as this set of options now gives about 20% speedup
on some of my codes of this type.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4/4.5 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (113 preceding siblings ...)
2010-03-27 4:28 ` lucier at math dot purdue dot edu
@ 2010-03-27 4:59 ` lucier at math dot purdue dot edu
2010-03-27 5:20 ` lucier at math dot purdue dot edu
` (5 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2010-03-27 4:59 UTC (permalink / raw)
To: gcc-bugs
------- Comment #114 from lucier at math dot purdue dot edu 2010-03-27 04:59 -------
Created an attachment (id=20221)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20221&action=view)
time/mem report compiling compiler.i
This is the time and detailed memory report for compiling compiler.i with
today's gcc and optimization level -O3 -fschedule-insns. Again, the detailed
configuration information and command line are contained at the beginning of
the file.
Except for taking > 20GB of RAM, this doesn't look too bad, either. The passes
taking the most time are:
parser : 222.18 (21%) usr 2.95 (11%) sys 225.37 (21%) wall
103148 kB (11%) ggc
tree CFG cleanup : 63.67 ( 6%) usr 0.00 ( 0%) sys 63.60 ( 6%) wall
2467 kB ( 0%) ggc
scheduling : 394.04 (37%) usr 0.00 ( 0%) sys 394.04 (36%) wall
5824 kB ( 1%) ggc
TOTAL :1056.69 26.47 1083.41
916872 kB
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4/4.5 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (114 preceding siblings ...)
2010-03-27 4:59 ` lucier at math dot purdue dot edu
@ 2010-03-27 5:20 ` lucier at math dot purdue dot edu
2010-03-27 11:15 ` rguenth at gcc dot gnu dot org
` (4 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2010-03-27 5:20 UTC (permalink / raw)
To: gcc-bugs
------- Comment #115 from lucier at math dot purdue dot edu 2010-03-27 05:20 -------
Created an attachment (id=20222)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20222&action=view)
time/mem report compiling compiler.i with -O1
Here is the time and memory report with -O1 -fschedule-insns2 on the same
machine as the -O3 -fschedule-insns report.
The biggest times are:
parser : 224.89 (54%) usr 2.61 (24%) sys 226.97 (53%) wall
103148 kB (15%) ggc
tree CFG cleanup : 60.61 (15%) usr 0.00 ( 0%) sys 60.58 (14%) wall
1388 kB ( 0%) ggc
reload : 19.17 ( 5%) usr 0.00 ( 0%) sys 19.17 ( 5%) wall
4694 kB ( 1%) ggc
TOTAL : 413.29 10.95 424.28
709657 kB
--
lucier at math dot purdue dot edu changed:
What |Removed |Added
----------------------------------------------------------------------------
Attachment #20220|0 |1
is obsolete| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4/4.5 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (115 preceding siblings ...)
2010-03-27 5:20 ` lucier at math dot purdue dot edu
@ 2010-03-27 11:15 ` rguenth at gcc dot gnu dot org
2010-03-27 16:38 ` lucier at math dot purdue dot edu
` (3 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-03-27 11:15 UTC (permalink / raw)
To: gcc-bugs
------- Comment #116 from rguenth at gcc dot gnu dot org 2010-03-27 11:14 -------
Given that parsing takes most of the time the compile-time indeed looks
reasonable. That DF uses >20GB of ram at -O3 is still unfortunate, but the
-O1 numbers look indeed good.
I wonder if the parsing numbers are accurate as the initial report has
like 9s parsing while the current ones are >200s. Can you explain that
difference? (like, were you testing different source?)
As is the testcase(s) are an interesting source of information - maybe we
should gather those up on a page in the wiki just in case we end up closing
this bug at some point (I suggest not to at the moment, the parsing times
look odd and >20GB memory use doesn't sound reasonable). Did you ever
test other compilers and see how they perform with respect to memory usage
and compile time?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4/4.5 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (116 preceding siblings ...)
2010-03-27 11:15 ` rguenth at gcc dot gnu dot org
@ 2010-03-27 16:38 ` lucier at math dot purdue dot edu
2010-03-27 16:45 ` lucier at math dot purdue dot edu
` (2 subsequent siblings)
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2010-03-27 16:38 UTC (permalink / raw)
To: gcc-bugs
------- Comment #117 from lucier at math dot purdue dot edu 2010-03-27 16:38 -------
Subject: Re: [4.3/4.4/4.5 Regression] Inordinate compile times on large
routines
On Mar 27, 2010, at 7:14 AM, rguenth at gcc dot gnu dot org wrote:
> I wonder if the parsing numbers are accurate as the initial report has
> like 9s parsing while the current ones are >200s. Can you explain
> that
> difference? (like, were you testing different source?)
Yes, different source (compiler.i instead of all.i), different
(faster) machine. Perhaps gathering the detailed memory stats affect
the parser time.
Here are times for the original source file all.i using the same
machine and compiler as in the immediately previous report for
compiler.i:
df live&initialized regs: 45.00 ( 8%) usr 0.00 ( 0%) sys 45.04
( 8%) wall 0 kB ( 0%) ggc
parser : 19.60 ( 3%) usr 1.22 ( 7%) sys 21.25
( 4%) wall 70217 kB ( 2%) ggc
scheduling : 301.86 (52%) usr 0.00 ( 0%) sys 301.87
(51%) wall 8739 kB ( 0%) ggc
TOTAL : 579.88 17.55
597.65 3393985 kB
Glancing at top, the maximum reported memory usage was > 13GB. I'll
attach the detailed results for all.i next
> As is the testcase(s) are an interesting source of information -
> maybe we
> should gather those up on a page in the wiki just in case we end up
> closing
> this bug at some point (I suggest not to at the moment, the parsing
> times
> look odd and >20GB memory use doesn't sound reasonable). Did you ever
> test other compilers and see how they perform with respect to memory
> usage
> and compile time?
No, none that were not a gcc derivative.
Brad
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4/4.5 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (117 preceding siblings ...)
2010-03-27 16:38 ` lucier at math dot purdue dot edu
@ 2010-03-27 16:45 ` lucier at math dot purdue dot edu
2010-04-29 14:35 ` [Bug tree-optimization/26854] [4.3/4.4/4.5/4.6 " bergner at gcc dot gnu dot org
2010-05-22 18:16 ` rguenth at gcc dot gnu dot org
120 siblings, 0 replies; 144+ messages in thread
From: lucier at math dot purdue dot edu @ 2010-03-27 16:45 UTC (permalink / raw)
To: gcc-bugs
------- Comment #118 from lucier at math dot purdue dot edu 2010-03-27 16:44 -------
Created an attachment (id=20224)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20224&action=view)
time/memory report compiling all.i with -O3
These are the detailed time and memory statistics reported when compiling all.i
with -O3 -fschedule-insns on x86-64.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4/4.5/4.6 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (118 preceding siblings ...)
2010-03-27 16:45 ` lucier at math dot purdue dot edu
@ 2010-04-29 14:35 ` bergner at gcc dot gnu dot org
2010-05-22 18:16 ` rguenth at gcc dot gnu dot org
120 siblings, 0 replies; 144+ messages in thread
From: bergner at gcc dot gnu dot org @ 2010-04-29 14:35 UTC (permalink / raw)
To: gcc-bugs
------- Comment #119 from bergner at gcc dot gnu dot org 2010-04-29 14:34 -------
Subject: Bug 26854
Author: bergner
Date: Thu Apr 29 14:34:35 2010
New Revision: 158902
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=158902
Log:
Backport from mainline.
2009-08-30 Alan Modra <amodra@bigpond.net.au>
PR target/41081
* fwprop.c (get_reg_use_in): Delete.
(free_load_extend): New function.
(forward_propagate_subreg): Use it.
2009-08-23 Alan Modra <amodra@bigpond.net.au>
PR target/41081
* fwprop.c (try_fwprop_subst): Allow multiple sets.
(get_reg_use_in): New function.
(forward_propagate_subreg): Propagate through subreg of zero_extend
or sign_extend.
2009-05-08 Paolo Bonzini <bonzini@gnu.org>
PR rtl-optimization/33928
PR 26854
* fwprop.c (use_def_ref, get_def_for_use, bitmap_only_bit_bitween,
process_uses, build_single_def_use_links): New.
(update_df): Update use_def_ref.
(forward_propagate_into): Use get_def_for_use instead of use-def
chains.
(fwprop_init): Call build_single_def_use_links and let it initialize
dataflow.
(fwprop_done): Free use_def_ref.
(fwprop_addr): Eliminate duplicate call to df_set_flags.
* df-problems.c (df_rd_simulate_artificial_defs_at_top,
df_rd_simulate_one_insn): New.
(df_rd_bb_local_compute_process_def): Update head comment.
(df_chain_create_bb): Use the new RD simulation functions.
* df.h (df_rd_simulate_artificial_defs_at_top,
df_rd_simulate_one_insn): New.
* opts.c (decode_options): Enable fwprop at -O1.
* doc/invoke.texi (-fforward-propagate): Document this.
Modified:
branches/ibm/gcc-4_4-branch/gcc/ChangeLog.ibm
branches/ibm/gcc-4_4-branch/gcc/df-problems.c
branches/ibm/gcc-4_4-branch/gcc/df.h
branches/ibm/gcc-4_4-branch/gcc/doc/invoke.texi
branches/ibm/gcc-4_4-branch/gcc/fwprop.c
branches/ibm/gcc-4_4-branch/gcc/opts.c
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread
* [Bug tree-optimization/26854] [4.3/4.4/4.5/4.6 Regression] Inordinate compile times on large routines
2006-03-24 20:25 [Bug c/26854] New: Inordinate compile times on large routines lucier at math dot purdue dot edu
` (119 preceding siblings ...)
2010-04-29 14:35 ` [Bug tree-optimization/26854] [4.3/4.4/4.5/4.6 " bergner at gcc dot gnu dot org
@ 2010-05-22 18:16 ` rguenth at gcc dot gnu dot org
120 siblings, 0 replies; 144+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-05-22 18:16 UTC (permalink / raw)
To: gcc-bugs
------- Comment #120 from rguenth at gcc dot gnu dot org 2010-05-22 18:10 -------
GCC 4.3.5 is being released, adjusting target milestone.
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|4.3.5 |4.3.6
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
^ permalink raw reply [flat|nested] 144+ messages in thread