public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug lto/49700] New: LTO compile time hog
@ 2011-07-11 6:29 Joost.VandeVondele at pci dot uzh.ch
2011-07-11 9:21 ` [Bug lto/49700] " rguenth at gcc dot gnu.org
` (8 more replies)
0 siblings, 9 replies; 10+ messages in thread
From: Joost.VandeVondele at pci dot uzh.ch @ 2011-07-11 6:29 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49700
Summary: LTO compile time hog
Product: gcc
Version: 4.7.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: lto
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: Joost.VandeVondele@pci.uzh.ch
Using LTO a CP2K compile can take several hours and a lot of memory. I have
been able to extract the following time report, but what is the best way to
make a testcase ? -save-temps yielded the cp2k.sopt.ltrans29.ltrans.o file, but
is anything more needed ?
gfortran @/dev/shm/vondele/ccGyXn6S.args
Using built-in specs.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/data03/vondele/gnu/gcc_trunk/install/libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc/configure
--prefix=/data03/vondele/gnu/gcc_trunk/install --enable-languages=c,c++,fortran
--disable-multilib --enable-plugins --enable-cloog-backend=isl
--with-ppl=/data03/vondele/gnu/ppl-0.11/install
--with-cloog=/data03/vondele/gnu/cloog-0.16.1/install/
--with-libelf=/data03/vondele/gnu/libelf-0.8.13/install
--with-plugin-ld=ld.gold
Thread model: posix
gcc version 4.7.0 20110709 (experimental) [trunk revision 176072] (GCC)
COLLECT_GCC_OPTIONS='-c' '-v' '-save-temps' '-ftime-report' '-u'
'se-linker-plugin' '-O3' '-march=native' '-funroll-loops' '-ffast-math'
'-ffree-form' '-D' '__GFORTRAN' '-D' '__FFTSG' '-D' '__FFTW3' '-D' '__LIBINT'
'-I' '/ext/software/64/gfortran-suite/include' '-D'
'__COMPILE_ARCH="gfortran-test13"' '-D' '__COMPILE_DATE="Sun Jul 10 14:22:33
CEST 2011"' '-D' '__COMPILE_HOST="pcihopt3"' '-D'
'__COMPILE_LASTCVS="/qs_scf.F/1.527/Sat Jul 9 07:18:08 2011//"'
'-L/users/vondele/LAPACK/' '-L/ext/software/64/gfortran-suite/lib'
'-shared-libgcc' '-dumpdir'
'/data03/vondele/cp2k_gcc/cp2k/makefiles/../exe/gfortran-test13/' '-dumpbase'
'cp2k.sopt.ltrans29' '-fltrans' '-o' 'cp2k.sopt.ltrans29.ltrans.o'
/data03/vondele/gnu/gcc_trunk/install/libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/lto1
-march=amdfam10 -mcx16 -msahf -mno-movbe -mno-aes -mno-pclmul -mpopcnt -mabm
-mno-lwp -mno-fma -mno-fma4 -mno-xop -mno-bmi -mno-tbm -mno-avx -mno-sse4.2
-mno-sse4.1 --param l1-cache-size=64 --param l1-cache-line-size=64 --param
l2-cache-size=512 -mtune=amdfam10 -quiet -dumpdir
/data03/vondele/cp2k_gcc/cp2k/makefiles/../exe/gfortran-test13/ -dumpbase
cp2k.sopt.ltrans29 -auxbase-strip cp2k.sopt.ltrans29.ltrans.o -O3 -version
-ftime-report -funroll-loops -ffast-math -ffree-form -fltrans
@/dev/shm/vondele/ccuRF1W6 -o cp2k.sopt.ltrans29.s
GNU GIMPLE (GCC) version 4.7.0 20110709 (experimental) [trunk revision 176072]
(x86_64-unknown-linux-gnu)
compiled by GNU C version 4.7.0 20110709 (experimental) [trunk revision
176072], GMP version 4.3.2, MPFR version 2.4.2, MPC version 0.8.2
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
GNU GIMPLE (GCC) version 4.7.0 20110709 (experimental) [trunk revision 176072]
(x86_64-unknown-linux-gnu)
compiled by GNU C version 4.7.0 20110709 (experimental) [trunk revision
176072], GMP version 4.3.2, MPFR version 2.4.2, MPC version 0.8.2
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
Execution times (seconds)
phase setup : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall
1250 kB ( 0%) ggc
phase parsing :5086.93 (100%) usr 9.45 (100%) sys5213.35 (100%)
wall 1072513 kB (100%) ggc
phase generate : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
0 kB ( 0%) ggc
garbage collection : 4.16 ( 0%) usr 0.02 ( 0%) sys 4.22 ( 0%) wall
0 kB ( 0%) ggc
ipa lto gimple in : 0.13 ( 0%) usr 0.01 ( 0%) sys 0.16 ( 0%) wall
26100 kB ( 2%) ggc
ipa lto decl in : 0.86 ( 0%) usr 0.01 ( 0%) sys 0.89 ( 0%) wall
7027 kB ( 1%) ggc
ipa pure const : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall
11 kB ( 0%) ggc
cfg construction : 0.62 ( 0%) usr 0.00 ( 0%) sys 0.63 ( 0%) wall
3820 kB ( 0%) ggc
cfg cleanup : 2.89 ( 0%) usr 0.00 ( 0%) sys 2.91 ( 0%) wall
4416 kB ( 0%) ggc
CFG verifier : 49.95 ( 1%) usr 0.00 ( 0%) sys 50.28 ( 1%) wall
0 kB ( 0%) ggc
trivially dead code : 0.88 ( 0%) usr 0.00 ( 0%) sys 0.90 ( 0%) wall
0 kB ( 0%) ggc
df scan insns : 0.20 ( 0%) usr 0.00 ( 0%) sys 0.23 ( 0%) wall
20 kB ( 0%) ggc
df multiple defs : 0.34 ( 0%) usr 0.00 ( 0%) sys 0.34 ( 0%) wall
0 kB ( 0%) ggc
df reaching defs : 60.58 ( 1%) usr 1.10 (12%) sys 62.35 ( 1%) wall
0 kB ( 0%) ggc
df live regs : 13.28 ( 0%) usr 0.04 ( 0%) sys 13.56 ( 0%) wall
0 kB ( 0%) ggc
df live&initialized regs: 5.43 ( 0%) usr 0.00 ( 0%) sys 5.59 ( 0%) wall
0 kB ( 0%) ggc
df use-def / def-use chains: 1.16 ( 0%) usr 0.01 ( 0%) sys 1.25 ( 0%)
wall 0 kB ( 0%) ggc
df live reg subwords : 0.47 ( 0%) usr 0.00 ( 0%) sys 0.46 ( 0%) wall
0 kB ( 0%) ggc
df reg dead/unused notes: 4.22 ( 0%) usr 0.00 ( 0%) sys 4.20 ( 0%) wall
8542 kB ( 1%) ggc
register information : 1.72 ( 0%) usr 0.00 ( 0%) sys 1.77 ( 0%) wall
0 kB ( 0%) ggc
alias analysis : 2.40 ( 0%) usr 0.00 ( 0%) sys 2.40 ( 0%) wall
29964 kB ( 3%) ggc
alias stmt walking : 1.06 ( 0%) usr 0.03 ( 0%) sys 1.11 ( 0%) wall
6438 kB ( 1%) ggc
register scan : 0.17 ( 0%) usr 0.00 ( 0%) sys 0.17 ( 0%) wall
35 kB ( 0%) ggc
rebuild jump labels : 0.85 ( 0%) usr 0.00 ( 0%) sys 0.87 ( 0%) wall
7 kB ( 0%) ggc
inline heuristics : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall
0 kB ( 0%) ggc
integration : 0.10 ( 0%) usr 0.00 ( 0%) sys 0.10 ( 0%) wall
14586 kB ( 1%) ggc
tree CFG cleanup : 3.58 ( 0%) usr 0.01 ( 0%) sys 3.52 ( 0%) wall
2188 kB ( 0%) ggc
tree VRP : 1.49 ( 0%) usr 0.01 ( 0%) sys 1.56 ( 0%) wall
17581 kB ( 2%) ggc
tree copy propagation : 0.27 ( 0%) usr 0.01 ( 0%) sys 0.29 ( 0%) wall
858 kB ( 0%) ggc
tree PTA : 0.57 ( 0%) usr 0.00 ( 0%) sys 0.63 ( 0%) wall
670 kB ( 0%) ggc
tree SSA rewrite : 8.75 ( 0%) usr 0.01 ( 0%) sys 8.72 ( 0%) wall
1215 kB ( 0%) ggc
tree SSA incremental : 26.73 ( 1%) usr 0.00 ( 0%) sys 27.36 ( 1%) wall
3046 kB ( 0%) ggc
tree operand scan : 0.78 ( 0%) usr 0.26 ( 3%) sys 0.83 ( 0%) wall
39556 kB ( 4%) ggc
dominator optimization : 0.29 ( 0%) usr 0.00 ( 0%) sys 0.38 ( 0%) wall
7238 kB ( 1%) ggc
tree CCP : 0.34 ( 0%) usr 0.00 ( 0%) sys 0.42 ( 0%) wall
1098 kB ( 0%) ggc
tree PHI const/copy prop: 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
126 kB ( 0%) ggc
tree split crit edges : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
1301 kB ( 0%) ggc
tree reassociation : 0.19 ( 0%) usr 0.01 ( 0%) sys 0.26 ( 0%) wall
5052 kB ( 0%) ggc
tree PRE : 1.17 ( 0%) usr 0.01 ( 0%) sys 1.14 ( 0%) wall
13070 kB ( 1%) ggc
tree FRE : 0.54 ( 0%) usr 0.01 ( 0%) sys 0.57 ( 0%) wall
6904 kB ( 1%) ggc
tree code sinking : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall
427 kB ( 0%) ggc
tree linearize phis : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall
30 kB ( 0%) ggc
tree forward propagate : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.10 ( 0%) wall
1803 kB ( 0%) ggc
tree phiprop : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall
0 kB ( 0%) ggc
tree conservative DCE : 0.14 ( 0%) usr 0.01 ( 0%) sys 0.15 ( 0%) wall
155 kB ( 0%) ggc
tree aggressive DCE : 0.36 ( 0%) usr 0.00 ( 0%) sys 0.43 ( 0%) wall
11259 kB ( 1%) ggc
tree DSE : 0.19 ( 0%) usr 0.00 ( 0%) sys 0.18 ( 0%) wall
0 kB ( 0%) ggc
tree loop bounds : 0.31 ( 0%) usr 0.00 ( 0%) sys 0.25 ( 0%) wall
11761 kB ( 1%) ggc
tree loop invariant motion: 0.22 ( 0%) usr 0.02 ( 0%) sys 0.23 ( 0%)
wall 4266 kB ( 0%) ggc
tree canonical iv : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall
875 kB ( 0%) ggc
scev constant prop : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall
2182 kB ( 0%) ggc
complete unrolling : 0.70 ( 0%) usr 0.00 ( 0%) sys 0.77 ( 0%) wall
37727 kB ( 4%) ggc
tree vectorization : 0.79 ( 0%) usr 0.04 ( 0%) sys 0.84 ( 0%) wall
36624 kB ( 3%) ggc
tree slp vectorization : 0.45 ( 0%) usr 0.00 ( 0%) sys 0.44 ( 0%) wall
11293 kB ( 1%) ggc
tree loop distribution : 3.31 ( 0%) usr 0.00 ( 0%) sys 3.32 ( 0%) wall
857 kB ( 0%) ggc
tree prefetching : 51.24 ( 1%) usr 0.07 ( 1%) sys 52.02 ( 1%) wall
56594 kB ( 5%) ggc
tree iv optimization : 7.47 ( 0%) usr 0.09 ( 1%) sys 7.64 ( 0%) wall
200051 kB (19%) ggc
predictive commoning : 0.25 ( 0%) usr 0.00 ( 0%) sys 0.20 ( 0%) wall
6379 kB ( 1%) ggc
tree loop init : 0.10 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%) wall
856 kB ( 0%) ggc
tree copy headers : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
1560 kB ( 0%) ggc
tree SSA uncprop : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall
0 kB ( 0%) ggc
tree rename SSA copies : 0.07 ( 0%) usr 0.00 ( 0%) sys 0.05 ( 0%) wall
0 kB ( 0%) ggc
tree SSA verifier : 376.24 ( 7%) usr 0.04 ( 0%) sys 379.72 ( 7%) wall
0 kB ( 0%) ggc
tree STMT verifier : 439.42 ( 9%) usr 0.16 ( 2%) sys 443.29 ( 9%) wall
0 kB ( 0%) ggc
callgraph verifier : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall
63 kB ( 0%) ggc
dominance frontiers : 16.93 ( 0%) usr 0.03 ( 0%) sys 16.81 ( 0%) wall
0 kB ( 0%) ggc
dominance computation : 10.78 ( 0%) usr 0.00 ( 0%) sys 10.50 ( 0%) wall
0 kB ( 0%) ggc
control dependences : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
0 kB ( 0%) ggc
out of ssa : 0.63 ( 0%) usr 0.00 ( 0%) sys 0.68 ( 0%) wall
207 kB ( 0%) ggc
expand vars : 0.08 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%) wall
12686 kB ( 1%) ggc
expand : 0.83 ( 0%) usr 0.00 ( 0%) sys 0.76 ( 0%) wall
50753 kB ( 5%) ggc
post expand cleanups : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall
3897 kB ( 0%) ggc
lower subreg : 0.28 ( 0%) usr 0.00 ( 0%) sys 0.29 ( 0%) wall
0 kB ( 0%) ggc
forward prop : 1.07 ( 0%) usr 0.00 ( 0%) sys 1.10 ( 0%) wall
11010 kB ( 1%) ggc
CSE : 0.71 ( 0%) usr 0.00 ( 0%) sys 0.77 ( 0%) wall
446 kB ( 0%) ggc
dead code elimination : 1.73 ( 0%) usr 0.00 ( 0%) sys 1.74 ( 0%) wall
0 kB ( 0%) ggc
dead store elim1 : 1.33 ( 0%) usr 0.00 ( 0%) sys 1.35 ( 0%) wall
15295 kB ( 1%) ggc
dead store elim2 : 1.50 ( 0%) usr 0.00 ( 0%) sys 1.54 ( 0%) wall
15743 kB ( 1%) ggc
loop analysis : 0.11 ( 0%) usr 0.00 ( 0%) sys 0.11 ( 0%) wall
1864 kB ( 0%) ggc
loop invariant motion : 24.13 ( 0%) usr 0.02 ( 0%) sys 23.97 ( 0%) wall
1275 kB ( 0%) ggc
loop unswitching : 872.57 (17%) usr 0.01 ( 0%) sys 873.51 (17%) wall
111 kB ( 0%) ggc
loop unrolling :2802.10 (55%) usr 4.15 (44%) sys2910.38 (56%) wall
99541 kB ( 9%) ggc
CPROP : 0.40 ( 0%) usr 0.00 ( 0%) sys 0.26 ( 0%) wall
3209 kB ( 0%) ggc
PRE : 0.22 ( 0%) usr 0.00 ( 0%) sys 0.20 ( 0%) wall
113 kB ( 0%) ggc
code hoisting : 0.09 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%) wall
4 kB ( 0%) ggc
web : 1.60 ( 0%) usr 0.02 ( 0%) sys 1.63 ( 0%) wall
6999 kB ( 1%) ggc
CSE 2 : 1.49 ( 0%) usr 0.00 ( 0%) sys 1.53 ( 0%) wall
1326 kB ( 0%) ggc
combiner : 23.55 ( 0%) usr 0.04 ( 0%) sys 23.84 ( 0%) wall
53172 kB ( 5%) ggc
if-conversion : 13.53 ( 0%) usr 0.01 ( 0%) sys 13.69 ( 0%) wall
1910 kB ( 0%) ggc
regmove : 0.92 ( 0%) usr 0.00 ( 0%) sys 0.94 ( 0%) wall
9 kB ( 0%) ggc
integrated RA : 19.24 ( 0%) usr 3.00 (32%) sys 22.40 ( 0%) wall
151112 kB (14%) ggc
reload : 16.17 ( 0%) usr 0.00 ( 0%) sys 16.37 ( 0%) wall
12120 kB ( 1%) ggc
reload CSE regs : 3.43 ( 0%) usr 0.00 ( 0%) sys 3.52 ( 0%) wall
25345 kB ( 2%) ggc
load CSE after reload : 1.71 ( 0%) usr 0.00 ( 0%) sys 1.74 ( 0%) wall
919 kB ( 0%) ggc
zee : 0.47 ( 0%) usr 0.00 ( 0%) sys 0.48 ( 0%) wall
42 kB ( 0%) ggc
thread pro- & epilogue : 0.10 ( 0%) usr 0.00 ( 0%) sys 0.11 ( 0%) wall
618 kB ( 0%) ggc
if-conversion 2 : 0.20 ( 0%) usr 0.01 ( 0%) sys 0.21 ( 0%) wall
776 kB ( 0%) ggc
combine stack adjustments: 0.10 ( 0%) usr 0.00 ( 0%) sys 0.11 ( 0%) wall
0 kB ( 0%) ggc
peephole 2 : 0.32 ( 0%) usr 0.00 ( 0%) sys 0.33 ( 0%) wall
2368 kB ( 0%) ggc
rename registers : 1.13 ( 0%) usr 0.00 ( 0%) sys 1.16 ( 0%) wall
3536 kB ( 0%) ggc
hard reg cprop : 0.82 ( 0%) usr 0.00 ( 0%) sys 0.78 ( 0%) wall
15 kB ( 0%) ggc
scheduling 2 : 4.34 ( 0%) usr 0.00 ( 0%) sys 4.35 ( 0%) wall
351 kB ( 0%) ggc
machine dep reorg : 0.76 ( 0%) usr 0.00 ( 0%) sys 0.73 ( 0%) wall
15 kB ( 0%) ggc
reorder blocks : 0.93 ( 0%) usr 0.00 ( 0%) sys 0.92 ( 0%) wall
2694 kB ( 0%) ggc
final : 1.68 ( 0%) usr 0.09 ( 1%) sys 1.76 ( 0%) wall
423 kB ( 0%) ggc
rest of compilation : 1.73 ( 0%) usr 0.02 ( 0%) sys 1.73 ( 0%) wall
5241 kB ( 0%) ggc
remove unused locals : 0.96 ( 0%) usr 0.00 ( 0%) sys 0.87 ( 0%) wall
0 kB ( 0%) ggc
address taken : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall
0 kB ( 0%) ggc
unaccounted todo : 1.15 ( 0%) usr 0.02 ( 0%) sys 1.16 ( 0%) wall
9 kB ( 0%) ggc
verify loop closed : 168.68 ( 3%) usr 0.05 ( 1%) sys 169.90 ( 3%) wall
0 kB ( 0%) ggc
verify RTL sharing : 8.07 ( 0%) usr 0.00 ( 0%) sys 8.30 ( 0%) wall
0 kB ( 0%) ggc
rebuild frequencies : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
249 kB ( 0%) ggc
repair loop structures : 0.48 ( 0%) usr 0.00 ( 0%) sys 0.48 ( 0%) wall
961 kB ( 0%) ggc
TOTAL :5086.93 9.45 5213.38
1073764 kB
Extra diagnostic checks enabled; compiler may run slowly.
Configure with --enable-checking=release to disable checks.
COLLECT_GCC_OPTIONS='-c' '-v' '-save-temps' '-ftime-report' '-u'
'se-linker-plugin' '-O3' '-march=native' '-funroll-loops' '-ffast-math'
'-ffree-form' '-D' '__GFORTRAN' '-D' '__FFTSG' '-D' '__FFTW3' '-D' '__LIBINT'
'-I' '/ext/software/64/gfortran-suite/include' '-D'
'__COMPILE_ARCH="gfortran-test13"' '-D' '__COMPILE_DATE="Sun Jul 10 14:22:33
CEST 2011"' '-D' '__COMPILE_HOST="pcihopt3"' '-D'
'__COMPILE_LASTCVS="/qs_scf.F/1.527/Sat Jul 9 07:18:08 2011//"'
'-L/users/vondele/LAPACK/' '-L/ext/software/64/gfortran-suite/lib'
'-shared-libgcc' '-dumpdir'
'/data03/vondele/cp2k_gcc/cp2k/makefiles/../exe/gfortran-test13/' '-dumpbase'
'cp2k.sopt.ltrans29' '-fltrans' '-o' 'cp2k.sopt.ltrans29.ltrans.o'
as --64 -o cp2k.sopt.ltrans29.ltrans.o cp2k.sopt.ltrans29.s
COMPILER_PATH=/data03/vondele/gnu/gcc_trunk/install/libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/:/data03/vondele/gnu/gcc_trunk/install/libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/:/data03/vondele/gnu/gcc_trunk/install/libexec/gcc/x86_64-unknown-linux-gnu/:/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/:/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/:/data03/vondele/gnu/gcc_trunk/install/libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/:/data03/vondele/gnu/gcc_trunk/install/libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/:/data03/vondele/gnu/gcc_trunk/install/libexec/gcc/x86_64-unknown-linux-gnu/:/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/:/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/
LIBRARY_PATH=/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/../../../../lib64/../lib64/:/lib/../lib64/../lib64/:/usr/lib/../lib64/../lib64/:/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/../../../../lib64/:/lib/../lib64/:/usr/lib/../lib64/:/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/:/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/../../../../lib64/:/lib/../lib64/:/usr/lib/../lib64/:/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/:/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/../../../../lib64/:/lib/../lib64/:/usr/lib/../lib64/:/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/../../../:/lib/:/usr/lib/:/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/../../../:/lib/:/usr/lib/
COLLECT_GCC_OPTIONS='-c' '-v' '-save-temps' '-ftime-report' '-u'
'se-linker-plugin' '-O3' '-march=native' '-funroll-loops' '-ffast-math'
'-ffree-form' '-D' '__GFORTRAN' '-D' '__FFTSG' '-D' '__FFTW3' '-D' '__LIBINT'
'-I' '/ext/software/64/gfortran-suite/include' '-D'
'__COMPILE_ARCH="gfortran-test13"' '-D' '__COMPILE_DATE="Sun Jul 10 14:22:33
CEST 2011"' '-D' '__COMPILE_HOST="pcihopt3"' '-D'
'__COMPILE_LASTCVS="/qs_scf.F/1.527/Sat Jul 9 07:18:08 2011//"'
'-L/users/vondele/LAPACK/' '-L/ext/software/64/gfortran-suite/lib'
'-shared-libgcc' '-dumpdir'
'/data03/vondele/cp2k_gcc/cp2k/makefiles/../exe/gfortran-test13/' '-dumpbase'
'cp2k.sopt.ltrans29' '-fltrans' '-o' 'cp2k.sopt.ltrans29.ltrans.o'
[Leaving LTRANS /dev/shm/vondele/ccGyXn6S.args]
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug lto/49700] LTO compile time hog
2011-07-11 6:29 [Bug lto/49700] New: LTO compile time hog Joost.VandeVondele at pci dot uzh.ch
@ 2011-07-11 9:21 ` rguenth at gcc dot gnu.org
2011-07-11 9:29 ` jakub at gcc dot gnu.org
` (7 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2011-07-11 9:21 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49700
Richard Guenther <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |hubicka at gcc dot gnu.org,
| |rguenth at gcc dot gnu.org
--- Comment #1 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-07-11 09:20:37 UTC ---
The
phase parsing :5086.93 (100%) usr 9.45 (100%) sys5213.35 (100%)
wall 1072513 kB (100%) ggc
line looks completely odd ;) It would be interesting to know what broke the
LTO I/O type/merging accounting.
Honza? The above is from LTRANS phase, so maybe there's some timevar
pushing missing? It looks like materialize_cgraph is suspicious.
No, the ltrans object file is of no use.
Joost, can you point us to a source tarball? Does the issue reproduce
with simpler flags (plain -O2 or plain -O3? Esp. w/o -march=native?)
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug lto/49700] LTO compile time hog
2011-07-11 6:29 [Bug lto/49700] New: LTO compile time hog Joost.VandeVondele at pci dot uzh.ch
2011-07-11 9:21 ` [Bug lto/49700] " rguenth at gcc dot gnu.org
@ 2011-07-11 9:29 ` jakub at gcc dot gnu.org
2011-07-11 9:33 ` rguenth at gcc dot gnu.org
` (6 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2011-07-11 9:29 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49700
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jakub at gcc dot gnu.org
--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-07-11 09:27:50 UTC ---
The phase parsing numbers are uninteresting, as lto1 FE does almost all the
work in its lto_main (aka parse_file langhook), including cgraph_optimize. So
that time includes basically the whole compilation, you can see that the
generate phase is very short.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug lto/49700] LTO compile time hog
2011-07-11 6:29 [Bug lto/49700] New: LTO compile time hog Joost.VandeVondele at pci dot uzh.ch
2011-07-11 9:21 ` [Bug lto/49700] " rguenth at gcc dot gnu.org
2011-07-11 9:29 ` jakub at gcc dot gnu.org
@ 2011-07-11 9:33 ` rguenth at gcc dot gnu.org
2011-07-11 19:57 ` pinskia at gcc dot gnu.org
` (5 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2011-07-11 9:33 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49700
--- Comment #3 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-07-11 09:33:36 UTC ---
Oh, so it's a sum ...
Well, the I suppose you run into the usual array-prefetching compile-time hog.
Try -fno-prefetch-loop-arrays.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug lto/49700] LTO compile time hog
2011-07-11 6:29 [Bug lto/49700] New: LTO compile time hog Joost.VandeVondele at pci dot uzh.ch
` (2 preceding siblings ...)
2011-07-11 9:33 ` rguenth at gcc dot gnu.org
@ 2011-07-11 19:57 ` pinskia at gcc dot gnu.org
2011-07-12 10:18 ` Joost.VandeVondele at pci dot uzh.ch
` (4 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2011-07-11 19:57 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49700
--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> 2011-07-11 19:56:55 UTC ---
> Extra diagnostic checks enabled; compiler may run slowly.
> Configure with --enable-checking=release to disable checks.
Also try to build the compiler with that option passed to configure. The
defaults for the trunk has always been adding extra checking.
> CFG verifier : 49.95 ( 1%) usr
> tree SSA verifier : 376.24 ( 7%) usr
> tree STMT verifier : 439.42 ( 9%) usr
> verify loop closed : 168.68 ( 3%) usr
As you can see with the above, some verifying takes time (at least 20% of the
total time).
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug lto/49700] LTO compile time hog
2011-07-11 6:29 [Bug lto/49700] New: LTO compile time hog Joost.VandeVondele at pci dot uzh.ch
` (3 preceding siblings ...)
2011-07-11 19:57 ` pinskia at gcc dot gnu.org
@ 2011-07-12 10:18 ` Joost.VandeVondele at pci dot uzh.ch
2011-07-12 14:41 ` Joost.VandeVondele at pci dot uzh.ch
` (3 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Joost.VandeVondele at pci dot uzh.ch @ 2011-07-12 10:18 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49700
--- Comment #5 from Joost VandeVondele <Joost.VandeVondele at pci dot uzh.ch> 2011-07-12 10:17:54 UTC ---
(In reply to comment #3)
> Oh, so it's a sum ...
>
> Well, the I suppose you run into the usual array-prefetching compile-time hog.
> Try -fno-prefetch-loop-arrays.
This seems to reduce the time by 5x. On a non-LTO run, this doesn't matter, but
I assume LTO makes more info available that triggers some action by that pass.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug lto/49700] LTO compile time hog
2011-07-11 6:29 [Bug lto/49700] New: LTO compile time hog Joost.VandeVondele at pci dot uzh.ch
` (4 preceding siblings ...)
2011-07-12 10:18 ` Joost.VandeVondele at pci dot uzh.ch
@ 2011-07-12 14:41 ` Joost.VandeVondele at pci dot uzh.ch
2012-05-07 12:41 ` rguenth at gcc dot gnu.org
` (2 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Joost.VandeVondele at pci dot uzh.ch @ 2011-07-12 14:41 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49700
--- Comment #6 from Joost VandeVondele <Joost.VandeVondele at pci dot uzh.ch> 2011-07-12 14:40:29 UTC ---
(In reply to comment #1)
> Joost, can you point us to a source tarball? Does the issue reproduce
> with simpler flags (plain -O2 or plain -O3? Esp. w/o -march=native?)
CP2K sources are easy, but the to get a version that links, there are
additional non-standard dependencies (blas, lapack, libint).
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug lto/49700] LTO compile time hog
2011-07-11 6:29 [Bug lto/49700] New: LTO compile time hog Joost.VandeVondele at pci dot uzh.ch
` (5 preceding siblings ...)
2011-07-12 14:41 ` Joost.VandeVondele at pci dot uzh.ch
@ 2012-05-07 12:41 ` rguenth at gcc dot gnu.org
2012-05-07 19:18 ` Joost.VandeVondele at mat dot ethz.ch
2012-05-08 18:59 ` Joost.VandeVondele at mat dot ethz.ch
8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-05-07 12:41 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49700
Richard Guenther <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |WAITING
Last reconfirmed| |2012-05-07
Ever Confirmed|0 |1
--- Comment #7 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-05-07 12:38:43 UTC ---
Has the situation improved?
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug lto/49700] LTO compile time hog
2011-07-11 6:29 [Bug lto/49700] New: LTO compile time hog Joost.VandeVondele at pci dot uzh.ch
` (6 preceding siblings ...)
2012-05-07 12:41 ` rguenth at gcc dot gnu.org
@ 2012-05-07 19:18 ` Joost.VandeVondele at mat dot ethz.ch
2012-05-08 18:59 ` Joost.VandeVondele at mat dot ethz.ch
8 siblings, 0 replies; 10+ messages in thread
From: Joost.VandeVondele at mat dot ethz.ch @ 2012-05-07 19:18 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49700
--- Comment #8 from Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> 2012-05-07 19:04:29 UTC ---
(In reply to comment #7)
> Has the situation improved?
current trunk LTO seems to fail on CP2K with:
/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F: In function
‘propagate_cn_or_em’:
/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
SUBROUTINE propagate_cn_or_em(qs_env, error)
^
struct array2_integer(kind=4)
struct array2_integer(kind=4)
# VUSE <.MEM_805>
D.79093_629 = D.79094_628->orders.data;
/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
SUBROUTINE propagate_cn_or_em(qs_env, error)
^
struct array2_integer(kind=4)
struct array2_integer(kind=4)
# VUSE <.MEM_805>
D.79092_630 = D.79094_628->orders.offset;
/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
SUBROUTINE propagate_cn_or_em(qs_env, error)
^
struct array2_integer(kind=4)
struct array2_integer(kind=4)
# VUSE <.MEM_805>
D.79090_632 = D.79094_628->orders.dim[1].stride;
/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
SUBROUTINE propagate_cn_or_em(qs_env, error)
^
struct array2_integer(kind=4)
struct array2_integer(kind=4)
# VUSE <.MEM_816>
D.79093_652 = D.79094_651->orders.data;
/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
SUBROUTINE propagate_cn_or_em(qs_env, error)
^
struct array2_integer(kind=4)
struct array2_integer(kind=4)
# VUSE <.MEM_816>
D.79092_653 = D.79094_651->orders.offset;
/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
SUBROUTINE propagate_cn_or_em(qs_env, error)
^
struct array2_integer(kind=4)
struct array2_integer(kind=4)
# VUSE <.MEM_816>
D.79090_655 = D.79094_651->orders.dim[1].stride;
/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
SUBROUTINE propagate_cn_or_em(qs_env, error)
^
struct array2_integer(kind=4)
struct array2_integer(kind=4)
# VUSE <.MEM_827>
D.79093_675 = D.79094_674->orders.data;
/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
SUBROUTINE propagate_cn_or_em(qs_env, error)
^
struct array2_integer(kind=4)
struct array2_integer(kind=4)
# VUSE <.MEM_827>
D.79092_676 = D.79094_674->orders.offset;
/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
SUBROUTINE propagate_cn_or_em(qs_env, error)
^
struct array2_integer(kind=4)
struct array2_integer(kind=4)
# VUSE <.MEM_827>
D.79090_678 = D.79094_674->orders.dim[1].stride;
/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
SUBROUTINE propagate_cn_or_em(qs_env, error)
^
struct array2_integer(kind=4)
struct array2_integer(kind=4)
# VUSE <.MEM_838>
D.79093_700 = D.79094_699->orders.data;
/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
SUBROUTINE propagate_cn_or_em(qs_env, error)
^
struct array2_integer(kind=4)
struct array2_integer(kind=4)
# VUSE <.MEM_838>
D.79092_701 = D.79094_699->orders.offset;
/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
SUBROUTINE propagate_cn_or_em(qs_env, error)
^
struct array2_integer(kind=4)
struct array2_integer(kind=4)
# VUSE <.MEM_838>
D.79090_703 = D.79094_699->orders.dim[1].stride;
/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0:
internal compiler error: verify_gimple failed
SUBROUTINE propagate_cn_or_em(qs_env, error)
^
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://gcc.gnu.org/bugs.html> for instructions.
lto-wrapper: gfortran returned 1 exit status
/data/vjoost/gnu/binutils-2.22/install/bin/ld: lto-wrapper failed
collect2: error: ld returned 1 exit status
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug lto/49700] LTO compile time hog
2011-07-11 6:29 [Bug lto/49700] New: LTO compile time hog Joost.VandeVondele at pci dot uzh.ch
` (7 preceding siblings ...)
2012-05-07 19:18 ` Joost.VandeVondele at mat dot ethz.ch
@ 2012-05-08 18:59 ` Joost.VandeVondele at mat dot ethz.ch
8 siblings, 0 replies; 10+ messages in thread
From: Joost.VandeVondele at mat dot ethz.ch @ 2012-05-08 18:59 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49700
Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|WAITING |RESOLVED
Resolution| |FIXED
--- Comment #9 from Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> 2012-05-08 18:52:12 UTC ---
trying 4.7.X instead it actually looks very reasonable now.
Using -flto=jobserver -fuse-linker-plugin -ftime-report -O3 -march=native
-ffast-math -g -ffree-form
I get CP2K to build in 4min on a 32 cores server. The time report also looks
OK. I'll close this PR as fixed (to issue with 4.8 is tracked in PR 45586).
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2012-05-08 18:53 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-07-11 6:29 [Bug lto/49700] New: LTO compile time hog Joost.VandeVondele at pci dot uzh.ch
2011-07-11 9:21 ` [Bug lto/49700] " rguenth at gcc dot gnu.org
2011-07-11 9:29 ` jakub at gcc dot gnu.org
2011-07-11 9:33 ` rguenth at gcc dot gnu.org
2011-07-11 19:57 ` pinskia at gcc dot gnu.org
2011-07-12 10:18 ` Joost.VandeVondele at pci dot uzh.ch
2011-07-12 14:41 ` Joost.VandeVondele at pci dot uzh.ch
2012-05-07 12:41 ` rguenth at gcc dot gnu.org
2012-05-07 19:18 ` Joost.VandeVondele at mat dot ethz.ch
2012-05-08 18:59 ` Joost.VandeVondele at mat dot ethz.ch
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).