public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
* [Bug lto/49700] New: LTO compile time hog @ 2011-07-11 6:29 Joost.VandeVondele at pci dot uzh.ch 2011-07-11 9:21 ` [Bug lto/49700] " rguenth at gcc dot gnu.org ` (8 more replies) 0 siblings, 9 replies; 10+ messages in thread From: Joost.VandeVondele at pci dot uzh.ch @ 2011-07-11 6:29 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49700 Summary: LTO compile time hog Product: gcc Version: 4.7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: lto AssignedTo: unassigned@gcc.gnu.org ReportedBy: Joost.VandeVondele@pci.uzh.ch Using LTO a CP2K compile can take several hours and a lot of memory. I have been able to extract the following time report, but what is the best way to make a testcase ? -save-temps yielded the cp2k.sopt.ltrans29.ltrans.o file, but is anything more needed ? gfortran @/dev/shm/vondele/ccGyXn6S.args Using built-in specs. COLLECT_GCC=gfortran COLLECT_LTO_WRAPPER=/data03/vondele/gnu/gcc_trunk/install/libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ../gcc/configure --prefix=/data03/vondele/gnu/gcc_trunk/install --enable-languages=c,c++,fortran --disable-multilib --enable-plugins --enable-cloog-backend=isl --with-ppl=/data03/vondele/gnu/ppl-0.11/install --with-cloog=/data03/vondele/gnu/cloog-0.16.1/install/ --with-libelf=/data03/vondele/gnu/libelf-0.8.13/install --with-plugin-ld=ld.gold Thread model: posix gcc version 4.7.0 20110709 (experimental) [trunk revision 176072] (GCC) COLLECT_GCC_OPTIONS='-c' '-v' '-save-temps' '-ftime-report' '-u' 'se-linker-plugin' '-O3' '-march=native' '-funroll-loops' '-ffast-math' '-ffree-form' '-D' '__GFORTRAN' '-D' '__FFTSG' '-D' '__FFTW3' '-D' '__LIBINT' '-I' '/ext/software/64/gfortran-suite/include' '-D' '__COMPILE_ARCH="gfortran-test13"' '-D' '__COMPILE_DATE="Sun Jul 10 14:22:33 CEST 2011"' '-D' '__COMPILE_HOST="pcihopt3"' '-D' '__COMPILE_LASTCVS="/qs_scf.F/1.527/Sat Jul 9 07:18:08 2011//"' '-L/users/vondele/LAPACK/' '-L/ext/software/64/gfortran-suite/lib' '-shared-libgcc' '-dumpdir' '/data03/vondele/cp2k_gcc/cp2k/makefiles/../exe/gfortran-test13/' '-dumpbase' 'cp2k.sopt.ltrans29' '-fltrans' '-o' 'cp2k.sopt.ltrans29.ltrans.o' /data03/vondele/gnu/gcc_trunk/install/libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/lto1 -march=amdfam10 -mcx16 -msahf -mno-movbe -mno-aes -mno-pclmul -mpopcnt -mabm -mno-lwp -mno-fma -mno-fma4 -mno-xop -mno-bmi -mno-tbm -mno-avx -mno-sse4.2 -mno-sse4.1 --param l1-cache-size=64 --param l1-cache-line-size=64 --param l2-cache-size=512 -mtune=amdfam10 -quiet -dumpdir /data03/vondele/cp2k_gcc/cp2k/makefiles/../exe/gfortran-test13/ -dumpbase cp2k.sopt.ltrans29 -auxbase-strip cp2k.sopt.ltrans29.ltrans.o -O3 -version -ftime-report -funroll-loops -ffast-math -ffree-form -fltrans @/dev/shm/vondele/ccuRF1W6 -o cp2k.sopt.ltrans29.s GNU GIMPLE (GCC) version 4.7.0 20110709 (experimental) [trunk revision 176072] (x86_64-unknown-linux-gnu) compiled by GNU C version 4.7.0 20110709 (experimental) [trunk revision 176072], GMP version 4.3.2, MPFR version 2.4.2, MPC version 0.8.2 GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 GNU GIMPLE (GCC) version 4.7.0 20110709 (experimental) [trunk revision 176072] (x86_64-unknown-linux-gnu) compiled by GNU C version 4.7.0 20110709 (experimental) [trunk revision 176072], GMP version 4.3.2, MPFR version 2.4.2, MPC version 0.8.2 GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 Execution times (seconds) phase setup : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 1250 kB ( 0%) ggc phase parsing :5086.93 (100%) usr 9.45 (100%) sys5213.35 (100%) wall 1072513 kB (100%) ggc phase generate : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc garbage collection : 4.16 ( 0%) usr 0.02 ( 0%) sys 4.22 ( 0%) wall 0 kB ( 0%) ggc ipa lto gimple in : 0.13 ( 0%) usr 0.01 ( 0%) sys 0.16 ( 0%) wall 26100 kB ( 2%) ggc ipa lto decl in : 0.86 ( 0%) usr 0.01 ( 0%) sys 0.89 ( 0%) wall 7027 kB ( 1%) ggc ipa pure const : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 11 kB ( 0%) ggc cfg construction : 0.62 ( 0%) usr 0.00 ( 0%) sys 0.63 ( 0%) wall 3820 kB ( 0%) ggc cfg cleanup : 2.89 ( 0%) usr 0.00 ( 0%) sys 2.91 ( 0%) wall 4416 kB ( 0%) ggc CFG verifier : 49.95 ( 1%) usr 0.00 ( 0%) sys 50.28 ( 1%) wall 0 kB ( 0%) ggc trivially dead code : 0.88 ( 0%) usr 0.00 ( 0%) sys 0.90 ( 0%) wall 0 kB ( 0%) ggc df scan insns : 0.20 ( 0%) usr 0.00 ( 0%) sys 0.23 ( 0%) wall 20 kB ( 0%) ggc df multiple defs : 0.34 ( 0%) usr 0.00 ( 0%) sys 0.34 ( 0%) wall 0 kB ( 0%) ggc df reaching defs : 60.58 ( 1%) usr 1.10 (12%) sys 62.35 ( 1%) wall 0 kB ( 0%) ggc df live regs : 13.28 ( 0%) usr 0.04 ( 0%) sys 13.56 ( 0%) wall 0 kB ( 0%) ggc df live&initialized regs: 5.43 ( 0%) usr 0.00 ( 0%) sys 5.59 ( 0%) wall 0 kB ( 0%) ggc df use-def / def-use chains: 1.16 ( 0%) usr 0.01 ( 0%) sys 1.25 ( 0%) wall 0 kB ( 0%) ggc df live reg subwords : 0.47 ( 0%) usr 0.00 ( 0%) sys 0.46 ( 0%) wall 0 kB ( 0%) ggc df reg dead/unused notes: 4.22 ( 0%) usr 0.00 ( 0%) sys 4.20 ( 0%) wall 8542 kB ( 1%) ggc register information : 1.72 ( 0%) usr 0.00 ( 0%) sys 1.77 ( 0%) wall 0 kB ( 0%) ggc alias analysis : 2.40 ( 0%) usr 0.00 ( 0%) sys 2.40 ( 0%) wall 29964 kB ( 3%) ggc alias stmt walking : 1.06 ( 0%) usr 0.03 ( 0%) sys 1.11 ( 0%) wall 6438 kB ( 1%) ggc register scan : 0.17 ( 0%) usr 0.00 ( 0%) sys 0.17 ( 0%) wall 35 kB ( 0%) ggc rebuild jump labels : 0.85 ( 0%) usr 0.00 ( 0%) sys 0.87 ( 0%) wall 7 kB ( 0%) ggc inline heuristics : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc integration : 0.10 ( 0%) usr 0.00 ( 0%) sys 0.10 ( 0%) wall 14586 kB ( 1%) ggc tree CFG cleanup : 3.58 ( 0%) usr 0.01 ( 0%) sys 3.52 ( 0%) wall 2188 kB ( 0%) ggc tree VRP : 1.49 ( 0%) usr 0.01 ( 0%) sys 1.56 ( 0%) wall 17581 kB ( 2%) ggc tree copy propagation : 0.27 ( 0%) usr 0.01 ( 0%) sys 0.29 ( 0%) wall 858 kB ( 0%) ggc tree PTA : 0.57 ( 0%) usr 0.00 ( 0%) sys 0.63 ( 0%) wall 670 kB ( 0%) ggc tree SSA rewrite : 8.75 ( 0%) usr 0.01 ( 0%) sys 8.72 ( 0%) wall 1215 kB ( 0%) ggc tree SSA incremental : 26.73 ( 1%) usr 0.00 ( 0%) sys 27.36 ( 1%) wall 3046 kB ( 0%) ggc tree operand scan : 0.78 ( 0%) usr 0.26 ( 3%) sys 0.83 ( 0%) wall 39556 kB ( 4%) ggc dominator optimization : 0.29 ( 0%) usr 0.00 ( 0%) sys 0.38 ( 0%) wall 7238 kB ( 1%) ggc tree CCP : 0.34 ( 0%) usr 0.00 ( 0%) sys 0.42 ( 0%) wall 1098 kB ( 0%) ggc tree PHI const/copy prop: 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 126 kB ( 0%) ggc tree split crit edges : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 1301 kB ( 0%) ggc tree reassociation : 0.19 ( 0%) usr 0.01 ( 0%) sys 0.26 ( 0%) wall 5052 kB ( 0%) ggc tree PRE : 1.17 ( 0%) usr 0.01 ( 0%) sys 1.14 ( 0%) wall 13070 kB ( 1%) ggc tree FRE : 0.54 ( 0%) usr 0.01 ( 0%) sys 0.57 ( 0%) wall 6904 kB ( 1%) ggc tree code sinking : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 427 kB ( 0%) ggc tree linearize phis : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 30 kB ( 0%) ggc tree forward propagate : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.10 ( 0%) wall 1803 kB ( 0%) ggc tree phiprop : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc tree conservative DCE : 0.14 ( 0%) usr 0.01 ( 0%) sys 0.15 ( 0%) wall 155 kB ( 0%) ggc tree aggressive DCE : 0.36 ( 0%) usr 0.00 ( 0%) sys 0.43 ( 0%) wall 11259 kB ( 1%) ggc tree DSE : 0.19 ( 0%) usr 0.00 ( 0%) sys 0.18 ( 0%) wall 0 kB ( 0%) ggc tree loop bounds : 0.31 ( 0%) usr 0.00 ( 0%) sys 0.25 ( 0%) wall 11761 kB ( 1%) ggc tree loop invariant motion: 0.22 ( 0%) usr 0.02 ( 0%) sys 0.23 ( 0%) wall 4266 kB ( 0%) ggc tree canonical iv : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 875 kB ( 0%) ggc scev constant prop : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall 2182 kB ( 0%) ggc complete unrolling : 0.70 ( 0%) usr 0.00 ( 0%) sys 0.77 ( 0%) wall 37727 kB ( 4%) ggc tree vectorization : 0.79 ( 0%) usr 0.04 ( 0%) sys 0.84 ( 0%) wall 36624 kB ( 3%) ggc tree slp vectorization : 0.45 ( 0%) usr 0.00 ( 0%) sys 0.44 ( 0%) wall 11293 kB ( 1%) ggc tree loop distribution : 3.31 ( 0%) usr 0.00 ( 0%) sys 3.32 ( 0%) wall 857 kB ( 0%) ggc tree prefetching : 51.24 ( 1%) usr 0.07 ( 1%) sys 52.02 ( 1%) wall 56594 kB ( 5%) ggc tree iv optimization : 7.47 ( 0%) usr 0.09 ( 1%) sys 7.64 ( 0%) wall 200051 kB (19%) ggc predictive commoning : 0.25 ( 0%) usr 0.00 ( 0%) sys 0.20 ( 0%) wall 6379 kB ( 1%) ggc tree loop init : 0.10 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%) wall 856 kB ( 0%) ggc tree copy headers : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 1560 kB ( 0%) ggc tree SSA uncprop : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 0 kB ( 0%) ggc tree rename SSA copies : 0.07 ( 0%) usr 0.00 ( 0%) sys 0.05 ( 0%) wall 0 kB ( 0%) ggc tree SSA verifier : 376.24 ( 7%) usr 0.04 ( 0%) sys 379.72 ( 7%) wall 0 kB ( 0%) ggc tree STMT verifier : 439.42 ( 9%) usr 0.16 ( 2%) sys 443.29 ( 9%) wall 0 kB ( 0%) ggc callgraph verifier : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall 63 kB ( 0%) ggc dominance frontiers : 16.93 ( 0%) usr 0.03 ( 0%) sys 16.81 ( 0%) wall 0 kB ( 0%) ggc dominance computation : 10.78 ( 0%) usr 0.00 ( 0%) sys 10.50 ( 0%) wall 0 kB ( 0%) ggc control dependences : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc out of ssa : 0.63 ( 0%) usr 0.00 ( 0%) sys 0.68 ( 0%) wall 207 kB ( 0%) ggc expand vars : 0.08 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%) wall 12686 kB ( 1%) ggc expand : 0.83 ( 0%) usr 0.00 ( 0%) sys 0.76 ( 0%) wall 50753 kB ( 5%) ggc post expand cleanups : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall 3897 kB ( 0%) ggc lower subreg : 0.28 ( 0%) usr 0.00 ( 0%) sys 0.29 ( 0%) wall 0 kB ( 0%) ggc forward prop : 1.07 ( 0%) usr 0.00 ( 0%) sys 1.10 ( 0%) wall 11010 kB ( 1%) ggc CSE : 0.71 ( 0%) usr 0.00 ( 0%) sys 0.77 ( 0%) wall 446 kB ( 0%) ggc dead code elimination : 1.73 ( 0%) usr 0.00 ( 0%) sys 1.74 ( 0%) wall 0 kB ( 0%) ggc dead store elim1 : 1.33 ( 0%) usr 0.00 ( 0%) sys 1.35 ( 0%) wall 15295 kB ( 1%) ggc dead store elim2 : 1.50 ( 0%) usr 0.00 ( 0%) sys 1.54 ( 0%) wall 15743 kB ( 1%) ggc loop analysis : 0.11 ( 0%) usr 0.00 ( 0%) sys 0.11 ( 0%) wall 1864 kB ( 0%) ggc loop invariant motion : 24.13 ( 0%) usr 0.02 ( 0%) sys 23.97 ( 0%) wall 1275 kB ( 0%) ggc loop unswitching : 872.57 (17%) usr 0.01 ( 0%) sys 873.51 (17%) wall 111 kB ( 0%) ggc loop unrolling :2802.10 (55%) usr 4.15 (44%) sys2910.38 (56%) wall 99541 kB ( 9%) ggc CPROP : 0.40 ( 0%) usr 0.00 ( 0%) sys 0.26 ( 0%) wall 3209 kB ( 0%) ggc PRE : 0.22 ( 0%) usr 0.00 ( 0%) sys 0.20 ( 0%) wall 113 kB ( 0%) ggc code hoisting : 0.09 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%) wall 4 kB ( 0%) ggc web : 1.60 ( 0%) usr 0.02 ( 0%) sys 1.63 ( 0%) wall 6999 kB ( 1%) ggc CSE 2 : 1.49 ( 0%) usr 0.00 ( 0%) sys 1.53 ( 0%) wall 1326 kB ( 0%) ggc combiner : 23.55 ( 0%) usr 0.04 ( 0%) sys 23.84 ( 0%) wall 53172 kB ( 5%) ggc if-conversion : 13.53 ( 0%) usr 0.01 ( 0%) sys 13.69 ( 0%) wall 1910 kB ( 0%) ggc regmove : 0.92 ( 0%) usr 0.00 ( 0%) sys 0.94 ( 0%) wall 9 kB ( 0%) ggc integrated RA : 19.24 ( 0%) usr 3.00 (32%) sys 22.40 ( 0%) wall 151112 kB (14%) ggc reload : 16.17 ( 0%) usr 0.00 ( 0%) sys 16.37 ( 0%) wall 12120 kB ( 1%) ggc reload CSE regs : 3.43 ( 0%) usr 0.00 ( 0%) sys 3.52 ( 0%) wall 25345 kB ( 2%) ggc load CSE after reload : 1.71 ( 0%) usr 0.00 ( 0%) sys 1.74 ( 0%) wall 919 kB ( 0%) ggc zee : 0.47 ( 0%) usr 0.00 ( 0%) sys 0.48 ( 0%) wall 42 kB ( 0%) ggc thread pro- & epilogue : 0.10 ( 0%) usr 0.00 ( 0%) sys 0.11 ( 0%) wall 618 kB ( 0%) ggc if-conversion 2 : 0.20 ( 0%) usr 0.01 ( 0%) sys 0.21 ( 0%) wall 776 kB ( 0%) ggc combine stack adjustments: 0.10 ( 0%) usr 0.00 ( 0%) sys 0.11 ( 0%) wall 0 kB ( 0%) ggc peephole 2 : 0.32 ( 0%) usr 0.00 ( 0%) sys 0.33 ( 0%) wall 2368 kB ( 0%) ggc rename registers : 1.13 ( 0%) usr 0.00 ( 0%) sys 1.16 ( 0%) wall 3536 kB ( 0%) ggc hard reg cprop : 0.82 ( 0%) usr 0.00 ( 0%) sys 0.78 ( 0%) wall 15 kB ( 0%) ggc scheduling 2 : 4.34 ( 0%) usr 0.00 ( 0%) sys 4.35 ( 0%) wall 351 kB ( 0%) ggc machine dep reorg : 0.76 ( 0%) usr 0.00 ( 0%) sys 0.73 ( 0%) wall 15 kB ( 0%) ggc reorder blocks : 0.93 ( 0%) usr 0.00 ( 0%) sys 0.92 ( 0%) wall 2694 kB ( 0%) ggc final : 1.68 ( 0%) usr 0.09 ( 1%) sys 1.76 ( 0%) wall 423 kB ( 0%) ggc rest of compilation : 1.73 ( 0%) usr 0.02 ( 0%) sys 1.73 ( 0%) wall 5241 kB ( 0%) ggc remove unused locals : 0.96 ( 0%) usr 0.00 ( 0%) sys 0.87 ( 0%) wall 0 kB ( 0%) ggc address taken : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall 0 kB ( 0%) ggc unaccounted todo : 1.15 ( 0%) usr 0.02 ( 0%) sys 1.16 ( 0%) wall 9 kB ( 0%) ggc verify loop closed : 168.68 ( 3%) usr 0.05 ( 1%) sys 169.90 ( 3%) wall 0 kB ( 0%) ggc verify RTL sharing : 8.07 ( 0%) usr 0.00 ( 0%) sys 8.30 ( 0%) wall 0 kB ( 0%) ggc rebuild frequencies : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 249 kB ( 0%) ggc repair loop structures : 0.48 ( 0%) usr 0.00 ( 0%) sys 0.48 ( 0%) wall 961 kB ( 0%) ggc TOTAL :5086.93 9.45 5213.38 1073764 kB Extra diagnostic checks enabled; compiler may run slowly. Configure with --enable-checking=release to disable checks. COLLECT_GCC_OPTIONS='-c' '-v' '-save-temps' '-ftime-report' '-u' 'se-linker-plugin' '-O3' '-march=native' '-funroll-loops' '-ffast-math' '-ffree-form' '-D' '__GFORTRAN' '-D' '__FFTSG' '-D' '__FFTW3' '-D' '__LIBINT' '-I' '/ext/software/64/gfortran-suite/include' '-D' '__COMPILE_ARCH="gfortran-test13"' '-D' '__COMPILE_DATE="Sun Jul 10 14:22:33 CEST 2011"' '-D' '__COMPILE_HOST="pcihopt3"' '-D' '__COMPILE_LASTCVS="/qs_scf.F/1.527/Sat Jul 9 07:18:08 2011//"' '-L/users/vondele/LAPACK/' '-L/ext/software/64/gfortran-suite/lib' '-shared-libgcc' '-dumpdir' '/data03/vondele/cp2k_gcc/cp2k/makefiles/../exe/gfortran-test13/' '-dumpbase' 'cp2k.sopt.ltrans29' '-fltrans' '-o' 'cp2k.sopt.ltrans29.ltrans.o' as --64 -o cp2k.sopt.ltrans29.ltrans.o cp2k.sopt.ltrans29.s COMPILER_PATH=/data03/vondele/gnu/gcc_trunk/install/libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/:/data03/vondele/gnu/gcc_trunk/install/libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/:/data03/vondele/gnu/gcc_trunk/install/libexec/gcc/x86_64-unknown-linux-gnu/:/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/:/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/:/data03/vondele/gnu/gcc_trunk/install/libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/:/data03/vondele/gnu/gcc_trunk/install/libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/:/data03/vondele/gnu/gcc_trunk/install/libexec/gcc/x86_64-unknown-linux-gnu/:/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/:/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/ LIBRARY_PATH=/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/../../../../lib64/../lib64/:/lib/../lib64/../lib64/:/usr/lib/../lib64/../lib64/:/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/../../../../lib64/:/lib/../lib64/:/usr/lib/../lib64/:/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/:/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/../../../../lib64/:/lib/../lib64/:/usr/lib/../lib64/:/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/:/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/../../../../lib64/:/lib/../lib64/:/usr/lib/../lib64/:/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/../../../:/lib/:/usr/lib/:/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/../../../:/lib/:/usr/lib/ COLLECT_GCC_OPTIONS='-c' '-v' '-save-temps' '-ftime-report' '-u' 'se-linker-plugin' '-O3' '-march=native' '-funroll-loops' '-ffast-math' '-ffree-form' '-D' '__GFORTRAN' '-D' '__FFTSG' '-D' '__FFTW3' '-D' '__LIBINT' '-I' '/ext/software/64/gfortran-suite/include' '-D' '__COMPILE_ARCH="gfortran-test13"' '-D' '__COMPILE_DATE="Sun Jul 10 14:22:33 CEST 2011"' '-D' '__COMPILE_HOST="pcihopt3"' '-D' '__COMPILE_LASTCVS="/qs_scf.F/1.527/Sat Jul 9 07:18:08 2011//"' '-L/users/vondele/LAPACK/' '-L/ext/software/64/gfortran-suite/lib' '-shared-libgcc' '-dumpdir' '/data03/vondele/cp2k_gcc/cp2k/makefiles/../exe/gfortran-test13/' '-dumpbase' 'cp2k.sopt.ltrans29' '-fltrans' '-o' 'cp2k.sopt.ltrans29.ltrans.o' [Leaving LTRANS /dev/shm/vondele/ccGyXn6S.args] ^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug lto/49700] LTO compile time hog 2011-07-11 6:29 [Bug lto/49700] New: LTO compile time hog Joost.VandeVondele at pci dot uzh.ch @ 2011-07-11 9:21 ` rguenth at gcc dot gnu.org 2011-07-11 9:29 ` jakub at gcc dot gnu.org ` (7 subsequent siblings) 8 siblings, 0 replies; 10+ messages in thread From: rguenth at gcc dot gnu.org @ 2011-07-11 9:21 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49700 Richard Guenther <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |hubicka at gcc dot gnu.org, | |rguenth at gcc dot gnu.org --- Comment #1 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-07-11 09:20:37 UTC --- The phase parsing :5086.93 (100%) usr 9.45 (100%) sys5213.35 (100%) wall 1072513 kB (100%) ggc line looks completely odd ;) It would be interesting to know what broke the LTO I/O type/merging accounting. Honza? The above is from LTRANS phase, so maybe there's some timevar pushing missing? It looks like materialize_cgraph is suspicious. No, the ltrans object file is of no use. Joost, can you point us to a source tarball? Does the issue reproduce with simpler flags (plain -O2 or plain -O3? Esp. w/o -march=native?) ^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug lto/49700] LTO compile time hog 2011-07-11 6:29 [Bug lto/49700] New: LTO compile time hog Joost.VandeVondele at pci dot uzh.ch 2011-07-11 9:21 ` [Bug lto/49700] " rguenth at gcc dot gnu.org @ 2011-07-11 9:29 ` jakub at gcc dot gnu.org 2011-07-11 9:33 ` rguenth at gcc dot gnu.org ` (6 subsequent siblings) 8 siblings, 0 replies; 10+ messages in thread From: jakub at gcc dot gnu.org @ 2011-07-11 9:29 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49700 Jakub Jelinek <jakub at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jakub at gcc dot gnu.org --- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-07-11 09:27:50 UTC --- The phase parsing numbers are uninteresting, as lto1 FE does almost all the work in its lto_main (aka parse_file langhook), including cgraph_optimize. So that time includes basically the whole compilation, you can see that the generate phase is very short. ^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug lto/49700] LTO compile time hog 2011-07-11 6:29 [Bug lto/49700] New: LTO compile time hog Joost.VandeVondele at pci dot uzh.ch 2011-07-11 9:21 ` [Bug lto/49700] " rguenth at gcc dot gnu.org 2011-07-11 9:29 ` jakub at gcc dot gnu.org @ 2011-07-11 9:33 ` rguenth at gcc dot gnu.org 2011-07-11 19:57 ` pinskia at gcc dot gnu.org ` (5 subsequent siblings) 8 siblings, 0 replies; 10+ messages in thread From: rguenth at gcc dot gnu.org @ 2011-07-11 9:33 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49700 --- Comment #3 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-07-11 09:33:36 UTC --- Oh, so it's a sum ... Well, the I suppose you run into the usual array-prefetching compile-time hog. Try -fno-prefetch-loop-arrays. ^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug lto/49700] LTO compile time hog 2011-07-11 6:29 [Bug lto/49700] New: LTO compile time hog Joost.VandeVondele at pci dot uzh.ch ` (2 preceding siblings ...) 2011-07-11 9:33 ` rguenth at gcc dot gnu.org @ 2011-07-11 19:57 ` pinskia at gcc dot gnu.org 2011-07-12 10:18 ` Joost.VandeVondele at pci dot uzh.ch ` (4 subsequent siblings) 8 siblings, 0 replies; 10+ messages in thread From: pinskia at gcc dot gnu.org @ 2011-07-11 19:57 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49700 --- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> 2011-07-11 19:56:55 UTC --- > Extra diagnostic checks enabled; compiler may run slowly. > Configure with --enable-checking=release to disable checks. Also try to build the compiler with that option passed to configure. The defaults for the trunk has always been adding extra checking. > CFG verifier : 49.95 ( 1%) usr > tree SSA verifier : 376.24 ( 7%) usr > tree STMT verifier : 439.42 ( 9%) usr > verify loop closed : 168.68 ( 3%) usr As you can see with the above, some verifying takes time (at least 20% of the total time). ^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug lto/49700] LTO compile time hog 2011-07-11 6:29 [Bug lto/49700] New: LTO compile time hog Joost.VandeVondele at pci dot uzh.ch ` (3 preceding siblings ...) 2011-07-11 19:57 ` pinskia at gcc dot gnu.org @ 2011-07-12 10:18 ` Joost.VandeVondele at pci dot uzh.ch 2011-07-12 14:41 ` Joost.VandeVondele at pci dot uzh.ch ` (3 subsequent siblings) 8 siblings, 0 replies; 10+ messages in thread From: Joost.VandeVondele at pci dot uzh.ch @ 2011-07-12 10:18 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49700 --- Comment #5 from Joost VandeVondele <Joost.VandeVondele at pci dot uzh.ch> 2011-07-12 10:17:54 UTC --- (In reply to comment #3) > Oh, so it's a sum ... > > Well, the I suppose you run into the usual array-prefetching compile-time hog. > Try -fno-prefetch-loop-arrays. This seems to reduce the time by 5x. On a non-LTO run, this doesn't matter, but I assume LTO makes more info available that triggers some action by that pass. ^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug lto/49700] LTO compile time hog 2011-07-11 6:29 [Bug lto/49700] New: LTO compile time hog Joost.VandeVondele at pci dot uzh.ch ` (4 preceding siblings ...) 2011-07-12 10:18 ` Joost.VandeVondele at pci dot uzh.ch @ 2011-07-12 14:41 ` Joost.VandeVondele at pci dot uzh.ch 2012-05-07 12:41 ` rguenth at gcc dot gnu.org ` (2 subsequent siblings) 8 siblings, 0 replies; 10+ messages in thread From: Joost.VandeVondele at pci dot uzh.ch @ 2011-07-12 14:41 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49700 --- Comment #6 from Joost VandeVondele <Joost.VandeVondele at pci dot uzh.ch> 2011-07-12 14:40:29 UTC --- (In reply to comment #1) > Joost, can you point us to a source tarball? Does the issue reproduce > with simpler flags (plain -O2 or plain -O3? Esp. w/o -march=native?) CP2K sources are easy, but the to get a version that links, there are additional non-standard dependencies (blas, lapack, libint). ^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug lto/49700] LTO compile time hog 2011-07-11 6:29 [Bug lto/49700] New: LTO compile time hog Joost.VandeVondele at pci dot uzh.ch ` (5 preceding siblings ...) 2011-07-12 14:41 ` Joost.VandeVondele at pci dot uzh.ch @ 2012-05-07 12:41 ` rguenth at gcc dot gnu.org 2012-05-07 19:18 ` Joost.VandeVondele at mat dot ethz.ch 2012-05-08 18:59 ` Joost.VandeVondele at mat dot ethz.ch 8 siblings, 0 replies; 10+ messages in thread From: rguenth at gcc dot gnu.org @ 2012-05-07 12:41 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49700 Richard Guenther <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |WAITING Last reconfirmed| |2012-05-07 Ever Confirmed|0 |1 --- Comment #7 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-05-07 12:38:43 UTC --- Has the situation improved? ^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug lto/49700] LTO compile time hog 2011-07-11 6:29 [Bug lto/49700] New: LTO compile time hog Joost.VandeVondele at pci dot uzh.ch ` (6 preceding siblings ...) 2012-05-07 12:41 ` rguenth at gcc dot gnu.org @ 2012-05-07 19:18 ` Joost.VandeVondele at mat dot ethz.ch 2012-05-08 18:59 ` Joost.VandeVondele at mat dot ethz.ch 8 siblings, 0 replies; 10+ messages in thread From: Joost.VandeVondele at mat dot ethz.ch @ 2012-05-07 19:18 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49700 --- Comment #8 from Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> 2012-05-07 19:04:29 UTC --- (In reply to comment #7) > Has the situation improved? current trunk LTO seems to fail on CP2K with: /data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F: In function ‘propagate_cn_or_em’: /data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error: type mismatch in component reference SUBROUTINE propagate_cn_or_em(qs_env, error) ^ struct array2_integer(kind=4) struct array2_integer(kind=4) # VUSE <.MEM_805> D.79093_629 = D.79094_628->orders.data; /data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error: type mismatch in component reference SUBROUTINE propagate_cn_or_em(qs_env, error) ^ struct array2_integer(kind=4) struct array2_integer(kind=4) # VUSE <.MEM_805> D.79092_630 = D.79094_628->orders.offset; /data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error: type mismatch in component reference SUBROUTINE propagate_cn_or_em(qs_env, error) ^ struct array2_integer(kind=4) struct array2_integer(kind=4) # VUSE <.MEM_805> D.79090_632 = D.79094_628->orders.dim[1].stride; /data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error: type mismatch in component reference SUBROUTINE propagate_cn_or_em(qs_env, error) ^ struct array2_integer(kind=4) struct array2_integer(kind=4) # VUSE <.MEM_816> D.79093_652 = D.79094_651->orders.data; /data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error: type mismatch in component reference SUBROUTINE propagate_cn_or_em(qs_env, error) ^ struct array2_integer(kind=4) struct array2_integer(kind=4) # VUSE <.MEM_816> D.79092_653 = D.79094_651->orders.offset; /data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error: type mismatch in component reference SUBROUTINE propagate_cn_or_em(qs_env, error) ^ struct array2_integer(kind=4) struct array2_integer(kind=4) # VUSE <.MEM_816> D.79090_655 = D.79094_651->orders.dim[1].stride; /data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error: type mismatch in component reference SUBROUTINE propagate_cn_or_em(qs_env, error) ^ struct array2_integer(kind=4) struct array2_integer(kind=4) # VUSE <.MEM_827> D.79093_675 = D.79094_674->orders.data; /data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error: type mismatch in component reference SUBROUTINE propagate_cn_or_em(qs_env, error) ^ struct array2_integer(kind=4) struct array2_integer(kind=4) # VUSE <.MEM_827> D.79092_676 = D.79094_674->orders.offset; /data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error: type mismatch in component reference SUBROUTINE propagate_cn_or_em(qs_env, error) ^ struct array2_integer(kind=4) struct array2_integer(kind=4) # VUSE <.MEM_827> D.79090_678 = D.79094_674->orders.dim[1].stride; /data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error: type mismatch in component reference SUBROUTINE propagate_cn_or_em(qs_env, error) ^ struct array2_integer(kind=4) struct array2_integer(kind=4) # VUSE <.MEM_838> D.79093_700 = D.79094_699->orders.data; /data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error: type mismatch in component reference SUBROUTINE propagate_cn_or_em(qs_env, error) ^ struct array2_integer(kind=4) struct array2_integer(kind=4) # VUSE <.MEM_838> D.79092_701 = D.79094_699->orders.offset; /data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error: type mismatch in component reference SUBROUTINE propagate_cn_or_em(qs_env, error) ^ struct array2_integer(kind=4) struct array2_integer(kind=4) # VUSE <.MEM_838> D.79090_703 = D.79094_699->orders.dim[1].stride; /data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: internal compiler error: verify_gimple failed SUBROUTINE propagate_cn_or_em(qs_env, error) ^ Please submit a full bug report, with preprocessed source if appropriate. See <http://gcc.gnu.org/bugs.html> for instructions. lto-wrapper: gfortran returned 1 exit status /data/vjoost/gnu/binutils-2.22/install/bin/ld: lto-wrapper failed collect2: error: ld returned 1 exit status ^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug lto/49700] LTO compile time hog 2011-07-11 6:29 [Bug lto/49700] New: LTO compile time hog Joost.VandeVondele at pci dot uzh.ch ` (7 preceding siblings ...) 2012-05-07 19:18 ` Joost.VandeVondele at mat dot ethz.ch @ 2012-05-08 18:59 ` Joost.VandeVondele at mat dot ethz.ch 8 siblings, 0 replies; 10+ messages in thread From: Joost.VandeVondele at mat dot ethz.ch @ 2012-05-08 18:59 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49700 Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|WAITING |RESOLVED Resolution| |FIXED --- Comment #9 from Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> 2012-05-08 18:52:12 UTC --- trying 4.7.X instead it actually looks very reasonable now. Using -flto=jobserver -fuse-linker-plugin -ftime-report -O3 -march=native -ffast-math -g -ffree-form I get CP2K to build in 4min on a 32 cores server. The time report also looks OK. I'll close this PR as fixed (to issue with 4.8 is tracked in PR 45586). ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2012-05-08 18:53 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-07-11 6:29 [Bug lto/49700] New: LTO compile time hog Joost.VandeVondele at pci dot uzh.ch 2011-07-11 9:21 ` [Bug lto/49700] " rguenth at gcc dot gnu.org 2011-07-11 9:29 ` jakub at gcc dot gnu.org 2011-07-11 9:33 ` rguenth at gcc dot gnu.org 2011-07-11 19:57 ` pinskia at gcc dot gnu.org 2011-07-12 10:18 ` Joost.VandeVondele at pci dot uzh.ch 2011-07-12 14:41 ` Joost.VandeVondele at pci dot uzh.ch 2012-05-07 12:41 ` rguenth at gcc dot gnu.org 2012-05-07 19:18 ` Joost.VandeVondele at mat dot ethz.ch 2012-05-08 18:59 ` Joost.VandeVondele at mat dot ethz.ch
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).