public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug lto/49700] New: LTO compile time hog
@ 2011-07-11  6:29 Joost.VandeVondele at pci dot uzh.ch
  2011-07-11  9:21 ` [Bug lto/49700] " rguenth at gcc dot gnu.org
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: Joost.VandeVondele at pci dot uzh.ch @ 2011-07-11  6:29 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49700

           Summary: LTO compile time hog
           Product: gcc
           Version: 4.7.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: lto
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: Joost.VandeVondele@pci.uzh.ch


Using LTO a CP2K compile can take several hours and a lot of memory. I have
been able to extract the following time report, but what is the best way to
make a testcase ? -save-temps yielded the cp2k.sopt.ltrans29.ltrans.o file, but
is anything more needed ?


 gfortran @/dev/shm/vondele/ccGyXn6S.args
Using built-in specs.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/data03/vondele/gnu/gcc_trunk/install/libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc/configure
--prefix=/data03/vondele/gnu/gcc_trunk/install --enable-languages=c,c++,fortran
--disable-multilib --enable-plugins --enable-cloog-backend=isl
--with-ppl=/data03/vondele/gnu/ppl-0.11/install
--with-cloog=/data03/vondele/gnu/cloog-0.16.1/install/
--with-libelf=/data03/vondele/gnu/libelf-0.8.13/install
--with-plugin-ld=ld.gold
Thread model: posix
gcc version 4.7.0 20110709 (experimental) [trunk revision 176072] (GCC)
COLLECT_GCC_OPTIONS='-c' '-v' '-save-temps' '-ftime-report' '-u'
'se-linker-plugin' '-O3' '-march=native' '-funroll-loops' '-ffast-math'
'-ffree-form' '-D' '__GFORTRAN' '-D' '__FFTSG' '-D' '__FFTW3' '-D' '__LIBINT'
'-I' '/ext/software/64/gfortran-suite/include' '-D'
'__COMPILE_ARCH="gfortran-test13"' '-D' '__COMPILE_DATE="Sun Jul 10 14:22:33
CEST 2011"' '-D' '__COMPILE_HOST="pcihopt3"' '-D'
'__COMPILE_LASTCVS="/qs_scf.F/1.527/Sat Jul  9 07:18:08 2011//"'
'-L/users/vondele/LAPACK/' '-L/ext/software/64/gfortran-suite/lib'
'-shared-libgcc' '-dumpdir'
'/data03/vondele/cp2k_gcc/cp2k/makefiles/../exe/gfortran-test13/' '-dumpbase'
'cp2k.sopt.ltrans29' '-fltrans' '-o' 'cp2k.sopt.ltrans29.ltrans.o'

/data03/vondele/gnu/gcc_trunk/install/libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/lto1
-march=amdfam10 -mcx16 -msahf -mno-movbe -mno-aes -mno-pclmul -mpopcnt -mabm
-mno-lwp -mno-fma -mno-fma4 -mno-xop -mno-bmi -mno-tbm -mno-avx -mno-sse4.2
-mno-sse4.1 --param l1-cache-size=64 --param l1-cache-line-size=64 --param
l2-cache-size=512 -mtune=amdfam10 -quiet -dumpdir
/data03/vondele/cp2k_gcc/cp2k/makefiles/../exe/gfortran-test13/ -dumpbase
cp2k.sopt.ltrans29 -auxbase-strip cp2k.sopt.ltrans29.ltrans.o -O3 -version
-ftime-report -funroll-loops -ffast-math -ffree-form -fltrans
@/dev/shm/vondele/ccuRF1W6 -o cp2k.sopt.ltrans29.s
GNU GIMPLE (GCC) version 4.7.0 20110709 (experimental) [trunk revision 176072]
(x86_64-unknown-linux-gnu)
        compiled by GNU C version 4.7.0 20110709 (experimental) [trunk revision
176072], GMP version 4.3.2, MPFR version 2.4.2, MPC version 0.8.2
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
GNU GIMPLE (GCC) version 4.7.0 20110709 (experimental) [trunk revision 176072]
(x86_64-unknown-linux-gnu)
        compiled by GNU C version 4.7.0 20110709 (experimental) [trunk revision
176072], GMP version 4.3.2, MPFR version 2.4.2, MPC version 0.8.2
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096

Execution times (seconds)
 phase setup             :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall 
  1250 kB ( 0%) ggc
 phase parsing           :5086.93 (100%) usr   9.45 (100%) sys5213.35 (100%)
wall 1072513 kB (100%) ggc
 phase generate          :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
     0 kB ( 0%) ggc
 garbage collection      :   4.16 ( 0%) usr   0.02 ( 0%) sys   4.22 ( 0%) wall 
     0 kB ( 0%) ggc
 ipa lto gimple in       :   0.13 ( 0%) usr   0.01 ( 0%) sys   0.16 ( 0%) wall 
 26100 kB ( 2%) ggc
 ipa lto decl in         :   0.86 ( 0%) usr   0.01 ( 0%) sys   0.89 ( 0%) wall 
  7027 kB ( 1%) ggc
 ipa pure const          :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall 
    11 kB ( 0%) ggc
 cfg construction        :   0.62 ( 0%) usr   0.00 ( 0%) sys   0.63 ( 0%) wall 
  3820 kB ( 0%) ggc
 cfg cleanup             :   2.89 ( 0%) usr   0.00 ( 0%) sys   2.91 ( 0%) wall 
  4416 kB ( 0%) ggc
 CFG verifier            :  49.95 ( 1%) usr   0.00 ( 0%) sys  50.28 ( 1%) wall 
     0 kB ( 0%) ggc
 trivially dead code     :   0.88 ( 0%) usr   0.00 ( 0%) sys   0.90 ( 0%) wall 
     0 kB ( 0%) ggc
 df scan insns           :   0.20 ( 0%) usr   0.00 ( 0%) sys   0.23 ( 0%) wall 
    20 kB ( 0%) ggc
 df multiple defs        :   0.34 ( 0%) usr   0.00 ( 0%) sys   0.34 ( 0%) wall 
     0 kB ( 0%) ggc
 df reaching defs        :  60.58 ( 1%) usr   1.10 (12%) sys  62.35 ( 1%) wall 
     0 kB ( 0%) ggc
 df live regs            :  13.28 ( 0%) usr   0.04 ( 0%) sys  13.56 ( 0%) wall 
     0 kB ( 0%) ggc
 df live&initialized regs:   5.43 ( 0%) usr   0.00 ( 0%) sys   5.59 ( 0%) wall 
     0 kB ( 0%) ggc
 df use-def / def-use chains:   1.16 ( 0%) usr   0.01 ( 0%) sys   1.25 ( 0%)
wall       0 kB ( 0%) ggc
 df live reg subwords    :   0.47 ( 0%) usr   0.00 ( 0%) sys   0.46 ( 0%) wall 
     0 kB ( 0%) ggc
 df reg dead/unused notes:   4.22 ( 0%) usr   0.00 ( 0%) sys   4.20 ( 0%) wall 
  8542 kB ( 1%) ggc
 register information    :   1.72 ( 0%) usr   0.00 ( 0%) sys   1.77 ( 0%) wall 
     0 kB ( 0%) ggc
 alias analysis          :   2.40 ( 0%) usr   0.00 ( 0%) sys   2.40 ( 0%) wall 
 29964 kB ( 3%) ggc
 alias stmt walking      :   1.06 ( 0%) usr   0.03 ( 0%) sys   1.11 ( 0%) wall 
  6438 kB ( 1%) ggc
 register scan           :   0.17 ( 0%) usr   0.00 ( 0%) sys   0.17 ( 0%) wall 
    35 kB ( 0%) ggc
 rebuild jump labels     :   0.85 ( 0%) usr   0.00 ( 0%) sys   0.87 ( 0%) wall 
     7 kB ( 0%) ggc
 inline heuristics       :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall 
     0 kB ( 0%) ggc
 integration             :   0.10 ( 0%) usr   0.00 ( 0%) sys   0.10 ( 0%) wall 
 14586 kB ( 1%) ggc
 tree CFG cleanup        :   3.58 ( 0%) usr   0.01 ( 0%) sys   3.52 ( 0%) wall 
  2188 kB ( 0%) ggc
 tree VRP                :   1.49 ( 0%) usr   0.01 ( 0%) sys   1.56 ( 0%) wall 
 17581 kB ( 2%) ggc
 tree copy propagation   :   0.27 ( 0%) usr   0.01 ( 0%) sys   0.29 ( 0%) wall 
   858 kB ( 0%) ggc
 tree PTA                :   0.57 ( 0%) usr   0.00 ( 0%) sys   0.63 ( 0%) wall 
   670 kB ( 0%) ggc
 tree SSA rewrite        :   8.75 ( 0%) usr   0.01 ( 0%) sys   8.72 ( 0%) wall 
  1215 kB ( 0%) ggc
 tree SSA incremental    :  26.73 ( 1%) usr   0.00 ( 0%) sys  27.36 ( 1%) wall 
  3046 kB ( 0%) ggc
 tree operand scan       :   0.78 ( 0%) usr   0.26 ( 3%) sys   0.83 ( 0%) wall 
 39556 kB ( 4%) ggc
 dominator optimization  :   0.29 ( 0%) usr   0.00 ( 0%) sys   0.38 ( 0%) wall 
  7238 kB ( 1%) ggc
 tree CCP                :   0.34 ( 0%) usr   0.00 ( 0%) sys   0.42 ( 0%) wall 
  1098 kB ( 0%) ggc
 tree PHI const/copy prop:   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
   126 kB ( 0%) ggc
 tree split crit edges   :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
  1301 kB ( 0%) ggc
 tree reassociation      :   0.19 ( 0%) usr   0.01 ( 0%) sys   0.26 ( 0%) wall 
  5052 kB ( 0%) ggc
 tree PRE                :   1.17 ( 0%) usr   0.01 ( 0%) sys   1.14 ( 0%) wall 
 13070 kB ( 1%) ggc
 tree FRE                :   0.54 ( 0%) usr   0.01 ( 0%) sys   0.57 ( 0%) wall 
  6904 kB ( 1%) ggc
 tree code sinking       :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall 
   427 kB ( 0%) ggc
 tree linearize phis     :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall 
    30 kB ( 0%) ggc
 tree forward propagate  :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.10 ( 0%) wall 
  1803 kB ( 0%) ggc
 tree phiprop            :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall 
     0 kB ( 0%) ggc
 tree conservative DCE   :   0.14 ( 0%) usr   0.01 ( 0%) sys   0.15 ( 0%) wall 
   155 kB ( 0%) ggc
 tree aggressive DCE     :   0.36 ( 0%) usr   0.00 ( 0%) sys   0.43 ( 0%) wall 
 11259 kB ( 1%) ggc
 tree DSE                :   0.19 ( 0%) usr   0.00 ( 0%) sys   0.18 ( 0%) wall 
     0 kB ( 0%) ggc
 tree loop bounds        :   0.31 ( 0%) usr   0.00 ( 0%) sys   0.25 ( 0%) wall 
 11761 kB ( 1%) ggc
 tree loop invariant motion:   0.22 ( 0%) usr   0.02 ( 0%) sys   0.23 ( 0%)
wall    4266 kB ( 0%) ggc
 tree canonical iv       :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall 
   875 kB ( 0%) ggc
 scev constant prop      :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall 
  2182 kB ( 0%) ggc
 complete unrolling      :   0.70 ( 0%) usr   0.00 ( 0%) sys   0.77 ( 0%) wall 
 37727 kB ( 4%) ggc
 tree vectorization      :   0.79 ( 0%) usr   0.04 ( 0%) sys   0.84 ( 0%) wall 
 36624 kB ( 3%) ggc
 tree slp vectorization  :   0.45 ( 0%) usr   0.00 ( 0%) sys   0.44 ( 0%) wall 
 11293 kB ( 1%) ggc
 tree loop distribution  :   3.31 ( 0%) usr   0.00 ( 0%) sys   3.32 ( 0%) wall 
   857 kB ( 0%) ggc
 tree prefetching        :  51.24 ( 1%) usr   0.07 ( 1%) sys  52.02 ( 1%) wall 
 56594 kB ( 5%) ggc
 tree iv optimization    :   7.47 ( 0%) usr   0.09 ( 1%) sys   7.64 ( 0%) wall 
200051 kB (19%) ggc
 predictive commoning    :   0.25 ( 0%) usr   0.00 ( 0%) sys   0.20 ( 0%) wall 
  6379 kB ( 1%) ggc
 tree loop init          :   0.10 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall 
   856 kB ( 0%) ggc
 tree copy headers       :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
  1560 kB ( 0%) ggc
 tree SSA uncprop        :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall 
     0 kB ( 0%) ggc
 tree rename SSA copies  :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall 
     0 kB ( 0%) ggc
 tree SSA verifier       : 376.24 ( 7%) usr   0.04 ( 0%) sys 379.72 ( 7%) wall 
     0 kB ( 0%) ggc
 tree STMT verifier      : 439.42 ( 9%) usr   0.16 ( 2%) sys 443.29 ( 9%) wall 
     0 kB ( 0%) ggc
 callgraph verifier      :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall 
    63 kB ( 0%) ggc
 dominance frontiers     :  16.93 ( 0%) usr   0.03 ( 0%) sys  16.81 ( 0%) wall 
     0 kB ( 0%) ggc
 dominance computation   :  10.78 ( 0%) usr   0.00 ( 0%) sys  10.50 ( 0%) wall 
     0 kB ( 0%) ggc
 control dependences     :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
     0 kB ( 0%) ggc
 out of ssa              :   0.63 ( 0%) usr   0.00 ( 0%) sys   0.68 ( 0%) wall 
   207 kB ( 0%) ggc
 expand vars             :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall 
 12686 kB ( 1%) ggc
 expand                  :   0.83 ( 0%) usr   0.00 ( 0%) sys   0.76 ( 0%) wall 
 50753 kB ( 5%) ggc
 post expand cleanups    :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall 
  3897 kB ( 0%) ggc
 lower subreg            :   0.28 ( 0%) usr   0.00 ( 0%) sys   0.29 ( 0%) wall 
     0 kB ( 0%) ggc
 forward prop            :   1.07 ( 0%) usr   0.00 ( 0%) sys   1.10 ( 0%) wall 
 11010 kB ( 1%) ggc
 CSE                     :   0.71 ( 0%) usr   0.00 ( 0%) sys   0.77 ( 0%) wall 
   446 kB ( 0%) ggc
 dead code elimination   :   1.73 ( 0%) usr   0.00 ( 0%) sys   1.74 ( 0%) wall 
     0 kB ( 0%) ggc
 dead store elim1        :   1.33 ( 0%) usr   0.00 ( 0%) sys   1.35 ( 0%) wall 
 15295 kB ( 1%) ggc
 dead store elim2        :   1.50 ( 0%) usr   0.00 ( 0%) sys   1.54 ( 0%) wall 
 15743 kB ( 1%) ggc
 loop analysis           :   0.11 ( 0%) usr   0.00 ( 0%) sys   0.11 ( 0%) wall 
  1864 kB ( 0%) ggc
 loop invariant motion   :  24.13 ( 0%) usr   0.02 ( 0%) sys  23.97 ( 0%) wall 
  1275 kB ( 0%) ggc
 loop unswitching        : 872.57 (17%) usr   0.01 ( 0%) sys 873.51 (17%) wall 
   111 kB ( 0%) ggc
 loop unrolling          :2802.10 (55%) usr   4.15 (44%) sys2910.38 (56%) wall 
 99541 kB ( 9%) ggc
 CPROP                   :   0.40 ( 0%) usr   0.00 ( 0%) sys   0.26 ( 0%) wall 
  3209 kB ( 0%) ggc
 PRE                     :   0.22 ( 0%) usr   0.00 ( 0%) sys   0.20 ( 0%) wall 
   113 kB ( 0%) ggc
 code hoisting           :   0.09 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall 
     4 kB ( 0%) ggc
 web                     :   1.60 ( 0%) usr   0.02 ( 0%) sys   1.63 ( 0%) wall 
  6999 kB ( 1%) ggc
 CSE 2                   :   1.49 ( 0%) usr   0.00 ( 0%) sys   1.53 ( 0%) wall 
  1326 kB ( 0%) ggc
 combiner                :  23.55 ( 0%) usr   0.04 ( 0%) sys  23.84 ( 0%) wall 
 53172 kB ( 5%) ggc
 if-conversion           :  13.53 ( 0%) usr   0.01 ( 0%) sys  13.69 ( 0%) wall 
  1910 kB ( 0%) ggc
 regmove                 :   0.92 ( 0%) usr   0.00 ( 0%) sys   0.94 ( 0%) wall 
     9 kB ( 0%) ggc
 integrated RA           :  19.24 ( 0%) usr   3.00 (32%) sys  22.40 ( 0%) wall 
151112 kB (14%) ggc
 reload                  :  16.17 ( 0%) usr   0.00 ( 0%) sys  16.37 ( 0%) wall 
 12120 kB ( 1%) ggc
 reload CSE regs         :   3.43 ( 0%) usr   0.00 ( 0%) sys   3.52 ( 0%) wall 
 25345 kB ( 2%) ggc
 load CSE after reload   :   1.71 ( 0%) usr   0.00 ( 0%) sys   1.74 ( 0%) wall 
   919 kB ( 0%) ggc
 zee                     :   0.47 ( 0%) usr   0.00 ( 0%) sys   0.48 ( 0%) wall 
    42 kB ( 0%) ggc
 thread pro- & epilogue  :   0.10 ( 0%) usr   0.00 ( 0%) sys   0.11 ( 0%) wall 
   618 kB ( 0%) ggc
 if-conversion 2         :   0.20 ( 0%) usr   0.01 ( 0%) sys   0.21 ( 0%) wall 
   776 kB ( 0%) ggc
 combine stack adjustments:   0.10 ( 0%) usr   0.00 ( 0%) sys   0.11 ( 0%) wall
      0 kB ( 0%) ggc
 peephole 2              :   0.32 ( 0%) usr   0.00 ( 0%) sys   0.33 ( 0%) wall 
  2368 kB ( 0%) ggc
 rename registers        :   1.13 ( 0%) usr   0.00 ( 0%) sys   1.16 ( 0%) wall 
  3536 kB ( 0%) ggc
 hard reg cprop          :   0.82 ( 0%) usr   0.00 ( 0%) sys   0.78 ( 0%) wall 
    15 kB ( 0%) ggc
 scheduling 2            :   4.34 ( 0%) usr   0.00 ( 0%) sys   4.35 ( 0%) wall 
   351 kB ( 0%) ggc
 machine dep reorg       :   0.76 ( 0%) usr   0.00 ( 0%) sys   0.73 ( 0%) wall 
    15 kB ( 0%) ggc
 reorder blocks          :   0.93 ( 0%) usr   0.00 ( 0%) sys   0.92 ( 0%) wall 
  2694 kB ( 0%) ggc
 final                   :   1.68 ( 0%) usr   0.09 ( 1%) sys   1.76 ( 0%) wall 
   423 kB ( 0%) ggc
 rest of compilation     :   1.73 ( 0%) usr   0.02 ( 0%) sys   1.73 ( 0%) wall 
  5241 kB ( 0%) ggc
 remove unused locals    :   0.96 ( 0%) usr   0.00 ( 0%) sys   0.87 ( 0%) wall 
     0 kB ( 0%) ggc
 address taken           :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall 
     0 kB ( 0%) ggc
 unaccounted todo        :   1.15 ( 0%) usr   0.02 ( 0%) sys   1.16 ( 0%) wall 
     9 kB ( 0%) ggc
 verify loop closed      : 168.68 ( 3%) usr   0.05 ( 1%) sys 169.90 ( 3%) wall 
     0 kB ( 0%) ggc
 verify RTL sharing      :   8.07 ( 0%) usr   0.00 ( 0%) sys   8.30 ( 0%) wall 
     0 kB ( 0%) ggc
 rebuild frequencies     :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
   249 kB ( 0%) ggc
 repair loop structures  :   0.48 ( 0%) usr   0.00 ( 0%) sys   0.48 ( 0%) wall 
   961 kB ( 0%) ggc
 TOTAL                 :5086.93             9.45          5213.38           
1073764 kB
Extra diagnostic checks enabled; compiler may run slowly.
Configure with --enable-checking=release to disable checks.
COLLECT_GCC_OPTIONS='-c' '-v' '-save-temps' '-ftime-report' '-u'
'se-linker-plugin' '-O3' '-march=native' '-funroll-loops' '-ffast-math'
'-ffree-form' '-D' '__GFORTRAN' '-D' '__FFTSG' '-D' '__FFTW3' '-D' '__LIBINT'
'-I' '/ext/software/64/gfortran-suite/include' '-D'
'__COMPILE_ARCH="gfortran-test13"' '-D' '__COMPILE_DATE="Sun Jul 10 14:22:33
CEST 2011"' '-D' '__COMPILE_HOST="pcihopt3"' '-D'
'__COMPILE_LASTCVS="/qs_scf.F/1.527/Sat Jul  9 07:18:08 2011//"'
'-L/users/vondele/LAPACK/' '-L/ext/software/64/gfortran-suite/lib'
'-shared-libgcc' '-dumpdir'
'/data03/vondele/cp2k_gcc/cp2k/makefiles/../exe/gfortran-test13/' '-dumpbase'
'cp2k.sopt.ltrans29' '-fltrans' '-o' 'cp2k.sopt.ltrans29.ltrans.o'
 as --64 -o cp2k.sopt.ltrans29.ltrans.o cp2k.sopt.ltrans29.s
COMPILER_PATH=/data03/vondele/gnu/gcc_trunk/install/libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/:/data03/vondele/gnu/gcc_trunk/install/libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/:/data03/vondele/gnu/gcc_trunk/install/libexec/gcc/x86_64-unknown-linux-gnu/:/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/:/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/:/data03/vondele/gnu/gcc_trunk/install/libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/:/data03/vondele/gnu/gcc_trunk/install/libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/:/data03/vondele/gnu/gcc_trunk/install/libexec/gcc/x86_64-unknown-linux-gnu/:/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/:/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/
LIBRARY_PATH=/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/../../../../lib64/../lib64/:/lib/../lib64/../lib64/:/usr/lib/../lib64/../lib64/:/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/../../../../lib64/:/lib/../lib64/:/usr/lib/../lib64/:/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/:/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/../../../../lib64/:/lib/../lib64/:/usr/lib/../lib64/:/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/:/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/../../../../lib64/:/lib/../lib64/:/usr/lib/../lib64/:/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/../../../:/lib/:/usr/lib/:/data03/vondele/gnu/gcc_trunk/install/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/../../../:/lib/:/usr/lib/
COLLECT_GCC_OPTIONS='-c' '-v' '-save-temps' '-ftime-report' '-u'
'se-linker-plugin' '-O3' '-march=native' '-funroll-loops' '-ffast-math'
'-ffree-form' '-D' '__GFORTRAN' '-D' '__FFTSG' '-D' '__FFTW3' '-D' '__LIBINT'
'-I' '/ext/software/64/gfortran-suite/include' '-D'
'__COMPILE_ARCH="gfortran-test13"' '-D' '__COMPILE_DATE="Sun Jul 10 14:22:33
CEST 2011"' '-D' '__COMPILE_HOST="pcihopt3"' '-D'
'__COMPILE_LASTCVS="/qs_scf.F/1.527/Sat Jul  9 07:18:08 2011//"'
'-L/users/vondele/LAPACK/' '-L/ext/software/64/gfortran-suite/lib'
'-shared-libgcc' '-dumpdir'
'/data03/vondele/cp2k_gcc/cp2k/makefiles/../exe/gfortran-test13/' '-dumpbase'
'cp2k.sopt.ltrans29' '-fltrans' '-o' 'cp2k.sopt.ltrans29.ltrans.o'
[Leaving LTRANS /dev/shm/vondele/ccGyXn6S.args]


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug lto/49700] LTO compile time hog
  2011-07-11  6:29 [Bug lto/49700] New: LTO compile time hog Joost.VandeVondele at pci dot uzh.ch
@ 2011-07-11  9:21 ` rguenth at gcc dot gnu.org
  2011-07-11  9:29 ` jakub at gcc dot gnu.org
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2011-07-11  9:21 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49700

Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hubicka at gcc dot gnu.org,
                   |                            |rguenth at gcc dot gnu.org

--- Comment #1 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-07-11 09:20:37 UTC ---
The

 phase parsing           :5086.93 (100%) usr   9.45 (100%) sys5213.35 (100%)
wall 1072513 kB (100%) ggc

line looks completely odd ;)  It would be interesting to know what broke the
LTO I/O type/merging accounting.

Honza?  The above is from LTRANS phase, so maybe there's some timevar
pushing missing?  It looks like materialize_cgraph is suspicious.

No, the ltrans object file is of no use.

Joost, can you point us to a source tarball?  Does the issue reproduce
with simpler flags (plain -O2 or plain -O3?  Esp. w/o -march=native?)


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug lto/49700] LTO compile time hog
  2011-07-11  6:29 [Bug lto/49700] New: LTO compile time hog Joost.VandeVondele at pci dot uzh.ch
  2011-07-11  9:21 ` [Bug lto/49700] " rguenth at gcc dot gnu.org
@ 2011-07-11  9:29 ` jakub at gcc dot gnu.org
  2011-07-11  9:33 ` rguenth at gcc dot gnu.org
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2011-07-11  9:29 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49700

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-07-11 09:27:50 UTC ---
The phase parsing numbers are uninteresting, as lto1 FE does almost all the
work in its lto_main (aka parse_file langhook), including cgraph_optimize.  So
that time includes basically the whole compilation, you can see that the
generate phase is very short.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug lto/49700] LTO compile time hog
  2011-07-11  6:29 [Bug lto/49700] New: LTO compile time hog Joost.VandeVondele at pci dot uzh.ch
  2011-07-11  9:21 ` [Bug lto/49700] " rguenth at gcc dot gnu.org
  2011-07-11  9:29 ` jakub at gcc dot gnu.org
@ 2011-07-11  9:33 ` rguenth at gcc dot gnu.org
  2011-07-11 19:57 ` pinskia at gcc dot gnu.org
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2011-07-11  9:33 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49700

--- Comment #3 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-07-11 09:33:36 UTC ---
Oh, so it's a sum ...

Well, the I suppose you run into the usual array-prefetching compile-time hog.
Try -fno-prefetch-loop-arrays.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug lto/49700] LTO compile time hog
  2011-07-11  6:29 [Bug lto/49700] New: LTO compile time hog Joost.VandeVondele at pci dot uzh.ch
                   ` (2 preceding siblings ...)
  2011-07-11  9:33 ` rguenth at gcc dot gnu.org
@ 2011-07-11 19:57 ` pinskia at gcc dot gnu.org
  2011-07-12 10:18 ` Joost.VandeVondele at pci dot uzh.ch
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2011-07-11 19:57 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49700

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> 2011-07-11 19:56:55 UTC ---
> Extra diagnostic checks enabled; compiler may run slowly.
> Configure with --enable-checking=release to disable checks.

Also try to build the compiler with that option passed to configure.  The
defaults for the trunk has always been adding extra checking.

> CFG verifier            :  49.95 ( 1%) usr
> tree SSA verifier       : 376.24 ( 7%) usr
> tree STMT verifier      : 439.42 ( 9%) usr
> verify loop closed      : 168.68 ( 3%) usr

As you can see with the above, some verifying takes time (at least 20% of the
total time).


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug lto/49700] LTO compile time hog
  2011-07-11  6:29 [Bug lto/49700] New: LTO compile time hog Joost.VandeVondele at pci dot uzh.ch
                   ` (3 preceding siblings ...)
  2011-07-11 19:57 ` pinskia at gcc dot gnu.org
@ 2011-07-12 10:18 ` Joost.VandeVondele at pci dot uzh.ch
  2011-07-12 14:41 ` Joost.VandeVondele at pci dot uzh.ch
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Joost.VandeVondele at pci dot uzh.ch @ 2011-07-12 10:18 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49700

--- Comment #5 from Joost VandeVondele <Joost.VandeVondele at pci dot uzh.ch> 2011-07-12 10:17:54 UTC ---
(In reply to comment #3)
> Oh, so it's a sum ...
> 
> Well, the I suppose you run into the usual array-prefetching compile-time hog.
> Try -fno-prefetch-loop-arrays.

This seems to reduce the time by 5x. On a non-LTO run, this doesn't matter, but
I assume LTO makes more info available that triggers some action by that pass.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug lto/49700] LTO compile time hog
  2011-07-11  6:29 [Bug lto/49700] New: LTO compile time hog Joost.VandeVondele at pci dot uzh.ch
                   ` (4 preceding siblings ...)
  2011-07-12 10:18 ` Joost.VandeVondele at pci dot uzh.ch
@ 2011-07-12 14:41 ` Joost.VandeVondele at pci dot uzh.ch
  2012-05-07 12:41 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Joost.VandeVondele at pci dot uzh.ch @ 2011-07-12 14:41 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49700

--- Comment #6 from Joost VandeVondele <Joost.VandeVondele at pci dot uzh.ch> 2011-07-12 14:40:29 UTC ---
(In reply to comment #1)
> Joost, can you point us to a source tarball?  Does the issue reproduce
> with simpler flags (plain -O2 or plain -O3?  Esp. w/o -march=native?)

CP2K sources are easy, but the to get a version that links, there are
additional non-standard dependencies (blas, lapack, libint).


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug lto/49700] LTO compile time hog
  2011-07-11  6:29 [Bug lto/49700] New: LTO compile time hog Joost.VandeVondele at pci dot uzh.ch
                   ` (5 preceding siblings ...)
  2011-07-12 14:41 ` Joost.VandeVondele at pci dot uzh.ch
@ 2012-05-07 12:41 ` rguenth at gcc dot gnu.org
  2012-05-07 19:18 ` Joost.VandeVondele at mat dot ethz.ch
  2012-05-08 18:59 ` Joost.VandeVondele at mat dot ethz.ch
  8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-05-07 12:41 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49700

Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |WAITING
   Last reconfirmed|                            |2012-05-07
     Ever Confirmed|0                           |1

--- Comment #7 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-05-07 12:38:43 UTC ---
Has the situation improved?


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug lto/49700] LTO compile time hog
  2011-07-11  6:29 [Bug lto/49700] New: LTO compile time hog Joost.VandeVondele at pci dot uzh.ch
                   ` (6 preceding siblings ...)
  2012-05-07 12:41 ` rguenth at gcc dot gnu.org
@ 2012-05-07 19:18 ` Joost.VandeVondele at mat dot ethz.ch
  2012-05-08 18:59 ` Joost.VandeVondele at mat dot ethz.ch
  8 siblings, 0 replies; 10+ messages in thread
From: Joost.VandeVondele at mat dot ethz.ch @ 2012-05-07 19:18 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49700

--- Comment #8 from Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> 2012-05-07 19:04:29 UTC ---
(In reply to comment #7)
> Has the situation improved?

current trunk LTO seems to fail on CP2K with:

/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F: In function
‘propagate_cn_or_em’:
/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
   SUBROUTINE propagate_cn_or_em(qs_env, error)
 ^
struct array2_integer(kind=4)

struct array2_integer(kind=4)

# VUSE <.MEM_805>
D.79093_629 = D.79094_628->orders.data;

/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
   SUBROUTINE propagate_cn_or_em(qs_env, error)
 ^
struct array2_integer(kind=4)

struct array2_integer(kind=4)

# VUSE <.MEM_805>
D.79092_630 = D.79094_628->orders.offset;

/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
   SUBROUTINE propagate_cn_or_em(qs_env, error)
 ^
struct array2_integer(kind=4)

struct array2_integer(kind=4)

# VUSE <.MEM_805>
D.79090_632 = D.79094_628->orders.dim[1].stride;

/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
   SUBROUTINE propagate_cn_or_em(qs_env, error)
 ^
struct array2_integer(kind=4)

struct array2_integer(kind=4)

# VUSE <.MEM_816>
D.79093_652 = D.79094_651->orders.data;

/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
   SUBROUTINE propagate_cn_or_em(qs_env, error)
 ^
struct array2_integer(kind=4)

struct array2_integer(kind=4)

# VUSE <.MEM_816>
D.79092_653 = D.79094_651->orders.offset;

/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
   SUBROUTINE propagate_cn_or_em(qs_env, error)
 ^
struct array2_integer(kind=4)

struct array2_integer(kind=4)

# VUSE <.MEM_816>
D.79090_655 = D.79094_651->orders.dim[1].stride;

/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
   SUBROUTINE propagate_cn_or_em(qs_env, error)
 ^
struct array2_integer(kind=4)

struct array2_integer(kind=4)

# VUSE <.MEM_827>
D.79093_675 = D.79094_674->orders.data;

/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
   SUBROUTINE propagate_cn_or_em(qs_env, error)
 ^
struct array2_integer(kind=4)

struct array2_integer(kind=4)

# VUSE <.MEM_827>
D.79092_676 = D.79094_674->orders.offset;

/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
   SUBROUTINE propagate_cn_or_em(qs_env, error)
 ^
struct array2_integer(kind=4)

struct array2_integer(kind=4)

# VUSE <.MEM_827>
D.79090_678 = D.79094_674->orders.dim[1].stride;

/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
   SUBROUTINE propagate_cn_or_em(qs_env, error)
 ^
struct array2_integer(kind=4)

struct array2_integer(kind=4)

# VUSE <.MEM_838>
D.79093_700 = D.79094_699->orders.data;

/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
   SUBROUTINE propagate_cn_or_em(qs_env, error)
 ^
struct array2_integer(kind=4)

struct array2_integer(kind=4)

# VUSE <.MEM_838>
D.79092_701 = D.79094_699->orders.offset;

/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
   SUBROUTINE propagate_cn_or_em(qs_env, error)
 ^
struct array2_integer(kind=4)

struct array2_integer(kind=4)

# VUSE <.MEM_838>
D.79090_703 = D.79094_699->orders.dim[1].stride;

/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0:
internal compiler error: verify_gimple failed
   SUBROUTINE propagate_cn_or_em(qs_env, error)
 ^
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://gcc.gnu.org/bugs.html> for instructions.
lto-wrapper: gfortran returned 1 exit status
/data/vjoost/gnu/binutils-2.22/install/bin/ld: lto-wrapper failed
collect2: error: ld returned 1 exit status


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug lto/49700] LTO compile time hog
  2011-07-11  6:29 [Bug lto/49700] New: LTO compile time hog Joost.VandeVondele at pci dot uzh.ch
                   ` (7 preceding siblings ...)
  2012-05-07 19:18 ` Joost.VandeVondele at mat dot ethz.ch
@ 2012-05-08 18:59 ` Joost.VandeVondele at mat dot ethz.ch
  8 siblings, 0 replies; 10+ messages in thread
From: Joost.VandeVondele at mat dot ethz.ch @ 2012-05-08 18:59 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49700

Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|WAITING                     |RESOLVED
         Resolution|                            |FIXED

--- Comment #9 from Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> 2012-05-08 18:52:12 UTC ---
trying 4.7.X instead it actually looks very reasonable now.

Using -flto=jobserver -fuse-linker-plugin -ftime-report -O3 -march=native
-ffast-math -g -ffree-form

I get CP2K to build in 4min on a 32 cores server. The time report also looks
OK. I'll close this PR as fixed (to issue with 4.8 is tracked in PR 45586).


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-05-08 18:53 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-07-11  6:29 [Bug lto/49700] New: LTO compile time hog Joost.VandeVondele at pci dot uzh.ch
2011-07-11  9:21 ` [Bug lto/49700] " rguenth at gcc dot gnu.org
2011-07-11  9:29 ` jakub at gcc dot gnu.org
2011-07-11  9:33 ` rguenth at gcc dot gnu.org
2011-07-11 19:57 ` pinskia at gcc dot gnu.org
2011-07-12 10:18 ` Joost.VandeVondele at pci dot uzh.ch
2011-07-12 14:41 ` Joost.VandeVondele at pci dot uzh.ch
2012-05-07 12:41 ` rguenth at gcc dot gnu.org
2012-05-07 19:18 ` Joost.VandeVondele at mat dot ethz.ch
2012-05-08 18:59 ` Joost.VandeVondele at mat dot ethz.ch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).