public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/45422] [4.6 Regression] compile time increases 3x.
       [not found] <bug-45422-4@http.gcc.gnu.org/bugzilla/>
@ 2011-01-17 11:59 ` Joost.VandeVondele at pci dot uzh.ch
  2011-01-21 10:31 ` jakub at gcc dot gnu.org
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Joost.VandeVondele at pci dot uzh.ch @ 2011-01-17 11:59 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422

Joost VandeVondele <Joost.VandeVondele at pci dot uzh.ch> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|2010-08-29 09:25:52         |2011-01-17 9:25:52

--- Comment #27 from Joost VandeVondele <Joost.VandeVondele at pci dot uzh.ch> 2011-01-17 11:38:36 UTC ---
timings with current trunk (release checking). 

out.4_3
 TOTAL                 :  34.62        0.43      35.27             837034 kB
out.4_5
 TOTAL                 :  45.30        0.70      46.02             897447 kB
out.trunk
 TOTAL                 : 165.89        0.99     166.97            1743679 kB

so time is up by 5x memory 2x relative to 4.3.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Bug middle-end/45422] [4.6 Regression] compile time increases 3x.
       [not found] <bug-45422-4@http.gcc.gnu.org/bugzilla/>
  2011-01-17 11:59 ` [Bug middle-end/45422] [4.6 Regression] compile time increases 3x Joost.VandeVondele at pci dot uzh.ch
@ 2011-01-21 10:31 ` jakub at gcc dot gnu.org
  2011-01-21 16:46 ` xinliangli at gmail dot com
                   ` (15 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: jakub at gcc dot gnu.org @ 2011-01-21 10:31 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #28 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-01-21 09:50:25 UTC ---
David, any progress with this?


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Bug middle-end/45422] [4.6 Regression] compile time increases 3x.
       [not found] <bug-45422-4@http.gcc.gnu.org/bugzilla/>
  2011-01-17 11:59 ` [Bug middle-end/45422] [4.6 Regression] compile time increases 3x Joost.VandeVondele at pci dot uzh.ch
  2011-01-21 10:31 ` jakub at gcc dot gnu.org
@ 2011-01-21 16:46 ` xinliangli at gmail dot com
  2011-01-21 20:08 ` xinliangli at gmail dot com
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: xinliangli at gmail dot com @ 2011-01-21 16:46 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422

--- Comment #29 from davidxl <xinliangli at gmail dot com> 2011-01-21 16:27:43 UTC ---
(In reply to comment #28)
> David, any progress with this?

The cost function fix to make sure solution set does not become too big will be
probably very involved and won't be availlable in 4.6 time frame. I will get a
workaround using Richard's suggestion -- terminate the iterating loop when slow
convergence is detected and some limit is reached.

David


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Bug middle-end/45422] [4.6 Regression] compile time increases 3x.
       [not found] <bug-45422-4@http.gcc.gnu.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2011-01-21 16:46 ` xinliangli at gmail dot com
@ 2011-01-21 20:08 ` xinliangli at gmail dot com
  2011-01-21 21:01 ` xinliangli at gmail dot com
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: xinliangli at gmail dot com @ 2011-01-21 20:08 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422

--- Comment #30 from davidxl <xinliangli at gmail dot com> 2011-01-21 19:58:41 UTC ---
(In reply to comment #29)
> (In reply to comment #28)
> > David, any progress with this?
> 
> The cost function fix to make sure solution set does not become too big will be
> probably very involved and won't be availlable in 4.6 time frame. I will get a
> workaround using Richard's suggestion -- terminate the iterating loop when slow
> convergence is detected and some limit is reached.
> 
> David


Two observations:
1) I can not reproduce the timing by Joost -- see below. Can someone else
measure the time independently?

2) Limiting the iteration count of ivopt improvement loop does not help that
much: from unlimited (can be ~40 in this case) to max iteration of 5 only
cutdown total compile time by 2s.


The following is the timing of the trunk compiler. Options: 
-O2 -ftime-report -cpp -fbounds-check -g -O3 -ffast-math -funroll-loops
-ftree-vectorize -march=native -ffree-form 

 parser                :   0.67 ( 1%) usr   0.09 ( 6%) sys   0.77 ( 1%) wall  
53556 kB ( 5%) ggc
 inline heuristics     :   0.11 ( 0%) usr   0.00 ( 0%) sys   0.11 ( 0%) wall   
   0 kB ( 0%) ggc
 tree gimplify         :   0.35 ( 1%) usr   0.03 ( 2%) sys   0.38 ( 1%) wall  
48426 kB ( 4%) ggc
 tree eh               :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
   0 kB ( 0%) ggc
 tree CFG construction :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall  
11978 kB ( 1%) ggc
 tree CFG cleanup      :   0.68 ( 1%) usr   0.02 ( 1%) sys   0.64 ( 1%) wall   
2484 kB ( 0%) ggc
 tree VRP              :   0.83 ( 1%) usr   0.02 ( 1%) sys   1.28 ( 2%) wall  
64371 kB ( 6%) ggc
 tree copy propagation :   0.16 ( 0%) usr   0.00 ( 0%) sys   0.16 ( 0%) wall   
1267 kB ( 0%) ggc
 tree find ref. vars   :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall   
3806 kB ( 0%) ggc
 tree PTA              :   0.82 ( 1%) usr   0.00 ( 0%) sys   0.80 ( 1%) wall   
5497 kB ( 0%) ggc
 tree PHI insertion    :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall   
3194 kB ( 0%) ggc
 tree SSA rewrite      :   0.23 ( 0%) usr   0.01 ( 1%) sys   0.21 ( 0%) wall  
14021 kB ( 1%) ggc
 tree SSA other        :   0.06 ( 0%) usr   0.01 ( 1%) sys   0.09 ( 0%) wall   
 435 kB ( 0%) ggc
 tree SSA incremental  :   0.65 ( 1%) usr   0.02 ( 1%) sys   0.65 ( 1%) wall   
6735 kB ( 1%) ggc
 tree operand scan     :   0.37 ( 1%) usr   0.14 ( 9%) sys   0.53 ( 1%) wall  
47156 kB ( 4%) ggc
 dominator optimization:   0.38 ( 1%) usr   0.02 ( 1%) sys   0.50 ( 1%) wall   
6948 kB ( 1%) ggc
 tree SRA              :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree CCP              :   0.93 ( 1%) usr   0.01 ( 1%) sys   1.02 ( 2%) wall   
4975 kB ( 0%) ggc
 tree PHI const/copy prop:   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
   124 kB ( 0%) ggc
 tree split crit edges :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
1743 kB ( 0%) ggc
 tree reassociation    :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.20 ( 0%) wall   
5095 kB ( 0%) ggc
 tree PRE              :   0.64 ( 1%) usr   0.00 ( 0%) sys   0.64 ( 1%) wall   
9790 kB ( 1%) ggc
 tree FRE              :   0.28 ( 0%) usr   0.00 ( 0%) sys   0.31 ( 0%) wall   
5410 kB ( 0%) ggc
 tree code sinking     :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall   
 956 kB ( 0%) ggc
 tree linearize phis   :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
   0 kB ( 0%) ggc
 tree forward propagate:   0.17 ( 0%) usr   0.00 ( 0%) sys   0.16 ( 0%) wall  
11005 kB ( 1%) ggc
 tree phiprop          :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree conservative DCE :   0.04 ( 0%) usr   0.02 ( 1%) sys   0.06 ( 0%) wall   
 944 kB ( 0%) ggc
 tree aggressive DCE   :   0.31 ( 0%) usr   0.03 ( 2%) sys   0.40 ( 1%) wall  
15336 kB ( 1%) ggc
 tree DSE              :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.07 ( 0%) wall   
 225 kB ( 0%) ggc
 tree loop bounds      :   0.16 ( 0%) usr   0.00 ( 0%) sys   0.17 ( 0%) wall   
6744 kB ( 1%) ggc
 tree loop invariant motion:   0.05 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%)
wall     485 kB ( 0%) ggc
 tree canonical iv     :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall   
3128 kB ( 0%) ggc
 scev constant prop    :   0.04 ( 0%) usr   0.01 ( 1%) sys   0.03 ( 0%) wall   
1924 kB ( 0%) ggc
 complete unrolling    :   0.79 ( 1%) usr   0.05 ( 3%) sys   0.85 ( 1%) wall  
91364 kB ( 8%) ggc
 tree vectorization    :   0.34 ( 1%) usr   0.00 ( 0%) sys   0.37 ( 1%) wall  
25117 kB ( 2%) ggc
 tree slp vectorization:   0.41 ( 1%) usr   0.00 ( 0%) sys   0.35 ( 1%) wall  
19256 kB ( 2%) ggc
 tree loop distribution:   0.04 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall   
 850 kB ( 0%) ggc
 tree iv optimization  :  11.14 (18%) usr   0.33 (22%) sys  12.24 (18%) wall 
141300 kB (12%) ggc
 predictive commoning  :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.09 ( 0%) wall   
2696 kB ( 0%) ggc
 tree loop init        :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
1220 kB ( 0%) ggc
 tree loop fini        :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree copy headers     :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
1652 kB ( 0%) ggc
 tree SSA uncprop      :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree rename SSA copies:   0.03 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall   
   0 kB ( 0%) ggc
 dominance frontiers   :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
   0 kB ( 0%) ggc
 dominance computation :   0.28 ( 0%) usr   0.00 ( 0%) sys   0.32 ( 0%) wall   
   0 kB ( 0%) ggc
 control dependences   :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
   0 kB ( 0%) ggc
 out of ssa            :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.13 ( 0%) wall   
 130 kB ( 0%) ggc
 expand vars           :   0.09 ( 0%) usr   0.01 ( 1%) sys   0.10 ( 0%) wall   
9013 kB ( 1%) ggc
 expand                :   0.40 ( 1%) usr   0.01 ( 1%) sys   0.52 ( 1%) wall  
57975 kB ( 5%) ggc
 post expand cleanups  :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
3355 kB ( 0%) ggc
 lower subreg          :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall   
   0 kB ( 0%) ggc
 forward prop          :   0.33 ( 1%) usr   0.00 ( 0%) sys   0.33 ( 0%) wall  
11129 kB ( 1%) ggc
 CSE                   :   0.97 ( 2%) usr   0.01 ( 1%) sys   0.99 ( 1%) wall   
 207 kB ( 0%) ggc
 dead code elimination :   0.26 ( 0%) usr   0.00 ( 0%) sys   0.32 ( 0%) wall   
   0 kB ( 0%) ggc
 dead store elim1      :   0.49 ( 1%) usr   0.00 ( 0%) sys   0.45 ( 1%) wall  
11519 kB ( 1%) ggc
 dead store elim2      :   0.41 ( 1%) usr   0.00 ( 0%) sys   0.46 ( 1%) wall  
13060 kB ( 1%) ggc
 loop analysis         :   0.03 ( 0%) usr   0.01 ( 1%) sys   0.02 ( 0%) wall   
1626 kB ( 0%) ggc
 loop invariant motion :   0.09 ( 0%) usr   0.00 ( 0%) sys   0.14 ( 0%) wall   
 505 kB ( 0%) ggc
 loop unswitching      :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.07 ( 0%) wall   
   0 kB ( 0%) ggc
 loop unrolling        :   1.59 ( 3%) usr   0.02 ( 1%) sys   1.64 ( 2%) wall 
102158 kB ( 9%) ggc
 CPROP                 :   0.69 ( 1%) usr   0.02 ( 1%) sys   0.77 ( 1%) wall  
13208 kB ( 1%) ggc
 PRE                   :   0.58 ( 1%) usr   0.00 ( 0%) sys   0.58 ( 1%) wall   
1030 kB ( 0%) ggc
 web                   :   0.20 ( 0%) usr   0.00 ( 0%) sys   0.15 ( 0%) wall   
2961 kB ( 0%) ggc
 CSE 2                 :   0.87 ( 1%) usr   0.01 ( 1%) sys   1.08 ( 2%) wall   
1246 kB ( 0%) ggc
 branch prediction     :   0.12 ( 0%) usr   0.02 ( 1%) sys   0.14 ( 0%) wall   
6859 kB ( 1%) ggc
 combiner              :   1.75 ( 3%) usr   0.03 ( 2%) sys   1.70 ( 3%) wall  
39971 kB ( 3%) ggc
 if-conversion         :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
1398 kB ( 0%) ggc
 regmove               :   0.26 ( 0%) usr   0.00 ( 0%) sys   0.25 ( 0%) wall   
   0 kB ( 0%) ggc
 integrated RA         :   3.24 ( 5%) usr   0.01 ( 1%) sys   3.39 ( 5%) wall  
24873 kB ( 2%) ggc
 reload                :   1.72 ( 3%) usr   0.00 ( 0%) sys   1.72 ( 3%) wall   
8401 kB ( 1%) ggc
 reload CSE regs       :   1.93 ( 3%) usr   0.00 ( 0%) sys   1.75 ( 3%) wall  
19943 kB ( 2%) ggc
 load CSE after reload :   0.14 ( 0%) usr   0.00 ( 0%) sys   0.14 ( 0%) wall   
 487 kB ( 0%) ggc
 zee                   :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall   
  31 kB ( 0%) ggc
 thread pro- & epilogue:   0.06 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall   
3614 kB ( 0%) ggc
 combine stack adjustments:   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall
      0 kB ( 0%) ggc
 peephole 2            :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall   
1907 kB ( 0%) ggc
 rename registers      :   0.47 ( 1%) usr   0.00 ( 0%) sys   0.49 ( 1%) wall   
2169 kB ( 0%) ggc
 hard reg cprop        :   0.45 ( 1%) usr   0.00 ( 0%) sys   0.45 ( 1%) wall   
  22 kB ( 0%) ggc
 scheduling 2          :   4.47 ( 7%) usr   0.01 ( 1%) sys   4.35 ( 6%) wall   
1114 kB ( 0%) ggc
 machine dep reorg     :   0.35 ( 1%) usr   0.00 ( 0%) sys   0.39 ( 1%) wall   
  22 kB ( 0%) ggc
 reorder blocks        :   0.27 ( 0%) usr   0.00 ( 0%) sys   0.26 ( 0%) wall   
3129 kB ( 0%) ggc
 final                 :   0.82 ( 1%) usr   0.03 ( 2%) sys   0.83 ( 1%) wall   
8473 kB ( 1%) ggc
 symout                :   0.33 ( 1%) usr   0.00 ( 0%) sys   0.33 ( 0%) wall  
53120 kB ( 5%) ggc
 variable tracking     :   1.34 ( 2%) usr   0.01 ( 1%) sys   1.42 ( 2%) wall  
37182 kB ( 3%) ggc
 var-tracking dataflow :   2.11 ( 3%) usr   0.00 ( 0%) sys   2.28 ( 3%) wall   
   0 kB ( 0%) ggc
 var-tracking emit     :   2.01 ( 3%) usr   0.00 ( 0%) sys   1.89 ( 3%) wall  
18854 kB ( 2%) ggc
 rest of compilation   :   3.44 ( 5%) usr   0.33 (22%) sys   3.46 ( 5%) wall   
8050 kB ( 1%) ggc
 remove unused locals  :   0.47 ( 1%) usr   0.01 ( 1%) sys   0.62 ( 1%) wall   
   0 kB ( 0%) ggc
 address taken         :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
   0 kB ( 0%) ggc
 unaccounted todo      :   0.98 ( 2%) usr   0.08 ( 5%) sys   1.27 ( 2%) wall   
   8 kB ( 0%) ggc
 repair loop structures:   0.07 ( 0%) usr   0.01 ( 1%) sys   0.07 ( 0%) wall   
4127 kB ( 0%) ggc
 TOTAL                 :  63.40             1.53            67.47           
1152381 kB


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Bug middle-end/45422] [4.6 Regression] compile time increases 3x.
       [not found] <bug-45422-4@http.gcc.gnu.org/bugzilla/>
                   ` (3 preceding siblings ...)
  2011-01-21 20:08 ` xinliangli at gmail dot com
@ 2011-01-21 21:01 ` xinliangli at gmail dot com
  2011-01-25  9:47 ` jakub at gcc dot gnu.org
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: xinliangli at gmail dot com @ 2011-01-21 21:01 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422

--- Comment #31 from davidxl <xinliangli at gmail dot com> 2011-01-21 20:08:11 UTC ---
Comparing this timing with 4.6 results (164s), looks like many other passes
become slower other than ivopt (e.g IRA increases from 3.5s to 11s etc -- ivopt
only account for a small part of the 110s increase.

David


(In reply to comment #18)
> FYI, these are the 4.5 branch timings:
> 
> Execution times (seconds)
>  garbage collection    :   0.47 ( 1%) usr   0.00 ( 0%) sys   0.47 ( 1%) wall   
>    0 kB ( 0%) ggc
>  callgraph construction:   0.05 ( 0%) usr   0.01 ( 1%) sys   0.09 ( 0%) wall   
> 5996 kB ( 1%) ggc
>  callgraph optimization:   0.21 ( 0%) usr   0.02 ( 1%) sys   0.26 ( 0%) wall   
>  606 kB ( 0%) ggc
>  ipa cp                :   0.09 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall   
> 1381 kB ( 0%) ggc
>  ipa reference         :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall   
>    0 kB ( 0%) ggc
>  ipa pure const        :   0.06 ( 0%) usr   0.01 ( 1%) sys   0.09 ( 0%) wall   
>    0 kB ( 0%) ggc
>  cfg cleanup           :   0.39 ( 1%) usr   0.00 ( 0%) sys   0.51 ( 1%) wall   
> 2459 kB ( 0%) ggc
>  trivially dead code   :   0.34 ( 1%) usr   0.00 ( 0%) sys   0.30 ( 1%) wall   
>    0 kB ( 0%) ggc
>  df multiple defs      :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.13 ( 0%) wall   
>    0 kB ( 0%) ggc
>  df reaching defs      :   0.33 ( 1%) usr   0.00 ( 0%) sys   0.27 ( 1%) wall   
>    0 kB ( 0%) ggc
>  df live regs          :   2.08 ( 4%) usr   0.01 ( 1%) sys   2.19 ( 4%) wall   
>    0 kB ( 0%) ggc
>  df live&initialized regs:   0.98 ( 2%) usr   0.00 ( 0%) sys   0.92 ( 2%) wall 
>      0 kB ( 0%) ggc
>  df use-def / def-use chains:   0.24 ( 0%) usr   0.00 ( 0%) sys   0.19 ( 0%)
> wall       0 kB ( 0%) ggc
>  df reg dead/unused notes:   0.93 ( 2%) usr   0.00 ( 0%) sys   1.04 ( 2%) wall 
>   5756 kB ( 1%) ggc
>  register information  :   0.51 ( 1%) usr   0.01 ( 1%) sys   0.39 ( 1%) wall   
>    0 kB ( 0%) ggc
>  alias analysis        :   0.78 ( 1%) usr   0.01 ( 1%) sys   0.91 ( 2%) wall  
> 22384 kB ( 3%) ggc
>  alias stmt walking    :   0.50 ( 1%) usr   0.03 ( 2%) sys   0.38 ( 1%) wall   
> 5563 kB ( 1%) ggc
>  register scan         :   0.13 ( 0%) usr   0.00 ( 0%) sys   0.09 ( 0%) wall   
>    0 kB ( 0%) ggc
>  rebuild jump labels   :   0.19 ( 0%) usr   0.00 ( 0%) sys   0.19 ( 0%) wall   
>    0 kB ( 0%) ggc
>  parser                :   0.82 ( 2%) usr   0.13 ( 9%) sys   0.94 ( 2%) wall  
> 55603 kB ( 6%) ggc
>  inline heuristics     :   0.20 ( 0%) usr   0.01 ( 1%) sys   0.16 ( 0%) wall   
>    0 kB ( 0%) ggc
>  tree gimplify         :   0.38 ( 1%) usr   0.03 ( 2%) sys   0.40 ( 1%) wall  
> 46588 kB ( 5%) ggc
>  tree eh               :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
>    0 kB ( 0%) ggc
>  tree CFG construction :   0.04 ( 0%) usr   0.02 ( 1%) sys   0.05 ( 0%) wall  
> 11964 kB ( 1%) ggc
>  tree CFG cleanup      :   0.47 ( 1%) usr   0.00 ( 0%) sys   0.79 ( 1%) wall   
> 1829 kB ( 0%) ggc
>  tree VRP              :   1.46 ( 3%) usr   0.05 ( 4%) sys   1.27 ( 2%) wall  
> 56376 kB ( 6%) ggc
>  tree copy propagation :   0.09 ( 0%) usr   0.02 ( 1%) sys   0.22 ( 0%) wall   
>  746 kB ( 0%) ggc
>  tree find ref. vars   :   0.09 ( 0%) usr   0.01 ( 1%) sys   0.07 ( 0%) wall   
> 3806 kB ( 0%) ggc
>  tree PTA              :   0.30 ( 1%) usr   0.00 ( 0%) sys   0.33 ( 1%) wall   
> 3836 kB ( 0%) ggc
>  tree PHI insertion    :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall   
> 3194 kB ( 0%) ggc
>  tree SSA rewrite      :   0.24 ( 0%) usr   0.01 ( 1%) sys   0.29 ( 1%) wall  
> 13860 kB ( 2%) ggc
>  tree SSA other        :   0.13 ( 0%) usr   0.02 ( 1%) sys   0.11 ( 0%) wall   
>  418 kB ( 0%) ggc
>  tree SSA incremental  :   0.89 ( 2%) usr   0.06 ( 4%) sys   0.97 ( 2%) wall   
> 6811 kB ( 1%) ggc
>  tree operand scan     :   0.34 ( 1%) usr   0.23 (17%) sys   0.59 ( 1%) wall  
> 44776 kB ( 5%) ggc
>  dominator optimization:   0.29 ( 1%) usr   0.01 ( 1%) sys   0.35 ( 1%) wall   
> 5152 kB ( 1%) ggc
>  tree CCP              :   0.51 ( 1%) usr   0.02 ( 1%) sys   0.43 ( 1%) wall   
> 4620 kB ( 1%) ggc
>  tree PHI const/copy prop:   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
>    106 kB ( 0%) ggc
>  tree split crit edges :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
> 2019 kB ( 0%) ggc
>  tree reassociation    :   0.12 ( 0%) usr   0.01 ( 1%) sys   0.12 ( 0%) wall   
> 2946 kB ( 0%) ggc
>  tree PRE              :   0.92 ( 2%) usr   0.00 ( 0%) sys   0.95 ( 2%) wall   
> 7315 kB ( 1%) ggc
>  tree FRE              :   0.45 ( 1%) usr   0.04 ( 3%) sys   0.35 ( 1%) wall   
> 5518 kB ( 1%) ggc
>  tree code sinking     :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
> 1400 kB ( 0%) ggc
>  tree linearize phis   :   0.02 ( 0%) usr   0.01 ( 1%) sys   0.01 ( 0%) wall   
>    0 kB ( 0%) ggc
>  tree forward propagate:   0.18 ( 0%) usr   0.02 ( 1%) sys   0.16 ( 0%) wall  
> 10006 kB ( 1%) ggc
>  tree conservative DCE :   0.05 ( 0%) usr   0.01 ( 1%) sys   0.13 ( 0%) wall   
>  576 kB ( 0%) ggc
>  tree aggressive DCE   :   0.28 ( 1%) usr   0.01 ( 1%) sys   0.37 ( 1%) wall   
> 8853 kB ( 1%) ggc
>  tree buildin call DCE :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
>    0 kB ( 0%) ggc
>  tree DSE              :   0.20 ( 0%) usr   0.00 ( 0%) sys   0.11 ( 0%) wall   
>  132 kB ( 0%) ggc
>  PHI merge             :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
>   37 kB ( 0%) ggc
>  tree loop bounds      :   0.22 ( 0%) usr   0.00 ( 0%) sys   0.18 ( 0%) wall   
> 8266 kB ( 1%) ggc
>  tree loop invariant motion:   0.06 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%)
> wall      67 kB ( 0%) ggc
>  tree canonical iv     :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.09 ( 0%) wall   
> 4779 kB ( 1%) ggc
>  scev constant prop    :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.07 ( 0%) wall   
> 2345 kB ( 0%) ggc
>  tree loop unswitching :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
>  573 kB ( 0%) ggc
>  complete unrolling    :   1.05 ( 2%) usr   0.11 ( 8%) sys   1.39 ( 3%) wall  
> 98553 kB (11%) ggc
>  tree vectorization    :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
>  883 kB ( 0%) ggc
>  tree slp vectorization:   0.61 ( 1%) usr   0.00 ( 0%) sys   0.60 ( 1%) wall  
> 53236 kB ( 6%) ggc
>  tree iv optimization  :   5.80 (11%) usr   0.06 ( 4%) sys   5.94 (11%) wall  
> 95356 kB (11%) ggc
>  predictive commoning  :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
> 1054 kB ( 0%) ggc
>  tree loop init        :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
> 1339 kB ( 0%) ggc
>  tree copy headers     :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
> 1613 kB ( 0%) ggc
>  tree SSA uncprop      :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
>    0 kB ( 0%) ggc
>  tree rename SSA copies:   0.06 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
>    0 kB ( 0%) ggc
>  dominance frontiers   :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall   
>    0 kB ( 0%) ggc
>  dominance computation :   0.23 ( 0%) usr   0.00 ( 0%) sys   0.26 ( 0%) wall   
>    0 kB ( 0%) ggc
>  expand                :   3.24 ( 6%) usr   0.07 ( 5%) sys   3.34 ( 6%) wall  
> 69633 kB ( 8%) ggc
>  lower subreg          :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
>    0 kB ( 0%) ggc
>  forward prop          :   0.48 ( 1%) usr   0.01 ( 1%) sys   0.48 ( 1%) wall   
> 9984 kB ( 1%) ggc
>  CSE                   :   0.73 ( 1%) usr   0.00 ( 0%) sys   0.92 ( 2%) wall   
>  248 kB ( 0%) ggc
>  dead code elimination :   0.24 ( 0%) usr   0.00 ( 0%) sys   0.28 ( 1%) wall   
>    0 kB ( 0%) ggc
>  dead store elim1      :   0.33 ( 1%) usr   0.01 ( 1%) sys   0.32 ( 1%) wall   
> 5987 kB ( 1%) ggc
>  dead store elim2      :   0.44 ( 1%) usr   0.02 ( 1%) sys   0.39 ( 1%) wall   
> 7831 kB ( 1%) ggc
>  loop analysis         :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
>  718 kB ( 0%) ggc
>  loop invariant motion :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.07 ( 0%) wall   
>  305 kB ( 0%) ggc
>  loop unswitching      :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
>    0 kB ( 0%) ggc
>  loop unrolling        :   0.65 ( 1%) usr   0.00 ( 0%) sys   0.62 ( 1%) wall  
> 32780 kB ( 4%) ggc
>  CPROP                 :   0.70 ( 1%) usr   0.00 ( 0%) sys   0.60 ( 1%) wall   
> 7825 kB ( 1%) ggc
>  PRE                   :   0.32 ( 1%) usr   0.00 ( 0%) sys   0.33 ( 1%) wall   
>  719 kB ( 0%) ggc
>  web                   :   0.13 ( 0%) usr   0.00 ( 0%) sys   0.07 ( 0%) wall   
>  594 kB ( 0%) ggc
>  CSE 2                 :   0.75 ( 1%) usr   0.01 ( 1%) sys   0.60 ( 1%) wall   
>  470 kB ( 0%) ggc
>  branch prediction     :   0.19 ( 0%) usr   0.01 ( 1%) sys   0.14 ( 0%) wall   
> 7344 kB ( 1%) ggc
>  combiner              :   1.19 ( 2%) usr   0.01 ( 1%) sys   1.33 ( 2%) wall  
> 19980 kB ( 2%) ggc
>  if-conversion         :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
>  746 kB ( 0%) ggc
>  regmove               :   0.37 ( 1%) usr   0.01 ( 1%) sys   0.33 ( 1%) wall   
>    0 kB ( 0%) ggc
>  integrated RA         :   3.51 ( 7%) usr   0.01 ( 1%) sys   3.74 ( 7%) wall  
> 12746 kB ( 1%) ggc
>  reload                :   2.16 ( 4%) usr   0.02 ( 1%) sys   2.01 ( 4%) wall   
> 7755 kB ( 1%) ggc
>  reload CSE regs       :   1.38 ( 3%) usr   0.00 ( 0%) sys   1.26 ( 2%) wall  
> 12331 kB ( 1%) ggc
>  load CSE after reload :   0.18 ( 0%) usr   0.00 ( 0%) sys   0.14 ( 0%) wall   
>  162 kB ( 0%) ggc
>  thread pro- & epilogue:   0.11 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall   
> 4370 kB ( 0%) ggc
>  if-conversion 2       :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
>  357 kB ( 0%) ggc
>  combine stack adjustments:   0.03 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall
>       0 kB ( 0%) ggc
>  peephole 2            :   0.12 ( 0%) usr   0.00 ( 0%) sys   0.16 ( 0%) wall   
> 1899 kB ( 0%) ggc
>  rename registers      :   0.46 ( 1%) usr   0.00 ( 0%) sys   0.55 ( 1%) wall   
> 2237 kB ( 0%) ggc
>  hard reg cprop        :   0.37 ( 1%) usr   0.00 ( 0%) sys   0.48 ( 1%) wall   
>   13 kB ( 0%) ggc
>  scheduling 2          :   3.30 ( 6%) usr   0.04 ( 3%) sys   3.10 ( 6%) wall   
> 1216 kB ( 0%) ggc
>  machine dep reorg     :   0.38 ( 1%) usr   0.00 ( 0%) sys   0.36 ( 1%) wall   
>   11 kB ( 0%) ggc
>  reorder blocks        :   0.15 ( 0%) usr   0.00 ( 0%) sys   0.13 ( 0%) wall   
> 1283 kB ( 0%) ggc
>  final                 :   0.93 ( 2%) usr   0.07 ( 5%) sys   0.84 ( 2%) wall   
> 6610 kB ( 1%) ggc
>  symout                :   0.30 ( 1%) usr   0.03 ( 2%) sys   0.34 ( 1%) wall  
> 27006 kB ( 3%) ggc
>  variable tracking     :   3.86 ( 7%) usr   0.03 ( 2%) sys   3.99 ( 7%) wall  
> 39804 kB ( 4%) ggc
>  plugin execution      :   0.00 ( 0%) usr   0.01 ( 1%) sys   0.05 ( 0%) wall   
>    0 kB ( 0%) ggc
>  rest of compilation   :   0.00 ( 0%) usr   0.01 ( 1%) sys   0.00 ( 0%) wall   
>    0 kB ( 0%) ggc
>  TOTAL                 :  52.50             1.37            53.88            
> 893901 kB


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Bug middle-end/45422] [4.6 Regression] compile time increases 3x.
       [not found] <bug-45422-4@http.gcc.gnu.org/bugzilla/>
                   ` (4 preceding siblings ...)
  2011-01-21 21:01 ` xinliangli at gmail dot com
@ 2011-01-25  9:47 ` jakub at gcc dot gnu.org
  2011-01-25  9:51 ` Joost.VandeVondele at pci dot uzh.ch
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: jakub at gcc dot gnu.org @ 2011-01-25  9:47 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422

--- Comment #32 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-01-25 09:02:57 UTC ---
IMHO for P1 purposes we should just look at compile time regressions from 4.5
here at this point.  On the #c1 testcase I get with --enable-checking=release
current trunk and current 4.5 branch on x86_64-linux:

4.6 x86_64 -m64 -O3 -fbounds-check -ftime-report
 df live regs          :   1.87 ( 3%) usr   0.02 ( 1%) sys   1.66 ( 3%) wall   
   0 kB ( 0%) ggc
 parser                :   1.04 ( 2%) usr   0.20 ( 9%) sys   1.24 ( 2%) wall  
53425 kB ( 6%) ggc
 tree VRP              :   1.82 ( 3%) usr   0.09 ( 4%) sys   2.02 ( 3%) wall  
63870 kB ( 8%) ggc
 tree PTA              :   1.02 ( 2%) usr   0.01 ( 0%) sys   0.98 ( 2%) wall   
5498 kB ( 1%) ggc  
 tree SSA incremental  :   1.23 ( 2%) usr   0.12 ( 6%) sys   1.11 ( 2%) wall   
6733 kB ( 1%) ggc
 tree CCP              :   1.33 ( 2%) usr   0.03 ( 1%) sys   1.33 ( 2%) wall   
4989 kB ( 1%) ggc
 complete unrolling    :   1.07 ( 2%) usr   0.16 ( 8%) sys   1.28 ( 2%) wall  
88755 kB (11%) ggc
 tree iv optimization  :  10.99 (19%) usr   0.09 ( 4%) sys  11.09 (19%) wall 
138994 kB (16%) ggc  
 CSE                   :   1.28 ( 2%) usr   0.01 ( 0%) sys   1.28 ( 2%) wall   
 229 kB ( 0%) ggc
 combiner              :   2.00 ( 3%) usr   0.00 ( 0%) sys   1.95 ( 3%) wall  
31554 kB ( 4%) ggc  
 integrated RA         :   3.68 ( 6%) usr   0.01 ( 0%) sys   3.78 ( 6%) wall  
19906 kB ( 2%) ggc
 reload                :   2.04 ( 4%) usr   0.00 ( 0%) sys   2.18 ( 4%) wall   
7106 kB ( 1%) ggc
 reload CSE regs       :   2.04 ( 4%) usr   0.02 ( 1%) sys   2.01 ( 3%) wall  
12188 kB ( 1%) ggc
 scheduling 2          :   2.55 ( 4%) usr   0.01 ( 0%) sys   2.61 ( 4%) wall   
 895 kB ( 0%) ggc
 TOTAL                 :  57.47             2.11            59.60            
845009 kB

4.5 x86_64 -m64 -O3 -fbounds-check -ftime-report
 df live regs          :   1.58 ( 4%) usr   0.00 ( 0%) sys   1.39 ( 3%) wall   
   0 kB ( 0%) ggc     
 parser                :   1.02 ( 2%) usr   0.18 ( 9%) sys   1.21 ( 3%) wall  
55472 kB ( 7%) ggc     
 tree VRP              :   1.39 ( 3%) usr   0.13 ( 6%) sys   1.73 ( 4%) wall  
56478 kB ( 8%) ggc
 tree PRE              :   1.03 ( 2%) usr   0.04 ( 2%) sys   1.24 ( 3%) wall   
7286 kB ( 1%) ggc
 complete unrolling    :   1.32 ( 3%) usr   0.21 (10%) sys   1.55 ( 3%) wall  
91137 kB (12%) ggc
 tree iv optimization  :   5.45 (12%) usr   0.09 ( 4%) sys   5.43 (12%) wall  
95576 kB (13%) ggc
 expand                :   2.62 ( 6%) usr   0.16 ( 8%) sys   2.76 ( 6%) wall  
58104 kB ( 8%) ggc
 CSE                   :   1.18 ( 3%) usr   0.01 ( 0%) sys   0.94 ( 2%) wall   
 261 kB ( 0%) ggc
 combiner              :   1.53 ( 3%) usr   0.00 ( 0%) sys   1.48 ( 3%) wall  
19953 kB ( 3%) ggc
 integrated RA         :   3.21 ( 7%) usr   0.00 ( 0%) sys   3.55 ( 8%) wall  
11410 kB ( 2%) ggc
 reload                :   2.13 ( 5%) usr   0.04 ( 2%) sys   2.00 ( 4%) wall   
7273 kB ( 1%) ggc
 reload CSE regs       :   1.67 ( 4%) usr   0.01 ( 0%) sys   1.55 ( 3%) wall  
10032 kB ( 1%) ggc
 scheduling 2          :   2.65 ( 6%) usr   0.02 ( 1%) sys   2.66 ( 6%) wall   
1063 kB ( 0%) ggc
 TOTAL                 :  44.55             2.05            46.62            
747832 kB

4.6 x86_64 -m32 -O3 -fbounds-check -ftime-report
 df live regs          :   1.24 ( 2%) usr   0.02 ( 1%) sys   1.05 ( 2%) wall   
   0 kB ( 0%) ggc
 parser                :   1.05 ( 2%) usr   0.18 ( 9%) sys   1.23 ( 2%) wall  
53861 kB ( 7%) ggc
 tree VRP              :   1.48 ( 3%) usr   0.05 ( 2%) sys   1.78 ( 3%) wall  
52970 kB ( 7%) ggc
 tree iv optimization  :   9.92 (19%) usr   0.15 ( 7%) sys   9.98 (18%) wall 
125735 kB (17%) ggc
 CSE                   :   1.46 ( 3%) usr   0.00 ( 0%) sys   1.42 ( 3%) wall   
 329 kB ( 0%) ggc
 combiner              :   1.41 ( 3%) usr   0.01 ( 0%) sys   1.35 ( 2%) wall  
20981 kB ( 3%) ggc
 integrated RA         :   2.89 ( 6%) usr   0.00 ( 0%) sys   2.83 ( 5%) wall  
14083 kB ( 2%) ggc
 reload                :   2.59 ( 5%) usr   0.02 ( 1%) sys   2.58 ( 5%) wall  
18918 kB ( 3%) ggc
 reload CSE regs       :   2.62 ( 5%) usr   0.00 ( 0%) sys   2.91 ( 5%) wall  
13557 kB ( 2%) ggc
 scheduling 2          :   2.49 ( 5%) usr   0.01 ( 0%) sys   2.45 ( 5%) wall   
 953 kB ( 0%) ggc
 TOTAL                 :  52.36             2.02            54.39            
744417 kB

4.5 x86_64 -m32 -O3 -fbounds-check -ftime-report
 df live regs          :   1.41 ( 3%) usr   0.02 ( 1%) sys   1.43 ( 3%) wall   
   0 kB ( 0%) ggc
 parser                :   1.02 ( 2%) usr   0.18 ( 9%) sys   1.19 ( 2%) wall  
55913 kB ( 8%) ggc
 tree VRP              :   1.44 ( 3%) usr   0.14 ( 7%) sys   1.39 ( 3%) wall  
54451 kB ( 8%) ggc
 tree iv optimization  :   7.76 (17%) usr   0.11 ( 5%) sys   8.02 (17%) wall 
107362 kB (15%) ggc
 expand                :   2.66 ( 6%) usr   0.08 ( 4%) sys   2.73 ( 6%) wall  
56088 kB ( 8%) ggc
 CSE                   :   1.41 ( 3%) usr   0.00 ( 0%) sys   1.31 ( 3%) wall   
 480 kB ( 0%) ggc
 integrated RA         :   2.88 ( 6%) usr   0.00 ( 0%) sys   2.78 ( 6%) wall   
9890 kB ( 1%) ggc
 reload                :   2.71 ( 6%) usr   0.05 ( 2%) sys   2.68 ( 6%) wall  
20135 kB ( 3%) ggc
 reload CSE regs       :   1.98 ( 4%) usr   0.00 ( 0%) sys   2.00 ( 4%) wall  
13166 kB ( 2%) ggc
 scheduling 2          :   2.67 ( 6%) usr   0.04 ( 2%) sys   2.77 ( 6%) wall   
 840 kB ( 0%) ggc
 TOTAL                 :  46.38             2.08            48.48            
708175 kB

(listing only lines with >= 1sec times).  For x86_64 -m32 it doesn't seem to be
a big deal and even the 4.6 numbers are nowhere the claimed 3x increase, it is
a 30% slowdown and only half of the slowdown can be actually attributed to
ivopts.  On the #c5 testcase ivopts still takes > 50% of the reported time
though.  To me this sounds P2ish, but I'll let Richard chime in...


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Bug middle-end/45422] [4.6 Regression] compile time increases 3x.
       [not found] <bug-45422-4@http.gcc.gnu.org/bugzilla/>
                   ` (5 preceding siblings ...)
  2011-01-25  9:47 ` jakub at gcc dot gnu.org
@ 2011-01-25  9:51 ` Joost.VandeVondele at pci dot uzh.ch
  2011-01-25 10:03 ` jakub at gcc dot gnu.org
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Joost.VandeVondele at pci dot uzh.ch @ 2011-01-25  9:51 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422

--- Comment #33 from Joost VandeVondele <Joost.VandeVondele at pci dot uzh.ch> 2011-01-25 09:47:10 UTC ---
I just note that the timings reported by David and Jakub are not for the
compile options I originally reported.

With 4.6 (20110117) I now have 

gfortran -c -ftime-report -cpp -fbounds-check -g -O3 -ffast-math -funroll-loops
-ftree-vectorize -march=native -ffree-form PR45422.F90
TOTAL                 : 102.15 

while with the options used by David / Jakub I have timings similar to theirs.

gfortran -O3 -fbounds-check -ftime-report -c PR45422.F90
 TOTAL                 :  42.87

With 4.5 timings remain ~44s


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Bug middle-end/45422] [4.6 Regression] compile time increases 3x.
       [not found] <bug-45422-4@http.gcc.gnu.org/bugzilla/>
                   ` (6 preceding siblings ...)
  2011-01-25  9:51 ` Joost.VandeVondele at pci dot uzh.ch
@ 2011-01-25 10:03 ` jakub at gcc dot gnu.org
  2011-01-25 10:25 ` Joost.VandeVondele at pci dot uzh.ch
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: jakub at gcc dot gnu.org @ 2011-01-25 10:03 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422

--- Comment #34 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-01-25 09:52:23 UTC ---
-march=native is ambiguous, please see with -v what actually is being used.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Bug middle-end/45422] [4.6 Regression] compile time increases 3x.
       [not found] <bug-45422-4@http.gcc.gnu.org/bugzilla/>
                   ` (7 preceding siblings ...)
  2011-01-25 10:03 ` jakub at gcc dot gnu.org
@ 2011-01-25 10:25 ` Joost.VandeVondele at pci dot uzh.ch
  2011-01-25 17:58 ` xinliangli at gmail dot com
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Joost.VandeVondele at pci dot uzh.ch @ 2011-01-25 10:25 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422

--- Comment #35 from Joost VandeVondele <Joost.VandeVondele at pci dot uzh.ch> 2011-01-25 10:03:02 UTC ---
(In reply to comment #34)
> -march=native is ambiguous, please see with -v what actually is being used.

This was mentioned in the initial comment:
-march=k8-sse3 -mcx16 -msahf
--param l1-cache-size=64 --param l1-cache-line-size=64 --param
l2-cache-size=1024 -mtune=k8

The latest timings are on a newer machine (old one is gone now) which has:
-march=amdfam10 -mcx16 -msahf -mpopcnt -mabm --param l1-cache-size=64 --param
l1-cache-line-size=64 --param l2-cache-size=512 -mtune=amdfam10


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Bug middle-end/45422] [4.6 Regression] compile time increases 3x.
       [not found] <bug-45422-4@http.gcc.gnu.org/bugzilla/>
                   ` (8 preceding siblings ...)
  2011-01-25 10:25 ` Joost.VandeVondele at pci dot uzh.ch
@ 2011-01-25 17:58 ` xinliangli at gmail dot com
  2011-01-27 16:03 ` jakub at gcc dot gnu.org
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: xinliangli at gmail dot com @ 2011-01-25 17:58 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422

--- Comment #36 from davidxl <xinliangli at gmail dot com> 2011-01-25 17:28:30 UTC ---
(In reply to comment #35)
> (In reply to comment #34)
> > -march=native is ambiguous, please see with -v what actually is being used.
> 
> This was mentioned in the initial comment:
> -march=k8-sse3 -mcx16 -msahf
> --param l1-cache-size=64 --param l1-cache-line-size=64 --param
> l2-cache-size=1024 -mtune=k8
> 
> The latest timings are on a newer machine (old one is gone now) which has:
> -march=amdfam10 -mcx16 -msahf -mpopcnt -mabm --param l1-cache-size=64 --param
> l1-cache-line-size=64 --param l2-cache-size=512 -mtune=amdfam10

I did use the options you originally posted "-ftime-report -cpp -fbounds-check
-g -O3 -ffast-math -funroll-loops -ftree-vectorize -march=native -ffree-form".
The timing is consistently 58s on my 2.4Ghz core-2 box, and 42s on the 2.67Ghz
Xeon machine.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Bug middle-end/45422] [4.6 Regression] compile time increases 3x.
       [not found] <bug-45422-4@http.gcc.gnu.org/bugzilla/>
                   ` (9 preceding siblings ...)
  2011-01-25 17:58 ` xinliangli at gmail dot com
@ 2011-01-27 16:03 ` jakub at gcc dot gnu.org
  2011-01-27 16:17 ` jakub at gcc dot gnu.org
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: jakub at gcc dot gnu.org @ 2011-01-27 16:03 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422

--- Comment #37 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-01-27 15:55:57 UTC ---
/usr/src/gcc/objr/gcc/f951 -quiet -ftime-report -fbounds-check -g -O3
-ffast-math -funroll-loops -ftree-vectorize -march=amdfam10 pr45422.f90 2>&1 |
grep ':[         ]*[1-9]\|TOTAL'
 garbage collection    :   1.34 ( 1%) usr   0.00 ( 0%) sys   1.32 ( 1%) wall   
   0 kB ( 0%) ggc
 cfg cleanup           :   2.24 ( 2%) usr   0.01 ( 0%) sys   2.26 ( 2%) wall   
7301 kB ( 0%) ggc
 df reaching defs      :   1.46 ( 1%) usr   0.02 ( 1%) sys   1.34 ( 1%) wall   
   0 kB ( 0%) ggc
 df live regs          :   8.28 ( 6%) usr   0.02 ( 1%) sys   8.49 ( 6%) wall   
   0 kB ( 0%) ggc
 df live&initialized regs:   2.46 ( 2%) usr   0.00 ( 0%) sys   2.98 ( 2%) wall 
     0 kB ( 0%) ggc
 df use-def / def-use chains:   1.31 ( 1%) usr   0.00 ( 0%) sys   1.13 ( 1%)
wall       0 kB ( 0%) ggc
 df reg dead/unused notes:   4.01 ( 3%) usr   0.00 ( 0%) sys   4.03 ( 3%) wall 
  7770 kB ( 0%) ggc
 register information  :   1.48 ( 1%) usr   0.00 ( 0%) sys   1.53 ( 1%) wall   
   0 kB ( 0%) ggc
 alias analysis        :   1.86 ( 1%) usr   0.00 ( 0%) sys   1.89 ( 1%) wall  
46655 kB ( 3%) ggc
 tree VRP              :   2.25 ( 2%) usr   0.08 ( 4%) sys   2.27 ( 2%) wall  
74472 kB ( 4%) ggc
 tree SSA incremental  :   1.43 ( 1%) usr   0.25 (11%) sys   1.34 ( 1%) wall   
7187 kB ( 0%) ggc
 complete unrolling    :   1.19 ( 1%) usr   0.14 ( 6%) sys   1.24 ( 1%) wall  
91809 kB ( 5%) ggc
 tree prefetching      :   1.31 ( 1%) usr   0.12 ( 5%) sys   1.50 ( 1%) wall  
92179 kB ( 5%) ggc
 tree iv optimization  :  15.43 (11%) usr   0.09 ( 4%) sys  15.62 (11%) wall 
303704 kB (17%) ggc
 expand                :   1.11 ( 1%) usr   0.03 ( 1%) sys   1.11 ( 1%) wall  
81489 kB ( 5%) ggc
 forward prop          :   1.17 ( 1%) usr   0.01 ( 0%) sys   1.19 ( 1%) wall  
16030 kB ( 1%) ggc
 CSE                   :   1.58 ( 1%) usr   0.01 ( 0%) sys   1.42 ( 1%) wall   
 667 kB ( 0%) ggc
 dead code elimination :   1.24 ( 1%) usr   0.00 ( 0%) sys   1.30 ( 1%) wall   
   0 kB ( 0%) ggc
 dead store elim1      :   1.37 ( 1%) usr   0.00 ( 0%) sys   1.31 ( 1%) wall  
23509 kB ( 1%) ggc
 dead store elim2      :   1.10 ( 1%) usr   0.00 ( 0%) sys   1.08 ( 1%) wall  
22323 kB ( 1%) ggc
 loop unrolling        :   3.99 ( 3%) usr   0.03 ( 1%) sys   4.11 ( 3%) wall 
185245 kB (11%) ggc
 CPROP                 :   2.25 ( 2%) usr   0.01 ( 0%) sys   2.00 ( 1%) wall  
25084 kB ( 1%) ggc
 PRE                   :   1.20 ( 1%) usr   0.00 ( 0%) sys   1.13 ( 1%) wall   
1576 kB ( 0%) ggc
 web                   :   1.09 ( 1%) usr   0.00 ( 0%) sys   1.09 ( 1%) wall   
8368 kB ( 0%) ggc
 CSE 2                 :   2.10 ( 2%) usr   0.01 ( 0%) sys   2.17 ( 2%) wall   
2122 kB ( 0%) ggc
 combiner              :   3.97 ( 3%) usr   0.00 ( 0%) sys   3.96 ( 3%) wall  
60594 kB ( 3%) ggc
 integrated RA         :  10.18 ( 7%) usr   0.01 ( 0%) sys  10.27 ( 7%) wall  
44477 kB ( 3%) ggc
 reload                :   6.31 ( 5%) usr   0.01 ( 0%) sys   6.24 ( 4%) wall  
10153 kB ( 1%) ggc
 reload CSE regs       :   4.39 ( 3%) usr   0.01 ( 0%) sys   4.17 ( 3%) wall  
37354 kB ( 2%) ggc
 rename registers      :   1.13 ( 1%) usr   0.00 ( 0%) sys   1.18 ( 1%) wall   
2500 kB ( 0%) ggc
 scheduling 2          :   5.84 ( 4%) usr   0.02 ( 1%) sys   5.81 ( 4%) wall   
1160 kB ( 0%) ggc
 final                 :   4.29 ( 3%) usr   0.04 ( 2%) sys   4.66 ( 3%) wall  
10463 kB ( 1%) ggc
 variable tracking     :   2.76 ( 2%) usr   0.01 ( 0%) sys   2.73 ( 2%) wall  
64964 kB ( 4%) ggc
 var-tracking dataflow :   3.86 ( 3%) usr   0.02 ( 1%) sys   3.90 ( 3%) wall   
   0 kB ( 0%) ggc
 var-tracking emit     :   3.89 ( 3%) usr   0.01 ( 0%) sys   3.85 ( 3%) wall  
19488 kB ( 1%) ggc
 rest of compilation   :   2.27 ( 2%) usr   0.08 ( 4%) sys   2.28 ( 2%) wall  
21438 kB ( 1%) ggc
 remove unused locals  :   1.02 ( 1%) usr   0.01 ( 0%) sys   0.92 ( 1%) wall   
   0 kB ( 0%) ggc
 unaccounted todo      :   1.21 ( 1%) usr   0.05 ( 2%) sys   1.19 ( 1%) wall   
   8 kB ( 0%) ggc
 TOTAL                 : 137.09             2.28           139.39           
1741129 kB

/usr/src/gcc-4.5/objr/gcc/f951 -quiet -ftime-report -fbounds-check -g -O3
-ffast-math -funroll-loops -ftree-vectorize -march=amdfam10 pr45422.f90 2>&1 |
grep ':[      ]*[1-9]\|TOTAL'
 df live regs          :   2.05 ( 4%) usr   0.00 ( 0%) sys   1.95 ( 4%) wall   
   0 kB ( 0%) ggc
 tree VRP              :   1.43 ( 3%) usr   0.15 ( 8%) sys   1.47 ( 3%) wall  
56376 kB ( 6%) ggc
 complete unrolling    :   1.14 ( 2%) usr   0.18 (10%) sys   1.39 ( 3%) wall  
98554 kB (11%) ggc
 tree iv optimization  :   5.31 (10%) usr   0.05 ( 3%) sys   5.40 (10%) wall  
95356 kB (11%) ggc
 expand                :   2.98 ( 6%) usr   0.11 ( 6%) sys   3.29 ( 6%) wall  
69642 kB ( 8%) ggc
 combiner              :   1.49 ( 3%) usr   0.00 ( 0%) sys   1.22 ( 2%) wall  
19980 kB ( 2%) ggc
 integrated RA         :   3.60 ( 7%) usr   0.01 ( 1%) sys   3.56 ( 6%) wall  
12746 kB ( 1%) ggc
 reload                :   2.21 ( 4%) usr   0.01 ( 1%) sys   2.18 ( 4%) wall   
7748 kB ( 1%) ggc
 reload CSE regs       :   1.24 ( 2%) usr   0.01 ( 1%) sys   1.16 ( 2%) wall  
12330 kB ( 1%) ggc
 scheduling 2          :   2.73 ( 5%) usr   0.01 ( 1%) sys   2.88 ( 5%) wall   
1218 kB ( 0%) ggc
 final                 :   3.14 ( 6%) usr   0.03 ( 2%) sys   3.16 ( 6%) wall   
7438 kB ( 1%) ggc
 variable tracking     :   4.18 ( 8%) usr   0.03 ( 2%) sys   4.25 ( 8%) wall  
40204 kB ( 4%) ggc
 TOTAL                 :  53.49             1.81            55.45            
897516 kB

/usr/src/gcc/objr/gcc/f951 -quiet -ftime-report -fbounds-check -g -O3
-ffast-math -funroll-loops -ftree-vectorize -march=amdfam10 pr45422.f90
-fno-ivopts 2>&1 | grep ':[      ]*[1-9]\|TOTAL'
 cfg cleanup           :   1.83 ( 2%) usr   0.01 ( 0%) sys   1.71 ( 2%) wall   
7191 kB ( 1%) ggc
 df reaching defs      :   1.25 ( 1%) usr   0.00 ( 0%) sys   1.33 ( 1%) wall   
   0 kB ( 0%) ggc
 df live regs          :   6.19 ( 6%) usr   0.06 ( 3%) sys   6.34 ( 6%) wall   
   0 kB ( 0%) ggc
 df live&initialized regs:   2.47 ( 2%) usr   0.02 ( 1%) sys   2.06 ( 2%) wall 
     0 kB ( 0%) ggc
 df reg dead/unused notes:   2.82 ( 3%) usr   0.01 ( 0%) sys   2.79 ( 3%) wall 
  9653 kB ( 1%) ggc
 register information  :   1.19 ( 1%) usr   0.00 ( 0%) sys   1.21 ( 1%) wall   
   0 kB ( 0%) ggc
 alias analysis        :   1.75 ( 2%) usr   0.00 ( 0%) sys   1.81 ( 2%) wall  
48001 kB ( 3%) ggc
 tree CFG cleanup      :   1.08 ( 1%) usr   0.01 ( 0%) sys   1.10 ( 1%) wall   
4079 kB ( 0%) ggc
 tree VRP              :   2.13 ( 2%) usr   0.05 ( 2%) sys   2.43 ( 2%) wall  
76935 kB ( 5%) ggc
 tree SSA incremental  :   1.23 ( 1%) usr   0.20 ( 9%) sys   1.48 ( 1%) wall   
7193 kB ( 1%) ggc
 tree CCP              :   1.00 ( 1%) usr   0.06 ( 3%) sys   1.12 ( 1%) wall   
4975 kB ( 0%) ggc
 complete unrolling    :   1.15 ( 1%) usr   0.16 ( 7%) sys   1.12 ( 1%) wall  
91888 kB ( 6%) ggc
 tree prefetching      :   1.45 ( 1%) usr   0.07 ( 3%) sys   1.38 ( 1%) wall  
92015 kB ( 6%) ggc
 expand                :   1.30 ( 1%) usr   0.03 ( 1%) sys   1.23 ( 1%) wall  
98494 kB ( 7%) ggc
 forward prop          :   1.19 ( 1%) usr   0.01 ( 0%) sys   1.05 ( 1%) wall  
17136 kB ( 1%) ggc
 CSE                   :   1.64 ( 2%) usr   0.00 ( 0%) sys   1.51 ( 1%) wall   
 683 kB ( 0%) ggc
 dead store elim1      :   1.08 ( 1%) usr   0.00 ( 0%) sys   1.23 ( 1%) wall  
24050 kB ( 2%) ggc
 loop unrolling        :   3.38 ( 3%) usr   0.06 ( 3%) sys   3.69 ( 3%) wall 
165346 kB (12%) ggc
 CPROP                 :   1.92 ( 2%) usr   0.02 ( 1%) sys   2.11 ( 2%) wall  
23260 kB ( 2%) ggc
 PRE                   :   1.23 ( 1%) usr   0.00 ( 0%) sys   1.06 ( 1%) wall   
1979 kB ( 0%) ggc
 CSE 2                 :   1.98 ( 2%) usr   0.01 ( 0%) sys   1.94 ( 2%) wall   
2631 kB ( 0%) ggc
 combiner              :   4.33 ( 4%) usr   0.00 ( 0%) sys   4.40 ( 4%) wall  
76446 kB ( 5%) ggc
 integrated RA         :   8.83 ( 8%) usr   0.03 ( 1%) sys   9.07 ( 8%) wall  
46246 kB ( 3%) ggc
 reload                :   5.16 ( 5%) usr   0.02 ( 1%) sys   5.12 ( 5%) wall   
9244 kB ( 1%) ggc
 reload CSE regs       :   3.47 ( 3%) usr   0.01 ( 0%) sys   3.59 ( 3%) wall  
34826 kB ( 2%) ggc
 rename registers      :   1.21 ( 1%) usr   0.00 ( 0%) sys   1.13 ( 1%) wall   
2675 kB ( 0%) ggc
 scheduling 2          :   5.54 ( 5%) usr   0.01 ( 0%) sys   5.46 ( 5%) wall   
1216 kB ( 0%) ggc
 final                 :   3.94 ( 4%) usr   0.08 ( 4%) sys   4.16 ( 4%) wall   
9291 kB ( 1%) ggc
 variable tracking     :   2.23 ( 2%) usr   0.05 ( 2%) sys   2.42 ( 2%) wall  
61607 kB ( 4%) ggc
 var-tracking dataflow :   3.97 ( 4%) usr   0.00 ( 0%) sys   3.76 ( 3%) wall   
   0 kB ( 0%) ggc
 var-tracking emit     :   3.75 ( 3%) usr   0.01 ( 0%) sys   3.91 ( 4%) wall  
21108 kB ( 1%) ggc
 rest of compilation   :   1.84 ( 2%) usr   0.08 ( 4%) sys   2.11 ( 2%) wall  
16864 kB ( 1%) ggc
 TOTAL                 : 107.95             2.25           110.22           
1435716 kB

shows that still the ivopts slowdown isn't so significant, the compiler on this
testcase just slowed down everywhere.  Both f951 binaries are
--enable-checking=release.  Suprisingly -fno-while-file on the trunk doesn't
make any visible difference in compile time.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Bug middle-end/45422] [4.6 Regression] compile time increases 3x.
       [not found] <bug-45422-4@http.gcc.gnu.org/bugzilla/>
                   ` (10 preceding siblings ...)
  2011-01-27 16:03 ` jakub at gcc dot gnu.org
@ 2011-01-27 16:17 ` jakub at gcc dot gnu.org
  2011-01-27 16:29 ` rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: jakub at gcc dot gnu.org @ 2011-01-27 16:17 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422

--- Comment #38 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-01-27 16:02:49 UTC ---
*.gimple dump is roughly the same size between 4.5 and 4.6, but resulting
assembly size is 15MB in 4.5 and 23MB (with only < 100KB variation with
-fno-ivopts) in 4.6.  -fno-inline doesn't help neither compile time nor
assembly size though on 4.6.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Bug middle-end/45422] [4.6 Regression] compile time increases 3x.
       [not found] <bug-45422-4@http.gcc.gnu.org/bugzilla/>
                   ` (11 preceding siblings ...)
  2011-01-27 16:17 ` jakub at gcc dot gnu.org
@ 2011-01-27 16:29 ` rguenth at gcc dot gnu.org
  2011-01-27 16:31 ` rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: rguenth at gcc dot gnu.org @ 2011-01-27 16:29 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422

--- Comment #39 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-01-27 16:16:48 UTC ---
The size difference is likely from prefetching, it's 1.5MB vs. 1.1MB without
that (-O3 -fbounds-check -ffast-math -funroll-loops).  Prefetching usually
causes another set of (then RTL unrolled) loop copies.  See PR44688.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Bug middle-end/45422] [4.6 Regression] compile time increases 3x.
       [not found] <bug-45422-4@http.gcc.gnu.org/bugzilla/>
                   ` (12 preceding siblings ...)
  2011-01-27 16:29 ` rguenth at gcc dot gnu.org
@ 2011-01-27 16:31 ` rguenth at gcc dot gnu.org
  2011-01-27 16:40 ` jakub at gcc dot gnu.org
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: rguenth at gcc dot gnu.org @ 2011-01-27 16:31 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422

--- Comment #40 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-01-27 16:19:26 UTC ---
Btw, when I remove -fbounds-check the sizes are comparable (without
prefetching),
so I guess we are just better in removing bounds checking for 4.6 and that
triggers size-costly loop opts such as vectorization and unrolling.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Bug middle-end/45422] [4.6 Regression] compile time increases 3x.
       [not found] <bug-45422-4@http.gcc.gnu.org/bugzilla/>
                   ` (13 preceding siblings ...)
  2011-01-27 16:31 ` rguenth at gcc dot gnu.org
@ 2011-01-27 16:40 ` jakub at gcc dot gnu.org
  2011-01-27 16:51 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: jakub at gcc dot gnu.org @ 2011-01-27 16:40 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422

--- Comment #41 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-01-27 16:28:49 UTC ---
With additional -fno-prefetch-loop-arrays the TOTAL goes down from that 137s to
92.23, and judging from tree dumps between 4.5 and 4.6 we do significantly more
vectorization too (4.6 *.ifcvt is 4.7MB compared to 5.3MB 4.5 *.ifcvt, while
4.6 *.vect grows to 8.3MB while 4.5 *.vect stays at 5.3MB).


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Bug middle-end/45422] [4.6 Regression] compile time increases 3x.
       [not found] <bug-45422-4@http.gcc.gnu.org/bugzilla/>
                   ` (14 preceding siblings ...)
  2011-01-27 16:40 ` jakub at gcc dot gnu.org
@ 2011-01-27 16:51 ` rguenth at gcc dot gnu.org
  2011-01-27 16:55 ` jakub at gcc dot gnu.org
  2011-01-27 17:55 ` xinliangli at gmail dot com
  17 siblings, 0 replies; 28+ messages in thread
From: rguenth at gcc dot gnu.org @ 2011-01-27 16:51 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422

--- Comment #42 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-01-27 16:30:52 UTC ---
Comparing -O3 -ffast-math -funroll-loops -fno-inline -fno-partial-inlining
(thus generic arch, without prefetching):

trunk:

 df live regs          :   4.22 ( 6%) usr   0.04 ( 2%) sys   4.11 ( 5%) wall   
   0 kB ( 0%) ggc
 tree iv optimization  :   3.92 ( 5%) usr   0.13 ( 5%) sys   4.29 ( 6%) wall  
91066 kB (11%) ggc
 integrated RA         :   5.57 ( 8%) usr   0.10 ( 4%) sys   5.93 ( 8%) wall  
26408 kB ( 3%) ggc
 scheduling 2          :   3.73 ( 5%) usr   0.04 ( 2%) sys   3.85 ( 5%) wall   
 939 kB ( 0%) ggc
 TOTAL                 :  73.68             2.37            76.91            
852775 kB

4.5:

 df live regs          :   4.60 ( 7%) usr   0.02 ( 1%) sys   4.62 ( 6%) wall   
   0 kB ( 0%) ggc
 expand                :   3.94 ( 6%) usr   0.17 ( 8%) sys   3.94 ( 6%) wall  
62218 kB ( 8%) ggc
 integrated RA         :   5.73 ( 8%) usr   0.02 ( 1%) sys   5.76 ( 8%) wall  
22920 kB ( 3%) ggc
 reload                :   3.78 ( 5%) usr   0.08 ( 4%) sys   3.86 ( 5%) wall   
9291 kB ( 1%) ggc
 TOTAL                 :  68.98             2.01            71.22            
828137 kB

it would be nice to confirm that we are indeed much better with
optimizing bounds-checking code.  The prefetching issue is
tracked as PR44688.  So I'd close this either as a dup or as
wontfix (it's a feature that we optimize loops with bounds-checking).


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Bug middle-end/45422] [4.6 Regression] compile time increases 3x.
       [not found] <bug-45422-4@http.gcc.gnu.org/bugzilla/>
                   ` (15 preceding siblings ...)
  2011-01-27 16:51 ` rguenth at gcc dot gnu.org
@ 2011-01-27 16:55 ` jakub at gcc dot gnu.org
  2011-01-27 17:55 ` xinliangli at gmail dot com
  17 siblings, 0 replies; 28+ messages in thread
From: jakub at gcc dot gnu.org @ 2011-01-27 16:55 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |WONTFIX

--- Comment #43 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-01-27 16:43:17 UTC ---
Yeah, I agree.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Bug middle-end/45422] [4.6 Regression] compile time increases 3x.
       [not found] <bug-45422-4@http.gcc.gnu.org/bugzilla/>
                   ` (16 preceding siblings ...)
  2011-01-27 16:55 ` jakub at gcc dot gnu.org
@ 2011-01-27 17:55 ` xinliangli at gmail dot com
  17 siblings, 0 replies; 28+ messages in thread
From: xinliangli at gmail dot com @ 2011-01-27 17:55 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422

--- Comment #44 from davidxl <xinliangli at gmail dot com> 2011-01-27 17:33:42 UTC ---
Nice triaging..

David


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Bug middle-end/45422] [4.6 Regression] compile time increases 3x.
  2010-08-26 18:33 [Bug middle-end/45422] New: [4.6 Regression] compile time increases 8x jv244 at cam dot ac dot uk
                   ` (8 preceding siblings ...)
  2010-08-31 17:45 ` davidxl at gcc dot gnu dot org
@ 2010-09-02 11:25 ` rguenth at gcc dot gnu dot org
  9 siblings, 0 replies; 28+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-09-02 11:25 UTC (permalink / raw)
  To: gcc-bugs



-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P1


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Bug middle-end/45422] [4.6 Regression] compile time increases 3x.
  2010-08-26 18:33 [Bug middle-end/45422] New: [4.6 Regression] compile time increases 8x jv244 at cam dot ac dot uk
                   ` (7 preceding siblings ...)
  2010-08-30 16:41 ` davidxl at gcc dot gnu dot org
@ 2010-08-31 17:45 ` davidxl at gcc dot gnu dot org
  2010-09-02 11:25 ` rguenth at gcc dot gnu dot org
  9 siblings, 0 replies; 28+ messages in thread
From: davidxl at gcc dot gnu dot org @ 2010-08-31 17:45 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #26 from davidxl at gcc dot gnu dot org  2010-08-31 17:45 -------
Good observation re. the number of IVs in the final set. This usually points to
some problem/bug in the cost function. I briefly looked at this case -- it
indeed exposes two more bugs in the cost model:

1) the computation cost of the all the cost pairs in an assignment can actually
not simply be added together, because many rewrite expressions can be commoned.
We now have the mechanism to compute with common loop invariants for register
pressure estimation, and this mechnasim needs to be extended for computation
cost.

2) the offset is not stripped when computing loop invariant expression ids --
this can cause problem in overestimating reg pressure. (The case arises more
often with loop unrolling).

David


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Bug middle-end/45422] [4.6 Regression] compile time increases 3x.
  2010-08-26 18:33 [Bug middle-end/45422] New: [4.6 Regression] compile time increases 8x jv244 at cam dot ac dot uk
                   ` (6 preceding siblings ...)
  2010-08-30  7:12 ` rguenth at gcc dot gnu dot org
@ 2010-08-30 16:41 ` davidxl at gcc dot gnu dot org
  2010-08-31 17:45 ` davidxl at gcc dot gnu dot org
  2010-09-02 11:25 ` rguenth at gcc dot gnu dot org
  9 siblings, 0 replies; 28+ messages in thread
From: davidxl at gcc dot gnu dot org @ 2010-08-30 16:41 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #25 from davidxl at gcc dot gnu dot org  2010-08-30 16:41 -------
(In reply to comment #24)
> (In reply to comment #20)
> > (In reply to comment #16)
> > > adjust summary according to the last timings
> > > 
> > 
> > I am surprised to see such big differences between trunk and previous releases.
> > Compiling this test case with the those options on my core2 box (2.4GHz ) took
> > only 56seconds which is comparable with the timing with a 4.4.3 compiler (with
> > google local patches including ivopt improvements).
> 
> Of course - because the ivopt improvement patches are the problem.
> 

It is just the total time diff from Joost's measure can be just explained by
ivopt component.

David


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Bug middle-end/45422] [4.6 Regression] compile time increases 3x.
  2010-08-26 18:33 [Bug middle-end/45422] New: [4.6 Regression] compile time increases 8x jv244 at cam dot ac dot uk
                   ` (4 preceding siblings ...)
  2010-08-30  3:19 ` davidxl at gcc dot gnu dot org
@ 2010-08-30  7:12 ` rguenth at gcc dot gnu dot org
  2010-08-30  7:12 ` rguenth at gcc dot gnu dot org
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 28+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-08-30  7:12 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #23 from rguenth at gcc dot gnu dot org  2010-08-30 07:11 -------
(In reply to comment #22)
> Given the fact that the solution space is really large -- M^N where M is the
> number of candidates and M is the number of uses (here M == 70 and N == 48), 
> and the cost function is complicated, it will be challenging to come up with
> algorithm that converges really fast, and most importantly -- 'guarantees' an
> optimal solution..

Well - we can't guarantee an optimal solution.  We have to take compile-time
into account which means that O(M^N) is not acceptable but we need to come
up with something that can complete in O((M+N) log (M+N)) time at most.

I btw doubt that the solution found is anywhere near optimal for 32bit
x86 - using 15 IVs instead of 2 can't be cheaper.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Bug middle-end/45422] [4.6 Regression] compile time increases 3x.
  2010-08-26 18:33 [Bug middle-end/45422] New: [4.6 Regression] compile time increases 8x jv244 at cam dot ac dot uk
                   ` (5 preceding siblings ...)
  2010-08-30  7:12 ` rguenth at gcc dot gnu dot org
@ 2010-08-30  7:12 ` rguenth at gcc dot gnu dot org
  2010-08-30 16:41 ` davidxl at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 28+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-08-30  7:12 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #24 from rguenth at gcc dot gnu dot org  2010-08-30 07:12 -------
(In reply to comment #20)
> (In reply to comment #16)
> > adjust summary according to the last timings
> > 
> 
> I am surprised to see such big differences between trunk and previous releases.
> Compiling this test case with the those options on my core2 box (2.4GHz ) took
> only 56seconds which is comparable with the timing with a 4.4.3 compiler (with
> google local patches including ivopt improvements).

Of course - because the ivopt improvement patches are the problem.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Bug middle-end/45422] [4.6 Regression] compile time increases 3x.
  2010-08-26 18:33 [Bug middle-end/45422] New: [4.6 Regression] compile time increases 8x jv244 at cam dot ac dot uk
                   ` (3 preceding siblings ...)
  2010-08-30  3:11 ` davidxl at gcc dot gnu dot org
@ 2010-08-30  3:19 ` davidxl at gcc dot gnu dot org
  2010-08-30  7:12 ` rguenth at gcc dot gnu dot org
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 28+ messages in thread
From: davidxl at gcc dot gnu dot org @ 2010-08-30  3:19 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #21 from davidxl at gcc dot gnu dot org  2010-08-30 03:19 -------
(In reply to comment #17)
>  tree iv optimization  :  32.57 (20%) usr   0.10 ( 5%) sys  32.73 (20%) wall 
> 322095 kB (18%) ggc
> 
> 
> 20% is still completely unreasonable for IV optimization.
> 

There was a patch in trunk that may double the time in ivopt -- i.e.
find_optimal_iv_set_1 is done twice, one with the original iv set while the
other with full set. This probably needs to be revisited. 

David


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Bug middle-end/45422] [4.6 Regression] compile time increases 3x.
  2010-08-26 18:33 [Bug middle-end/45422] New: [4.6 Regression] compile time increases 8x jv244 at cam dot ac dot uk
                   ` (2 preceding siblings ...)
  2010-08-29 15:07 ` jv244 at cam dot ac dot uk
@ 2010-08-30  3:11 ` davidxl at gcc dot gnu dot org
  2010-08-30  3:19 ` davidxl at gcc dot gnu dot org
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 28+ messages in thread
From: davidxl at gcc dot gnu dot org @ 2010-08-30  3:11 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #20 from davidxl at gcc dot gnu dot org  2010-08-30 03:10 -------
(In reply to comment #16)
> adjust summary according to the last timings
> 

I am surprised to see such big differences between trunk and previous releases.
Compiling this test case with the those options on my core2 box (2.4GHz ) took
only 56seconds which is comparable with the timing with a 4.4.3 compiler (with
google local patches including ivopt improvements).

David


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Bug middle-end/45422] [4.6 Regression] compile time increases 3x.
  2010-08-26 18:33 [Bug middle-end/45422] New: [4.6 Regression] compile time increases 8x jv244 at cam dot ac dot uk
  2010-08-29  6:38 ` [Bug middle-end/45422] [4.6 Regression] compile time increases 3x jv244 at cam dot ac dot uk
  2010-08-29  9:26 ` rguenth at gcc dot gnu dot org
@ 2010-08-29 15:07 ` jv244 at cam dot ac dot uk
  2010-08-30  3:11 ` davidxl at gcc dot gnu dot org
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 28+ messages in thread
From: jv244 at cam dot ac dot uk @ 2010-08-29 15:07 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #18 from jv244 at cam dot ac dot uk  2010-08-29 15:07 -------
FYI, these are the 4.5 branch timings:

Execution times (seconds)
 garbage collection    :   0.47 ( 1%) usr   0.00 ( 0%) sys   0.47 ( 1%) wall   
   0 kB ( 0%) ggc
 callgraph construction:   0.05 ( 0%) usr   0.01 ( 1%) sys   0.09 ( 0%) wall   
5996 kB ( 1%) ggc
 callgraph optimization:   0.21 ( 0%) usr   0.02 ( 1%) sys   0.26 ( 0%) wall   
 606 kB ( 0%) ggc
 ipa cp                :   0.09 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall   
1381 kB ( 0%) ggc
 ipa reference         :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall   
   0 kB ( 0%) ggc
 ipa pure const        :   0.06 ( 0%) usr   0.01 ( 1%) sys   0.09 ( 0%) wall   
   0 kB ( 0%) ggc
 cfg cleanup           :   0.39 ( 1%) usr   0.00 ( 0%) sys   0.51 ( 1%) wall   
2459 kB ( 0%) ggc
 trivially dead code   :   0.34 ( 1%) usr   0.00 ( 0%) sys   0.30 ( 1%) wall   
   0 kB ( 0%) ggc
 df multiple defs      :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.13 ( 0%) wall   
   0 kB ( 0%) ggc
 df reaching defs      :   0.33 ( 1%) usr   0.00 ( 0%) sys   0.27 ( 1%) wall   
   0 kB ( 0%) ggc
 df live regs          :   2.08 ( 4%) usr   0.01 ( 1%) sys   2.19 ( 4%) wall   
   0 kB ( 0%) ggc
 df live&initialized regs:   0.98 ( 2%) usr   0.00 ( 0%) sys   0.92 ( 2%) wall 
     0 kB ( 0%) ggc
 df use-def / def-use chains:   0.24 ( 0%) usr   0.00 ( 0%) sys   0.19 ( 0%)
wall       0 kB ( 0%) ggc
 df reg dead/unused notes:   0.93 ( 2%) usr   0.00 ( 0%) sys   1.04 ( 2%) wall 
  5756 kB ( 1%) ggc
 register information  :   0.51 ( 1%) usr   0.01 ( 1%) sys   0.39 ( 1%) wall   
   0 kB ( 0%) ggc
 alias analysis        :   0.78 ( 1%) usr   0.01 ( 1%) sys   0.91 ( 2%) wall  
22384 kB ( 3%) ggc
 alias stmt walking    :   0.50 ( 1%) usr   0.03 ( 2%) sys   0.38 ( 1%) wall   
5563 kB ( 1%) ggc
 register scan         :   0.13 ( 0%) usr   0.00 ( 0%) sys   0.09 ( 0%) wall   
   0 kB ( 0%) ggc
 rebuild jump labels   :   0.19 ( 0%) usr   0.00 ( 0%) sys   0.19 ( 0%) wall   
   0 kB ( 0%) ggc
 parser                :   0.82 ( 2%) usr   0.13 ( 9%) sys   0.94 ( 2%) wall  
55603 kB ( 6%) ggc
 inline heuristics     :   0.20 ( 0%) usr   0.01 ( 1%) sys   0.16 ( 0%) wall   
   0 kB ( 0%) ggc
 tree gimplify         :   0.38 ( 1%) usr   0.03 ( 2%) sys   0.40 ( 1%) wall  
46588 kB ( 5%) ggc
 tree eh               :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
   0 kB ( 0%) ggc
 tree CFG construction :   0.04 ( 0%) usr   0.02 ( 1%) sys   0.05 ( 0%) wall  
11964 kB ( 1%) ggc
 tree CFG cleanup      :   0.47 ( 1%) usr   0.00 ( 0%) sys   0.79 ( 1%) wall   
1829 kB ( 0%) ggc
 tree VRP              :   1.46 ( 3%) usr   0.05 ( 4%) sys   1.27 ( 2%) wall  
56376 kB ( 6%) ggc
 tree copy propagation :   0.09 ( 0%) usr   0.02 ( 1%) sys   0.22 ( 0%) wall   
 746 kB ( 0%) ggc
 tree find ref. vars   :   0.09 ( 0%) usr   0.01 ( 1%) sys   0.07 ( 0%) wall   
3806 kB ( 0%) ggc
 tree PTA              :   0.30 ( 1%) usr   0.00 ( 0%) sys   0.33 ( 1%) wall   
3836 kB ( 0%) ggc
 tree PHI insertion    :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall   
3194 kB ( 0%) ggc
 tree SSA rewrite      :   0.24 ( 0%) usr   0.01 ( 1%) sys   0.29 ( 1%) wall  
13860 kB ( 2%) ggc
 tree SSA other        :   0.13 ( 0%) usr   0.02 ( 1%) sys   0.11 ( 0%) wall   
 418 kB ( 0%) ggc
 tree SSA incremental  :   0.89 ( 2%) usr   0.06 ( 4%) sys   0.97 ( 2%) wall   
6811 kB ( 1%) ggc
 tree operand scan     :   0.34 ( 1%) usr   0.23 (17%) sys   0.59 ( 1%) wall  
44776 kB ( 5%) ggc
 dominator optimization:   0.29 ( 1%) usr   0.01 ( 1%) sys   0.35 ( 1%) wall   
5152 kB ( 1%) ggc
 tree CCP              :   0.51 ( 1%) usr   0.02 ( 1%) sys   0.43 ( 1%) wall   
4620 kB ( 1%) ggc
 tree PHI const/copy prop:   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
   106 kB ( 0%) ggc
 tree split crit edges :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
2019 kB ( 0%) ggc
 tree reassociation    :   0.12 ( 0%) usr   0.01 ( 1%) sys   0.12 ( 0%) wall   
2946 kB ( 0%) ggc
 tree PRE              :   0.92 ( 2%) usr   0.00 ( 0%) sys   0.95 ( 2%) wall   
7315 kB ( 1%) ggc
 tree FRE              :   0.45 ( 1%) usr   0.04 ( 3%) sys   0.35 ( 1%) wall   
5518 kB ( 1%) ggc
 tree code sinking     :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
1400 kB ( 0%) ggc
 tree linearize phis   :   0.02 ( 0%) usr   0.01 ( 1%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree forward propagate:   0.18 ( 0%) usr   0.02 ( 1%) sys   0.16 ( 0%) wall  
10006 kB ( 1%) ggc
 tree conservative DCE :   0.05 ( 0%) usr   0.01 ( 1%) sys   0.13 ( 0%) wall   
 576 kB ( 0%) ggc
 tree aggressive DCE   :   0.28 ( 1%) usr   0.01 ( 1%) sys   0.37 ( 1%) wall   
8853 kB ( 1%) ggc
 tree buildin call DCE :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
   0 kB ( 0%) ggc
 tree DSE              :   0.20 ( 0%) usr   0.00 ( 0%) sys   0.11 ( 0%) wall   
 132 kB ( 0%) ggc
 PHI merge             :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
  37 kB ( 0%) ggc
 tree loop bounds      :   0.22 ( 0%) usr   0.00 ( 0%) sys   0.18 ( 0%) wall   
8266 kB ( 1%) ggc
 tree loop invariant motion:   0.06 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%)
wall      67 kB ( 0%) ggc
 tree canonical iv     :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.09 ( 0%) wall   
4779 kB ( 1%) ggc
 scev constant prop    :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.07 ( 0%) wall   
2345 kB ( 0%) ggc
 tree loop unswitching :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
 573 kB ( 0%) ggc
 complete unrolling    :   1.05 ( 2%) usr   0.11 ( 8%) sys   1.39 ( 3%) wall  
98553 kB (11%) ggc
 tree vectorization    :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
 883 kB ( 0%) ggc
 tree slp vectorization:   0.61 ( 1%) usr   0.00 ( 0%) sys   0.60 ( 1%) wall  
53236 kB ( 6%) ggc
 tree iv optimization  :   5.80 (11%) usr   0.06 ( 4%) sys   5.94 (11%) wall  
95356 kB (11%) ggc
 predictive commoning  :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
1054 kB ( 0%) ggc
 tree loop init        :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
1339 kB ( 0%) ggc
 tree copy headers     :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
1613 kB ( 0%) ggc
 tree SSA uncprop      :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree rename SSA copies:   0.06 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
   0 kB ( 0%) ggc
 dominance frontiers   :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall   
   0 kB ( 0%) ggc
 dominance computation :   0.23 ( 0%) usr   0.00 ( 0%) sys   0.26 ( 0%) wall   
   0 kB ( 0%) ggc
 expand                :   3.24 ( 6%) usr   0.07 ( 5%) sys   3.34 ( 6%) wall  
69633 kB ( 8%) ggc
 lower subreg          :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
   0 kB ( 0%) ggc
 forward prop          :   0.48 ( 1%) usr   0.01 ( 1%) sys   0.48 ( 1%) wall   
9984 kB ( 1%) ggc
 CSE                   :   0.73 ( 1%) usr   0.00 ( 0%) sys   0.92 ( 2%) wall   
 248 kB ( 0%) ggc
 dead code elimination :   0.24 ( 0%) usr   0.00 ( 0%) sys   0.28 ( 1%) wall   
   0 kB ( 0%) ggc
 dead store elim1      :   0.33 ( 1%) usr   0.01 ( 1%) sys   0.32 ( 1%) wall   
5987 kB ( 1%) ggc
 dead store elim2      :   0.44 ( 1%) usr   0.02 ( 1%) sys   0.39 ( 1%) wall   
7831 kB ( 1%) ggc
 loop analysis         :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
 718 kB ( 0%) ggc
 loop invariant motion :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.07 ( 0%) wall   
 305 kB ( 0%) ggc
 loop unswitching      :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
   0 kB ( 0%) ggc
 loop unrolling        :   0.65 ( 1%) usr   0.00 ( 0%) sys   0.62 ( 1%) wall  
32780 kB ( 4%) ggc
 CPROP                 :   0.70 ( 1%) usr   0.00 ( 0%) sys   0.60 ( 1%) wall   
7825 kB ( 1%) ggc
 PRE                   :   0.32 ( 1%) usr   0.00 ( 0%) sys   0.33 ( 1%) wall   
 719 kB ( 0%) ggc
 web                   :   0.13 ( 0%) usr   0.00 ( 0%) sys   0.07 ( 0%) wall   
 594 kB ( 0%) ggc
 CSE 2                 :   0.75 ( 1%) usr   0.01 ( 1%) sys   0.60 ( 1%) wall   
 470 kB ( 0%) ggc
 branch prediction     :   0.19 ( 0%) usr   0.01 ( 1%) sys   0.14 ( 0%) wall   
7344 kB ( 1%) ggc
 combiner              :   1.19 ( 2%) usr   0.01 ( 1%) sys   1.33 ( 2%) wall  
19980 kB ( 2%) ggc
 if-conversion         :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
 746 kB ( 0%) ggc
 regmove               :   0.37 ( 1%) usr   0.01 ( 1%) sys   0.33 ( 1%) wall   
   0 kB ( 0%) ggc
 integrated RA         :   3.51 ( 7%) usr   0.01 ( 1%) sys   3.74 ( 7%) wall  
12746 kB ( 1%) ggc
 reload                :   2.16 ( 4%) usr   0.02 ( 1%) sys   2.01 ( 4%) wall   
7755 kB ( 1%) ggc
 reload CSE regs       :   1.38 ( 3%) usr   0.00 ( 0%) sys   1.26 ( 2%) wall  
12331 kB ( 1%) ggc
 load CSE after reload :   0.18 ( 0%) usr   0.00 ( 0%) sys   0.14 ( 0%) wall   
 162 kB ( 0%) ggc
 thread pro- & epilogue:   0.11 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall   
4370 kB ( 0%) ggc
 if-conversion 2       :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
 357 kB ( 0%) ggc
 combine stack adjustments:   0.03 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall
      0 kB ( 0%) ggc
 peephole 2            :   0.12 ( 0%) usr   0.00 ( 0%) sys   0.16 ( 0%) wall   
1899 kB ( 0%) ggc
 rename registers      :   0.46 ( 1%) usr   0.00 ( 0%) sys   0.55 ( 1%) wall   
2237 kB ( 0%) ggc
 hard reg cprop        :   0.37 ( 1%) usr   0.00 ( 0%) sys   0.48 ( 1%) wall   
  13 kB ( 0%) ggc
 scheduling 2          :   3.30 ( 6%) usr   0.04 ( 3%) sys   3.10 ( 6%) wall   
1216 kB ( 0%) ggc
 machine dep reorg     :   0.38 ( 1%) usr   0.00 ( 0%) sys   0.36 ( 1%) wall   
  11 kB ( 0%) ggc
 reorder blocks        :   0.15 ( 0%) usr   0.00 ( 0%) sys   0.13 ( 0%) wall   
1283 kB ( 0%) ggc
 final                 :   0.93 ( 2%) usr   0.07 ( 5%) sys   0.84 ( 2%) wall   
6610 kB ( 1%) ggc
 symout                :   0.30 ( 1%) usr   0.03 ( 2%) sys   0.34 ( 1%) wall  
27006 kB ( 3%) ggc
 variable tracking     :   3.86 ( 7%) usr   0.03 ( 2%) sys   3.99 ( 7%) wall  
39804 kB ( 4%) ggc
 plugin execution      :   0.00 ( 0%) usr   0.01 ( 1%) sys   0.05 ( 0%) wall   
   0 kB ( 0%) ggc
 rest of compilation   :   0.00 ( 0%) usr   0.01 ( 1%) sys   0.00 ( 0%) wall   
   0 kB ( 0%) ggc
 TOTAL                 :  52.50             1.37            53.88            
893901 kB


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Bug middle-end/45422] [4.6 Regression] compile time increases 3x.
  2010-08-26 18:33 [Bug middle-end/45422] New: [4.6 Regression] compile time increases 8x jv244 at cam dot ac dot uk
  2010-08-29  6:38 ` [Bug middle-end/45422] [4.6 Regression] compile time increases 3x jv244 at cam dot ac dot uk
@ 2010-08-29  9:26 ` rguenth at gcc dot gnu dot org
  2010-08-29 15:07 ` jv244 at cam dot ac dot uk
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 28+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-08-29  9:26 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #17 from rguenth at gcc dot gnu dot org  2010-08-29 09:25 -------
 tree iv optimization  :  32.57 (20%) usr   0.10 ( 5%) sys  32.73 (20%) wall 
322095 kB (18%) ggc


20% is still completely unreasonable for IV optimization.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|WAITING                     |NEW
   Last reconfirmed|2010-08-29 06:38:26         |2010-08-29 09:25:52
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Bug middle-end/45422] [4.6 Regression] compile time increases 3x.
  2010-08-26 18:33 [Bug middle-end/45422] New: [4.6 Regression] compile time increases 8x jv244 at cam dot ac dot uk
@ 2010-08-29  6:38 ` jv244 at cam dot ac dot uk
  2010-08-29  9:26 ` rguenth at gcc dot gnu dot org
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 28+ messages in thread
From: jv244 at cam dot ac dot uk @ 2010-08-29  6:38 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #16 from jv244 at cam dot ac dot uk  2010-08-29 06:38 -------
adjust summary according to the last timings


-- 

jv244 at cam dot ac dot uk changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|2010-08-29 05:31:37         |2010-08-29 06:38:26
               date|                            |
            Summary|[4.6 Regression] compile    |[4.6 Regression] compile
                   |time increases 5x.          |time increases 3x.


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422


^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2011-01-27 17:34 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-45422-4@http.gcc.gnu.org/bugzilla/>
2011-01-17 11:59 ` [Bug middle-end/45422] [4.6 Regression] compile time increases 3x Joost.VandeVondele at pci dot uzh.ch
2011-01-21 10:31 ` jakub at gcc dot gnu.org
2011-01-21 16:46 ` xinliangli at gmail dot com
2011-01-21 20:08 ` xinliangli at gmail dot com
2011-01-21 21:01 ` xinliangli at gmail dot com
2011-01-25  9:47 ` jakub at gcc dot gnu.org
2011-01-25  9:51 ` Joost.VandeVondele at pci dot uzh.ch
2011-01-25 10:03 ` jakub at gcc dot gnu.org
2011-01-25 10:25 ` Joost.VandeVondele at pci dot uzh.ch
2011-01-25 17:58 ` xinliangli at gmail dot com
2011-01-27 16:03 ` jakub at gcc dot gnu.org
2011-01-27 16:17 ` jakub at gcc dot gnu.org
2011-01-27 16:29 ` rguenth at gcc dot gnu.org
2011-01-27 16:31 ` rguenth at gcc dot gnu.org
2011-01-27 16:40 ` jakub at gcc dot gnu.org
2011-01-27 16:51 ` rguenth at gcc dot gnu.org
2011-01-27 16:55 ` jakub at gcc dot gnu.org
2011-01-27 17:55 ` xinliangli at gmail dot com
2010-08-26 18:33 [Bug middle-end/45422] New: [4.6 Regression] compile time increases 8x jv244 at cam dot ac dot uk
2010-08-29  6:38 ` [Bug middle-end/45422] [4.6 Regression] compile time increases 3x jv244 at cam dot ac dot uk
2010-08-29  9:26 ` rguenth at gcc dot gnu dot org
2010-08-29 15:07 ` jv244 at cam dot ac dot uk
2010-08-30  3:11 ` davidxl at gcc dot gnu dot org
2010-08-30  3:19 ` davidxl at gcc dot gnu dot org
2010-08-30  7:12 ` rguenth at gcc dot gnu dot org
2010-08-30  7:12 ` rguenth at gcc dot gnu dot org
2010-08-30 16:41 ` davidxl at gcc dot gnu dot org
2010-08-31 17:45 ` davidxl at gcc dot gnu dot org
2010-09-02 11:25 ` rguenth at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).