public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/38474]  New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand )
@ 2008-12-10 15:26 jv244 at cam dot ac dot uk
  2008-12-10 15:27 ` [Bug middle-end/38474] " jv244 at cam dot ac dot uk
                   ` (50 more replies)
  0 siblings, 51 replies; 52+ messages in thread
From: jv244 at cam dot ac dot uk @ 2008-12-10 15:26 UTC (permalink / raw)
  To: gcc-bugs

With current trunk, the attached testcase (~15Klines) takes about 15min to
compile (& 2.3Gb). To reproduce

gfortran -ffree-line-length-512 -g -c testcase.f90

The issue seems the clear from the timing report:

Execution times (seconds)
 garbage collection    :   1.37 ( 0%) usr   0.00 ( 0%) sys   1.37 ( 0%) wall   
   0 kB ( 0%) ggc
 callgraph construction:   0.14 ( 0%) usr   0.01 ( 0%) sys   0.15 ( 0%) wall  
12498 kB ( 2%) ggc
 callgraph optimization: 202.27 (24%) usr   1.57 (21%) sys 204.09 (24%) wall   
2304 kB ( 0%) ggc
 cfg cleanup           :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
   0 kB ( 0%) ggc
 CFG verifier          :   1.32 ( 0%) usr   0.01 ( 0%) sys   1.32 ( 0%) wall   
   0 kB ( 0%) ggc
 trivially dead code   :   0.52 ( 0%) usr   0.00 ( 0%) sys   0.52 ( 0%) wall   
   0 kB ( 0%) ggc
 df live regs          :   0.30 ( 0%) usr   0.00 ( 0%) sys   0.30 ( 0%) wall   
   0 kB ( 0%) ggc
 df reg dead/unused notes:   0.63 ( 0%) usr   0.01 ( 0%) sys   0.64 ( 0%) wall 
 25889 kB ( 4%) ggc
 register information  :   0.33 ( 0%) usr   0.01 ( 0%) sys   0.35 ( 0%) wall   
   0 kB ( 0%) ggc
 alias analysis        :   0.29 ( 0%) usr   0.00 ( 0%) sys   0.30 ( 0%) wall   
8335 kB ( 1%) ggc
 rebuild jump labels   :   0.34 ( 0%) usr   0.00 ( 0%) sys   0.33 ( 0%) wall   
   0 kB ( 0%) ggc
 parser                :   5.26 ( 1%) usr   0.10 ( 1%) sys   5.37 ( 1%) wall  
59009 kB ( 9%) ggc
 inline heuristics     : 402.30 (48%) usr   3.15 (42%) sys 406.73 (48%) wall   
   0 kB ( 0%) ggc
 tree gimplify         :   0.21 ( 0%) usr   0.01 ( 0%) sys   0.21 ( 0%) wall  
16835 kB ( 2%) ggc
 tree eh               :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
   0 kB ( 0%) ggc
 tree CFG construction :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
 180 kB ( 0%) ggc
 tree CFG cleanup      :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
   0 kB ( 0%) ggc
 tree find ref. vars   :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall   
7875 kB ( 1%) ggc
 tree SSA other        :   0.03 ( 0%) usr   0.01 ( 0%) sys   0.04 ( 0%) wall   
   0 kB ( 0%) ggc
 tree operand scan     :   0.03 ( 0%) usr   0.02 ( 0%) sys   0.05 ( 0%) wall   
 236 kB ( 0%) ggc
 tree SSA to normal    :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree SSA verifier     :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree STMT verifier    :   0.57 ( 0%) usr   0.06 ( 1%) sys   0.64 ( 0%) wall   
   0 kB ( 0%) ggc
 callgraph verifier    :   0.24 ( 0%) usr   0.00 ( 0%) sys   0.24 ( 0%) wall   
   0 kB ( 0%) ggc
 expand                : 207.53 (25%) usr   2.16 (29%) sys 210.04 (25%) wall 
366504 kB (54%) ggc
 integrated RA         :  11.67 ( 1%) usr   0.15 ( 2%) sys  11.84 ( 1%) wall   
8700 kB ( 1%) ggc
 reload                :   5.34 ( 1%) usr   0.20 ( 3%) sys   5.53 ( 1%) wall 
163863 kB (24%) ggc
 thread pro- & epilogue:   0.50 ( 0%) usr   0.00 ( 0%) sys   0.49 ( 0%) wall   
 174 kB ( 0%) ggc
 final                 :   1.86 ( 0%) usr   0.10 ( 1%) sys   1.99 ( 0%) wall   
7380 kB ( 1%) ggc
 symout                :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall   
4777 kB ( 1%) ggc
 TOTAL                 : 843.24             7.57           852.70            
684846 kB


-- 
           Summary: slow compilation at -O0 (callgraph optimization, inline
                    heuristics, ggc expand )
           Product: gcc
           Version: 4.4.0
            Status: UNCONFIRMED
          Keywords: compile-time-hog
          Severity: normal
          Priority: P3
         Component: middle-end
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: jv244 at cam dot ac dot uk


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
@ 2008-12-10 15:27 ` jv244 at cam dot ac dot uk
  2008-12-10 15:41 ` [Bug middle-end/38474] slow compilation at -O0 (callgraph optimization, inline heuristics, " rguenth at gcc dot gnu dot org
                   ` (49 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jv244 at cam dot ac dot uk @ 2008-12-10 15:27 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from jv244 at cam dot ac dot uk  2008-12-10 15:25 -------
Created an attachment (id=16873)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16873&action=view)
testcase


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
  2008-12-10 15:27 ` [Bug middle-end/38474] " jv244 at cam dot ac dot uk
@ 2008-12-10 15:41 ` rguenth at gcc dot gnu dot org
  2008-12-10 16:14 ` jv244 at cam dot ac dot uk
                   ` (48 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-12-10 15:41 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from rguenth at gcc dot gnu dot org  2008-12-10 15:39 -------
Confirmed.  4.3 is worse (I ran out of memory).

Probably the FE presents us with sth funny.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|0                           |1
           Keywords|                            |memory-hog
   Last reconfirmed|0000-00-00 00:00:00         |2008-12-10 15:39:38
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
  2008-12-10 15:27 ` [Bug middle-end/38474] " jv244 at cam dot ac dot uk
  2008-12-10 15:41 ` [Bug middle-end/38474] slow compilation at -O0 (callgraph optimization, inline heuristics, " rguenth at gcc dot gnu dot org
@ 2008-12-10 16:14 ` jv244 at cam dot ac dot uk
  2008-12-10 16:58 ` rguenth at gcc dot gnu dot org
                   ` (47 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jv244 at cam dot ac dot uk @ 2008-12-10 16:14 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from jv244 at cam dot ac dot uk  2008-12-10 16:13 -------
(In reply to comment #2)
> Confirmed.  4.3 is worse (I ran out of memory).
> 
> Probably the FE presents us with sth funny.
> 
actually, I just got a timing report from 4.3 [4.3.1 20080507 (prerelease)
[gcc-4_3-branch revision 135036]] (on a different machine, but with roughly the
same clock speed, and plenty of RAM):

Execution times (seconds)
 garbage collection    :   1.05 ( 0%) usr   0.00 ( 0%) sys   1.05 ( 0%) wall   
   0 kB ( 0%) ggc
 cfg cleanup           :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 trivially dead code   :   0.70 ( 0%) usr   0.00 ( 0%) sys   0.70 ( 0%) wall   
   0 kB ( 0%) ggc
 df live regs          :   0.96 ( 0%) usr   0.00 ( 0%) sys   0.96 ( 0%) wall   
   0 kB ( 0%) ggc
 df reg dead/unused notes:   1.11 ( 0%) usr   0.00 ( 0%) sys   1.12 ( 0%) wall 
 25889 kB ( 4%) ggc
 register information  :   0.90 ( 0%) usr   0.00 ( 0%) sys   0.89 ( 0%) wall   
   0 kB ( 0%) ggc
 alias analysis        :   0.82 ( 0%) usr   0.00 ( 0%) sys   0.83 ( 0%) wall   
8335 kB ( 1%) ggc
 rebuild jump labels   :   1.10 ( 0%) usr   0.00 ( 0%) sys   1.10 ( 0%) wall   
   0 kB ( 0%) ggc
 parser                :   3.02 ( 0%) usr   0.12 ( 1%) sys   3.96 ( 0%) wall  
75960 kB (12%) ggc
 inline heuristics     :1862.97 (65%) usr   4.94 (59%) sys2078.10 (65%) wall   
   1 kB ( 0%) ggc
 tree gimplify         :   0.48 ( 0%) usr   0.00 ( 0%) sys   0.47 ( 0%) wall   
3446 kB ( 1%) ggc
 tree eh               :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree CFG construction :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
7151 kB ( 1%) ggc
 expand                : 967.65 (34%) usr   2.96 (35%) sys1102.03 (34%) wall 
357297 kB (54%) ggc
 local alloc           :   5.22 ( 0%) usr   0.08 ( 1%) sys   5.29 ( 0%) wall   
8652 kB ( 1%) ggc
 global alloc          :  12.28 ( 0%) usr   0.27 ( 3%) sys  12.59 ( 0%) wall 
163884 kB (25%) ggc
 thread pro- & epilogue:   0.74 ( 0%) usr   0.00 ( 0%) sys   0.75 ( 0%) wall   
 172 kB ( 0%) ggc
 final                 :   2.91 ( 0%) usr   0.05 ( 1%) sys   2.92 ( 0%) wall   
 541 kB ( 0%) ggc
 symout                :   0.06 ( 0%) usr   0.01 ( 0%) sys   0.07 ( 0%) wall   
5690 kB ( 1%) ggc
 TOTAL                 :2862.03             8.43          3212.90            
657217 kB


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (2 preceding siblings ...)
  2008-12-10 16:14 ` jv244 at cam dot ac dot uk
@ 2008-12-10 16:58 ` rguenth at gcc dot gnu dot org
  2008-12-10 22:35 ` jv244 at cam dot ac dot uk
                   ` (46 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-12-10 16:58 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from rguenth at gcc dot gnu dot org  2008-12-10 16:57 -------
Could you capture the memory requirements on the 4.3 branch?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (3 preceding siblings ...)
  2008-12-10 16:58 ` rguenth at gcc dot gnu dot org
@ 2008-12-10 22:35 ` jv244 at cam dot ac dot uk
  2008-12-10 22:49 ` rguenth at gcc dot gnu dot org
                   ` (45 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jv244 at cam dot ac dot uk @ 2008-12-10 22:35 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from jv244 at cam dot ac dot uk  2008-12-10 22:34 -------
(In reply to comment #4)
> Could you capture the memory requirements on the 4.3 branch?

I watched top (for 4.3.1), but can't recall anything more than 3Gb. It's a bit
boring to watch top for 45min.... any better approach?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (4 preceding siblings ...)
  2008-12-10 22:35 ` jv244 at cam dot ac dot uk
@ 2008-12-10 22:49 ` rguenth at gcc dot gnu dot org
  2008-12-10 22:58 ` jv244 at cam dot ac dot uk
                   ` (44 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-12-10 22:49 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from rguenth at gcc dot gnu dot org  2008-12-10 22:48 -------
Created an attachment (id=16881)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16881&action=view)
memory measurement tool

Of course!  Try the attached with just

~/bin/maxmem2.sh gfortan ...


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (5 preceding siblings ...)
  2008-12-10 22:49 ` rguenth at gcc dot gnu dot org
@ 2008-12-10 22:58 ` jv244 at cam dot ac dot uk
  2008-12-11  8:28 ` jv244 at cam dot ac dot uk
                   ` (43 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jv244 at cam dot ac dot uk @ 2008-12-10 22:58 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #7 from jv244 at cam dot ac dot uk  2008-12-10 22:57 -------
(In reply to comment #6)
> Created an attachment (id=16881)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16881&action=view) [edit]
> memory measurement tool
> 
> Of course!  Try the attached with just
> 
> ~/bin/maxmem2.sh gfortan ...
> 
ugh how intuitive... but very useful. Will try to run it tomorrow.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (6 preceding siblings ...)
  2008-12-10 22:58 ` jv244 at cam dot ac dot uk
@ 2008-12-11  8:28 ` jv244 at cam dot ac dot uk
  2008-12-11 11:35 ` rguenth at gcc dot gnu dot org
                   ` (42 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jv244 at cam dot ac dot uk @ 2008-12-11  8:28 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #8 from jv244 at cam dot ac dot uk  2008-12-11 08:27 -------
(In reply to comment #6)
> Created an attachment (id=16881)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16881&action=view) [edit]
> memory measurement tool
> 
> Of course!  Try the attached with just
> 
> ~/bin/maxmem2.sh gfortan ...
> 
Hmmm. So this is what is returned:

4.3.3 is GNU Fortran (GCC) 4.3.3 20080912 (prerelease)
trunk is NU Fortran (GCC) 4.4.0 20081206 (experimental) [trunk revision 142525]

4.3.3: 899675 kB (and about 33min)
trunk: 1145308 kB (and about 45min).

this is on the same machine, so times can be compared (module enable
checking?).

However, for the memory usage, top (oh no) showed 2.3-2.5Gb, which is quite
different from what the script returns?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (7 preceding siblings ...)
  2008-12-11  8:28 ` jv244 at cam dot ac dot uk
@ 2008-12-11 11:35 ` rguenth at gcc dot gnu dot org
  2008-12-11 12:04 ` jv244 at cam dot ac dot uk
                   ` (41 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-12-11 11:35 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #9 from rguenth at gcc dot gnu dot org  2008-12-11 11:33 -------
The script only has received testing on linux systems, so if you are running
somewhere else it is likely that either the regexps do not match or that
you require different/additional syscalls to be traced.  It's not perfect, but
it works reliably for me.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (8 preceding siblings ...)
  2008-12-11 11:35 ` rguenth at gcc dot gnu dot org
@ 2008-12-11 12:04 ` jv244 at cam dot ac dot uk
  2008-12-15 19:39 ` jv244 at cam dot ac dot uk
                   ` (40 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jv244 at cam dot ac dot uk @ 2008-12-11 12:04 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #10 from jv244 at cam dot ac dot uk  2008-12-11 12:02 -------
(In reply to comment #9)
> The script only has received testing on linux systems, so if you are running
> somewhere else it is likely that either the regexps do not match or that
> you require different/additional syscalls to be traced.  It's not perfect, but
> it works reliably for me.
> 
no, it is a linux box (actually SUSE as far as I can tell,
> uname -a
 Linux pcihopt3 2.6.16.21-0.8-smp #1 SMP Mon Jul 3 18:25:39 UTC 2006 x86_64
x86_64 x86_64 GNU/Linux
> cat /etc/SuSE-release
SUSE Linux Enterprise Server 10 (x86_64)
VERSION = 10
).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (9 preceding siblings ...)
  2008-12-11 12:04 ` jv244 at cam dot ac dot uk
@ 2008-12-15 19:39 ` jv244 at cam dot ac dot uk
  2008-12-15 21:19 ` steven at gcc dot gnu dot org
                   ` (39 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jv244 at cam dot ac dot uk @ 2008-12-15 19:39 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #11 from jv244 at cam dot ac dot uk  2008-12-15 19:38 -------
as this file is included in a project compiled normally with '-O3 -march=native
-funroll-loops' the timing in that case is also important. As I'm finding out,
this becomes unworkable (>6h, and still compiling).

Looking at -fdump-tree-original, the overloaded operators (+,-,..) expand to
function call, leading to a subroutine which contains 73000 function calls. So,
likely that some stuff is scaling at least quadratically wrt to this parameter.


-- 

jv244 at cam dot ac dot uk changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |critical


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (10 preceding siblings ...)
  2008-12-15 19:39 ` jv244 at cam dot ac dot uk
@ 2008-12-15 21:19 ` steven at gcc dot gnu dot org
  2008-12-15 21:29 ` steven at gcc dot gnu dot org
                   ` (38 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: steven at gcc dot gnu dot org @ 2008-12-15 21:19 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #12 from steven at gcc dot gnu dot org  2008-12-15 21:17 -------
One of the bottlenecks seems to be find_temp_slot_from_address.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (11 preceding siblings ...)
  2008-12-15 21:19 ` steven at gcc dot gnu dot org
@ 2008-12-15 21:29 ` steven at gcc dot gnu dot org
  2008-12-15 21:54 ` steven at gcc dot gnu dot org
                   ` (37 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: steven at gcc dot gnu dot org @ 2008-12-15 21:29 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #13 from steven at gcc dot gnu dot org  2008-12-15 21:27 -------
OK, to elaborate: I'm playing with this test case on ia64-linux, and I reduced
the test case by some 8000 lines to make it compilable at all.  With this 8000
lines less, it actually spends more time for me in "expand", in the function
"find_temp_slot_from_address (rtx x)".  It spends all of its time...

  for (i = max_slot_level (); i >= 0; i--)
    for (p = *temp_slots_at_level (i); p; p = p->next)
      {
        if (XEXP (p->slot, 0) == x
            || p->address == x
            || (GET_CODE (x) == PLUS
                && XEXP (x, 0) == virtual_stack_vars_rtx
                && GET_CODE (XEXP (x, 1)) == CONST_INT
                && INTVAL (XEXP (x, 1)) >= p->base_offset
                && INTVAL (XEXP (x, 1)) < p->base_offset + p->full_size))
          return p;

        else if (p->address != 0 && GET_CODE (p->address) == EXPR_LIST)
          for (next = p->address; next; next = XEXP (next, 1))
            if (XEXP (next, 0) == x)  /* ...here in  this loop... */
              return p;

in the "for (next = p->address; ...)" loop. This list in p->address is actually
several thousand items long and it is traversed many times:

traversals ~ max_slot_level()*temp_slots_at_level(i)*list length of p->address

which is, at best, cubic behavior.


-- 

steven at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|2008-12-10 15:39:38         |2008-12-15 21:27:40
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (12 preceding siblings ...)
  2008-12-15 21:29 ` steven at gcc dot gnu dot org
@ 2008-12-15 21:54 ` steven at gcc dot gnu dot org
  2008-12-15 21:56 ` steven at gcc dot gnu dot org
                   ` (36 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: steven at gcc dot gnu dot org @ 2008-12-15 21:54 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #14 from steven at gcc dot gnu dot org  2008-12-15 21:53 -------
For the inline heuristics, almost all time is also spent in stack slot related
stuff. The culprit is estimate_stack_frame_size (or actually,
add_alias_set__conflicts) in cfgexpand.c.
(What are we doing in cfgexpand anyway, for inlining?!?)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (13 preceding siblings ...)
  2008-12-15 21:54 ` steven at gcc dot gnu dot org
@ 2008-12-15 21:56 ` steven at gcc dot gnu dot org
  2008-12-15 21:57 ` steven at gcc dot gnu dot org
                   ` (35 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: steven at gcc dot gnu dot org @ 2008-12-15 21:56 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #15 from steven at gcc dot gnu dot org  2008-12-15 21:55 -------
>From cfgexpand.c:

static void
add_alias_set_conflicts (void)
{
  size_t i, j, n = stack_vars_num;

  for (i = 0; i < n; ++i)
    {
      tree type_i = TREE_TYPE (stack_vars[i].decl);
      bool aggr_i = AGGREGATE_TYPE_P (type_i);
      bool contains_union;

      contains_union = aggregate_contains_union_type (type_i);
      for (j = 0; j < i; ++j)
        {

Classic example of quadratic algorithm...


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (14 preceding siblings ...)
  2008-12-15 21:56 ` steven at gcc dot gnu dot org
@ 2008-12-15 21:57 ` steven at gcc dot gnu dot org
  2008-12-16  7:52 ` jv244 at cam dot ac dot uk
                   ` (34 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: steven at gcc dot gnu dot org @ 2008-12-15 21:57 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #16 from steven at gcc dot gnu dot org  2008-12-15 21:56 -------
Oh, and FWIW, for yukawa_gn_full, stack_vars_num == 67551 for me.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (15 preceding siblings ...)
  2008-12-15 21:57 ` steven at gcc dot gnu dot org
@ 2008-12-16  7:52 ` jv244 at cam dot ac dot uk
  2008-12-16 11:59 ` jv244 at cam dot ac dot uk
                   ` (33 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jv244 at cam dot ac dot uk @ 2008-12-16  7:52 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #17 from jv244 at cam dot ac dot uk  2008-12-16 07:51 -------
(In reply to comment #16)
> Oh, and FWIW, for yukawa_gn_full, stack_vars_num == 67551 for me.

Thanks for the analysis. Detailed enough to have me peak in the gcc code for
once.  

This would mean that the array stack_vars_conflict takes about 2.2Gb, since it
is  O(stack_vars_num**2/2) (assuming a bool is 8bits, quite consistent with
what we see). 

There is already a function (defer_stack_allocation) that decides to give up
due to 'the quadratic problem'. Maybe gcc should use some drastic short-cut,
including not allocating the stack_vars_conflict array, as soon as ~10000 stack
variables are detected ?

BTW, the -O3 compilation is still running (for 17h now).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (16 preceding siblings ...)
  2008-12-16  7:52 ` jv244 at cam dot ac dot uk
@ 2008-12-16 11:59 ` jv244 at cam dot ac dot uk
  2008-12-16 12:46 ` steven at gcc dot gnu dot org
                   ` (32 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jv244 at cam dot ac dot uk @ 2008-12-16 11:59 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #18 from jv244 at cam dot ac dot uk  2008-12-16 11:58 -------
(In reply to comment #17)
> BTW, the -O3 compilation is still running (for 17h now).
finished successfully after 23h... 


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (17 preceding siblings ...)
  2008-12-16 11:59 ` jv244 at cam dot ac dot uk
@ 2008-12-16 12:46 ` steven at gcc dot gnu dot org
  2008-12-16 12:48 ` jv244 at cam dot ac dot uk
                   ` (31 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: steven at gcc dot gnu dot org @ 2008-12-16 12:46 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #19 from steven at gcc dot gnu dot org  2008-12-16 12:45 -------
Re. comment #18, I'd say "brilliant" if it wasn't such a poor performance :-)

Did you manage to get a time report out of that run?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (18 preceding siblings ...)
  2008-12-16 12:46 ` steven at gcc dot gnu dot org
@ 2008-12-16 12:48 ` jv244 at cam dot ac dot uk
  2008-12-16 12:50 ` jv244 at cam dot ac dot uk
                   ` (30 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jv244 at cam dot ac dot uk @ 2008-12-16 12:48 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #20 from jv244 at cam dot ac dot uk  2008-12-16 12:47 -------
(In reply to comment #19)
> Re. comment #18, I'd say "brilliant" if it wasn't such a poor performance :-)

I agree... quite an achievement not to crash in such a case.

> Did you manage to get a time report out of that run?

no... obviously I can rerun this (numbers tomorrow, of course)?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (19 preceding siblings ...)
  2008-12-16 12:48 ` jv244 at cam dot ac dot uk
@ 2008-12-16 12:50 ` jv244 at cam dot ac dot uk
  2008-12-16 13:43 ` steven at gcc dot gnu dot org
                   ` (29 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jv244 at cam dot ac dot uk @ 2008-12-16 12:50 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #21 from jv244 at cam dot ac dot uk  2008-12-16 12:48 -------
(In reply to comment #16)
> Oh, and FWIW, for yukawa_gn_full, stack_vars_num == 67551 for me.

btw, that routine only has 3800 user variables, the rests are FE generated
temporaries (which should have a limited lifetime).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (20 preceding siblings ...)
  2008-12-16 12:50 ` jv244 at cam dot ac dot uk
@ 2008-12-16 13:43 ` steven at gcc dot gnu dot org
  2008-12-16 14:19 ` jv244 at cam dot ac dot uk
                   ` (28 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: steven at gcc dot gnu dot org @ 2008-12-16 13:43 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #22 from steven at gcc dot gnu dot org  2008-12-16 13:41 -------
We may be better off with a slightly reduced test case for the -O3 report. 
It's not difficult to cut out ~8000 lines (like I did yesterday) and still have
a huge test case (and the horendous compile times to go with that).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (21 preceding siblings ...)
  2008-12-16 13:43 ` steven at gcc dot gnu dot org
@ 2008-12-16 14:19 ` jv244 at cam dot ac dot uk
  2008-12-16 14:21 ` jv244 at cam dot ac dot uk
                   ` (27 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jv244 at cam dot ac dot uk @ 2008-12-16 14:19 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #23 from jv244 at cam dot ac dot uk  2008-12-16 14:17 -------
Created an attachment (id=16913)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16913&action=view)
reduced testcase

just so we talk about the same file, I've reduced the testcase to more
managable sizes. This one compiles in about 1min at -O0. I'll attach
time-report output in a  sec.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (22 preceding siblings ...)
  2008-12-16 14:19 ` jv244 at cam dot ac dot uk
@ 2008-12-16 14:21 ` jv244 at cam dot ac dot uk
  2008-12-16 16:19 ` jv244 at cam dot ac dot uk
                   ` (26 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jv244 at cam dot ac dot uk @ 2008-12-16 14:21 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #24 from jv244 at cam dot ac dot uk  2008-12-16 14:20 -------
(In reply to comment #23)
reduced testcase timings at -O0 and -O3. Tree operand scan anybody?

> time gfortran -O0 -ffree-line-length-512 -c -ftime-report testcase_reduced.f90

Execution times (seconds)
 garbage collection    :   0.51 ( 1%) usr   0.00 ( 0%) sys   0.49 ( 1%) wall   
   0 kB ( 0%) ggc
 callgraph construction:   0.05 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall   
4956 kB ( 2%) ggc
 callgraph optimization:   8.13 (18%) usr   0.20 (16%) sys   8.36 (18%) wall   
1280 kB ( 1%) ggc
 cfg cleanup           :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 CFG verifier          :   0.48 ( 1%) usr   0.02 ( 2%) sys   0.46 ( 1%) wall   
   0 kB ( 0%) ggc
 trivially dead code   :   0.17 ( 0%) usr   0.00 ( 0%) sys   0.18 ( 0%) wall   
   0 kB ( 0%) ggc
 df live regs          :   0.11 ( 0%) usr   0.00 ( 0%) sys   0.11 ( 0%) wall   
   0 kB ( 0%) ggc
 df reg dead/unused notes:   0.24 ( 1%) usr   0.00 ( 0%) sys   0.23 ( 0%) wall 
  9445 kB ( 4%) ggc
 register information  :   0.11 ( 0%) usr   0.01 ( 1%) sys   0.12 ( 0%) wall   
   0 kB ( 0%) ggc
 alias analysis        :   0.10 ( 0%) usr   0.01 ( 1%) sys   0.10 ( 0%) wall   
4239 kB ( 2%) ggc
 rebuild jump labels   :   0.12 ( 0%) usr   0.00 ( 0%) sys   0.11 ( 0%) wall   
   0 kB ( 0%) ggc
 parser                :   1.07 ( 2%) usr   0.05 ( 4%) sys   1.12 ( 2%) wall  
22673 kB ( 9%) ggc
 inline heuristics     :  16.30 (36%) usr   0.41 (33%) sys  16.75 (36%) wall   
   0 kB ( 0%) ggc
 tree gimplify         :   0.06 ( 0%) usr   0.01 ( 1%) sys   0.08 ( 0%) wall   
6435 kB ( 3%) ggc
 tree CFG construction :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
 180 kB ( 0%) ggc
 tree find ref. vars   :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
3231 kB ( 1%) ggc
 tree SSA rewrite      :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
  63 kB ( 0%) ggc
 tree SSA other        :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree operand scan     :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
 236 kB ( 0%) ggc
 tree SSA to normal    :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree SSA verifier     :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
   0 kB ( 0%) ggc
 tree STMT verifier    :   0.20 ( 0%) usr   0.01 ( 1%) sys   0.22 ( 0%) wall   
   0 kB ( 0%) ggc
 callgraph verifier    :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall   
   0 kB ( 0%) ggc
 dominance computation :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
   0 kB ( 0%) ggc
 expand                :  10.86 (24%) usr   0.38 (30%) sys  11.26 (24%) wall 
132856 kB (52%) ggc
 integrated RA         :   4.08 ( 9%) usr   0.05 ( 4%) sys   4.13 ( 9%) wall   
4604 kB ( 2%) ggc
 reload                :   1.88 ( 4%) usr   0.07 ( 6%) sys   1.97 ( 4%) wall  
59269 kB (23%) ggc
 thread pro- & epilogue:   0.17 ( 0%) usr   0.00 ( 0%) sys   0.18 ( 0%) wall   
 175 kB ( 0%) ggc
 final                 :   0.63 ( 1%) usr   0.03 ( 2%) sys   0.66 ( 1%) wall   
3790 kB ( 1%) ggc
 TOTAL                 :  45.42             1.25            46.73            
253684 kB
Extra diagnostic checks enabled; compiler may run slowly.
Configure with --enable-checking=release to disable checks.

real    0m47.298s
user    0m45.923s
sys     0m1.316s

> time gfortran -march=native -O3 -ffree-line-length-512 -c -ftime-report testcase_reduced.f90

Execution times (seconds)
 garbage collection    :   1.48 ( 1%) usr   0.01 ( 0%) sys   1.50 ( 1%) wall   
   0 kB ( 0%) ggc
 callgraph construction:   0.03 ( 0%) usr   0.01 ( 0%) sys   0.05 ( 0%) wall   
4955 kB ( 1%) ggc
 callgraph optimization:   6.27 ( 3%) usr   0.15 ( 7%) sys   6.46 ( 4%) wall   
2366 kB ( 0%) ggc
 ipa cp                :   0.05 ( 0%) usr   0.01 ( 0%) sys   0.06 ( 0%) wall   
  34 kB ( 0%) ggc
 cfg cleanup           :   0.01 ( 0%) usr   0.01 ( 0%) sys   0.02 ( 0%) wall   
   0 kB ( 0%) ggc
 CFG verifier          :   1.41 ( 1%) usr   0.00 ( 0%) sys   1.33 ( 1%) wall   
   0 kB ( 0%) ggc
 trivially dead code   :   0.62 ( 0%) usr   0.00 ( 0%) sys   0.66 ( 0%) wall   
   0 kB ( 0%) ggc
 df reaching defs      :   0.69 ( 0%) usr   0.01 ( 0%) sys   0.67 ( 0%) wall   
   0 kB ( 0%) ggc
 df live regs          :   1.86 ( 1%) usr   0.00 ( 0%) sys   1.86 ( 1%) wall   
   0 kB ( 0%) ggc
 df live&initialized regs:   0.93 ( 1%) usr   0.00 ( 0%) sys   0.94 ( 1%) wall 
     0 kB ( 0%) ggc
 df use-def / def-use chains:   1.33 ( 1%) usr   0.04 ( 2%) sys   1.38 ( 1%)
wall       0 kB ( 0%) ggc
 df reg dead/unused notes:   0.92 ( 1%) usr   0.00 ( 0%) sys   0.96 ( 1%) wall 
 13469 kB ( 3%) ggc
 register information  :   0.44 ( 0%) usr   0.00 ( 0%) sys   0.43 ( 0%) wall   
   0 kB ( 0%) ggc
 alias analysis        :   1.05 ( 1%) usr   0.00 ( 0%) sys   1.05 ( 1%) wall  
24068 kB ( 5%) ggc
 register scan         :   0.20 ( 0%) usr   0.00 ( 0%) sys   0.17 ( 0%) wall   
  18 kB ( 0%) ggc
 rebuild jump labels   :   0.31 ( 0%) usr   0.00 ( 0%) sys   0.30 ( 0%) wall   
   0 kB ( 0%) ggc
 parser                :   1.16 ( 1%) usr   0.03 ( 1%) sys   1.21 ( 1%) wall  
22673 kB ( 5%) ggc
 inline heuristics     :  15.83 ( 9%) usr   0.40 (20%) sys  16.25 ( 9%) wall   
 138 kB ( 0%) ggc
 integration           :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
 885 kB ( 0%) ggc
 tree gimplify         :   0.06 ( 0%) usr   0.01 ( 0%) sys   0.07 ( 0%) wall   
6434 kB ( 1%) ggc
 tree CFG construction :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
 179 kB ( 0%) ggc
 tree CFG cleanup      :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   8 kB ( 0%) ggc
 tree VRP              :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall   
 448 kB ( 0%) ggc
 tree copy propagation :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall   
 159 kB ( 0%) ggc
 tree find ref. vars   :   0.01 ( 0%) usr   0.01 ( 0%) sys   0.01 ( 0%) wall   
3229 kB ( 1%) ggc
 tree PTA              :   1.29 ( 1%) usr   0.03 ( 1%) sys   1.34 ( 1%) wall   
 540 kB ( 0%) ggc
 tree alias analysis   :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall   
  57 kB ( 0%) ggc
 tree call clobbering  :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
  19 kB ( 0%) ggc
 tree flow sensitive alias:   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall
     60 kB ( 0%) ggc
 tree flow insensitive alias:   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%)
wall       0 kB ( 0%) ggc
 tree memory partitioning:   0.09 ( 0%) usr   0.00 ( 0%) sys   0.07 ( 0%) wall 
     0 kB ( 0%) ggc
 tree SSA rewrite      :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
8391 kB ( 2%) ggc
 tree SSA other        :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree SSA incremental  :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
  21 kB ( 0%) ggc
 tree operand scan     :  98.14 (55%) usr   0.03 ( 1%) sys  98.31 (54%) wall   
4048 kB ( 1%) ggc
 dominator optimization:   0.03 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
  73 kB ( 0%) ggc
 tree CCP              :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
 119 kB ( 0%) ggc
 tree PRE              :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall   
  62 kB ( 0%) ggc
 tree FRE              :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
  33 kB ( 0%) ggc
 tree forward propagate:   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
   3 kB ( 0%) ggc
 tree conservative DCE :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
   0 kB ( 0%) ggc
 tree aggressive DCE   :   0.02 ( 0%) usr   0.01 ( 0%) sys   0.01 ( 0%) wall   
  12 kB ( 0%) ggc
 tree loop init        :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
  44 kB ( 0%) ggc
 tree SSA to normal    :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   4 kB ( 0%) ggc
 tree rename SSA copies:   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree SSA verifier     :   0.95 ( 1%) usr   0.00 ( 0%) sys   0.97 ( 1%) wall   
   0 kB ( 0%) ggc
 tree STMT verifier    :   2.05 ( 1%) usr   0.04 ( 2%) sys   2.11 ( 1%) wall   
   0 kB ( 0%) ggc
 callgraph verifier    :   0.10 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall   
   0 kB ( 0%) ggc
 dominance computation :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
   0 kB ( 0%) ggc
 expand                :  12.80 ( 7%) usr   0.33 (16%) sys  13.09 ( 7%) wall 
131225 kB (27%) ggc
 lower subreg          :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
   0 kB ( 0%) ggc
 forward prop          :   0.91 ( 1%) usr   0.01 ( 0%) sys   0.91 ( 1%) wall   
9021 kB ( 2%) ggc
 CSE                   :   2.70 ( 2%) usr   0.01 ( 0%) sys   2.70 ( 1%) wall   
8941 kB ( 2%) ggc
 dead code elimination :   0.37 ( 0%) usr   0.00 ( 0%) sys   0.38 ( 0%) wall   
   0 kB ( 0%) ggc
 dead store elim1      :   0.59 ( 0%) usr   0.02 ( 1%) sys   0.62 ( 0%) wall  
13140 kB ( 3%) ggc
 dead store elim2      :   0.77 ( 0%) usr   0.00 ( 0%) sys   0.76 ( 0%) wall  
13219 kB ( 3%) ggc
 CSE 2                 :   2.04 ( 1%) usr   0.00 ( 0%) sys   2.04 ( 1%) wall   
3477 kB ( 1%) ggc
 combiner              :   0.77 ( 0%) usr   0.00 ( 0%) sys   0.79 ( 0%) wall   
6633 kB ( 1%) ggc
 if-conversion         :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
  50 kB ( 0%) ggc
 regmove               :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall   
  28 kB ( 0%) ggc
 integrated RA         :   9.17 ( 5%) usr   0.71 (35%) sys   9.92 ( 5%) wall  
25558 kB ( 5%) ggc
 reload                :   3.36 ( 2%) usr   0.10 ( 5%) sys   3.47 ( 2%) wall 
101799 kB (21%) ggc
 reload CSE regs       :   1.76 ( 1%) usr   0.00 ( 0%) sys   1.75 ( 1%) wall  
27970 kB ( 6%) ggc
 load CSE after reload :   0.09 ( 0%) usr   0.00 ( 0%) sys   0.09 ( 0%) wall   
   0 kB ( 0%) ggc
 thread pro- & epilogue:   0.18 ( 0%) usr   0.00 ( 0%) sys   0.19 ( 0%) wall   
 231 kB ( 0%) ggc
 peephole 2            :   0.16 ( 0%) usr   0.00 ( 0%) sys   0.15 ( 0%) wall   
  29 kB ( 0%) ggc
 rename registers      :   1.03 ( 1%) usr   0.00 ( 0%) sys   1.04 ( 1%) wall   
   0 kB ( 0%) ggc
 scheduling 2          :   3.82 ( 2%) usr   0.03 ( 1%) sys   3.82 ( 2%) wall  
53812 kB (11%) ggc
 machine dep reorg     :   0.41 ( 0%) usr   0.00 ( 0%) sys   0.41 ( 0%) wall   
   0 kB ( 0%) ggc
 reorder blocks        :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall   
   3 kB ( 0%) ggc
 final                 :   0.64 ( 0%) usr   0.02 ( 1%) sys   0.65 ( 0%) wall   
3824 kB ( 1%) ggc
 TOTAL                 : 179.47             2.03           181.70            
492168 kB
Extra diagnostic checks enabled; compiler may run slowly.
Configure with --enable-checking=release to disable checks.

real    3m2.238s
user    2m59.927s
sys     0m2.128s


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (23 preceding siblings ...)
  2008-12-16 14:21 ` jv244 at cam dot ac dot uk
@ 2008-12-16 16:19 ` jv244 at cam dot ac dot uk
  2008-12-16 16:28 ` steven at gcc dot gnu dot org
                   ` (25 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jv244 at cam dot ac dot uk @ 2008-12-16 16:19 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #25 from jv244 at cam dot ac dot uk  2008-12-16 16:17 -------
doing some more profiling, the -O0 problem is to a large extend due to
compute_inline_parameters and estimate_stack_frame_size. Spending 10-30min just
on estimating the stack_frame_size on something that can't be reasonably
inlined anyways seems a waste.


-- 

jv244 at cam dot ac dot uk changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hubicka at gcc dot gnu dot
                   |                            |org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (24 preceding siblings ...)
  2008-12-16 16:19 ` jv244 at cam dot ac dot uk
@ 2008-12-16 16:28 ` steven at gcc dot gnu dot org
  2008-12-16 16:31 ` jv244 at cam dot ac dot uk
                   ` (24 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: steven at gcc dot gnu dot org @ 2008-12-16 16:28 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #26 from steven at gcc dot gnu dot org  2008-12-16 16:26 -------
I am going to work on the -O0 problems a bit.

The operand scanner is the problem at -O3. Richi, this is one you may want to
try on the alias improvements branch, if most of the time is spent on virtual
SSA names (I haven't checked, but it's likely with so many aggregate-typed
variables).


-- 

steven at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|unassigned at gcc dot gnu   |steven at gcc dot gnu dot
                   |dot org                     |org
             Status|NEW                         |ASSIGNED
   Last reconfirmed|2008-12-15 21:27:40         |2008-12-16 16:26:45
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (25 preceding siblings ...)
  2008-12-16 16:28 ` steven at gcc dot gnu dot org
@ 2008-12-16 16:31 ` jv244 at cam dot ac dot uk
  2008-12-16 19:35 ` [Bug middle-end/38474] [4.3/4.4 Regression] " pinskia at gcc dot gnu dot org
                   ` (23 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jv244 at cam dot ac dot uk @ 2008-12-16 16:31 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #27 from jv244 at cam dot ac dot uk  2008-12-16 16:29 -------
the slow routines at -O3 are related to compute_may_aliases, at the point I
interupted the profiling, this routine had called add_virtual_operand 200M
times.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] [4.3/4.4 Regression] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (26 preceding siblings ...)
  2008-12-16 16:31 ` jv244 at cam dot ac dot uk
@ 2008-12-16 19:35 ` pinskia at gcc dot gnu dot org
  2008-12-16 20:32 ` jv244 at cam dot ac dot uk
                   ` (22 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2008-12-16 19:35 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #28 from pinskia at gcc dot gnu dot org  2008-12-16 19:32 -------
The stack heuristic is new for 4.3.


-- 

pinskia at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|critical                    |normal
            Summary|slow compilation at -O0     |[4.3/4.4 Regression] slow
                   |(callgraph optimization,    |compilation at -O0
                   |inline heuristics, expand ) |(callgraph optimization,
                   |                            |inline heuristics, expand )
   Target Milestone|---                         |4.3.4


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] [4.3/4.4 Regression] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (27 preceding siblings ...)
  2008-12-16 19:35 ` [Bug middle-end/38474] [4.3/4.4 Regression] " pinskia at gcc dot gnu dot org
@ 2008-12-16 20:32 ` jv244 at cam dot ac dot uk
  2008-12-17  6:52 ` jv244 at cam dot ac dot uk
                   ` (21 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jv244 at cam dot ac dot uk @ 2008-12-16 20:32 UTC (permalink / raw)
  To: gcc-bugs



-- 

jv244 at cam dot ac dot uk changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.3.4                       |4.3.3


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] [4.3/4.4 Regression] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (28 preceding siblings ...)
  2008-12-16 20:32 ` jv244 at cam dot ac dot uk
@ 2008-12-17  6:52 ` jv244 at cam dot ac dot uk
  2008-12-17  7:03 ` steven at gcc dot gnu dot org
                   ` (20 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jv244 at cam dot ac dot uk @ 2008-12-17  6:52 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #29 from jv244 at cam dot ac dot uk  2008-12-17 06:50 -------
doing the original testcase again at -O3 has been a useful exercise i think.
13.5h is spent in rename_registers, 2h in tree operand scan, ~1h in inline
heuristics, and 20min in expand. (Note that this is a 4.3 based compiler, maybe
I should redo this with a 4.4 or has nothing changed there despite ira?).

gfortran-4.3 -ffree-line-length-512 -g -fopenmp -ffree-form -D__T_C_G0
-ftime-report -c -O3 -march=native -funroll-loops mpfr_yukawa.f

Execution times (seconds)
 garbage collection    :   3.96 ( 0%) usr   0.00 ( 0%) sys   3.95 ( 0%) wall   
   0 kB ( 0%) ggc
 callgraph construction:   0.42 ( 0%) usr   0.01 ( 0%) sys   0.44 ( 0%) wall  
10751 kB ( 1%) ggc
 callgraph optimization:   0.78 ( 0%) usr   0.02 ( 0%) sys   0.82 ( 0%) wall  
14320 kB ( 1%) ggc
 ipa reference         :   0.17 ( 0%) usr   0.00 ( 0%) sys   0.17 ( 0%) wall   
   0 kB ( 0%) ggc
 ipa pure const        :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 cfg cleanup           :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall   
   0 kB ( 0%) ggc
 trivially dead code   :   5.36 ( 0%) usr   0.00 ( 0%) sys   5.38 ( 0%) wall   
   0 kB ( 0%) ggc
 df reaching defs      :   6.56 ( 0%) usr   0.07 ( 1%) sys   6.60 ( 0%) wall   
   0 kB ( 0%) ggc
 df live regs          :  16.93 ( 0%) usr   0.00 ( 0%) sys  16.96 ( 0%) wall   
   0 kB ( 0%) ggc
 df live&initialized regs:   7.39 ( 0%) usr   0.00 ( 0%) sys   7.38 ( 0%) wall 
     0 kB ( 0%) ggc
 df use-def / def-use chains:  14.49 ( 0%) usr   0.02 ( 0%) sys  14.52 ( 0%)
wall       0 kB ( 0%) ggc
 df reg dead/unused notes:  10.94 ( 0%) usr   0.00 ( 0%) sys  10.96 ( 0%) wall 
 37217 kB ( 3%) ggc
 register information  :   4.97 ( 0%) usr   0.00 ( 0%) sys   4.97 ( 0%) wall   
   0 kB ( 0%) ggc
 alias analysis        :   8.84 ( 0%) usr   0.01 ( 0%) sys   8.86 ( 0%) wall  
49164 kB ( 3%) ggc
 register scan         :   1.78 ( 0%) usr   0.00 ( 0%) sys   1.80 ( 0%) wall   
   0 kB ( 0%) ggc
 rebuild jump labels   :   3.93 ( 0%) usr   0.00 ( 0%) sys   3.93 ( 0%) wall   
   0 kB ( 0%) ggc
 parser                :   3.24 ( 0%) usr   0.10 ( 1%) sys   3.35 ( 0%) wall  
60522 kB ( 4%) ggc
 inline heuristics     :2516.42 ( 4%) usr   4.20 (35%) sys2542.00 ( 4%) wall   
   0 kB ( 0%) ggc
 tree gimplify         :   0.55 ( 0%) usr   0.00 ( 0%) sys   0.54 ( 0%) wall   
3453 kB ( 0%) ggc
 tree eh               :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree CFG construction :   0.02 ( 0%) usr   0.01 ( 0%) sys   0.03 ( 0%) wall   
7185 kB ( 1%) ggc
 tree CFG cleanup      :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   1 kB ( 0%) ggc
 tree VRP              :   0.25 ( 0%) usr   0.00 ( 0%) sys   0.23 ( 0%) wall   
  36 kB ( 0%) ggc
 tree copy propagation :   0.63 ( 0%) usr   0.00 ( 0%) sys   0.60 ( 0%) wall   
   6 kB ( 0%) ggc
 tree find ref. vars   :   0.12 ( 0%) usr   0.01 ( 0%) sys   0.12 ( 0%) wall   
8202 kB ( 1%) ggc
 tree PTA              :   3.61 ( 0%) usr   0.07 ( 1%) sys   3.64 ( 0%) wall   
  96 kB ( 0%) ggc
 tree alias analysis   :   6.71 ( 0%) usr   0.11 ( 1%) sys   6.74 ( 0%) wall   
   3 kB ( 0%) ggc
 tree call clobbering  :   0.39 ( 0%) usr   0.00 ( 0%) sys   0.37 ( 0%) wall   
   0 kB ( 0%) ggc
 tree flow sensitive alias:   0.11 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall
      9 kB ( 0%) ggc
 tree flow insensitive alias:   3.54 ( 0%) usr   0.00 ( 0%) sys   3.53 ( 0%)
wall       0 kB ( 0%) ggc
 tree memory partitioning:  18.88 ( 0%) usr   0.03 ( 0%) sys  18.93 ( 0%) wall 
     0 kB ( 0%) ggc
 tree PHI insertion    :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
   1 kB ( 0%) ggc
 tree SSA rewrite      :   0.17 ( 0%) usr   0.00 ( 0%) sys   0.18 ( 0%) wall  
25972 kB ( 2%) ggc
 tree SSA other        :   0.02 ( 0%) usr   0.03 ( 0%) sys   0.08 ( 0%) wall   
   0 kB ( 0%) ggc
 tree SSA incremental  :   0.82 ( 0%) usr   0.00 ( 0%) sys   0.82 ( 0%) wall   
   9 kB ( 0%) ggc
 tree operand scan     :6681.62 (11%) usr   0.55 ( 5%) sys6698.35 (11%) wall  
19727 kB ( 1%) ggc
 dominator optimization:   0.26 ( 0%) usr   0.00 ( 0%) sys   0.25 ( 0%) wall   
  11 kB ( 0%) ggc
 tree SRA              :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall   
   0 kB ( 0%) ggc
 tree STORE-CCP        :   0.15 ( 0%) usr   0.00 ( 0%) sys   0.15 ( 0%) wall   
   1 kB ( 0%) ggc
 tree CCP              :   0.18 ( 0%) usr   0.00 ( 0%) sys   0.19 ( 0%) wall   
   3 kB ( 0%) ggc
 tree reassociation    :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall   
   2 kB ( 0%) ggc
 tree PRE              :   0.37 ( 0%) usr   0.00 ( 0%) sys   0.38 ( 0%) wall   
2289 kB ( 0%) ggc
 tree FRE              :   0.38 ( 0%) usr   0.01 ( 0%) sys   0.38 ( 0%) wall   
2297 kB ( 0%) ggc
 tree linearize phis   :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree forward propagate:   0.04 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
   0 kB ( 0%) ggc
 tree conservative DCE :   0.53 ( 0%) usr   0.00 ( 0%) sys   0.56 ( 0%) wall   
   0 kB ( 0%) ggc
 tree aggressive DCE   :   0.11 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall   
   0 kB ( 0%) ggc
 tree DSE              :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall   
   0 kB ( 0%) ggc
 tree SSA to normal    :   0.09 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall   
   3 kB ( 0%) ggc
 tree rename SSA copies:   0.06 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall   
   0 kB ( 0%) ggc
 expand                :1279.48 ( 2%) usr   2.74 (23%) sys1285.18 ( 2%) wall 
440026 kB (31%) ggc
 lower subreg          :   0.09 ( 0%) usr   0.00 ( 0%) sys   0.09 ( 0%) wall   
   0 kB ( 0%) ggc
 forward prop          :   1.74 ( 0%) usr   0.03 ( 0%) sys   1.73 ( 0%) wall   
8458 kB ( 1%) ggc
 CSE                   :   9.02 ( 0%) usr   0.04 ( 0%) sys   9.04 ( 0%) wall  
27164 kB ( 2%) ggc
 dead code elimination :   3.63 ( 0%) usr   0.00 ( 0%) sys   3.62 ( 0%) wall   
   0 kB ( 0%) ggc
 dead store elim1      :   5.64 ( 0%) usr   0.05 ( 0%) sys   5.69 ( 0%) wall  
37803 kB ( 3%) ggc
 dead store elim2      :   7.64 ( 0%) usr   0.01 ( 0%) sys   7.65 ( 0%) wall  
38112 kB ( 3%) ggc
 web                   :   1.38 ( 0%) usr   0.02 ( 0%) sys   1.41 ( 0%) wall   
   0 kB ( 0%) ggc
 CSE 2                 :   4.26 ( 0%) usr   0.02 ( 0%) sys   4.29 ( 0%) wall  
10087 kB ( 1%) ggc
 branch prediction     :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall   
   3 kB ( 0%) ggc
 combiner              :   5.02 ( 0%) usr   0.03 ( 0%) sys   5.06 ( 0%) wall  
17823 kB ( 1%) ggc
 regmove               :   4.83 ( 0%) usr   0.00 ( 0%) sys   4.83 ( 0%) wall   
   1 kB ( 0%) ggc
 local alloc           :  11.33 ( 0%) usr   0.08 ( 1%) sys  11.36 ( 0%) wall  
64779 kB ( 5%) ggc
 global alloc          :  21.86 ( 0%) usr   0.69 ( 6%) sys  22.55 ( 0%) wall 
275939 kB (20%) ggc
 reload CSE regs       :  10.28 ( 0%) usr   0.00 ( 0%) sys  10.29 ( 0%) wall  
79246 kB ( 6%) ggc
 load CSE after reload :   0.84 ( 0%) usr   0.00 ( 0%) sys   0.85 ( 0%) wall   
   0 kB ( 0%) ggc
 thread pro- & epilogue:   0.89 ( 0%) usr   0.00 ( 0%) sys   0.89 ( 0%) wall   
  13 kB ( 0%) ggc
 peephole 2            :   1.46 ( 0%) usr   0.00 ( 0%) sys   1.46 ( 0%) wall   
   0 kB ( 0%) ggc
 rename registers      :49100.30 (82%) usr   2.30 (19%) sys49376.54 (82%) wall 
 16842 kB ( 1%) ggc
 scheduling 2          :  21.80 ( 0%) usr   0.60 ( 5%) sys  22.39 ( 0%) wall 
150933 kB (11%) ggc
 machine dep reorg     :   1.99 ( 0%) usr   0.00 ( 0%) sys   2.00 ( 0%) wall   
   0 kB ( 0%) ggc
 reorder blocks        :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   1 kB ( 0%) ggc
 final                 :   3.83 ( 0%) usr   0.04 ( 0%) sys   3.88 ( 0%) wall   
 738 kB ( 0%) ggc
 symout                :   0.10 ( 0%) usr   0.00 ( 0%) sys   0.10 ( 0%) wall   
5418 kB ( 0%) ggc
 variable tracking     :   1.86 ( 0%) usr   0.02 ( 0%) sys   1.87 ( 0%) wall   
  17 kB ( 0%) ggc
 TOTAL                 :59825.47            11.92          60151.75           
1415015 kB


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] [4.3/4.4 Regression] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (29 preceding siblings ...)
  2008-12-17  6:52 ` jv244 at cam dot ac dot uk
@ 2008-12-17  7:03 ` steven at gcc dot gnu dot org
  2008-12-17  8:37 ` jv244 at cam dot ac dot uk
                   ` (19 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: steven at gcc dot gnu dot org @ 2008-12-17  7:03 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #30 from steven at gcc dot gnu dot org  2008-12-17 07:01 -------
I think redoing this with 4.4.0 would be useful, to check if new code (like
IRA) uses this kind of non-linear algorithms.  But the register renaming patch
hasn't changed between 4.3 and 4.4, so I would compile with
-fno-rename-registers ;-)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] [4.3/4.4 Regression] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (30 preceding siblings ...)
  2008-12-17  7:03 ` steven at gcc dot gnu dot org
@ 2008-12-17  8:37 ` jv244 at cam dot ac dot uk
  2008-12-17 12:59 ` jv244 at cam dot ac dot uk
                   ` (18 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jv244 at cam dot ac dot uk @ 2008-12-17  8:37 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #31 from jv244 at cam dot ac dot uk  2008-12-17 08:36 -------
(In reply to comment #30)
> I think redoing this with 4.4.0 would be useful, to check if new code (like
> IRA) uses this kind of non-linear algorithms.  But the register renaming patch
> hasn't changed between 4.3 and 4.4, so I would compile with
> -fno-rename-registers ;-)
> 

thanks for the '-fno-rename-registers' trick, something like that is not
obvious to beginners. I tried with 4.4, but virtual memory usage went to 9Gb,
so I'll have to run it on a larger machine. I can't remember seeing this with
4.3, but I'll test. 


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] [4.3/4.4 Regression] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (31 preceding siblings ...)
  2008-12-17  8:37 ` jv244 at cam dot ac dot uk
@ 2008-12-17 12:59 ` jv244 at cam dot ac dot uk
  2008-12-17 19:42 ` steven at gcc dot gnu dot org
                   ` (17 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jv244 at cam dot ac dot uk @ 2008-12-17 12:59 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #32 from jv244 at cam dot ac dot uk  2008-12-17 12:58 -------
The 9.3Gb for 4.4 is confirmed. I attached gdb to the process at that point
(after about 70min of compilation), and that is the backtrace:

#0  0x0000000000b48a9a in bucket_allocno_compare_func (v1p=0x7fffe3592d98,
v2p=0x7fffe3592d90)
    at /data04/vondele/gcc_trunk/gcc/gcc/ira-color.c:746
#1  0x0000000000b4abbd in push_allocno_to_stack (allocno=<value optimized out>)
    at /data04/vondele/gcc_trunk/gcc/gcc/ira-color.c:803
#2  0x0000000000b4e4d2 in color_allocnos () at
/data04/vondele/gcc_trunk/gcc/gcc/ira-color.c:989
#3  0x0000000000b4f614 in color_pass (loop_tree_node=<value optimized out>)
    at /data04/vondele/gcc_trunk/gcc/gcc/ira-color.c:1936
#4  0x0000000000b3ef2a in ira_traverse_loop_tree (bb_p=0 '\0',
loop_node=0x7fffe3592d90,
    preorder_func=0x337d54f8, postorder_func=0) at
/data04/vondele/gcc_trunk/gcc/gcc/ira-build.c:1381
#5  0x0000000000b4a320 in ira_color () at
/data04/vondele/gcc_trunk/gcc/gcc/ira-color.c:2080
#6  0x0000000000b3d7eb in rest_of_handle_ira () at
/data04/vondele/gcc_trunk/gcc/gcc/ira.c:1926
#7  0x000000000069e48d in execute_one_pass (pass=0x10980e0) at
/data04/vondele/gcc_trunk/gcc/gcc/passes.c:1279
#8  0x000000000069e6d5 in execute_pass_list (pass=0x10980e0) at
/data04/vondele/gcc_trunk/gcc/gcc/passes.c:1328
#9  0x000000000069e6ed in execute_pass_list (pass=0x1093060) at
/data04/vondele/gcc_trunk/gcc/gcc/passes.c:1329
#10 0x0000000000794ddc in tree_rest_of_compilation (fndecl=0x7f45da99eb00)
    at /data04/vondele/gcc_trunk/gcc/gcc/tree-optimize.c:419

apart from this, 4.4 is actually a bit faster. This is the time report:

gfortran -ffree-line-length-512 -g -ffree-form -ftime-report -c -O3
-march=native -funroll-loops -fno-rename-registers testcase.f90

Execution times (seconds)
 garbage collection    :   7.00 ( 0%) usr   0.02 ( 0%) sys   7.05 ( 0%) wall   
   0 kB ( 0%) ggc
 callgraph construction:   0.25 ( 0%) usr   0.01 ( 0%) sys   0.25 ( 0%) wall  
12496 kB ( 1%) ggc
 callgraph optimization: 387.87 ( 8%) usr   1.54 (11%) sys 389.44 ( 8%) wall   
4414 kB ( 0%) ggc
 ipa cp                :   0.30 ( 0%) usr   0.00 ( 0%) sys   0.30 ( 0%) wall   
  34 kB ( 0%) ggc
 ipa reference         :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall   
   0 kB ( 0%) ggc
 ipa pure const        :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 cfg cleanup           :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall   
   1 kB ( 0%) ggc
 CFG verifier          :  10.67 ( 0%) usr   0.01 ( 0%) sys  10.74 ( 0%) wall   
   0 kB ( 0%) ggc
 trivially dead code   :   4.31 ( 0%) usr   0.00 ( 0%) sys   4.30 ( 0%) wall   
   0 kB ( 0%) ggc
 df reaching defs      :   4.51 ( 0%) usr   0.02 ( 0%) sys   4.54 ( 0%) wall   
   0 kB ( 0%) ggc
 df live regs          :  14.18 ( 0%) usr   0.00 ( 0%) sys  14.22 ( 0%) wall   
   0 kB ( 0%) ggc
 df live&initialized regs:   6.79 ( 0%) usr   0.00 ( 0%) sys   6.82 ( 0%) wall 
     0 kB ( 0%) ggc
 df use-def / def-use chains:  20.22 ( 0%) usr   0.05 ( 0%) sys  20.27 ( 0%)
wall       0 kB ( 0%) ggc
 df reg dead/unused notes:   6.55 ( 0%) usr   0.01 ( 0%) sys   6.53 ( 0%) wall 
 36992 kB ( 3%) ggc
 register information  :   2.68 ( 0%) usr   0.00 ( 0%) sys   2.71 ( 0%) wall   
   0 kB ( 0%) ggc
 alias analysis        :   6.93 ( 0%) usr   0.00 ( 0%) sys   6.92 ( 0%) wall  
46600 kB ( 4%) ggc
 register scan         :   1.32 ( 0%) usr   0.00 ( 0%) sys   1.33 ( 0%) wall   
  18 kB ( 0%) ggc
 rebuild jump labels   :   2.14 ( 0%) usr   0.00 ( 0%) sys   2.15 ( 0%) wall   
   0 kB ( 0%) ggc
 parser                :   6.23 ( 0%) usr   0.09 ( 1%) sys   6.30 ( 0%) wall  
59009 kB ( 4%) ggc
 inline heuristics     :1328.68 (28%) usr   3.27 (24%) sys1331.94 (28%) wall   
 138 kB ( 0%) ggc
 tree gimplify         :   0.36 ( 0%) usr   0.00 ( 0%) sys   0.38 ( 0%) wall  
16833 kB ( 1%) ggc
 tree eh               :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
   0 kB ( 0%) ggc
 tree CFG construction :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
 179 kB ( 0%) ggc
 tree CFG cleanup      :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
   8 kB ( 0%) ggc
 tree VRP              :   0.21 ( 0%) usr   0.00 ( 0%) sys   0.20 ( 0%) wall   
 448 kB ( 0%) ggc
 tree copy propagation :   0.41 ( 0%) usr   0.00 ( 0%) sys   0.38 ( 0%) wall   
 159 kB ( 0%) ggc
 tree find ref. vars   :   0.07 ( 0%) usr   0.02 ( 0%) sys   0.08 ( 0%) wall   
7873 kB ( 1%) ggc
 tree PTA              :  19.23 ( 0%) usr   0.06 ( 0%) sys  19.31 ( 0%) wall   
 540 kB ( 0%) ggc
 tree alias analysis   :   0.47 ( 0%) usr   0.04 ( 0%) sys   0.53 ( 0%) wall   
  75 kB ( 0%) ggc
 tree call clobbering  :   0.13 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall   
  36 kB ( 0%) ggc
 tree flow sensitive alias:   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall
     61 kB ( 0%) ggc
 tree flow insensitive alias:   0.03 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%)
wall       0 kB ( 0%) ggc
 tree memory partitioning:   0.61 ( 0%) usr   0.00 ( 0%) sys   0.60 ( 0%) wall 
     0 kB ( 0%) ggc
 tree SSA rewrite      :   0.09 ( 0%) usr   0.01 ( 0%) sys   0.10 ( 0%) wall  
20578 kB ( 2%) ggc
 tree SSA other        :   0.06 ( 0%) usr   0.01 ( 0%) sys   0.02 ( 0%) wall   
   0 kB ( 0%) ggc
 tree SSA incremental  :   0.21 ( 0%) usr   0.00 ( 0%) sys   0.17 ( 0%) wall   
  21 kB ( 0%) ggc
 tree operand scan     :2339.81 (49%) usr   0.26 ( 2%) sys2340.13 (49%) wall   
9840 kB ( 1%) ggc
 dominator optimization:   0.12 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall   
  73 kB ( 0%) ggc
 tree SRA              :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
  78 kB ( 0%) ggc
 tree CCP              :   0.18 ( 0%) usr   0.00 ( 0%) sys   0.18 ( 0%) wall   
 119 kB ( 0%) ggc
 tree reassociation    :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
  50 kB ( 0%) ggc
 tree PRE              :   0.13 ( 0%) usr   0.00 ( 0%) sys   0.14 ( 0%) wall   
  62 kB ( 0%) ggc
 tree FRE              :   0.14 ( 0%) usr   0.00 ( 0%) sys   0.16 ( 0%) wall   
  33 kB ( 0%) ggc
 tree linearize phis   :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
  17 kB ( 0%) ggc
 tree forward propagate:   0.03 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   3 kB ( 0%) ggc
 tree conservative DCE :   0.30 ( 0%) usr   0.00 ( 0%) sys   0.31 ( 0%) wall   
   0 kB ( 0%) ggc
 tree aggressive DCE   :   0.13 ( 0%) usr   0.00 ( 0%) sys   0.13 ( 0%) wall   
  12 kB ( 0%) ggc
 tree buildin call DCE :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree DSE              :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
  10 kB ( 0%) ggc
 complete unrolling    :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
  46 kB ( 0%) ggc
 tree loop init        :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
  44 kB ( 0%) ggc
 tree SSA uncprop      :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree SSA to normal    :   0.09 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall   
   4 kB ( 0%) ggc
 tree rename SSA copies:   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree SSA verifier     :   7.53 ( 0%) usr   0.01 ( 0%) sys   7.55 ( 0%) wall   
   0 kB ( 0%) ggc
 tree STMT verifier    :  11.14 ( 0%) usr   0.01 ( 0%) sys  11.19 ( 0%) wall   
   0 kB ( 0%) ggc
 callgraph verifier    :   1.01 ( 0%) usr   0.00 ( 0%) sys   1.02 ( 0%) wall   
   0 kB ( 0%) ggc
 dominance computation :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall   
   0 kB ( 0%) ggc
 expand                : 421.25 ( 9%) usr   2.03 (15%) sys 423.13 ( 9%) wall 
365559 kB (28%) ggc
 lower subreg          :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall   
   0 kB ( 0%) ggc
 forward prop          :   5.34 ( 0%) usr   0.03 ( 0%) sys   5.36 ( 0%) wall  
24812 kB ( 2%) ggc
 CSE                   :  10.24 ( 0%) usr   0.07 ( 1%) sys  10.30 ( 0%) wall  
24343 kB ( 2%) ggc
 dead code elimination :   2.74 ( 0%) usr   0.00 ( 0%) sys   2.75 ( 0%) wall   
   0 kB ( 0%) ggc
 dead store elim1      :   2.46 ( 0%) usr   0.05 ( 0%) sys   2.52 ( 0%) wall  
36487 kB ( 3%) ggc
 dead store elim2      :   3.81 ( 0%) usr   0.00 ( 0%) sys   3.82 ( 0%) wall  
36569 kB ( 3%) ggc
 loop analysis         :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
 109 kB ( 0%) ggc
 global CSE            :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 web                   :   4.48 ( 0%) usr   0.01 ( 0%) sys   4.49 ( 0%) wall   
   7 kB ( 0%) ggc
 CSE 2                 :   4.96 ( 0%) usr   0.02 ( 0%) sys   4.99 ( 0%) wall   
9448 kB ( 1%) ggc
 branch prediction     :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall   
  46 kB ( 0%) ggc
 combiner              :   3.41 ( 0%) usr   0.00 ( 0%) sys   3.41 ( 0%) wall  
18359 kB ( 1%) ggc
 regmove               :   0.27 ( 0%) usr   0.00 ( 0%) sys   0.26 ( 0%) wall   
  28 kB ( 0%) ggc
 integrated RA         :  71.46 ( 1%) usr   5.20 (38%) sys  96.42 ( 2%) wall  
66797 kB ( 5%) ggc
 reload                :  14.31 ( 0%) usr   0.19 ( 1%) sys  14.49 ( 0%) wall 
284193 kB (21%) ggc
 reload CSE regs       :   8.05 ( 0%) usr   0.00 ( 0%) sys   8.04 ( 0%) wall  
77580 kB ( 6%) ggc
 load CSE after reload :   0.62 ( 0%) usr   0.00 ( 0%) sys   0.61 ( 0%) wall   
   0 kB ( 0%) ggc
 thread pro- & epilogue:   0.79 ( 0%) usr   0.00 ( 0%) sys   0.79 ( 0%) wall   
 216 kB ( 0%) ggc
 peephole 2            :   0.99 ( 0%) usr   0.00 ( 0%) sys   0.99 ( 0%) wall   
  29 kB ( 0%) ggc
 rename registers      :   3.99 ( 0%) usr   0.00 ( 0%) sys   4.02 ( 0%) wall   
   0 kB ( 0%) ggc
 scheduling 2          :  18.25 ( 0%) usr   0.54 ( 4%) sys  18.77 ( 0%) wall 
147959 kB (11%) ggc
 machine dep reorg     :   1.87 ( 0%) usr   0.00 ( 0%) sys   1.88 ( 0%) wall   
   2 kB ( 0%) ggc
 reorder blocks        :   0.36 ( 0%) usr   0.00 ( 0%) sys   0.35 ( 0%) wall   
  17 kB ( 0%) ggc
 final                 :   2.93 ( 0%) usr   0.09 ( 1%) sys   3.01 ( 0%) wall  
11989 kB ( 1%) ggc
 symout                :   0.04 ( 0%) usr   0.01 ( 0%) sys   0.05 ( 0%) wall   
5340 kB ( 0%) ggc
 variable tracking     :   1.39 ( 0%) usr   0.06 ( 0%) sys   1.44 ( 0%) wall   
  63 kB ( 0%) ggc
 TOTAL                 :4777.54            13.76          4811.10           
1328156 kB

while on the same machine, 4.3 has this time report:

gfortran-4.3 -ffree-line-length-512 -g -ffree-form -ftime-report -c -O3
-march=native -funroll-loops -fno-rename-registers testcase.f90

Execution times (seconds)
 garbage collection    :   2.22 ( 0%) usr   0.01 ( 0%) sys   2.26 ( 0%) wall   
   0 kB ( 0%) ggc
 callgraph construction:   0.24 ( 0%) usr   0.05 ( 0%) sys   0.30 ( 0%) wall  
10435 kB ( 1%) ggc
 callgraph optimization:   0.88 ( 0%) usr   0.02 ( 0%) sys   0.94 ( 0%) wall  
13970 kB ( 1%) ggc
 ipa reference         :   0.16 ( 0%) usr   0.00 ( 0%) sys   0.16 ( 0%) wall   
   3 kB ( 0%) ggc
 ipa pure const        :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 cfg cleanup           :   0.06 ( 0%) usr   0.02 ( 0%) sys   0.15 ( 0%) wall   
   1 kB ( 0%) ggc
 trivially dead code   :   3.72 ( 0%) usr   0.02 ( 0%) sys   3.76 ( 0%) wall   
   0 kB ( 0%) ggc
 df reaching defs      :   4.71 ( 0%) usr   0.06 ( 0%) sys   4.76 ( 0%) wall   
   0 kB ( 0%) ggc
 df live regs          :  14.35 ( 0%) usr   0.00 ( 0%) sys  14.35 ( 0%) wall   
   0 kB ( 0%) ggc
 df live&initialized regs:   6.28 ( 0%) usr   0.05 ( 0%) sys   6.41 ( 0%) wall 
     0 kB ( 0%) ggc
 df use-def / def-use chains:  12.90 ( 0%) usr   0.03 ( 0%) sys  12.96 ( 0%)
wall       0 kB ( 0%) ggc
 df reg dead/unused notes:   6.17 ( 0%) usr   0.01 ( 0%) sys   6.19 ( 0%) wall 
 35719 kB ( 3%) ggc
 register information  :   2.84 ( 0%) usr   0.01 ( 0%) sys   2.91 ( 0%) wall   
   0 kB ( 0%) ggc
 alias analysis        :   6.60 ( 0%) usr   0.02 ( 0%) sys   6.69 ( 0%) wall  
46609 kB ( 3%) ggc
 register scan         :   1.24 ( 0%) usr   0.00 ( 0%) sys   1.28 ( 0%) wall   
   8 kB ( 0%) ggc
 rebuild jump labels   :   2.33 ( 0%) usr   0.00 ( 0%) sys   2.32 ( 0%) wall   
   0 kB ( 0%) ggc
 parser                :   2.77 ( 0%) usr   0.86 ( 6%) sys   5.64 ( 0%) wall  
58659 kB ( 4%) ggc
 inline heuristics     :1683.93 (21%) usr   4.08 (29%) sys1689.76 (21%) wall   
 136 kB ( 0%) ggc
 integration           :   0.00 ( 0%) usr   0.02 ( 0%) sys   0.12 ( 0%) wall   
 824 kB ( 0%) ggc
 tree gimplify         :   0.43 ( 0%) usr   0.01 ( 0%) sys   0.43 ( 0%) wall   
3446 kB ( 0%) ggc
 tree eh               :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
   0 kB ( 0%) ggc
 tree CFG construction :   0.03 ( 0%) usr   0.01 ( 0%) sys   0.03 ( 0%) wall   
7150 kB ( 1%) ggc
 tree CFG cleanup      :   0.02 ( 0%) usr   0.04 ( 0%) sys   0.21 ( 0%) wall   
   8 kB ( 0%) ggc
 tree VRP              :   0.19 ( 0%) usr   0.00 ( 0%) sys   0.21 ( 0%) wall   
 479 kB ( 0%) ggc
 tree copy propagation :   0.51 ( 0%) usr   0.00 ( 0%) sys   0.55 ( 0%) wall   
 159 kB ( 0%) ggc
 tree find ref. vars   :   0.11 ( 0%) usr   0.02 ( 0%) sys   0.12 ( 0%) wall   
7876 kB ( 1%) ggc
 tree PTA              :   3.88 ( 0%) usr   0.07 ( 1%) sys   3.98 ( 0%) wall   
 380 kB ( 0%) ggc
 tree alias analysis   :   8.07 ( 0%) usr   0.83 ( 6%) sys  10.31 ( 0%) wall   
  49 kB ( 0%) ggc
 tree call clobbering  :   0.32 ( 0%) usr   0.00 ( 0%) sys   0.31 ( 0%) wall   
   0 kB ( 0%) ggc
 tree flow sensitive alias:   0.08 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall
     62 kB ( 0%) ggc
 tree flow insensitive alias:   4.79 ( 0%) usr   0.00 ( 0%) sys   4.79 ( 0%)
wall       0 kB ( 0%) ggc
 tree memory partitioning:  23.61 ( 0%) usr   0.01 ( 0%) sys  23.65 ( 0%) wall 
     2 kB ( 0%) ggc
 tree PHI insertion    :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
   0 kB ( 0%) ggc
 tree SSA rewrite      :   0.15 ( 0%) usr   0.02 ( 0%) sys   0.19 ( 0%) wall  
25410 kB ( 2%) ggc
 tree SSA other        :   0.15 ( 0%) usr   0.52 ( 4%) sys   2.07 ( 0%) wall   
   0 kB ( 0%) ggc
 tree SSA incremental  :   0.63 ( 0%) usr   0.02 ( 0%) sys   0.69 ( 0%) wall   
  23 kB ( 0%) ggc
 tree operand scan     :5025.85 (64%) usr   3.00 (22%) sys5036.21 (64%) wall  
19771 kB ( 1%) ggc
 dominator optimization:   0.19 ( 0%) usr   0.01 ( 0%) sys   0.18 ( 0%) wall   
 381 kB ( 0%) ggc
 tree SRA              :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall   
  71 kB ( 0%) ggc
 tree STORE-CCP        :   0.12 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall   
  39 kB ( 0%) ggc
 tree CCP              :   0.17 ( 0%) usr   0.00 ( 0%) sys   0.20 ( 0%) wall   
  81 kB ( 0%) ggc
 tree PHI const/copy prop:   0.00 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall 
     0 kB ( 0%) ggc
 tree reassociation    :   0.05 ( 0%) usr   0.01 ( 0%) sys   0.04 ( 0%) wall   
  50 kB ( 0%) ggc
 tree PRE              :   0.37 ( 0%) usr   0.00 ( 0%) sys   0.36 ( 0%) wall   
2286 kB ( 0%) ggc
 tree FRE              :   0.31 ( 0%) usr   0.01 ( 0%) sys   0.34 ( 0%) wall   
2268 kB ( 0%) ggc
 tree code sinking     :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
  25 kB ( 0%) ggc
 tree linearize phis   :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall   
  10 kB ( 0%) ggc
 tree forward propagate:   0.04 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall   
   1 kB ( 0%) ggc
 tree conservative DCE :   0.44 ( 0%) usr   0.00 ( 0%) sys   0.47 ( 0%) wall   
   4 kB ( 0%) ggc
 tree aggressive DCE   :   0.10 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall   
   0 kB ( 0%) ggc
 tree DSE              :   0.05 ( 0%) usr   0.01 ( 0%) sys   0.04 ( 0%) wall   
   4 kB ( 0%) ggc
 PHI merge             :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree loop optimization:   0.00 ( 0%) usr   0.01 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 loop invariant motion :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree canonical iv     :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree loop unswitching :   0.00 ( 0%) usr   0.01 ( 0%) sys   0.00 ( 0%) wall   
   0 kB ( 0%) ggc
 complete unrolling    :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 predictive commoning  :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   1 kB ( 0%) ggc
 tree copy headers     :   0.00 ( 0%) usr   0.01 ( 0%) sys   0.01 ( 0%) wall   
  26 kB ( 0%) ggc
 tree SSA uncprop      :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
   0 kB ( 0%) ggc
 tree SSA to normal    :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall   
   1 kB ( 0%) ggc
 tree NRV optimization :   0.00 ( 0%) usr   0.02 ( 0%) sys   0.00 ( 0%) wall   
   0 kB ( 0%) ggc
 tree rename SSA copies:   0.03 ( 0%) usr   0.01 ( 0%) sys   0.06 ( 0%) wall   
   0 kB ( 0%) ggc
 dominance frontiers   :   0.00 ( 0%) usr   0.01 ( 0%) sys   0.00 ( 0%) wall   
   0 kB ( 0%) ggc
 dominance computation :   0.00 ( 0%) usr   0.02 ( 0%) sys   0.09 ( 0%) wall   
   0 kB ( 0%) ggc
 expand                : 959.04 (12%) usr   2.65 (19%) sys 964.23 (12%) wall 
422621 kB (31%) ggc
 varconst              :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 lower subreg          :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall   
   0 kB ( 0%) ggc
 jump                  :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 forward prop          :   1.26 ( 0%) usr   0.04 ( 0%) sys   1.34 ( 0%) wall   
8174 kB ( 1%) ggc
 CSE                   :   7.64 ( 0%) usr   0.04 ( 0%) sys   7.71 ( 0%) wall  
26097 kB ( 2%) ggc
 dead code elimination :   2.69 ( 0%) usr   0.00 ( 0%) sys   2.70 ( 0%) wall   
   0 kB ( 0%) ggc
 dead store elim1      :   4.05 ( 0%) usr   0.10 ( 1%) sys   4.21 ( 0%) wall  
36200 kB ( 3%) ggc
 dead store elim2      :   5.95 ( 0%) usr   0.01 ( 0%) sys   5.98 ( 0%) wall  
36506 kB ( 3%) ggc
 loop analysis         :   0.00 ( 0%) usr   0.01 ( 0%) sys   0.04 ( 0%) wall   
 111 kB ( 0%) ggc
 global CSE            :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
   0 kB ( 0%) ggc
 bypass jumps          :   0.00 ( 0%) usr   0.01 ( 0%) sys   0.00 ( 0%) wall   
   6 kB ( 0%) ggc
 web                   :   0.98 ( 0%) usr   0.01 ( 0%) sys   1.00 ( 0%) wall   
   0 kB ( 0%) ggc
 CSE 2                 :   3.65 ( 0%) usr   0.01 ( 0%) sys   3.70 ( 0%) wall   
9710 kB ( 1%) ggc
 branch prediction     :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall   
  48 kB ( 0%) ggc
 combiner              :   3.34 ( 0%) usr   0.00 ( 0%) sys   3.34 ( 0%) wall  
17434 kB ( 1%) ggc
 if-conversion         :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
  50 kB ( 0%) ggc
 regmove               :   3.07 ( 0%) usr   0.00 ( 0%) sys   3.07 ( 0%) wall   
  28 kB ( 0%) ggc
 mode switching        :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 local alloc           :   9.49 ( 0%) usr   0.07 ( 1%) sys   9.54 ( 0%) wall  
62106 kB ( 5%) ggc
 global alloc          :  17.28 ( 0%) usr   0.52 ( 4%) sys  18.12 ( 0%) wall 
264214 kB (20%) ggc
 reload CSE regs       :   7.61 ( 0%) usr   0.01 ( 0%) sys   7.62 ( 0%) wall  
75867 kB ( 6%) ggc
 load CSE after reload :   0.60 ( 0%) usr   0.00 ( 0%) sys   0.60 ( 0%) wall   
   0 kB ( 0%) ggc
 thread pro- & epilogue:   0.68 ( 0%) usr   0.01 ( 0%) sys   0.68 ( 0%) wall   
 236 kB ( 0%) ggc
 if-conversion 2       :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
  25 kB ( 0%) ggc
 peephole 2            :   1.08 ( 0%) usr   0.00 ( 0%) sys   1.09 ( 0%) wall   
  28 kB ( 0%) ggc
 rename registers      :   1.87 ( 0%) usr   0.00 ( 0%) sys   1.90 ( 0%) wall   
   0 kB ( 0%) ggc
 scheduling 2          :  15.12 ( 0%) usr   0.38 ( 3%) sys  15.51 ( 0%) wall 
144759 kB (11%) ggc
 machine dep reorg     :   1.52 ( 0%) usr   0.00 ( 0%) sys   1.51 ( 0%) wall   
   2 kB ( 0%) ggc
 reorder blocks        :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
  18 kB ( 0%) ggc
 reg stack             :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 final                 :   2.94 ( 0%) usr   0.10 ( 1%) sys   3.29 ( 0%) wall   
1267 kB ( 0%) ggc
 symout                :   0.15 ( 0%) usr   0.01 ( 0%) sys   0.16 ( 0%) wall   
6928 kB ( 1%) ggc
 variable tracking     :   1.36 ( 0%) usr   0.00 ( 0%) sys   1.37 ( 0%) wall   
  76 kB ( 0%) ggc
 tree if-combine       :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 rest of compilation   :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
   0 kB ( 0%) ggc
 TOTAL                 :7873.75            13.93          7906.31           
1349263 kB
total: 1271414 kB


-- 

jv244 at cam dot ac dot uk changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  GCC build triplet|                            |vmakarov


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] [4.3/4.4 Regression] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (32 preceding siblings ...)
  2008-12-17 12:59 ` jv244 at cam dot ac dot uk
@ 2008-12-17 19:42 ` steven at gcc dot gnu dot org
  2008-12-20  9:00 ` jv244 at cam dot ac dot uk
                   ` (16 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: steven at gcc dot gnu dot org @ 2008-12-17 19:42 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #33 from steven at gcc dot gnu dot org  2008-12-17 19:40 -------
cfgexpand.c:defer_stack_allocation() has this gem:

  /* Without optimization, *most* variables are allocated from the
     stack, which makes the quadratic problem large exactly when we
     want compilation to proceed as quickly as possible.  On the
     other hand, we don't want the function's stack frame size to
     get completely out of hand.  So we avoid adding scalars and
     "small" aggregates to the list at all.  */
  if (optimize == 0 && tree_low_cst (DECL_SIZE_UNIT (var), 1) < 32)
    return false;

In our case, most variables are of type mpfr_type, which is ... yes, 32 bytes
:-)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] [4.3/4.4 Regression] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (33 preceding siblings ...)
  2008-12-17 19:42 ` steven at gcc dot gnu dot org
@ 2008-12-20  9:00 ` jv244 at cam dot ac dot uk
  2008-12-20  9:56 ` steven at gcc dot gnu dot org
                   ` (15 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jv244 at cam dot ac dot uk @ 2008-12-20  9:00 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #34 from jv244 at cam dot ac dot uk  2008-12-20 08:58 -------
BTW, should I split this PR in 4 sub PRs, and make them block this one?

1) inline heuristics (4.3/4.4 Regression) 
2) IRA mem explosion (4.4 Regression)
3) rename registers issue (?)
4) may_alias issue (?)

This makes kind of sense according to me.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] [4.3/4.4 Regression] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (34 preceding siblings ...)
  2008-12-20  9:00 ` jv244 at cam dot ac dot uk
@ 2008-12-20  9:56 ` steven at gcc dot gnu dot org
  2008-12-20 11:33 ` [Bug middle-end/38474] [Meta] " jv244 at cam dot ac dot uk
                   ` (14 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: steven at gcc dot gnu dot org @ 2008-12-20  9:56 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #35 from steven at gcc dot gnu dot org  2008-12-20 09:54 -------
Re comment #34: Good idea, but add:
5) quadratic behaviour in find_temp_slot_from_address.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] [Meta] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (35 preceding siblings ...)
  2008-12-20  9:56 ` steven at gcc dot gnu dot org
@ 2008-12-20 11:33 ` jv244 at cam dot ac dot uk
  2008-12-20 15:51 ` steven at gcc dot gnu dot org
                   ` (13 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jv244 at cam dot ac dot uk @ 2008-12-20 11:33 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #36 from jv244 at cam dot ac dot uk  2008-12-20 11:30 -------
I've added 

PR38582 : rename registers
PR38583 : ira
PR38584 : inline heuristic
PR38585 : compute_may_aliases
PR38586 : find_temp_slot_from_address

and turned this one in a meta bug.


-- 

jv244 at cam dot ac dot uk changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[4.3/4.4 Regression] slow   |[Meta] slow compilation at -
                   |compilation at -O0          |O0 (callgraph optimization,
                   |(callgraph optimization,    |inline heuristics, expand )
                   |inline heuristics, expand ) |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] [Meta] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (36 preceding siblings ...)
  2008-12-20 11:33 ` [Bug middle-end/38474] [Meta] " jv244 at cam dot ac dot uk
@ 2008-12-20 15:51 ` steven at gcc dot gnu dot org
  2009-01-24 10:27 ` rguenth at gcc dot gnu dot org
                   ` (12 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: steven at gcc dot gnu dot org @ 2008-12-20 15:51 UTC (permalink / raw)
  To: gcc-bugs



-- 

steven at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|steven at gcc dot gnu dot   |unassigned at gcc dot gnu
                   |org                         |dot org
             Status|ASSIGNED                    |NEW


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] [Meta] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (37 preceding siblings ...)
  2008-12-20 15:51 ` steven at gcc dot gnu dot org
@ 2009-01-24 10:27 ` rguenth at gcc dot gnu dot org
  2009-08-04 12:45 ` rguenth at gcc dot gnu dot org
                   ` (11 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-01-24 10:27 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #37 from rguenth at gcc dot gnu dot org  2009-01-24 10:21 -------
GCC 4.3.3 is being released, adjusting target milestone.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.3.3                       |4.3.4


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] [Meta] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (38 preceding siblings ...)
  2009-01-24 10:27 ` rguenth at gcc dot gnu dot org
@ 2009-08-04 12:45 ` rguenth at gcc dot gnu dot org
  2009-11-27  8:52 ` jv244 at cam dot ac dot uk
                   ` (10 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-08-04 12:45 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #38 from rguenth at gcc dot gnu dot org  2009-08-04 12:29 -------
GCC 4.3.4 is being released, adjusting target milestone.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.3.4                       |4.3.5


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] [Meta] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (39 preceding siblings ...)
  2009-08-04 12:45 ` rguenth at gcc dot gnu dot org
@ 2009-11-27  8:52 ` jv244 at cam dot ac dot uk
  2009-11-27  9:00 ` jv244 at cam dot ac dot uk
                   ` (9 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jv244 at cam dot ac dot uk @ 2009-11-27  8:52 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #39 from jv244 at cam dot ac dot uk  2009-11-27 08:52 -------
I've rerun the initial (non-reduced) testcase at -O0, and I'm getting now more
reasonable memory usage (2.5Gb), and all time is now in 'expand'. 'expand' is
now about 3 times slower than 1year ago, but this is with checking enabled so I
don't know if this is relevant:

Execution times (seconds)
 garbage collection    :   2.22 ( 0%) usr   0.00 ( 0%) sys   2.22 ( 0%) wall   
   0 kB ( 0%) ggc
 callgraph construction:   0.22 ( 0%) usr   0.02 ( 0%) sys   0.28 ( 0%) wall  
12487 kB ( 2%) ggc
 callgraph optimization:   0.23 ( 0%) usr   0.00 ( 0%) sys   0.19 ( 0%) wall   
4370 kB ( 1%) ggc
 cfg cleanup           :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall   
   0 kB ( 0%) ggc
 CFG verifier          :   3.33 ( 0%) usr   0.01 ( 0%) sys   3.36 ( 0%) wall   
   0 kB ( 0%) ggc
 trivially dead code   :   0.92 ( 0%) usr   0.00 ( 0%) sys   0.91 ( 0%) wall   
   0 kB ( 0%) ggc
 df live regs          :   0.62 ( 0%) usr   0.00 ( 0%) sys   0.64 ( 0%) wall   
   0 kB ( 0%) ggc
 df reg dead/unused notes:   1.33 ( 0%) usr   0.02 ( 0%) sys   1.31 ( 0%) wall 
 19416 kB ( 3%) ggc
 register information  :   0.63 ( 0%) usr   0.01 ( 0%) sys   0.64 ( 0%) wall   
   0 kB ( 0%) ggc
 alias analysis        :   0.58 ( 0%) usr   0.01 ( 0%) sys   0.59 ( 0%) wall   
8335 kB ( 1%) ggc
 rebuild jump labels   :   0.65 ( 0%) usr   0.00 ( 0%) sys   0.65 ( 0%) wall   
   0 kB ( 0%) ggc
 parser                :   4.96 ( 1%) usr   0.09 ( 2%) sys   5.06 ( 1%) wall  
50732 kB ( 9%) ggc
 inline heuristics     :   0.15 ( 0%) usr   0.00 ( 0%) sys   0.11 ( 0%) wall   
   0 kB ( 0%) ggc
 tree gimplify         :   0.72 ( 0%) usr   0.01 ( 0%) sys   0.69 ( 0%) wall  
13184 kB ( 2%) ggc
 tree eh               :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree CFG construction :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
 172 kB ( 0%) ggc
 tree find ref. vars   :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall   
3263 kB ( 1%) ggc
 tree SSA rewrite      :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
  46 kB ( 0%) ggc
 tree SSA other        :   0.02 ( 0%) usr   0.02 ( 0%) sys   0.04 ( 0%) wall   
  18 kB ( 0%) ggc
 tree operand scan     :   0.06 ( 0%) usr   0.02 ( 0%) sys   0.07 ( 0%) wall   
 118 kB ( 0%) ggc
 tree SSA verifier     :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall   
   0 kB ( 0%) ggc
 tree STMT verifier    :   1.49 ( 0%) usr   0.03 ( 1%) sys   1.53 ( 0%) wall   
   0 kB ( 0%) ggc
 callgraph verifier    :   1.22 ( 0%) usr   0.00 ( 0%) sys   1.25 ( 0%) wall   
   0 kB ( 0%) ggc
 expand                : 737.90 (94%) usr   3.54 (79%) sys 741.44 (94%) wall 
309551 kB (55%) ggc
 integrated RA         :  18.91 ( 2%) usr   0.28 ( 6%) sys  19.24 ( 2%) wall   
8696 kB ( 2%) ggc
 reload                :   8.08 ( 1%) usr   0.26 ( 6%) sys   8.33 ( 1%) wall 
123546 kB (22%) ggc
 thread pro- & epilogue:   0.80 ( 0%) usr   0.00 ( 0%) sys   0.81 ( 0%) wall   
 239 kB ( 0%) ggc
 final                 :   3.13 ( 0%) usr   0.15 ( 3%) sys   3.30 ( 0%) wall   
 533 kB ( 0%) ggc
 symout                :   0.08 ( 0%) usr   0.02 ( 0%) sys   0.08 ( 0%) wall   
4818 kB ( 1%) ggc
 TOTAL                 : 788.49             4.49           792.98            
559736 kB
Extra diagnostic checks enabled; compiler may run slowly.
Configure with --enable-checking=release to disable checks.
COLLECT_GCC_OPTIONS='-ffree-line-length-512' '-g' '-ffree-form' '-ftime-report'
'-c' '-O0' '-ffree-line-length-512' '-v' '-mtune=generic'
 as -V -Qy -o PR38582.o /tmp/ccfulxg5.s


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] [Meta] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (40 preceding siblings ...)
  2009-11-27  8:52 ` jv244 at cam dot ac dot uk
@ 2009-11-27  9:00 ` jv244 at cam dot ac dot uk
  2009-11-27 10:50 ` rguenth at gcc dot gnu dot org
                   ` (8 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jv244 at cam dot ac dot uk @ 2009-11-27  9:00 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #40 from jv244 at cam dot ac dot uk  2009-11-27 09:00 -------
with the fix for rename registers this now also runs 'fast' at -O3 (see below),
and memory is reasonable as well. Most time is in expand as well. This is the
time report of -O3:

Execution times (seconds)
 garbage collection    :   7.60 ( 1%) usr   0.03 ( 0%) sys   7.65 ( 1%) wall   
   0 kB ( 0%) ggc
 callgraph construction:   0.23 ( 0%) usr   0.01 ( 0%) sys   0.25 ( 0%) wall  
12524 kB ( 1%) ggc
 callgraph optimization:   0.48 ( 0%) usr   0.03 ( 0%) sys   0.51 ( 0%) wall   
4370 kB ( 0%) ggc
 ipa cp                :   0.12 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall   
2061 kB ( 0%) ggc
 ipa reference         :   0.10 ( 0%) usr   0.00 ( 0%) sys   0.11 ( 0%) wall   
   0 kB ( 0%) ggc
 ipa pure const        :   0.18 ( 0%) usr   0.00 ( 0%) sys   0.18 ( 0%) wall   
   2 kB ( 0%) ggc
 cfg cleanup           :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall   
   0 kB ( 0%) ggc
 CFG verifier          :  11.18 ( 1%) usr   0.04 ( 0%) sys  11.25 ( 1%) wall   
   0 kB ( 0%) ggc
 trivially dead code   :   2.70 ( 0%) usr   0.01 ( 0%) sys   2.72 ( 0%) wall   
   0 kB ( 0%) ggc
 df multiple defs      :   3.28 ( 0%) usr   0.00 ( 0%) sys   3.28 ( 0%) wall   
   0 kB ( 0%) ggc
 df reaching defs      :   1.30 ( 0%) usr   0.04 ( 0%) sys   1.33 ( 0%) wall   
   0 kB ( 0%) ggc
 df live regs          :  11.46 ( 1%) usr   0.01 ( 0%) sys  11.47 ( 1%) wall   
   0 kB ( 0%) ggc
 df live&initialized regs:   6.86 ( 1%) usr   0.02 ( 0%) sys   6.87 ( 1%) wall 
     0 kB ( 0%) ggc
 df use-def / def-use chains:   3.87 ( 0%) usr   0.02 ( 0%) sys   3.91 ( 0%)
wall       0 kB ( 0%) ggc
 df reg dead/unused notes:   9.18 ( 1%) usr   0.01 ( 0%) sys   9.23 ( 1%) wall 
 28894 kB ( 2%) ggc
 register information  :   3.54 ( 0%) usr   0.02 ( 0%) sys   3.58 ( 0%) wall   
   0 kB ( 0%) ggc
 alias analysis        :   5.55 ( 1%) usr   0.01 ( 0%) sys   5.60 ( 1%) wall  
42254 kB ( 4%) ggc
 alias stmt walking    :   0.23 ( 0%) usr   0.11 ( 1%) sys   0.33 ( 0%) wall   
   0 kB ( 0%) ggc
 register scan         :   0.70 ( 0%) usr   0.00 ( 0%) sys   0.71 ( 0%) wall   
   4 kB ( 0%) ggc
 rebuild jump labels   :   1.43 ( 0%) usr   0.00 ( 0%) sys   1.46 ( 0%) wall   
   0 kB ( 0%) ggc
 parser                :   4.66 ( 1%) usr   0.11 ( 1%) sys   4.78 ( 1%) wall  
50732 kB ( 4%) ggc
 inline heuristics     :  40.66 ( 5%) usr   8.08 (51%) sys  48.90 ( 6%) wall   
 112 kB ( 0%) ggc
 integration           :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall   
 951 kB ( 0%) ggc
 tree gimplify         :   0.67 ( 0%) usr   0.00 ( 0%) sys   0.67 ( 0%) wall  
13182 kB ( 1%) ggc
 tree eh               :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
   0 kB ( 0%) ggc
 tree CFG construction :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
 172 kB ( 0%) ggc
 tree CFG cleanup      :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
   1 kB ( 0%) ggc
 tree VRP              :   0.15 ( 0%) usr   0.00 ( 0%) sys   0.17 ( 0%) wall   
 425 kB ( 0%) ggc
 tree copy propagation :   0.26 ( 0%) usr   0.00 ( 0%) sys   0.22 ( 0%) wall   
 139 kB ( 0%) ggc
 tree find ref. vars   :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.09 ( 0%) wall   
3262 kB ( 0%) ggc
 tree PTA              :  21.39 ( 3%) usr   0.38 ( 2%) sys  21.76 ( 3%) wall   
 371 kB ( 0%) ggc
 tree PHI insertion    :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
   0 kB ( 0%) ggc
 tree SSA rewrite      :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall   
8504 kB ( 1%) ggc
 tree SSA other        :   0.04 ( 0%) usr   0.01 ( 0%) sys   0.05 ( 0%) wall   
  18 kB ( 0%) ggc
 tree SSA incremental  :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
  24 kB ( 0%) ggc
 tree operand scan     :   0.04 ( 0%) usr   0.07 ( 0%) sys   0.10 ( 0%) wall   
4721 kB ( 0%) ggc
 dominator optimization:   0.04 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall   
  68 kB ( 0%) ggc
 tree SRA              :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
  86 kB ( 0%) ggc
 tree CCP              :   0.18 ( 0%) usr   0.00 ( 0%) sys   0.19 ( 0%) wall   
 105 kB ( 0%) ggc
 tree reassociation    :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall   
  48 kB ( 0%) ggc
 tree PRE              :   0.11 ( 0%) usr   0.00 ( 0%) sys   0.11 ( 0%) wall   
 171 kB ( 0%) ggc
 tree FRE              :   0.12 ( 0%) usr   0.00 ( 0%) sys   0.13 ( 0%) wall   
 140 kB ( 0%) ggc
 tree code sinking     :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
  24 kB ( 0%) ggc
 tree linearize phis   :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
  14 kB ( 0%) ggc
 tree forward propagate:   0.03 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall   
   7 kB ( 0%) ggc
 tree conservative DCE :   0.40 ( 0%) usr   0.05 ( 0%) sys   0.46 ( 0%) wall   
   0 kB ( 0%) ggc
 tree aggressive DCE   :   0.21 ( 0%) usr   0.03 ( 0%) sys   0.19 ( 0%) wall   
 319 kB ( 0%) ggc
 tree buildin call DCE :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
   0 kB ( 0%) ggc
 tree DSE              :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall   
   8 kB ( 0%) ggc
 complete unrolling    :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
  43 kB ( 0%) ggc
 tree vectorization    :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree slp vectorization:   0.03 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
  27 kB ( 0%) ggc
 tree rename SSA copies:   0.04 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall   
   0 kB ( 0%) ggc
 tree SSA verifier     :   2.85 ( 0%) usr   0.02 ( 0%) sys   2.83 ( 0%) wall   
   0 kB ( 0%) ggc
 tree STMT verifier    :  13.12 ( 2%) usr   0.06 ( 0%) sys  13.20 ( 2%) wall   
   0 kB ( 0%) ggc
 callgraph verifier    :   1.85 ( 0%) usr   0.00 ( 0%) sys   1.86 ( 0%) wall   
   0 kB ( 0%) ggc
 dominance computation :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 expand                : 548.68 (65%) usr   4.27 (27%) sys 552.92 (64%) wall 
311209 kB (26%) ggc
 lower subreg          :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.07 ( 0%) wall   
   0 kB ( 0%) ggc
 forward prop          :   6.49 ( 1%) usr   0.08 ( 1%) sys   6.57 ( 1%) wall  
18623 kB ( 2%) ggc
 CSE                   :   4.60 ( 1%) usr   0.02 ( 0%) sys   4.62 ( 1%) wall  
11149 kB ( 1%) ggc
 dead code elimination :   2.60 ( 0%) usr   0.01 ( 0%) sys   2.60 ( 0%) wall   
   0 kB ( 0%) ggc
 dead store elim1      :   3.33 ( 0%) usr   0.22 ( 1%) sys   3.51 ( 0%) wall  
27472 kB ( 2%) ggc
 dead store elim2      :   8.94 ( 1%) usr   0.02 ( 0%) sys   8.92 ( 1%) wall  
40503 kB ( 3%) ggc
 CPROP                 :   3.82 ( 0%) usr   0.01 ( 0%) sys   3.84 ( 0%) wall   
  10 kB ( 0%) ggc
 CSE 2                 :   4.43 ( 1%) usr   0.02 ( 0%) sys   4.44 ( 1%) wall   
7115 kB ( 1%) ggc
 branch prediction     :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
  43 kB ( 0%) ggc
 combiner              :   3.60 ( 0%) usr   0.03 ( 0%) sys   3.62 ( 0%) wall  
13773 kB ( 1%) ggc
 regmove               :   1.00 ( 0%) usr   0.01 ( 0%) sys   1.00 ( 0%) wall   
   0 kB ( 0%) ggc
 integrated RA         :  30.06 ( 4%) usr   0.29 ( 2%) sys  30.38 ( 4%) wall  
52314 kB ( 4%) ggc
 reload                :  11.54 ( 1%) usr   0.52 ( 3%) sys  12.09 ( 1%) wall 
216344 kB (18%) ggc
 reload CSE regs       :   9.15 ( 1%) usr   0.01 ( 0%) sys   9.16 ( 1%) wall  
59432 kB ( 5%) ggc
 load CSE after reload :   0.53 ( 0%) usr   0.01 ( 0%) sys   0.53 ( 0%) wall   
   0 kB ( 0%) ggc
 thread pro- & epilogue:   0.86 ( 0%) usr   0.00 ( 0%) sys   0.86 ( 0%) wall   
 302 kB ( 0%) ggc
 if-conversion 2       :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
  24 kB ( 0%) ggc
 combine stack adjustments:   0.18 ( 0%) usr   0.00 ( 0%) sys   0.19 ( 0%) wall
      0 kB ( 0%) ggc
 peephole 2            :   1.07 ( 0%) usr   0.00 ( 0%) sys   1.07 ( 0%) wall   
  27 kB ( 0%) ggc
 hard reg cprop        :   3.83 ( 0%) usr   0.00 ( 0%) sys   3.85 ( 0%) wall   
   2 kB ( 0%) ggc
 scheduling 2          :  20.89 ( 2%) usr   0.83 ( 5%) sys  21.75 ( 3%) wall 
125198 kB (10%) ggc
 machine dep reorg     :   1.51 ( 0%) usr   0.00 ( 0%) sys   1.53 ( 0%) wall   
   0 kB ( 0%) ggc
 reorder blocks        :   0.31 ( 0%) usr   0.00 ( 0%) sys   0.30 ( 0%) wall   
   1 kB ( 0%) ggc
 final                 :   3.47 ( 0%) usr   0.13 ( 1%) sys   3.56 ( 0%) wall   
1631 kB ( 0%) ggc
 symout                :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall   
4315 kB ( 0%) ggc
 variable tracking     :  15.85 ( 2%) usr   0.03 ( 0%) sys  15.90 ( 2%) wall 
133442 kB (11%) ggc
 TOTAL                 : 844.13            15.69           860.19           
1197120 kB
Extra diagnostic checks enabled; compiler may run slowly.
Configure with --enable-checking=release to disable checks.
COLLECT_GCC_OPTIONS='-ffree-line-length-512' '-g' '-ffree-form' '-ftime-report'
'-c' '-O3' '-ffree-line-length-512' '-v' '-mtune=generic'
 as -V -Qy -o PR38582.o /tmp/ccoKMKzI.s


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] [Meta] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (41 preceding siblings ...)
  2009-11-27  9:00 ` jv244 at cam dot ac dot uk
@ 2009-11-27 10:50 ` rguenth at gcc dot gnu dot org
  2009-12-03 13:37 ` matz at gcc dot gnu dot org
                   ` (7 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-11-27 10:50 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #41 from rguenth at gcc dot gnu dot org  2009-11-27 10:50 -------
Micha - we still spend most of the time in expand_used_vars even at -O0.
Maybe you want to have a look.

 expand                : 555.46 (92%) usr   4.88 (77%) sys 579.14 (92%) wall 
310089 kB (56%) ggc
 integrated RA         :  13.46 ( 2%) usr   0.14 ( 2%) sys  13.57 ( 2%) wall   
8685 kB ( 2%) ggc
 reload                :  14.29 ( 2%) usr   0.71 (11%) sys  15.04 ( 2%) wall 
123548 kB (22%) ggc
 TOTAL                 : 605.85             6.35           631.31            
552067 kB

We also still peak at 2.4GB for this testcase...  the detailed memory
report is as follows (just the biggest pieces):

Kind                   Nodes      Bytes
---------------------------------------
decls                 145133   24456664
exprs                 490000   28838464
random kinds          226360    9054736
---------------------------------------
Total                 923230   66015565

GIMPLE statements
Kind                   Stmts      Bytes
---------------------------------------
assignments              361      29112
phi nodes                  1        240
conditionals              61       4880
sequences               3874      92976
everything else        78134   10240768
---------------------------------------
Total                  82431   10367976

(would probably interesting to separately count calls)

RTX Kind               Count      Bytes
---------------------------------------
expr_list            1113875   26733000
insn                 1106192   79645824
set                  1106448   26554752
reg                   829905   26556960
mem                  2508095   60194280
plus                 2894284   69462816
---------------------------------------
Total                10615719  311545008

DF as usual is a big memory consumer, even at -O0 ...

source location                                     Garbage            Freed   
         Leak         Overhead            Times
emit-rtl.c:907 (gen_reg_rtx)                       11781120: 2.3%  
11769792:33.0%          0: 0.0%    6519744:60.7%         18
rtl.c:285 (copy_rtx)                               21352344: 4.1%          0:
0.0%          0: 0.0%          0: 0.0%     889681
emit-rtl.c:425 (gen_raw_REG)                       26543328: 5.1%          0:
0.0%      13600: 0.1%          0: 0.0%     829904
reload1.c:2622 (eliminate_regs_1)                  26589096: 5.1%          0:
0.0%          0: 0.0%          0: 0.0%    1107879
emit-rtl.c:640 (gen_rtx_MEM)                       59946576:11.5%          0:
0.0%     247704: 1.2%          0: 0.0%    2508095
emit-rtl.c:5457 (copy_insn_1)                      61260256:11.8%          0:
0.0%          0: 0.0%          0: 0.0%    2608317
emit-rtl.c:3610 (make_insn_raw)                    79645680:15.3%          0:
0.0%          0: 0.0%          0: 0.0%    1106190
Total                                             520012383         35665696   
     20430230         10739229         16135674
source location                                     Garbage            Freed   
         Leak         Overhead            Times


So most of the memory is used in the RTL parts of the compiler.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |matz at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] [Meta] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (42 preceding siblings ...)
  2009-11-27 10:50 ` rguenth at gcc dot gnu dot org
@ 2009-12-03 13:37 ` matz at gcc dot gnu dot org
  2009-12-03 17:48 ` jv244 at cam dot ac dot uk
                   ` (6 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: matz at gcc dot gnu dot org @ 2009-12-03 13:37 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #42 from matz at gcc dot gnu dot org  2009-12-03 13:36 -------
Subject: Bug 38474

Author: matz
Date: Thu Dec  3 13:36:32 2009
New Revision: 154945

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=154945
Log:
        PR middle-end/38474
        * cfgexpand.c (struct stack_var): Add conflicts member.
        (stack_vars_conflict, stack_vars_conflict_alloc,
        n_stack_vars_conflict): Remove.
        (add_stack_var): Initialize conflicts member.
        (triangular_index, resize_stack_vars_conflict): Remove.
        (add_stack_var_conflict, stack_var_conflict_p): Rewrite in
        terms of new member.
        (union_stack_vars): Only run over the conflicts.
        (partition_stack_vars): Remove special case.
        (expand_used_vars_for_block): Don't call resize_stack_vars_conflict,
        don't create self-conflicts.
        (account_used_vars_for_block): Don't create any conflicts.
        (fini_vars_expansion): Free bitmaps, don't free or clear removed
        globals.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/cfgexpand.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] [Meta] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (43 preceding siblings ...)
  2009-12-03 13:37 ` matz at gcc dot gnu dot org
@ 2009-12-03 17:48 ` jv244 at cam dot ac dot uk
  2009-12-03 21:05 ` matz at gcc dot gnu dot org
                   ` (5 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: jv244 at cam dot ac dot uk @ 2009-12-03 17:48 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #43 from jv244 at cam dot ac dot uk  2009-12-03 17:47 -------
(In reply to comment #42)
> Subject: Bug 38474
> 
> Author: matz
> Date: Thu Dec  3 13:36:32 2009
> New Revision: 154945

looks like the initial testcase now runs with 1.3Gb, and with the following
timings (so mem/time both better by a factor of two):
expand                : 386.46 (89%) usr   0.81 (48%) sys 387.21 (89%) wall 
309554 kB (56%) ggc
 integrated RA         :  17.97 ( 4%) usr   0.26 (15%) sys  18.28 ( 4%) wall   
8696 kB ( 2%) ggc
 reload                :   7.78 ( 2%) usr   0.25 (15%) sys   8.07 ( 2%) wall 
123546 kB (22%) ggc
 thread pro- & epilogue:   0.74 ( 0%) usr   0.00 ( 0%) sys   0.76 ( 0%) wall   
 239 kB ( 0%) ggc
 final                 :   2.84 ( 1%) usr   0.12 ( 7%) sys   2.95 ( 1%) wall   
  20 kB ( 0%) ggc
 TOTAL                 : 434.29             1.70           436.00            
553866 kB


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] [Meta] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (44 preceding siblings ...)
  2009-12-03 17:48 ` jv244 at cam dot ac dot uk
@ 2009-12-03 21:05 ` matz at gcc dot gnu dot org
  2009-12-08 13:56 ` matz at gcc dot gnu dot org
                   ` (4 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: matz at gcc dot gnu dot org @ 2009-12-03 21:05 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #44 from matz at gcc dot gnu dot org  2009-12-03 21:05 -------
I'm glad.  I plan to work on also the other slow part of expand, which is the
temp slot goo, but a full solution requires touching very old and stable parts
of GCC, hence is IMO nothing for stage 3.  I have an obvious band aid patch
giving at least some further improvements that I plan to submit for 4.5.


-- 

matz at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|unassigned at gcc dot gnu   |matz at gcc dot gnu dot org
                   |dot org                     |
             Status|NEW                         |ASSIGNED
   Last reconfirmed|2008-12-16 16:26:45         |2009-12-03 21:05:09
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] [Meta] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (45 preceding siblings ...)
  2009-12-03 21:05 ` matz at gcc dot gnu dot org
@ 2009-12-08 13:56 ` matz at gcc dot gnu dot org
  2010-05-22 18:33 ` rguenth at gcc dot gnu dot org
                   ` (3 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: matz at gcc dot gnu dot org @ 2009-12-08 13:56 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #45 from matz at gcc dot gnu dot org  2009-12-08 13:56 -------
Subject: Bug 38474

Author: matz
Date: Tue Dec  8 13:56:06 2009
New Revision: 155087

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=155087
Log:
        PR middle-end/38474
        * function.c (free_temp_slots): Only walk the temp slot
        addresses and combine slots if we actually changes something.
        (pop_temp_slots): Ditto.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/function.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] [Meta] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (46 preceding siblings ...)
  2009-12-08 13:56 ` matz at gcc dot gnu dot org
@ 2010-05-22 18:33 ` rguenth at gcc dot gnu dot org
  2010-05-23  6:31 ` jv244 at cam dot ac dot uk
                   ` (2 subsequent siblings)
  50 siblings, 0 replies; 52+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-05-22 18:33 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #46 from rguenth at gcc dot gnu dot org  2010-05-22 18:12 -------
GCC 4.3.5 is being released, adjusting target milestone.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.3.5                       |4.3.6


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] [Meta] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (47 preceding siblings ...)
  2010-05-22 18:33 ` rguenth at gcc dot gnu dot org
@ 2010-05-23  6:31 ` jv244 at cam dot ac dot uk
  2010-05-23 20:09 ` rguenth at gcc dot gnu dot org
  2010-05-23 21:03 ` [Bug middle-end/38474] slow compilation at -O0 due to expand's temp slot goo steven at gcc dot gnu dot org
  50 siblings, 0 replies; 52+ messages in thread
From: jv244 at cam dot ac dot uk @ 2010-05-23  6:31 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #47 from jv244 at cam dot ac dot uk  2010-05-23 06:31 -------
all dependencies are fixed, and so is this bug.


-- 

jv244 at cam dot ac dot uk changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|                            |FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] [Meta] slow compilation at -O0 (callgraph optimization, inline heuristics, expand )
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (48 preceding siblings ...)
  2010-05-23  6:31 ` jv244 at cam dot ac dot uk
@ 2010-05-23 20:09 ` rguenth at gcc dot gnu dot org
  2010-05-23 21:03 ` [Bug middle-end/38474] slow compilation at -O0 due to expand's temp slot goo steven at gcc dot gnu dot org
  50 siblings, 0 replies; 52+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-05-23 20:09 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #48 from rguenth at gcc dot gnu dot org  2010-05-23 20:08 -------
Nope.  See comment#44.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|FIXED                       |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [Bug middle-end/38474] slow compilation at -O0 due to expand's temp slot goo
  2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
                   ` (49 preceding siblings ...)
  2010-05-23 20:09 ` rguenth at gcc dot gnu dot org
@ 2010-05-23 21:03 ` steven at gcc dot gnu dot org
  50 siblings, 0 replies; 52+ messages in thread
From: steven at gcc dot gnu dot org @ 2010-05-23 21:03 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #49 from steven at gcc dot gnu dot org  2010-05-23 21:02 -------
Let's change the bug type at least, from a meta bug to a normal bug.


-- 

steven at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[Meta] slow compilation at -|slow compilation at -O0 due
                   |O0 (callgraph optimization, |to expand's temp slot goo
                   |inline heuristics, expand ) |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474


^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2010-05-23 21:03 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-12-10 15:26 [Bug middle-end/38474] New: slow compilation at -O0 (callgraph optimization, inline heuristics, ggc expand ) jv244 at cam dot ac dot uk
2008-12-10 15:27 ` [Bug middle-end/38474] " jv244 at cam dot ac dot uk
2008-12-10 15:41 ` [Bug middle-end/38474] slow compilation at -O0 (callgraph optimization, inline heuristics, " rguenth at gcc dot gnu dot org
2008-12-10 16:14 ` jv244 at cam dot ac dot uk
2008-12-10 16:58 ` rguenth at gcc dot gnu dot org
2008-12-10 22:35 ` jv244 at cam dot ac dot uk
2008-12-10 22:49 ` rguenth at gcc dot gnu dot org
2008-12-10 22:58 ` jv244 at cam dot ac dot uk
2008-12-11  8:28 ` jv244 at cam dot ac dot uk
2008-12-11 11:35 ` rguenth at gcc dot gnu dot org
2008-12-11 12:04 ` jv244 at cam dot ac dot uk
2008-12-15 19:39 ` jv244 at cam dot ac dot uk
2008-12-15 21:19 ` steven at gcc dot gnu dot org
2008-12-15 21:29 ` steven at gcc dot gnu dot org
2008-12-15 21:54 ` steven at gcc dot gnu dot org
2008-12-15 21:56 ` steven at gcc dot gnu dot org
2008-12-15 21:57 ` steven at gcc dot gnu dot org
2008-12-16  7:52 ` jv244 at cam dot ac dot uk
2008-12-16 11:59 ` jv244 at cam dot ac dot uk
2008-12-16 12:46 ` steven at gcc dot gnu dot org
2008-12-16 12:48 ` jv244 at cam dot ac dot uk
2008-12-16 12:50 ` jv244 at cam dot ac dot uk
2008-12-16 13:43 ` steven at gcc dot gnu dot org
2008-12-16 14:19 ` jv244 at cam dot ac dot uk
2008-12-16 14:21 ` jv244 at cam dot ac dot uk
2008-12-16 16:19 ` jv244 at cam dot ac dot uk
2008-12-16 16:28 ` steven at gcc dot gnu dot org
2008-12-16 16:31 ` jv244 at cam dot ac dot uk
2008-12-16 19:35 ` [Bug middle-end/38474] [4.3/4.4 Regression] " pinskia at gcc dot gnu dot org
2008-12-16 20:32 ` jv244 at cam dot ac dot uk
2008-12-17  6:52 ` jv244 at cam dot ac dot uk
2008-12-17  7:03 ` steven at gcc dot gnu dot org
2008-12-17  8:37 ` jv244 at cam dot ac dot uk
2008-12-17 12:59 ` jv244 at cam dot ac dot uk
2008-12-17 19:42 ` steven at gcc dot gnu dot org
2008-12-20  9:00 ` jv244 at cam dot ac dot uk
2008-12-20  9:56 ` steven at gcc dot gnu dot org
2008-12-20 11:33 ` [Bug middle-end/38474] [Meta] " jv244 at cam dot ac dot uk
2008-12-20 15:51 ` steven at gcc dot gnu dot org
2009-01-24 10:27 ` rguenth at gcc dot gnu dot org
2009-08-04 12:45 ` rguenth at gcc dot gnu dot org
2009-11-27  8:52 ` jv244 at cam dot ac dot uk
2009-11-27  9:00 ` jv244 at cam dot ac dot uk
2009-11-27 10:50 ` rguenth at gcc dot gnu dot org
2009-12-03 13:37 ` matz at gcc dot gnu dot org
2009-12-03 17:48 ` jv244 at cam dot ac dot uk
2009-12-03 21:05 ` matz at gcc dot gnu dot org
2009-12-08 13:56 ` matz at gcc dot gnu dot org
2010-05-22 18:33 ` rguenth at gcc dot gnu dot org
2010-05-23  6:31 ` jv244 at cam dot ac dot uk
2010-05-23 20:09 ` rguenth at gcc dot gnu dot org
2010-05-23 21:03 ` [Bug middle-end/38474] slow compilation at -O0 due to expand's temp slot goo steven at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).