public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug optimization/2692] excessive compile time with optimization
       [not found] <20010429210601.2692.snyder@fnal.gov>
@ 2003-07-25 10:49 ` steven at gcc dot gnu dot org
  2003-07-25 15:35 ` steven at gcc dot gnu dot org
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 11+ messages in thread
From: steven at gcc dot gnu dot org @ 2003-07-25 10:49 UTC (permalink / raw)
  To: gcc-bugs

PLEASE REPLY TO gcc-bugzilla@gcc.gnu.org ONLY, *NOT* gcc-bugs@gcc.gnu.org.

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=2692



------- Additional Comments From steven at gcc dot gnu dot org  2003-07-25 10:49 -------
This is still slow, but not as bad as it used to be.  Here are time reports from
what I get on an Athlon XP2000 with 256MB RAM, for "g++-3.4 (GCC) 3.4 20030718
(experimental)":

$ g++-3.4 -c -ftime-report 2692.cc
 
Execution times (seconds)
 cfg construction      :   0.02 ( 2%) usr   0.00 ( 0%) sys   0.02 ( 2%) wall
 trivially dead code   :   0.01 ( 1%) usr   0.00 ( 0%) sys   0.01 ( 1%) wall
 life analysis         :   0.10 ( 9%) usr   0.00 ( 0%) sys   0.10 ( 8%) wall
 life info update      :   0.03 ( 3%) usr   0.00 ( 0%) sys   0.03 ( 3%) wall
 register scan         :   0.01 ( 1%) usr   0.00 ( 0%) sys   0.01 ( 1%) wall
 parser                :   0.12 (11%) usr   0.01 (17%) sys   0.14 (12%) wall
 name lookup           :   0.01 ( 1%) usr   0.02 (33%) sys   0.04 ( 3%) wall
 expand                :   0.09 ( 8%) usr   0.01 (17%) sys   0.10 ( 8%) wall
 integration           :   0.03 ( 3%) usr   0.00 ( 0%) sys   0.03 ( 3%) wall
 flow analysis         :   0.02 ( 2%) usr   0.00 ( 0%) sys   0.02 ( 2%) wall
 local alloc           :   0.11 (10%) usr   0.00 ( 0%) sys   0.11 ( 9%) wall
 global alloc          :   0.34 (31%) usr   0.00 ( 0%) sys   0.34 (29%) wall
 flow 2                :   0.03 ( 3%) usr   0.00 ( 0%) sys   0.03 ( 3%) wall
 shorten branches      :   0.04 ( 4%) usr   0.00 ( 0%) sys   0.04 ( 3%) wall
 reg stack             :   0.01 ( 1%) usr   0.00 ( 0%) sys   0.01 ( 1%) wall
 final                 :   0.06 ( 5%) usr   0.01 (17%) sys   0.07 ( 6%) wall
 rest of compilation   :   0.08 ( 7%) usr   0.01 (17%) sys   0.09 ( 8%) wall
 TOTAL                 :   1.11             0.06             1.19
$ g++-3.4 -c -O -ftime-report 2692.cc
 
Execution times (seconds)
 garbage collection    :   0.17 ( 0%) usr   0.01 ( 3%) sys   0.68 ( 0%) wall
 cfg construction      :   0.11 ( 0%) usr   0.00 ( 0%) sys   0.11 ( 0%) wall
 cfg cleanup           :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.07 ( 0%) wall
 trivially dead code   :   0.13 ( 0%) usr   0.01 ( 3%) sys   0.14 ( 0%) wall
 life analysis         :  99.39 (67%) usr   0.06 (17%) sys 105.98 (66%) wall
 life info update      :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall
 alias analysis        :   0.21 ( 0%) usr   0.01 ( 3%) sys   0.24 ( 0%) wall
 register scan         :   0.08 ( 0%) usr   0.01 ( 3%) sys   0.09 ( 0%) wall
 rebuild jump labels   :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall
 preprocessing         :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall
 parser                :   0.12 ( 0%) usr   0.00 ( 0%) sys   0.13 ( 0%) wall
 name lookup           :   0.00 ( 0%) usr   0.03 ( 8%) sys   0.03 ( 0%) wall
 expand                :   0.52 ( 0%) usr   0.05 (14%) sys   0.60 ( 0%) wall
 varconst              :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall
 integration           :   0.47 ( 0%) usr   0.01 ( 3%) sys   0.48 ( 0%) wall
 jump                  :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall
 CSE                   :   2.65 ( 2%) usr   0.03 ( 8%) sys   3.05 ( 2%) wall
 loop analysis         :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall
 branch prediction     :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall
 flow analysis         :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall
 combiner              :   6.36 ( 4%) usr   0.00 ( 0%) sys   6.79 ( 4%) wall
 if-conversion         :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall
 local alloc           :   0.33 ( 0%) usr   0.01 ( 3%) sys   0.48 ( 0%) wall
 global alloc          :  34.50 (23%) usr   0.11 (31%) sys  39.30 (24%) wall
 reload CSE regs       :   1.60 ( 1%) usr   0.00 ( 0%) sys   1.81 ( 1%) wall
 flow 2                :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.07 ( 0%) wall
 rename registers      :   0.14 ( 0%) usr   0.00 ( 0%) sys   0.15 ( 0%) wall
 shorten branches      :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall
 reg stack             :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall
 final                 :   0.09 ( 0%) usr   0.01 ( 3%) sys   0.13 ( 0%) wall
 rest of compilation   :   0.31 ( 0%) usr   0.00 ( 0%) sys   0.32 ( 0%) wall
 TOTAL                 : 147.62             0.36           161.10

So the expand hog is gone :-)

It's not a surprise that, for the test case for this PR, global alloc and life
analysis take so much time.  It would obviously be nice to have it faster, but
it is not the awful compile time hog anymore.

Richard, I have not reconfirmed this PR because I am not sure what's reasonable
here.  Do you think this report can be closed, or do you think these timings
still are unacceptable?

Gr.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug optimization/2692] excessive compile time with optimization
       [not found] <20010429210601.2692.snyder@fnal.gov>
  2003-07-25 10:49 ` [Bug optimization/2692] excessive compile time with optimization steven at gcc dot gnu dot org
@ 2003-07-25 15:35 ` steven at gcc dot gnu dot org
  2003-07-25 16:56 ` steven at gcc dot gnu dot org
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 11+ messages in thread
From: steven at gcc dot gnu dot org @ 2003-07-25 15:35 UTC (permalink / raw)
  To: gcc-bugs

PLEASE REPLY TO gcc-bugzilla@gcc.gnu.org ONLY, *NOT* gcc-bugs@gcc.gnu.org.

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=2692



------- Additional Comments From steven at gcc dot gnu dot org  2003-07-25 15:35 -------
Compiler:
GNU C++ version 3.4 20030725 (experimental) (i686-pc-linux-gnu)
        compiled by GNU C version 3.4 20030725 (experimental).
GGC heuristics: --param ggc-min-expand=47 --param ggc-min-heapsize=31916

Flags:
-O -quiet

File:
z.cc

Flat profile:
                                                                               
                       Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 10.73     40.70    40.70    19930     2.04     3.22  find_equiv_reg
  9.68     77.41    36.71 193175759     0.00     0.00  find_base_term
  9.53    113.58    36.17 178499488     0.00     0.00  refers_to_regno_p
  8.26    144.90    31.32 446734067     0.00     0.00  canon_rtx
  7.49    173.30    28.40 467218101     0.00     0.00  rtx_equal_p
  7.20    200.62    27.32 176913679     0.00     0.00  read_dependence
  6.26    224.38    23.75 73755132     0.00     0.00  addr_side_effect_eval
  5.96    246.99    22.61 295299380     0.00     0.00  true_regnum
  4.69    264.79    17.80 175470757     0.00     0.00  canon_true_dependence
  4.38    281.39    16.61 578096780     0.00     0.00  ix86_find_base_term
  4.36    297.94    16.54 178512386     0.00     0.00  reg_overlap_mentioned_p
  4.11    313.51    15.58 578101529     0.00     0.00  i386_output_dwarf_dtprel
  3.67    327.43    13.91 18646114     0.00     0.00  regno_clobbered_at_setjmp
  2.38    336.45     9.02 176829268     0.00     0.00  main
  1.29    341.36     4.91 175470757     0.00     0.00  anti_dependence
  1.07    345.41     4.05  3014519     0.00     0.00  propagate_block
  (all others <1%)

So find_equiv_reg is a bottleneck for this code.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug optimization/2692] excessive compile time with optimization
       [not found] <20010429210601.2692.snyder@fnal.gov>
  2003-07-25 10:49 ` [Bug optimization/2692] excessive compile time with optimization steven at gcc dot gnu dot org
  2003-07-25 15:35 ` steven at gcc dot gnu dot org
@ 2003-07-25 16:56 ` steven at gcc dot gnu dot org
  2003-08-16 14:23 ` pinskia at gcc dot gnu dot org
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 11+ messages in thread
From: steven at gcc dot gnu dot org @ 2003-07-25 16:56 UTC (permalink / raw)
  To: gcc-bugs

PLEASE REPLY TO gcc-bugzilla@gcc.gnu.org ONLY, *NOT* gcc-bugs@gcc.gnu.org.

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=2692


steven at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|0000-00-00 00:00:00         |2003-07-25 16:56:52
               date|                            |


------- Additional Comments From steven at gcc dot gnu dot org  2003-07-25 16:56 -------
Bug 10776 may be related to this one.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug optimization/2692] excessive compile time with optimization
       [not found] <20010429210601.2692.snyder@fnal.gov>
                   ` (2 preceding siblings ...)
  2003-07-25 16:56 ` steven at gcc dot gnu dot org
@ 2003-08-16 14:23 ` pinskia at gcc dot gnu dot org
  2003-10-30 21:54 ` pinskia at gcc dot gnu dot org
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2003-08-16 14:23 UTC (permalink / raw)
  To: gcc-bugs

PLEASE REPLY TO gcc-bugzilla@gcc.gnu.org ONLY, *NOT* gcc-bugs@gcc.gnu.org.

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=2692



------- Additional Comments From pinskia at gcc dot gnu dot org  2003-08-16 14:23 -------
On powerpc-apple-darwin6.6, the combiner is where most of the work is done:
Execution times (seconds)
 garbage collection    :   1.12 ( 1%) usr   0.00 ( 0%) sys   2.16 ( 2%) wall
 cfg construction      :   0.18 ( 0%) usr   0.00 ( 0%) sys   0.57 ( 0%) wall
 cfg cleanup           :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall
 trivially dead code   :   0.30 ( 0%) usr   0.00 ( 0%) sys   0.33 ( 0%) wall
 life analysis         :   9.04 ( 9%) usr   0.00 ( 0%) sys  10.26 ( 8%) wall
 life info update      :   0.28 ( 0%) usr   0.00 ( 0%) sys   0.32 ( 0%) wall
 alias analysis        :   0.42 ( 0%) usr   0.00 ( 0%) sys   0.45 ( 0%) wall
 register scan         :   0.25 ( 0%) usr   0.00 ( 0%) sys   0.35 ( 0%) wall
 rebuild jump labels   :   0.10 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall
 preprocessing         :   0.13 ( 0%) usr   0.00 ( 0%) sys   0.20 ( 0%) wall
 parser                :   0.54 ( 1%) usr   0.00 ( 0%) sys   0.54 ( 0%) wall
 name lookup           :   0.66 ( 1%) usr   0.00 ( 0%) sys   0.73 ( 1%) wall
 expand                :   1.28 ( 1%) usr   0.00 ( 0%) sys   2.42 ( 2%) wall
 integration           :   1.10 ( 1%) usr   0.00 ( 0%) sys   3.75 ( 3%) wall
 jump                  :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.09 ( 0%) wall
 CSE                   :   5.04 ( 5%) usr   0.00 ( 0%) sys   6.21 ( 5%) wall
 loop analysis         :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall
 branch prediction     :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.23 ( 0%) wall
 flow analysis         :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall
 combiner              :  63.88 (66%) usr   0.00 ( 0%) sys  82.54 (65%) wall
 if-conversion         :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall
 local alloc           :   1.04 ( 1%) usr   0.00 ( 0%) sys   1.96 ( 2%) wall
 global alloc          :   5.34 ( 6%) usr   0.00 ( 0%) sys   6.33 ( 5%) wall
 reload CSE regs       :   3.57 ( 4%) usr   0.00 ( 0%) sys   4.07 ( 3%) wall
 flow 2                :   0.15 ( 0%) usr   0.00 ( 0%) sys   0.15 ( 0%) wall
 rename registers      :   0.24 ( 0%) usr   0.00 ( 0%) sys   0.28 ( 0%) wall
 shorten branches      :   0.11 ( 0%) usr   0.00 ( 0%) sys   0.11 ( 0%) wall
 final                 :   0.35 ( 0%) usr   0.00 ( 0%) sys   0.41 ( 0%) wall
 rest of compilation   :   0.71 ( 1%) usr   0.00 ( 0%) sys   1.76 ( 1%) wall
 TOTAL                 :  96.15             0.00           126.51


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug optimization/2692] excessive compile time with optimization
       [not found] <20010429210601.2692.snyder@fnal.gov>
                   ` (3 preceding siblings ...)
  2003-08-16 14:23 ` pinskia at gcc dot gnu dot org
@ 2003-10-30 21:54 ` pinskia at gcc dot gnu dot org
  2003-11-23  8:24 ` pinskia at gcc dot gnu dot org
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2003-10-30 21:54 UTC (permalink / raw)
  To: gcc-bugs

PLEASE REPLY TO gcc-bugzilla@gcc.gnu.org ONLY, *NOT* gcc-bugs@gcc.gnu.org.

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=2692



------- Additional Comments From pinskia at gcc dot gnu dot org  2003-10-30 21:22 -------
On the mainline (20031030), this code with -O3, gcc ICEs on powerpc-apple-darwin in 
the webizer pass.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug optimization/2692] excessive compile time with optimization
       [not found] <20010429210601.2692.snyder@fnal.gov>
                   ` (4 preceding siblings ...)
  2003-10-30 21:54 ` pinskia at gcc dot gnu dot org
@ 2003-11-23  8:24 ` pinskia at gcc dot gnu dot org
  2004-01-15  1:16 ` rth at gcc dot gnu dot org
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2003-11-23  8:24 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2003-11-23 08:24 -------
It is cool that -O3 and -O2 are faster than -O1 (unit-at-a-time causes this)
The time has migrated to rename (for -O3 at least) registers on the mainline:
-O3 -fno-web:
Execution times (seconds)
 garbage collection    :   1.78 ( 2%) usr   0.01 ( 0%) sys   2.76 ( 2%) wall
 callgraph construction:   0.09 ( 0%) usr   0.02 ( 1%) sys   0.34 ( 0%) wall
 cfg construction      :   0.15 ( 0%) usr   0.00 ( 0%) sys   0.19 ( 0%) wall
 cfg cleanup           :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall
 trivially dead code   :   0.34 ( 0%) usr   0.02 ( 1%) sys   0.70 ( 1%) wall
 life analysis         :   2.81 ( 3%) usr   0.00 ( 0%) sys   3.02 ( 3%) wall
 life info update      :   1.07 ( 1%) usr   0.02 ( 1%) sys   1.30 ( 1%) wall
 alias analysis        :   0.45 ( 0%) usr   0.05 ( 2%) sys   0.71 ( 1%) wall
 register scan         :   0.28 ( 0%) usr   0.00 ( 0%) sys   0.33 ( 0%) wall
 rebuild jump labels   :   0.10 ( 0%) usr   0.00 ( 0%) sys   0.13 ( 0%) wall
 preprocessing         :   0.05 ( 0%) usr   0.04 ( 2%) sys   0.15 ( 0%) wall
 parser                :   0.65 ( 1%) usr   0.20 ( 9%) sys   1.54 ( 1%) wall
 name lookup           :   0.16 ( 0%) usr   0.44 (19%) sys   0.65 ( 1%) wall
 expand                :   0.91 ( 1%) usr   0.12 ( 5%) sys   3.07 ( 3%) wall
 integration           :   1.11 ( 1%) usr   0.09 ( 4%) sys   1.55 ( 1%) wall
 jump                  :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.09 ( 0%) wall
 CSE                   :   3.09 ( 3%) usr   0.07 ( 3%) sys   3.85 ( 3%) wall
 loop analysis         :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall
 CSE 2                 :   0.92 ( 1%) usr   0.02 ( 1%) sys   1.02 ( 1%) wall
 branch prediction     :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall
 flow analysis         :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall
 combiner              :  15.43 (16%) usr   0.13 ( 6%) sys  18.74 (16%) wall
 if-conversion         :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 regmove               :   0.40 ( 0%) usr   0.01 ( 0%) sys   0.41 ( 0%) wall
 scheduling            :   8.13 ( 8%) usr   0.46 (20%) sys   9.42 ( 8%) wall
 local alloc           :   1.21 ( 1%) usr   0.02 ( 1%) sys   1.30 ( 1%) wall
 global alloc          :   2.52 ( 3%) usr   0.08 ( 3%) sys   2.72 ( 2%) wall
 reload CSE regs       :   0.94 ( 1%) usr   0.00 ( 0%) sys   0.97 ( 1%) wall
 flow 2                :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall
 rename registers      :  51.52 (53%) usr   0.07 ( 3%) sys  56.65 (48%) wall
 scheduling 2          :   1.63 ( 2%) usr   0.39 (17%) sys   4.27 ( 4%) wall
 shorten branches      :   0.09 ( 0%) usr   0.01 ( 0%) sys   0.12 ( 0%) wall
 final                 :   0.19 ( 0%) usr   0.02 ( 1%) sys   0.22 ( 0%) wall
 rest of compilation   :   0.45 ( 0%) usr   0.01 ( 0%) sys   0.50 ( 0%) wall
 TOTAL                 :  96.78             2.33           116.99

-O0:
Execution times (seconds)
 garbage collection    :   0.13 ( 3%) usr   0.00 ( 0%) sys   0.14 ( 2%) wall
 cfg construction      :   0.04 ( 1%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall
 trivially dead code   :   0.03 ( 1%) usr   0.00 ( 0%) sys   0.07 ( 1%) wall
 life analysis         :   0.41 (10%) usr   0.00 ( 0%) sys   0.50 ( 8%) wall
 life info update      :   0.21 ( 5%) usr   0.00 ( 0%) sys   0.23 ( 4%) wall
 register scan         :   0.05 ( 1%) usr   0.00 ( 0%) sys   0.05 ( 1%) wall
 rebuild jump labels   :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 1%) wall
 preprocessing         :   0.04 ( 1%) usr   0.07 ( 9%) sys   0.11 ( 2%) wall
 parser                :   0.68 (16%) usr   0.18 (22%) sys   0.78 (12%) wall
 name lookup           :   0.19 ( 5%) usr   0.45 (56%) sys   0.78 (12%) wall
 expand                :   0.22 ( 5%) usr   0.00 ( 0%) sys   0.26 ( 4%) wall
 integration           :   0.08 ( 2%) usr   0.00 ( 0%) sys   0.09 ( 1%) wall
 jump                  :   0.00 ( 0%) usr   0.01 ( 1%) sys   0.00 ( 0%) wall
 flow analysis         :   0.04 ( 1%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall
 local alloc           :   0.73 (18%) usr   0.02 ( 3%) sys   1.21 (19%) wall
 global alloc          :   0.76 (18%) usr   0.02 ( 3%) sys   1.11 (18%) wall
 flow 2                :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 shorten branches      :   0.07 ( 2%) usr   0.00 ( 0%) sys   0.07 ( 1%) wall
 final                 :   0.17 ( 4%) usr   0.01 ( 1%) sys   0.35 ( 6%) wall
 rest of compilation   :   0.24 ( 6%) usr   0.00 ( 0%) sys   0.35 ( 6%) wall
 TOTAL                 :   4.15             0.80             6.25

-O1:
Execution times (seconds)
 garbage collection    :   1.23 ( 1%) usr   0.00 ( 0%) sys   1.28 ( 1%) wall
 cfg construction      :   0.13 ( 0%) usr   0.00 ( 0%) sys   0.13 ( 0%) wall
 cfg cleanup           :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall
 trivially dead code   :   0.32 ( 0%) usr   0.01 ( 1%) sys   0.30 ( 0%) wall
 life analysis         :   3.45 ( 3%) usr   0.02 ( 1%) sys   4.09 ( 4%) wall
 life info update      :   1.02 ( 1%) usr   0.00 ( 0%) sys   1.05 ( 1%) wall
 alias analysis        :   0.41 ( 0%) usr   0.04 ( 3%) sys   0.47 ( 0%) wall
 register scan         :   0.25 ( 0%) usr   0.00 ( 0%) sys   0.28 ( 0%) wall
 rebuild jump labels   :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall
 preprocessing         :   0.05 ( 0%) usr   0.07 ( 5%) sys   0.16 ( 0%) wall
 parser                :   0.61 ( 1%) usr   0.18 (13%) sys   0.75 ( 1%) wall
 name lookup           :   0.37 ( 0%) usr   0.34 (25%) sys   0.76 ( 1%) wall
 expand                :   0.66 ( 1%) usr   0.08 ( 6%) sys   0.74 ( 1%) wall
 integration           :   1.02 ( 1%) usr   0.02 ( 1%) sys   1.07 ( 1%) wall
 jump                  :   0.04 ( 0%) usr   0.01 ( 1%) sys   0.03 ( 0%) wall
 CSE                   :   3.27 ( 3%) usr   0.03 ( 2%) sys   3.39 ( 3%) wall
 loop analysis         :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall
 branch prediction     :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall
 flow analysis         :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall
 combiner              :  81.10 (79%) usr   0.20 (15%) sys  83.87 (77%) wall
 if-conversion         :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall
 local alloc           :   1.62 ( 2%) usr   0.05 ( 4%) sys   1.67 ( 2%) wall
 global alloc          :   3.68 ( 4%) usr   0.23 (17%) sys   4.02 ( 4%) wall
 reload CSE regs       :   1.37 ( 1%) usr   0.01 ( 1%) sys   1.94 ( 2%) wall
 flow 2                :   0.09 ( 0%) usr   0.00 ( 0%) sys   0.20 ( 0%) wall
 rename registers      :   0.47 ( 0%) usr   0.00 ( 0%) sys   0.60 ( 1%) wall
 shorten branches      :   0.17 ( 0%) usr   0.02 ( 1%) sys   0.23 ( 0%) wall
 final                 :   0.43 ( 0%) usr   0.01 ( 1%) sys   0.61 ( 1%) wall
 rest of compilation   :   0.75 ( 1%) usr   0.02 ( 1%) sys   0.80 ( 1%) wall
 TOTAL                 : 102.71             1.37           108.74

-O2:
Execution times (seconds)
 garbage collection    :   1.69 ( 4%) usr   0.00 ( 0%) sys   2.63 ( 4%) wall
 callgraph construction:   0.08 ( 0%) usr   0.02 ( 1%) sys   0.11 ( 0%) wall
 callgraph optimization:   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 cfg construction      :   0.13 ( 0%) usr   0.00 ( 0%) sys   0.13 ( 0%) wall
 cfg cleanup           :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall
 trivially dead code   :   0.34 ( 1%) usr   0.01 ( 0%) sys   0.38 ( 1%) wall
 life analysis         :   2.88 ( 6%) usr   0.03 ( 1%) sys   3.48 ( 6%) wall
 life info update      :   0.97 ( 2%) usr   0.00 ( 0%) sys   1.09 ( 2%) wall
 alias analysis        :   0.44 ( 1%) usr   0.07 ( 3%) sys   0.75 ( 1%) wall
 register scan         :   0.24 ( 1%) usr   0.01 ( 0%) sys   0.25 ( 0%) wall
 rebuild jump labels   :   0.10 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall
 preprocessing         :   0.07 ( 0%) usr   0.08 ( 4%) sys   0.13 ( 0%) wall
 parser                :   0.55 ( 1%) usr   0.14 ( 6%) sys   0.82 ( 1%) wall
 name lookup           :   0.27 ( 1%) usr   0.43 (19%) sys   0.63 ( 1%) wall
 expand                :   0.73 ( 2%) usr   0.05 ( 2%) sys   0.81 ( 1%) wall
 integration           :   1.04 ( 2%) usr   0.14 ( 6%) sys   1.24 ( 2%) wall
 jump                  :   0.08 ( 0%) usr   0.02 ( 1%) sys   0.08 ( 0%) wall
 CSE                   :   3.13 ( 7%) usr   0.06 ( 3%) sys   4.16 ( 7%) wall
 loop analysis         :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall
 CSE 2                 :   0.89 ( 2%) usr   0.02 ( 1%) sys   0.99 ( 2%) wall
 branch prediction     :   0.03 ( 0%) usr   0.01 ( 0%) sys   0.04 ( 0%) wall
 flow analysis         :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall
 combiner              :  15.77 (35%) usr   0.11 ( 5%) sys  18.25 (31%) wall
 if-conversion         :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall
 regmove               :   0.40 ( 1%) usr   0.00 ( 0%) sys   0.48 ( 1%) wall
 scheduling            :   8.15 (18%) usr   0.47 (21%) sys   9.75 (16%) wall
 local alloc           :   1.28 ( 3%) usr   0.02 ( 1%) sys   1.44 ( 2%) wall
 global alloc          :   2.46 ( 5%) usr   0.10 ( 4%) sys   3.05 ( 5%) wall
 reload CSE regs       :   0.96 ( 2%) usr   0.02 ( 1%) sys   1.17 ( 2%) wall
 flow 2                :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall
 rename registers      :   0.21 ( 0%) usr   0.00 ( 0%) sys   0.55 ( 1%) wall
 scheduling 2          :   1.32 ( 3%) usr   0.41 (18%) sys   5.31 ( 9%) wall
 shorten branches      :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.09 ( 0%) wall
 final                 :   0.22 ( 0%) usr   0.02 ( 1%) sys   0.89 ( 1%) wall
 rest of compilation   :   0.46 ( 1%) usr   0.00 ( 0%) sys   0.52 ( 1%) wall
 TOTAL                 :  45.16             2.24            59.57

-O1 -funit-at-a-time:

Execution times (seconds)
 garbage collection    :   1.18 ( 4%) usr   0.00 ( 0%) sys   1.22 ( 4%) wall
 callgraph construction:   0.10 ( 0%) usr   0.01 ( 1%) sys   0.10 ( 0%) wall
 cfg construction      :   0.11 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall
 cfg cleanup           :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall
 trivially dead code   :   0.22 ( 1%) usr   0.00 ( 0%) sys   0.21 ( 1%) wall
 life analysis         :   2.70 ( 9%) usr   0.02 ( 2%) sys   2.75 ( 8%) wall
 life info update      :   0.54 ( 2%) usr   0.00 ( 0%) sys   0.56 ( 2%) wall
 alias analysis        :   0.27 ( 1%) usr   0.01 ( 1%) sys   0.31 ( 1%) wall
 register scan         :   0.17 ( 1%) usr   0.00 ( 0%) sys   0.17 ( 1%) wall
 rebuild jump labels   :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.07 ( 0%) wall
 preprocessing         :   0.01 ( 0%) usr   0.12 (11%) sys   0.12 ( 0%) wall
 parser                :   0.51 ( 2%) usr   0.15 (14%) sys   0.81 ( 2%) wall
 name lookup           :   0.34 ( 1%) usr   0.38 (35%) sys   0.70 ( 2%) wall
 expand                :   0.71 ( 2%) usr   0.07 ( 6%) sys   0.78 ( 2%) wall
 integration           :   1.11 ( 4%) usr   0.08 ( 7%) sys   1.21 ( 4%) wall
 jump                  :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall
 CSE                   :   2.00 ( 6%) usr   0.05 ( 5%) sys   2.09 ( 6%) wall
 loop analysis         :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall
 branch prediction     :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall
 flow analysis         :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall
 combiner              :  17.04 (54%) usr   0.07 ( 6%) sys  17.57 (53%) wall
 local alloc           :   0.75 ( 2%) usr   0.03 ( 3%) sys   0.80 ( 2%) wall
 global alloc          :   1.80 ( 6%) usr   0.04 ( 4%) sys   1.89 ( 6%) wall
 reload CSE regs       :   0.47 ( 2%) usr   0.00 ( 0%) sys   0.49 ( 1%) wall
 flow 2                :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall
 rename registers      :   0.23 ( 1%) usr   0.00 ( 0%) sys   0.23 ( 1%) wall
 shorten branches      :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall
 final                 :   0.18 ( 1%) usr   0.02 ( 2%) sys   0.21 ( 1%) wall
 rest of compilation   :   0.47 ( 2%) usr   0.01 ( 1%) sys   0.47 ( 1%) wall
 TOTAL                 :  31.27             1.09            33.21

-O0 -funit-at-a-time:
Execution times (seconds)
 garbage collection    :   0.14 ( 3%) usr   0.00 ( 0%) sys   0.14 ( 2%) wall
 callgraph construction:   0.10 ( 2%) usr   0.00 ( 0%) sys   0.11 ( 2%) wall
 cfg construction      :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall
 trivially dead code   :   0.03 ( 1%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall
 life analysis         :   0.42 (10%) usr   0.01 ( 1%) sys   1.09 (15%) wall
 life info update      :   0.21 ( 5%) usr   0.00 ( 0%) sys   0.23 ( 3%) wall
 register scan         :   0.05 ( 1%) usr   0.01 ( 1%) sys   0.04 ( 1%) wall
 rebuild jump labels   :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 1%) wall
 preprocessing         :   0.07 ( 2%) usr   0.06 ( 9%) sys   0.09 ( 1%) wall
 parser                :   0.64 (15%) usr   0.15 (22%) sys   0.80 (11%) wall
 name lookup           :   0.25 ( 6%) usr   0.36 (54%) sys   0.71 (10%) wall
 expand                :   0.22 ( 5%) usr   0.01 ( 1%) sys   0.24 ( 3%) wall
 integration           :   0.07 ( 2%) usr   0.01 ( 1%) sys   0.07 ( 1%) wall
 jump                  :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall
 flow analysis         :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall
 local alloc           :   0.74 (17%) usr   0.01 ( 1%) sys   0.80 (11%) wall
 global alloc          :   0.74 (17%) usr   0.02 ( 3%) sys   1.25 (17%) wall
 flow 2                :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall
 shorten branches      :   0.07 ( 2%) usr   0.00 ( 0%) sys   0.08 ( 1%) wall
 final                 :   0.20 ( 5%) usr   0.00 ( 0%) sys   1.01 (14%) wall
 rest of compilation   :   0.29 ( 7%) usr   0.00 ( 0%) sys   0.31 ( 4%) wall
 TOTAL                 :   4.37             0.67             7.16


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=2692


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug optimization/2692] excessive compile time with optimization
       [not found] <20010429210601.2692.snyder@fnal.gov>
                   ` (5 preceding siblings ...)
  2003-11-23  8:24 ` pinskia at gcc dot gnu dot org
@ 2004-01-15  1:16 ` rth at gcc dot gnu dot org
  2004-04-29  2:40 ` pinskia at gcc dot gnu dot org
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 11+ messages in thread
From: rth at gcc dot gnu dot org @ 2004-01-15  1:16 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From rth at gcc dot gnu dot org  2004-01-15 01:16 -------
Current state on mainline, at least for x86 at -O2, is that we spend lots
of time in flow doing dead store elimination,

 life analysis         :  60.94 (76%) usr   0.00 ( 0%) sys  61.02 (75%) wall
 TOTAL                 :  80.58             0.39            81.01

If I tweek flow.c to not do *any* store elimination at all, I can pull
the total down to ~75 seconds.  I don't see anything easy to do to even
bridge the gap between these two times at this late stage of 3.4.

On tree-ssa branch, we do significantly better.

 TOTAL                 :  18.08             0.51            18.57

This with the original C++ test case.  If I crop the std::dcomplex parts
and use the _Complex support in C, then I get

 TOTAL                 :   5.85             0.17             6.01

Clearly there's work to do yet in unraveling the abstraction, but either
compilation time is acceptable, so I'm going to suspend this PR as fixed
pending merge to mainline.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |SUSPENDED
   Target Milestone|---                         |tree-ssa


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=2692


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug optimization/2692] excessive compile time with optimization
       [not found] <20010429210601.2692.snyder@fnal.gov>
                   ` (6 preceding siblings ...)
  2004-01-15  1:16 ` rth at gcc dot gnu dot org
@ 2004-04-29  2:40 ` pinskia at gcc dot gnu dot org
  2004-04-29  3:27 ` giovannibajo at libero dot it
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-04-29  2:40 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2004-04-29 01:05 -------
With my cast pass we cut the time in half of the current tree-ssa compiler.  It also 
improves the code too.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|2003-07-25 16:56:52         |2004-04-29 01:05:52
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=2692


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug optimization/2692] excessive compile time with optimization
       [not found] <20010429210601.2692.snyder@fnal.gov>
                   ` (7 preceding siblings ...)
  2004-04-29  2:40 ` pinskia at gcc dot gnu dot org
@ 2004-04-29  3:27 ` giovannibajo at libero dot it
  2004-04-29  6:11 ` pinskia at gcc dot gnu dot org
  2004-05-13 21:33 ` [Bug tree-optimization/2692] " pinskia at gcc dot gnu dot org
  10 siblings, 0 replies; 11+ messages in thread
From: giovannibajo at libero dot it @ 2004-04-29  3:27 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From giovannibajo at libero dot it  2004-04-29 01:34 -------
The C vs C++ difference can probably be tracked in a new different (cleaner) PR.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |giovannibajo at libero dot
                   |                            |it


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=2692


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug optimization/2692] excessive compile time with optimization
       [not found] <20010429210601.2692.snyder@fnal.gov>
                   ` (8 preceding siblings ...)
  2004-04-29  3:27 ` giovannibajo at libero dot it
@ 2004-04-29  6:11 ` pinskia at gcc dot gnu dot org
  2004-05-13 21:33 ` [Bug tree-optimization/2692] " pinskia at gcc dot gnu dot org
  10 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-04-29  6:11 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2004-04-29 03:29 -------
I filed a bug which should help the code generation differences between C and C++, PR 15197.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=2692


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/2692] excessive compile time with optimization
       [not found] <20010429210601.2692.snyder@fnal.gov>
                   ` (9 preceding siblings ...)
  2004-04-29  6:11 ` pinskia at gcc dot gnu dot org
@ 2004-05-13 21:33 ` pinskia at gcc dot gnu dot org
  10 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-05-13 21:33 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2004-05-13 11:50 -------
Fixed for 3.5.0 by the merge of the tree-ssa.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|SUSPENDED                   |RESOLVED
          Component|rtl-optimization            |tree-optimization
         Resolution|                            |FIXED
   Target Milestone|tree-ssa                    |3.5.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=2692


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2004-05-13 11:50 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20010429210601.2692.snyder@fnal.gov>
2003-07-25 10:49 ` [Bug optimization/2692] excessive compile time with optimization steven at gcc dot gnu dot org
2003-07-25 15:35 ` steven at gcc dot gnu dot org
2003-07-25 16:56 ` steven at gcc dot gnu dot org
2003-08-16 14:23 ` pinskia at gcc dot gnu dot org
2003-10-30 21:54 ` pinskia at gcc dot gnu dot org
2003-11-23  8:24 ` pinskia at gcc dot gnu dot org
2004-01-15  1:16 ` rth at gcc dot gnu dot org
2004-04-29  2:40 ` pinskia at gcc dot gnu dot org
2004-04-29  3:27 ` giovannibajo at libero dot it
2004-04-29  6:11 ` pinskia at gcc dot gnu dot org
2004-05-13 21:33 ` [Bug tree-optimization/2692] " pinskia at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).