public inbox for gcc-prs@sourceware.org
help / color / mirror / Atom feed
* Re: optimization/4121: split_all_insns performance bottleneck
@ 2003-02-19 15:26 gerald
  0 siblings, 0 replies; 2+ messages in thread
From: gerald @ 2003-02-19 15:26 UTC (permalink / raw)
  To: gcc-bugs, gcc-prs, lucier, nobody

Synopsis: split_all_insns performance bottleneck

State-Changed-From-To: open->closed
State-Changed-By: gerald
State-Changed-When: Wed Feb 19 15:26:12 2003
State-Changed-Why:
    Fixed, according to the reporter.

http://gcc.gnu.org/cgi-bin/gnatsweb.pl?cmd=view%20audit-trail&database=gcc&pr=4121


^ permalink raw reply	[flat|nested] 2+ messages in thread

* optimization/4121: split_all_insns performance bottleneck
@ 2001-08-24 17:06 lucier
  0 siblings, 0 replies; 2+ messages in thread
From: lucier @ 2001-08-24 17:06 UTC (permalink / raw)
  To: gcc-gnats

>Number:         4121
>Category:       optimization
>Synopsis:       split_all_insns performance bottleneck
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    unassigned
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Aug 24 17:06:01 PDT 2001
>Closed-Date:
>Last-Modified:
>Originator:     B. Lucier
>Release:        3.1 20010824 (experimental)
>Organization:
>Environment:
sparc-sun-solaris28
>Description:
The following PR can probably be closed now

http://gcc.gnu.org/ml/gcc-prs/2001-08/msg00184.html

The current problem when compiling

http://www.math.purdue.edu/~lucier/all.i.gz

seems to be that split_all_insns calls find_sub_basic_block each time
the CFG is altered.  Here is the timing and profiling data with

banach-109% /pkgs/gcc-2.96/bin/gcc -v
Reading specs from /pkgs/gcc-2.96/lib/gcc-lib/sparc-sun-solaris2.8/3.1/specs
Configured with: ../configure --prefix=/pkgs/gcc-2.96 --enable-checking=no --enable-languages=c
Thread model: posix
gcc version 3.1 20010824 (experimental)

and the calling options

/pkgs/gcc-2.96/lib/gcc-lib/sparc-sun-solaris2.8/3.1//cc1 -fPIC -O1 -fschedule-insns2 -fno-math-errno -fno-strict-aliasing -mcpu=supersparc -mtune=ultrasparc -Wall -W -Wno-unused all.i

 ___H__20_all {GC 72513k -> 24052k} {GC 32111k -> 25325k} {GC 33960k -> 25286k} {GC 40663k -> 24398k} {GC 37453k -> 27411k} {GC 50777k -> 30006k} ___init_proc ____20_all
Execution times (seconds)
 garbage collection    :   6.33 ( 0%) usr   0.00 ( 0%) sys   6.44 ( 0%) wall
 cfg construction      : 184.32 ( 6%) usr  14.80 (49%) sys 199.06 ( 6%) wall
 cfg cleanup           : 529.64 (16%) usr   0.00 ( 0%) sys 529.69 (16%) wall
 preprocessing         :   2.26 ( 0%) usr   3.03 (10%) sys   4.88 ( 0%) wall
 lexical analysis      :   2.28 ( 0%) usr   6.28 (21%) sys   8.94 ( 0%) wall
 parser                :  16.05 ( 0%) usr   4.14 (14%) sys  20.25 ( 1%) wall
 varconst              :   0.74 ( 0%) usr   0.01 ( 0%) sys   0.62 ( 0%) wall
 jump                  :  13.81 ( 0%) usr   0.03 ( 0%) sys  13.88 ( 0%) wall
 CSE                   :  12.64 ( 0%) usr   0.00 ( 0%) sys  12.75 ( 0%) wall
 loop analysis         :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall
 flow analysis         : 180.06 ( 5%) usr   1.16 ( 4%) sys 181.25 ( 5%) wall
 combiner              :  13.89 ( 0%) usr   0.01 ( 0%) sys  13.94 ( 0%) wall
 if-conversion         :   9.96 ( 0%) usr   0.01 ( 0%) sys  10.00 ( 0%) wall
 local alloc           :   6.35 ( 0%) usr   0.00 ( 0%) sys   6.38 ( 0%) wall
 global alloc          :  23.79 ( 1%) usr   0.62 ( 2%) sys  24.44 ( 1%) wall
 reload CSE regs       :  87.27 ( 3%) usr   0.00 ( 0%) sys  87.25 ( 3%) wall
 flow 2                :2150.64 (65%) usr   0.00 ( 0%) sys2150.69 (64%) wall
 if-conversion 2       :  10.33 ( 0%) usr   0.00 ( 0%) sys  10.31 ( 0%) wall
 scheduling 2          :  41.51 ( 1%) usr   0.00 ( 0%) sys  41.50 ( 1%) wall
 delay branch sched    :   6.23 ( 0%) usr   0.00 ( 0%) sys   6.25 ( 0%) wall
 shorten branches      :   0.72 ( 0%) usr   0.00 ( 0%) sys   0.75 ( 0%) wall
 final                 :  16.81 ( 1%) usr   0.01 ( 0%) sys  16.75 ( 0%) wall
 symout                :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall
 rest of compilation   :  12.87 ( 0%) usr   0.00 ( 0%) sys  12.88 ( 0%) wall
 TOTAL                 :3328.58            30.15          3359.19
3328.68u 30.31s 56:00.70 99.9%

Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  ms/call  ms/call  name    
 43.06    370.73   370.73 69535200     0.01     0.01  make_edge
 12.23    476.05   105.32                             internal_mcount
  8.84    552.15    76.10                             htab_traverse
  3.80    584.88    32.73       29  1128.62  1128.62  mark_critical_edges
  3.57    615.66    30.78     2474    12.44    12.44  propagate_freq
  2.52    637.35    21.69       15  1446.00  1449.33  calc_idoms
  2.51    658.99    21.64   258060     0.08     0.08  try_forward_edges
  2.49    680.42    21.43 120405144     0.00     0.00  bitmap_operation
  2.12    698.68    18.26       15  1217.33  1217.33  calc_dfs_tree_nonrec
  2.08    716.57    17.89        8  2236.25  4988.53  calculate_global_regs_live
  1.93    733.17    16.60       25   664.00   664.00  find_unreachable_blocks
  1.88    749.38    16.21        3  5403.33  9329.03  flow_loops_find
  1.29    760.50    11.12 69475595     0.00     0.00  make_label_edge
  1.12    770.15     9.65        5  1930.00  1930.00  mark_dfs_back_edges
  0.89    777.83     7.68        3  2560.00 15154.13  estimate_bb_frequencies
...
-----------------------------------------------
                0.00    0.28      10/13720       find_basic_blocks [34]
                3.87  386.48   13710/13720       find_sub_basic_blocks [10]
[8]     45.4    3.87  386.76   13720         make_edges [8]
              370.69    0.00 69527073/69535200     make_edge [11]
               11.12    0.00 69475595/69475595     make_label_edge [36]
                3.24    0.00   13712/13730       sbitmap_vector_alloc [53]
                0.05    1.56   13712/13730       sbitmap_vector_zero [71]
                0.03    0.04   57675/877995      for_each_rtx <cycle 10> [77]
                0.01    0.02   70795/76239       computed_jump_p [470]
                0.01    0.00  107911/923704      next_nonnote_insn [400]
                0.00    0.00   70795/66160938     find_reg_note [165]
                0.00    0.00   57675/295634      returnjump_p [1163]
-----------------------------------------------
                0.00  390.55       9/9           rest_of_compilation [7]
[9]     45.4    0.00  390.55       9         split_all_insns [9]
                0.03  390.27   13707/13710       find_sub_basic_blocks [10]
                0.03    0.20  387759/508164      split_insn [174]
                0.01    0.00       1/34          compute_bb_for_insn [172]
                0.00    0.00       9/188062848     sbitmap_zero [72]
                0.00    0.00       9/2519        sbitmap_alloc [1383]
                0.00    0.00       1/13762       get_max_uid [1303]
-----------------------------------------------
                0.00    0.09       3/13710       commit_edge_insertions [275]
                0.03  390.27   13707/13710       split_all_insns [9]
[10]    45.4    0.03  390.36   13710         find_sub_basic_blocks [10]
                3.87  386.48   13710/13720       make_edges [8]
                0.01    0.01   13710/27426       purge_dead_edges [458]
                0.00    0.00   13156/66160938     find_reg_note [165]
-----------------------------------------------
                0.04    0.00    8127/69535200     try_crossjump_to_edge [59]
              370.69    0.00 69527073/69535200     make_edges [8]
[11]    43.1  370.73    0.00 69535200         make_edge [11]
-----------------------------------------------

>How-To-Repeat:

>Fix:

>Release-Note:
>Audit-Trail:
>Unformatted:


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2003-02-19 15:26 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-02-19 15:26 optimization/4121: split_all_insns performance bottleneck gerald
  -- strict thread matches above, loose matches on Subject: below --
2001-08-24 17:06 lucier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).