public inbox for gcc-prs@sourceware.org
help / color / mirror / Atom feed
* optimization/4313: Inordinately long time in CFG manipulation
@ 2001-09-12 22:06 lucier
0 siblings, 0 replies; 2+ messages in thread
From: lucier @ 2001-09-12 22:06 UTC (permalink / raw)
To: gcc-gnats
>Number: 4313
>Category: optimization
>Synopsis: Inordinately long time in CFG manipulation
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: unassigned
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Sep 12 22:06:01 PDT 2001
>Closed-Date:
>Last-Modified:
>Originator: B. Lucier
>Release: gcc version 3.1 20010912 (experimental)
>Organization:
>Environment:
Red Hat Linux 7.1 on 350 MHz PII
Configured with: ../configure --enable-languages=c --enable-checking=no
Thread model: single
gcc version 3.1 20010912 (experimental)
built with
make bootstrap BOOT_CFLAGS='-O2 -g -pg' BOOT_LDFLAGS='-g -pg'
>Description:
Here are profiled timings for the following command applied to _num.i,
which may be at
http://www.math.purdue.edu/~lucier/_num.i.gz
(the server is down right now, I can't check):
[lucier@curie ~]$ time programs/gcc/gcc-3.1/objdir/gcc/cc1 -fno-math-errno -O1 -fschedule-insns2 -fomit-frame-pointer -D___SINGLE_HOST _num.i
__sgn __sgnf __sgnl atan2 atan2f atan2l __atan2l fmod fmodf fmodl sqrt sqrtf sqrtl __sqrtl fabs fabsf fabsl __fabsl atan atanf atanl __sgn1l floor floorf floorl ceil ceilf ceill ldexp log1p log1pf log1pl asinh asinhf asinhl acosh acoshf acoshl atanh atanhf atanhl hypot hypotf hypotl logb logbf logbl drem dremf dreml __finite ___H__20___num {GC 25002k -> 7637k} {GC 10676k -> 7660k} {GC 11856k -> 7923k} {GC 11786k -> 8417k} {GC 13063k -> 9614k} {GC 15281k -> 10232k} ___init_proc ____20___num
Execution times (seconds)
garbage collection : 3.52 ( 0%) usr 0.01 ( 0%) sys 3.67 ( 0%) wall
cfg construction : 643.65 (31%) usr 0.84 ( 0%) sys 663.76 (23%) wall
cfg cleanup : 127.71 ( 6%) usr 0.05 ( 0%) sys 136.16 ( 5%) wall
preprocessing : 2.43 ( 0%) usr 0.40 ( 0%) sys 2.90 ( 0%) wall
lexical analysis : 2.87 ( 0%) usr 0.71 ( 0%) sys 3.57 ( 0%) wall
parser : 10.21 ( 0%) usr 0.39 ( 0%) sys 11.01 ( 0%) wall
varconst : 0.41 ( 0%) usr 0.02 ( 0%) sys 0.43 ( 0%) wall
integration : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall
jump : 3.99 ( 0%) usr 0.00 ( 0%) sys 4.04 ( 0%) wall
CSE : 6.49 ( 0%) usr 0.00 ( 0%) sys 6.64 ( 0%) wall
global CSE : 69.06 ( 3%) usr 1.58 ( 0%) sys 72.31 ( 3%) wall
loop analysis : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall
flow analysis : 40.20 ( 2%) usr 0.35 ( 0%) sys 41.43 ( 1%) wall
combiner : 5.08 ( 0%) usr 0.00 ( 0%) sys 5.27 ( 0%) wall
if-conversion : 1.91 ( 0%) usr 0.07 ( 0%) sys 2.06 ( 0%) wall
local alloc : 1.80 ( 0%) usr 0.00 ( 0%) sys 1.82 ( 0%) wall
global alloc : 11.02 ( 1%) usr 0.06 ( 0%) sys 11.32 ( 0%) wall
reload CSE regs : 25.63 ( 1%) usr 0.00 ( 0%) sys 26.21 ( 1%) wall
flow 2 : 864.58 (41%) usr 677.04 (98%) sys1595.09 (55%) wall
if-conversion 2 : 1.58 ( 0%) usr 0.06 ( 0%) sys 1.66 ( 0%) wall
scheduling 2 : 13.27 ( 1%) usr 0.06 ( 0%) sys 13.52 ( 0%) wall
shorten branches : 0.67 ( 0%) usr 0.00 ( 0%) sys 0.69 ( 0%) wall
reg stack : 37.86 ( 2%) usr 7.50 ( 1%) sys 45.86 ( 2%) wall
final : 213.11 (10%) usr 0.18 ( 0%) sys 232.41 ( 8%) wall
rest of compilation : 3.34 ( 0%) usr 0.00 ( 0%) sys 3.36 ( 0%) wall
TOTAL :2090.54 689.32 2885.34
2090.670u 689.440s 48:05.77 96.3% 0+0k 0+0io 1094pf+0w
This file is compiled in a few minutes with gcc-3.0. Here
are reports of the top times with gprof:
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
29.24 199.93 199.93 9319742 0.02 0.02 cached_make_edge
26.59 381.78 181.85 7449858 0.02 0.02 remove_edge
6.21 424.22 42.44 72443984 0.00 0.00 bitmap_operation
4.68 456.23 32.01 13 2462.31 5891.02 calculate_global_regs_live
2.77 475.14 18.91 67665 0.28 0.71 try_crossjump_bb
2.52 492.36 17.22 9668 1.78 1.78 sbitmap_vector_alloc
2.24 507.68 15.32 htab_traverse
1.62 518.77 11.09 90441614 0.00 0.00 sbitmap_zero
...
-----------------------------------------------
0.00 0.25 10/9594 find_basic_blocks [14]
2.65 240.51 9584/9594 find_sub_basic_blocks [9]
[8] 35.6 2.65 240.76 9594 make_edges [8]
199.86 0.00 9316700/9319742 cached_make_edge [11]
9.74 11.05 9584/9614 sbitmap_vector_zero [26]
17.07 0.00 9584/9668 sbitmap_vector_alloc [27]
2.89 0.00 9277825/9277825 make_label_edge [69]
0.02 0.05 44343/95703 returnjump_p [286]
0.01 0.05 48208/51142 computed_jump_p [412]
0.01 0.00 75759/433825 next_nonnote_insn [541]
0.00 0.00 48208/13306482 find_reg_note [98]
-----------------------------------------------
0.00 2.77 109/9584 commit_edge_insertions [56]
0.06 240.51 9475/9584 split_all_insns [10]
[9] 35.6 0.06 243.28 9584 find_sub_basic_blocks [9]
2.65 240.51 9584/9594 make_edges [8]
0.00 0.07 96/434 split_block [205]
0.04 0.01 9584/19000 purge_dead_edges [370]
0.00 0.00 96/7449858 remove_edge [13]
0.00 0.00 9748/13306482 find_reg_note [98]
-----------------------------------------------
0.06 241.28 12/12 rest_of_compilation [7]
[10] 35.3 0.06 241.28 12 split_all_insns [10]
0.06 240.51 9475/9584 find_sub_basic_blocks [9]
0.06 0.64 252984/252984 split_insn [134]
0.01 0.00 1/34 compute_bb_for_insn [229]
0.00 0.00 12/90441614 sbitmap_zero [30]
0.00 0.00 12/565 sbitmap_alloc [1663]
0.00 0.00 1/9536 get_max_uid [1509]
-----------------------------------------------
0.07 0.00 3042/9319742 make_edge [411]
199.86 0.00 9316700/9319742 make_edges [8]
[11] 29.2 199.93 0.00 9319742 cached_make_edge [11]
-----------------------------------------------
0.28 54.49 3/10 free_basic_block_vars [20]
0.65 127.14 7/10 find_basic_blocks [14]
[12] 26.7 0.93 181.63 10 clear_edges [12]
181.63 0.00 7440953/7449858 remove_edge [13]
-----------------------------------------------
0.00 0.00 3/7449858 redirect_edge_succ_nodup [809]
0.00 0.00 18/7449858 purge_dead_edges [370]
0.00 0.00 36/7449858 try_redirect_by_replacing_jump [337]
0.00 0.00 96/7449858 find_sub_basic_blocks [9]
0.02 0.00 703/7449858 merge_blocks_nomove [325]
0.06 0.00 2620/7449858 try_crossjump_to_edge [24]
0.13 0.00 5429/7449858 flow_delete_block [41]
181.63 0.00 7440953/7449858 clear_edges [12]
[13] 26.6 181.85 0.00 7449858 remove_edge [13]
-----------------------------------------------
0.01 128.33 10/10 rest_of_compilation [7]
[14] 18.8 0.01 128.33 10 find_basic_blocks [14]
0.65 127.14 7/10 clear_edges [12]
0.00 0.25 10/9594 make_edges [8]
0.09 0.02 10/10 find_basic_blocks_1 [333]
0.08 0.00 10/34 compute_bb_for_insn [229]
0.06 0.00 10/10 count_basic_blocks [425]
0.03 0.01 10/14 tidy_fallthru_edges [437]
0.00 0.00 10/1239490 timevar_push [201]
0.00 0.00 10/254 varray_init [1743]
0.00 0.00 10/9536 get_max_uid [1509]
-----------------------------------------------
0.00 97.09 3/3 rest_of_compilation [7]
[15] 14.2 0.00 97.09 3 schedule_insns [15]
0.12 89.56 9485/9496 update_life_info [16]
0.00 2.83 9485/9485 schedule_region [72]
0.00 2.13 3/3 init_regions [82]
2.13 0.00 9485/18975 count_or_remove_death_notes [55]
0.04 0.24 3/3 sched_init [222]
0.02 0.00 3/34 compute_bb_for_insn [229]
0.01 0.00 3/13 allocate_reg_life_data [485]
0.00 0.00 9485/90441614 sbitmap_zero [30]
0.00 0.00 3/3 reposition_prologue_and_epilogue_notes [1365]
0.00 0.00 3/3796403 sbitmap_ones [242]
0.00 0.00 6/565 sbitmap_alloc [1663]
0.00 0.00 3/9536 get_max_uid [1509]
0.00 0.00 3/3 sched_finish [2063]
0.00 0.00 3/35859 get_insns [1454]
-----------------------------------------------
0.00 0.01 1/9496 if_convert [31]
0.00 0.03 3/9496 optimize_mode_switching [23]
0.00 0.07 7/9496 life_analysis [45]
0.12 89.56 9485/9496 schedule_insns [15]
[16] 13.1 0.12 89.67 9496 update_life_info [16]
32.01 44.57 13/13 calculate_global_regs_live [17]
0.00 9.83 3/22 cleanup_cfg [18]
0.18 3.04 84954/143280 propagate_block [48]
0.03 0.00 84954/229107 bitmap_copy [362]
0.00 0.01 9485/9485 verify_local_live_at_start [845]
0.00 0.00 9496/1465878 bitmap_clear [330]
0.00 0.00 3/18975 count_or_remove_death_notes [55]
0.00 0.00 9496/568566 bitmap_initialize [487]
-----------------------------------------------
32.01 44.57 13/13 update_life_info [16]
[17] 11.2 32.01 44.57 13 calculate_global_regs_live [17]
42.25 0.00 72111186/72443984 bitmap_operation [22]
0.12 2.06 57752/143280 propagate_block [48]
0.04 0.00 101235/229107 bitmap_copy [362]
0.04 0.00 203195/2166730 bitmap_set_bit [179]
0.00 0.03 57752/67237 bitmap_equal_p [501]
0.02 0.00 328090/1465878 bitmap_clear [330]
0.01 0.00 113217/568566 bitmap_initialize [487]
0.00 0.00 1/90441614 sbitmap_zero [30]
-----------------------------------------------
0.00 9.83 3/22 update_life_info [16]
0.01 62.24 19/22 rest_of_compilation [7]
[18] 10.5 0.01 72.07 22 cleanup_cfg [18]
0.09 64.62 22/22 try_optimize_cfg [19]
0.03 7.33 31/31 delete_unreachable_blocks [37]
0.00 0.00 22/1239490 timevar_pop [197]
0.00 0.00 22/1239490 timevar_push [201]
0.00 0.00 44/166659 free_EXPR_LIST_list [995]
-----------------------------------------------
0.09 64.62 22/22 cleanup_cfg [18]
[19] 9.5 0.09 64.62 22 try_optimize_cfg [19]
18.91 28.89 67665/67665 try_crossjump_bb [21]
9.23 0.11 208787/208787 try_forward_edges [33]
0.00 3.82 3146/5417 flow_delete_block [41]
0.02 2.75 208787/208787 try_simplify_condjump [73]
0.00 0.37 890/890 merge_blocks [192]
0.00 0.32 2015/10723 flow_delete_insn_chain [86]
0.02 0.12 74300/81241 redirect_edge_and_branch [278]
0.01 0.02 96070/27562210 forwarder_block_p [35]
0.02 0.01 86632/12775689 onlyjump_p [58]
0.00 0.00 2015/138502 reg_mentioned_p [346]
0.00 0.00 353/7724 redirect_edge_succ_nodup [809]
-----------------------------------------------
0.00 5.25 7/73 life_analysis [45]
0.00 6.75 9/73 if_convert [31]
0.00 42.76 57/73 rest_of_compilation [7]
[20] 8.0 0.00 54.77 73 free_basic_block_vars [20]
0.28 54.49 3/10 clear_edges [12]
-----------------------------------------------
18.91 28.89 67665/67665 try_optimize_cfg [19]
[21] 7.0 18.91 28.89 67665 try_crossjump_bb [21]
2.50 26.39 7261292/7261292 try_crossjump_to_edge [24]
-----------------------------------------------
0.00 0.00 9/72443984 find_if_case_1 [533]
0.00 0.00 14/72443984 find_if_case_2 [642]
0.00 0.00 420/72443984 dead_or_predicable [813]
0.01 0.00 18758/72443984 update_equiv_regs [208]
0.04 0.00 67237/72443984 bitmap_equal_p [501]
0.14 0.00 246360/72443984 finish_spills [181]
42.25 0.00 72111186/72443984 calculate_global_regs_live [17]
[22] 6.2 42.44 0.00 72443984 bitmap_operation [22]
-----------------------------------------------
0.88 29.45 3/3 rest_of_compilation [7]
[23] 4.4 0.88 29.45 3 optimize_mode_switching [23]
0.00 28.35 6/6 pre_edge_lcm [25]
0.25 0.57 1/5 commit_edge_insertions [56]
0.02 0.03 3/21 sbitmap_vector_ones [188]
0.01 0.03 36043/275175 get_attr_type [207]
0.02 0.02 36284/2348999 note_stores <cycle 7> [140]
0.00 0.03 3/9496 update_life_info [16]
0.01 0.01 12/9614 sbitmap_vector_zero [26]
0.02 0.00 23930/276241 reg_set_to_hard_reg_set [228]
0.02 0.00 12/9668 sbitmap_vector_alloc [27]
0.00 0.01 3541/62459 recog_memoized_1 [240]
0.01 0.00 3/13 allocate_reg_life_data [485]
0.01 0.00 27290/384090 asm_noperands [331]
0.00 0.00 14620/86781 gen_sequence [482]
0.00 0.00 14634/102819 end_sequence [728]
0.00 0.00 18762/3738844 sbitmap_not [265]
0.00 0.00 14535/14825 emit_insn_before [1120]
0.00 0.00 5/5 emit_i387_cw_initialization [1235]
0.00 0.00 10/47 assign_386_stack_local [1203]
0.00 0.00 14/123 insert_insn_on_edge [1175]
0.00 0.00 5/2348999 emit_move_insn <cycle 7> [460]
0.00 0.00 71/198 emit_block_insn_before [1383]
0.00 0.00 23420/23420 reg_dies [1471]
0.00 0.00 14620/102819 start_sequence [1426]
0.00 0.00 9385/9385 new_seginfo [1512]
0.00 0.00 9385/9385 add_seginfo [1511]
0.00 0.00 48/48 make_preds_opaque [1895]
0.00 0.00 6/6 free_edge_list [1987]
-----------------------------------------------
2.50 26.39 7261292/7261292 try_crossjump_bb [21]
[24] 4.2 2.50 26.39 7261292 try_crossjump_to_edge [24]
2.97 14.21 6289011/6289011 flow_find_cross_jump [28]
2.41 5.63 27025588/27562210 forwarder_block_p [35]
0.00 0.40 2549/10723 flow_delete_insn_chain [86]
0.35 0.02 7261156/7261156 outgoing_edges_match [195]
0.01 0.25 338/434 split_block [205]
0.06 0.00 2620/7449858 remove_edge [13]
0.01 0.05 2549/3021 make_single_succ_edge [387]
0.01 0.00 2549/2631 set_block_for_new_insns [789]
0.00 0.00 2549/9171 gen_jump [689]
0.00 0.00 2549/2596 emit_jump_insn_before [949]
0.00 0.00 2549/13306482 find_reg_note [98]
0.00 0.00 2549/433825 next_nonnote_insn [541]
0.00 0.00 2549/8290 block_label [1100]
0.00 0.00 71/176 emit_barrier_after [1166]
0.00 0.00 71/165738 gen_rtx_CONST_INT [686]
-----------------------------------------------
0.00 28.35 6/6 optimize_mode_switching [23]
[25] 4.1 0.00 28.35 6 pre_edge_lcm [25]
5.85 4.21 6/6 compute_laterin [32]
1.41 5.29 6/6 compute_antinout_edge [40]
0.88 4.41 6/6 compute_available [49]
1.50 1.51 6/6 compute_earliest [66]
1.16 0.68 6/6 compute_insert_delete [85]
1.35 0.00 6/6 create_edge_list [100]
0.10 0.00 54/9668 sbitmap_vector_alloc [27]
-----------------------------------------------
0.00 0.00 3/9614 init_dependency_caches [698]
0.01 0.01 12/9614 optimize_mode_switching [23]
0.02 0.02 15/9614 idoms_to_doms [248]
9.74 11.05 9584/9614 make_edges [8]
[26] 3.1 9.77 11.09 9614 sbitmap_vector_zero [26]
11.09 0.00 90422081/90441614 sbitmap_zero [30]
-----------------------------------------------
0.01 0.00 3/9668 flow_loops_find [63]
0.01 0.00 3/9668 init_dependency_caches [698]
0.01 0.00 6/9668 estimate_probability [34]
0.01 0.00 6/9668 if_convert [31]
0.02 0.00 12/9668 optimize_mode_switching [23]
0.10 0.00 54/9668 pre_edge_lcm [25]
17.07 0.00 9584/9668 make_edges [8]
[27] 2.5 17.22 0.00 9668 sbitmap_vector_alloc [27]
-----------------------------------------------
2.97 14.21 6289011/6289011 try_crossjump_to_edge [24]
[28] 2.5 2.97 14.21 6289011 flow_find_cross_jump [28]
7.32 0.00 5587015/5643903 rtx_renumbered_equal_p [36]
2.31 1.30 12578022/12775689 onlyjump_p [58]
0.74 2.36 11164966/11164997 find_reg_equal_equiv_note [65]
0.03 0.08 440578/508111 stack_regs_mentioned [316]
0.02 0.05 41435/95703 returnjump_p [286]
0.00 0.00 196/2223309 rtx_equal_p [239]
-----------------------------------------------
<spontaneous>
[29] 2.5 15.32 1.60 htab_traverse [29]
0.00 0.87 43785/43785 rtx_htab_mark_1 [116]
0.20 0.53 460865/460865 cselib_invalidate_mem_1 [127]
0.00 0.00 1170/1170 mark_hash_entry [1067]
0.00 0.00 3386/3386 discard_useless_locs [1565]
0.00 0.00 1631/1631 discard_useless_values [1598]
-----------------------------------------------
0.00 0.00 1/90441614 calculate_global_regs_live [17]
0.00 0.00 1/90441614 flow_depth_first_order_compute [160]
0.00 0.00 1/90441614 if_convert [31]
0.00 0.00 2/90441614 flow_loops_find [63]
0.00 0.00 3/90441614 combine_instructions [74]
0.00 0.00 6/90441614 mark_dfs_back_edges [80]
0.00 0.00 6/90441614 compute_antinout_edge [40]
0.00 0.00 6/90441614 compute_earliest [66]
0.00 0.00 6/90441614 compute_available [49]
0.00 0.00 12/90441614 split_all_insns [10]
0.00 0.00 519/90441614 flow_loop_nodes_find [1183]
0.00 0.00 9485/90441614 init_regions [82]
0.00 0.00 9485/90441614 schedule_insns [15]
11.09 0.00 90422081/90441614 sbitmap_vector_zero [26]
[30] 1.6 11.09 0.00 90441614 sbitmap_zero [30]
-----------------------------------------------
The enormous amount of system time in flow2 is from mmapping
and unmmapping a large bitmap, Daniel Berlin provided the
analysis previously.
I wanted to send this timing data now because of the large
changes being made to the cfg code, with the hope that
there can be some improvement.
Brad Lucier
>How-To-Repeat:
>Fix:
>Release-Note:
>Audit-Trail:
>Unformatted:
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: optimization/4313: Inordinately long time in CFG manipulation
@ 2003-02-19 15:25 gerald
0 siblings, 0 replies; 2+ messages in thread
From: gerald @ 2003-02-19 15:25 UTC (permalink / raw)
To: gcc-bugs, gcc-prs, lucier, nobody
Synopsis: Inordinately long time in CFG manipulation
State-Changed-From-To: open->closed
State-Changed-By: gerald
State-Changed-When: Wed Feb 19 15:25:41 2003
State-Changed-Why:
Fixed, according to the reporter.
http://gcc.gnu.org/cgi-bin/gnatsweb.pl?cmd=view%20audit-trail&database=gcc&pr=4313
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2003-02-19 15:25 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-09-12 22:06 optimization/4313: Inordinately long time in CFG manipulation lucier
2003-02-19 15:25 gerald
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).