From: Steven Bosscher <stevenb.gcc@gmail.com>
To: GCC Mailing List <gcc@gcc.gnu.org>
Subject: Where does the time go?
Date: Thu, 20 May 2010 15:55:00 -0000 [thread overview]
Message-ID: <AANLkTimYQASWjXGLX9W49v-1Sy3VsLIaybYWD2oTe808@mail.gmail.com> (raw)
Hello,
For some time now, I've wanted to see where compile time goes in a
typical GCC build, because nobody really seems to know what the
compiler spends its time on. The impressions that get published about
gcc usually indicate that there is at least a feeling that GCC is not
getting faster, and that parts of the compiler are unreasonably slow.
I was hoping to maybe shed some light on what parts that could be.
What I've done is this:
* Build GCC 4.6.0 (trunk r159624) with --enable-checking=release and
with -O2 and install it
* Build GCC 4.6.0 (trunk r159624) again, with the installed compiler
and with "-O2 -g3 -ftime-report". The time reports (along with
everything else on stderr) are piped to an output file
* Extract, sum, and sort time consumed per timevar
Host was cfarm gcc14 (8 x 3GHz Xeon). Target was
x86_64-unknown-linux-gnu. "Build" means non-bootstrap.
Results at the bottom of this mail.
Conclusions:
* There are quite a few timevars for parts of the compiler that have
been removed: TV_SEQABSTR, TV_GLOBAL_ALLOC, TV_LOCAL_ALLOC are the
ones I've spotted so far. I will go through the whole list, remove
all timevars that are unused, and submit a patch.
* The "slow" parts of the compiler are not exactly news: tree-PRE,
scheduling, register allocation
* Variable tracking costs ~7.8% of compile time. This more than the
cost of the register allocation (IRA+reload)
* The C front end (preprocessing+lexing+parsing) costs ~17%. For an
optimizing compiler with so many passes, this is quite a lot.
* The GIMPLE optimizers (done with egrep
"tree|dominator_opt|alias_stmt_walking|alias_analysis|inline_heuristics|PHI_merge")
together cost ~16%.
* Adding and subtracting the above numbers, the rest of the compiler,
which is mostly the RTL parts, still account for 100-17-16-8=59% of
the total compile time. This was the most surprising result for me.
Ciao!
Steven
auto_inc_dec 0.00 0%
callgraph_verifier 0.00 0%
cfg_construction 0.00 0%
CFG_verifier 0.00 0%
delay_branch_sched 0.00 0%
df_live_byte_regs 0.00 0%
df_scan_insns 0.00 0%
df_uninitialized_regs_2 0.00 0%
dump_files 0.00 0%
global_alloc 0.00 0%
Graphite_code_generation 0.00 0%
Graphite_data_dep_analysis 0.00 0%
Graphite_loop_transforms 0.00 0%
ipa_free_lang_data 0.00 0%
ipa_lto_cgraph_IO 0.00 0%
ipa_lto_cgraph_merge 0.00 0%
ipa_lto_decl_init_IO 0.00 0%
ipa_lto_decl_IO 0.00 0%
ipa_lto_decl_merge 0.00 0%
ipa_lto_gimple_IO 0.00 0%
ipa_points_to 0.00 0%
ipa_profile 0.00 0%
ipa_type_escape 0.00 0%
life_analysis 0.00 0%
life_info_update 0.00 0%
load_CSE_after_reload 0.00 0%
local_alloc 0.00 0%
loop_doloop 0.00 0%
loop_unrolling 0.00 0%
loop_unswitching 0.00 0%
LSM 0.00 0%
lto 0.00 0%
name_lookup 0.00 0%
overload_resolution 0.00 0%
PCH_main_state_restore 0.00 0%
PCH_main_state_save 0.00 0%
PCH_pointer_reallocation 0.00 0%
PCH_pointer_sort 0.00 0%
PCH_preprocessor_state_restore 0.00 0%
PCH_preprocessor_state_save 0.00 0%
plugin_execution 0.00 0%
plugin_initialization 0.00 0%
predictive_commoning 0.00 0%
reg_stack 0.00 0%
rename_registers 0.00 0%
rest_of_compilation 0.00 0%
sequence_abstraction 0.00 0%
shorten_branches 0.00 0%
sms_modulo_scheduling 0.00 0%
template_instantiation 0.00 0%
total_time 0.00 0%
tracer 0.00 0%
tree_check_data_dependences 0.00 0%
tree_loop_distribution 0.00 0%
tree_loop_linear 0.00 0%
tree_loop_optimization 0.00 0%
tree_loop_unswitching 0.00 0%
tree_parallelize_loops 0.00 0%
tree_prefetching 0.00 0%
tree_redundant_PHIs 0.00 0%
tree_slp_vectorization 0.00 0%
tree_SSA_to_normal 0.00 0%
tree_SSA_verifier 0.00 0%
tree_STMT_verifier 0.00 0%
tree_STORE_CCP 0.00 0%
tree_store_copy_prop 0.00 0%
tree_vectorization 0.00 0%
value_profile_opts 0.00 0%
web 0.00 0%
whopr_ltrans 0.00 0%
whopr_wpa 0.00 0%
whopr_wpa_IO 0.00 0%
whopr_wpa_ltrans 0.00 0%
mode_switching 0.01 0.00261117%
tree_NRV_optimization 0.01 0.00261117%
tree_loop_fini 0.03 0.00783351%
tree_switch_initialization_conversion 0.03 0.00783351%
lower_subreg 0.04 0.0104447%
tree_buildin_call_DCE 0.05 0.0130559%
code_hoisting 0.06 0.015667%
ipa_reference 0.06 0.015667%
tree_canonical_iv 0.06 0.015667%
tree_if_combine 0.06 0.015667%
PHI_merge 0.07 0.0182782%
tree_phiprop 0.07 0.0182782%
uninit_var_anaysis 0.07 0.0182782%
control_dependences 0.08 0.0208894%
varconst 0.09 0.0235005%
tree_PHI_const_copy_prop 0.16 0.0417787%
tree_eh 0.19 0.0496122%
tree_split_crit_edges 0.19 0.0496122%
scev_constant_prop 0.20 0.0522234%
tree_PHI_insertion 0.20 0.0522234%
tree_copy_headers 0.23 0.0600569%
tree_loop_bounds 0.24 0.0626681%
tree_loop_invariant_motion 0.27 0.0705016%
variable_output 0.27 0.0705016%
combine_stack_adjustments 0.28 0.0731128%
garbage_collection 0.28 0.0731128%
loop_analysis 0.28 0.0731128%
tree_SSA_uncprop 0.28 0.0731128%
tree_SRA 0.30 0.0783351%
ipa_cp 0.34 0.0887798%
tree_linearize_phis 0.34 0.0887798%
tree_DSE 0.39 0.101836%
tree_find_ref._vars 0.39 0.101836%
varpool_construction 0.39 0.101836%
tree_rename_SSA_copies 0.47 0.122725%
complete_unrolling 0.50 0.130559%
tree_SSA_other 0.53 0.138392%
tree_loop_init 0.57 0.148837%
tree_CFG_construction 0.59 0.154059%
tree_code_sinking 0.60 0.15667%
zee 0.62 0.161893%
dominance_frontiers 0.65 0.169726%
loop_invariant_motion 0.66 0.172337%
register_scan 0.67 0.174948%
ipa_pure_const 0.71 0.185393%
tree_reassociation 0.72 0.188004%
callgraph_construction 0.73 0.190615%
if_conversion_2 0.74 0.193227%
tree_forward_propagate 0.77 0.20106%
ipa_SRA 0.91 0.237617%
peephole_2 0.95 0.248061%
tree_conservative_DCE 0.96 0.250672%
regmove 1.02 0.266339%
thread_pro_and_epilogue 1.28 0.33423%
tree_iv_optimization 1.31 0.342063%
tree_operand_scan 1.32 0.344675%
rebuild_jump_labels 1.33 0.347286%
jump 1.34 0.349897%
branch_prediction 1.35 0.352508%
machine_dep_reorg 1.36 0.355119%
inline_heuristics 1.42 0.370786%
df_multiple_defs 1.50 0.391676%
dead_code_elimination 1.74 0.454344%
tree_SSA_rewrite 1.80 0.470011%
df_use_def_def_use_chains 1.86 0.485678%
trivially_dead_code 2.00 0.522234%
reorder_blocks 2.07 0.540512%
hard_reg_cprop 2.10 0.548346%
alias_stmt_walking 2.15 0.561402%
tree_copy_propagation 2.19 0.571846%
register_information 2.27 0.592736%
tree_aggressive_DCE 2.29 0.597958%
dead_store_elim1 2.38 0.621459%
dead_store_elim2 2.40 0.626681%
integration 2.73 0.71285%
if_conversion 2.80 0.731128%
tree_CCP 2.89 0.754628%
tree_gimplify 3.03 0.791185%
callgraph_optimization 3.22 0.840797%
forward_prop 3.25 0.84863%
alias_analysis 3.41 0.890409%
df_reaching_defs 3.44 0.898243%
dominator_optimization 3.49 0.911299%
tree_SSA_incremental 3.50 0.91391%
tree_FRE 3.90 1.01836%
CSE_2 4.71 1.22986%
tree_PTA 4.80 1.25336%
CPROP 4.98 1.30036%
reload_CSE_regs 5.23 1.36564%
final 5.26 1.37348%
dominance_computation 5.44 1.42048%
df_reg_dead_unused_notes 5.62 1.46748%
tree_CFG_cleanup 5.69 1.48576%
cfg_cleanup 6.28 1.63982%
PRE 6.64 1.73382%
lexical_analysis 6.65 1.73643%
CSE 8.16 2.13072%
tree_VRP 8.36 2.18294%
symout 8.94 2.33439%
combiner 10.17 2.65556%
tree_PRE 11.42 2.98196%
scheduling_2 11.44 2.98718%
reload 11.7 3.05507%
df_live_initialized_regs 12.92 3.37363%
integrated_RA 16.31 4.25882%
df_live_regs 17.52 4.57477%
expand 24.18 6.31381%
preprocessing 27.59 7.20422%
variable_tracking 29.17 7.61678%
parser 31.53 8.23302%
TOTAL 382.97 100%
next reply other threads:[~2010-05-20 15:55 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-05-20 15:55 Steven Bosscher [this message]
2010-05-20 19:16 ` Vladimir Makarov
2010-05-20 19:57 ` Toon Moene
2010-05-20 20:36 ` Steven Bosscher
2010-05-20 20:54 ` Duncan Sands
2010-05-20 21:14 ` Steven Bosscher
2010-05-23 19:09 ` Joseph S. Myers
2010-05-24 17:00 ` Mark Mitchell
2010-05-24 21:07 ` Steven Bosscher
2010-05-24 23:22 ` Mark Mitchell
2010-05-25 1:20 ` Joseph S. Myers
2010-05-20 21:09 ` Ian Lance Taylor
2010-05-20 21:14 ` Xinliang David Li
2010-05-20 21:18 ` Steven Bosscher
2010-05-20 21:21 ` Xinliang David Li
2010-05-21 10:54 ` Richard Guenther
2010-05-21 13:26 ` Jan Hubicka
2010-05-21 15:06 ` Richard Guenther
2010-05-21 15:49 ` Jan Hubicka
2010-05-21 17:06 ` Xinliang David Li
2010-05-21 17:07 ` Richard Guenther
2010-05-20 19:36 ` Joseph S. Myers
2010-05-20 20:35 ` Eric Botcazou
2010-05-20 20:42 ` Eric Botcazou
2010-05-21 20:43 ` Diego Novillo
2010-05-20 21:28 Bradley Lucier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=AANLkTimYQASWjXGLX9W49v-1Sy3VsLIaybYWD2oTe808@mail.gmail.com \
--to=stevenb.gcc@gmail.com \
--cc=gcc@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).