* Re: [google] Port LIPO support to google/main (issue4431065)
[not found] ` <BANLkTimFO5r7qW891YT2jXfWUdTd+whX0w@mail.gmail.com>
@ 2011-04-26 2:15 ` Rong Xu
0 siblings, 0 replies; only message in thread
From: Rong Xu @ 2011-04-26 2:15 UTC (permalink / raw)
To: Xinliang David Li; +Cc: reply, gcc-patches
LGTM
On Mon, Apr 25, 2011 at 4:32 PM, Xinliang David Li <davidxl@google.com> wrote:
> Bootstrapped on x86_64/linux with all regression tests passing.
>
> Additional tests: SPEC2k, SPEC06 FDO and LIPO testing.
>
> Note -- there will be work in move LIPO away from FE based solution,
> this patch is needed to make the base for forward progress.
>
> David
>
> On Mon, Apr 25, 2011 at 4:29 PM, David Li <davidxl@google.com> wrote:
>> 2011-04-25 David Li <davidxl@google.com>
>>
>> * cgraphbuild.c (rebuild_cgraph_edges): Create edges based
>> using resolved nodes in LIPO mode after ipa_tree_profiling.
>> * dyn-ipa.c: New file.
>> * dbgcnt.def: New debug counters.
>> * c-family/c-opts.c (c_common_post_options): New option.
>> (c_common_parse_file): LIPO mode driver change.
>> (push_command_line_include): New function.
>> * cgraph.c (cgraph_mark_reachable_node): Handle aux module nodes.
>> (cgraph_clone_node): New field initialization.
>> (cgraph_create_virtual_clone): New field initialization.
>> * value-prof.c (dump_histogram_value): New hist kind.
>> (check_counter): New function.
>> (gimple_value_profile_transformations): New ic kind.
>> (gimple_ic_transform): Changed interface.
>> (gimple_indirect_call_to_profile): New vpt kind.
>> (gimple_find_values_to_profile): New vpt kind.
>> * cgraphunit.c (cgraph_finalize_function): Handle LIPO mode.
>> (verify_edge_count_and_frequency): Ditto.
>> (verify_cgraph_node): Ditto.
>> (cgraph_finalize_compilation_unit): Ditto.
>> (cgraph_mark_functions_to_output): Ditto.
>> (cgraph_expand_function): Ditto.
>> (cgraph_output_in_order): Ditto.
>> (cgraph_optimize): Ditto.
>> (cgraph_copy_node_for_versioning): Ditto.
>> (cgraph_materialize_clone): Ditto.
>> (cgraph_redirect_edge_call_stmt_to_callee): Ditto.
>> * cp/decl.c (walk_namespaces): Ditto.
>> (duplicate_decls): Ditto.
>> (make_rtl_for_nonlocal_decl): Ditto.
>> * cp/rtti.c (get_tinfo_decl): Ditto.
>> (create_pseudo_type_info): Ditto.
>> (create_tinfo_types): Ditto.
>> * cp/cp-lang.c (cp_classify_record): Ditto.
>> * cp/pt.c (instantiate_decl): Ditto.
>> * cp/semantics.c (emit_associated_thunks): Ditto.
>> * cp/decl2.c (start_static_storage_duration_function): Ditto.
>> (prune_vars_needing_no_initialization): Ditto.
>> (cxx_callgraph_analyze_expr): Ditto.
>> (no_linkage_error): Ditto.
>> (collect_all_refs): Ditto.
>> (cp_write_global_declarations): Ditto.
>> * cp/parser.c (pragma_lex): Ditto.
>> * cp/cp-objcp-common.c (cp_function_decl_explicit_p): Ditto.
>> * cp/mangle.c (mangle_conv_op_name_for_type): Ditto.
>> * cp/name-lookup.c (add_decl_to_level): Ditto.
>> * gcov-io.c (gcov_read_summary): Ditto.
>> * tree-ssa-alias.c (same_type_for_tbaa): Ditto.
>> (refs_may_alias_p_1): Ditto.
>> (walk_aliased_vdefs): Ditto.
>> * ipa-inline.c (cgraph_mark_inline_edge): Ditto.
>> (cgraph_recursive_inlining_p): Ditto.
>> (cgraph_edge_badness): Ditto.
>> (update_caller_keys): Ditto.
>> (cgraph_decide_recursive_inlining): Ditto.
>> (cgraph_decide_inlining_of_small_function): Ditto.
>> (cgraph_decide_inlining): Ditto.
>> (cgraph_decide_inlining_incrementally): Ditto.
>> (estimate_function_body_sizes): Ditto.
>> * dwarf2out.c (dwarf2out_do_cfi_startproc): Ditto.
>> (dwarf2out_begin_prologue): Ditto.
>> (dwarf2out_end_epilogue): Ditto.
>> (dw_loc_list): Ditto.
>> (gen_subprogram_die): Ditto.
>> (dwarf2out_source_line): Ditto.
>> * c-decl.c (bind): Ditto.
>> (pop_scope): Ditto. Ditto.
>> (push_file_scope): Ditto.
>> (pop_file_scope): Ditto.
>> (finish_decl): Ditto.
>> (c_write_global_declarations): Ditto.
>> * langhooks.c (lhd_do_nothing_f): Ditto.
>> (lhd_do_nothing_t_t_return_null_tree): Ditto.
>> (lhd_return_null_tree_v): Ditto.
>> (lhd_builtin_function): Ditto.
>> * function.c (pop_cfun): Ditto.
>> (allocate_struct_function): Ditto.
>> * profile.c (instrument_values): Ditto.
>> (is_edge_inconsistent): Ditto.
>> * ipa.c (process_references): Ditto.
>> (cgraph_remove_unreachable_nodes): Ditto.
>> (function_and_variable_visibility):
>> * c-typeck.c (tagged_types_tu_compatible_p): Ditto.
>> * gimplify.c (gimplify_addr_expr): Ditto.
>> * except.c (sjlj_emit_function_enter): Ditto.
>> (output_one_function_exception_table): Ditto.
>> * coverage.c (get_gcov_type): Ditto.
>> (htab_counts_entry_del): Ditto.
>> (read_counts_file): Ditto.
>> (get_coverage_counts): Ditto.
>> (coverage_checksum_string): Ditto.
>> (coverage_begin_output): Ditto.
>> (coverage_end_function): Ditto.
>> (build_fn_info_type): Ditto.
>> (build_fn_info_value): Ditto.
>> (build_ctr_info_value): Ditto.
>> (build_gcov_info): Ditto.
>> (create_coverage): Ditto.
>> (coverage_finish): Ditto.
>> * tree-sra.c (convert_callers): Ditto.
>> (modify_function): Ditto.
>> (ipa_early_sra): Ditto.
>> * varasm.c (notice_global_symbol): Ditto.
>> (assemble_external): Ditto.
>> (mark_decl_referenced): Ditto.
>> (find_decl_and_mark_needed): Ditto.
>> (finish_aliases_1): Ditto.
>> (assemble_alias): Ditto.
>> * tree-ssa.c (useless_type_conversion_p): Ditto.
>> * tree-inline.c (copy_bb): Ditto.
>> (copy_edges_for_bb): Ditto.
>> (initialize_cfun): Ditto.
>> (copy_cfg_body): Ditto.
>> (tree_inlinable_function_p): Ditto.
>> (optimize_inline_calls): Ditto.
>> (copy_tree_r): Ditto.
>> * tree-profile.c (init_ic_make_global_vars): Ditto.
>> (gimple_init_edge_profiler): Ditto.
>> (gimple_gen_ic_profiler): Ditto.
>> (gimple_gen_ic_func_profiler): Ditto.
>> (tree_profiling): Ditto.
>> * opts-global.c (lang_handle_option): Ditto.
>> (add_input_filename): Ditto.
>> (read_cmdline_options): Ditto.
>> * l-ipo.c: Ditto.
>> * libgcov.c (gcov_version): Ditto.
>> (gcov_exit): Ditto.
>> (__gcov_init): Ditto.
>> (__gcov_flush): Ditto.
>> (__gcov_merge_ior): Ditto.
>> (__gcov_one_value_profiler_body): Ditto.
>> (__gcov_indirect_call_profiler): Ditto.
>> * l-ipo.h: Ditto.
>> * tree-cfg.c: (verify_types_in_gimple_reference): Ditto.
>> (verify_gimple_call): Ditto.
>> (verify_gimple_assign_single): Ditto.
>> (verify_gimple_return): Ditto.
>> (gimple_verify_flow_info): Ditto.
>> * passes.c: (rest_of_decl_compilation): Ditto.
>> (init_optimization_passes): Ditto.
>> (pass_init_dump_file): Ditto.
>> * varpool.c (varpool_node): Ditto.
>> (debug_varpool): Ditto.
>> (varpool_node_for_asm): Ditto.
>> (varpool_enqueue_needed_node): Ditto.
>> * stmt.c (expand_asm_operands): Ditto.
>> * gcov-dump.c (main): Ditto.
>> (print_usage): Ditto.
>> (print_prefix): Ditto.
>> (dump_file): Ditto.
>> (tag_summary): Ditto.
>>
>> Index: doc/invoke.texi
>> ===================================================================
>> --- doc/invoke.texi (revision 172880)
>> +++ doc/invoke.texi (working copy)
>> @@ -258,7 +258,7 @@ Objective-C and Objective-C++ Dialects}.
>> -Wparentheses -Wpedantic-ms-format -Wno-pedantic-ms-format @gol
>> -Wpointer-arith -Wno-pointer-to-int-cast @gol
>> -Wredundant-decls @gol
>> --Wreturn-type -Wsequence-point -Wshadow @gol
>> +-Wreturn-type -Wripa-opt-mismatch -Wsequence-point -Wshadow @gol
>> -Wsign-compare -Wsign-conversion -Wstack-protector @gol
>> -Wstrict-aliasing -Wstrict-aliasing=n @gol
>> -Wstrict-overflow -Wstrict-overflow=@var{n} @gol
>> @@ -380,7 +380,9 @@ Objective-C and Objective-C++ Dialects}.
>> -freciprocal-math -fregmove -frename-registers -freorder-blocks @gol
>> -freorder-blocks-and-partition -freorder-functions @gol
>> -frerun-cse-after-loop -freschedule-modulo-scheduled-loops @gol
>> --frounding-math -fsched2-use-superblocks -fsched-pressure @gol
>> +-fripa -fripa-disallow-asm-modules -fripa-disallow-opt-mismatch @gol
>> +-fripa-no-promote-always-inline-func -fripa-verbose -frounding-math @gol
>> +-fsched2-use-superblocks -fsched-pressure @gol
>> -fsched-spec-load -fsched-spec-load-dangerous @gol
>> -fsched-stalled-insns-dep[=@var{n}] -fsched-stalled-insns[=@var{n}] @gol
>> -fsched-group-heuristic -fsched-critical-path-heuristic @gol
>> @@ -1364,8 +1366,8 @@ prefix.
>> @opindex no-canonical-prefixes
>> Never expand any symbolic links, resolve references to @samp{/../}
>> or @samp{/./}, or make the path absolute when generating a relative
>> -prefix. If neither @option{-canonical-prefixes} nor
>> -@option{-canonical-prefixes} is given, GCC tries to set an appropriate
>> +prefix. If neither @option{-canonical-prefixes} nor
>> +@option{-nocanonical-prefixes} is given, GCC tries to set an appropriate
>> default by looking for a target-specific subdirectory alongside the
>> directory containing the compiler driver.
>>
>> @@ -2969,6 +2971,7 @@ Options} and @ref{Objective-C and Object
>> -Wpointer-sign @gol
>> -Wreorder @gol
>> -Wreturn-type @gol
>> +-Wripa-opt-mismatch @gol
>> -Wsequence-point @gol
>> -Wsign-compare @r{(only in C++)} @gol
>> -Wstrict-aliasing @gol
>> @@ -3395,6 +3398,16 @@ exceptions are @samp{main} and functions
>>
>> This warning is enabled by @option{-Wall}.
>>
>> +@item -Wripa-opt-mismatch
>> +@opindex Wripa-opt-mismatch
>> +@opindex Wno-ripa-opt-mismatch
>> +When doing an FDO build with @option{-fprofile-use} and @option{-fripa},
>> +warn if importing an axuiliary module that was built with a different
>> +GCC command line during the profile-generate phase than the primary
>> +module.
>> +
>> +This warning is enabled by @option{-Wall}.
>> +
>> @item -Wswitch
>> @opindex Wswitch
>> @opindex Wno-switch
>> @@ -4493,6 +4506,12 @@ Suppress warnings from casts to pointer
>> different size. In C++, casting to a pointer type of smaller size is
>> an error. @option{Wint-to-pointer-cast} is enabled by default.
>>
>> +@item max-lipo-mem
>> +When importing auxiliary modules during profile-use, check current
>> +memory consumption after parsing each auxiliary module. If it exceeds
>> +this limit (specified in kb), don't import any more auxiliary modules.
>> +Specifying a value of 0 means don't enforce this limit. This parameter
>> +is only useful when using @option{-fprofile-use} and @option{-fripa}.
>>
>> @item -Wno-pointer-to-int-cast @r{(C and Objective-C only)}
>> @opindex Wno-pointer-to-int-cast
>> @@ -7864,6 +7883,40 @@ code.
>>
>> If @var{path} is specified, GCC will look at the @var{path} to find
>> the profile feedback data files. See @option{-fprofile-dir}.
>> +
>> +@item -fripa
>> +@opindex fripa
>> +Perform dynamic inter-procedural analysis. This is used in conjunction with
>> +the @option{-fprofile-generate} and @option{-fprofile-use} options.
>> +During the @option{-fprofile-generate} phase, this flag turns on some additional
>> +instrumentation code that enables dynamic call-graph analysis.
>> +During the @option{-fprofile-use} phase, this flag enables cross-module
>> +optimizations such as inlining.
>> +
>> +@item -fripa-disallow-asm-modules
>> +@opindex fripa-disallow-asm-modules
>> +During profile-gen, if this flag is enabled, and the module has asm statements,
>> +arrange so that a bit recording this information will be set in the profile
>> +feedback data file.
>> +During profile-use, if this flag is enabled, and the same bit in auxiliary
>> +module's profile feedback data is set, don't import this auxiliary module.
>> +If this is the primary module, don't export it.
>> +
>> +@item -fripa-disallow-opt-mismatch
>> +@opindex fripa-disallow-opt-mismatch
>> +Don't import an auxiliary module, if the GCC command line options used for this
>> +auxiliary module during the profile-generate stage were different from those used
>> +for the primary module. Note that any mismatches in warning-related options are
>> +ignored for this comparison.
>> +
>> +@item -fripa-no-promote-always-inline-func
>> +@opindex fripa-no-promote-always-inline-func
>> +Do not promote static functions with always inline attribute in LIPO compilation.
>> +
>> +@item -fripa-verbose
>> +@opindex fripa-verbose
>> +Enable printing of verbose information about dynamic inter-procedural optimizations.
>> +This is used in conjunction with the @option{-fripa}.
>> @end table
>>
>> The following options control compiler behavior regarding floating
>> @@ -9954,6 +10007,12 @@ given to GCC, substitutes @code{Y}; else
>> be as many clauses as you need. This may be combined with @code{.},
>> @code{,}, @code{!}, @code{|}, and @code{*} as needed.
>>
>> +@item max-lipo-mem
>> +When importing auxiliary modules during profile-use, check current
>> +memory consumption after parsing each auxiliary module. If it exceeds
>> +this limit (specified in kb), don't import any more auxiliary modules.
>> +Specifying a value of 0 means don't enforce this limit. This parameter
>> +is only useful when using @option{-fprofile-use} and @option{-fripa}.
>>
>> @end table
>>
>> Index: cgraphbuild.c
>> ===================================================================
>> --- cgraphbuild.c (revision 172880)
>> +++ cgraphbuild.c (working copy)
>> @@ -30,9 +30,13 @@ along with GCC; see the file COPYING3.
>> #include "cgraph.h"
>> #include "intl.h"
>> #include "gimple.h"
>> +#include "toplev.h"
>> +#include "gcov-io.h"
>> +#include "coverage.h"
>> #include "tree-pass.h"
>> #include "ipa-utils.h"
>> #include "except.h"
>> +#include "l-ipo.h"
>>
>> /* Context of record_reference. */
>> struct record_reference_ctx
>> @@ -236,6 +240,140 @@ compute_call_stmt_bb_frequency (tree dec
>> return freq;
>> }
>>
>> +
>> +bool cgraph_pre_profiling_inlining_done = false;
>> +
>> +/* Return true if E is a fake indirect call edge. */
>> +
>> +bool
>> +cgraph_is_fake_indirect_call_edge (struct cgraph_edge *e)
>> +{
>> + return !e->call_stmt;
>> +}
>> +
>> +
>> +/* Add fake cgraph edges from NODE to its indirect call callees
>> + using profile data. */
>> +
>> +static void
>> +add_fake_indirect_call_edges (struct cgraph_node *node)
>> +{
>> + unsigned n_counts, i;
>> + gcov_type *ic_counts;
>> +
>> + /* Enable this only for LIPO for now. */
>> + if (!L_IPO_COMP_MODE)
>> + return;
>> +
>> + if (cgraph_pre_profiling_inlining_done)
>> + return;
>> +
>> + ic_counts
>> + = get_coverage_counts_no_warn (DECL_STRUCT_FUNCTION (node->decl),
>> + GCOV_COUNTER_ICALL_TOPNV, &n_counts);
>> +
>> + if (!ic_counts)
>> + return;
>> +
>> + gcc_assert ((n_counts % GCOV_ICALL_TOPN_NCOUNTS) == 0);
>> +
>> +/* After the early_inline_1 before value profile transformation,
>> + functions that are indirect call targets may have their bodies
>> + removed (extern inline functions or functions from aux modules,
>> + functions in comdat etc) if all direct callsites are inlined. This
>> + will lead to missing inline opportunities after profile based
>> + indirect call promotion. The solution is to add fake edges to
>> + indirect call targets. Note that such edges are not associated
>> + with actual indirect call sites because it is not possible to
>> + reliably match pre-early-inline indirect callsites with indirect
>> + call profile counters which are from post-early inline function body. */
>> +
>> + for (i = 0; i < n_counts;
>> + i += GCOV_ICALL_TOPN_NCOUNTS, ic_counts += GCOV_ICALL_TOPN_NCOUNTS)
>> + {
>> + gcov_type val1, val2, count1, count2;
>> + struct cgraph_node *direct_call1 = 0, *direct_call2 = 0;
>> +
>> + val1 = ic_counts[1];
>> + count1 = ic_counts[2];
>> + val2 = ic_counts[3];
>> + count2 = ic_counts[4];
>> +
>> + if (val1 == 0 || count1 == 0)
>> + continue;
>> +
>> + direct_call1 = find_func_by_global_id (val1);
>> + if (direct_call1)
>> + {
>> + tree decl = direct_call1->decl;
>> + cgraph_create_edge (node,
>> + cgraph_node (decl),
>> + NULL,
>> + count1, 0, 0);
>> + }
>> +
>> + if (val2 == 0 || count2 == 0)
>> + continue;
>> + direct_call2 = find_func_by_global_id (val2);
>> + if (direct_call2)
>> + {
>> + tree decl = direct_call2->decl;
>> + cgraph_create_edge (node,
>> + cgraph_node (decl),
>> + NULL,
>> + count2, 0, 0);
>> + }
>> + }
>> +}
>> +
>> +
>> +/* This can be implemented as an IPA pass that must be first one
>> + before any unreachable node elimination. */
>> +
>> +void
>> +cgraph_add_fake_indirect_call_edges (void)
>> +{
>> + struct cgraph_node *node;
>> +
>> + /* Enable this only for LIPO for now. */
>> + if (!L_IPO_COMP_MODE)
>> + return;
>> +
>> + for (node = cgraph_nodes; node; node = node->next)
>> + {
>> + if (node->analyzed && (node->needed || node->reachable))
>> + add_fake_indirect_call_edges (node);
>> + }
>> +}
>> +
>> +/* Remove zero count fake edges added for the purpose of ensuring
>> + the right processing order. This should be called after all
>> + small ipa passes. */
>> +void
>> +cgraph_remove_zero_count_fake_edges (void)
>> +{
>> + struct cgraph_node *node;
>> +
>> + /* Enable this only for LIPO for now. */
>> + if (!L_IPO_COMP_MODE)
>> + return;
>> +
>> + for (node = cgraph_nodes; node; node = node->next)
>> + {
>> + if (node->analyzed && (node->needed || node->reachable))
>> + {
>> + struct cgraph_edge *e, *f;
>> + for (e = node->callees; e; e = f)
>> + {
>> + f = e->next_callee;
>> + if (!e->call_stmt && !e->count && !e->loop_nest
>> + && !e->frequency)
>> + cgraph_remove_edge (e);
>> + }
>> + }
>> + }
>> +}
>> +
>> /* Mark address taken in STMT. */
>>
>> static bool
>> @@ -390,6 +528,7 @@ build_cgraph_edges (void)
>> mark_load, mark_store, mark_address);
>> }
>>
>> +
>> /* Look for initializers of constant variables and private statics. */
>> FOR_EACH_LOCAL_DECL (cfun, ix, decl)
>> if (TREE_CODE (decl) == VAR_DECL
>> @@ -466,9 +605,24 @@ rebuild_cgraph_edges (void)
>> bb);
>> decl = gimple_call_fndecl (stmt);
>> if (decl)
>> - cgraph_create_edge (node, cgraph_node (decl), stmt,
>> - bb->count, freq,
>> - bb->loop_depth);
>> + {
>> + struct cgraph_node *callee;
>> + /* In LIPO mode, before tree_profiling, the call graph edge
>> + needs to be built with the original target node to make
>> + sure consistent early inline decisions between profile generate
>> + and profile use. After tree-profiling, the target needs to be
>> + set to the resolved node so that ipa-inline sees the definitions. */
>> + if (L_IPO_COMP_MODE && cgraph_pre_profiling_inlining_done)
>> + callee = cgraph_lipo_get_resolved_node (decl);
>> + else
>> + callee = cgraph_node (decl);
>> + cgraph_create_edge (node, callee, stmt,
>> + bb->count, freq,
>> + bb->loop_depth);
>> + if (L_IPO_COMP_MODE && cgraph_pre_profiling_inlining_done
>> + && decl != callee->decl)
>> + gimple_call_set_fndecl (stmt, callee->decl);
>> + }
>> else
>> cgraph_create_indirect_edge (node, stmt,
>> gimple_call_flags (stmt),
>> @@ -483,6 +637,7 @@ rebuild_cgraph_edges (void)
>> walk_stmt_load_store_addr_ops (gsi_stmt (gsi), node,
>> mark_load, mark_store, mark_address);
>> }
>> + add_fake_indirect_call_edges (node);
>> record_eh_tables (node, cfun);
>> gcc_assert (!node->global.inlined_to);
>>
>> Index: dyn-ipa.c
>> ===================================================================
>> --- dyn-ipa.c (revision 0)
>> +++ dyn-ipa.c (revision 0)
>> @@ -0,0 +1,1476 @@
>> +/* Compile this one with gcc. */
>> +/* Copyright (C) 2009. Free Software Foundation, Inc.
>> + Contributed by Xinliang David Li (davidxl@google.com) and
>> + Raksit Ashok (raksit@google.com)
>> +
>> +This file is part of GCC.
>> +
>> +GCC is free software; you can redistribute it and/or modify it under
>> +the terms of the GNU General Public License as published by the Free
>> +Software Foundation; either version 3, or (at your option) any later
>> +version.
>> +
>> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
>> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
>> +for more details.
>> +
>> +Under Section 7 of GPL version 3, you are granted additional
>> +permissions described in the GCC Runtime Library Exception, version
>> +3.1, as published by the Free Software Foundation.
>> +
>> +You should have received a copy of the GNU General Public License and
>> +a copy of the GCC Runtime Library Exception along with this program;
>> +see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
>> +<http://www.gnu.org/licenses/>. */
>> +
>> +#include "tconfig.h"
>> +#include "tsystem.h"
>> +#include "coretypes.h"
>> +#include "tm.h"
>> +
>> +#if defined(inhibit_libc)
>> +#define IN_LIBGCOV (-1)
>> +#else
>> +#undef NULL /* Avoid errors if stdio.h and our stddef.h mismatch. */
>> +#include <stdio.h>
>> +#include <stdlib.h>
>> +#define IN_LIBGCOV 1
>> +#if defined(L_gcov)
>> +#define GCOV_LINKAGE /* nothing */
>> +#endif
>> +#endif
>> +#include "gcov-io.h"
>> +
>> +struct dyn_pointer_set;
>> +
>> +#define XNEWVEC(type,ne) (type *)malloc(sizeof(type) * (ne))
>> +#define XNEW(type) (type *)malloc(sizeof(type))
>> +#define XDELETEVEC(p) free(p)
>> +#define XDELETE(p) free(p)
>> +
>> +struct dyn_cgraph_node
>> +{
>> + struct dyn_cgraph_edge *callees;
>> + struct dyn_cgraph_edge *callers;
>> + struct dyn_pointer_set *imported_modules;
>> +
>> + gcov_type guid;
>> + gcov_type sum_in_count;
>> + gcov_unsigned_t visited;
>> +};
>> +
>> +struct dyn_cgraph_edge
>> +{
>> + struct dyn_cgraph_node *caller;
>> + struct dyn_cgraph_node *callee;
>> + struct dyn_cgraph_edge *next_caller;
>> + struct dyn_cgraph_edge *next_callee;
>> + gcov_type count;
>> +};
>> +
>> +struct dyn_module_info
>> +{
>> + struct dyn_pointer_set *imported_modules;
>> + gcov_unsigned_t max_func_ident;
>> +};
>> +
>> +struct dyn_cgraph
>> +{
>> + struct dyn_pointer_set **call_graph_nodes;
>> + struct gcov_info **modules;
>> + /* supplement module information */
>> + struct dyn_module_info *sup_modules;
>> + const struct gcov_fn_info ***functions;
>> + unsigned num_modules;
>> + unsigned num_nodes_executed;
>> +};
>> +
>> +struct dyn_pointer_set
>> +{
>> + size_t log_slots;
>> + size_t n_slots; /* n_slots = 2^log_slots */
>> + size_t n_elements;
>> +
>> + void **slots;
>> + unsigned (*get_key) (const void *);
>> +};
>> +
>> +
>> +#if defined(inhibit_libc)
>> +__gcov_build_callgraph (void) {}
>> +#else
>> +
>> +void __gcov_compute_module_groups (void) ATTRIBUTE_HIDDEN;
>> +void __gcov_finalize_dyn_callgraph (void) ATTRIBUTE_HIDDEN;
>> +static void gcov_dump_callgraph (gcov_type);
>> +static void gcov_dump_cgraph_node_short (struct dyn_cgraph_node *node);
>> +static void gcov_dump_cgraph_node (struct dyn_cgraph_node *node,
>> + unsigned m, unsigned f);
>> +static void
>> +gcov_dump_cgraph_node_dot (struct dyn_cgraph_node *node,
>> + unsigned m, unsigned f,
>> + gcov_type cutoff_count);
>> +static void
>> +pointer_set_destroy (struct dyn_pointer_set *pset);
>> +static void **
>> +pointer_set_find_or_insert (struct dyn_pointer_set *pset, unsigned key);
>> +static struct dyn_pointer_set *
>> +pointer_set_create (unsigned (*get_key) (const void *));
>> +
>> +static struct dyn_cgraph the_dyn_call_graph;
>> +static int total_zero_count = 0;
>> +static int total_insane_count = 0;
>> +
>> +static void
>> +init_dyn_cgraph_node (struct dyn_cgraph_node *node, gcov_type guid)
>> +{
>> + node->callees = 0;
>> + node->callers = 0;
>> + node->imported_modules = 0;
>> + node->guid = guid;
>> + node->visited = 0;
>> +}
>> +
>> +/* Return (module_id - 1). FUNC_GUID is the global unique id. */
>> +
>> +static inline gcov_unsigned_t
>> +get_module_idx_from_func_glob_uid (gcov_type func_guid)
>> +{
>> + return EXTRACT_MODULE_ID_FROM_GLOBAL_ID (func_guid) - 1;
>> +}
>> +
>> +/* Return (module_id - 1) for MODULE_INFO. */
>> +
>> +static inline gcov_unsigned_t
>> +get_module_idx (const struct gcov_info *module_info)
>> +{
>> + return module_info->mod_info->ident - 1;
>> +}
>> +
>> +/* Return intra-module function id given function global unique id
>> + FUNC_GUID. */
>> +
>> +static inline gcov_unsigned_t
>> +get_intra_module_func_id (gcov_type func_guid)
>> +{
>> + return EXTRACT_FUNC_ID_FROM_GLOBAL_ID (func_guid);
>> +}
>> +
>> +/* Return the pointer to the dynamic call graph node for FUNC_GUID. */
>> +
>> +static inline struct dyn_cgraph_node *
>> +get_cgraph_node (gcov_type func_guid)
>> +{
>> + gcov_unsigned_t mod_id, func_id;
>> +
>> + mod_id = get_module_idx_from_func_glob_uid (func_guid);
>> +
>> + /* This is to workaround: calls in __static_initialization_and_destruction
>> + should not be instrumented as the module id context for the callees have
>> + not setup yet -- this leads to mod_id == (unsigned) (0 - 1). Multithreaded
>> + programs may also produce insane func_guid in the profile counter. */
>> + if (mod_id >= the_dyn_call_graph.num_modules)
>> + return 0;
>> +
>> + func_id = get_intra_module_func_id (func_guid);
>> + if (func_id > the_dyn_call_graph.sup_modules[mod_id].max_func_ident)
>> + return 0;
>> +
>> + return *(pointer_set_find_or_insert
>> + (the_dyn_call_graph.call_graph_nodes[mod_id], func_id));
>> +}
>> +
>> +/* Return the gcov_info pointer for module with id MODULE_ID. */
>> +
>> +static inline struct gcov_info *
>> +get_module_info (gcov_unsigned_t module_id)
>> +{
>> + return the_dyn_call_graph.modules[module_id];
>> +}
>> +
>> +struct gcov_info *__gcov_list ATTRIBUTE_HIDDEN;
>> +
>> +static inline unsigned
>> +cgraph_node_get_key (const void *p)
>> +{
>> + return get_intra_module_func_id (((const struct dyn_cgraph_node *) p)->guid);
>> +}
>> +
>> +/* Initialize dynamic call graph. */
>> +
>> +static void
>> +init_dyn_call_graph (void)
>> +{
>> + unsigned num_modules = 0;
>> + struct gcov_info *gi_ptr;
>> +
>> + the_dyn_call_graph.call_graph_nodes = 0;
>> + the_dyn_call_graph.modules = 0;
>> + the_dyn_call_graph.functions = 0;
>> + the_dyn_call_graph.num_nodes_executed = 0;
>> +
>> + gi_ptr = __gcov_list;
>> +
>> + for (; gi_ptr; gi_ptr = gi_ptr->next)
>> + num_modules++;
>> +
>> + the_dyn_call_graph.num_modules = num_modules;
>> +
>> + the_dyn_call_graph.modules
>> + = XNEWVEC (struct gcov_info *, num_modules);
>> +
>> + the_dyn_call_graph.sup_modules
>> + = XNEWVEC (struct dyn_module_info, num_modules);
>> + memset (the_dyn_call_graph.sup_modules, 0,
>> + num_modules * sizeof (struct dyn_module_info));
>> +
>> + the_dyn_call_graph.functions
>> + = XNEWVEC (const struct gcov_fn_info **, num_modules);
>> +
>> + the_dyn_call_graph.call_graph_nodes
>> + = XNEWVEC (struct dyn_pointer_set *, num_modules);
>> +
>> + gi_ptr = __gcov_list;
>> +
>> + for (; gi_ptr; gi_ptr = gi_ptr->next)
>> + {
>> + unsigned c_ix = 0, t_ix, j, mod_id, fi_stride, max_func_ident = 0;
>> + struct dyn_cgraph_node *node;
>> +
>> + mod_id = get_module_idx (gi_ptr);
>> +
>> + the_dyn_call_graph.modules[mod_id] = gi_ptr;
>> +
>> + the_dyn_call_graph.functions[mod_id]
>> + = XNEWVEC (const struct gcov_fn_info *, gi_ptr->n_functions);
>> +
>> + for (t_ix = 0; t_ix < GCOV_COUNTERS; t_ix++)
>> + if ((1 << t_ix) & gi_ptr->ctr_mask)
>> + c_ix++;
>> +
>> + fi_stride = offsetof (struct gcov_fn_info, n_ctrs) + c_ix * sizeof (unsigned);
>> + if (__alignof__ (struct gcov_fn_info) > sizeof (unsigned))
>> + {
>> + fi_stride += __alignof__ (struct gcov_fn_info) - 1;
>> + fi_stride &= ~(__alignof__ (struct gcov_fn_info) - 1);
>> + }
>> +
>> + for (j = 0; j < gi_ptr->n_functions; j++)
>> + {
>> + const struct gcov_fn_info *fi_ptr = (const struct gcov_fn_info *)
>> + ((const char *) gi_ptr->functions + j * fi_stride);
>> + the_dyn_call_graph.functions[mod_id][j] = fi_ptr;
>> + if (fi_ptr->ident > max_func_ident)
>> + max_func_ident = fi_ptr->ident;
>> + }
>> +
>> +
>> + the_dyn_call_graph.call_graph_nodes[mod_id]
>> + = pointer_set_create (cgraph_node_get_key);
>> +
>> + the_dyn_call_graph.sup_modules[mod_id].max_func_ident = max_func_ident;
>> +
>> + for (j = 0; j < gi_ptr->n_functions; j++)
>> + {
>> + const struct gcov_fn_info *fi_ptr
>> + = the_dyn_call_graph.functions[mod_id][j];
>> + *(pointer_set_find_or_insert
>> + (the_dyn_call_graph.call_graph_nodes[mod_id], fi_ptr->ident))
>> + = node = XNEW (struct dyn_cgraph_node);
>> + the_dyn_call_graph.call_graph_nodes[mod_id]->n_elements++;
>> + init_dyn_cgraph_node (node, GEN_FUNC_GLOBAL_ID (gi_ptr->mod_info->ident,
>> + fi_ptr->ident));
>> + }
>> + }
>> +}
>> +
>> +/* Free up memory allocated for dynamic call graph. */
>> +
>> +void
>> +__gcov_finalize_dyn_callgraph (void)
>> +{
>> + unsigned i;
>> + struct gcov_info *gi_ptr;
>> +
>> + for (i = 0; i < the_dyn_call_graph.num_modules; i++)
>> + {
>> + gi_ptr = the_dyn_call_graph.modules[i];
>> + const struct gcov_fn_info *fi_ptr;
>> + unsigned f_ix;
>> + for (f_ix = 0; f_ix < gi_ptr->n_functions; f_ix++)
>> + {
>> + struct dyn_cgraph_node *node;
>> + struct dyn_cgraph_edge *callees, *next_callee;
>> + fi_ptr = the_dyn_call_graph.functions[i][f_ix];
>> + node = *(pointer_set_find_or_insert
>> + (the_dyn_call_graph.call_graph_nodes[i], fi_ptr->ident));
>> + gcc_assert (node);
>> + callees = node->callees;
>> +
>> + if (!callees)
>> + continue;
>> + while (callees != 0)
>> + {
>> + next_callee = callees->next_callee;
>> + XDELETE (callees);
>> + callees = next_callee;
>> + }
>> + if (node->imported_modules)
>> + pointer_set_destroy (node->imported_modules);
>> + }
>> + if (the_dyn_call_graph.call_graph_nodes[i])
>> + pointer_set_destroy (the_dyn_call_graph.call_graph_nodes[i]);
>> + if (the_dyn_call_graph.functions[i])
>> + XDELETEVEC (the_dyn_call_graph.functions[i]);
>> + /* Now delete sup modules */
>> + if (the_dyn_call_graph.sup_modules[i].imported_modules)
>> + pointer_set_destroy (the_dyn_call_graph.sup_modules[i].imported_modules);
>> + }
>> + XDELETEVEC (the_dyn_call_graph.call_graph_nodes);
>> + XDELETEVEC (the_dyn_call_graph.functions);
>> + XDELETEVEC (the_dyn_call_graph.sup_modules);
>> + XDELETEVEC (the_dyn_call_graph.modules);
>> +}
>> +
>> +/* Add outgoing edge OUT_EDGE for caller node CALLER. */
>> +
>> +static void
>> +gcov_add_out_edge (struct dyn_cgraph_node *caller,
>> + struct dyn_cgraph_edge *out_edge)
>> +{
>> + if (!caller->callees)
>> + caller->callees = out_edge;
>> + else
>> + {
>> + out_edge->next_callee = caller->callees;
>> + caller->callees = out_edge;
>> + }
>> +}
>> +
>> +/* Add incoming edge IN_EDGE for callee node CALLEE. */
>> +
>> +static void
>> +gcov_add_in_edge (struct dyn_cgraph_node *callee,
>> + struct dyn_cgraph_edge *in_edge)
>> +{
>> + if (!callee->callers)
>> + callee->callers = in_edge;
>> + else
>> + {
>> + in_edge->next_caller = callee->callers;
>> + callee->callers = in_edge;
>> + }
>> +}
>> +
>> +/* Add a call graph edge between caller CALLER and callee CALLEE.
>> + The edge count is COUNT. */
>> +
>> +static void
>> +gcov_add_cgraph_edge (struct dyn_cgraph_node *caller,
>> + struct dyn_cgraph_node *callee,
>> + gcov_type count)
>> +{
>> + struct dyn_cgraph_edge *new_edge = XNEW (struct dyn_cgraph_edge);
>> + new_edge->caller = caller;
>> + new_edge->callee = callee;
>> + new_edge->count = count;
>> + new_edge->next_caller = 0;
>> + new_edge->next_callee = 0;
>> +
>> + gcov_add_out_edge (caller, new_edge);
>> + gcov_add_in_edge (callee, new_edge);
>> +}
>> +
>> +/* Add call graph edges from direct calls for caller CALLER. DIR_CALL_COUNTERS
>> + is the array of call counters. N_COUNTS is the number of counters. */
>> +
>> +static void
>> +gcov_build_callgraph_dc_fn (struct dyn_cgraph_node *caller,
>> + gcov_type *dir_call_counters,
>> + unsigned n_counts)
>> +{
>> + unsigned i;
>> +
>> + for (i = 0; i < n_counts; i += 2)
>> + {
>> + struct dyn_cgraph_node *callee;
>> + gcov_type count;
>> + gcov_type callee_guid = dir_call_counters[i];
>> +
>> + count = dir_call_counters[i + 1];
>> + if (count == 0)
>> + {
>> + total_zero_count++;
>> + continue;
>> + }
>> + callee = get_cgraph_node (callee_guid);
>> + if (!callee)
>> + {
>> + total_insane_count++;
>> + continue;
>> + }
>> + gcov_add_cgraph_edge (caller, callee, count);
>> + }
>> +}
>> +
>> +/* Add call graph edges from indirect calls for caller CALLER. ICALL_COUNTERS
>> + is the array of icall counters. N_COUNTS is the number of counters. */
>> +
>> +static void
>> +gcov_build_callgraph_ic_fn (struct dyn_cgraph_node *caller,
>> + gcov_type *icall_counters,
>> + unsigned n_counts)
>> +{
>> + unsigned i, j;
>> +
>> + for (i = 0; i < n_counts; i += GCOV_ICALL_TOPN_NCOUNTS)
>> + {
>> + gcov_type *value_array = &icall_counters[i + 1];
>> + for (j = 0; j < GCOV_ICALL_TOPN_NCOUNTS - 1; j += 2)
>> + {
>> + struct dyn_cgraph_node *callee;
>> + gcov_type count;
>> + gcov_type callee_guid = value_array[j];
>> +
>> + count = value_array[j + 1];
>> + /* Do not update zero count edge count
>> + * as it means there is no target in this entry. */
>> + if (count == 0)
>> + continue;
>> + callee = get_cgraph_node (callee_guid);
>> + if (!callee)
>> + {
>> + total_insane_count++;
>> + continue;
>> + }
>> + gcov_add_cgraph_edge (caller, callee, count);
>> + }
>> + }
>> +}
>> +
>> +/* Build the dynamic call graph. */
>> +
>> +static void
>> +gcov_build_callgraph (void)
>> +{
>> + struct gcov_info *gi_ptr;
>> + unsigned t_ix, m_ix;
>> +
>> + init_dyn_call_graph ();
>> +
>> + for (m_ix = 0; m_ix < the_dyn_call_graph.num_modules; m_ix++)
>> + {
>> + const struct gcov_fn_info *fi_ptr;
>> + unsigned c_ix, f_ix, n_counts, dp_cix = 0, ip_cix = 0;
>> + gcov_type *dcall_profile_values, *icall_profile_values;
>> + gcov_type *arcs_values = 0; unsigned arcs_cix;
>> +
>> + gi_ptr = the_dyn_call_graph.modules[m_ix];
>> +
>> + dcall_profile_values = 0;
>> + icall_profile_values = 0;
>> + c_ix = 0;
>> + for (t_ix = 0; t_ix < GCOV_COUNTERS; t_ix++)
>> + if ((1 << t_ix) & gi_ptr->ctr_mask)
>> + {
>> + if (t_ix == GCOV_COUNTER_DIRECT_CALL)
>> + {
>> + dcall_profile_values = gi_ptr->counts[c_ix].values;
>> + dp_cix = c_ix;
>> + }
>> + if (t_ix == GCOV_COUNTER_ICALL_TOPNV)
>> + {
>> + icall_profile_values = gi_ptr->counts[c_ix].values;
>> + ip_cix = c_ix;
>> + }
>> + if (t_ix == GCOV_COUNTER_ARCS)
>> + {
>> + arcs_values = gi_ptr->counts[c_ix].values;
>> + arcs_cix = c_ix;
>> + }
>> + c_ix++;
>> + }
>> +
>> + if (dcall_profile_values == 0 && icall_profile_values == 0)
>> + continue;
>> +
>> + for (f_ix = 0; f_ix < gi_ptr->n_functions; f_ix++)
>> + {
>> + struct dyn_cgraph_node *caller;
>> + fi_ptr = the_dyn_call_graph.functions[m_ix][f_ix];
>> + caller = *(pointer_set_find_or_insert
>> + (the_dyn_call_graph.call_graph_nodes[m_ix],
>> + fi_ptr->ident));
>> + gcc_assert (caller);
>> + if (dcall_profile_values)
>> + {
>> + unsigned offset;
>> + n_counts = fi_ptr->n_ctrs[dp_cix];
>> + offset = fi_ptr->dc_offset;
>> + gcov_build_callgraph_dc_fn (caller,
>> + dcall_profile_values + offset,
>> + n_counts);
>> + }
>> + if (icall_profile_values)
>> + {
>> + n_counts = fi_ptr->n_ctrs[ip_cix];
>> + gcov_build_callgraph_ic_fn (caller, icall_profile_values, n_counts);
>> + icall_profile_values += n_counts;
>> + }
>> + if (arcs_values && 0)
>> + {
>> + gcov_type total_arc_count = 0;
>> + unsigned arc;
>> + n_counts = fi_ptr->n_ctrs[arcs_cix];
>> + for (arc = 0; arc < n_counts; arc++)
>> + total_arc_count += arcs_values[arc];
>> + if (total_arc_count != 0)
>> + the_dyn_call_graph.num_nodes_executed++;
>> + arcs_values += n_counts;
>> + }
>> + }
>> + }
>> +
>> +}
>> +
>> +static inline size_t
>> +hash1 (unsigned p, unsigned long max, unsigned long logmax)
>> +{
>> + const unsigned long long A = 0x9e3779b97f4a7c16ull;
>> + const unsigned long long shift = 64 - logmax;
>> +
>> + return ((A * (unsigned long) p) >> shift) & (max - 1);
>> +}
>> +
>> +/* Allocate an empty imported-modules set. */
>> +
>> +static struct dyn_pointer_set *
>> +pointer_set_create (unsigned (*get_key) (const void *))
>> +{
>> + struct dyn_pointer_set *result = XNEW (struct dyn_pointer_set);
>> +
>> + result->n_elements = 0;
>> + result->log_slots = 8;
>> + result->n_slots = (size_t) 1 << result->log_slots;
>> +
>> + result->slots = XNEWVEC (void *, result->n_slots);
>> + memset (result->slots, 0, sizeof (void *) * result->n_slots);
>> + result->get_key = get_key;
>> +
>> + return result;
>> +}
>> +
>> +/* Reclaim all memory associated with PSET. */
>> +
>> +static void
>> +pointer_set_destroy (struct dyn_pointer_set *pset)
>> +{
>> + size_t i;
>> + for (i = 0; i < pset->n_slots; i++)
>> + if (pset->slots[i])
>> + XDELETE (pset->slots[i]);
>> + XDELETEVEC (pset->slots);
>> + XDELETE (pset);
>> +}
>> +
>> +/* Subroutine of pointer_set_find_or_insert. Return the insertion slot for KEY
>> + into an empty element of SLOTS, an array of length N_SLOTS. */
>> +static inline size_t
>> +insert_aux (unsigned key, void **slots,
>> + size_t n_slots, size_t log_slots,
>> + unsigned (*get_key) (const void *))
>> +{
>> + size_t n = hash1 (key, n_slots, log_slots);
>> + while (1)
>> + {
>> + if (slots[n] == 0 || get_key (slots[n]) == key)
>> + return n;
>> + else
>> + {
>> + ++n;
>> + if (n == n_slots)
>> + n = 0;
>> + }
>> + }
>> +}
>> +
>> +/* Find slot for KEY. KEY must be nonnull. */
>> +
>> +static void **
>> +pointer_set_find_or_insert (struct dyn_pointer_set *pset, unsigned key)
>> +{
>> + size_t n;
>> +
>> + /* For simplicity, expand the set even if KEY is already there. This can be
>> + superfluous but can happen at most once. */
>> + if (pset->n_elements > pset->n_slots / 4)
>> + {
>> + size_t new_log_slots = pset->log_slots + 1;
>> + size_t new_n_slots = pset->n_slots * 2;
>> + void **new_slots = XNEWVEC (void *, new_n_slots);
>> + memset (new_slots, 0, sizeof (void *) * new_n_slots);
>> + size_t i;
>> +
>> + for (i = 0; i < pset->n_slots; ++i)
>> + {
>> + void *value = pset->slots[i];
>> + if (!value)
>> + continue;
>> + n = insert_aux (pset->get_key (value), new_slots, new_n_slots,
>> + new_log_slots, pset->get_key);
>> + new_slots[n] = value;
>> + }
>> +
>> + XDELETEVEC (pset->slots);
>> + pset->n_slots = new_n_slots;
>> + pset->log_slots = new_log_slots;
>> + pset->slots = new_slots;
>> + }
>> +
>> + n = insert_aux (key, pset->slots, pset->n_slots, pset->log_slots,
>> + pset->get_key);
>> + return &pset->slots[n];
>> +}
>> +
>> +/* Pass each pointer in PSET to the function in FN, together with the fixed
>> + parameters DATA1, DATA2, DATA3. If FN returns false, the iteration stops. */
>> +
>> +static void
>> +pointer_set_traverse (const struct dyn_pointer_set *pset,
>> + int (*fn) (const void *, void *, void *, void *),
>> + void *data1, void *data2, void *data3)
>> +{
>> + size_t i;
>> + for (i = 0; i < pset->n_slots; ++i)
>> + if (pset->slots[i] && !fn (pset->slots[i], data1, data2, data3))
>> + break;
>> +}
>> +
>> +static int
>> +imp_mod_set_insert (struct dyn_pointer_set *p, const struct gcov_info *imp_mod,
>> + double wt)
>> +{
>> + struct dyn_imp_mod **m = (struct dyn_imp_mod **)
>> + pointer_set_find_or_insert (p, imp_mod->mod_info->ident);
>> + if (*m)
>> + {
>> + (*m)->weight += wt;
>> + return 1;
>> + }
>> + else
>> + {
>> + *m = XNEW (struct dyn_imp_mod);
>> + (*m)->imp_mod = imp_mod;
>> + (*m)->weight = wt;
>> + p->n_elements++;
>> + return 0;
>> + }
>> +}
>> +
>> +/* Callback function to propagate import module (VALUE) from callee to
>> + caller's imported-module-set (DATA1).
>> + The weight is scaled by the scaling-factor (DATA2) before propagation,
>> + and accumulated into DATA3. */
>> +static int
>> +gcov_propagate_imp_modules (const void *value, void *data1, void *data2,
>> + void *data3)
>> +{
>> + const struct dyn_imp_mod *m = (const struct dyn_imp_mod *) value;
>> + struct dyn_pointer_set *receiving_set = (struct dyn_pointer_set *) data1;
>> + double *scale = (double *) data2;
>> + double *sum = (double *) data3;
>> + double wt = m->weight;
>> + if (scale)
>> + wt *= *scale;
>> + if (sum)
>> + (*sum) += wt;
>> + imp_mod_set_insert (receiving_set, m->imp_mod, wt);
>> + return 1;
>> +}
>> +
>> +static int
>> +sort_by_count (const void *pa, const void *pb)
>> +{
>> + const struct dyn_cgraph_edge *edge_a = *(struct dyn_cgraph_edge * const *)pa;
>> + const struct dyn_cgraph_edge *edge_b = *(struct dyn_cgraph_edge * const *)pb;
>> +
>> + /* This can overvlow. */
>> + /* return edge_b->count - edge_a->count; */
>> + if (edge_b->count > edge_a->count)
>> + return 1;
>> + else if (edge_b->count == edge_a->count)
>> + return 0;
>> + else
>> + return -1;
>> +}
>> +
>> +/* Compute the hot callgraph edge threhold. */
>> +
>> +static gcov_type
>> +gcov_compute_cutoff_count (void)
>> +{
>> + unsigned m_ix, capacity, i;
>> + unsigned num_edges = 0;
>> + gcov_type cutoff_count;
>> + double total, cum, cum_cutoff;
>> + struct dyn_cgraph_edge **edges;
>> + struct gcov_info *gi_ptr;
>> + char *cutoff_str;
>> + char *num_perc_str;
>> + unsigned cutoff_perc;
>> + unsigned num_perc;
>> + int do_dump;
>> +
>> + capacity = 100;
>> + /* allocate an edge array */
>> + edges = XNEWVEC (struct dyn_cgraph_edge*, capacity);
>> + /* First count the number of edges. */
>> + for (m_ix = 0; m_ix < the_dyn_call_graph.num_modules; m_ix++)
>> + {
>> + const struct gcov_fn_info *fi_ptr;
>> + unsigned f_ix;
>> +
>> + gi_ptr = the_dyn_call_graph.modules[m_ix];
>> +
>> + for (f_ix = 0; f_ix < gi_ptr->n_functions; f_ix++)
>> + {
>> + struct dyn_cgraph_node *node;
>> + struct dyn_cgraph_edge *callees;
>> +
>> + fi_ptr = the_dyn_call_graph.functions[m_ix][f_ix];
>> +
>> + node = *(pointer_set_find_or_insert
>> + (the_dyn_call_graph.call_graph_nodes[m_ix], fi_ptr->ident));
>> + gcc_assert (node);
>> +
>> + callees = node->callees;
>> + while (callees != 0)
>> + {
>> + num_edges++;
>> + if (num_edges < capacity)
>> + edges[num_edges - 1] = callees;
>> + else
>> + {
>> + capacity = capacity + (capacity >> 1);
>> + edges = (struct dyn_cgraph_edge **)realloc (edges, sizeof (void*) * capacity);
>> + edges[num_edges - 1] = callees;
>> + }
>> + callees = callees->next_callee;
>> + }
>> + }
>> + }
>> +
>> + /* Now sort */
>> + qsort (edges, num_edges, sizeof (void *), sort_by_count);
>> +#define CUM_CUTOFF_PERCENT 80
>> +#define MIN_NUM_EDGE_PERCENT 0
>> + cutoff_str = getenv ("GCOV_DYN_CGRAPH_CUTOFF");
>> + if (cutoff_str && strlen (cutoff_str))
>> + {
>> + if ((num_perc_str = strchr (cutoff_str, ':')))
>> + {
>> + *num_perc_str = '\0';
>> + num_perc_str++;
>> + }
>> + cutoff_perc = atoi (cutoff_str);
>> + if (num_perc_str)
>> + num_perc = atoi (num_perc_str);
>> + else
>> + num_perc = MIN_NUM_EDGE_PERCENT;
>> + }
>> + else
>> + {
>> + cutoff_perc = CUM_CUTOFF_PERCENT;
>> + num_perc = MIN_NUM_EDGE_PERCENT;
>> + }
>> +
>> + total = 0;
>> + cum = 0;
>> + for (i = 0; i < num_edges; i++)
>> + total += edges[i]->count;
>> +
>> + cum_cutoff = (total * cutoff_perc)/100;
>> + do_dump = (getenv ("GCOV_DYN_CGRAPH_DUMP") != 0);
>> + for (i = 0; i < num_edges; i++)
>> + {
>> + cum += edges[i]->count;
>> + if (do_dump)
>> + fprintf (stderr, "// edge[%d] count = %.0f [%llx --> %llx]\n",
>> + i, (double) edges[i]->count,
>> + (long long) edges[i]->caller->guid,
>> + (long long) edges[i]->callee->guid);
>> + if (cum >= cum_cutoff && (i * 100 >= num_edges * num_perc))
>> + {
>> + cutoff_count = edges[i]->count;
>> + break;
>> + }
>> + }
>> +
>> + if (do_dump)
>> + fprintf (stderr, "// total = %.0f cum = %.0f cum/total = %.0f%%"
>> + " cutoff_count = %lld [total edges: %d hot edges: %d perc: %d%%]\n"
>> + " total_zero_count_edges = %d total_insane_count_edgess = %d\n"
>> + " total_nodes_executed = %d\n",
>> + total, cum, (cum * 100)/total, (long long) cutoff_count,
>> + num_edges, i, (i * 100)/num_edges, total_zero_count,
>> + total_insane_count, the_dyn_call_graph.num_nodes_executed);
>> +
>> + XDELETEVEC (edges);
>> + return cutoff_count;
>> +}
>> +
>> +static inline unsigned
>> +imp_mod_get_key (const void *p)
>> +{
>> + return ((const struct dyn_imp_mod *) p)->imp_mod->mod_info->ident;
>> +}
>> +
>> +/* Return the imported module set for NODE. */
>> +
>> +static struct dyn_pointer_set *
>> +gcov_get_imp_module_set (struct dyn_cgraph_node *node)
>> +{
>> + if (!node->imported_modules)
>> + node->imported_modules = pointer_set_create (imp_mod_get_key);
>> +
>> + return node->imported_modules;
>> +}
>> +
>> +/* Return the imported module set for MODULE MI. */
>> +
>> +static struct dyn_pointer_set *
>> +gcov_get_module_imp_module_set (struct dyn_module_info *mi)
>> +{
>> + if (!mi->imported_modules)
>> + mi->imported_modules = pointer_set_create (imp_mod_get_key);
>> +
>> + return mi->imported_modules;
>> +}
>> +
>> +/* Callback function to mark if a module needs to be exported. */
>> +
>> +static int
>> +gcov_mark_export_modules (const void *value,
>> + void *data1 ATTRIBUTE_UNUSED,
>> + void *data2 ATTRIBUTE_UNUSED,
>> + void *data3 ATTRIBUTE_UNUSED)
>> +{
>> + const struct gcov_info *module_info
>> + = ((const struct dyn_imp_mod *) value)->imp_mod;
>> +
>> + module_info->mod_info->is_exported = 1;
>> + return 1;
>> +}
>> +
>> +struct gcov_import_mod_array
>> +{
>> + const struct dyn_imp_mod **imported_modules;
>> + struct gcov_info *importing_module;
>> + unsigned len;
>> +};
>> +
>> +/* Callback function to compute pointer set size. */
>> +
>> +static int
>> +gcov_compute_mset_size (const void *value ATTRIBUTE_UNUSED,
>> + void *data1,
>> + void *data2 ATTRIBUTE_UNUSED,
>> + void *data3 ATTRIBUTE_UNUSED)
>> +{
>> + unsigned *len = (unsigned *) data1;
>> + (*len)++;
>> + return 1;
>> +}
>> +
>> +/* Callback function to collect imported modules. */
>> +
>> +static int
>> +gcov_collect_imported_modules (const void *value,
>> + void *data1,
>> + void *data2 ATTRIBUTE_UNUSED,
>> + void *data3 ATTRIBUTE_UNUSED)
>> +{
>> + struct gcov_import_mod_array *out_array;
>> + const struct dyn_imp_mod *m
>> + = (const struct dyn_imp_mod *) value;
>> +
>> + out_array = (struct gcov_import_mod_array *) data1;
>> +
>> + if (m->imp_mod != out_array->importing_module)
>> + out_array->imported_modules[out_array->len++] = m;
>> +
>> + return 1;
>> +}
>> +
>> +/* Comparator for sorting imported modules using weights. */
>> +
>> +static int
>> +sort_by_module_wt (const void *pa, const void *pb)
>> +{
>> + const struct dyn_imp_mod *m_a = *((const struct dyn_imp_mod * const *) pa);
>> + const struct dyn_imp_mod *m_b = *((const struct dyn_imp_mod * const *) pb);
>> +
>> + /* We want to sort in descending order of weights. */
>> + if (m_a->weight < m_b->weight)
>> + return +1;
>> + if (m_a->weight > m_b->weight)
>> + return -1;
>> + return get_module_idx (m_a->imp_mod) - get_module_idx (m_b->imp_mod);
>> +}
>> +
>> +/* Return a dynamic array of imported modules that is sorted for
>> + the importing module MOD_INFO. The length of the array is returned
>> + in *LEN. */
>> +
>> +const struct dyn_imp_mod **
>> +gcov_get_sorted_import_module_array (struct gcov_info *mod_info,
>> + unsigned *len)
>> +{
>> + unsigned mod_id;
>> + struct dyn_module_info *sup_mod_info;
>> + unsigned array_len = 0;
>> + struct gcov_import_mod_array imp_array;
>> +
>> + mod_id = get_module_idx (mod_info);
>> + sup_mod_info = &the_dyn_call_graph.sup_modules[mod_id];
>> +
>> + if (sup_mod_info->imported_modules == 0)
>> + return 0;
>> +
>> + pointer_set_traverse (sup_mod_info->imported_modules,
>> + gcov_compute_mset_size, &array_len, 0, 0);
>> + imp_array.imported_modules = XNEWVEC (const struct dyn_imp_mod *, array_len);
>> + imp_array.len = 0;
>> + imp_array.importing_module = mod_info;
>> + pointer_set_traverse (sup_mod_info->imported_modules,
>> + gcov_collect_imported_modules, &imp_array, 0, 0);
>> + *len = imp_array.len;
>> + qsort (imp_array.imported_modules, imp_array.len,
>> + sizeof (void *), sort_by_module_wt);
>> + return imp_array.imported_modules;
>> +}
>> +
>> +/* Compute modules that are needed for NODE (for cross module inlining).
>> + CUTTOFF_COUNT is the call graph edge count cutoff value.
>> + IMPORT_SCALE is the scaling-factor (percent) by which to scale the
>> + weights of imported modules of a callee before propagating them to
>> + the caller, if the callee and caller are in different modules.
>> +
>> + Each imported module is assigned a weight that corresponds to the
>> + expected benefit due to cross-module inlining. When the imported modules
>> + are written out, they are sorted with highest weight first.
>> +
>> + The following example illustrates how the weight is computed:
>> +
>> + Suppose we are processing call-graph node A. It calls function B 50 times,
>> + which calls function C 1000 times, and function E 800 times. Lets say B has
>> + another in-edge from function D, with edge-count of 50. Say all the
>> + functions are in separate modules (modules a, b, c, d, e, respectively):
>> +
>> + D
>> + |
>> + | 50
>> + |
>> + 50 v 1000
>> + A --------> B ----------> C
>> + |
>> + | 800
>> + |
>> + v
>> + E
>> +
>> + Nodes are processed in depth-first order, so when processing A, we first
>> + process B. For node B, we are going to add module c to the imported-module
>> + set, with weight 1000 (edge-count), and module e with weight 800.
>> + Coming back to A, we are going to add the imported-module-set of B to A,
>> + after doing some scaling.
>> + The first scaling factor comes from the fact that A calls B 50 times, but B
>> + has in-edge-count total of 100. So this scaling factor is 50/100 = 0.5
>> + The second scaling factor is that since B is in a different module than A,
>> + we want to slightly downgrade imported modules of B, before adding to the
>> + imported-modules set of A. This scaling factor has a default value of 50%
>> + (can be set via env variable GCOV_DYN_IMPORT_SCALE).
>> + So we end up adding modules c and e to the imported-set of A, with weights
>> + 0.5*0.5*1000=250 and 0.5*0.5*800=200, respectively.
>> +
>> + Next, we have to add module b itself to A. The weight is computed as the
>> + edge-count plus the sum of scaled-weights of all modules in the
>> + imported-module set of B, i.e., 50 + 250 + 200 = 500.
>> +
>> + In computing the weight of module b, we add the sum of scaled-weights of
>> + imported modules of b, because it doesn't make sense to import c, e in
>> + module a, until module b is imported. */
>> +
>> +static void
>> +gcov_process_cgraph_node (struct dyn_cgraph_node *node,
>> + gcov_type cutoff_count,
>> + unsigned import_scale)
>> +{
>> + unsigned mod_id;
>> + struct dyn_cgraph_edge *callees;
>> + struct dyn_cgraph_edge *callers;
>> + node->visited = 1;
>> + node->sum_in_count = 0;
>> +
>> + callers = node->callers;
>> + while (callers)
>> + {
>> + node->sum_in_count += callers->count;
>> + callers = callers->next_caller;
>> + }
>> +
>> + callees = node->callees;
>> + mod_id = get_module_idx_from_func_glob_uid (node->guid);
>> +
>> + while (callees)
>> + {
>> + if (!callees->callee->visited)
>> + gcov_process_cgraph_node (callees->callee,
>> + cutoff_count,
>> + import_scale);
>> + callees = callees->next_callee;
>> + }
>> +
>> + callees = node->callees;
>> + while (callees)
>> + {
>> + if (callees->count >= cutoff_count)
>> + {
>> + unsigned callee_mod_id;
>> + struct dyn_pointer_set *imp_modules
>> + = gcov_get_imp_module_set (node);
>> +
>> + callee_mod_id
>> + = get_module_idx_from_func_glob_uid (callees->callee->guid);
>> +
>> + double callee_mod_wt = (double) callees->count;
>> + if (callees->callee->imported_modules)
>> + {
>> + double scale = ((double) callees->count) /
>> + ((double) callees->callee->sum_in_count);
>> + /* Reduce weight if callee is in different module. */
>> + if (mod_id != callee_mod_id)
>> + scale = (scale * import_scale) / 100.0;
>> + pointer_set_traverse (callees->callee->imported_modules,
>> + gcov_propagate_imp_modules,
>> + imp_modules, &scale, &callee_mod_wt);
>> + }
>> + if (mod_id != callee_mod_id)
>> + {
>> + struct gcov_info *callee_mod_info
>> + = get_module_info (callee_mod_id);
>> + imp_mod_set_insert (imp_modules, callee_mod_info, callee_mod_wt);
>> + }
>> + }
>> +
>> + callees = callees->next_callee;
>> + }
>> +}
>> +
>> +/* Compute module grouping using CUTOFF_COUNT as the hot edge
>> + threshold. */
>> +
>> +#define DEFAULT_IMPORT_SCALE 100
>> +static void
>> +gcov_compute_module_groups (gcov_type cutoff_count)
>> +{
>> + unsigned m_ix;
>> + struct gcov_info *gi_ptr;
>> + const char *import_scale_str;
>> + unsigned import_scale = DEFAULT_IMPORT_SCALE;
>> +
>> + import_scale_str = getenv ("GCOV_DYN_IMPORT_SCALE");
>> + if (import_scale_str && strlen (import_scale_str))
>> + import_scale = atoi (import_scale_str);
>> +
>> + for (m_ix = 0; m_ix < the_dyn_call_graph.num_modules; m_ix++)
>> + {
>> + const struct gcov_fn_info *fi_ptr;
>> + unsigned f_ix;
>> +
>> + gi_ptr = the_dyn_call_graph.modules[m_ix];
>> +
>> + for (f_ix = 0; f_ix < gi_ptr->n_functions; f_ix++)
>> + {
>> + struct dyn_cgraph_node *node;
>> +
>> + fi_ptr = the_dyn_call_graph.functions[m_ix][f_ix];
>> + node = *(pointer_set_find_or_insert
>> + (the_dyn_call_graph.call_graph_nodes[m_ix], fi_ptr->ident));
>> + gcc_assert (node);
>> + if (node->visited)
>> + continue;
>> +
>> + gcov_process_cgraph_node (node, cutoff_count, import_scale);
>> + }
>> + }
>> +
>> + for (m_ix = 0; m_ix < the_dyn_call_graph.num_modules; m_ix++)
>> + {
>> + const struct gcov_fn_info *fi_ptr;
>> + unsigned f_ix;
>> +
>> + gi_ptr = the_dyn_call_graph.modules[m_ix];
>> +
>> + for (f_ix = 0; f_ix < gi_ptr->n_functions; f_ix++)
>> + {
>> + struct dyn_cgraph_node *node;
>> + unsigned mod_id;
>> + struct dyn_pointer_set *imp_modules;
>> +
>> + fi_ptr = the_dyn_call_graph.functions[m_ix][f_ix];
>> + node = *(pointer_set_find_or_insert
>> + (the_dyn_call_graph.call_graph_nodes[m_ix], fi_ptr->ident));
>> + gcc_assert (node);
>> +
>> + if (!node->imported_modules)
>> + continue;
>> +
>> + mod_id = get_module_idx_from_func_glob_uid (node->guid);
>> + gcc_assert (mod_id == m_ix);
>> +
>> + imp_modules
>> + = gcov_get_module_imp_module_set (
>> + &the_dyn_call_graph.sup_modules[mod_id]);
>> +
>> + pointer_set_traverse (node->imported_modules,
>> + gcov_propagate_imp_modules,
>> + imp_modules, 0, 0);
>> + }
>> + }
>> +
>> + /* Now compute the export attribute */
>> + for (m_ix = 0; m_ix < the_dyn_call_graph.num_modules; m_ix++)
>> + {
>> + struct dyn_module_info *mi
>> + = &the_dyn_call_graph.sup_modules[m_ix];
>> + if (mi->imported_modules)
>> + pointer_set_traverse (mi->imported_modules,
>> + gcov_mark_export_modules, 0, 0, 0);
>> + }
>> +}
>> +
>> +/* For each module, compute at random, the group of imported modules,
>> + that is of size at most MAX_GROUP_SIZE. */
>> +
>> +static void
>> +gcov_compute_random_module_groups (unsigned max_group_size)
>> +{
>> + unsigned m_ix;
>> +
>> + if (max_group_size > the_dyn_call_graph.num_modules)
>> + max_group_size = the_dyn_call_graph.num_modules;
>> +
>> + for (m_ix = 0; m_ix < the_dyn_call_graph.num_modules; m_ix++)
>> + {
>> + struct dyn_pointer_set *imp_modules =
>> + gcov_get_module_imp_module_set (&the_dyn_call_graph.sup_modules[m_ix]);
>> + int cur_group_size = random () % max_group_size;
>> + int i = 0;
>> + while (i < cur_group_size)
>> + {
>> + struct gcov_info *imp_mod_info;
>> + unsigned mod_id = random () % the_dyn_call_graph.num_modules;
>> + if (mod_id == m_ix)
>> + continue;
>> + imp_mod_info = get_module_info (mod_id);
>> + if (!imp_mod_set_insert (imp_modules, imp_mod_info, 1.0))
>> + i++;
>> + }
>> + }
>> +
>> + /* Now compute the export attribute */
>> + for (m_ix = 0; m_ix < the_dyn_call_graph.num_modules; m_ix++)
>> + {
>> + struct dyn_module_info *mi
>> + = &the_dyn_call_graph.sup_modules[m_ix];
>> + if (mi->imported_modules)
>> + pointer_set_traverse (mi->imported_modules,
>> + gcov_mark_export_modules, 0, 0, 0);
>> + }
>> +}
>> +
>> +/* Write out MOD_INFO into the gcda file. IS_PRIMARY is a flag
>> + indicating if the module is the primary module in the group. */
>> +
>> +static void
>> +gcov_write_module_info (const struct gcov_info *mod_info,
>> + unsigned is_primary)
>> +{
>> + gcov_unsigned_t len = 0, filename_len = 0, src_filename_len = 0, i, j;
>> + gcov_unsigned_t num_strings;
>> + gcov_unsigned_t *aligned_fname;
>> + struct gcov_module_info *module_info = mod_info->mod_info;
>> + filename_len = (strlen (module_info->da_filename) +
>> + sizeof (gcov_unsigned_t)) / sizeof (gcov_unsigned_t);
>> + src_filename_len = (strlen (module_info->source_filename) +
>> + sizeof (gcov_unsigned_t)) / sizeof (gcov_unsigned_t);
>> + len = filename_len + src_filename_len;
>> + len += 2; /* each name string is led by a length. */
>> +
>> + num_strings = module_info->num_quote_paths + module_info->num_bracket_paths +
>> + module_info->num_cpp_defines + module_info->num_cpp_includes +
>> + module_info->num_cl_args;
>> + for (i = 0; i < num_strings; i++)
>> + {
>> + gcov_unsigned_t string_len
>> + = (strlen (module_info->string_array[i]) + sizeof (gcov_unsigned_t))
>> + / sizeof (gcov_unsigned_t);
>> + len += string_len;
>> + len += 1; /* Each string is lead by a length. */
>> + }
>> +
>> + len += 9; /* 9 more fields */
>> +
>> + gcov_write_tag_length (GCOV_TAG_MODULE_INFO, len);
>> + gcov_write_unsigned (module_info->ident);
>> + gcov_write_unsigned (is_primary);
>> + gcov_write_unsigned (module_info->is_exported);
>> + gcov_write_unsigned (module_info->lang);
>> + gcov_write_unsigned (module_info->num_quote_paths);
>> + gcov_write_unsigned (module_info->num_bracket_paths);
>> + gcov_write_unsigned (module_info->num_cpp_defines);
>> + gcov_write_unsigned (module_info->num_cpp_includes);
>> + gcov_write_unsigned (module_info->num_cl_args);
>> +
>> + /* Now write the filenames */
>> + aligned_fname = (gcov_unsigned_t *) alloca ((filename_len + src_filename_len + 2) *
>> + sizeof (gcov_unsigned_t));
>> + memset (aligned_fname, 0,
>> + (filename_len + src_filename_len + 2) * sizeof (gcov_unsigned_t));
>> + aligned_fname[0] = filename_len;
>> + strcpy ((char*) (aligned_fname + 1), module_info->da_filename);
>> + aligned_fname[filename_len + 1] = src_filename_len;
>> + strcpy ((char*) (aligned_fname + filename_len + 2), module_info->source_filename);
>> +
>> + for (i = 0; i < (filename_len + src_filename_len + 2); i++)
>> + gcov_write_unsigned (aligned_fname[i]);
>> +
>> + /* Now write the string array. */
>> + for (j = 0; j < num_strings; j++)
>> + {
>> + gcov_unsigned_t *aligned_string;
>> + gcov_unsigned_t string_len =
>> + (strlen (module_info->string_array[j]) + sizeof (gcov_unsigned_t)) /
>> + sizeof (gcov_unsigned_t);
>> + aligned_string = (gcov_unsigned_t *)
>> + alloca ((string_len + 1) * sizeof (gcov_unsigned_t));
>> + memset (aligned_string, 0, (string_len + 1) * sizeof (gcov_unsigned_t));
>> + aligned_string[0] = string_len;
>> + strcpy ((char*) (aligned_string + 1), module_info->string_array[j]);
>> + for (i = 0; i < (string_len + 1); i++)
>> + gcov_write_unsigned (aligned_string[i]);
>> + }
>> +}
>> +
>> +/* Write out MOD_INFO and its imported modules into gcda file. */
>> +
>> +void
>> +gcov_write_module_infos (struct gcov_info *mod_info)
>> +{
>> + unsigned mod_id, imp_len = 0;
>> + const struct dyn_imp_mod **imp_mods;
>> +
>> + mod_id = get_module_idx (mod_info);
>> + gcov_write_module_info (mod_info, 1);
>> +
>> + imp_mods = gcov_get_sorted_import_module_array (mod_info, &imp_len);
>> + if (imp_mods)
>> + {
>> + unsigned i;
>> +
>> + for (i = 0; i < imp_len; i++)
>> + {
>> + const struct gcov_info *imp_mod = imp_mods[i]->imp_mod;
>> + gcov_write_module_info (imp_mod, 0);
>> + }
>> + free (imp_mods);
>> + }
>> +}
>> +
>> +/* Compute module groups needed for L-IPO compilation. */
>> +
>> +void
>> +__gcov_compute_module_groups (void)
>> +{
>> + gcov_type cut_off_count;
>> + char *seed = getenv ("LIPO_RANDOM_GROUPING");
>> + char *max_group_size = seed ? strchr (seed, ':') : 0;
>> +
>> + if (seed && max_group_size)
>> + {
>> + *max_group_size = '\0';
>> + max_group_size++;
>> + srandom (atoi (seed));
>> + init_dyn_call_graph ();
>> + gcov_compute_random_module_groups (atoi (max_group_size));
>> + return;
>> + }
>> +
>> + /* First compute dynamic call graph. */
>> + gcov_build_callgraph ();
>> +
>> + cut_off_count = gcov_compute_cutoff_count ();
>> +
>> + gcov_compute_module_groups (cut_off_count);
>> +
>> + gcov_dump_callgraph (cut_off_count);
>> +
>> +}
>> +
>> +/* Dumper function for NODE. */
>> +static void
>> +gcov_dump_cgraph_node_short (struct dyn_cgraph_node *node)
>> +{
>> + unsigned mod_id, func_id;
>> + struct gcov_info *mod_info;
>> + mod_id = get_module_idx_from_func_glob_uid (node->guid);
>> + func_id = get_intra_module_func_id (node->guid);
>> +
>> + mod_info = the_dyn_call_graph.modules[mod_id];
>> +
>> + fprintf (stderr, "NODE(%llx) module(%s) func(%u)",
>> + (long long)node->guid,
>> + mod_info->mod_info->source_filename, func_id);
>> +}
>> +
>> +/* Dumper function for NODE. M is the module id and F is the function id. */
>> +
>> +static void
>> +gcov_dump_cgraph_node (struct dyn_cgraph_node *node, unsigned m, unsigned f)
>> +{
>> + unsigned mod_id, func_id;
>> + struct gcov_info *mod_info;
>> + struct dyn_cgraph_edge *callers;
>> + struct dyn_cgraph_edge *callees;
>> +
>> + mod_id = get_module_idx_from_func_glob_uid (node->guid);
>> + func_id = get_intra_module_func_id (node->guid);
>> + gcc_assert (mod_id == m && func_id == f);
>> +
>> + mod_info = the_dyn_call_graph.modules[mod_id];
>> +
>> + fprintf (stderr, "NODE(%llx) module(%s) func(%x)\n",
>> + (long long) node->guid,
>> + mod_info->mod_info->source_filename, f);
>> +
>> + /* Now dump callers. */
>> + callers = node->callers;
>> + fprintf (stderr, "\t[CALLERS]\n");
>> + while (callers != 0)
>> + {
>> + fprintf (stderr,"\t\t[count=%ld] ", (long) callers->count);
>> + gcov_dump_cgraph_node_short (callers->caller);
>> + fprintf (stderr,"\n");
>> + callers = callers->next_caller;
>> + }
>> +
>> + callees = node->callees;
>> + fprintf (stderr, "\t[CALLEES]\n");
>> + while (callees != 0)
>> + {
>> + fprintf (stderr,"\t\t[count=%ld] ", (long) callees->count);
>> + gcov_dump_cgraph_node_short (callees->callee);
>> + fprintf (stderr,"\n");
>> + callees = callees->next_callee;
>> + }
>> +}
>> +
>> +/* Dumper function for NODE. M is the module id and F is the function id. */
>> +
>> +static void
>> +gcov_dump_cgraph_node_dot (struct dyn_cgraph_node *node,
>> + unsigned m, unsigned f,
>> + gcov_type cutoff_count)
>> +{
>> + unsigned mod_id, func_id, imp_len = 0, i;
>> + struct gcov_info *mod_info;
>> + const struct dyn_imp_mod **imp_mods;
>> + struct dyn_cgraph_edge *callees;
>> +
>> + mod_id = get_module_idx_from_func_glob_uid (node->guid);
>> + func_id = get_intra_module_func_id (node->guid);
>> + gcc_assert (mod_id == m && func_id == f);
>> +
>> + mod_info = the_dyn_call_graph.modules[mod_id];
>> +
>> + fprintf (stderr, "NODE_%llx[label=\"MODULE\\n(%s)\\n FUNC(%x)\\n",
>> + (long long) node->guid, mod_info->mod_info->source_filename, f);
>> +
>> + imp_mods = gcov_get_sorted_import_module_array (mod_info, &imp_len);
>> + fprintf (stderr, "IMPORTS:\\n");
>> + if (imp_mods)
>> + {
>> + for (i = 0; i < imp_len; i++)
>> + fprintf (stderr, "%s\\n", imp_mods[i]->imp_mod->mod_info->source_filename);
>> + fprintf (stderr, "\"]\n");
>> + free (imp_mods);
>> + }
>> + else
>> + fprintf (stderr, "\"]\n");
>> +
>> + callees = node->callees;
>> + while (callees != 0)
>> + {
>> + if (callees->count >= cutoff_count)
>> + fprintf (stderr, "NODE_%llx -> NODE_%llx[label=%lld color=red]\n",
>> + (long long) node->guid, (long long) callees->callee->guid,
>> + (long long) callees->count);
>> + else
>> + fprintf (stderr, "NODE_%llx -> NODE_%llx[label=%lld color=blue]\n",
>> + (long long) node->guid, (long long) callees->callee->guid,
>> + (long long) callees->count);
>> + callees = callees->next_callee;
>> + }
>> +}
>> +
>> +/* Dump dynamic call graph. CUTOFF_COUNT is the computed hot edge threshold. */
>> +
>> +static void
>> +gcov_dump_callgraph (gcov_type cutoff_count)
>> +{
>> + struct gcov_info *gi_ptr;
>> + unsigned m_ix;
>> + const char *dyn_cgraph_dump = 0;
>> +
>> + dyn_cgraph_dump = getenv ("GCOV_DYN_CGRAPH_DUMP");
>> +
>> + if (!dyn_cgraph_dump || !strlen (dyn_cgraph_dump))
>> + return;
>> +
>> + fprintf (stderr,"digraph dyn_call_graph {\n");
>> + fprintf (stderr,"node[shape=box]\nsize=\"11,8.5\"\n");
>> +
>> + for (m_ix = 0; m_ix < the_dyn_call_graph.num_modules; m_ix++)
>> + {
>> + const struct gcov_fn_info *fi_ptr;
>> + unsigned f_ix;
>> +
>> + gi_ptr = the_dyn_call_graph.modules[m_ix];
>> +
>> + for (f_ix = 0; f_ix < gi_ptr->n_functions; f_ix++)
>> + {
>> + struct dyn_cgraph_node *node;
>> + fi_ptr = the_dyn_call_graph.functions[m_ix][f_ix];
>> +
>> + node = *(pointer_set_find_or_insert
>> + (the_dyn_call_graph.call_graph_nodes[m_ix], fi_ptr->ident));
>> + gcc_assert (node);
>> +
>> + /* skip dead functions */
>> + if (!node->callees && !node->callers)
>> + continue;
>> +
>> + if (dyn_cgraph_dump[0] == '1')
>> + gcov_dump_cgraph_node (node, m_ix, fi_ptr->ident);
>> + else
>> + gcov_dump_cgraph_node_dot (node, m_ix, fi_ptr->ident,
>> + cutoff_count);
>> + }
>> + }
>> + fprintf (stderr,"}\n");
>> +}
>> +
>> +
>> +#endif
>> Index: dbgcnt.def
>> ===================================================================
>> --- dbgcnt.def (revision 172880)
>> +++ dbgcnt.def (working copy)
>> @@ -141,10 +141,12 @@ echo ubound: $ub
>> */
>>
>> /* Debug counter definitions. */
>> +DEBUG_COUNTER (alias)
>> DEBUG_COUNTER (auto_inc_dec)
>> DEBUG_COUNTER (ccp)
>> DEBUG_COUNTER (cfg_cleanup)
>> DEBUG_COUNTER (cse2_move2add)
>> +DEBUG_COUNTER (clone)
>> DEBUG_COUNTER (cprop)
>> DEBUG_COUNTER (dce)
>> DEBUG_COUNTER (dce_fast)
>> @@ -165,6 +167,7 @@ DEBUG_COUNTER (if_conversion)
>> DEBUG_COUNTER (if_conversion_tree)
>> DEBUG_COUNTER (if_after_combine)
>> DEBUG_COUNTER (if_after_reload)
>> +DEBUG_COUNTER (inl)
>> DEBUG_COUNTER (local_alloc_for_sched)
>> DEBUG_COUNTER (postreload_cse)
>> DEBUG_COUNTER (pre)
>> Index: c-family/c-opts.c
>> ===================================================================
>> --- c-family/c-opts.c (revision 172880)
>> +++ c-family/c-opts.c (working copy)
>> @@ -38,6 +38,9 @@ along with GCC; see the file COPYING3.
>> #include "mkdeps.h"
>> #include "target.h" /* For gcc_targetcm. */
>> #include "tm_p.h" /* For C_COMMON_OVERRIDE_OPTIONS. */
>> +#include "function.h"
>> +#include "params.h"
>> +#include "l-ipo.h"
>>
>> #ifndef DOLLARS_IN_IDENTIFIERS
>> # define DOLLARS_IN_IDENTIFIERS true
>> @@ -364,6 +367,7 @@ c_common_handle_option (size_t scode, co
>> warn_missing_braces = value;
>> warn_parentheses = value;
>> warn_return_type = value;
>> + warn_ripa_opt_mismatch = value;
>> warn_sequence_point = value; /* Was C only. */
>> warn_switch = value;
>> if (warn_strict_aliasing == -1)
>> @@ -864,6 +868,10 @@ c_common_post_options (const char **pfil
>> else if (!flag_gnu89_inline && !flag_isoc99)
>> error ("-fno-gnu89-inline is only supported in GNU99 or C99 mode");
>>
>> + if (flag_dyn_ipa && cpp_opts->preprocessed)
>> + error ("-fpreprocessed/-save-temps are not supported with -fripa");
>> +
>> +
>> /* Default to ObjC sjlj exception handling if NeXT runtime. */
>> if (flag_objc_sjlj_exceptions < 0)
>> flag_objc_sjlj_exceptions = flag_next_runtime;
>> @@ -1061,6 +1069,28 @@ c_common_init (void)
>> return true;
>> }
>>
>> +/* Return TRUE if the lipo maximum memory consumption limit is reached, and
>> + we should not import any further auxiliary modules. Check after parsing
>> + each module, the Ith module being the just parsed module. */
>> +static bool
>> +lipo_max_mem_reached (unsigned int i)
>> +{
>> + if (L_IPO_COMP_MODE && PARAM_VALUE (PARAM_MAX_LIPO_MEMORY)
>> + && i < (num_in_fnames - 1)
>> + && ((ggc_total_allocated () >> 10)
>> + > (size_t) PARAM_VALUE (PARAM_MAX_LIPO_MEMORY))) {
>> + i++;
>> + do {
>> + inform (input_location, "Not importing %s: maximum memory "
>> + "consumption reached", in_fnames[i]);
>> + i++;
>> + } while (i < num_in_fnames);
>> + return true;
>> + }
>> + return false;
>> +}
>> +
>> +
>> /* Initialize the integrated preprocessor after debug output has been
>> initialized; loop over each input file. */
>> void
>> @@ -1073,8 +1103,15 @@ c_common_parse_file (void)
>> {
>> c_finish_options ();
>> pch_init ();
>> + set_lipo_c_parsing_context (parse_in, i, verbose);
>> push_file_scope ();
>> c_parse_file ();
>> + /* In lipo mode, processing too many auxiliary files will cause us
>> + to hit memory limits, and cause thrashing -- prevent this by not
>> + processing any further auxiliary modules if we reach a certain
>> + memory limit. */
>> + if (lipo_max_mem_reached (i))
>> + num_in_fnames = i + 1;
>> pop_file_scope ();
>> /* And end the main input file, if the debug writer wants it */
>> if (debug_hooks->start_end_main_source_file)
>> @@ -1083,6 +1120,7 @@ c_common_parse_file (void)
>> break;
>> cpp_undef_all (parse_in);
>> cpp_clear_file_cache (parse_in);
>> + deferred_count = 0;
>> this_input_filename
>> = cpp_read_main_file (parse_in, in_fnames[i]);
>> /* If an input file is missing, abandon further compilation.
>> @@ -1320,9 +1358,15 @@ c_finish_options (void)
>> struct deferred_opt *opt = &deferred_opts[i];
>>
>> if (opt->code == OPT_D)
>> - cpp_define (parse_in, opt->arg);
>> + {
>> + cpp_define (parse_in, opt->arg);
>> + coverage_note_define (opt->arg, true);
>> + }
>> else if (opt->code == OPT_U)
>> - cpp_undef (parse_in, opt->arg);
>> + {
>> + cpp_undef (parse_in, opt->arg);
>> + coverage_note_define (opt->arg, false);
>> + }
>> else if (opt->code == OPT_A)
>> {
>> if (opt->arg[0] == '-')
>> @@ -1345,6 +1389,7 @@ c_finish_options (void)
>> if (opt->code == OPT_imacros
>> && cpp_push_include (parse_in, opt->arg))
>> {
>> + coverage_note_include (opt->arg);
>> /* Disable push_command_line_include callback for now. */
>> include_cursor = deferred_count + 1;
>> cpp_scan_nooutput (parse_in);
>> @@ -1376,7 +1421,10 @@ push_command_line_include (void)
>>
>> if (!cpp_opts->preprocessed && opt->code == OPT_include
>> && cpp_push_include (parse_in, opt->arg))
>> - return;
>> + {
>> + coverage_note_include (opt->arg);
>> + return;
>> + }
>> }
>>
>> if (include_cursor == deferred_count)
>> Index: cgraph.c
>> ===================================================================
>> --- cgraph.c (revision 172880)
>> +++ cgraph.c (working copy)
>> @@ -98,6 +98,7 @@ The callgraph:
>> #include "rtl.h"
>> #include "ipa-utils.h"
>> #include "lto-streamer.h"
>> +#include "l-ipo.h"
>>
>> const char * const ld_plugin_symbol_resolution_names[]=
>> {
>> @@ -485,6 +486,44 @@ cgraph_create_node (void)
>> return node;
>> }
>>
>> +/* Add an entry in assembler hash for NODE. */
>> +void
>> +cgraph_add_assembler_hash_node (struct cgraph_node *node)
>> +{
>> + if (assembler_name_hash)
>> + {
>> + void **aslot;
>> + tree name = DECL_ASSEMBLER_NAME (node->decl);
>> +
>> + aslot = htab_find_slot_with_hash (assembler_name_hash, name,
>> + decl_assembler_name_hash (name),
>> + INSERT);
>> + /* We can have multiple declarations with same assembler name. For C++
>> + it is __builtin_strlen and strlen, for instance. Do we need to
>> + record them all? Original implementation marked just first one
>> + so lets hope for the best. */
>> + if (*aslot == NULL)
>> + *aslot = node;
>> + }
>> +}
>> +
>> +/* Remove from assembler hash for NODE. */
>> +void
>> +cgraph_remove_assembler_hash_node (struct cgraph_node *node)
>> +{
>> + void **slot;
>> + if (assembler_name_hash)
>> + {
>> + tree name = DECL_ASSEMBLER_NAME (node->decl);
>> + slot = htab_find_slot_with_hash (assembler_name_hash, name,
>> + decl_assembler_name_hash (name),
>> + NO_INSERT);
>> + /* Inline clones are not hashed. */
>> + if (slot && *slot == node)
>> + htab_clear_slot (assembler_name_hash, slot);
>> + }
>> +}
>> +
>> /* Return cgraph node assigned to DECL. Create new one when needed. */
>>
>> struct cgraph_node *
>> @@ -518,21 +557,7 @@ cgraph_node (tree decl)
>> node->next_nested = node->origin->nested;
>> node->origin->nested = node;
>> }
>> - if (assembler_name_hash)
>> - {
>> - void **aslot;
>> - tree name = DECL_ASSEMBLER_NAME (decl);
>> -
>> - aslot = htab_find_slot_with_hash (assembler_name_hash, name,
>> - decl_assembler_name_hash (name),
>> - INSERT);
>> - /* We can have multiple declarations with same assembler name. For C++
>> - it is __builtin_strlen and strlen, for instance. Do we need to
>> - record them all? Original implementation marked just first one
>> - so lets hope for the best. */
>> - if (*aslot == NULL)
>> - *aslot = node;
>> - }
>> + cgraph_add_assembler_hash_node (node);
>> return node;
>> }
>>
>> @@ -830,7 +855,9 @@ cgraph_edge (struct cgraph_node *node, g
>> {
>> node->call_site_hash = htab_create_ggc (120, edge_hash, edge_eq, NULL);
>> for (e2 = node->callees; e2; e2 = e2->next_callee)
>> - cgraph_add_edge_to_call_site_hash (e2);
>> + /* Skip fake edges. */
>> + if (e2->call_stmt)
>> + cgraph_add_edge_to_call_site_hash (e2);
>> for (e2 = node->indirect_calls; e2; e2 = e2->next_callee)
>> cgraph_add_edge_to_call_site_hash (e2);
>> }
>> @@ -1139,7 +1166,7 @@ cgraph_edge_remove_caller (struct cgraph
>> else
>> e->caller->callees = e->next_callee;
>> }
>> - if (e->caller->call_site_hash)
>> + if (e->caller->call_site_hash && e->call_stmt)
>> htab_remove_elt_with_hash (e->caller->call_site_hash,
>> e->call_stmt,
>> htab_hash_pointer (e->call_stmt));
>> @@ -1178,6 +1205,26 @@ cgraph_remove_edge (struct cgraph_edge *
>> cgraph_free_edge (e);
>> }
>>
>> +/* Remove fake cgraph edges for indirect calls. NODE is the callee
>> + of the edges. */
>> +
>> +void
>> +cgraph_remove_fake_indirect_call_in_edges (struct cgraph_node *node)
>> +{
>> + struct cgraph_edge *f, *e;
>> +
>> + if (!L_IPO_COMP_MODE)
>> + return;
>> +
>> + for (e = node->callers; e; e = f)
>> + {
>> + f = e->next_caller;
>> + if (!e->call_stmt)
>> + cgraph_remove_edge (e);
>> + }
>> +}
>> +
>> +
>> /* Set callee of call graph edge E and add it to the corresponding set of
>> callers. */
>>
>> @@ -1400,6 +1447,8 @@ cgraph_node_remove_callers (struct cgrap
>> void
>> cgraph_release_function_body (struct cgraph_node *node)
>> {
>> + if (cgraph_is_aux_decl_external (node))
>> + DECL_EXTERNAL (node->decl) = 1;
>> if (DECL_STRUCT_FUNCTION (node->decl))
>> {
>> tree old_decl = current_function_decl;
>> @@ -1667,19 +1716,14 @@ cgraph_remove_node (struct cgraph_node *
>> || n->in_other_partition)))
>> kill_body = true;
>> }
>> - if (assembler_name_hash)
>> - {
>> - tree name = DECL_ASSEMBLER_NAME (node->decl);
>> - slot = htab_find_slot_with_hash (assembler_name_hash, name,
>> - decl_assembler_name_hash (name),
>> - NO_INSERT);
>> - /* Inline clones are not hashed. */
>> - if (slot && *slot == node)
>> - htab_clear_slot (assembler_name_hash, slot);
>> - }
>> +
>> + cgraph_remove_assembler_hash_node (node);
>>
>> if (kill_body)
>> cgraph_release_function_body (node);
>> +
>> + cgraph_remove_link_node (node);
>> +
>> node->decl = NULL;
>> if (node->call_site_hash)
>> {
>> @@ -1724,7 +1768,8 @@ cgraph_mark_reachable_node (struct cgrap
>> during the optimization process. This can happen for extern
>> inlines when bodies was removed after inlining. */
>> gcc_assert ((node->analyzed || node->in_other_partition
>> - || DECL_EXTERNAL (node->decl)));
>> + || DECL_EXTERNAL (node->decl)
>> + || cgraph_is_aux_decl_external (node)));
>> }
>> else
>> notice_global_symbol (node->decl);
>> @@ -2190,6 +2235,7 @@ cgraph_clone_node (struct cgraph_node *n
>> new_node->global = n->global;
>> new_node->rtl = n->rtl;
>> new_node->count = count;
>> + new_node->is_versioned_clone = n->is_versioned_clone;
>> new_node->frequency = n->frequency;
>> new_node->clone = n->clone;
>> new_node->clone.tree_map = 0;
>> @@ -2239,17 +2285,8 @@ cgraph_clone_node (struct cgraph_node *n
>> slot = (struct cgraph_node **) htab_find_slot (cgraph_hash, new_node, INSERT);
>> gcc_assert (!*slot);
>> *slot = new_node;
>> - if (assembler_name_hash)
>> - {
>> - void **aslot;
>> - tree name = DECL_ASSEMBLER_NAME (decl);
>> -
>> - aslot = htab_find_slot_with_hash (assembler_name_hash, name,
>> - decl_assembler_name_hash (name),
>> - INSERT);
>> - gcc_assert (!*aslot);
>> - *aslot = new_node;
>> - }
>> + cgraph_add_assembler_hash_node (new_node);
>> + cgraph_link_node (new_node);
>> }
>> return new_node;
>> }
>> @@ -2379,6 +2416,7 @@ cgraph_create_virtual_clone (struct cgra
>> new_node->clone.combined_args_to_skip = args_to_skip;
>> new_node->local.externally_visible = 0;
>> new_node->local.local = 1;
>> + new_node->is_versioned_clone = 1;
>> new_node->lowered = true;
>> new_node->reachable = true;
>>
>> Index: cgraph.h
>> ===================================================================
>> --- cgraph.h (revision 172880)
>> +++ cgraph.h (working copy)
>> @@ -286,6 +286,8 @@ struct GTY((chain_next ("%h.next"), chai
>> unsigned alias : 1;
>> /* Set for nodes that was constructed and finalized by frontend. */
>> unsigned finalized_by_frontend : 1;
>> + /* Is this function cloned during versioning ? */
>> + unsigned is_versioned_clone : 1;
>> /* Set for alias and thunk nodes, same_body points to the node they are alias
>> of and they are linked through the next/previous pointers. */
>> unsigned same_body_alias : 1;
>> @@ -475,6 +477,8 @@ struct GTY((chain_next ("%h.next"), chai
>> int order;
>> enum ld_plugin_symbol_resolution resolution;
>>
>> + /* The module in which it is first declared. */
>> + unsigned module_id;
>> /* Set when function must be output - it is externally visible
>> or its address is taken. */
>> unsigned needed : 1;
>> @@ -547,6 +551,11 @@ void debug_cgraph_node (struct cgraph_no
>> void cgraph_insert_node_to_hashtable (struct cgraph_node *node);
>> void cgraph_remove_edge (struct cgraph_edge *);
>> void cgraph_remove_node (struct cgraph_node *);
>> +void cgraph_add_assembler_hash_node (struct cgraph_node *);
>> +void cgraph_remove_assembler_hash_node (struct cgraph_node *);
>> +void cgraph_remove_fake_indirect_call_in_edges (struct cgraph_node *);
>> +extern bool cgraph_pre_profiling_inlining_done;
>> +extern bool cgraph_is_fake_indirect_call_edge (struct cgraph_edge *e);
>> void cgraph_remove_node_and_inline_clones (struct cgraph_node *);
>> void cgraph_release_function_body (struct cgraph_node *);
>> void cgraph_node_remove_callees (struct cgraph_node *node);
>> @@ -641,6 +650,48 @@ struct cgraph_node *save_inline_function
>> void record_references_in_initializer (tree, bool);
>> bool cgraph_process_new_functions (void);
>>
>> +/* Module info structure. */
>> +struct GTY (()) cgraph_mod_info
>> +{
>> + unsigned module_id;
>> +};
>> +
>> +/* LIPO linker symbol table entry for function symbols. */
>> +struct GTY (()) cgraph_sym
>> +{
>> + tree assembler_name;
>> + struct cgraph_node *rep_node;
>> + tree rep_decl;
>> + htab_t GTY ((param_is (struct cgraph_mod_info))) def_module_hash;
>> + bool is_promoted_static;
>> +};
>> +
>> +void cgraph_init_gid_map (void);
>> +void cgraph_add_fake_indirect_call_edges (void);
>> +void cgraph_remove_zero_count_fake_edges (void);
>> +void cgraph_do_link (void);
>> +struct cgraph_sym *cgraph_link_node (struct cgraph_node *);
>> +tree cgraph_find_decl (tree asm_name);
>> +void cgraph_remove_link_node (struct cgraph_node *node);
>> +struct cgraph_node *cgraph_lipo_get_resolved_node (tree decl);
>> +struct cgraph_node *cgraph_lipo_get_resolved_node_1 (tree decl, bool);
>> +unsigned cgraph_get_module_id (tree fndecl);
>> +bool cgraph_is_auxiliary (tree fndecl);
>> +void cgraph_process_module_scope_statics (void);
>> +bool cgraph_is_promoted_static_func (tree fndecl);
>> +bool cgraph_is_inline_body_available_in_module (tree fndecl, unsigned module_id);
>> +bool cgraph_is_aux_decl_external (struct cgraph_node *);
>> +void cgraph_unify_type_alias_sets (void);
>> +void varpool_do_link (void);
>> +void varpool_link_node (struct varpool_node *);
>> +void varpool_remove_link_node (struct varpool_node *node);
>> +struct varpool_node *real_varpool_node (tree decl);
>> +bool varpool_is_auxiliary (struct varpool_node *node);
>> +void varpool_get_referenced_asm_ids (VEC(tree, gc) **);
>> +void varpool_clear_asm_id_reference_bit (void);
>> +void varpool_reset_queue (void);
>> +void varpool_remove_duplicate_weak_decls (void);
>> +
>> bool cgraph_decide_is_function_needed (struct cgraph_node *, tree);
>>
>> typedef void (*cgraph_edge_hook)(struct cgraph_edge *, void *);
>> Index: value-prof.c
>> ===================================================================
>> --- value-prof.c (revision 172880)
>> +++ value-prof.c (working copy)
>> @@ -46,6 +46,9 @@ along with GCC; see the file COPYING3.
>> #include "timevar.h"
>> #include "tree-pass.h"
>> #include "pointer-set.h"
>> +#include "langhooks.h"
>> +#include "params.h"
>> +#include "l-ipo.h"
>>
>> /* In this file value profile based optimizations are placed. Currently the
>> following optimizations are implemented (for more detailed descriptions
>> @@ -300,6 +303,10 @@ dump_histogram_value (FILE *dump_file, h
>> }
>> fprintf (dump_file, ".\n");
>> break;
>> + case HIST_TYPE_INDIR_CALL_TOPN:
>> + fprintf (dump_file, "Indirect call -- top N\n");
>> + /* TODO add more elaborate dumping code. */
>> + break;
>> }
>> }
>>
>> @@ -487,6 +494,70 @@ check_counter (gimple stmt, const char *
>> return false;
>> }
>>
>> +/* The overall number of invocations of the counter should match
>> + execution count of basic block. Report it as error rather than
>> + internal error as it might mean that user has misused the profile
>> + somehow. STMT is the indiret call, COUNT1 and COUNT2 are counts
>> + of two top targets, and ALL is the enclosing basic block execution
>> + count. */
>> +
>> +static bool
>> +check_ic_counter (gimple stmt, gcov_type *count1, gcov_type *count2,
>> + gcov_type all)
>> +{
>> + location_t locus;
>> + if (*count1 > all && flag_profile_correction)
>> + {
>> + locus = (stmt != NULL)
>> + ? gimple_location (stmt)
>> + : DECL_SOURCE_LOCATION (current_function_decl);
>> + inform (locus, "Correcting inconsistent value profile: "
>> + "ic (topn) profiler top target count (%ld) exceeds "
>> + "BB count (%ld)", (long)*count1, (long)all);
>> + *count1 = all;
>> + }
>> + if (*count2 > all && flag_profile_correction)
>> + {
>> + locus = (stmt != NULL)
>> + ? gimple_location (stmt)
>> + : DECL_SOURCE_LOCATION (current_function_decl);
>> + inform (locus, "Correcting inconsistent value profile: "
>> + "ic (topn) profiler second target count (%ld) exceeds "
>> + "BB count (%ld)", (long)*count2, (long)all);
>> + *count2 = all;
>> + }
>> +
>> + if (*count2 > *count1)
>> + {
>> + locus = (stmt != NULL)
>> + ? gimple_location (stmt)
>> + : DECL_SOURCE_LOCATION (current_function_decl);
>> + inform (locus, "Corrupted topn ic value profile: "
>> + "first target count (%ld) is less than the second "
>> + "target count (%ld)", (long)*count1, (long)*count2);
>> + return true;
>> + }
>> +
>> + if (*count1 + *count2 > all)
>> + {
>> + /* If (COUNT1 + COUNT2) is greater than ALL by less than around 10% then
>> + just fix COUNT2 up so that (COUNT1 + COUNT2) equals ALL. */
>> + if ((*count1 + *count2 - all) < (all >> 3))
>> + *count2 = all - *count1;
>> + else
>> + {
>> + locus = (stmt != NULL)
>> + ? gimple_location (stmt)
>> + : DECL_SOURCE_LOCATION (current_function_decl);
>> + inform (locus, "Corrupted topn ic value profile: top two targets's"
>> + " total count (%ld) exceeds bb count (%ld)",
>> + (long)(*count1 + *count2), (long)all);
>> + return true;
>> + }
>> + }
>> + return false;
>> +}
>> +
>>
>> /* GIMPLE based transformations. */
>>
>> @@ -542,6 +613,14 @@ gimple_value_profile_transformations (vo
>> if (changed)
>> {
>> counts_to_freqs ();
>> + /* Value profile transformations may change inline parameters
>> + a lot (e.g., indirect call promotion introduces new direct calls).
>> + The update is also needed to avoid compiler ICE -- when MULTI
>> + target icall promotion happens, the caller's size may become
>> + negative when the promoted direct calls get promoted. */
>> + /* Guard this for LIPO for now. */
>> + if (L_IPO_COMP_MODE)
>> + compute_inline_parameters (cgraph_node (current_function_decl));
>> }
>>
>> return changed;
>> @@ -1090,6 +1169,111 @@ find_func_by_pid (int pid)
>> return pid_map [pid];
>> }
>>
>> +/* Initialize map of gids (gid -> cgraph node) */
>> +
>> +static htab_t gid_map = NULL;
>> +
>> +typedef struct func_gid_entry
>> +{
>> + struct cgraph_node *node;
>> + unsigned HOST_WIDEST_INT gid;
>> +} func_gid_entry_t;
>> +
>> +/* Hash function for function global unique ids. */
>> +
>> +static hashval_t
>> +htab_gid_hash (const void * ent)
>> +{
>> + const func_gid_entry_t *const entry = (const func_gid_entry_t *) ent;
>> + return entry->gid;
>> +}
>> +
>> +/* Hash table equality function for function global unique ids. */
>> +
>> +static int
>> +htab_gid_eq (const void *ent1, const void * ent2)
>> +{
>> + const func_gid_entry_t *const entry1 = (const func_gid_entry_t *) ent1;
>> + const func_gid_entry_t *const entry2 = (const func_gid_entry_t *) ent2;
>> + return entry1->gid == entry2->gid;
>> +}
>> +
>> +static void
>> +htab_gid_del (void *ent)
>> +{
>> + func_gid_entry_t *const entry = (func_gid_entry_t *) ent;
>> + free (entry);
>> +}
>> +
>> +/* Initialize the global unique id map for functions. */
>> +
>> +static void
>> +init_gid_map (void)
>> +{
>> + struct cgraph_node *n;
>> +
>> + gcc_assert (!gid_map);
>> +
>> + gid_map
>> + = htab_create (10, htab_gid_hash, htab_gid_eq, htab_gid_del);
>> +
>> + for (n = cgraph_nodes; n; n = n->next)
>> + {
>> + func_gid_entry_t ent, *entp;
>> + func_gid_entry_t **slot;
>> + struct function *f;
>> + ent.node = n;
>> + f = DECL_STRUCT_FUNCTION (n->decl);
>> + /* Do not care to indirect call promote a function with id. */
>> + if (!f || DECL_ABSTRACT (n->decl))
>> + continue;
>> + /* The global function id computed at profile-use time
>> + is slightly different from the one computed in
>> + instrumentation runtime -- for the latter, the intra-
>> + module function ident is 1 based while in profile-use
>> + phase, it is zero based. See get_next_funcdef_no in
>> + function.c. */
>> + ent.gid = FUNC_DECL_GLOBAL_ID (DECL_STRUCT_FUNCTION (n->decl));
>> + slot = (func_gid_entry_t **) htab_find_slot (gid_map, &ent, INSERT);
>> +
>> + gcc_assert (!*slot || ((*slot)->gid == ent.gid && (*slot)->node == n));
>> + if (!*slot)
>> + {
>> + *slot = entp = XCNEW (func_gid_entry_t);
>> + entp->node = n;
>> + entp->gid = ent.gid;
>> + }
>> + }
>> +}
>> +
>> +/* Initialize the global unique id map for functions. */
>> +
>> +void
>> +cgraph_init_gid_map (void)
>> +{
>> + if (!L_IPO_COMP_MODE)
>> + return;
>> +
>> + init_gid_map ();
>> +}
>> +
>> +/* Return cgraph node for function with global id. */
>> +
>> +struct cgraph_node *
>> +find_func_by_global_id (unsigned HOST_WIDE_INT gid)
>> +{
>> + func_gid_entry_t ent, *entp;
>> +
>> + gcc_assert (gid_map);
>> +
>> + ent.node = NULL;
>> + ent.gid = gid;
>> + entp = (func_gid_entry_t *)htab_find (gid_map, &ent);
>> + if (entp)
>> + return entp->node;
>> + return NULL;
>> +}
>> +
>> /* Perform sanity check on the indirect call target. Due to race conditions,
>> false function target may be attributed to an indirect call site. If the
>> call expression type mismatches with the target function's type, expand_call
>> @@ -1244,27 +1428,13 @@ gimple_ic (gimple icall_stmt, struct cgr
>> */
>>
>> static bool
>> -gimple_ic_transform (gimple stmt)
>> +gimple_ic_transform_single_targ (gimple stmt, histogram_value histogram)
>> {
>> - histogram_value histogram;
>> gcov_type val, count, all, bb_all;
>> gcov_type prob;
>> - tree callee;
>> gimple modify;
>> struct cgraph_node *direct_call;
>>
>> - if (gimple_code (stmt) != GIMPLE_CALL)
>> - return false;
>> -
>> - callee = gimple_call_fn (stmt);
>> -
>> - if (TREE_CODE (callee) == FUNCTION_DECL)
>> - return false;
>> -
>> - histogram = gimple_histogram_value_of_type (cfun, stmt, HIST_TYPE_INDIR_CALL);
>> - if (!histogram)
>> - return false;
>> -
>> val = histogram->hvalue.counters [0];
>> count = histogram->hvalue.counters [1];
>> all = histogram->hvalue.counters [2];
>> @@ -1312,6 +1482,195 @@ gimple_ic_transform (gimple stmt)
>> return true;
>> }
>>
>> +/* Convert indirect function call STMT into guarded direct function
>> + calls. Multiple indirect call targets are supported. HISTOGRAM
>> + is the target distribution for the callsite. */
>> +
>> +static bool
>> +gimple_ic_transform_mult_targ (gimple stmt, histogram_value histogram)
>> +{
>> + gcov_type val1, val2, count1, count2, all, bb_all;
>> + gcov_type prob1, prob2;
>> + gimple modify1, modify2;
>> + struct cgraph_node *direct_call1 = 0, *direct_call2 = 0;
>> + int perc_threshold, count_threshold, always_inline;
>> + location_t locus;
>> +
>> + val1 = histogram->hvalue.counters [1];
>> + count1 = histogram->hvalue.counters [2];
>> + val2 = histogram->hvalue.counters [3];
>> + count2 = histogram->hvalue.counters [4];
>> + bb_all = gimple_bb (stmt)->count;
>> + all = bb_all;
>> +
>> + gimple_remove_histogram_value (cfun, stmt, histogram);
>> +
>> + if (count1 == 0)
>> + return false;
>> +
>> + perc_threshold = PARAM_VALUE (PARAM_ICALL_PROMOTE_PERCENT_THRESHOLD);
>> + count_threshold = PARAM_VALUE (PARAM_ICALL_PROMOTE_COUNT_THRESHOLD);
>> + always_inline = PARAM_VALUE (PARAM_ALWAYS_INLINE_ICALL_TARGET);
>> +
>> + if (100 * count1 < all * perc_threshold || count1 < count_threshold)
>> + return false;
>> +
>> + if (check_ic_counter (stmt, &count1, &count2, all))
>> + return false;
>> +
>> + if (all > 0)
>> + {
>> + prob1 = (count1 * REG_BR_PROB_BASE + all / 2) / all;
>> + if (all - count1 > 0)
>> + prob2 = (count2 * REG_BR_PROB_BASE
>> + + (all - count1) / 2) / (all - count1);
>> + else
>> + prob2 = 0;
>> + }
>> + else
>> + prob1 = prob2 = 0;
>> +
>> + direct_call1 = find_func_by_global_id (val1);
>> +
>> + if (val2 && (100 * count2 >= all * perc_threshold)
>> + && count2 > count_threshold)
>> + direct_call2 = find_func_by_global_id (val2);
>> +
>> + locus = (stmt != NULL) ? gimple_location (stmt)
>> + : DECL_SOURCE_LOCATION (current_function_decl);
>> + if (direct_call1 == NULL
>> + || !check_ic_target (stmt, direct_call1))
>> + {
>> + if (flag_ripa_verbose)
>> + {
>> + if (!direct_call1)
>> + inform (locus, "Can not find indirect call target decl "
>> + "(%d:%d)[cnt:%u] in current module",
>> + EXTRACT_MODULE_ID_FROM_GLOBAL_ID (val1),
>> + EXTRACT_FUNC_ID_FROM_GLOBAL_ID (val1), (unsigned) count1);
>> + else
>> + inform (locus,
>> + "Can not find promote indirect call target decl -- type mismatch "
>> + "(%d:%d)[cnt:%u] in current module",
>> + EXTRACT_MODULE_ID_FROM_GLOBAL_ID (val1),
>> + EXTRACT_FUNC_ID_FROM_GLOBAL_ID (val1), (unsigned) count1);
>> + }
>> + return false;
>> + }
>> +
>> + /* Don't indirect-call promote if the target is in auxiliary module and
>> + DECL_ARTIFICIAL and not TREE_PUBLIC, because we don't static-promote
>> + DECL_ARTIFICIALs yet. */
>> + if (cgraph_is_auxiliary (direct_call1->decl)
>> + && DECL_ARTIFICIAL (direct_call1->decl)
>> + && ! TREE_PUBLIC (direct_call1->decl))
>> + return false;
>> +
>> + modify1 = gimple_ic (stmt, direct_call1, prob1, count
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2011-04-25 23:39 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <20110425232927.75B241B40B6@aples.mtv.corp.google.com>
[not found] ` <BANLkTimFO5r7qW891YT2jXfWUdTd+whX0w@mail.gmail.com>
2011-04-26 2:15 ` [google] Port LIPO support to google/main (issue4431065) Rong Xu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).