Re: [google] Port LIPO support to google/main (issue4431065)

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* Re: [google] Port LIPO support to google/main (issue4431065)
       [not found] ` <BANLkTimFO5r7qW891YT2jXfWUdTd+whX0w@mail.gmail.com>
@ 2011-04-26  2:15   ` Rong Xu
  0 siblings, 0 replies; only message in thread
From: Rong Xu @ 2011-04-26  2:15 UTC (permalink / raw)
  To: Xinliang David Li; +Cc: reply, gcc-patches

LGTM

On Mon, Apr 25, 2011 at 4:32 PM, Xinliang David Li <davidxl@google.com> wrote:
> Bootstrapped on x86_64/linux with all regression tests passing.
>
> Additional tests: SPEC2k, SPEC06 FDO and LIPO testing.
>
> Note -- there will be work in move LIPO away from FE based solution,
> this patch is needed to make the base for forward progress.
>
> David
>
> On Mon, Apr 25, 2011 at 4:29 PM, David Li <davidxl@google.com> wrote:
>> 2011-04-25  David Li  <davidxl@google.com>
>>
>>        * cgraphbuild.c (rebuild_cgraph_edges): Create edges based
>>        using resolved nodes in LIPO mode after ipa_tree_profiling.
>>        * dyn-ipa.c:  New file.
>>        * dbgcnt.def: New debug counters.
>>        * c-family/c-opts.c (c_common_post_options): New option.
>>        (c_common_parse_file): LIPO mode driver change.
>>        (push_command_line_include): New function.
>>        * cgraph.c (cgraph_mark_reachable_node): Handle aux module nodes.
>>        (cgraph_clone_node): New field initialization.
>>        (cgraph_create_virtual_clone): New field initialization.
>>        * value-prof.c (dump_histogram_value): New hist kind.
>>        (check_counter): New function.
>>        (gimple_value_profile_transformations): New ic kind.
>>        (gimple_ic_transform): Changed interface.
>>        (gimple_indirect_call_to_profile): New vpt kind.
>>        (gimple_find_values_to_profile): New vpt kind.
>>        * cgraphunit.c (cgraph_finalize_function): Handle LIPO mode.
>>        (verify_edge_count_and_frequency): Ditto.
>>        (verify_cgraph_node): Ditto.
>>        (cgraph_finalize_compilation_unit): Ditto.
>>        (cgraph_mark_functions_to_output): Ditto.
>>        (cgraph_expand_function): Ditto.
>>        (cgraph_output_in_order): Ditto.
>>        (cgraph_optimize): Ditto.
>>        (cgraph_copy_node_for_versioning): Ditto.
>>        (cgraph_materialize_clone): Ditto.
>>        (cgraph_redirect_edge_call_stmt_to_callee): Ditto.
>>        * cp/decl.c (walk_namespaces): Ditto.
>>        (duplicate_decls): Ditto.
>>        (make_rtl_for_nonlocal_decl): Ditto.
>>        * cp/rtti.c (get_tinfo_decl): Ditto.
>>        (create_pseudo_type_info): Ditto.
>>        (create_tinfo_types): Ditto.
>>        * cp/cp-lang.c (cp_classify_record): Ditto.
>>        * cp/pt.c (instantiate_decl): Ditto.
>>        * cp/semantics.c (emit_associated_thunks): Ditto.
>>        * cp/decl2.c (start_static_storage_duration_function): Ditto.
>>        (prune_vars_needing_no_initialization):  Ditto.
>>        (cxx_callgraph_analyze_expr): Ditto.
>>        (no_linkage_error): Ditto.
>>        (collect_all_refs): Ditto.
>>        (cp_write_global_declarations): Ditto.
>>        * cp/parser.c (pragma_lex): Ditto.
>>        * cp/cp-objcp-common.c (cp_function_decl_explicit_p): Ditto.
>>        * cp/mangle.c (mangle_conv_op_name_for_type): Ditto.
>>        * cp/name-lookup.c (add_decl_to_level): Ditto.
>>        * gcov-io.c (gcov_read_summary): Ditto.
>>        * tree-ssa-alias.c (same_type_for_tbaa): Ditto.
>>        (refs_may_alias_p_1): Ditto.
>>        (walk_aliased_vdefs): Ditto.
>>        * ipa-inline.c (cgraph_mark_inline_edge): Ditto.
>>        (cgraph_recursive_inlining_p): Ditto.
>>        (cgraph_edge_badness): Ditto.
>>        (update_caller_keys): Ditto.
>>        (cgraph_decide_recursive_inlining): Ditto.
>>        (cgraph_decide_inlining_of_small_function): Ditto.
>>        (cgraph_decide_inlining): Ditto.
>>        (cgraph_decide_inlining_incrementally): Ditto.
>>        (estimate_function_body_sizes): Ditto.
>>        * dwarf2out.c (dwarf2out_do_cfi_startproc): Ditto.
>>        (dwarf2out_begin_prologue): Ditto.
>>        (dwarf2out_end_epilogue): Ditto.
>>        (dw_loc_list): Ditto.
>>        (gen_subprogram_die): Ditto.
>>        (dwarf2out_source_line): Ditto.
>>        * c-decl.c (bind): Ditto.
>>        (pop_scope): Ditto. Ditto.
>>        (push_file_scope): Ditto.
>>        (pop_file_scope): Ditto.
>>        (finish_decl): Ditto.
>>        (c_write_global_declarations): Ditto.
>>        * langhooks.c (lhd_do_nothing_f): Ditto.
>>        (lhd_do_nothing_t_t_return_null_tree): Ditto.
>>        (lhd_return_null_tree_v): Ditto.
>>        (lhd_builtin_function): Ditto.
>>        * function.c (pop_cfun): Ditto.
>>        (allocate_struct_function): Ditto.
>>        * profile.c (instrument_values): Ditto.
>>        (is_edge_inconsistent): Ditto.
>>        * ipa.c (process_references): Ditto.
>>        (cgraph_remove_unreachable_nodes): Ditto.
>>        (function_and_variable_visibility):
>>        * c-typeck.c (tagged_types_tu_compatible_p): Ditto.
>>        * gimplify.c (gimplify_addr_expr): Ditto.
>>        * except.c (sjlj_emit_function_enter): Ditto.
>>        (output_one_function_exception_table): Ditto.
>>        * coverage.c (get_gcov_type): Ditto.
>>        (htab_counts_entry_del): Ditto.
>>        (read_counts_file): Ditto.
>>        (get_coverage_counts): Ditto.
>>        (coverage_checksum_string): Ditto.
>>        (coverage_begin_output): Ditto.
>>        (coverage_end_function): Ditto.
>>        (build_fn_info_type): Ditto.
>>        (build_fn_info_value): Ditto.
>>        (build_ctr_info_value): Ditto.
>>        (build_gcov_info): Ditto.
>>        (create_coverage): Ditto.
>>        (coverage_finish): Ditto.
>>        * tree-sra.c (convert_callers): Ditto.
>>        (modify_function): Ditto.
>>        (ipa_early_sra): Ditto.
>>        * varasm.c (notice_global_symbol): Ditto.
>>        (assemble_external): Ditto.
>>        (mark_decl_referenced): Ditto.
>>        (find_decl_and_mark_needed): Ditto.
>>        (finish_aliases_1): Ditto.
>>        (assemble_alias): Ditto.
>>        * tree-ssa.c (useless_type_conversion_p): Ditto.
>>        * tree-inline.c (copy_bb): Ditto.
>>        (copy_edges_for_bb): Ditto.
>>        (initialize_cfun): Ditto.
>>        (copy_cfg_body): Ditto.
>>        (tree_inlinable_function_p): Ditto.
>>        (optimize_inline_calls): Ditto.
>>        (copy_tree_r): Ditto.
>>        * tree-profile.c (init_ic_make_global_vars): Ditto.
>>        (gimple_init_edge_profiler): Ditto.
>>        (gimple_gen_ic_profiler): Ditto.
>>        (gimple_gen_ic_func_profiler): Ditto.
>>        (tree_profiling): Ditto.
>>        * opts-global.c (lang_handle_option): Ditto.
>>        (add_input_filename): Ditto.
>>        (read_cmdline_options): Ditto.
>>        * l-ipo.c: Ditto.
>>        * libgcov.c (gcov_version): Ditto.
>>        (gcov_exit): Ditto.
>>        (__gcov_init): Ditto.
>>        (__gcov_flush): Ditto.
>>        (__gcov_merge_ior): Ditto.
>>        (__gcov_one_value_profiler_body): Ditto.
>>        (__gcov_indirect_call_profiler): Ditto.
>>        * l-ipo.h: Ditto.
>>        * tree-cfg.c: (verify_types_in_gimple_reference): Ditto.
>>        (verify_gimple_call): Ditto.
>>        (verify_gimple_assign_single): Ditto.
>>        (verify_gimple_return): Ditto.
>>        (gimple_verify_flow_info): Ditto.
>>        * passes.c: (rest_of_decl_compilation): Ditto.
>>        (init_optimization_passes): Ditto.
>>        (pass_init_dump_file): Ditto.
>>        * varpool.c (varpool_node): Ditto.
>>        (debug_varpool): Ditto.
>>        (varpool_node_for_asm): Ditto.
>>        (varpool_enqueue_needed_node): Ditto.
>>        * stmt.c (expand_asm_operands): Ditto.
>>        * gcov-dump.c (main): Ditto.
>>        (print_usage): Ditto.
>>        (print_prefix): Ditto.
>>        (dump_file): Ditto.
>>        (tag_summary): Ditto.
>>
>> Index: doc/invoke.texi
>> ===================================================================
>> --- doc/invoke.texi     (revision 172880)
>> +++ doc/invoke.texi     (working copy)
>> @@ -258,7 +258,7 @@ Objective-C and Objective-C++ Dialects}.
>>  -Wparentheses  -Wpedantic-ms-format -Wno-pedantic-ms-format @gol
>>  -Wpointer-arith  -Wno-pointer-to-int-cast @gol
>>  -Wredundant-decls @gol
>> --Wreturn-type  -Wsequence-point  -Wshadow @gol
>> +-Wreturn-type  -Wripa-opt-mismatch  -Wsequence-point  -Wshadow @gol
>>  -Wsign-compare  -Wsign-conversion  -Wstack-protector @gol
>>  -Wstrict-aliasing -Wstrict-aliasing=n @gol
>>  -Wstrict-overflow -Wstrict-overflow=@var{n} @gol
>> @@ -380,7 +380,9 @@ Objective-C and Objective-C++ Dialects}.
>>  -freciprocal-math -fregmove -frename-registers -freorder-blocks @gol
>>  -freorder-blocks-and-partition -freorder-functions @gol
>>  -frerun-cse-after-loop -freschedule-modulo-scheduled-loops @gol
>> --frounding-math -fsched2-use-superblocks -fsched-pressure @gol
>> +-fripa -fripa-disallow-asm-modules -fripa-disallow-opt-mismatch @gol
>> +-fripa-no-promote-always-inline-func -fripa-verbose -frounding-math @gol
>> +-fsched2-use-superblocks -fsched-pressure @gol
>>  -fsched-spec-load -fsched-spec-load-dangerous @gol
>>  -fsched-stalled-insns-dep[=@var{n}] -fsched-stalled-insns[=@var{n}] @gol
>>  -fsched-group-heuristic -fsched-critical-path-heuristic @gol
>> @@ -1364,8 +1366,8 @@ prefix.
>>  @opindex no-canonical-prefixes
>>  Never expand any symbolic links, resolve references to @samp{/../}
>>  or @samp{/./}, or make the path absolute when generating a relative
>> -prefix.  If neither @option{-canonical-prefixes} nor
>> -@option{-canonical-prefixes} is given, GCC tries to set an appropriate
>> +prefix. If neither @option{-canonical-prefixes} nor
>> +@option{-nocanonical-prefixes} is given, GCC tries to set an appropriate
>>  default by looking for a target-specific subdirectory alongside the
>>  directory containing the compiler driver.
>>
>> @@ -2969,6 +2971,7 @@ Options} and @ref{Objective-C and Object
>>  -Wpointer-sign  @gol
>>  -Wreorder   @gol
>>  -Wreturn-type  @gol
>> +-Wripa-opt-mismatch @gol
>>  -Wsequence-point  @gol
>>  -Wsign-compare @r{(only in C++)}  @gol
>>  -Wstrict-aliasing  @gol
>> @@ -3395,6 +3398,16 @@ exceptions are @samp{main} and functions
>>
>>  This warning is enabled by @option{-Wall}.
>>
>> +@item -Wripa-opt-mismatch
>> +@opindex Wripa-opt-mismatch
>> +@opindex Wno-ripa-opt-mismatch
>> +When doing an FDO build with @option{-fprofile-use} and @option{-fripa},
>> +warn if importing an axuiliary module that was built with a different
>> +GCC command line during the profile-generate phase than the primary
>> +module.
>> +
>> +This warning is enabled by @option{-Wall}.
>> +
>>  @item -Wswitch
>>  @opindex Wswitch
>>  @opindex Wno-switch
>> @@ -4493,6 +4506,12 @@ Suppress warnings from casts to pointer
>>  different size. In C++, casting to a pointer type of smaller size is
>>  an error. @option{Wint-to-pointer-cast} is enabled by default.
>>
>> +@item max-lipo-mem
>> +When importing auxiliary modules during profile-use, check current
>> +memory consumption after parsing each auxiliary module. If it exceeds
>> +this limit (specified in kb), don't import any more auxiliary modules.
>> +Specifying a value of 0 means don't enforce this limit. This parameter
>> +is only useful when using @option{-fprofile-use} and @option{-fripa}.
>>
>>  @item -Wno-pointer-to-int-cast @r{(C and Objective-C only)}
>>  @opindex Wno-pointer-to-int-cast
>> @@ -7864,6 +7883,40 @@ code.
>>
>>  If @var{path} is specified, GCC will look at the @var{path} to find
>>  the profile feedback data files. See @option{-fprofile-dir}.
>> +
>> +@item -fripa
>> +@opindex fripa
>> +Perform dynamic inter-procedural analysis. This is used in conjunction with
>> +the @option{-fprofile-generate} and @option{-fprofile-use} options.
>> +During the @option{-fprofile-generate} phase, this flag turns on some additional
>> +instrumentation code that enables dynamic call-graph analysis.
>> +During the @option{-fprofile-use} phase, this flag enables cross-module
>> +optimizations such as inlining.
>> +
>> +@item -fripa-disallow-asm-modules
>> +@opindex fripa-disallow-asm-modules
>> +During profile-gen, if this flag is enabled, and the module has asm statements,
>> +arrange so that a bit recording this information will be set in the profile
>> +feedback data file.
>> +During profile-use, if this flag is enabled, and the same bit in auxiliary
>> +module's profile feedback data is set, don't import this auxiliary module.
>> +If this is the primary module, don't export it.
>> +
>> +@item -fripa-disallow-opt-mismatch
>> +@opindex fripa-disallow-opt-mismatch
>> +Don't import an auxiliary module, if the GCC command line options used for this
>> +auxiliary module during the profile-generate stage were different from those used
>> +for the primary module. Note that any mismatches in warning-related options are
>> +ignored for this comparison.
>> +
>> +@item -fripa-no-promote-always-inline-func
>> +@opindex fripa-no-promote-always-inline-func
>> +Do not promote static functions with always inline attribute in LIPO compilation.
>> +
>> +@item -fripa-verbose
>> +@opindex fripa-verbose
>> +Enable printing of verbose information about dynamic inter-procedural optimizations.
>> +This is used in conjunction with the @option{-fripa}.
>>  @end table
>>
>>  The following options control compiler behavior regarding floating
>> @@ -9954,6 +10007,12 @@ given to GCC, substitutes @code{Y}; else
>>  be as many clauses as you need.  This may be combined with @code{.},
>>  @code{,}, @code{!}, @code{|}, and @code{*} as needed.
>>
>> +@item max-lipo-mem
>> +When importing auxiliary modules during profile-use, check current
>> +memory consumption after parsing each auxiliary module. If it exceeds
>> +this limit (specified in kb), don't import any more auxiliary modules.
>> +Specifying a value of 0 means don't enforce this limit. This parameter
>> +is only useful when using @option{-fprofile-use} and @option{-fripa}.
>>
>>  @end table
>>
>> Index: cgraphbuild.c
>> ===================================================================
>> --- cgraphbuild.c       (revision 172880)
>> +++ cgraphbuild.c       (working copy)
>> @@ -30,9 +30,13 @@ along with GCC; see the file COPYING3.
>>  #include "cgraph.h"
>>  #include "intl.h"
>>  #include "gimple.h"
>> +#include "toplev.h"
>> +#include "gcov-io.h"
>> +#include "coverage.h"
>>  #include "tree-pass.h"
>>  #include "ipa-utils.h"
>>  #include "except.h"
>> +#include "l-ipo.h"
>>
>>  /* Context of record_reference.  */
>>  struct record_reference_ctx
>> @@ -236,6 +240,140 @@ compute_call_stmt_bb_frequency (tree dec
>>   return freq;
>>  }
>>
>> +
>> +bool cgraph_pre_profiling_inlining_done = false;
>> +
>> +/* Return true if E is a fake indirect call edge.  */
>> +
>> +bool
>> +cgraph_is_fake_indirect_call_edge (struct cgraph_edge *e)
>> +{
>> +  return !e->call_stmt;
>> +}
>> +
>> +
>> +/* Add fake cgraph edges from NODE to its indirect call callees
>> +   using profile data.  */
>> +
>> +static void
>> +add_fake_indirect_call_edges (struct cgraph_node *node)
>> +{
>> +  unsigned n_counts, i;
>> +  gcov_type *ic_counts;
>> +
>> +  /* Enable this only for LIPO for now.  */
>> +  if (!L_IPO_COMP_MODE)
>> +    return;
>> +
>> +  if (cgraph_pre_profiling_inlining_done)
>> +    return;
>> +
>> +  ic_counts
>> +      = get_coverage_counts_no_warn (DECL_STRUCT_FUNCTION (node->decl),
>> +                                     GCOV_COUNTER_ICALL_TOPNV, &n_counts);
>> +
>> +  if (!ic_counts)
>> +    return;
>> +
>> +  gcc_assert ((n_counts % GCOV_ICALL_TOPN_NCOUNTS) == 0);
>> +
>> +/* After the early_inline_1 before value profile transformation,
>> +   functions that are indirect call targets may have their bodies
>> +   removed (extern inline functions or functions from aux modules,
>> +   functions in comdat etc) if all direct callsites are inlined. This
>> +   will lead to missing inline opportunities after profile based
>> +   indirect call promotion. The solution is to add fake edges to
>> +   indirect call targets. Note that such edges are not associated
>> +   with actual indirect call sites because it is not possible to
>> +   reliably match pre-early-inline indirect callsites with indirect
>> +   call profile counters which are from post-early inline function body.  */
>> +
>> +  for (i = 0; i < n_counts;
>> +       i += GCOV_ICALL_TOPN_NCOUNTS, ic_counts += GCOV_ICALL_TOPN_NCOUNTS)
>> +    {
>> +      gcov_type val1, val2, count1, count2;
>> +      struct cgraph_node *direct_call1 = 0, *direct_call2 = 0;
>> +
>> +      val1 = ic_counts[1];
>> +      count1 = ic_counts[2];
>> +      val2 = ic_counts[3];
>> +      count2 = ic_counts[4];
>> +
>> +      if (val1 == 0 || count1 == 0)
>> +        continue;
>> +
>> +      direct_call1 = find_func_by_global_id (val1);
>> +      if (direct_call1)
>> +        {
>> +          tree decl = direct_call1->decl;
>> +          cgraph_create_edge (node,
>> +                             cgraph_node (decl),
>> +                             NULL,
>> +                              count1, 0, 0);
>> +        }
>> +
>> +      if (val2 == 0 || count2 == 0)
>> +        continue;
>> +      direct_call2 = find_func_by_global_id (val2);
>> +      if (direct_call2)
>> +        {
>> +          tree decl = direct_call2->decl;
>> +          cgraph_create_edge (node,
>> +                             cgraph_node (decl),
>> +                              NULL,
>> +                              count2, 0, 0);
>> +        }
>> +    }
>> +}
>> +
>> +
>> +/* This can be implemented as an IPA pass that must be first one
>> +   before any unreachable node elimination. */
>> +
>> +void
>> +cgraph_add_fake_indirect_call_edges (void)
>> +{
>> +  struct cgraph_node *node;
>> +
>> +  /* Enable this only for LIPO for now.  */
>> +  if (!L_IPO_COMP_MODE)
>> +    return;
>> +
>> +  for (node = cgraph_nodes; node; node = node->next)
>> +    {
>> +      if (node->analyzed && (node->needed || node->reachable))
>> +        add_fake_indirect_call_edges (node);
>> +    }
>> +}
>> +
>> +/* Remove zero count fake edges added for the purpose of ensuring
>> +   the right processing order.  This should be called after all
>> +   small ipa passes.  */
>> +void
>> +cgraph_remove_zero_count_fake_edges (void)
>> +{
>> +  struct cgraph_node *node;
>> +
>> +  /* Enable this only for LIPO for now.  */
>> +  if (!L_IPO_COMP_MODE)
>> +    return;
>> +
>> +  for (node = cgraph_nodes; node; node = node->next)
>> +    {
>> +      if (node->analyzed && (node->needed || node->reachable))
>> +        {
>> +          struct cgraph_edge *e, *f;
>> +         for (e = node->callees; e; e = f)
>> +           {
>> +             f = e->next_callee;
>> +             if (!e->call_stmt && !e->count && !e->loop_nest
>> +                 && !e->frequency)
>> +                cgraph_remove_edge (e);
>> +            }
>> +       }
>> +    }
>> +}
>> +
>>  /* Mark address taken in STMT.  */
>>
>>  static bool
>> @@ -390,6 +528,7 @@ build_cgraph_edges (void)
>>                                       mark_load, mark_store, mark_address);
>>    }
>>
>> +
>>   /* Look for initializers of constant variables and private statics.  */
>>   FOR_EACH_LOCAL_DECL (cfun, ix, decl)
>>     if (TREE_CODE (decl) == VAR_DECL
>> @@ -466,9 +605,24 @@ rebuild_cgraph_edges (void)
>>                                                         bb);
>>              decl = gimple_call_fndecl (stmt);
>>              if (decl)
>> -               cgraph_create_edge (node, cgraph_node (decl), stmt,
>> -                                   bb->count, freq,
>> -                                   bb->loop_depth);
>> +               {
>> +                 struct cgraph_node *callee;
>> +                 /* In LIPO mode, before tree_profiling, the call graph edge
>> +                    needs to be built with the original target node to make
>> +                    sure consistent early inline decisions between profile generate
>> +                    and profile use. After tree-profiling, the target needs to be
>> +                    set to the resolved node so that ipa-inline sees the definitions.  */
>> +                 if (L_IPO_COMP_MODE && cgraph_pre_profiling_inlining_done)
>> +                   callee = cgraph_lipo_get_resolved_node (decl);
>> +                  else
>> +                   callee = cgraph_node (decl);
>> +                  cgraph_create_edge (node, callee, stmt,
>> +                                      bb->count, freq,
>> +                                      bb->loop_depth);
>> +                  if (L_IPO_COMP_MODE && cgraph_pre_profiling_inlining_done
>> +                     && decl != callee->decl)
>> +                   gimple_call_set_fndecl (stmt, callee->decl);
>> +                }
>>              else
>>                cgraph_create_indirect_edge (node, stmt,
>>                                             gimple_call_flags (stmt),
>> @@ -483,6 +637,7 @@ rebuild_cgraph_edges (void)
>>        walk_stmt_load_store_addr_ops (gsi_stmt (gsi), node,
>>                                       mark_load, mark_store, mark_address);
>>     }
>> +  add_fake_indirect_call_edges (node);
>>   record_eh_tables (node, cfun);
>>   gcc_assert (!node->global.inlined_to);
>>
>> Index: dyn-ipa.c
>> ===================================================================
>> --- dyn-ipa.c   (revision 0)
>> +++ dyn-ipa.c   (revision 0)
>> @@ -0,0 +1,1476 @@
>> +/* Compile this one with gcc.  */
>> +/* Copyright (C) 2009. Free Software Foundation, Inc.
>> +   Contributed by Xinliang David Li (davidxl@google.com) and
>> +                  Raksit Ashok  (raksit@google.com)
>> +
>> +This file is part of GCC.
>> +
>> +GCC is free software; you can redistribute it and/or modify it under
>> +the terms of the GNU General Public License as published by the Free
>> +Software Foundation; either version 3, or (at your option) any later
>> +version.
>> +
>> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
>> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>> +for more details.
>> +
>> +Under Section 7 of GPL version 3, you are granted additional
>> +permissions described in the GCC Runtime Library Exception, version
>> +3.1, as published by the Free Software Foundation.
>> +
>> +You should have received a copy of the GNU General Public License and
>> +a copy of the GCC Runtime Library Exception along with this program;
>> +see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
>> +<http://www.gnu.org/licenses/>.  */
>> +
>> +#include "tconfig.h"
>> +#include "tsystem.h"
>> +#include "coretypes.h"
>> +#include "tm.h"
>> +
>> +#if defined(inhibit_libc)
>> +#define IN_LIBGCOV (-1)
>> +#else
>> +#undef NULL /* Avoid errors if stdio.h and our stddef.h mismatch.  */
>> +#include <stdio.h>
>> +#include <stdlib.h>
>> +#define IN_LIBGCOV 1
>> +#if defined(L_gcov)
>> +#define GCOV_LINKAGE /* nothing */
>> +#endif
>> +#endif
>> +#include "gcov-io.h"
>> +
>> +struct dyn_pointer_set;
>> +
>> +#define XNEWVEC(type,ne) (type *)malloc(sizeof(type) * (ne))
>> +#define XNEW(type) (type *)malloc(sizeof(type))
>> +#define XDELETEVEC(p) free(p)
>> +#define XDELETE(p) free(p)
>> +
>> +struct dyn_cgraph_node
>> +{
>> +  struct dyn_cgraph_edge *callees;
>> +  struct dyn_cgraph_edge *callers;
>> +  struct dyn_pointer_set *imported_modules;
>> +
>> +  gcov_type guid;
>> +  gcov_type sum_in_count;
>> +  gcov_unsigned_t visited;
>> +};
>> +
>> +struct dyn_cgraph_edge
>> +{
>> +  struct dyn_cgraph_node *caller;
>> +  struct dyn_cgraph_node *callee;
>> +  struct dyn_cgraph_edge *next_caller;
>> +  struct dyn_cgraph_edge *next_callee;
>> +  gcov_type count;
>> +};
>> +
>> +struct dyn_module_info
>> +{
>> +  struct dyn_pointer_set *imported_modules;
>> +  gcov_unsigned_t max_func_ident;
>> +};
>> +
>> +struct dyn_cgraph
>> +{
>> +  struct dyn_pointer_set **call_graph_nodes;
>> +  struct gcov_info **modules;
>> +  /* supplement module information  */
>> +  struct dyn_module_info *sup_modules;
>> +  const struct gcov_fn_info ***functions;
>> +  unsigned num_modules;
>> +  unsigned num_nodes_executed;
>> +};
>> +
>> +struct dyn_pointer_set
>> +{
>> +  size_t log_slots;
>> +  size_t n_slots;              /* n_slots = 2^log_slots */
>> +  size_t n_elements;
>> +
>> +  void **slots;
>> +  unsigned (*get_key) (const void *);
>> +};
>> +
>> +
>> +#if defined(inhibit_libc)
>> +__gcov_build_callgraph (void) {}
>> +#else
>> +
>> +void __gcov_compute_module_groups (void) ATTRIBUTE_HIDDEN;
>> +void __gcov_finalize_dyn_callgraph (void) ATTRIBUTE_HIDDEN;
>> +static void gcov_dump_callgraph (gcov_type);
>> +static void gcov_dump_cgraph_node_short (struct dyn_cgraph_node *node);
>> +static void gcov_dump_cgraph_node (struct dyn_cgraph_node *node,
>> +                                  unsigned m, unsigned f);
>> +static void
>> +gcov_dump_cgraph_node_dot (struct dyn_cgraph_node *node,
>> +                           unsigned m, unsigned f,
>> +                           gcov_type cutoff_count);
>> +static void
>> +pointer_set_destroy (struct dyn_pointer_set *pset);
>> +static void **
>> +pointer_set_find_or_insert (struct dyn_pointer_set *pset, unsigned key);
>> +static struct dyn_pointer_set *
>> +pointer_set_create (unsigned (*get_key) (const void *));
>> +
>> +static struct dyn_cgraph the_dyn_call_graph;
>> +static int total_zero_count = 0;
>> +static int total_insane_count = 0;
>> +
>> +static void
>> +init_dyn_cgraph_node (struct dyn_cgraph_node *node, gcov_type guid)
>> +{
>> +  node->callees = 0;
>> +  node->callers = 0;
>> +  node->imported_modules = 0;
>> +  node->guid = guid;
>> +  node->visited = 0;
>> +}
>> +
>> +/* Return (module_id - 1). FUNC_GUID is the global unique id.  */
>> +
>> +static inline gcov_unsigned_t
>> +get_module_idx_from_func_glob_uid (gcov_type func_guid)
>> +{
>> +  return EXTRACT_MODULE_ID_FROM_GLOBAL_ID (func_guid) - 1;
>> +}
>> +
>> +/* Return (module_id - 1) for MODULE_INFO.  */
>> +
>> +static inline gcov_unsigned_t
>> +get_module_idx (const struct gcov_info *module_info)
>> +{
>> +  return module_info->mod_info->ident - 1;
>> +}
>> +
>> +/* Return intra-module function id given function global unique id
>> +   FUNC_GUID.  */
>> +
>> +static inline gcov_unsigned_t
>> +get_intra_module_func_id (gcov_type func_guid)
>> +{
>> +  return EXTRACT_FUNC_ID_FROM_GLOBAL_ID (func_guid);
>> +}
>> +
>> +/* Return the pointer to the dynamic call graph node for FUNC_GUID.  */
>> +
>> +static inline struct dyn_cgraph_node *
>> +get_cgraph_node (gcov_type func_guid)
>> +{
>> +  gcov_unsigned_t mod_id, func_id;
>> +
>> +  mod_id = get_module_idx_from_func_glob_uid (func_guid);
>> +
>> +  /* This is to workaround: calls in __static_initialization_and_destruction
>> +     should not be instrumented as the module id context for the callees have
>> +     not setup yet -- this leads to mod_id == (unsigned) (0 - 1). Multithreaded
>> +     programs may also produce insane func_guid in the profile counter.  */
>> +  if (mod_id >= the_dyn_call_graph.num_modules)
>> +    return 0;
>> +
>> +  func_id = get_intra_module_func_id (func_guid);
>> +  if (func_id > the_dyn_call_graph.sup_modules[mod_id].max_func_ident)
>> +    return 0;
>> +
>> +  return *(pointer_set_find_or_insert
>> +          (the_dyn_call_graph.call_graph_nodes[mod_id], func_id));
>> +}
>> +
>> +/* Return the gcov_info pointer for module with id MODULE_ID.  */
>> +
>> +static inline struct gcov_info *
>> +get_module_info (gcov_unsigned_t module_id)
>> +{
>> +  return the_dyn_call_graph.modules[module_id];
>> +}
>> +
>> +struct gcov_info *__gcov_list ATTRIBUTE_HIDDEN;
>> +
>> +static inline unsigned
>> +cgraph_node_get_key (const void *p)
>> +{
>> +  return get_intra_module_func_id (((const struct dyn_cgraph_node *) p)->guid);
>> +}
>> +
>> +/* Initialize dynamic call graph.  */
>> +
>> +static void
>> +init_dyn_call_graph (void)
>> +{
>> +  unsigned num_modules = 0;
>> +  struct gcov_info *gi_ptr;
>> +
>> +  the_dyn_call_graph.call_graph_nodes = 0;
>> +  the_dyn_call_graph.modules = 0;
>> +  the_dyn_call_graph.functions = 0;
>> +  the_dyn_call_graph.num_nodes_executed = 0;
>> +
>> +  gi_ptr = __gcov_list;
>> +
>> +  for (; gi_ptr; gi_ptr = gi_ptr->next)
>> +    num_modules++;
>> +
>> +  the_dyn_call_graph.num_modules = num_modules;
>> +
>> +  the_dyn_call_graph.modules
>> +    = XNEWVEC (struct gcov_info *, num_modules);
>> +
>> +  the_dyn_call_graph.sup_modules
>> +    = XNEWVEC (struct dyn_module_info, num_modules);
>> +  memset (the_dyn_call_graph.sup_modules, 0,
>> +          num_modules * sizeof (struct dyn_module_info));
>> +
>> +  the_dyn_call_graph.functions
>> +    = XNEWVEC (const struct gcov_fn_info **, num_modules);
>> +
>> +  the_dyn_call_graph.call_graph_nodes
>> +    = XNEWVEC (struct dyn_pointer_set *, num_modules);
>> +
>> +  gi_ptr = __gcov_list;
>> +
>> +  for (; gi_ptr; gi_ptr = gi_ptr->next)
>> +    {
>> +      unsigned c_ix = 0, t_ix, j, mod_id, fi_stride, max_func_ident = 0;
>> +      struct dyn_cgraph_node *node;
>> +
>> +      mod_id = get_module_idx (gi_ptr);
>> +
>> +      the_dyn_call_graph.modules[mod_id] = gi_ptr;
>> +
>> +      the_dyn_call_graph.functions[mod_id]
>> +          = XNEWVEC (const struct gcov_fn_info *, gi_ptr->n_functions);
>> +
>> +      for (t_ix = 0; t_ix < GCOV_COUNTERS; t_ix++)
>> +       if ((1 << t_ix) & gi_ptr->ctr_mask)
>> +           c_ix++;
>> +
>> +      fi_stride = offsetof (struct gcov_fn_info, n_ctrs) + c_ix * sizeof (unsigned);
>> +      if (__alignof__ (struct gcov_fn_info) > sizeof (unsigned))
>> +       {
>> +         fi_stride += __alignof__ (struct gcov_fn_info) - 1;
>> +         fi_stride &= ~(__alignof__ (struct gcov_fn_info) - 1);
>> +       }
>> +
>> +      for (j = 0; j < gi_ptr->n_functions; j++)
>> +       {
>> +          const struct gcov_fn_info *fi_ptr = (const struct gcov_fn_info *)
>> +           ((const char *) gi_ptr->functions + j * fi_stride);
>> +          the_dyn_call_graph.functions[mod_id][j] = fi_ptr;
>> +          if (fi_ptr->ident > max_func_ident)
>> +            max_func_ident = fi_ptr->ident;
>> +        }
>> +
>> +
>> +      the_dyn_call_graph.call_graph_nodes[mod_id]
>> +       = pointer_set_create (cgraph_node_get_key);
>> +
>> +      the_dyn_call_graph.sup_modules[mod_id].max_func_ident = max_func_ident;
>> +
>> +      for (j = 0; j < gi_ptr->n_functions; j++)
>> +       {
>> +          const struct gcov_fn_info *fi_ptr
>> +              = the_dyn_call_graph.functions[mod_id][j];
>> +         *(pointer_set_find_or_insert
>> +           (the_dyn_call_graph.call_graph_nodes[mod_id], fi_ptr->ident))
>> +           = node = XNEW (struct dyn_cgraph_node);
>> +         the_dyn_call_graph.call_graph_nodes[mod_id]->n_elements++;
>> +         init_dyn_cgraph_node (node, GEN_FUNC_GLOBAL_ID (gi_ptr->mod_info->ident,
>> +                                                         fi_ptr->ident));
>> +       }
>> +    }
>> +}
>> +
>> +/* Free up memory allocated for dynamic call graph.  */
>> +
>> +void
>> +__gcov_finalize_dyn_callgraph (void)
>> +{
>> +  unsigned i;
>> +  struct gcov_info *gi_ptr;
>> +
>> +  for (i = 0; i < the_dyn_call_graph.num_modules; i++)
>> +    {
>> +      gi_ptr = the_dyn_call_graph.modules[i];
>> +      const struct gcov_fn_info *fi_ptr;
>> +      unsigned f_ix;
>> +      for (f_ix = 0; f_ix < gi_ptr->n_functions; f_ix++)
>> +       {
>> +         struct dyn_cgraph_node *node;
>> +          struct dyn_cgraph_edge *callees, *next_callee;
>> +          fi_ptr = the_dyn_call_graph.functions[i][f_ix];
>> +          node = *(pointer_set_find_or_insert
>> +                  (the_dyn_call_graph.call_graph_nodes[i], fi_ptr->ident));
>> +         gcc_assert (node);
>> +          callees = node->callees;
>> +
>> +          if (!callees)
>> +            continue;
>> +          while (callees != 0)
>> +            {
>> +              next_callee = callees->next_callee;
>> +              XDELETE (callees);
>> +              callees = next_callee;
>> +            }
>> +         if (node->imported_modules)
>> +           pointer_set_destroy (node->imported_modules);
>> +        }
>> +      if (the_dyn_call_graph.call_graph_nodes[i])
>> +        pointer_set_destroy (the_dyn_call_graph.call_graph_nodes[i]);
>> +      if (the_dyn_call_graph.functions[i])
>> +        XDELETEVEC (the_dyn_call_graph.functions[i]);
>> +      /* Now delete sup modules */
>> +      if (the_dyn_call_graph.sup_modules[i].imported_modules)
>> +        pointer_set_destroy (the_dyn_call_graph.sup_modules[i].imported_modules);
>> +    }
>> +  XDELETEVEC (the_dyn_call_graph.call_graph_nodes);
>> +  XDELETEVEC (the_dyn_call_graph.functions);
>> +  XDELETEVEC (the_dyn_call_graph.sup_modules);
>> +  XDELETEVEC (the_dyn_call_graph.modules);
>> +}
>> +
>> +/* Add outgoing edge OUT_EDGE for caller node CALLER.  */
>> +
>> +static void
>> +gcov_add_out_edge (struct dyn_cgraph_node *caller,
>> +                  struct dyn_cgraph_edge *out_edge)
>> +{
>> +  if (!caller->callees)
>> +    caller->callees = out_edge;
>> +  else
>> +    {
>> +      out_edge->next_callee = caller->callees;
>> +      caller->callees = out_edge;
>> +    }
>> +}
>> +
>> +/* Add incoming edge IN_EDGE for callee node CALLEE.  */
>> +
>> +static void
>> +gcov_add_in_edge (struct dyn_cgraph_node *callee,
>> +                 struct dyn_cgraph_edge *in_edge)
>> +{
>> +  if (!callee->callers)
>> +    callee->callers = in_edge;
>> +  else
>> +    {
>> +      in_edge->next_caller = callee->callers;
>> +      callee->callers = in_edge;
>> +    }
>> +}
>> +
>> +/* Add a call graph edge between caller CALLER and callee CALLEE.
>> +   The edge count is COUNT.  */
>> +
>> +static void
>> +gcov_add_cgraph_edge (struct dyn_cgraph_node *caller,
>> +                     struct dyn_cgraph_node *callee,
>> +                     gcov_type count)
>> +{
>> +  struct dyn_cgraph_edge *new_edge = XNEW (struct dyn_cgraph_edge);
>> +  new_edge->caller = caller;
>> +  new_edge->callee = callee;
>> +  new_edge->count = count;
>> +  new_edge->next_caller = 0;
>> +  new_edge->next_callee = 0;
>> +
>> +  gcov_add_out_edge (caller, new_edge);
>> +  gcov_add_in_edge (callee, new_edge);
>> +}
>> +
>> +/* Add call graph edges from direct calls for caller CALLER. DIR_CALL_COUNTERS
>> +   is the array of call counters. N_COUNTS is the number of counters.  */
>> +
>> +static void
>> +gcov_build_callgraph_dc_fn (struct dyn_cgraph_node *caller,
>> +                            gcov_type *dir_call_counters,
>> +                            unsigned n_counts)
>> +{
>> +  unsigned i;
>> +
>> +  for (i = 0; i < n_counts; i += 2)
>> +    {
>> +      struct dyn_cgraph_node *callee;
>> +      gcov_type count;
>> +      gcov_type callee_guid = dir_call_counters[i];
>> +
>> +      count = dir_call_counters[i + 1];
>> +      if (count == 0)
>> +        {
>> +          total_zero_count++;
>> +          continue;
>> +        }
>> +      callee = get_cgraph_node (callee_guid);
>> +      if (!callee)
>> +        {
>> +          total_insane_count++;
>> +          continue;
>> +        }
>> +      gcov_add_cgraph_edge (caller, callee, count);
>> +    }
>> +}
>> +
>> +/* Add call graph edges from indirect calls for caller CALLER. ICALL_COUNTERS
>> +   is the array of icall counters. N_COUNTS is the number of counters.  */
>> +
>> +static void
>> +gcov_build_callgraph_ic_fn (struct dyn_cgraph_node *caller,
>> +                            gcov_type *icall_counters,
>> +                            unsigned n_counts)
>> +{
>> +  unsigned i, j;
>> +
>> +  for (i = 0; i < n_counts; i += GCOV_ICALL_TOPN_NCOUNTS)
>> +    {
>> +      gcov_type *value_array = &icall_counters[i + 1];
>> +      for (j = 0; j < GCOV_ICALL_TOPN_NCOUNTS - 1; j += 2)
>> +        {
>> +          struct dyn_cgraph_node *callee;
>> +          gcov_type count;
>> +          gcov_type callee_guid = value_array[j];
>> +
>> +          count = value_array[j + 1];
>> +         /* Do not update zero count edge count
>> +          * as it means there is no target in this entry.  */
>> +          if (count == 0)
>> +            continue;
>> +          callee = get_cgraph_node (callee_guid);
>> +          if (!callee)
>> +           {
>> +              total_insane_count++;
>> +              continue;
>> +           }
>> +          gcov_add_cgraph_edge (caller, callee, count);
>> +        }
>> +    }
>> +}
>> +
>> +/* Build the dynamic call graph.  */
>> +
>> +static void
>> +gcov_build_callgraph (void)
>> +{
>> +  struct gcov_info *gi_ptr;
>> +  unsigned t_ix, m_ix;
>> +
>> +  init_dyn_call_graph ();
>> +
>> +  for (m_ix = 0; m_ix < the_dyn_call_graph.num_modules; m_ix++)
>> +    {
>> +      const struct gcov_fn_info *fi_ptr;
>> +      unsigned c_ix, f_ix, n_counts, dp_cix = 0, ip_cix = 0;
>> +      gcov_type *dcall_profile_values, *icall_profile_values;
>> +      gcov_type *arcs_values = 0; unsigned arcs_cix;
>> +
>> +      gi_ptr = the_dyn_call_graph.modules[m_ix];
>> +
>> +      dcall_profile_values = 0;
>> +      icall_profile_values = 0;
>> +      c_ix = 0;
>> +      for (t_ix = 0; t_ix < GCOV_COUNTERS; t_ix++)
>> +       if ((1 << t_ix) & gi_ptr->ctr_mask)
>> +         {
>> +           if (t_ix == GCOV_COUNTER_DIRECT_CALL)
>> +             {
>> +               dcall_profile_values = gi_ptr->counts[c_ix].values;
>> +               dp_cix = c_ix;
>> +             }
>> +           if (t_ix == GCOV_COUNTER_ICALL_TOPNV)
>> +             {
>> +               icall_profile_values = gi_ptr->counts[c_ix].values;
>> +               ip_cix = c_ix;
>> +             }
>> +           if (t_ix == GCOV_COUNTER_ARCS)
>> +             {
>> +               arcs_values = gi_ptr->counts[c_ix].values;
>> +               arcs_cix = c_ix;
>> +              }
>> +           c_ix++;
>> +         }
>> +
>> +      if (dcall_profile_values == 0 && icall_profile_values == 0)
>> +        continue;
>> +
>> +      for (f_ix = 0; f_ix < gi_ptr->n_functions; f_ix++)
>> +        {
>> +          struct dyn_cgraph_node *caller;
>> +          fi_ptr = the_dyn_call_graph.functions[m_ix][f_ix];
>> +          caller = *(pointer_set_find_or_insert
>> +                    (the_dyn_call_graph.call_graph_nodes[m_ix],
>> +                     fi_ptr->ident));
>> +         gcc_assert (caller);
>> +          if (dcall_profile_values)
>> +            {
>> +              unsigned offset;
>> +              n_counts = fi_ptr->n_ctrs[dp_cix];
>> +              offset = fi_ptr->dc_offset;
>> +              gcov_build_callgraph_dc_fn (caller,
>> +                                          dcall_profile_values + offset,
>> +                                          n_counts);
>> +            }
>> +          if (icall_profile_values)
>> +            {
>> +              n_counts = fi_ptr->n_ctrs[ip_cix];
>> +              gcov_build_callgraph_ic_fn (caller, icall_profile_values, n_counts);
>> +              icall_profile_values += n_counts;
>> +            }
>> +          if (arcs_values && 0)
>> +            {
>> +              gcov_type total_arc_count = 0;
>> +              unsigned arc;
>> +              n_counts = fi_ptr->n_ctrs[arcs_cix];
>> +              for (arc = 0; arc < n_counts; arc++)
>> +                total_arc_count += arcs_values[arc];
>> +              if (total_arc_count != 0)
>> +                the_dyn_call_graph.num_nodes_executed++;
>> +              arcs_values += n_counts;
>> +            }
>> +        }
>> +    }
>> +
>> +}
>> +
>> +static inline size_t
>> +hash1 (unsigned p, unsigned long max, unsigned long logmax)
>> +{
>> +  const unsigned long long A = 0x9e3779b97f4a7c16ull;
>> +  const unsigned long long shift = 64 - logmax;
>> +
>> +  return ((A * (unsigned long) p) >> shift) & (max - 1);
>> +}
>> +
>> +/* Allocate an empty imported-modules set.  */
>> +
>> +static struct dyn_pointer_set *
>> +pointer_set_create (unsigned (*get_key) (const void *))
>> +{
>> +  struct dyn_pointer_set *result = XNEW (struct dyn_pointer_set);
>> +
>> +  result->n_elements = 0;
>> +  result->log_slots = 8;
>> +  result->n_slots = (size_t) 1 << result->log_slots;
>> +
>> +  result->slots = XNEWVEC (void *, result->n_slots);
>> +  memset (result->slots, 0, sizeof (void *) * result->n_slots);
>> +  result->get_key = get_key;
>> +
>> +  return result;
>> +}
>> +
>> +/* Reclaim all memory associated with PSET.  */
>> +
>> +static void
>> +pointer_set_destroy (struct dyn_pointer_set *pset)
>> +{
>> +  size_t i;
>> +  for (i = 0; i < pset->n_slots; i++)
>> +    if (pset->slots[i])
>> +      XDELETE (pset->slots[i]);
>> +  XDELETEVEC (pset->slots);
>> +  XDELETE (pset);
>> +}
>> +
>> +/* Subroutine of pointer_set_find_or_insert.  Return the insertion slot for KEY
>> +   into an empty element of SLOTS, an array of length N_SLOTS.  */
>> +static inline size_t
>> +insert_aux (unsigned key, void **slots,
>> +           size_t n_slots, size_t log_slots,
>> +           unsigned (*get_key) (const void *))
>> +{
>> +  size_t n = hash1 (key, n_slots, log_slots);
>> +  while (1)
>> +    {
>> +      if (slots[n] == 0 || get_key (slots[n]) == key)
>> +       return n;
>> +      else
>> +       {
>> +         ++n;
>> +         if (n == n_slots)
>> +           n = 0;
>> +       }
>> +    }
>> +}
>> +
>> +/* Find slot for KEY. KEY must be nonnull.  */
>> +
>> +static void **
>> +pointer_set_find_or_insert (struct dyn_pointer_set *pset, unsigned key)
>> +{
>> +  size_t n;
>> +
>> +  /* For simplicity, expand the set even if KEY is already there.  This can be
>> +     superfluous but can happen at most once.  */
>> +  if (pset->n_elements > pset->n_slots / 4)
>> +    {
>> +      size_t new_log_slots = pset->log_slots + 1;
>> +      size_t new_n_slots = pset->n_slots * 2;
>> +      void **new_slots = XNEWVEC (void *, new_n_slots);
>> +      memset (new_slots, 0, sizeof (void *) * new_n_slots);
>> +      size_t i;
>> +
>> +      for (i = 0; i < pset->n_slots; ++i)
>> +        {
>> +         void *value = pset->slots[i];
>> +         if (!value)
>> +           continue;
>> +         n = insert_aux (pset->get_key (value), new_slots, new_n_slots,
>> +                         new_log_slots, pset->get_key);
>> +         new_slots[n] = value;
>> +       }
>> +
>> +      XDELETEVEC (pset->slots);
>> +      pset->n_slots = new_n_slots;
>> +      pset->log_slots = new_log_slots;
>> +      pset->slots = new_slots;
>> +    }
>> +
>> +  n = insert_aux (key, pset->slots, pset->n_slots, pset->log_slots,
>> +                 pset->get_key);
>> +  return &pset->slots[n];
>> +}
>> +
>> +/* Pass each pointer in PSET to the function in FN, together with the fixed
>> +   parameters DATA1, DATA2, DATA3.  If FN returns false, the iteration stops.  */
>> +
>> +static void
>> +pointer_set_traverse (const struct dyn_pointer_set *pset,
>> +                      int (*fn) (const void *, void *, void *, void *),
>> +                     void *data1, void *data2, void *data3)
>> +{
>> +  size_t i;
>> +  for (i = 0; i < pset->n_slots; ++i)
>> +    if (pset->slots[i] && !fn (pset->slots[i], data1, data2, data3))
>> +      break;
>> +}
>> +
>> +static int
>> +imp_mod_set_insert (struct dyn_pointer_set *p, const struct gcov_info *imp_mod,
>> +                   double wt)
>> +{
>> +  struct dyn_imp_mod **m = (struct dyn_imp_mod **)
>> +    pointer_set_find_or_insert (p, imp_mod->mod_info->ident);
>> +  if (*m)
>> +    {
>> +      (*m)->weight += wt;
>> +      return 1;
>> +    }
>> +  else
>> +    {
>> +      *m = XNEW (struct dyn_imp_mod);
>> +      (*m)->imp_mod = imp_mod;
>> +      (*m)->weight = wt;
>> +      p->n_elements++;
>> +      return 0;
>> +    }
>> +}
>> +
>> +/* Callback function to propagate import module (VALUE) from callee to
>> +   caller's imported-module-set (DATA1).
>> +   The weight is scaled by the scaling-factor (DATA2) before propagation,
>> +   and accumulated into DATA3.  */
>> +static int
>> +gcov_propagate_imp_modules (const void *value, void *data1, void *data2,
>> +                           void *data3)
>> +{
>> +  const struct dyn_imp_mod *m = (const struct dyn_imp_mod *) value;
>> +  struct dyn_pointer_set *receiving_set = (struct dyn_pointer_set *) data1;
>> +  double *scale = (double *) data2;
>> +  double *sum = (double *) data3;
>> +  double wt = m->weight;
>> +  if (scale)
>> +    wt *= *scale;
>> +  if (sum)
>> +    (*sum) += wt;
>> +  imp_mod_set_insert (receiving_set, m->imp_mod, wt);
>> +  return 1;
>> +}
>> +
>> +static int
>> +sort_by_count (const void *pa, const void *pb)
>> +{
>> +  const struct dyn_cgraph_edge *edge_a = *(struct dyn_cgraph_edge * const *)pa;
>> +  const struct dyn_cgraph_edge *edge_b = *(struct dyn_cgraph_edge * const *)pb;
>> +
>> +  /* This can overvlow.  */
>> +  /* return edge_b->count - edge_a->count;  */
>> +  if (edge_b->count > edge_a->count)
>> +    return 1;
>> +  else if (edge_b->count == edge_a->count)
>> +    return 0;
>> +  else
>> +    return -1;
>> +}
>> +
>> +/* Compute the hot callgraph edge threhold.  */
>> +
>> +static gcov_type
>> +gcov_compute_cutoff_count (void)
>> +{
>> +  unsigned m_ix, capacity, i;
>> +  unsigned num_edges = 0;
>> +  gcov_type cutoff_count;
>> +  double total, cum, cum_cutoff;
>> +  struct dyn_cgraph_edge **edges;
>> +  struct gcov_info *gi_ptr;
>> +  char *cutoff_str;
>> +  char *num_perc_str;
>> +  unsigned cutoff_perc;
>> +  unsigned num_perc;
>> +  int do_dump;
>> +
>> +  capacity = 100;
>> +  /* allocate an edge array */
>> +  edges = XNEWVEC (struct dyn_cgraph_edge*, capacity);
>> +  /* First count the number of edges.  */
>> +  for (m_ix = 0; m_ix < the_dyn_call_graph.num_modules; m_ix++)
>> +    {
>> +      const struct gcov_fn_info *fi_ptr;
>> +      unsigned f_ix;
>> +
>> +      gi_ptr = the_dyn_call_graph.modules[m_ix];
>> +
>> +      for (f_ix = 0; f_ix < gi_ptr->n_functions; f_ix++)
>> +       {
>> +         struct dyn_cgraph_node *node;
>> +          struct dyn_cgraph_edge *callees;
>> +
>> +         fi_ptr = the_dyn_call_graph.functions[m_ix][f_ix];
>> +
>> +         node = *(pointer_set_find_or_insert
>> +                  (the_dyn_call_graph.call_graph_nodes[m_ix], fi_ptr->ident));
>> +         gcc_assert (node);
>> +
>> +          callees = node->callees;
>> +          while (callees != 0)
>> +            {
>> +              num_edges++;
>> +              if (num_edges < capacity)
>> +                edges[num_edges - 1] = callees;
>> +              else
>> +                {
>> +                  capacity = capacity + (capacity >> 1);
>> +                  edges = (struct dyn_cgraph_edge **)realloc (edges, sizeof (void*) * capacity);
>> +                  edges[num_edges - 1] = callees;
>> +                }
>> +              callees = callees->next_callee;
>> +            }
>> +       }
>> +    }
>> +
>> +  /* Now sort */
>> + qsort (edges, num_edges, sizeof (void *), sort_by_count);
>> +#define CUM_CUTOFF_PERCENT 80
>> +#define MIN_NUM_EDGE_PERCENT 0
>> +  cutoff_str = getenv ("GCOV_DYN_CGRAPH_CUTOFF");
>> +  if (cutoff_str && strlen (cutoff_str))
>> +    {
>> +      if ((num_perc_str = strchr (cutoff_str, ':')))
>> +        {
>> +          *num_perc_str = '\0';
>> +          num_perc_str++;
>> +        }
>> +      cutoff_perc = atoi (cutoff_str);
>> +      if (num_perc_str)
>> +        num_perc = atoi (num_perc_str);
>> +      else
>> +        num_perc = MIN_NUM_EDGE_PERCENT;
>> +    }
>> +  else
>> +    {
>> +      cutoff_perc = CUM_CUTOFF_PERCENT;
>> +      num_perc = MIN_NUM_EDGE_PERCENT;
>> +    }
>> +
>> +  total = 0;
>> +  cum = 0;
>> +  for (i = 0; i < num_edges; i++)
>> +    total += edges[i]->count;
>> +
>> +  cum_cutoff = (total * cutoff_perc)/100;
>> +  do_dump = (getenv ("GCOV_DYN_CGRAPH_DUMP") != 0);
>> +  for (i = 0; i < num_edges; i++)
>> +    {
>> +      cum += edges[i]->count;
>> +      if (do_dump)
>> +        fprintf (stderr, "// edge[%d] count = %.0f [%llx --> %llx]\n",
>> +                 i, (double) edges[i]->count,
>> +                 (long long) edges[i]->caller->guid,
>> +                 (long long) edges[i]->callee->guid);
>> +      if (cum >= cum_cutoff && (i * 100 >= num_edges * num_perc))
>> +        {
>> +          cutoff_count = edges[i]->count;
>> +          break;
>> +        }
>> +    }
>> +
>> +  if (do_dump)
>> +    fprintf (stderr, "// total = %.0f cum = %.0f cum/total = %.0f%%"
>> +             " cutoff_count = %lld [total edges: %d hot edges: %d perc: %d%%]\n"
>> +            " total_zero_count_edges = %d total_insane_count_edgess = %d\n"
>> +             " total_nodes_executed = %d\n",
>> +             total, cum, (cum * 100)/total, (long long) cutoff_count,
>> +             num_edges, i, (i * 100)/num_edges, total_zero_count,
>> +             total_insane_count, the_dyn_call_graph.num_nodes_executed);
>> +
>> +  XDELETEVEC (edges);
>> +  return cutoff_count;
>> +}
>> +
>> +static inline unsigned
>> +imp_mod_get_key (const void *p)
>> +{
>> +  return ((const struct dyn_imp_mod *) p)->imp_mod->mod_info->ident;
>> +}
>> +
>> +/* Return the imported module set for NODE.  */
>> +
>> +static struct dyn_pointer_set *
>> +gcov_get_imp_module_set (struct dyn_cgraph_node *node)
>> +{
>> +  if (!node->imported_modules)
>> +    node->imported_modules = pointer_set_create (imp_mod_get_key);
>> +
>> +  return node->imported_modules;
>> +}
>> +
>> +/* Return the imported module set for MODULE MI.  */
>> +
>> +static struct dyn_pointer_set *
>> +gcov_get_module_imp_module_set (struct dyn_module_info *mi)
>> +{
>> +  if (!mi->imported_modules)
>> +    mi->imported_modules = pointer_set_create (imp_mod_get_key);
>> +
>> +  return mi->imported_modules;
>> +}
>> +
>> +/* Callback function to mark if a module needs to be exported.  */
>> +
>> +static int
>> +gcov_mark_export_modules (const void *value,
>> +                         void *data1 ATTRIBUTE_UNUSED,
>> +                         void *data2 ATTRIBUTE_UNUSED,
>> +                         void *data3 ATTRIBUTE_UNUSED)
>> +{
>> +  const struct gcov_info *module_info
>> +    = ((const struct dyn_imp_mod *) value)->imp_mod;
>> +
>> +  module_info->mod_info->is_exported = 1;
>> +  return 1;
>> +}
>> +
>> +struct gcov_import_mod_array
>> +{
>> +  const struct dyn_imp_mod **imported_modules;
>> +  struct gcov_info *importing_module;
>> +  unsigned len;
>> +};
>> +
>> +/* Callback function to compute pointer set size.  */
>> +
>> +static int
>> +gcov_compute_mset_size (const void *value ATTRIBUTE_UNUSED,
>> +                        void *data1,
>> +                       void *data2 ATTRIBUTE_UNUSED,
>> +                       void *data3 ATTRIBUTE_UNUSED)
>> +{
>> +  unsigned *len = (unsigned *) data1;
>> +  (*len)++;
>> +  return 1;
>> +}
>> +
>> +/* Callback function to collect imported modules.  */
>> +
>> +static int
>> +gcov_collect_imported_modules (const void *value,
>> +                              void *data1,
>> +                              void *data2 ATTRIBUTE_UNUSED,
>> +                              void *data3 ATTRIBUTE_UNUSED)
>> +{
>> +  struct gcov_import_mod_array *out_array;
>> +  const struct dyn_imp_mod *m
>> +    = (const struct dyn_imp_mod *) value;
>> +
>> +  out_array = (struct gcov_import_mod_array *) data1;
>> +
>> +  if (m->imp_mod != out_array->importing_module)
>> +    out_array->imported_modules[out_array->len++] = m;
>> +
>> +  return 1;
>> +}
>> +
>> +/* Comparator for sorting imported modules using weights.  */
>> +
>> +static int
>> +sort_by_module_wt (const void *pa, const void *pb)
>> +{
>> +  const struct dyn_imp_mod *m_a = *((const struct dyn_imp_mod * const *) pa);
>> +  const struct dyn_imp_mod *m_b = *((const struct dyn_imp_mod * const *) pb);
>> +
>> +  /* We want to sort in descending order of weights.  */
>> +  if (m_a->weight < m_b->weight)
>> +    return +1;
>> +  if (m_a->weight > m_b->weight)
>> +    return -1;
>> +  return get_module_idx (m_a->imp_mod) - get_module_idx (m_b->imp_mod);
>> +}
>> +
>> +/* Return a dynamic array of imported modules that is sorted for
>> +   the importing module MOD_INFO. The length of the array is returned
>> +   in *LEN.  */
>> +
>> +const struct dyn_imp_mod **
>> +gcov_get_sorted_import_module_array (struct gcov_info *mod_info,
>> +                                     unsigned *len)
>> +{
>> +  unsigned mod_id;
>> +  struct dyn_module_info *sup_mod_info;
>> +  unsigned array_len = 0;
>> +  struct gcov_import_mod_array imp_array;
>> +
>> +  mod_id = get_module_idx (mod_info);
>> +  sup_mod_info = &the_dyn_call_graph.sup_modules[mod_id];
>> +
>> +  if (sup_mod_info->imported_modules == 0)
>> +    return 0;
>> +
>> +  pointer_set_traverse (sup_mod_info->imported_modules,
>> +                        gcov_compute_mset_size, &array_len, 0, 0);
>> +  imp_array.imported_modules = XNEWVEC (const struct dyn_imp_mod *, array_len);
>> +  imp_array.len = 0;
>> +  imp_array.importing_module = mod_info;
>> +  pointer_set_traverse (sup_mod_info->imported_modules,
>> +                        gcov_collect_imported_modules, &imp_array, 0, 0);
>> +  *len = imp_array.len;
>> +  qsort (imp_array.imported_modules, imp_array.len,
>> +         sizeof (void *), sort_by_module_wt);
>> +  return imp_array.imported_modules;
>> +}
>> +
>> +/* Compute modules that are needed for NODE (for cross module inlining).
>> +   CUTTOFF_COUNT is the call graph edge count cutoff value.
>> +   IMPORT_SCALE is the scaling-factor (percent) by which to scale the
>> +   weights of imported modules of a callee before propagating them to
>> +   the caller, if the callee and caller are in different modules.
>> +
>> +   Each imported module is assigned a weight that corresponds to the
>> +   expected benefit due to cross-module inlining. When the imported modules
>> +   are written out, they are sorted with highest weight first.
>> +
>> +   The following example illustrates how the weight is computed:
>> +
>> +   Suppose we are processing call-graph node A. It calls function B 50 times,
>> +   which calls function C 1000 times, and function E 800 times. Lets say B has
>> +   another in-edge from function D, with edge-count of 50. Say all the
>> +   functions are in separate modules (modules a, b, c, d, e, respectively):
>> +
>> +              D
>> +              |
>> +              | 50
>> +              |
>> +       50     v     1000
>> +  A --------> B ----------> C
>> +              |
>> +              | 800
>> +              |
>> +              v
>> +              E
>> +
>> +  Nodes are processed in depth-first order, so when processing A, we first
>> +  process B. For node B, we are going to add module c to the imported-module
>> +  set, with weight 1000 (edge-count), and module e with weight 800.
>> +  Coming back to A, we are going to add the imported-module-set of B to A,
>> +  after doing some scaling.
>> +  The first scaling factor comes from the fact that A calls B 50 times, but B
>> +  has in-edge-count total of 100. So this scaling factor is 50/100 = 0.5
>> +  The second scaling factor is that since B is in a different module than A,
>> +  we want to slightly downgrade imported modules of B, before adding to the
>> +  imported-modules set of A. This scaling factor has a default value of 50%
>> +  (can be set via env variable GCOV_DYN_IMPORT_SCALE).
>> +  So we end up adding modules c and e to the imported-set of A, with weights
>> +  0.5*0.5*1000=250 and 0.5*0.5*800=200, respectively.
>> +
>> +  Next, we have to add module b itself to A. The weight is computed as the
>> +  edge-count plus the sum of scaled-weights of all modules in the
>> +  imported-module set of B, i.e., 50 + 250 + 200 = 500.
>> +
>> +  In computing the weight of module b, we add the sum of scaled-weights of
>> +  imported modules of b, because it doesn't make sense to import c, e in
>> +  module a, until module b is imported.  */
>> +
>> +static void
>> +gcov_process_cgraph_node (struct dyn_cgraph_node *node,
>> +                          gcov_type cutoff_count,
>> +                         unsigned import_scale)
>> +{
>> +  unsigned mod_id;
>> +  struct dyn_cgraph_edge *callees;
>> +  struct dyn_cgraph_edge *callers;
>> +  node->visited = 1;
>> +  node->sum_in_count = 0;
>> +
>> +  callers = node->callers;
>> +  while (callers)
>> +    {
>> +      node->sum_in_count += callers->count;
>> +      callers = callers->next_caller;
>> +    }
>> +
>> +  callees = node->callees;
>> +  mod_id = get_module_idx_from_func_glob_uid (node->guid);
>> +
>> +  while (callees)
>> +    {
>> +      if (!callees->callee->visited)
>> +        gcov_process_cgraph_node (callees->callee,
>> +                                  cutoff_count,
>> +                                 import_scale);
>> +      callees = callees->next_callee;
>> +    }
>> +
>> +  callees = node->callees;
>> +  while (callees)
>> +    {
>> +      if (callees->count >= cutoff_count)
>> +        {
>> +          unsigned callee_mod_id;
>> +          struct dyn_pointer_set *imp_modules
>> +              = gcov_get_imp_module_set (node);
>> +
>> +          callee_mod_id
>> +              = get_module_idx_from_func_glob_uid (callees->callee->guid);
>> +
>> +         double callee_mod_wt = (double) callees->count;
>> +          if (callees->callee->imported_modules)
>> +           {
>> +             double scale = ((double) callees->count) /
>> +               ((double) callees->callee->sum_in_count);
>> +             /* Reduce weight if callee is in different module.  */
>> +             if (mod_id != callee_mod_id)
>> +               scale = (scale * import_scale) / 100.0;
>> +             pointer_set_traverse (callees->callee->imported_modules,
>> +                                   gcov_propagate_imp_modules,
>> +                                   imp_modules, &scale, &callee_mod_wt);
>> +           }
>> +          if (mod_id != callee_mod_id)
>> +            {
>> +              struct gcov_info *callee_mod_info
>> +                  = get_module_info (callee_mod_id);
>> +              imp_mod_set_insert (imp_modules, callee_mod_info, callee_mod_wt);
>> +            }
>> +        }
>> +
>> +      callees = callees->next_callee;
>> +    }
>> +}
>> +
>> +/* Compute module grouping using CUTOFF_COUNT as the hot edge
>> +   threshold.  */
>> +
>> +#define DEFAULT_IMPORT_SCALE 100
>> +static void
>> +gcov_compute_module_groups (gcov_type cutoff_count)
>> +{
>> +  unsigned m_ix;
>> +  struct gcov_info *gi_ptr;
>> +  const char *import_scale_str;
>> +  unsigned import_scale = DEFAULT_IMPORT_SCALE;
>> +
>> +  import_scale_str = getenv ("GCOV_DYN_IMPORT_SCALE");
>> +  if (import_scale_str && strlen (import_scale_str))
>> +    import_scale = atoi (import_scale_str);
>> +
>> +  for (m_ix = 0; m_ix < the_dyn_call_graph.num_modules; m_ix++)
>> +    {
>> +      const struct gcov_fn_info *fi_ptr;
>> +      unsigned f_ix;
>> +
>> +      gi_ptr = the_dyn_call_graph.modules[m_ix];
>> +
>> +      for (f_ix = 0; f_ix < gi_ptr->n_functions; f_ix++)
>> +       {
>> +         struct dyn_cgraph_node *node;
>> +
>> +         fi_ptr = the_dyn_call_graph.functions[m_ix][f_ix];
>> +         node = *(pointer_set_find_or_insert
>> +                  (the_dyn_call_graph.call_graph_nodes[m_ix], fi_ptr->ident));
>> +         gcc_assert (node);
>> +          if (node->visited)
>> +            continue;
>> +
>> +          gcov_process_cgraph_node (node, cutoff_count, import_scale);
>> +       }
>> +    }
>> +
>> +  for (m_ix = 0; m_ix < the_dyn_call_graph.num_modules; m_ix++)
>> +    {
>> +      const struct gcov_fn_info *fi_ptr;
>> +      unsigned f_ix;
>> +
>> +      gi_ptr = the_dyn_call_graph.modules[m_ix];
>> +
>> +      for (f_ix = 0; f_ix < gi_ptr->n_functions; f_ix++)
>> +       {
>> +         struct dyn_cgraph_node *node;
>> +          unsigned mod_id;
>> +          struct dyn_pointer_set *imp_modules;
>> +
>> +         fi_ptr = the_dyn_call_graph.functions[m_ix][f_ix];
>> +         node = *(pointer_set_find_or_insert
>> +                  (the_dyn_call_graph.call_graph_nodes[m_ix], fi_ptr->ident));
>> +         gcc_assert (node);
>> +
>> +          if (!node->imported_modules)
>> +            continue;
>> +
>> +          mod_id = get_module_idx_from_func_glob_uid (node->guid);
>> +          gcc_assert (mod_id == m_ix);
>> +
>> +          imp_modules
>> +              = gcov_get_module_imp_module_set (
>> +                  &the_dyn_call_graph.sup_modules[mod_id]);
>> +
>> +          pointer_set_traverse (node->imported_modules,
>> +                                gcov_propagate_imp_modules,
>> +                                imp_modules, 0, 0);
>> +       }
>> +    }
>> +
>> +  /* Now compute the export attribute  */
>> +  for (m_ix = 0; m_ix < the_dyn_call_graph.num_modules; m_ix++)
>> +    {
>> +      struct dyn_module_info *mi
>> +          = &the_dyn_call_graph.sup_modules[m_ix];
>> +      if (mi->imported_modules)
>> +        pointer_set_traverse (mi->imported_modules,
>> +                              gcov_mark_export_modules, 0, 0, 0);
>> +    }
>> +}
>> +
>> +/* For each module, compute at random, the group of imported modules,
>> +   that is of size at most MAX_GROUP_SIZE.  */
>> +
>> +static void
>> +gcov_compute_random_module_groups (unsigned max_group_size)
>> +{
>> +  unsigned m_ix;
>> +
>> +  if (max_group_size > the_dyn_call_graph.num_modules)
>> +    max_group_size = the_dyn_call_graph.num_modules;
>> +
>> +  for (m_ix = 0; m_ix < the_dyn_call_graph.num_modules; m_ix++)
>> +    {
>> +      struct dyn_pointer_set *imp_modules =
>> +       gcov_get_module_imp_module_set (&the_dyn_call_graph.sup_modules[m_ix]);
>> +      int cur_group_size = random () % max_group_size;
>> +      int i = 0;
>> +      while (i < cur_group_size)
>> +       {
>> +         struct gcov_info *imp_mod_info;
>> +         unsigned mod_id = random () % the_dyn_call_graph.num_modules;
>> +         if (mod_id == m_ix)
>> +           continue;
>> +         imp_mod_info = get_module_info (mod_id);
>> +         if (!imp_mod_set_insert (imp_modules, imp_mod_info, 1.0))
>> +           i++;
>> +       }
>> +    }
>> +
>> +  /* Now compute the export attribute  */
>> +  for (m_ix = 0; m_ix < the_dyn_call_graph.num_modules; m_ix++)
>> +    {
>> +      struct dyn_module_info *mi
>> +       = &the_dyn_call_graph.sup_modules[m_ix];
>> +      if (mi->imported_modules)
>> +        pointer_set_traverse (mi->imported_modules,
>> +                              gcov_mark_export_modules, 0, 0, 0);
>> +    }
>> +}
>> +
>> +/* Write out MOD_INFO into the gcda file. IS_PRIMARY is a flag
>> +   indicating if the module is the primary module in the group.  */
>> +
>> +static void
>> +gcov_write_module_info (const struct gcov_info *mod_info,
>> +                        unsigned is_primary)
>> +{
>> +  gcov_unsigned_t len = 0, filename_len = 0, src_filename_len = 0, i, j;
>> +  gcov_unsigned_t num_strings;
>> +  gcov_unsigned_t *aligned_fname;
>> +  struct gcov_module_info  *module_info = mod_info->mod_info;
>> +  filename_len = (strlen (module_info->da_filename) +
>> +                 sizeof (gcov_unsigned_t)) / sizeof (gcov_unsigned_t);
>> +  src_filename_len = (strlen (module_info->source_filename) +
>> +                     sizeof (gcov_unsigned_t)) / sizeof (gcov_unsigned_t);
>> +  len = filename_len + src_filename_len;
>> +  len += 2; /* each name string is led by a length.  */
>> +
>> +  num_strings = module_info->num_quote_paths + module_info->num_bracket_paths +
>> +    module_info->num_cpp_defines + module_info->num_cpp_includes +
>> +    module_info->num_cl_args;
>> +  for (i = 0; i < num_strings; i++)
>> +    {
>> +      gcov_unsigned_t string_len
>> +          = (strlen (module_info->string_array[i]) + sizeof (gcov_unsigned_t))
>> +          / sizeof (gcov_unsigned_t);
>> +      len += string_len;
>> +      len += 1; /* Each string is lead by a length.  */
>> +    }
>> +
>> +  len += 9; /* 9 more fields */
>> +
>> +  gcov_write_tag_length (GCOV_TAG_MODULE_INFO, len);
>> +  gcov_write_unsigned (module_info->ident);
>> +  gcov_write_unsigned (is_primary);
>> +  gcov_write_unsigned (module_info->is_exported);
>> +  gcov_write_unsigned (module_info->lang);
>> +  gcov_write_unsigned (module_info->num_quote_paths);
>> +  gcov_write_unsigned (module_info->num_bracket_paths);
>> +  gcov_write_unsigned (module_info->num_cpp_defines);
>> +  gcov_write_unsigned (module_info->num_cpp_includes);
>> +  gcov_write_unsigned (module_info->num_cl_args);
>> +
>> +  /* Now write the filenames */
>> +  aligned_fname = (gcov_unsigned_t *) alloca ((filename_len + src_filename_len + 2) *
>> +                                             sizeof (gcov_unsigned_t));
>> +  memset (aligned_fname, 0,
>> +          (filename_len + src_filename_len + 2) * sizeof (gcov_unsigned_t));
>> +  aligned_fname[0] = filename_len;
>> +  strcpy ((char*) (aligned_fname + 1), module_info->da_filename);
>> +  aligned_fname[filename_len + 1] = src_filename_len;
>> +  strcpy ((char*) (aligned_fname + filename_len + 2), module_info->source_filename);
>> +
>> +  for (i = 0; i < (filename_len + src_filename_len + 2); i++)
>> +    gcov_write_unsigned (aligned_fname[i]);
>> +
>> +  /* Now write the string array.  */
>> +  for (j = 0; j < num_strings; j++)
>> +    {
>> +      gcov_unsigned_t *aligned_string;
>> +      gcov_unsigned_t string_len =
>> +       (strlen (module_info->string_array[j]) + sizeof (gcov_unsigned_t)) /
>> +       sizeof (gcov_unsigned_t);
>> +      aligned_string = (gcov_unsigned_t *)
>> +       alloca ((string_len + 1) * sizeof (gcov_unsigned_t));
>> +      memset (aligned_string, 0, (string_len + 1) * sizeof (gcov_unsigned_t));
>> +      aligned_string[0] = string_len;
>> +      strcpy ((char*) (aligned_string + 1), module_info->string_array[j]);
>> +      for (i = 0; i < (string_len + 1); i++)
>> +        gcov_write_unsigned (aligned_string[i]);
>> +    }
>> +}
>> +
>> +/* Write out MOD_INFO and its imported modules into gcda file.  */
>> +
>> +void
>> +gcov_write_module_infos (struct gcov_info *mod_info)
>> +{
>> +  unsigned mod_id, imp_len = 0;
>> +  const struct dyn_imp_mod **imp_mods;
>> +
>> +  mod_id = get_module_idx (mod_info);
>> +  gcov_write_module_info (mod_info, 1);
>> +
>> +  imp_mods = gcov_get_sorted_import_module_array (mod_info, &imp_len);
>> +  if (imp_mods)
>> +    {
>> +      unsigned i;
>> +
>> +      for (i = 0; i < imp_len; i++)
>> +        {
>> +          const struct gcov_info *imp_mod = imp_mods[i]->imp_mod;
>> +          gcov_write_module_info (imp_mod, 0);
>> +        }
>> +      free (imp_mods);
>> +    }
>> +}
>> +
>> +/* Compute module groups needed for L-IPO compilation.  */
>> +
>> +void
>> +__gcov_compute_module_groups (void)
>> +{
>> +  gcov_type cut_off_count;
>> +  char *seed = getenv ("LIPO_RANDOM_GROUPING");
>> +  char *max_group_size = seed ? strchr (seed, ':') : 0;
>> +
>> +  if (seed && max_group_size)
>> +    {
>> +      *max_group_size = '\0';
>> +      max_group_size++;
>> +      srandom (atoi (seed));
>> +      init_dyn_call_graph ();
>> +      gcov_compute_random_module_groups (atoi (max_group_size));
>> +      return;
>> +    }
>> +
>> +  /* First compute dynamic call graph.  */
>> +  gcov_build_callgraph ();
>> +
>> +  cut_off_count = gcov_compute_cutoff_count ();
>> +
>> +  gcov_compute_module_groups (cut_off_count);
>> +
>> +  gcov_dump_callgraph (cut_off_count);
>> +
>> +}
>> +
>> +/* Dumper function for NODE.  */
>> +static void
>> +gcov_dump_cgraph_node_short (struct dyn_cgraph_node *node)
>> +{
>> +  unsigned mod_id, func_id;
>> +  struct gcov_info *mod_info;
>> +  mod_id = get_module_idx_from_func_glob_uid (node->guid);
>> +  func_id = get_intra_module_func_id (node->guid);
>> +
>> +  mod_info = the_dyn_call_graph.modules[mod_id];
>> +
>> +  fprintf (stderr, "NODE(%llx) module(%s) func(%u)",
>> +           (long long)node->guid,
>> +           mod_info->mod_info->source_filename, func_id);
>> +}
>> +
>> +/* Dumper function for NODE.   M is the module id and F is the function id.  */
>> +
>> +static void
>> +gcov_dump_cgraph_node (struct dyn_cgraph_node *node, unsigned m, unsigned f)
>> +{
>> +  unsigned mod_id, func_id;
>> +  struct gcov_info *mod_info;
>> +  struct dyn_cgraph_edge *callers;
>> +  struct dyn_cgraph_edge *callees;
>> +
>> +  mod_id = get_module_idx_from_func_glob_uid (node->guid);
>> +  func_id = get_intra_module_func_id (node->guid);
>> +  gcc_assert (mod_id == m && func_id == f);
>> +
>> +  mod_info = the_dyn_call_graph.modules[mod_id];
>> +
>> +  fprintf (stderr, "NODE(%llx) module(%s) func(%x)\n",
>> +           (long long) node->guid,
>> +           mod_info->mod_info->source_filename, f);
>> +
>> +  /* Now dump callers.  */
>> +  callers = node->callers;
>> +  fprintf (stderr, "\t[CALLERS]\n");
>> +  while (callers != 0)
>> +    {
>> +      fprintf (stderr,"\t\t[count=%ld] ", (long)  callers->count);
>> +      gcov_dump_cgraph_node_short (callers->caller);
>> +      fprintf (stderr,"\n");
>> +      callers = callers->next_caller;
>> +    }
>> +
>> +  callees = node->callees;
>> +  fprintf (stderr, "\t[CALLEES]\n");
>> +  while (callees != 0)
>> +    {
>> +      fprintf (stderr,"\t\t[count=%ld] ", (long)  callees->count);
>> +      gcov_dump_cgraph_node_short (callees->callee);
>> +      fprintf (stderr,"\n");
>> +      callees = callees->next_callee;
>> +    }
>> +}
>> +
>> +/* Dumper function for NODE.   M is the module id and F is the function id.  */
>> +
>> +static void
>> +gcov_dump_cgraph_node_dot (struct dyn_cgraph_node *node,
>> +                           unsigned m, unsigned f,
>> +                           gcov_type cutoff_count)
>> +{
>> +  unsigned mod_id, func_id, imp_len = 0, i;
>> +  struct gcov_info *mod_info;
>> +  const struct dyn_imp_mod **imp_mods;
>> +  struct dyn_cgraph_edge *callees;
>> +
>> +  mod_id = get_module_idx_from_func_glob_uid (node->guid);
>> +  func_id = get_intra_module_func_id (node->guid);
>> +  gcc_assert (mod_id == m && func_id == f);
>> +
>> +  mod_info = the_dyn_call_graph.modules[mod_id];
>> +
>> +  fprintf (stderr, "NODE_%llx[label=\"MODULE\\n(%s)\\n FUNC(%x)\\n",
>> +           (long long) node->guid, mod_info->mod_info->source_filename, f);
>> +
>> +  imp_mods = gcov_get_sorted_import_module_array (mod_info, &imp_len);
>> +  fprintf (stderr, "IMPORTS:\\n");
>> +  if (imp_mods)
>> +    {
>> +      for (i = 0; i < imp_len; i++)
>> +        fprintf (stderr, "%s\\n", imp_mods[i]->imp_mod->mod_info->source_filename);
>> +      fprintf (stderr, "\"]\n");
>> +      free (imp_mods);
>> +    }
>> +  else
>> +    fprintf (stderr, "\"]\n");
>> +
>> +  callees = node->callees;
>> +  while (callees != 0)
>> +    {
>> +      if (callees->count >= cutoff_count)
>> +        fprintf (stderr, "NODE_%llx -> NODE_%llx[label=%lld color=red]\n",
>> +                 (long long) node->guid, (long long) callees->callee->guid,
>> +                 (long long) callees->count);
>> +      else
>> +        fprintf (stderr, "NODE_%llx -> NODE_%llx[label=%lld color=blue]\n",
>> +                 (long long) node->guid, (long long) callees->callee->guid,
>> +                 (long long) callees->count);
>> +      callees = callees->next_callee;
>> +    }
>> +}
>> +
>> +/* Dump dynamic call graph.  CUTOFF_COUNT is the computed hot edge threshold.  */
>> +
>> +static void
>> +gcov_dump_callgraph (gcov_type cutoff_count)
>> +{
>> +  struct gcov_info *gi_ptr;
>> +  unsigned m_ix;
>> +  const char *dyn_cgraph_dump = 0;
>> +
>> +  dyn_cgraph_dump = getenv ("GCOV_DYN_CGRAPH_DUMP");
>> +
>> +  if (!dyn_cgraph_dump || !strlen (dyn_cgraph_dump))
>> +      return;
>> +
>> +  fprintf (stderr,"digraph dyn_call_graph {\n");
>> +  fprintf (stderr,"node[shape=box]\nsize=\"11,8.5\"\n");
>> +
>> +  for (m_ix = 0; m_ix < the_dyn_call_graph.num_modules; m_ix++)
>> +    {
>> +      const struct gcov_fn_info *fi_ptr;
>> +      unsigned f_ix;
>> +
>> +      gi_ptr = the_dyn_call_graph.modules[m_ix];
>> +
>> +      for (f_ix = 0; f_ix < gi_ptr->n_functions; f_ix++)
>> +       {
>> +         struct dyn_cgraph_node *node;
>> +         fi_ptr = the_dyn_call_graph.functions[m_ix][f_ix];
>> +
>> +         node = *(pointer_set_find_or_insert
>> +                  (the_dyn_call_graph.call_graph_nodes[m_ix], fi_ptr->ident));
>> +         gcc_assert (node);
>> +
>> +          /* skip dead functions  */
>> +          if (!node->callees && !node->callers)
>> +            continue;
>> +
>> +          if (dyn_cgraph_dump[0] == '1')
>> +            gcov_dump_cgraph_node (node, m_ix, fi_ptr->ident);
>> +          else
>> +            gcov_dump_cgraph_node_dot (node, m_ix, fi_ptr->ident,
>> +                                       cutoff_count);
>> +       }
>> +    }
>> +  fprintf (stderr,"}\n");
>> +}
>> +
>> +
>> +#endif
>> Index: dbgcnt.def
>> ===================================================================
>> --- dbgcnt.def  (revision 172880)
>> +++ dbgcnt.def  (working copy)
>> @@ -141,10 +141,12 @@ echo ubound: $ub
>>  */
>>
>>  /* Debug counter definitions.  */
>> +DEBUG_COUNTER (alias)
>>  DEBUG_COUNTER (auto_inc_dec)
>>  DEBUG_COUNTER (ccp)
>>  DEBUG_COUNTER (cfg_cleanup)
>>  DEBUG_COUNTER (cse2_move2add)
>> +DEBUG_COUNTER (clone)
>>  DEBUG_COUNTER (cprop)
>>  DEBUG_COUNTER (dce)
>>  DEBUG_COUNTER (dce_fast)
>> @@ -165,6 +167,7 @@ DEBUG_COUNTER (if_conversion)
>>  DEBUG_COUNTER (if_conversion_tree)
>>  DEBUG_COUNTER (if_after_combine)
>>  DEBUG_COUNTER (if_after_reload)
>> +DEBUG_COUNTER (inl)
>>  DEBUG_COUNTER (local_alloc_for_sched)
>>  DEBUG_COUNTER (postreload_cse)
>>  DEBUG_COUNTER (pre)
>> Index: c-family/c-opts.c
>> ===================================================================
>> --- c-family/c-opts.c   (revision 172880)
>> +++ c-family/c-opts.c   (working copy)
>> @@ -38,6 +38,9 @@ along with GCC; see the file COPYING3.
>>  #include "mkdeps.h"
>>  #include "target.h"            /* For gcc_targetcm.  */
>>  #include "tm_p.h"              /* For C_COMMON_OVERRIDE_OPTIONS.  */
>> +#include "function.h"
>> +#include "params.h"
>> +#include "l-ipo.h"
>>
>>  #ifndef DOLLARS_IN_IDENTIFIERS
>>  # define DOLLARS_IN_IDENTIFIERS true
>> @@ -364,6 +367,7 @@ c_common_handle_option (size_t scode, co
>>       warn_missing_braces = value;
>>       warn_parentheses = value;
>>       warn_return_type = value;
>> +      warn_ripa_opt_mismatch = value;
>>       warn_sequence_point = value;     /* Was C only.  */
>>       warn_switch = value;
>>       if (warn_strict_aliasing == -1)
>> @@ -864,6 +868,10 @@ c_common_post_options (const char **pfil
>>   else if (!flag_gnu89_inline && !flag_isoc99)
>>     error ("-fno-gnu89-inline is only supported in GNU99 or C99 mode");
>>
>> +  if (flag_dyn_ipa && cpp_opts->preprocessed)
>> +    error ("-fpreprocessed/-save-temps are not supported with -fripa");
>> +
>> +
>>   /* Default to ObjC sjlj exception handling if NeXT runtime.  */
>>   if (flag_objc_sjlj_exceptions < 0)
>>     flag_objc_sjlj_exceptions = flag_next_runtime;
>> @@ -1061,6 +1069,28 @@ c_common_init (void)
>>   return true;
>>  }
>>
>> +/* Return TRUE if the lipo maximum memory consumption limit is reached, and
>> +   we should not import any further auxiliary modules. Check after parsing
>> +   each module, the Ith module being the just parsed module.  */
>> +static bool
>> +lipo_max_mem_reached (unsigned int i)
>> +{
>> +  if (L_IPO_COMP_MODE && PARAM_VALUE (PARAM_MAX_LIPO_MEMORY)
>> +      && i < (num_in_fnames - 1)
>> +      && ((ggc_total_allocated () >> 10)
>> +          > (size_t) PARAM_VALUE (PARAM_MAX_LIPO_MEMORY))) {
>> +    i++;
>> +    do {
>> +      inform (input_location, "Not importing %s: maximum memory "
>> +             "consumption reached", in_fnames[i]);
>> +      i++;
>> +    } while (i < num_in_fnames);
>> +    return true;
>> +  }
>> +  return false;
>> +}
>> +
>> +
>>  /* Initialize the integrated preprocessor after debug output has been
>>    initialized; loop over each input file.  */
>>  void
>> @@ -1073,8 +1103,15 @@ c_common_parse_file (void)
>>     {
>>       c_finish_options ();
>>       pch_init ();
>> +      set_lipo_c_parsing_context (parse_in, i, verbose);
>>       push_file_scope ();
>>       c_parse_file ();
>> +      /* In lipo mode, processing too many auxiliary files will cause us
>> +        to hit memory limits, and cause thrashing -- prevent this by not
>> +        processing any further auxiliary modules if we reach a certain
>> +        memory limit.  */
>> +      if (lipo_max_mem_reached (i))
>> +       num_in_fnames = i + 1;
>>       pop_file_scope ();
>>       /* And end the main input file, if the debug writer wants it  */
>>       if (debug_hooks->start_end_main_source_file)
>> @@ -1083,6 +1120,7 @@ c_common_parse_file (void)
>>        break;
>>       cpp_undef_all (parse_in);
>>       cpp_clear_file_cache (parse_in);
>> +      deferred_count = 0;
>>       this_input_filename
>>        = cpp_read_main_file (parse_in, in_fnames[i]);
>>       /* If an input file is missing, abandon further compilation.
>> @@ -1320,9 +1358,15 @@ c_finish_options (void)
>>          struct deferred_opt *opt = &deferred_opts[i];
>>
>>          if (opt->code == OPT_D)
>> -           cpp_define (parse_in, opt->arg);
>> +           {
>> +             cpp_define (parse_in, opt->arg);
>> +             coverage_note_define (opt->arg, true);
>> +           }
>>          else if (opt->code == OPT_U)
>> -           cpp_undef (parse_in, opt->arg);
>> +           {
>> +             cpp_undef (parse_in, opt->arg);
>> +             coverage_note_define (opt->arg, false);
>> +           }
>>          else if (opt->code == OPT_A)
>>            {
>>              if (opt->arg[0] == '-')
>> @@ -1345,6 +1389,7 @@ c_finish_options (void)
>>          if (opt->code == OPT_imacros
>>              && cpp_push_include (parse_in, opt->arg))
>>            {
>> +             coverage_note_include (opt->arg);
>>              /* Disable push_command_line_include callback for now.  */
>>              include_cursor = deferred_count + 1;
>>              cpp_scan_nooutput (parse_in);
>> @@ -1376,7 +1421,10 @@ push_command_line_include (void)
>>
>>       if (!cpp_opts->preprocessed && opt->code == OPT_include
>>          && cpp_push_include (parse_in, opt->arg))
>> -       return;
>> +       {
>> +         coverage_note_include (opt->arg);
>> +         return;
>> +       }
>>     }
>>
>>   if (include_cursor == deferred_count)
>> Index: cgraph.c
>> ===================================================================
>> --- cgraph.c    (revision 172880)
>> +++ cgraph.c    (working copy)
>> @@ -98,6 +98,7 @@ The callgraph:
>>  #include "rtl.h"
>>  #include "ipa-utils.h"
>>  #include "lto-streamer.h"
>> +#include "l-ipo.h"
>>
>>  const char * const ld_plugin_symbol_resolution_names[]=
>>  {
>> @@ -485,6 +486,44 @@ cgraph_create_node (void)
>>   return node;
>>  }
>>
>> +/* Add an entry in assembler hash for NODE.  */
>> +void
>> +cgraph_add_assembler_hash_node (struct cgraph_node *node)
>> +{
>> +  if (assembler_name_hash)
>> +    {
>> +      void **aslot;
>> +      tree name = DECL_ASSEMBLER_NAME (node->decl);
>> +
>> +      aslot = htab_find_slot_with_hash (assembler_name_hash, name,
>> +                                       decl_assembler_name_hash (name),
>> +                                       INSERT);
>> +      /* We can have multiple declarations with same assembler name. For C++
>> +        it is __builtin_strlen and strlen, for instance.  Do we need to
>> +        record them all?  Original implementation marked just first one
>> +        so lets hope for the best.  */
>> +      if (*aslot == NULL)
>> +       *aslot = node;
>> +    }
>> +}
>> +
>> +/* Remove from assembler hash for NODE.  */
>> +void
>> +cgraph_remove_assembler_hash_node (struct cgraph_node *node)
>> +{
>> +  void **slot;
>> +  if (assembler_name_hash)
>> +    {
>> +      tree name = DECL_ASSEMBLER_NAME (node->decl);
>> +      slot = htab_find_slot_with_hash (assembler_name_hash, name,
>> +                                      decl_assembler_name_hash (name),
>> +                                      NO_INSERT);
>> +      /* Inline clones are not hashed.  */
>> +      if (slot && *slot == node)
>> +        htab_clear_slot (assembler_name_hash, slot);
>> +    }
>> +}
>> +
>>  /* Return cgraph node assigned to DECL.  Create new one when needed.  */
>>
>>  struct cgraph_node *
>> @@ -518,21 +557,7 @@ cgraph_node (tree decl)
>>       node->next_nested = node->origin->nested;
>>       node->origin->nested = node;
>>     }
>> -  if (assembler_name_hash)
>> -    {
>> -      void **aslot;
>> -      tree name = DECL_ASSEMBLER_NAME (decl);
>> -
>> -      aslot = htab_find_slot_with_hash (assembler_name_hash, name,
>> -                                       decl_assembler_name_hash (name),
>> -                                       INSERT);
>> -      /* We can have multiple declarations with same assembler name. For C++
>> -        it is __builtin_strlen and strlen, for instance.  Do we need to
>> -        record them all?  Original implementation marked just first one
>> -        so lets hope for the best.  */
>> -      if (*aslot == NULL)
>> -       *aslot = node;
>> -    }
>> +  cgraph_add_assembler_hash_node (node);
>>   return node;
>>  }
>>
>> @@ -830,7 +855,9 @@ cgraph_edge (struct cgraph_node *node, g
>>     {
>>       node->call_site_hash = htab_create_ggc (120, edge_hash, edge_eq, NULL);
>>       for (e2 = node->callees; e2; e2 = e2->next_callee)
>> -       cgraph_add_edge_to_call_site_hash (e2);
>> +       /* Skip fake edges.  */
>> +       if (e2->call_stmt)
>> +         cgraph_add_edge_to_call_site_hash (e2);
>>       for (e2 = node->indirect_calls; e2; e2 = e2->next_callee)
>>        cgraph_add_edge_to_call_site_hash (e2);
>>     }
>> @@ -1139,7 +1166,7 @@ cgraph_edge_remove_caller (struct cgraph
>>       else
>>        e->caller->callees = e->next_callee;
>>     }
>> -  if (e->caller->call_site_hash)
>> +  if (e->caller->call_site_hash && e->call_stmt)
>>     htab_remove_elt_with_hash (e->caller->call_site_hash,
>>                               e->call_stmt,
>>                               htab_hash_pointer (e->call_stmt));
>> @@ -1178,6 +1205,26 @@ cgraph_remove_edge (struct cgraph_edge *
>>   cgraph_free_edge (e);
>>  }
>>
>> +/* Remove fake cgraph edges for indirect calls. NODE is the callee
>> +   of the edges.  */
>> +
>> +void
>> +cgraph_remove_fake_indirect_call_in_edges (struct cgraph_node *node)
>> +{
>> +  struct cgraph_edge *f, *e;
>> +
>> +  if (!L_IPO_COMP_MODE)
>> +    return;
>> +
>> +  for (e = node->callers; e; e = f)
>> +    {
>> +      f = e->next_caller;
>> +      if (!e->call_stmt)
>> +        cgraph_remove_edge (e);
>> +    }
>> +}
>> +
>> +
>>  /* Set callee of call graph edge E and add it to the corresponding set of
>>    callers. */
>>
>> @@ -1400,6 +1447,8 @@ cgraph_node_remove_callers (struct cgrap
>>  void
>>  cgraph_release_function_body (struct cgraph_node *node)
>>  {
>> +  if (cgraph_is_aux_decl_external (node))
>> +    DECL_EXTERNAL (node->decl) = 1;
>>   if (DECL_STRUCT_FUNCTION (node->decl))
>>     {
>>       tree old_decl = current_function_decl;
>> @@ -1667,19 +1716,14 @@ cgraph_remove_node (struct cgraph_node *
>>                  || n->in_other_partition)))
>>        kill_body = true;
>>     }
>> -  if (assembler_name_hash)
>> -    {
>> -      tree name = DECL_ASSEMBLER_NAME (node->decl);
>> -      slot = htab_find_slot_with_hash (assembler_name_hash, name,
>> -                                      decl_assembler_name_hash (name),
>> -                                      NO_INSERT);
>> -      /* Inline clones are not hashed.  */
>> -      if (slot && *slot == node)
>> -        htab_clear_slot (assembler_name_hash, slot);
>> -    }
>> +
>> +  cgraph_remove_assembler_hash_node (node);
>>
>>   if (kill_body)
>>     cgraph_release_function_body (node);
>> +
>> +  cgraph_remove_link_node (node);
>> +
>>   node->decl = NULL;
>>   if (node->call_site_hash)
>>     {
>> @@ -1724,7 +1768,8 @@ cgraph_mark_reachable_node (struct cgrap
>>             during the optimization process.  This can happen for extern
>>             inlines when bodies was removed after inlining.  */
>>          gcc_assert ((node->analyzed || node->in_other_partition
>> -                      || DECL_EXTERNAL (node->decl)));
>> +                      || DECL_EXTERNAL (node->decl)
>> +                       || cgraph_is_aux_decl_external (node)));
>>        }
>>       else
>>         notice_global_symbol (node->decl);
>> @@ -2190,6 +2235,7 @@ cgraph_clone_node (struct cgraph_node *n
>>   new_node->global = n->global;
>>   new_node->rtl = n->rtl;
>>   new_node->count = count;
>> +  new_node->is_versioned_clone = n->is_versioned_clone;
>>   new_node->frequency = n->frequency;
>>   new_node->clone = n->clone;
>>   new_node->clone.tree_map = 0;
>> @@ -2239,17 +2285,8 @@ cgraph_clone_node (struct cgraph_node *n
>>       slot = (struct cgraph_node **) htab_find_slot (cgraph_hash, new_node, INSERT);
>>       gcc_assert (!*slot);
>>       *slot = new_node;
>> -      if (assembler_name_hash)
>> -       {
>> -         void **aslot;
>> -         tree name = DECL_ASSEMBLER_NAME (decl);
>> -
>> -         aslot = htab_find_slot_with_hash (assembler_name_hash, name,
>> -                                           decl_assembler_name_hash (name),
>> -                                           INSERT);
>> -         gcc_assert (!*aslot);
>> -         *aslot = new_node;
>> -       }
>> +      cgraph_add_assembler_hash_node (new_node);
>> +      cgraph_link_node (new_node);
>>     }
>>   return new_node;
>>  }
>> @@ -2379,6 +2416,7 @@ cgraph_create_virtual_clone (struct cgra
>>     new_node->clone.combined_args_to_skip = args_to_skip;
>>   new_node->local.externally_visible = 0;
>>   new_node->local.local = 1;
>> +  new_node->is_versioned_clone = 1;
>>   new_node->lowered = true;
>>   new_node->reachable = true;
>>
>> Index: cgraph.h
>> ===================================================================
>> --- cgraph.h    (revision 172880)
>> +++ cgraph.h    (working copy)
>> @@ -286,6 +286,8 @@ struct GTY((chain_next ("%h.next"), chai
>>   unsigned alias : 1;
>>   /* Set for nodes that was constructed and finalized by frontend.  */
>>   unsigned finalized_by_frontend : 1;
>> +  /* Is this function cloned during versioning ?  */
>> +  unsigned is_versioned_clone : 1;
>>   /* Set for alias and thunk nodes, same_body points to the node they are alias
>>      of and they are linked through the next/previous pointers.  */
>>   unsigned same_body_alias : 1;
>> @@ -475,6 +477,8 @@ struct GTY((chain_next ("%h.next"), chai
>>   int order;
>>   enum ld_plugin_symbol_resolution resolution;
>>
>> +  /* The module in which it is first declared.  */
>> +  unsigned module_id;
>>   /* Set when function must be output - it is externally visible
>>      or its address is taken.  */
>>   unsigned needed : 1;
>> @@ -547,6 +551,11 @@ void debug_cgraph_node (struct cgraph_no
>>  void cgraph_insert_node_to_hashtable (struct cgraph_node *node);
>>  void cgraph_remove_edge (struct cgraph_edge *);
>>  void cgraph_remove_node (struct cgraph_node *);
>> +void cgraph_add_assembler_hash_node (struct cgraph_node *);
>> +void cgraph_remove_assembler_hash_node (struct cgraph_node *);
>> +void cgraph_remove_fake_indirect_call_in_edges (struct cgraph_node *);
>> +extern bool cgraph_pre_profiling_inlining_done;
>> +extern bool cgraph_is_fake_indirect_call_edge (struct cgraph_edge *e);
>>  void cgraph_remove_node_and_inline_clones (struct cgraph_node *);
>>  void cgraph_release_function_body (struct cgraph_node *);
>>  void cgraph_node_remove_callees (struct cgraph_node *node);
>> @@ -641,6 +650,48 @@ struct cgraph_node *save_inline_function
>>  void record_references_in_initializer (tree, bool);
>>  bool cgraph_process_new_functions (void);
>>
>> +/* Module info structure.  */
>> +struct GTY (()) cgraph_mod_info
>> +{
>> +  unsigned module_id;
>> +};
>> +
>> +/* LIPO linker symbol table entry for function symbols.  */
>> +struct GTY (()) cgraph_sym
>> +{
>> +  tree assembler_name;
>> +  struct cgraph_node *rep_node;
>> +  tree rep_decl;
>> +  htab_t GTY ((param_is (struct cgraph_mod_info))) def_module_hash;
>> +  bool is_promoted_static;
>> +};
>> +
>> +void cgraph_init_gid_map (void);
>> +void cgraph_add_fake_indirect_call_edges (void);
>> +void cgraph_remove_zero_count_fake_edges (void);
>> +void cgraph_do_link (void);
>> +struct cgraph_sym *cgraph_link_node (struct cgraph_node *);
>> +tree cgraph_find_decl (tree asm_name);
>> +void cgraph_remove_link_node (struct cgraph_node *node);
>> +struct cgraph_node *cgraph_lipo_get_resolved_node (tree decl);
>> +struct cgraph_node *cgraph_lipo_get_resolved_node_1 (tree decl, bool);
>> +unsigned  cgraph_get_module_id (tree fndecl);
>> +bool cgraph_is_auxiliary (tree fndecl);
>> +void cgraph_process_module_scope_statics (void);
>> +bool cgraph_is_promoted_static_func (tree fndecl);
>> +bool cgraph_is_inline_body_available_in_module (tree fndecl, unsigned module_id);
>> +bool cgraph_is_aux_decl_external (struct cgraph_node *);
>> +void cgraph_unify_type_alias_sets (void);
>> +void varpool_do_link (void);
>> +void varpool_link_node (struct varpool_node *);
>> +void varpool_remove_link_node (struct varpool_node *node);
>> +struct varpool_node *real_varpool_node (tree decl);
>> +bool varpool_is_auxiliary (struct varpool_node *node);
>> +void varpool_get_referenced_asm_ids (VEC(tree, gc) **);
>> +void varpool_clear_asm_id_reference_bit (void);
>> +void varpool_reset_queue (void);
>> +void varpool_remove_duplicate_weak_decls (void);
>> +
>>  bool cgraph_decide_is_function_needed (struct cgraph_node *, tree);
>>
>>  typedef void (*cgraph_edge_hook)(struct cgraph_edge *, void *);
>> Index: value-prof.c
>> ===================================================================
>> --- value-prof.c        (revision 172880)
>> +++ value-prof.c        (working copy)
>> @@ -46,6 +46,9 @@ along with GCC; see the file COPYING3.
>>  #include "timevar.h"
>>  #include "tree-pass.h"
>>  #include "pointer-set.h"
>> +#include "langhooks.h"
>> +#include "params.h"
>> +#include "l-ipo.h"
>>
>>  /* In this file value profile based optimizations are placed.  Currently the
>>    following optimizations are implemented (for more detailed descriptions
>> @@ -300,6 +303,10 @@ dump_histogram_value (FILE *dump_file, h
>>        }
>>       fprintf (dump_file, ".\n");
>>       break;
>> +    case HIST_TYPE_INDIR_CALL_TOPN:
>> +      fprintf (dump_file, "Indirect call -- top N\n");
>> +      /* TODO add more elaborate dumping code.  */
>> +      break;
>>    }
>>  }
>>
>> @@ -487,6 +494,70 @@ check_counter (gimple stmt, const char *
>>   return false;
>>  }
>>
>> +/* The overall number of invocations of the counter should match
>> +   execution count of basic block.  Report it as error rather than
>> +   internal error as it might mean that user has misused the profile
>> +   somehow.  STMT is the indiret call, COUNT1 and COUNT2 are counts
>> +   of two top targets, and ALL is the enclosing basic block execution
>> +   count.  */
>> +
>> +static bool
>> +check_ic_counter (gimple stmt, gcov_type *count1, gcov_type *count2,
>> +                  gcov_type all)
>> +{
>> +  location_t locus;
>> +  if (*count1 > all && flag_profile_correction)
>> +    {
>> +      locus = (stmt != NULL)
>> +              ? gimple_location (stmt)
>> +              : DECL_SOURCE_LOCATION (current_function_decl);
>> +      inform (locus, "Correcting inconsistent value profile: "
>> +              "ic (topn) profiler top target count (%ld) exceeds "
>> +             "BB count (%ld)", (long)*count1, (long)all);
>> +      *count1 = all;
>> +    }
>> +  if (*count2 > all && flag_profile_correction)
>> +    {
>> +      locus = (stmt != NULL)
>> +              ? gimple_location (stmt)
>> +              : DECL_SOURCE_LOCATION (current_function_decl);
>> +      inform (locus, "Correcting inconsistent value profile: "
>> +              "ic (topn) profiler second target count (%ld) exceeds "
>> +             "BB count (%ld)", (long)*count2, (long)all);
>> +      *count2 = all;
>> +    }
>> +
>> +  if (*count2 > *count1)
>> +    {
>> +      locus = (stmt != NULL)
>> +              ? gimple_location (stmt)
>> +              : DECL_SOURCE_LOCATION (current_function_decl);
>> +      inform (locus, "Corrupted topn ic value profile: "
>> +             "first target count (%ld) is less than the second "
>> +             "target count (%ld)", (long)*count1, (long)*count2);
>> +      return true;
>> +    }
>> +
>> +  if (*count1 + *count2 > all)
>> +    {
>> +      /* If (COUNT1 + COUNT2) is greater than ALL by less than around 10% then
>> +        just fix COUNT2 up so that (COUNT1 + COUNT2) equals ALL.  */
>> +      if ((*count1 + *count2 - all) < (all >> 3))
>> +       *count2 = all - *count1;
>> +      else
>> +       {
>> +         locus = (stmt != NULL)
>> +           ? gimple_location (stmt)
>> +           : DECL_SOURCE_LOCATION (current_function_decl);
>> +         inform (locus, "Corrupted topn ic value profile: top two targets's"
>> +                 " total count (%ld) exceeds bb count (%ld)",
>> +                 (long)(*count1 + *count2), (long)all);
>> +         return true;
>> +       }
>> +    }
>> +  return false;
>> +}
>> +
>>
>>  /* GIMPLE based transformations. */
>>
>> @@ -542,6 +613,14 @@ gimple_value_profile_transformations (vo
>>   if (changed)
>>     {
>>       counts_to_freqs ();
>> +      /* Value profile transformations may change inline parameters
>> +         a lot (e.g., indirect call promotion introduces new direct calls).
>> +         The update is also needed to avoid compiler ICE -- when MULTI
>> +         target icall promotion happens, the caller's size may become
>> +         negative when the promoted direct calls get promoted.  */
>> +      /* Guard this for LIPO for now.  */
>> +      if (L_IPO_COMP_MODE)
>> +        compute_inline_parameters (cgraph_node (current_function_decl));
>>     }
>>
>>   return changed;
>> @@ -1090,6 +1169,111 @@ find_func_by_pid (int   pid)
>>   return pid_map [pid];
>>  }
>>
>> +/* Initialize map of gids (gid -> cgraph node) */
>> +
>> +static htab_t gid_map = NULL;
>> +
>> +typedef struct func_gid_entry
>> +{
>> +  struct cgraph_node *node;
>> +  unsigned HOST_WIDEST_INT gid;
>> +} func_gid_entry_t;
>> +
>> +/* Hash function for function global unique ids.  */
>> +
>> +static hashval_t
>> +htab_gid_hash (const void * ent)
>> +{
>> +  const func_gid_entry_t *const entry = (const func_gid_entry_t *) ent;
>> +  return entry->gid;
>> +}
>> +
>> +/* Hash table equality function for function global unique ids.  */
>> +
>> +static int
>> +htab_gid_eq (const void *ent1, const void * ent2)
>> +{
>> +  const func_gid_entry_t *const entry1 = (const func_gid_entry_t *) ent1;
>> +  const func_gid_entry_t *const entry2 = (const func_gid_entry_t *) ent2;
>> +  return entry1->gid == entry2->gid;
>> +}
>> +
>> +static void
>> +htab_gid_del (void *ent)
>> +{
>> +  func_gid_entry_t *const entry = (func_gid_entry_t *) ent;
>> +  free (entry);
>> +}
>> +
>> +/* Initialize the global unique id map for functions.  */
>> +
>> +static void
>> +init_gid_map (void)
>> +{
>> +  struct cgraph_node *n;
>> +
>> +  gcc_assert (!gid_map);
>> +
>> +  gid_map
>> +      = htab_create (10, htab_gid_hash, htab_gid_eq, htab_gid_del);
>> +
>> +  for (n = cgraph_nodes; n; n = n->next)
>> +    {
>> +      func_gid_entry_t ent, *entp;
>> +      func_gid_entry_t **slot;
>> +      struct function *f;
>> +      ent.node = n;
>> +      f = DECL_STRUCT_FUNCTION (n->decl);
>> +      /* Do not care to indirect call promote a function with id.  */
>> +      if (!f || DECL_ABSTRACT (n->decl))
>> +        continue;
>> +      /* The global function id computed at profile-use time
>> +         is slightly different from the one computed in
>> +         instrumentation runtime -- for the latter, the intra-
>> +         module function ident is 1 based while in profile-use
>> +         phase, it is zero based. See get_next_funcdef_no in
>> +         function.c.  */
>> +      ent.gid = FUNC_DECL_GLOBAL_ID (DECL_STRUCT_FUNCTION (n->decl));
>> +      slot = (func_gid_entry_t **) htab_find_slot (gid_map, &ent, INSERT);
>> +
>> +      gcc_assert (!*slot || ((*slot)->gid == ent.gid && (*slot)->node == n));
>> +      if (!*slot)
>> +        {
>> +          *slot = entp = XCNEW (func_gid_entry_t);
>> +          entp->node = n;
>> +          entp->gid = ent.gid;
>> +        }
>> +    }
>> +}
>> +
>> +/* Initialize the global unique id map for functions.  */
>> +
>> +void
>> +cgraph_init_gid_map (void)
>> +{
>> +  if (!L_IPO_COMP_MODE)
>> +    return;
>> +
>> +  init_gid_map ();
>> +}
>> +
>> +/* Return cgraph node for function with global id.  */
>> +
>> +struct cgraph_node *
>> +find_func_by_global_id (unsigned HOST_WIDE_INT gid)
>> +{
>> +  func_gid_entry_t ent, *entp;
>> +
>> +  gcc_assert (gid_map);
>> +
>> +  ent.node = NULL;
>> +  ent.gid = gid;
>> +  entp = (func_gid_entry_t *)htab_find (gid_map, &ent);
>> +  if (entp)
>> +    return entp->node;
>> +  return NULL;
>> +}
>> +
>>  /* Perform sanity check on the indirect call target. Due to race conditions,
>>    false function target may be attributed to an indirect call site. If the
>>    call expression type mismatches with the target function's type, expand_call
>> @@ -1244,27 +1428,13 @@ gimple_ic (gimple icall_stmt, struct cgr
>>  */
>>
>>  static bool
>> -gimple_ic_transform (gimple stmt)
>> +gimple_ic_transform_single_targ (gimple stmt, histogram_value histogram)
>>  {
>> -  histogram_value histogram;
>>   gcov_type val, count, all, bb_all;
>>   gcov_type prob;
>> -  tree callee;
>>   gimple modify;
>>   struct cgraph_node *direct_call;
>>
>> -  if (gimple_code (stmt) != GIMPLE_CALL)
>> -    return false;
>> -
>> -  callee = gimple_call_fn (stmt);
>> -
>> -  if (TREE_CODE (callee) == FUNCTION_DECL)
>> -    return false;
>> -
>> -  histogram = gimple_histogram_value_of_type (cfun, stmt, HIST_TYPE_INDIR_CALL);
>> -  if (!histogram)
>> -    return false;
>> -
>>   val = histogram->hvalue.counters [0];
>>   count = histogram->hvalue.counters [1];
>>   all = histogram->hvalue.counters [2];
>> @@ -1312,6 +1482,195 @@ gimple_ic_transform (gimple stmt)
>>   return true;
>>  }
>>
>> +/* Convert indirect function call STMT into guarded direct function
>> +   calls. Multiple indirect call targets are supported. HISTOGRAM
>> +   is the target distribution for the callsite.  */
>> +
>> +static bool
>> +gimple_ic_transform_mult_targ (gimple stmt, histogram_value histogram)
>> +{
>> +  gcov_type val1, val2, count1, count2, all, bb_all;
>> +  gcov_type prob1, prob2;
>> +  gimple modify1, modify2;
>> +  struct cgraph_node *direct_call1 = 0, *direct_call2 = 0;
>> +  int perc_threshold, count_threshold, always_inline;
>> +  location_t locus;
>> +
>> +  val1 = histogram->hvalue.counters [1];
>> +  count1 = histogram->hvalue.counters [2];
>> +  val2 = histogram->hvalue.counters [3];
>> +  count2 = histogram->hvalue.counters [4];
>> +  bb_all = gimple_bb (stmt)->count;
>> +  all = bb_all;
>> +
>> +  gimple_remove_histogram_value (cfun, stmt, histogram);
>> +
>> +  if (count1 == 0)
>> +    return false;
>> +
>> +  perc_threshold = PARAM_VALUE (PARAM_ICALL_PROMOTE_PERCENT_THRESHOLD);
>> +  count_threshold = PARAM_VALUE (PARAM_ICALL_PROMOTE_COUNT_THRESHOLD);
>> +  always_inline = PARAM_VALUE (PARAM_ALWAYS_INLINE_ICALL_TARGET);
>> +
>> +  if (100 * count1 < all * perc_threshold || count1 < count_threshold)
>> +    return false;
>> +
>> +  if (check_ic_counter (stmt, &count1, &count2, all))
>> +    return false;
>> +
>> +  if (all > 0)
>> +    {
>> +      prob1 = (count1 * REG_BR_PROB_BASE + all / 2) / all;
>> +      if (all - count1 > 0)
>> +        prob2 = (count2 * REG_BR_PROB_BASE
>> +                 + (all - count1) / 2) / (all - count1);
>> +      else
>> +        prob2 = 0;
>> +    }
>> +  else
>> +    prob1 = prob2 = 0;
>> +
>> +  direct_call1 = find_func_by_global_id (val1);
>> +
>> +  if (val2 && (100 * count2 >= all * perc_threshold)
>> +      && count2 > count_threshold)
>> +    direct_call2 = find_func_by_global_id (val2);
>> +
>> +  locus = (stmt != NULL) ? gimple_location (stmt)
>> +      : DECL_SOURCE_LOCATION (current_function_decl);
>> +  if (direct_call1 == NULL
>> +      || !check_ic_target (stmt, direct_call1))
>> +    {
>> +      if (flag_ripa_verbose)
>> +        {
>> +          if (!direct_call1)
>> +            inform (locus, "Can not find indirect call target decl "
>> +                    "(%d:%d)[cnt:%u] in current module",
>> +                    EXTRACT_MODULE_ID_FROM_GLOBAL_ID (val1),
>> +                    EXTRACT_FUNC_ID_FROM_GLOBAL_ID (val1), (unsigned) count1);
>> +          else
>> +            inform (locus,
>> +                    "Can not find promote indirect call target decl -- type mismatch "
>> +                    "(%d:%d)[cnt:%u] in current module",
>> +                    EXTRACT_MODULE_ID_FROM_GLOBAL_ID (val1),
>> +                    EXTRACT_FUNC_ID_FROM_GLOBAL_ID (val1), (unsigned) count1);
>> +        }
>> +      return false;
>> +    }
>> +
>> +  /* Don't indirect-call promote if the target is in auxiliary module and
>> +     DECL_ARTIFICIAL and not TREE_PUBLIC, because we don't static-promote
>> +     DECL_ARTIFICIALs yet.  */
>> +  if (cgraph_is_auxiliary (direct_call1->decl)
>> +      && DECL_ARTIFICIAL (direct_call1->decl)
>> +      && ! TREE_PUBLIC (direct_call1->decl))
>> +    return false;
>> +
>> +  modify1 = gimple_ic (stmt, direct_call1, prob1, count

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2011-04-25 23:39 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20110425232927.75B241B40B6@aples.mtv.corp.google.com>
     [not found] ` <BANLkTimFO5r7qW891YT2jXfWUdTd+whX0w@mail.gmail.com>
2011-04-26  2:15   ` [google] Port LIPO support to google/main (issue4431065) Rong Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).