public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Type inheritance graph analysis & speculative devirtualization, part 7/7 (speculative devirtualizatoin)
@ 2013-09-01 13:57 Jan Hubicka
  2013-09-01 16:04 ` Xinliang David Li
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Jan Hubicka @ 2013-09-01 13:57 UTC (permalink / raw)
  To: gcc-patches, jason, Nathan Froyd, tglek, glandium, marxin.liska

Hi,
this patch implement speculative devirtualization.  It is a trivial pass that
asks for targets of every polymorphic call in a program and if the list
contains one likely target, it produces an speculative call. No context
sensitive analysis is done at the moment.  This call may or may not survive
into final program depending if we do somehting useful about the direct call.

The pass is currently disabled for LTO because
http://gcc.gnu.org/ml/gcc-patches/2013-08/msg01007.html is needed to properly
build type inheritance graph.

With LTO it is supposed to be effective on a premise that most types are not
escaping LTO unit and thus we see all possible targets.  Without LTO it makes
stronger assumption that you usually call only targets defined in current unit,
if any.

Path is suprisingly effective on Firefox:
105306 polymorphic calls, 0 devirtualized, 34258 speculatively devirtualized, 4056 cold
66875 have multiple targets, 0 overwritable, 0 already speculated (0 agree, 0 disagree), 0 not defined

So about 32% of calls are devirutalized.  By random checking, these can be
tracked to real occurences of code where virtual is used in a silly way.
I plan to introduce warning for that (I have code for that already since it
makes it easier to analyze what changes are really made and why).

Martin Liska rebuilt with FDO based indirect call resolution.  Here we get:
23928 indirect calls trained.
12837 (53.65%) have common target.
342 (1.43%) targets was not found.
8378 (35.01%) speculations seems useless.
4117 (17.21%) speculations produced.

I compared the overlap that is devirtualized by both techniques.  There is
almost 100% match, except that FDO code is not dealing well with thunks and
those 342 calls that seem to go out of libxul into plugins.  I will fix the
thunk issue later.

I also tested QT, where the numbers are smaller - only about 20% of devirtualized
calls, but largery things seems similar.

For non-LTO build, the devirtualization also seems sane, there seems to be
about 8% of miss rate on GCC bootstrap that seems acceptable. I tracked most of
those down into randomly included headers that do define derived types of a
given class that are unused in the current unit.  I think we can track this by
computing reachable functions in current unit and looking for vtables actually
used by construction.  One of such actually triggers undefined reference in
build of libstdc++ and therefore I added the check disabling devirtualization
to DECL_EXTERNAL for now.  It is because the libstdc++ header seems to have
explicit instantiation of a template that is never linked with.

I currently enabled the pass by default at -O2.  Based on the experience about
missrate, we may want to disable it for -O2 non-LTO if it shows to be too risky
on some codebases.

Bootstrapped/regtested x86_64-linux and ppc64-linux, also tested with lto bootstrap
with the LTO ODR code and tested on Firefox and QT builds. Will commit the patch
later today.

Comments are welcome.
Honza

	* common.opt (fdevirtualize-speculatively): New function.
	* invoke.texi (fdevirtualize-speculatively): Document.
	* ipa-devirt.c: Include ipa-inline.h
	(likely_target_p): New function.
	(ipa_devirt): New function.
	(gate_ipa_devirt): New function.
	(pass_data_ipa_devirt): New static var.
	(pass_ipa_devirt): Likewise.
	(make_pass_ipa_devirt): New function.
	* opts.c (default_options): Add OPT_fdevirtualize_speculatively.
	(common_handle_option): Disable devirtualization when
	value range profiling is available.
	* passes.def (pass_ipa_devirt): Add.
	* timever.def (TV_IPA_DEVIRT): New timevar.
	* tree-pass.h (make_pass_ipa_devirt): 
	
Index: common.opt
===================================================================
--- common.opt	(revision 202136)
+++ common.opt	(working copy)
@@ -1007,6 +1007,10 @@ fdevirtualize
 Common Report Var(flag_devirtualize) Optimization
 Try to convert virtual calls to direct ones.
 
+fdevirtualize-speculatively
+Common Report Var(flag_devirtualize_speculatively) Optimization
+Perform speculative devirtualization
+
 fdiagnostics-show-location=
 Common Joined RejectNegative Enum(diagnostic_prefixing_rule)
 -fdiagnostics-show-location=[once|every-line]	How often to emit source location at the beginning of line-wrapped diagnostics
@@ -1366,7 +1370,7 @@ Common RejectNegative Joined
 
 fipa-cp
 Common Report Var(flag_ipa_cp) Optimization
-Perform Interprocedural constant propagation
+Perform interprocedural constant propagation
 
 fipa-cp-clone
 Common Report Var(flag_ipa_cp_clone) Optimization
Index: doc/invoke.texi
===================================================================
--- doc/invoke.texi	(revision 202136)
+++ doc/invoke.texi	(working copy)
@@ -365,7 +365,7 @@ Objective-C and Objective-C++ Dialects}.
 -fcse-follow-jumps -fcse-skip-blocks -fcx-fortran-rules @gol
 -fcx-limited-range @gol
 -fdata-sections -fdce -fdelayed-branch @gol
--fdelete-null-pointer-checks -fdevirtualize -fdse @gol
+-fdelete-null-pointer-checks -fdevirtualize -fdevirtualize-speculatively -fdse @gol
 -fearly-inlining -fipa-sra -fexpensive-optimizations -ffat-lto-objects @gol
 -ffast-math -ffinite-math-only -ffloat-store -fexcess-precision=@var{style} @gol
 -fforward-propagate -ffp-contract=@var{style} -ffunction-sections @gol
@@ -6712,7 +6712,7 @@ also turns on the following optimization
 -fcrossjumping @gol
 -fcse-follow-jumps  -fcse-skip-blocks @gol
 -fdelete-null-pointer-checks @gol
--fdevirtualize @gol
+-fdevirtualize -fdevirtualize-speculatively @gol
 -fexpensive-optimizations @gol
 -fgcse  -fgcse-lm  @gol
 -fhoist-adjacent-loads @gol
@@ -7257,6 +7257,15 @@ indirect inlining (@code{-findirect-inli
 propagation (@option{-fipa-cp}).
 Enabled at levels @option{-O2}, @option{-O3}, @option{-Os}.
 
+@item -fdevirtualize-speculatively
+@opindex fdevirtualize-speculatively
+Attempt to convert calls to virtual functions to speculative direct calls.
+Based on the analysis of the type inheritance graph, determine for a given call
+the set of likely targets. If the set is small, preferably of size 1, change
+the call into an conditional deciding on direct and indirect call.  The
+speculative calls enable more optimizations, such as inlining.  When they seem
+useless after further optimization, they are converted back into original form.
+
 @item -fexpensive-optimizations
 @opindex fexpensive-optimizations
 Perform a number of minor optimizations that are relatively expensive.
Index: ipa-devirt.c
===================================================================
--- ipa-devirt.c	(revision 202136)
+++ ipa-devirt.c	(working copy)
@@ -101,6 +101,8 @@ along with GCC; see the file COPYING3.
 
   possible_polymorphic_call_targets returns, given an parameters found in
   indirect polymorphic edge all possible polymorphic call targets of the call.
+
+  pass_ipa_devirt performs simple speculative devirtualization.
 */
 
 #include "config.h"
@@ -116,6 +118,7 @@ along with GCC; see the file COPYING3.
 #include "tree-pretty-print.h"
 #include "ipa-utils.h"
 #include "gimple.h"
+#include "ipa-inline.h"
 
 /* Pointer set of all call targets appearing in the cache.  */
 static pointer_set_t *cached_polymorphic_call_targets;
@@ -728,4 +731,267 @@ update_type_inheritance_graph (void)
       get_odr_type (method_class_type (TREE_TYPE (n->symbol.decl)), true);
   timevar_pop (TV_IPA_INHERITANCE);
 }
+
+
+/* Return true if N looks like likely target of a polymorphic call.
+   Rule out cxa_pure_virtual, noreturns, function declared cold and
+   other obvious cases.  */
+
+bool
+likely_target_p (struct cgraph_node *n)
+{
+  int flags;
+  /* cxa_pure_virtual and similar things are not likely.  */
+  if (TREE_CODE (TREE_TYPE (n->symbol.decl)) != METHOD_TYPE)
+    return false;
+  flags = flags_from_decl_or_type (n->symbol.decl);
+  if (flags & ECF_NORETURN)
+    return false;
+  if (lookup_attribute ("cold",
+			DECL_ATTRIBUTES (n->symbol.decl)))
+    return false;
+  if (n->frequency < NODE_FREQUENCY_NORMAL)
+    return false;
+  return true;
+}
+
+/* The ipa-devirt pass.
+   This performs very trivial devirtualization:
+     1) when polymorphic call is known to have precisely one target,
+        turn it into direct call
+     2) when polymorphic call has only one likely target in the unit,
+        turn it into speculative call.  */
+
+static unsigned int
+ipa_devirt (void)
+{
+  struct cgraph_node *n;
+  struct pointer_set_t *bad_call_targets = pointer_set_create ();
+  struct cgraph_edge *e;
+
+  int npolymorphic = 0, nspeculated = 0, nconverted = 0, ncold = 0;
+  int nmultiple = 0, noverwritable = 0, ndevirtualized = 0, nnotdefined = 0;
+  int nwrong = 0, nok = 0, nexternal = 0;;
+
+  FOR_EACH_DEFINED_FUNCTION (n)
+    {	
+      bool update = false;
+      if (dump_file && n->indirect_calls)
+	fprintf (dump_file, "\n\nProcesing function %s/%i\n",
+		 cgraph_node_name (n), n->symbol.order);
+      for (e = n->indirect_calls; e; e = e->next_callee)
+	if (e->indirect_info->polymorphic)
+	  {
+	    struct cgraph_node *likely_target = NULL;
+	    void *cache_token;
+	    bool final;
+	    vec <cgraph_node *>targets
+	       = possible_polymorphic_call_targets
+		    (e, &final, &cache_token);
+	    unsigned int i;
+
+	    if (dump_file)
+	      dump_possible_polymorphic_call_targets 
+		(dump_file, e);
+	    npolymorphic++;
+
+	    if (final)
+	      {
+		gcc_assert (targets.length());
+		if (targets.length() == 1)
+		  {
+		    if (dump_file)
+		      fprintf (dump_file,
+			       "Devirtualizing call in %s/%i to %s/%i\n",
+			       cgraph_node_name (n), n->symbol.order,
+			       cgraph_node_name (targets[0]), targets[0]->symbol.order);
+		    cgraph_make_edge_direct (e, targets[0]);
+		    ndevirtualized++;
+		    update = true;
+		    continue;
+		  }
+	      }
+	    if (!flag_devirtualize_speculatively)
+	      continue;
+	    if (!cgraph_maybe_hot_edge_p (e))
+	      {
+		if (dump_file)
+		  fprintf (dump_file, "Call is cold\n");
+		ncold++;
+		continue;
+	      }
+	    if (e->speculative)
+	      {
+		if (dump_file)
+		  fprintf (dump_file, "Call is aready speculated\n");
+		nspeculated++;
+
+		/* When dumping see if we agree with speculation.  */
+		if (!dump_file)
+		  continue;
+	      }
+	    if (pointer_set_contains (bad_call_targets,
+				      cache_token))
+	      {
+		if (dump_file)
+		  fprintf (dump_file, "Target list is known to be useless\n");
+		nmultiple++;
+		continue;
+	      }
+	    for (i = 0; i < targets.length(); i++)
+	      if (likely_target_p (targets[i]))
+		{
+		  if (likely_target)
+		    {
+		      likely_target = NULL;
+		      if (dump_file)
+			fprintf (dump_file, "More than one likely target\n");
+		      nmultiple++;
+		      break;
+		    }
+		  likely_target = targets[i];
+		}
+	    if (!likely_target)
+	      {
+		pointer_set_insert (bad_call_targets, cache_token);
+	        continue;
+	      }
+	    /* This is reached only when dumping; check if we agree or disagree
+ 	       with the speculation.  */
+	    if (e->speculative)
+	      {
+		struct cgraph_edge *e2;
+		struct ipa_ref *ref;
+		cgraph_speculative_call_info (e, e2, e, ref);
+		if (cgraph_function_or_thunk_node (e2->callee, NULL)
+		    == cgraph_function_or_thunk_node (likely_target, NULL))
+		  {
+		    fprintf (dump_file, "We agree with speculation\n");
+		    nok++;
+		  }
+		else
+		  {
+		    fprintf (dump_file, "We disagree with speculation\n");
+		    nwrong++;
+		  }
+		continue;
+	      }
+	    if (!likely_target->symbol.definition)
+	      {
+		if (dump_file)
+		  fprintf (dump_file, "Target is not an definition\n");
+		nnotdefined++;
+		continue;
+	      }
+	    /* Do not introduce new references to external symbols.  While we
+	       can handle these just well, it is common for programs to
+	       incorrectly with headers defining methods they are linked
+	       with.  */
+	    if (DECL_EXTERNAL (likely_target->symbol.decl))
+	      {
+		if (dump_file)
+		  fprintf (dump_file, "Target is external\n");
+		nexternal++;
+		continue;
+	      }
+	    if (cgraph_function_body_availability (likely_target)
+		<= AVAIL_OVERWRITABLE
+		&& symtab_can_be_discarded ((symtab_node) likely_target))
+	      {
+		if (dump_file)
+		  fprintf (dump_file, "Target is overwritable\n");
+		noverwritable++;
+		continue;
+	      }
+	    else
+	      {
+		if (dump_file)
+		  fprintf (dump_file,
+			   "Speculatively devirtualizing call in %s/%i to %s/%i\n",
+			   cgraph_node_name (n), n->symbol.order,
+			   cgraph_node_name (likely_target),
+			   likely_target->symbol.order);
+		if (!symtab_can_be_discarded ((symtab_node) likely_target))
+		  likely_target = cgraph (symtab_nonoverwritable_alias ((symtab_node)likely_target));
+		nconverted++;
+		update = true;
+		cgraph_turn_edge_to_speculative
+		  (e, likely_target, e->count * 8 / 10, e->frequency * 8 / 10);
+	      }
+	  }
+      if (update)
+	inline_update_overall_summary (n);
+    }
+  pointer_set_destroy (bad_call_targets);
+
+  if (dump_file)
+    fprintf (dump_file,
+	     "%i polymorphic calls, %i devirtualized,"
+	     " %i speculatively devirtualized, %i cold\n"
+	     "%i have multiple targets, %i overwritable,"
+	     " %i already speculated (%i agree, %i disagree),"
+	     " %i external, %i not defined\n",
+	     npolymorphic, ndevirtualized, nconverted, ncold,
+	     nmultiple, noverwritable, nspeculated, nok, nwrong,
+	     nexternal, nnotdefined);
+  return ndevirtualized ? TODO_remove_functions : 0;
+}
+
+/* Gate for IPCP optimization.  */
+
+static bool
+gate_ipa_devirt (void)
+{
+  /* FIXME: We should remove the optimize check after we ensure we never run
+     IPA passes when not optimizing.  */
+  return (flag_devirtualize || flag_devirtualize_speculatively) && !in_lto_p;
+}
+
+namespace {
+
+const pass_data pass_data_ipa_devirt =
+{
+  IPA_PASS, /* type */
+  "devirt", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  true, /* has_gate */
+  true, /* has_execute */
+  TV_IPA_DEVIRT, /* tv_id */
+  0, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  ( TODO_dump_symtab ), /* todo_flags_finish */
+};
+
+class pass_ipa_devirt : public ipa_opt_pass_d
+{
+public:
+  pass_ipa_devirt(gcc::context *ctxt)
+    : ipa_opt_pass_d(pass_data_ipa_devirt, ctxt,
+		     NULL, /* generate_summary */
+		     NULL, /* write_summary */
+		     NULL, /* read_summary */
+		     NULL, /* write_optimization_summary */
+		     NULL, /* read_optimization_summary */
+		     NULL, /* stmt_fixup */
+		     0, /* function_transform_todo_flags_start */
+		     NULL, /* function_transform */
+		     NULL) /* variable_transform */
+  {}
+
+  /* opt_pass methods: */
+  bool gate () { return gate_ipa_devirt (); }
+  unsigned int execute () { return ipa_devirt (); }
+
+}; // class pass_ipa_devirt
+
+} // anon namespace
+
+ipa_opt_pass_d *
+make_pass_ipa_devirt (gcc::context *ctxt)
+{
+  return new pass_ipa_devirt (ctxt);
+}
+
 #include "gt-ipa-devirt.h"
Index: opts.c
===================================================================
--- opts.c	(revision 202136)
+++ opts.c	(working copy)
@@ -479,6 +479,7 @@ static const struct default_options defa
     { OPT_LEVELS_2_PLUS, OPT_ftree_switch_conversion, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fipa_cp, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fdevirtualize, NULL, 1 },
+    { OPT_LEVELS_2_PLUS, OPT_fdevirtualize_speculatively, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fipa_sra, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_falign_loops, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_falign_jumps, NULL, 1 },
@@ -1665,6 +1666,11 @@ common_handle_option (struct gcc_options
 	opts->x_flag_vect_cost_model = value;
       if (!opts_set->x_flag_tree_loop_distribute_patterns)
 	opts->x_flag_tree_loop_distribute_patterns = value;
+      /* Indirect call profiling should do all useful transformations
+ 	 speculative devirutalization does.  */
+      if (!opts_set->x_flag_devirtualize_speculatively
+	  && opts->x_flag_value_profile_transformations)
+	opts->x_flag_devirtualize_speculatively = false;
       break;
 
     case OPT_fprofile_generate_:
Index: passes.def
===================================================================
--- passes.def	(revision 202136)
+++ passes.def	(working copy)
@@ -104,6 +104,7 @@ along with GCC; see the file COPYING3.
   INSERT_PASSES_AFTER (all_regular_ipa_passes)
   NEXT_PASS (pass_ipa_whole_program_visibility);
   NEXT_PASS (pass_ipa_profile);
+  NEXT_PASS (pass_ipa_devirt);
   NEXT_PASS (pass_ipa_cp);
   NEXT_PASS (pass_ipa_cdtor_merge);
   NEXT_PASS (pass_ipa_inline);
Index: timevar.def
===================================================================
--- timevar.def	(revision 202136)
+++ timevar.def	(working copy)
@@ -66,6 +66,7 @@ DEFTIMEVAR (TV_CGRAPH                , "
 DEFTIMEVAR (TV_CGRAPHOPT             , "callgraph optimization")
 DEFTIMEVAR (TV_IPA_INHERITANCE       , "ipa inheritance graph")
 DEFTIMEVAR (TV_IPA_VIRTUAL_CALL      , "ipa virtual call target")
+DEFTIMEVAR (TV_IPA_DEVIRT	     , "ipa devirtualization")
 DEFTIMEVAR (TV_IPA_CONSTANT_PROP     , "ipa cp")
 DEFTIMEVAR (TV_IPA_INLINING          , "ipa inlining heuristics")
 DEFTIMEVAR (TV_IPA_FNSPLIT           , "ipa function splitting")
Index: tree-pass.h
===================================================================
--- tree-pass.h	(revision 202136)
+++ tree-pass.h	(working copy)
@@ -468,6 +468,7 @@ extern simple_ipa_opt_pass *make_pass_ip
 extern simple_ipa_opt_pass *make_pass_ipa_free_inline_summary (gcc::context
 							       *ctxt);
 extern ipa_opt_pass_d *make_pass_ipa_cp (gcc::context *ctxt);
+extern ipa_opt_pass_d *make_pass_ipa_devirt (gcc::context *ctxt);
 extern ipa_opt_pass_d *make_pass_ipa_reference (gcc::context *ctxt);
 extern ipa_opt_pass_d *make_pass_ipa_pure_const (gcc::context *ctxt);
 extern simple_ipa_opt_pass *make_pass_ipa_pta (gcc::context *ctxt);

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Type inheritance graph analysis & speculative devirtualization, part 7/7 (speculative devirtualizatoin)
  2013-09-01 13:57 Type inheritance graph analysis & speculative devirtualization, part 7/7 (speculative devirtualizatoin) Jan Hubicka
@ 2013-09-01 16:04 ` Xinliang David Li
  2013-09-01 16:39   ` Jan Hubicka
       [not found] ` <52260602.5090500@mozilla.com>
  2014-11-14 19:01 ` H.J. Lu
  2 siblings, 1 reply; 5+ messages in thread
From: Xinliang David Li @ 2013-09-01 16:04 UTC (permalink / raw)
  To: Jan Hubicka
  Cc: GCC Patches, Jason Merrill, Nathan Froyd, Taras Glek,
	Mike Hommey, Martin Liška

Missing test cases?

Have you tested the optimization with SPEC2k and SPEC06? There are a
couple of benchmarks benefit greatly from devirtualization, such as
eon, povray etc.   I believe astar will probably improve with this
optimization at O2 (it has hot virtual functions that are not
overridden at all). For eon, the assumption at O2 for speculative
devirt may not work well.

thanks,

David

On Sun, Sep 1, 2013 at 6:57 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
> Hi,
> this patch implement speculative devirtualization.  It is a trivial pass that
> asks for targets of every polymorphic call in a program and if the list
> contains one likely target, it produces an speculative call. No context
> sensitive analysis is done at the moment.  This call may or may not survive
> into final program depending if we do somehting useful about the direct call.
>
> The pass is currently disabled for LTO because
> http://gcc.gnu.org/ml/gcc-patches/2013-08/msg01007.html is needed to properly
> build type inheritance graph.
>
> With LTO it is supposed to be effective on a premise that most types are not
> escaping LTO unit and thus we see all possible targets.  Without LTO it makes
> stronger assumption that you usually call only targets defined in current unit,
> if any.
>
> Path is suprisingly effective on Firefox:
> 105306 polymorphic calls, 0 devirtualized, 34258 speculatively devirtualized, 4056 cold
> 66875 have multiple targets, 0 overwritable, 0 already speculated (0 agree, 0 disagree), 0 not defined
>
> So about 32% of calls are devirutalized.  By random checking, these can be
> tracked to real occurences of code where virtual is used in a silly way.
> I plan to introduce warning for that (I have code for that already since it
> makes it easier to analyze what changes are really made and why).
>
> Martin Liska rebuilt with FDO based indirect call resolution.  Here we get:
> 23928 indirect calls trained.
> 12837 (53.65%) have common target.
> 342 (1.43%) targets was not found.
> 8378 (35.01%) speculations seems useless.
> 4117 (17.21%) speculations produced.
>
> I compared the overlap that is devirtualized by both techniques.  There is
> almost 100% match, except that FDO code is not dealing well with thunks and
> those 342 calls that seem to go out of libxul into plugins.  I will fix the
> thunk issue later.
>
> I also tested QT, where the numbers are smaller - only about 20% of devirtualized
> calls, but largery things seems similar.
>
> For non-LTO build, the devirtualization also seems sane, there seems to be
> about 8% of miss rate on GCC bootstrap that seems acceptable. I tracked most of
> those down into randomly included headers that do define derived types of a
> given class that are unused in the current unit.  I think we can track this by
> computing reachable functions in current unit and looking for vtables actually
> used by construction.  One of such actually triggers undefined reference in
> build of libstdc++ and therefore I added the check disabling devirtualization
> to DECL_EXTERNAL for now.  It is because the libstdc++ header seems to have
> explicit instantiation of a template that is never linked with.
>
> I currently enabled the pass by default at -O2.  Based on the experience about
> missrate, we may want to disable it for -O2 non-LTO if it shows to be too risky
> on some codebases.
>
> Bootstrapped/regtested x86_64-linux and ppc64-linux, also tested with lto bootstrap
> with the LTO ODR code and tested on Firefox and QT builds. Will commit the patch
> later today.
>
> Comments are welcome.
> Honza
>
>         * common.opt (fdevirtualize-speculatively): New function.
>         * invoke.texi (fdevirtualize-speculatively): Document.
>         * ipa-devirt.c: Include ipa-inline.h
>         (likely_target_p): New function.
>         (ipa_devirt): New function.
>         (gate_ipa_devirt): New function.
>         (pass_data_ipa_devirt): New static var.
>         (pass_ipa_devirt): Likewise.
>         (make_pass_ipa_devirt): New function.
>         * opts.c (default_options): Add OPT_fdevirtualize_speculatively.
>         (common_handle_option): Disable devirtualization when
>         value range profiling is available.
>         * passes.def (pass_ipa_devirt): Add.
>         * timever.def (TV_IPA_DEVIRT): New timevar.
>         * tree-pass.h (make_pass_ipa_devirt):
>
> Index: common.opt
> ===================================================================
> --- common.opt  (revision 202136)
> +++ common.opt  (working copy)
> @@ -1007,6 +1007,10 @@ fdevirtualize
>  Common Report Var(flag_devirtualize) Optimization
>  Try to convert virtual calls to direct ones.
>
> +fdevirtualize-speculatively
> +Common Report Var(flag_devirtualize_speculatively) Optimization
> +Perform speculative devirtualization
> +
>  fdiagnostics-show-location=
>  Common Joined RejectNegative Enum(diagnostic_prefixing_rule)
>  -fdiagnostics-show-location=[once|every-line]  How often to emit source location at the beginning of line-wrapped diagnostics
> @@ -1366,7 +1370,7 @@ Common RejectNegative Joined
>
>  fipa-cp
>  Common Report Var(flag_ipa_cp) Optimization
> -Perform Interprocedural constant propagation
> +Perform interprocedural constant propagation
>
>  fipa-cp-clone
>  Common Report Var(flag_ipa_cp_clone) Optimization
> Index: doc/invoke.texi
> ===================================================================
> --- doc/invoke.texi     (revision 202136)
> +++ doc/invoke.texi     (working copy)
> @@ -365,7 +365,7 @@ Objective-C and Objective-C++ Dialects}.
>  -fcse-follow-jumps -fcse-skip-blocks -fcx-fortran-rules @gol
>  -fcx-limited-range @gol
>  -fdata-sections -fdce -fdelayed-branch @gol
> --fdelete-null-pointer-checks -fdevirtualize -fdse @gol
> +-fdelete-null-pointer-checks -fdevirtualize -fdevirtualize-speculatively -fdse @gol
>  -fearly-inlining -fipa-sra -fexpensive-optimizations -ffat-lto-objects @gol
>  -ffast-math -ffinite-math-only -ffloat-store -fexcess-precision=@var{style} @gol
>  -fforward-propagate -ffp-contract=@var{style} -ffunction-sections @gol
> @@ -6712,7 +6712,7 @@ also turns on the following optimization
>  -fcrossjumping @gol
>  -fcse-follow-jumps  -fcse-skip-blocks @gol
>  -fdelete-null-pointer-checks @gol
> --fdevirtualize @gol
> +-fdevirtualize -fdevirtualize-speculatively @gol
>  -fexpensive-optimizations @gol
>  -fgcse  -fgcse-lm  @gol
>  -fhoist-adjacent-loads @gol
> @@ -7257,6 +7257,15 @@ indirect inlining (@code{-findirect-inli
>  propagation (@option{-fipa-cp}).
>  Enabled at levels @option{-O2}, @option{-O3}, @option{-Os}.
>
> +@item -fdevirtualize-speculatively
> +@opindex fdevirtualize-speculatively
> +Attempt to convert calls to virtual functions to speculative direct calls.
> +Based on the analysis of the type inheritance graph, determine for a given call
> +the set of likely targets. If the set is small, preferably of size 1, change
> +the call into an conditional deciding on direct and indirect call.  The
> +speculative calls enable more optimizations, such as inlining.  When they seem
> +useless after further optimization, they are converted back into original form.
> +
>  @item -fexpensive-optimizations
>  @opindex fexpensive-optimizations
>  Perform a number of minor optimizations that are relatively expensive.
> Index: ipa-devirt.c
> ===================================================================
> --- ipa-devirt.c        (revision 202136)
> +++ ipa-devirt.c        (working copy)
> @@ -101,6 +101,8 @@ along with GCC; see the file COPYING3.
>
>    possible_polymorphic_call_targets returns, given an parameters found in
>    indirect polymorphic edge all possible polymorphic call targets of the call.
> +
> +  pass_ipa_devirt performs simple speculative devirtualization.
>  */
>
>  #include "config.h"
> @@ -116,6 +118,7 @@ along with GCC; see the file COPYING3.
>  #include "tree-pretty-print.h"
>  #include "ipa-utils.h"
>  #include "gimple.h"
> +#include "ipa-inline.h"
>
>  /* Pointer set of all call targets appearing in the cache.  */
>  static pointer_set_t *cached_polymorphic_call_targets;
> @@ -728,4 +731,267 @@ update_type_inheritance_graph (void)
>        get_odr_type (method_class_type (TREE_TYPE (n->symbol.decl)), true);
>    timevar_pop (TV_IPA_INHERITANCE);
>  }
> +
> +
> +/* Return true if N looks like likely target of a polymorphic call.
> +   Rule out cxa_pure_virtual, noreturns, function declared cold and
> +   other obvious cases.  */
> +
> +bool
> +likely_target_p (struct cgraph_node *n)
> +{
> +  int flags;
> +  /* cxa_pure_virtual and similar things are not likely.  */
> +  if (TREE_CODE (TREE_TYPE (n->symbol.decl)) != METHOD_TYPE)
> +    return false;
> +  flags = flags_from_decl_or_type (n->symbol.decl);
> +  if (flags & ECF_NORETURN)
> +    return false;
> +  if (lookup_attribute ("cold",
> +                       DECL_ATTRIBUTES (n->symbol.decl)))
> +    return false;
> +  if (n->frequency < NODE_FREQUENCY_NORMAL)
> +    return false;
> +  return true;
> +}
> +
> +/* The ipa-devirt pass.
> +   This performs very trivial devirtualization:
> +     1) when polymorphic call is known to have precisely one target,
> +        turn it into direct call
> +     2) when polymorphic call has only one likely target in the unit,
> +        turn it into speculative call.  */
> +
> +static unsigned int
> +ipa_devirt (void)
> +{
> +  struct cgraph_node *n;
> +  struct pointer_set_t *bad_call_targets = pointer_set_create ();
> +  struct cgraph_edge *e;
> +
> +  int npolymorphic = 0, nspeculated = 0, nconverted = 0, ncold = 0;
> +  int nmultiple = 0, noverwritable = 0, ndevirtualized = 0, nnotdefined = 0;
> +  int nwrong = 0, nok = 0, nexternal = 0;;
> +
> +  FOR_EACH_DEFINED_FUNCTION (n)
> +    {
> +      bool update = false;
> +      if (dump_file && n->indirect_calls)
> +       fprintf (dump_file, "\n\nProcesing function %s/%i\n",
> +                cgraph_node_name (n), n->symbol.order);
> +      for (e = n->indirect_calls; e; e = e->next_callee)
> +       if (e->indirect_info->polymorphic)
> +         {
> +           struct cgraph_node *likely_target = NULL;
> +           void *cache_token;
> +           bool final;
> +           vec <cgraph_node *>targets
> +              = possible_polymorphic_call_targets
> +                   (e, &final, &cache_token);
> +           unsigned int i;
> +
> +           if (dump_file)
> +             dump_possible_polymorphic_call_targets
> +               (dump_file, e);
> +           npolymorphic++;
> +
> +           if (final)
> +             {
> +               gcc_assert (targets.length());
> +               if (targets.length() == 1)
> +                 {
> +                   if (dump_file)
> +                     fprintf (dump_file,
> +                              "Devirtualizing call in %s/%i to %s/%i\n",
> +                              cgraph_node_name (n), n->symbol.order,
> +                              cgraph_node_name (targets[0]), targets[0]->symbol.order);
> +                   cgraph_make_edge_direct (e, targets[0]);
> +                   ndevirtualized++;
> +                   update = true;
> +                   continue;
> +                 }
> +             }
> +           if (!flag_devirtualize_speculatively)
> +             continue;
> +           if (!cgraph_maybe_hot_edge_p (e))
> +             {
> +               if (dump_file)
> +                 fprintf (dump_file, "Call is cold\n");
> +               ncold++;
> +               continue;
> +             }
> +           if (e->speculative)
> +             {
> +               if (dump_file)
> +                 fprintf (dump_file, "Call is aready speculated\n");
> +               nspeculated++;
> +
> +               /* When dumping see if we agree with speculation.  */
> +               if (!dump_file)
> +                 continue;
> +             }
> +           if (pointer_set_contains (bad_call_targets,
> +                                     cache_token))
> +             {
> +               if (dump_file)
> +                 fprintf (dump_file, "Target list is known to be useless\n");
> +               nmultiple++;
> +               continue;
> +             }
> +           for (i = 0; i < targets.length(); i++)
> +             if (likely_target_p (targets[i]))
> +               {
> +                 if (likely_target)
> +                   {
> +                     likely_target = NULL;
> +                     if (dump_file)
> +                       fprintf (dump_file, "More than one likely target\n");
> +                     nmultiple++;
> +                     break;
> +                   }
> +                 likely_target = targets[i];
> +               }
> +           if (!likely_target)
> +             {
> +               pointer_set_insert (bad_call_targets, cache_token);
> +               continue;
> +             }
> +           /* This is reached only when dumping; check if we agree or disagree
> +              with the speculation.  */
> +           if (e->speculative)
> +             {
> +               struct cgraph_edge *e2;
> +               struct ipa_ref *ref;
> +               cgraph_speculative_call_info (e, e2, e, ref);
> +               if (cgraph_function_or_thunk_node (e2->callee, NULL)
> +                   == cgraph_function_or_thunk_node (likely_target, NULL))
> +                 {
> +                   fprintf (dump_file, "We agree with speculation\n");
> +                   nok++;
> +                 }
> +               else
> +                 {
> +                   fprintf (dump_file, "We disagree with speculation\n");
> +                   nwrong++;
> +                 }
> +               continue;
> +             }
> +           if (!likely_target->symbol.definition)
> +             {
> +               if (dump_file)
> +                 fprintf (dump_file, "Target is not an definition\n");
> +               nnotdefined++;
> +               continue;
> +             }
> +           /* Do not introduce new references to external symbols.  While we
> +              can handle these just well, it is common for programs to
> +              incorrectly with headers defining methods they are linked
> +              with.  */
> +           if (DECL_EXTERNAL (likely_target->symbol.decl))
> +             {
> +               if (dump_file)
> +                 fprintf (dump_file, "Target is external\n");
> +               nexternal++;
> +               continue;
> +             }
> +           if (cgraph_function_body_availability (likely_target)
> +               <= AVAIL_OVERWRITABLE
> +               && symtab_can_be_discarded ((symtab_node) likely_target))
> +             {
> +               if (dump_file)
> +                 fprintf (dump_file, "Target is overwritable\n");
> +               noverwritable++;
> +               continue;
> +             }
> +           else
> +             {
> +               if (dump_file)
> +                 fprintf (dump_file,
> +                          "Speculatively devirtualizing call in %s/%i to %s/%i\n",
> +                          cgraph_node_name (n), n->symbol.order,
> +                          cgraph_node_name (likely_target),
> +                          likely_target->symbol.order);
> +               if (!symtab_can_be_discarded ((symtab_node) likely_target))
> +                 likely_target = cgraph (symtab_nonoverwritable_alias ((symtab_node)likely_target));
> +               nconverted++;
> +               update = true;
> +               cgraph_turn_edge_to_speculative
> +                 (e, likely_target, e->count * 8 / 10, e->frequency * 8 / 10);
> +             }
> +         }
> +      if (update)
> +       inline_update_overall_summary (n);
> +    }
> +  pointer_set_destroy (bad_call_targets);
> +
> +  if (dump_file)
> +    fprintf (dump_file,
> +            "%i polymorphic calls, %i devirtualized,"
> +            " %i speculatively devirtualized, %i cold\n"
> +            "%i have multiple targets, %i overwritable,"
> +            " %i already speculated (%i agree, %i disagree),"
> +            " %i external, %i not defined\n",
> +            npolymorphic, ndevirtualized, nconverted, ncold,
> +            nmultiple, noverwritable, nspeculated, nok, nwrong,
> +            nexternal, nnotdefined);
> +  return ndevirtualized ? TODO_remove_functions : 0;
> +}
> +
> +/* Gate for IPCP optimization.  */
> +
> +static bool
> +gate_ipa_devirt (void)
> +{
> +  /* FIXME: We should remove the optimize check after we ensure we never run
> +     IPA passes when not optimizing.  */
> +  return (flag_devirtualize || flag_devirtualize_speculatively) && !in_lto_p;
> +}
> +
> +namespace {
> +
> +const pass_data pass_data_ipa_devirt =
> +{
> +  IPA_PASS, /* type */
> +  "devirt", /* name */
> +  OPTGROUP_NONE, /* optinfo_flags */
> +  true, /* has_gate */
> +  true, /* has_execute */
> +  TV_IPA_DEVIRT, /* tv_id */
> +  0, /* properties_required */
> +  0, /* properties_provided */
> +  0, /* properties_destroyed */
> +  0, /* todo_flags_start */
> +  ( TODO_dump_symtab ), /* todo_flags_finish */
> +};
> +
> +class pass_ipa_devirt : public ipa_opt_pass_d
> +{
> +public:
> +  pass_ipa_devirt(gcc::context *ctxt)
> +    : ipa_opt_pass_d(pass_data_ipa_devirt, ctxt,
> +                    NULL, /* generate_summary */
> +                    NULL, /* write_summary */
> +                    NULL, /* read_summary */
> +                    NULL, /* write_optimization_summary */
> +                    NULL, /* read_optimization_summary */
> +                    NULL, /* stmt_fixup */
> +                    0, /* function_transform_todo_flags_start */
> +                    NULL, /* function_transform */
> +                    NULL) /* variable_transform */
> +  {}
> +
> +  /* opt_pass methods: */
> +  bool gate () { return gate_ipa_devirt (); }
> +  unsigned int execute () { return ipa_devirt (); }
> +
> +}; // class pass_ipa_devirt
> +
> +} // anon namespace
> +
> +ipa_opt_pass_d *
> +make_pass_ipa_devirt (gcc::context *ctxt)
> +{
> +  return new pass_ipa_devirt (ctxt);
> +}
> +
>  #include "gt-ipa-devirt.h"
> Index: opts.c
> ===================================================================
> --- opts.c      (revision 202136)
> +++ opts.c      (working copy)
> @@ -479,6 +479,7 @@ static const struct default_options defa
>      { OPT_LEVELS_2_PLUS, OPT_ftree_switch_conversion, NULL, 1 },
>      { OPT_LEVELS_2_PLUS, OPT_fipa_cp, NULL, 1 },
>      { OPT_LEVELS_2_PLUS, OPT_fdevirtualize, NULL, 1 },
> +    { OPT_LEVELS_2_PLUS, OPT_fdevirtualize_speculatively, NULL, 1 },
>      { OPT_LEVELS_2_PLUS, OPT_fipa_sra, NULL, 1 },
>      { OPT_LEVELS_2_PLUS, OPT_falign_loops, NULL, 1 },
>      { OPT_LEVELS_2_PLUS, OPT_falign_jumps, NULL, 1 },
> @@ -1665,6 +1666,11 @@ common_handle_option (struct gcc_options
>         opts->x_flag_vect_cost_model = value;
>        if (!opts_set->x_flag_tree_loop_distribute_patterns)
>         opts->x_flag_tree_loop_distribute_patterns = value;
> +      /* Indirect call profiling should do all useful transformations
> +        speculative devirutalization does.  */
> +      if (!opts_set->x_flag_devirtualize_speculatively
> +         && opts->x_flag_value_profile_transformations)
> +       opts->x_flag_devirtualize_speculatively = false;
>        break;
>
>      case OPT_fprofile_generate_:
> Index: passes.def
> ===================================================================
> --- passes.def  (revision 202136)
> +++ passes.def  (working copy)
> @@ -104,6 +104,7 @@ along with GCC; see the file COPYING3.
>    INSERT_PASSES_AFTER (all_regular_ipa_passes)
>    NEXT_PASS (pass_ipa_whole_program_visibility);
>    NEXT_PASS (pass_ipa_profile);
> +  NEXT_PASS (pass_ipa_devirt);
>    NEXT_PASS (pass_ipa_cp);
>    NEXT_PASS (pass_ipa_cdtor_merge);
>    NEXT_PASS (pass_ipa_inline);
> Index: timevar.def
> ===================================================================
> --- timevar.def (revision 202136)
> +++ timevar.def (working copy)
> @@ -66,6 +66,7 @@ DEFTIMEVAR (TV_CGRAPH                , "
>  DEFTIMEVAR (TV_CGRAPHOPT             , "callgraph optimization")
>  DEFTIMEVAR (TV_IPA_INHERITANCE       , "ipa inheritance graph")
>  DEFTIMEVAR (TV_IPA_VIRTUAL_CALL      , "ipa virtual call target")
> +DEFTIMEVAR (TV_IPA_DEVIRT           , "ipa devirtualization")
>  DEFTIMEVAR (TV_IPA_CONSTANT_PROP     , "ipa cp")
>  DEFTIMEVAR (TV_IPA_INLINING          , "ipa inlining heuristics")
>  DEFTIMEVAR (TV_IPA_FNSPLIT           , "ipa function splitting")
> Index: tree-pass.h
> ===================================================================
> --- tree-pass.h (revision 202136)
> +++ tree-pass.h (working copy)
> @@ -468,6 +468,7 @@ extern simple_ipa_opt_pass *make_pass_ip
>  extern simple_ipa_opt_pass *make_pass_ipa_free_inline_summary (gcc::context
>                                                                *ctxt);
>  extern ipa_opt_pass_d *make_pass_ipa_cp (gcc::context *ctxt);
> +extern ipa_opt_pass_d *make_pass_ipa_devirt (gcc::context *ctxt);
>  extern ipa_opt_pass_d *make_pass_ipa_reference (gcc::context *ctxt);
>  extern ipa_opt_pass_d *make_pass_ipa_pure_const (gcc::context *ctxt);
>  extern simple_ipa_opt_pass *make_pass_ipa_pta (gcc::context *ctxt);

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Type inheritance graph analysis & speculative devirtualization, part 7/7 (speculative devirtualizatoin)
  2013-09-01 16:04 ` Xinliang David Li
@ 2013-09-01 16:39   ` Jan Hubicka
  0 siblings, 0 replies; 5+ messages in thread
From: Jan Hubicka @ 2013-09-01 16:39 UTC (permalink / raw)
  To: Xinliang David Li
  Cc: Jan Hubicka, GCC Patches, Jason Merrill, Nathan Froyd,
	Taras Glek, Mike Hommey, Martin Liška

> Missing test cases?
Good point.  The testcases I have needs rest of the patches from the series to
hit the mainline.  For now I have added the following to test the basic scenario

	* g++.dg/ipa/devirt-15.C: New testcase.
Index: g++.dg/ipa/devirt-15.C
===================================================================
--- g++.dg/ipa/devirt-15.C	(revision 0)
+++ g++.dg/ipa/devirt-15.C	(working copy)
@@ -0,0 +1,40 @@
+/* Check that we speculatively devirutalize call to FOO to A::foo becuase
+   B is noreturn.  */
+/* { dg-do run } */
+/* { dg-options "-O2 -fdump-ipa-devirt"  } */
+class A {
+public:
+  virtual int foo(void)
+    {
+      throw (1);
+      return 0;
+    }
+};
+
+
+class B : public A {
+public:
+  virtual int foo(void);
+};
+
+int
+B::foo(void)
+{
+  return 1;
+}
+class A a, *b=&a;
+void
+m(void)
+{
+  b->foo();
+}
+main()
+{
+  m();
+}
+
+/* { dg-final { scan-ipa-dump "Speculatively devirtualizing call" "devirt"} } */
+/* { dg-final { cleanup-ipa-dump "devirt" } } */
+/* Match if (PROF_6 == foo) to verify that the speculation survived.  */
+/* { dg-final { scan-tree-dump "== foo" "optimized"} } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
> 
> Have you tested the optimization with SPEC2k and SPEC06? There are a
> couple of benchmarks benefit greatly from devirtualization, such as
> eon, povray etc.   I believe astar will probably improve with this
> optimization at O2 (it has hot virtual functions that are not
> overridden at all). For eon, the assumption at O2 for speculative
> devirt may not work well.

I trusted Martin (Jambor) that only interesting testcase for devirtualization
in SPEC in xalancbmk.  We will definitely see tomorrow.

Concerning eon, I think it includes all the classes for individual objects so
it should not get devirtualized the wrong way.  I will double check.  Wrong
guess in general should not be terribly expensive. Many speculative calls are
eliminated by inlining and other will just end up with the extra
conditinal+jump (to jump).  For firefox the code growth for all those 30000
devirtualization is about 1.8%.  Here about 70% of them survive till
final executable since they use virtual functions for very cheap accessors
that gets inlined well.

There is an accounting bug in ipa-cp that makes us to mistakely remove
references to some virutal functions as dead after devirtualization, so I
disabled tracking of described references for virtual calls.  Once this is
fixed, the code size may get better.

Honza

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Type inheritance graph analysis & speculative devirtualization, part 7/7 (speculative devirtualizatoin)
       [not found] ` <52260602.5090500@mozilla.com>
@ 2013-09-03 16:24   ` Jan Hubicka
  0 siblings, 0 replies; 5+ messages in thread
From: Jan Hubicka @ 2013-09-03 16:24 UTC (permalink / raw)
  To: Taras Glek
  Cc: Jan Hubicka, gcc-patches, jason, Nathan Froyd, glandium, marxin.liska

> What is the footprint impact of speculative devirtualization?

It is less than 2% of text section and once we solve problems with ipa-prop tracking,
I hope it will be less.

I hope we now understand better how to devirtualize and I think we can improve non-speculative
devirt noticeably.

> 
> Firefox is sensitive to footprint increases, especially on Android startup.

That is what Martin's patch is expected to solve.  With his ordering only
fraction of the text and rodata section needs to be preloaded.  We are still
chasing some issues there, but it seems that "only" about 10MB is needed
(out of 60MB binary).  Not accounting dynamic linking of data.rel.ro that
adds around another 5MB.  It depends on FDO though.

Honza

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Type inheritance graph analysis & speculative devirtualization, part 7/7 (speculative devirtualizatoin)
  2013-09-01 13:57 Type inheritance graph analysis & speculative devirtualization, part 7/7 (speculative devirtualizatoin) Jan Hubicka
  2013-09-01 16:04 ` Xinliang David Li
       [not found] ` <52260602.5090500@mozilla.com>
@ 2014-11-14 19:01 ` H.J. Lu
  2 siblings, 0 replies; 5+ messages in thread
From: H.J. Lu @ 2014-11-14 19:01 UTC (permalink / raw)
  To: Jan Hubicka
  Cc: GCC Patches, Jason Merrill, Nathan Froyd, Taras Glek, glandium,
	Martin Liška

On Sun, Sep 1, 2013 at 6:57 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
> Hi,
> this patch implement speculative devirtualization.  It is a trivial pass that
> asks for targets of every polymorphic call in a program and if the list
> contains one likely target, it produces an speculative call. No context
> sensitive analysis is done at the moment.  This call may or may not survive
> into final program depending if we do somehting useful about the direct call.
>
> The pass is currently disabled for LTO because
> http://gcc.gnu.org/ml/gcc-patches/2013-08/msg01007.html is needed to properly
> build type inheritance graph.
>
> With LTO it is supposed to be effective on a premise that most types are not
> escaping LTO unit and thus we see all possible targets.  Without LTO it makes
> stronger assumption that you usually call only targets defined in current unit,
> if any.
>
> Path is suprisingly effective on Firefox:
> 105306 polymorphic calls, 0 devirtualized, 34258 speculatively devirtualized, 4056 cold
> 66875 have multiple targets, 0 overwritable, 0 already speculated (0 agree, 0 disagree), 0 not defined
>
> So about 32% of calls are devirutalized.  By random checking, these can be
> tracked to real occurences of code where virtual is used in a silly way.
> I plan to introduce warning for that (I have code for that already since it
> makes it easier to analyze what changes are really made and why).
>
> Martin Liska rebuilt with FDO based indirect call resolution.  Here we get:
> 23928 indirect calls trained.
> 12837 (53.65%) have common target.
> 342 (1.43%) targets was not found.
> 8378 (35.01%) speculations seems useless.
> 4117 (17.21%) speculations produced.
>
> I compared the overlap that is devirtualized by both techniques.  There is
> almost 100% match, except that FDO code is not dealing well with thunks and
> those 342 calls that seem to go out of libxul into plugins.  I will fix the
> thunk issue later.
>
> I also tested QT, where the numbers are smaller - only about 20% of devirtualized
> calls, but largery things seems similar.
>
> For non-LTO build, the devirtualization also seems sane, there seems to be
> about 8% of miss rate on GCC bootstrap that seems acceptable. I tracked most of
> those down into randomly included headers that do define derived types of a
> given class that are unused in the current unit.  I think we can track this by
> computing reachable functions in current unit and looking for vtables actually
> used by construction.  One of such actually triggers undefined reference in
> build of libstdc++ and therefore I added the check disabling devirtualization
> to DECL_EXTERNAL for now.  It is because the libstdc++ header seems to have
> explicit instantiation of a template that is never linked with.
>
> I currently enabled the pass by default at -O2.  Based on the experience about
> missrate, we may want to disable it for -O2 non-LTO if it shows to be too risky
> on some codebases.
>
> Bootstrapped/regtested x86_64-linux and ppc64-linux, also tested with lto bootstrap
> with the LTO ODR code and tested on Firefox and QT builds. Will commit the patch
> later today.
>
> Comments are welcome.
> Honza
>
>         * common.opt (fdevirtualize-speculatively): New function.
>         * invoke.texi (fdevirtualize-speculatively): Document.
>         * ipa-devirt.c: Include ipa-inline.h
>         (likely_target_p): New function.
>         (ipa_devirt): New function.
>         (gate_ipa_devirt): New function.
>         (pass_data_ipa_devirt): New static var.
>         (pass_ipa_devirt): Likewise.
>         (make_pass_ipa_devirt): New function.
>         * opts.c (default_options): Add OPT_fdevirtualize_speculatively.
>         (common_handle_option): Disable devirtualization when
>         value range profiling is available.
>         * passes.def (pass_ipa_devirt): Add.
>         * timever.def (TV_IPA_DEVIRT): New timevar.
>         * tree-pass.h (make_pass_ipa_devirt):
>

This triggered:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63814

H.J.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-11-14 18:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-09-01 13:57 Type inheritance graph analysis & speculative devirtualization, part 7/7 (speculative devirtualizatoin) Jan Hubicka
2013-09-01 16:04 ` Xinliang David Li
2013-09-01 16:39   ` Jan Hubicka
     [not found] ` <52260602.5090500@mozilla.com>
2013-09-03 16:24   ` Jan Hubicka
2014-11-14 19:01 ` H.J. Lu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).