public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Martin Jambor <mjambor@suse.cz>
To: GCC Patches <gcc-patches@gcc.gnu.org>
Cc: Jan Hubicka <jh@suse.cz>, Xionghu Luo <luoxhu@linux.ibm.com>
Subject: [PATCH 4/4] ipa-cp: Select saner profile count to base heuristics on
Date: Mon, 23 Aug 2021 20:49:14 +0200	[thread overview]
Message-ID: <96160a5131c9e5eb302fb9f4db43c5d8b4cfe042.1629805719.git.mjambor@suse.cz> (raw)
In-Reply-To: <cover.1629805719.git.mjambor@suse.cz>

When profile feedback is available, IPA-CP takes the count of the
hottest node and then evaluates all call contexts relative to it.
This means that typically almost no clones for specialized contexts
are ever created because the maximum is some special function, called
from everywhere (that is likely to get inlined anyway) and all the
examined edges look cold compared to it.

This patch changes the selection.  It simply sorts counts of all edges
eligible for cloning in a vector and then picks the count in 90th
percentile (the actual number is configurable via a parameter).

I also tried more complex approaches which were summing the counts and
picking the edge which together with all hotter edges accounted for a
given portion of the total sum of all edge counts.  But first it was
not apparently clear to me that they make more logical sense that the
simple method and practically I always also had to ignore a few
percent of the hottest edges with really extreme counts (looking at
bash and python).  And when I had to do that anyway, it seemed simpler
to just "ignore" more and take the first non-ignored count as the
base.

Nevertheless, if people think some more sophisticated method should be
used anyway, I am willing to be persuaded.  But this patch is a clear
improvement over the current situation.

gcc/ChangeLog:

2021-08-23  Martin Jambor  <mjambor@suse.cz>

	* params.opt (param_ipa_cp_profile_count_base): New parameter.
	* ipa-cp.c (max_count): Replace with base_count, replace all
	occurrences too, unless otherwise stated.
	(ipcp_cloning_candidate_p): identify mostly-directly called
	functions based on their counts, not max_count.
	(compare_edge_profile_counts): New function.
	(ipcp_propagate_stage): Instead of setting max_count, find the
	appropriate edge count in a sorted vector of counts of eligible
	edges and make it the base_count.
---
 gcc/ipa-cp.c   | 82 +++++++++++++++++++++++++++++++++++++++++++++-----
 gcc/params.opt |  4 +++
 2 files changed, 78 insertions(+), 8 deletions(-)

diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index 53cca7aa804..6ab74f61e83 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -400,9 +400,9 @@ object_allocator<ipcp_value_source<tree> > ipcp_sources_pool
 object_allocator<ipcp_agg_lattice> ipcp_agg_lattice_pool
   ("IPA_CP aggregate lattices");
 
-/* Maximal count found in program.  */
+/* Base count to use in heuristics when using profile feedback.  */
 
-static profile_count max_count;
+static profile_count base_count;
 
 /* Original overall size of the program.  */
 
@@ -809,7 +809,8 @@ ipcp_cloning_candidate_p (struct cgraph_node *node)
   /* When profile is available and function is hot, propagate into it even if
      calls seems cold; constant propagation can improve function's speed
      significantly.  */
-  if (max_count > profile_count::zero ())
+  if (stats.count_sum > profile_count::zero ()
+      && node->count.ipa ().initialized_p ())
     {
       if (stats.count_sum > node->count.ipa ().apply_scale (90, 100))
 	{
@@ -3310,10 +3311,10 @@ good_cloning_opportunity_p (struct cgraph_node *node, sreal time_benefit,
 
   ipa_node_params *info = ipa_node_params_sum->get (node);
   int eval_threshold = opt_for_fn (node->decl, param_ipa_cp_eval_threshold);
-  if (max_count > profile_count::zero ())
+  if (base_count > profile_count::zero ())
     {
 
-      sreal factor = count_sum.probability_in (max_count).to_sreal ();
+      sreal factor = count_sum.probability_in (base_count).to_sreal ();
       sreal evaluation = (time_benefit * factor) / size_cost;
       evaluation = incorporate_penalties (node, info, evaluation);
       evaluation *= 1000;
@@ -3950,6 +3951,21 @@ value_topo_info<valtype>::propagate_effects ()
     }
 }
 
+/* Callback for qsort to sort counts of all edges.  */
+
+static int
+compare_edge_profile_counts (const void *a, const void *b)
+{
+  const profile_count *cnt1 = (const profile_count *) a;
+  const profile_count *cnt2 = (const profile_count *) b;
+
+  if (*cnt1 < *cnt2)
+    return 1;
+  if (*cnt1 > *cnt2)
+    return -1;
+  return 0;
+}
+
 
 /* Propagate constants, polymorphic contexts and their effects from the
    summaries interprocedurally.  */
@@ -3962,8 +3978,10 @@ ipcp_propagate_stage (class ipa_topo_info *topo)
   if (dump_file)
     fprintf (dump_file, "\n Propagating constants:\n\n");
 
-  max_count = profile_count::uninitialized ();
+  base_count = profile_count::uninitialized ();
 
+  bool compute_count_base = false;
+  unsigned base_count_pos_percent = 0;
   FOR_EACH_DEFINED_FUNCTION (node)
   {
     if (node->has_gimple_body_p ()
@@ -3981,9 +3999,57 @@ ipcp_propagate_stage (class ipa_topo_info *topo)
     ipa_size_summary *s = ipa_size_summaries->get (node);
     if (node->definition && !node->alias && s != NULL)
       overall_size += s->self_size;
-    max_count = max_count.max (node->count.ipa ());
+    if (node->count.ipa ().initialized_p ())
+      {
+	compute_count_base = true;
+	unsigned pos_percent = opt_for_fn (node->decl,
+					   param_ipa_cp_profile_count_base);
+	base_count_pos_percent = MAX (base_count_pos_percent, pos_percent);
+      }
   }
 
+  if (compute_count_base)
+    {
+      auto_vec<profile_count> all_edge_counts;
+      all_edge_counts.reserve_exact (symtab->edges_count);
+      FOR_EACH_DEFINED_FUNCTION (node)
+	for (cgraph_edge *cs = node->callees; cs; cs = cs->next_callee)
+	  {
+	    profile_count count = cs->count.ipa ();
+	    if (!(count > profile_count::zero ()))
+	      continue;
+
+	    enum availability avail;
+	    cgraph_node *tgt
+	      = cs->callee->function_or_virtual_thunk_symbol (&avail);
+	    ipa_node_params *info = ipa_node_params_sum->get (tgt);
+	    if (info && info->versionable)
+	      all_edge_counts.quick_push (count);
+	  }
+
+      if (!all_edge_counts.is_empty ())
+	{
+	  gcc_assert (base_count_pos_percent <= 100);
+	  all_edge_counts.qsort (compare_edge_profile_counts);
+
+	  unsigned base_count_pos
+	    = ((all_edge_counts.length () * (base_count_pos_percent)) / 100);
+	  base_count = all_edge_counts[base_count_pos];
+
+	  if (dump_file)
+	    {
+	      fprintf (dump_file, "\nSelected base_count from %u edges at "
+		       "position %u, arriving at: ", all_edge_counts.length (),
+		       base_count_pos);
+	      base_count.dump (dump_file);
+	      fprintf (dump_file, "\n");
+	    }
+	}
+      else if (dump_file)
+	fprintf (dump_file, "\nNo candidates with non-zero call count found, "
+		 "continuing as if without profile feedback.\n");
+    }
+
   orig_overall_size = overall_size;
 
   if (dump_file)
@@ -6576,7 +6642,7 @@ make_pass_ipa_cp (gcc::context *ctxt)
 void
 ipa_cp_c_finalize (void)
 {
-  max_count = profile_count::uninitialized ();
+  base_count = profile_count::uninitialized ();
   overall_size = 0;
   orig_overall_size = 0;
   ipcp_free_transformation_sum ();
diff --git a/gcc/params.opt b/gcc/params.opt
index 8d772309407..5223f784bf0 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -290,6 +290,10 @@ The size of translation unit that IPA-CP pass considers large.
 Common Joined UInteger Var(param_ipa_cp_value_list_size) Init(8) Param Optimization
 Maximum size of a list of values associated with each parameter for interprocedural constant propagation.
 
+-param=ipa-cp-profile-count-base=
+Common Joined UInteger Var(param_ipa_cp_profile_count_base) Init(10) IntegerRange(0, 100) Param Optimization
+When using profile feedback, use the edge at this percentage position in frequncy histogram as the bases for IPA-CP heuristics.
+
 -param=ipa-jump-function-lookups=
 Common Joined UInteger Var(param_ipa_jump_function_lookups) Init(8) Param Optimization
 Maximum number of statements visited during jump function offset discovery.
-- 
2.32.0

  parent reply	other threads:[~2021-08-24 11:52 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-24 11:48 [PATCH 0/4] IPA-CP profile feedback handling fixes Martin Jambor
2021-08-20 17:43 ` [PATCH 3/4] ipa-cp: Fix updating of profile counts and self-gen value evaluation Martin Jambor
2021-10-08 11:31   ` Jan Hubicka
2021-10-18 16:56     ` Martin Jambor
2021-10-27 13:18       ` Martin Jambor
2021-08-20 17:43 ` [PATCH 2/4] ipa-cp: Propagation boost for recursion generated values Martin Jambor
2021-10-06 15:49   ` Jan Hubicka
2021-10-07 14:59     ` Martin Jambor
2021-10-07 15:25       ` Jan Hubicka
2021-08-20 17:43 ` [PATCH 1/4] cgraph: Do not warn about caller count mismatches of removed functions Martin Jambor
2021-09-16 15:10   ` Martin Jambor
2021-08-23 18:49 ` Martin Jambor [this message]
2021-10-06 15:33   ` [PATCH 4/4] ipa-cp: Select saner profile count to base heuristics on Jan Hubicka
2021-10-18 17:10     ` Martin Jambor
2021-10-27 13:22       ` Martin Jambor
2021-10-27 13:20   ` Martin Jambor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=96160a5131c9e5eb302fb9f4db43c5d8b4cfe042.1629805719.git.mjambor@suse.cz \
    --to=mjambor@suse.cz \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jh@suse.cz \
    --cc=luoxhu@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).