public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Remove arc profile histogram in non-LTO mode.
@ 2018-09-19 18:19 Martin Liška
  2018-09-20  3:25 ` Bin.Cheng
  0 siblings, 1 reply; 7+ messages in thread
From: Martin Liška @ 2018-09-19 18:19 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jan Hubicka

[-- Attachment #1: Type: text/plain, Size: 5121 bytes --]

Hello.

I've been working for some time on a patch that simplifies how we set
the hotness threshold of basic blocks. Currently, we calculate so called
arc profile histograms that should identify edges that cover 99.9% of all
branching. These edges are then identified as hot. Disadvantage of the approach
is that it comes with significant overhead in run-time and GCC related code
is also not trivial. Moreover, anytime a histogram is merged after an instrumented
run, the resulting histogram is misleading.

That said, I decided to simplify it again, remove usage of the histogram and return
to what we have before (--param hot-bb-count-fraction). That basically says that
we consider hot each edge that has execution count bigger than sum_max / 10.000.

Note that LTO+PGO remains untouched as it still uses histogram that is dynamically
calculated by read arc counts.

Note the statistics of the patch:
  19 files changed, 101 insertions(+), 1216 deletions(-)

I'm attaching file sizes of SPEC2006 int benchmark.

Patch survives testing on x86_64-linux-gnu machine.
Ready to be installed?

Martin

gcc/ChangeLog:

2018-09-19  Martin Liska  <mliska@suse.cz>

	* auto-profile.c (autofdo_source_profile::read): Do not
	set sum_all.
	(read_profile): Do not add working sets.
	(read_autofdo_file): Remove sum_all.
	(afdo_callsite_hot_enough_for_early_inline): Remove const
	qualifier.
	* coverage.c (struct counts_entry): Remove gcov_summary.
	(read_counts_file): Read new GCOV_TAG_OBJECT_SUMMARY,
	do not support GCOV_TAG_PROGRAM_SUMMARY.
	(get_coverage_counts): Remove summary and expected
	arguments.
	* coverage.h (get_coverage_counts): Likewise.
	* doc/gcov-dump.texi: Remove -w option.
	* gcov-dump.c (dump_working_sets): Remove.
	(main): Do not support '-w' option.
	(print_usage): Likewise.
	(tag_summary): Likewise.
	* gcov-io.c (gcov_write_summary): Do not dump
	histogram.
	(gcov_read_summary): Likewise.
	(gcov_histo_index): Remove.
	(gcov_histogram_merge): Likewise.
	(compute_working_sets): Likewise.
	* gcov-io.h (GCOV_TAG_OBJECT_SUMMARY): Mark
	it not obsolete.
	(GCOV_TAG_PROGRAM_SUMMARY): Mark it obsolete.
	(GCOV_TAG_SUMMARY_LENGTH): Adjust.
	(GCOV_HISTOGRAM_SIZE): Remove.
	(GCOV_HISTOGRAM_BITVECTOR_SIZE): Likewise.
	(struct gcov_summary): Simplify rapidly just
	to runs and sum_max fields.
	(gcov_histo_index): Remove.
	(NUM_GCOV_WORKING_SETS): Likewise.
	(compute_working_sets): Likewise.
	* gcov-tool.c (print_overlap_usage_message): Remove
	trailing empty line.
	* gcov.c (read_count_file): Read GCOV_TAG_OBJECT_SUMMARY.
	(output_lines): Remove program related line.
	* ipa-profile.c (ipa_profile): Do not consider GCOV histogram.
	* lto-cgraph.c (output_profile_summary): Do not stream GCOV
	histogram.
	(input_profile_summary): Do not read it.
	(merge_profile_summaries): And do not merge it.
	(input_symtab): Do not call removed function.
	* modulo-sched.c (sms_schedule): Do not print sum_max.
	* params.def (HOT_BB_COUNT_FRACTION): Reincarnate param that was
	removed when histogram method was invented.
	(HOT_BB_COUNT_WS_PERMILLE): Mention that it's used only in LTO
	mode.
	* postreload-gcse.c (eliminate_partially_redundant_load): Fix
	GCOV coding style.
	* predict.c (get_hot_bb_threshold): Use HOT_BB_COUNT_FRACTION
	and dump selected value.
	* profile.c (add_working_set): Remove.
	(get_working_sets): Likewise.
	(find_working_set): Likewise.
	(get_exec_counts): Do not work with working sets.
	(read_profile_edge_counts): Do not inform as sum_max is removed.
	(compute_branch_probabilities): Likewise.
	(compute_value_histograms): Remove argument for call of
	get_coverage_counts.
	* profile.h: Do not make gcov_summary const.

libgcc/ChangeLog:

2018-09-19  Martin Liska  <mliska@suse.cz>

	* libgcov-driver.c (crc32_unsigned): Remove.
	(gcov_histogram_insert): Likewise.
	(gcov_compute_histogram): Likewise.
	(compute_summary): Simplify rapidly.
	(merge_one_data): Do not handle PROGRAM_SUMMARY tag.
	(merge_summary): Rapidly simplify.
	(dump_one_gcov): Ignore gcov_summary.
	(gcov_do_dump): Do not handle program summary, it's not
	used.
	* libgcov-util.c (tag_summary): Remove.
	(read_gcda_finalize): Fix coding style.
	(read_gcda_file): Initialize curr_object_summary.
	(compute_summary): Remove.
	(calculate_overlap): Remove settings of run_max.
---
  gcc/auto-profile.c      |  21 +--
  gcc/coverage.c          |  59 +-----
  gcc/coverage.h          |   4 +-
  gcc/doc/gcov-dump.texi  |   6 +-
  gcc/gcov-dump.c         |  81 +-------
  gcc/gcov-io.c           | 398 +---------------------------------------
  gcc/gcov-io.h           |  71 +------
  gcc/gcov-tool.c         |   1 -
  gcc/gcov.c              |   7 +-
  gcc/ipa-profile.c       |  26 +--
  gcc/lto-cgraph.c        | 136 +-------------
  gcc/modulo-sched.c      |   8 -
  gcc/params.def          |   7 +-
  gcc/postreload-gcse.c   |   2 +-
  gcc/predict.c           |   9 +-
  gcc/profile.c           | 116 +-----------
  gcc/profile.h           |   2 +-
  libgcc/libgcov-driver.c | 324 ++++----------------------------
  libgcc/libgcov-util.c   |  39 +---
  19 files changed, 101 insertions(+), 1216 deletions(-)



[-- Attachment #2: 0001-Remove-arc-profile-histogram-in-non-LTO-mode.patch --]
[-- Type: text/x-patch, Size: 73496 bytes --]

diff --git a/gcc/auto-profile.c b/gcc/auto-profile.c
index 197fa10e08c..68abe327cce 100644
--- a/gcc/auto-profile.c
+++ b/gcc/auto-profile.c
@@ -867,7 +867,6 @@ autofdo_source_profile::read ()
       function_instance::function_instance_stack stack;
       function_instance *s = function_instance::read_function_instance (
           &stack, gcov_read_counter ());
-      afdo_profile_info->sum_all += s->total_count ();
       map_[s->name ()] = s;
     }
   return true;
@@ -958,23 +957,6 @@ read_profile (void)
 
   /* autofdo_module_profile.  */
   fake_read_autofdo_module_profile ();
-
-  /* Read in the working set.  */
-  if (gcov_read_unsigned () != GCOV_TAG_AFDO_WORKING_SET)
-    {
-      error ("cannot read working set from %s", auto_profile_file);
-      return;
-    }
-
-  /* Skip the length of the section.  */
-  gcov_read_unsigned ();
-  gcov_working_set_t set[128];
-  for (unsigned i = 0; i < 128; i++)
-    {
-      set[i].num_counters = gcov_read_unsigned ();
-      set[i].min_counter = gcov_read_counter ();
-    }
-  add_working_set (set);
 }
 
 /* From AutoFDO profiles, find values inside STMT for that we want to measure
@@ -1685,7 +1667,6 @@ read_autofdo_file (void)
   autofdo::afdo_profile_info = XNEW (gcov_summary);
   autofdo::afdo_profile_info->runs = 1;
   autofdo::afdo_profile_info->sum_max = 0;
-  autofdo::afdo_profile_info->sum_all = 0;
 
   /* Read the profile from the profile file.  */
   autofdo::read_profile ();
@@ -1712,7 +1693,7 @@ afdo_callsite_hot_enough_for_early_inline (struct cgraph_edge *edge)
   if (count > 0)
     {
       bool is_hot;
-      const gcov_summary *saved_profile_info = profile_info;
+      gcov_summary *saved_profile_info = profile_info;
       /* At early inline stage, profile_info is not set yet. We need to
          temporarily set it to afdo_profile_info to calculate hotness.  */
       profile_info = autofdo::afdo_profile_info;
diff --git a/gcc/coverage.c b/gcc/coverage.c
index bae6f5cafac..26cce2bc63a 100644
--- a/gcc/coverage.c
+++ b/gcc/coverage.c
@@ -49,6 +49,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "intl.h"
 #include "params.h"
 #include "auto-profile.h"
+#include "profile.h"
 
 #include "gcov-io.c"
 
@@ -73,7 +74,6 @@ struct counts_entry : pointer_hash <counts_entry>
   unsigned lineno_checksum;
   unsigned cfg_checksum;
   gcov_type *counts;
-  gcov_summary summary;
 
   /* hash_table support.  */
   static inline hashval_t hash (const counts_entry *);
@@ -185,8 +185,6 @@ static void
 read_counts_file (void)
 {
   gcov_unsigned_t fn_ident = 0;
-  gcov_summary summary;
-  unsigned new_summary = 1;
   gcov_unsigned_t tag;
   int is_error = 0;
   unsigned lineno_checksum = 0;
@@ -236,27 +234,12 @@ read_counts_file (void)
 	    }
 	  else
 	    fn_ident = lineno_checksum = cfg_checksum = 0;
-	  new_summary = 1;
 	}
-      else if (tag == GCOV_TAG_PROGRAM_SUMMARY)
+      else if (tag == GCOV_TAG_OBJECT_SUMMARY)
 	{
-	  struct gcov_summary sum;
-
-	  if (new_summary)
-	    memset (&summary, 0, sizeof (summary));
-
-	  gcov_read_summary (&sum);
-	  summary.runs += sum.runs;
-	  summary.sum_all += sum.sum_all;
-	  if (summary.run_max < sum.run_max)
-	    summary.run_max = sum.run_max;
-	  summary.sum_max += sum.sum_max;
-          if (new_summary)
-	    memcpy (summary.histogram, sum.histogram,
-		sizeof (gcov_bucket_type) * GCOV_HISTOGRAM_SIZE);
-          else
-	    gcov_histogram_merge (summary.histogram, sum.histogram);
-	  new_summary = 0;
+	  profile_info = XCNEW (gcov_summary);
+	  profile_info->runs = gcov_read_unsigned ();
+	  profile_info->sum_max = gcov_read_unsigned ();
 	}
       else if (GCOV_TAG_IS_COUNTER (tag) && fn_ident)
 	{
@@ -276,9 +259,6 @@ read_counts_file (void)
 	      entry->ctr = elt.ctr;
 	      entry->lineno_checksum = lineno_checksum;
 	      entry->cfg_checksum = cfg_checksum;
-	      if (elt.ctr == GCOV_COUNTER_ARCS)
-		entry->summary = summary;
-              entry->summary.num = n_counts;
 	      entry->counts = XCNEWVEC (gcov_type, n_counts);
 	    }
 	  else if (entry->lineno_checksum != lineno_checksum
@@ -292,22 +272,6 @@ read_counts_file (void)
 	      counts_hash = NULL;
 	      break;
 	    }
-	  else if (entry->summary.num != n_counts)
-	    {
-	      error ("Profile data for function %u is corrupted", fn_ident);
-	      error ("number of counters is %d instead of %d", entry->summary.num, n_counts);
-	      delete counts_hash;
-	      counts_hash = NULL;
-	      break;
-	    }
-	  else
-	    {
-	      entry->summary.runs += summary.runs;
-	      entry->summary.sum_all += summary.sum_all;
-	      if (entry->summary.run_max < summary.run_max)
-		entry->summary.run_max = summary.run_max;
-	      entry->summary.sum_max += summary.sum_max;
-	    }
 	  for (ix = 0; ix != n_counts; ix++)
 	    entry->counts[ix] += gcov_read_counter ();
 	}
@@ -330,9 +294,8 @@ read_counts_file (void)
 /* Returns the counters for a particular tag.  */
 
 gcov_type *
-get_coverage_counts (unsigned counter, unsigned expected,
-                     unsigned cfg_checksum, unsigned lineno_checksum,
-		     const gcov_summary **summary)
+get_coverage_counts (unsigned counter, unsigned cfg_checksum,
+		     unsigned lineno_checksum)
 {
   counts_entry *entry, elt;
 
@@ -363,14 +326,13 @@ get_coverage_counts (unsigned counter, unsigned expected,
     }
   elt.ctr = counter;
   entry = counts_hash->find (&elt);
-  if (!entry || !entry->summary.num)
+  if (!entry)
     /* The function was not emitted, or is weak and not chosen in the
        final executable.  Silently fail, because there's nothing we
        can do about it.  */
     return NULL;
   
-  if (entry->cfg_checksum != cfg_checksum
-      || entry->summary.num != expected)
+  if (entry->cfg_checksum != cfg_checksum)
     {
       static int warned = 0;
       bool warning_printed = false;
@@ -414,9 +376,6 @@ get_coverage_counts (unsigned counter, unsigned expected,
 	       DECL_ASSEMBLER_NAME (current_function_decl));
     }
 
-  if (summary)
-    *summary = &entry->summary;
-
   return entry->counts;
 }
 
diff --git a/gcc/coverage.h b/gcc/coverage.h
index 842d6952c16..d612c38d159 100644
--- a/gcc/coverage.h
+++ b/gcc/coverage.h
@@ -51,10 +51,8 @@ extern tree tree_coverage_counter_addr (unsigned /*counter*/, unsigned/*num*/);
 
 /* Get all the counters for the current function.  */
 extern gcov_type *get_coverage_counts (unsigned /*counter*/,
-				       unsigned /*expected*/,
 				       unsigned /*cfg_checksum*/,
-				       unsigned /*lineno_checksum*/,
-				       const gcov_summary **);
+				       unsigned /*lineno_checksum*/);
 
 extern tree get_gcov_type (void);
 extern bool coverage_node_map_initialized_p (void);
diff --git a/gcc/doc/gcov-dump.texi b/gcc/doc/gcov-dump.texi
index e526bdeb858..0313358cdb0 100644
--- a/gcc/doc/gcov-dump.texi
+++ b/gcc/doc/gcov-dump.texi
@@ -61,7 +61,7 @@ gcov-dump [@option{-v}|@option{--version}]
      [@option{-h}|@option{--help}]
      [@option{-l}|@option{--long}]
      [@option{-p}|@option{--positions}]
-     [@option{-w}|@option{--working-sets}] @var{gcovfiles}
+     @var{gcovfiles}
 @c man end
 @end ignore
 
@@ -84,10 +84,6 @@ Dump positions of records.
 @itemx --version
 Display the @command{gcov-dump} version number (on the standard output),
 and exit without doing any further processing.
-
-@item -w
-@itemx --working-sets
-Dump working set computed from summary.
 @end table
 
 @c man end
diff --git a/gcc/gcov-dump.c b/gcc/gcov-dump.c
index 3ff11a6aa0b..7762e4e8190 100644
--- a/gcc/gcov-dump.c
+++ b/gcc/gcov-dump.c
@@ -38,9 +38,6 @@ static void tag_arcs (const char *, unsigned, unsigned, unsigned);
 static void tag_lines (const char *, unsigned, unsigned, unsigned);
 static void tag_counters (const char *, unsigned, unsigned, unsigned);
 static void tag_summary (const char *, unsigned, unsigned, unsigned);
-static void dump_working_sets (const char *filename ATTRIBUTE_UNUSED,
-			       const gcov_summary *summary,
-			       unsigned depth);
 extern int main (int, char **);
 
 typedef struct tag_format
@@ -52,7 +49,6 @@ typedef struct tag_format
 
 static int flag_dump_contents = 0;
 static int flag_dump_positions = 0;
-static int flag_dump_working_sets = 0;
 
 static const struct option options[] =
 {
@@ -60,7 +56,6 @@ static const struct option options[] =
   { "version",              no_argument,       NULL, 'v' },
   { "long",                 no_argument,       NULL, 'l' },
   { "positions",	    no_argument,       NULL, 'o' },
-  { "working-sets",	    no_argument,       NULL, 'w' },
   { 0, 0, 0, 0 }
 };
 
@@ -77,7 +72,6 @@ static const tag_format_t tag_table[] =
   {GCOV_TAG_ARCS, "ARCS", tag_arcs},
   {GCOV_TAG_LINES, "LINES", tag_lines},
   {GCOV_TAG_OBJECT_SUMMARY, "OBJECT_SUMMARY", tag_summary},
-  {GCOV_TAG_PROGRAM_SUMMARY, "PROGRAM_SUMMARY", tag_summary},
   {0, NULL, NULL}
 };
 
@@ -117,9 +111,6 @@ main (int argc ATTRIBUTE_UNUSED, char **argv)
 	case 'p':
 	  flag_dump_positions = 1;
 	  break;
-	case 'w':
-	  flag_dump_working_sets = 1;
-	  break;
 	default:
 	  fprintf (stderr, "unknown flag `%c'\n", opt);
 	}
@@ -139,7 +130,6 @@ print_usage (void)
   printf ("  -l, --long           Dump record contents too\n");
   printf ("  -p, --positions      Dump record positions\n");
   printf ("  -v, --version        Print version number\n");
-  printf ("  -w, --working-sets   Dump working set computed from summary\n");
   printf ("\nFor bug reporting instructions, please see:\n%s.\n",
 	   bug_report_url);
 }
@@ -465,75 +455,10 @@ tag_counters (const char *filename ATTRIBUTE_UNUSED,
 static void
 tag_summary (const char *filename ATTRIBUTE_UNUSED,
 	     unsigned tag ATTRIBUTE_UNUSED, unsigned length ATTRIBUTE_UNUSED,
-	     unsigned depth)
+	     unsigned depth ATTRIBUTE_UNUSED)
 {
   gcov_summary summary;
-  unsigned h_ix;
-  gcov_bucket_type *histo_bucket;
-
   gcov_read_summary (&summary);
-  printf (" checksum=0x%08x", summary.checksum);
-
-  printf ("\n");
-  print_prefix (filename, depth, 0);
-  printf (VALUE_PADDING_PREFIX "counts=%u, runs=%u",
-	  summary.num, summary.runs);
-
-  printf (", sum_all=%" PRId64,
-	  (int64_t)summary.sum_all);
-  printf (", run_max=%" PRId64,
-	  (int64_t)summary.run_max);
-  printf (", sum_max=%" PRId64,
-	  (int64_t)summary.sum_max);
-  printf ("\n");
-  print_prefix (filename, depth, 0);
-  printf (VALUE_PADDING_PREFIX "counter histogram:");
-  for (h_ix = 0; h_ix < GCOV_HISTOGRAM_SIZE; h_ix++)
-    {
-      histo_bucket = &summary.histogram[h_ix];
-      if (!histo_bucket->num_counters)
-	continue;
-      printf ("\n");
-      print_prefix (filename, depth, 0);
-      printf (VALUE_PADDING_PREFIX VALUE_PREFIX "num counts=%u, "
-	      "min counter=%" PRId64 ", cum_counter=%" PRId64,
-	      h_ix, histo_bucket->num_counters,
-	      (int64_t)histo_bucket->min_value,
-	      (int64_t)histo_bucket->cum_value);
-    }
-  if (flag_dump_working_sets)
-    dump_working_sets (filename, &summary, depth);
-}
-
-static void
-dump_working_sets (const char *filename ATTRIBUTE_UNUSED,
-		   const gcov_summary *summary,
-		   unsigned depth)
-{
-  gcov_working_set_t gcov_working_sets[NUM_GCOV_WORKING_SETS];
-  unsigned ws_ix, pctinc, pct;
-  gcov_working_set_t *ws_info;
-
-  compute_working_sets (summary, gcov_working_sets);
-
-  printf ("\n");
-  print_prefix (filename, depth, 0);
-  printf (VALUE_PADDING_PREFIX "counter working sets:");
-  /* Multiply the percentage by 100 to avoid float.  */
-  pctinc = 100 * 100 / NUM_GCOV_WORKING_SETS;
-  for (ws_ix = 0, pct = pctinc; ws_ix < NUM_GCOV_WORKING_SETS;
-       ws_ix++, pct += pctinc)
-    {
-      if (ws_ix == NUM_GCOV_WORKING_SETS - 1)
-        pct = 9990;
-      ws_info = &gcov_working_sets[ws_ix];
-      /* Print out the percentage using int arithmatic to avoid float.  */
-      printf ("\n");
-      print_prefix (filename, depth + 1, 0);
-      printf (VALUE_PADDING_PREFIX "%u.%02u%%: num counts=%u, min counter="
-               "%" PRId64,
-               pct / 100, pct - (pct / 100 * 100),
-               ws_info->num_counters,
-               (int64_t)ws_info->min_counter);
-    }
+  printf (" runs=%d, sum_max=%" PRId64,
+	  summary.runs, summary.sum_max);
 }
diff --git a/gcc/gcov-io.c b/gcc/gcov-io.c
index 311e4d014bf..63cc7fcb048 100644
--- a/gcc/gcov-io.c
+++ b/gcc/gcov-io.c
@@ -446,39 +446,11 @@ gcov_write_tag_length (gcov_unsigned_t tag, gcov_unsigned_t length)
 GCOV_LINKAGE void
 gcov_write_summary (gcov_unsigned_t tag, const struct gcov_summary *summary)
 {
-  unsigned h_ix, bv_ix, h_cnt = 0;
-  unsigned histo_bitvector[GCOV_HISTOGRAM_BITVECTOR_SIZE];
-
-  /* Count number of non-zero histogram entries, and fill in a bit vector
-     of non-zero indices. The histogram is only currently computed for arc
-     counters.  */
-  for (bv_ix = 0; bv_ix < GCOV_HISTOGRAM_BITVECTOR_SIZE; bv_ix++)
-    histo_bitvector[bv_ix] = 0;
-  for (h_ix = 0; h_ix < GCOV_HISTOGRAM_SIZE; h_ix++)
-    if (summary->histogram[h_ix].num_counters)
-      {
-	histo_bitvector[h_ix / 32] |= 1 << (h_ix % 32);
-	h_cnt++;
-      }
-  gcov_write_tag_length (tag, GCOV_TAG_SUMMARY_LENGTH (h_cnt));
-  gcov_write_unsigned (summary->checksum);
-
-  gcov_write_unsigned (summary->num);
+  gcov_write_tag_length (tag, GCOV_TAG_SUMMARY_LENGTH);
   gcov_write_unsigned (summary->runs);
-  gcov_write_counter (summary->sum_all);
-  gcov_write_counter (summary->run_max);
-  gcov_write_counter (summary->sum_max);
-  for (bv_ix = 0; bv_ix < GCOV_HISTOGRAM_BITVECTOR_SIZE; bv_ix++)
-    gcov_write_unsigned (histo_bitvector[bv_ix]);
-  for (h_ix = 0; h_ix < GCOV_HISTOGRAM_SIZE; h_ix++)
-    {
-      if (!summary->histogram[h_ix].num_counters)
-	continue;
-      gcov_write_unsigned (summary->histogram[h_ix].num_counters);
-      gcov_write_counter (summary->histogram[h_ix].min_value);
-      gcov_write_counter (summary->histogram[h_ix].cum_value);
-    }
+  gcov_write_unsigned (summary->sum_max);
 }
+
 #endif /* IN_LIBGCOV */
 
 #endif /*!IN_GCOV */
@@ -637,65 +609,8 @@ gcov_read_string (void)
 GCOV_LINKAGE void
 gcov_read_summary (struct gcov_summary *summary)
 {
-  unsigned h_ix, bv_ix, h_cnt = 0;
-  unsigned histo_bitvector[GCOV_HISTOGRAM_BITVECTOR_SIZE];
-  unsigned cur_bitvector;
-
-  summary->checksum = gcov_read_unsigned ();
-  summary->num = gcov_read_unsigned ();
   summary->runs = gcov_read_unsigned ();
-  summary->sum_all = gcov_read_counter ();
-  summary->run_max = gcov_read_counter ();
-  summary->sum_max = gcov_read_counter ();
-  memset (summary->histogram, 0,
-	  sizeof (gcov_bucket_type) * GCOV_HISTOGRAM_SIZE);
-  for (bv_ix = 0; bv_ix < GCOV_HISTOGRAM_BITVECTOR_SIZE; bv_ix++)
-    {
-      histo_bitvector[bv_ix] = gcov_read_unsigned ();
-#if IN_LIBGCOV
-      /* When building libgcov we don't include system.h, which includes
-	 hwint.h (where popcount_hwi is declared). However, libgcov.a
-	 is built by the bootstrapped compiler and therefore the builtins
-	 are always available.  */
-      h_cnt += __builtin_popcount (histo_bitvector[bv_ix]);
-#else
-      h_cnt += popcount_hwi (histo_bitvector[bv_ix]);
-#endif
-    }
-  bv_ix = 0;
-  h_ix = 0;
-  cur_bitvector = 0;
-  while (h_cnt--)
-    {
-      /* Find the index corresponding to the next entry we will read in.
-	 First find the next non-zero bitvector and re-initialize
-	 the histogram index accordingly, then right shift and increment
-	 the index until we find a set bit.  */
-      while (!cur_bitvector)
-	{
-	  h_ix = bv_ix * 32;
-	  if (bv_ix >= GCOV_HISTOGRAM_BITVECTOR_SIZE)
-	    gcov_error ("corrupted profile info: summary histogram "
-			"bitvector is corrupt");
-	  cur_bitvector = histo_bitvector[bv_ix++];
-	}
-      while (!(cur_bitvector & 0x1))
-	{
-	  h_ix++;
-	  cur_bitvector >>= 1;
-	}
-      if (h_ix >= GCOV_HISTOGRAM_SIZE)
-	gcov_error ("corrupted profile info: summary histogram "
-		    "index is corrupt");
-
-      summary->histogram[h_ix].num_counters = gcov_read_unsigned ();
-      summary->histogram[h_ix].min_value = gcov_read_counter ();
-      summary->histogram[h_ix].cum_value = gcov_read_counter ();
-      /* Shift off the index we are done with and increment to the
-	 corresponding next histogram entry.  */
-      cur_bitvector >>= 1;
-      h_ix++;
-    }
+  summary->sum_max = gcov_read_unsigned ();
 }
 
 /* We need to expose the below function when compiling for gcov-tool.  */
@@ -747,308 +662,3 @@ gcov_time (void)
     return status.st_mtime;
 }
 #endif /* IN_GCOV */
-
-#if !IN_GCOV
-/* Determine the index into histogram for VALUE. */
-
-#if IN_LIBGCOV
-static unsigned
-#else
-GCOV_LINKAGE unsigned
-#endif
-gcov_histo_index (gcov_type value)
-{
-  gcov_type_unsigned v = (gcov_type_unsigned)value;
-  unsigned r = 0;
-  unsigned prev2bits = 0;
-
-  /* Find index into log2 scale histogram, where each of the log2
-     sized buckets is divided into 4 linear sub-buckets for better
-     focus in the higher buckets.  */
-
-  /* Find the place of the most-significant bit set.  */
-  if (v > 0)
-    {
-#if IN_LIBGCOV
-      /* When building libgcov we don't include system.h, which includes
-         hwint.h (where floor_log2 is declared). However, libgcov.a
-         is built by the bootstrapped compiler and therefore the builtins
-         are always available.  */
-      r = sizeof (long long) * __CHAR_BIT__ - 1 - __builtin_clzll (v);
-#else
-      /* We use floor_log2 from hwint.c, which takes a HOST_WIDE_INT
-         that is 64 bits and gcov_type_unsigned is 64 bits.  */
-      r = floor_log2 (v);
-#endif
-    }
-
-  /* If at most the 2 least significant bits are set (value is
-     0 - 3) then that value is our index into the lowest set of
-     four buckets.  */
-  if (r < 2)
-    return (unsigned)value;
-
-  gcov_nonruntime_assert (r < 64);
-
-  /* Find the two next most significant bits to determine which
-     of the four linear sub-buckets to select.  */
-  prev2bits = (v >> (r - 2)) & 0x3;
-  /* Finally, compose the final bucket index from the log2 index and
-     the next 2 bits. The minimum r value at this point is 2 since we
-     returned above if r was 2 or more, so the minimum bucket at this
-     point is 4.  */
-  return (r - 1) * 4 + prev2bits;
-}
-
-/* Merge SRC_HISTO into TGT_HISTO. The counters are assumed to be in
-   the same relative order in both histograms, and are matched up
-   and merged in reverse order. Each counter is assigned an equal portion of
-   its entry's original cumulative counter value when computing the
-   new merged cum_value.  */
-
-static void gcov_histogram_merge (gcov_bucket_type *tgt_histo,
-                                  gcov_bucket_type *src_histo)
-{
-  int src_i, tgt_i, tmp_i = 0;
-  unsigned src_num, tgt_num, merge_num;
-  gcov_type src_cum, tgt_cum, merge_src_cum, merge_tgt_cum, merge_cum;
-  gcov_type merge_min;
-  gcov_bucket_type tmp_histo[GCOV_HISTOGRAM_SIZE];
-  int src_done = 0;
-
-  memset (tmp_histo, 0, sizeof (gcov_bucket_type) * GCOV_HISTOGRAM_SIZE);
-
-  /* Assume that the counters are in the same relative order in both
-     histograms. Walk the histograms from largest to smallest entry,
-     matching up and combining counters in order.  */
-  src_num = 0;
-  src_cum = 0;
-  src_i = GCOV_HISTOGRAM_SIZE - 1;
-  for (tgt_i = GCOV_HISTOGRAM_SIZE - 1; tgt_i >= 0 && !src_done; tgt_i--)
-    {
-      tgt_num = tgt_histo[tgt_i].num_counters;
-      tgt_cum = tgt_histo[tgt_i].cum_value;
-      /* Keep going until all of the target histogram's counters at this
-         position have been matched and merged with counters from the
-         source histogram.  */
-      while (tgt_num > 0 && !src_done)
-        {
-          /* If this is either the first time through this loop or we just
-             exhausted the previous non-zero source histogram entry, look
-             for the next non-zero source histogram entry.  */
-          if (!src_num)
-            {
-              /* Locate the next non-zero entry.  */
-              while (src_i >= 0 && !src_histo[src_i].num_counters)
-                src_i--;
-              /* If source histogram has fewer counters, then just copy over the
-                 remaining target counters and quit.  */
-              if (src_i < 0)
-                {
-                  tmp_histo[tgt_i].num_counters += tgt_num;
-                  tmp_histo[tgt_i].cum_value += tgt_cum;
-                  if (!tmp_histo[tgt_i].min_value ||
-                      tgt_histo[tgt_i].min_value < tmp_histo[tgt_i].min_value)
-                    tmp_histo[tgt_i].min_value = tgt_histo[tgt_i].min_value;
-                  while (--tgt_i >= 0)
-                    {
-                      tmp_histo[tgt_i].num_counters
-                          += tgt_histo[tgt_i].num_counters;
-                      tmp_histo[tgt_i].cum_value += tgt_histo[tgt_i].cum_value;
-                      if (!tmp_histo[tgt_i].min_value ||
-                          tgt_histo[tgt_i].min_value
-                          < tmp_histo[tgt_i].min_value)
-                        tmp_histo[tgt_i].min_value = tgt_histo[tgt_i].min_value;
-                    }
-
-                  src_done = 1;
-                  break;
-                }
-
-              src_num = src_histo[src_i].num_counters;
-              src_cum = src_histo[src_i].cum_value;
-            }
-
-          /* The number of counters to merge on this pass is the minimum
-             of the remaining counters from the current target and source
-             histogram entries.  */
-          merge_num = tgt_num;
-          if (src_num < merge_num)
-            merge_num = src_num;
-
-          /* The merged min_value is the sum of the min_values from target
-             and source.  */
-          merge_min = tgt_histo[tgt_i].min_value + src_histo[src_i].min_value;
-
-          /* Compute the portion of source and target entries' cum_value
-             that will be apportioned to the counters being merged.
-             The total remaining cum_value from each entry is divided
-             equally among the counters from that histogram entry if we
-             are not merging all of them.  */
-          merge_src_cum = src_cum;
-          if (merge_num < src_num)
-            merge_src_cum = merge_num * src_cum / src_num;
-          merge_tgt_cum = tgt_cum;
-          if (merge_num < tgt_num)
-            merge_tgt_cum = merge_num * tgt_cum / tgt_num;
-          /* The merged cum_value is the sum of the source and target
-             components.  */
-          merge_cum = merge_src_cum + merge_tgt_cum;
-
-          /* Update the remaining number of counters and cum_value left
-             to be merged from this source and target entry.  */
-          src_cum -= merge_src_cum;
-          tgt_cum -= merge_tgt_cum;
-          src_num -= merge_num;
-          tgt_num -= merge_num;
-
-          /* The merged counters get placed in the new merged histogram
-             at the entry for the merged min_value.  */
-          tmp_i = gcov_histo_index (merge_min);
-          gcov_nonruntime_assert (tmp_i < GCOV_HISTOGRAM_SIZE);
-          tmp_histo[tmp_i].num_counters += merge_num;
-          tmp_histo[tmp_i].cum_value += merge_cum;
-          if (!tmp_histo[tmp_i].min_value ||
-              merge_min < tmp_histo[tmp_i].min_value)
-            tmp_histo[tmp_i].min_value = merge_min;
-
-          /* Ensure the search for the next non-zero src_histo entry starts
-             at the next smallest histogram bucket.  */
-          if (!src_num)
-            src_i--;
-        }
-    }
-
-  gcov_nonruntime_assert (tgt_i < 0);
-
-  /* In the case where there were more counters in the source histogram,
-     accumulate the remaining unmerged cumulative counter values. Add
-     those to the smallest non-zero target histogram entry. Otherwise,
-     the total cumulative counter values in the histogram will be smaller
-     than the sum_all stored in the summary, which will complicate
-     computing the working set information from the histogram later on.  */
-  if (src_num)
-    src_i--;
-  while (src_i >= 0)
-    {
-      src_cum += src_histo[src_i].cum_value;
-      src_i--;
-    }
-  /* At this point, tmp_i should be the smallest non-zero entry in the
-     tmp_histo.  */
-  gcov_nonruntime_assert (tmp_i >= 0 && tmp_i < GCOV_HISTOGRAM_SIZE
-                          && tmp_histo[tmp_i].num_counters > 0);
-  tmp_histo[tmp_i].cum_value += src_cum;
-
-  /* Finally, copy the merged histogram into tgt_histo.  */
-  memcpy (tgt_histo, tmp_histo,
-	  sizeof (gcov_bucket_type) * GCOV_HISTOGRAM_SIZE);
-}
-#endif /* !IN_GCOV */
-
-/* This is used by gcov-dump (IN_GCOV == -1) and in the compiler
-   (!IN_GCOV && !IN_LIBGCOV).  */
-#if IN_GCOV <= 0 && !IN_LIBGCOV
-/* Compute the working set information from the counter histogram in
-   the profile summary. This is an array of information corresponding to a
-   range of percentages of the total execution count (sum_all), and includes
-   the number of counters required to cover that working set percentage and
-   the minimum counter value in that working set.  */
-
-GCOV_LINKAGE void
-compute_working_sets (const gcov_summary *summary,
-                      gcov_working_set_t *gcov_working_sets)
-{
-  gcov_type working_set_cum_values[NUM_GCOV_WORKING_SETS];
-  gcov_type ws_cum_hotness_incr;
-  gcov_type cum, tmp_cum;
-  const gcov_bucket_type *histo_bucket;
-  unsigned ws_ix, c_num, count;
-  int h_ix;
-
-  /* Compute the amount of sum_all that the cumulative hotness grows
-     by in each successive working set entry, which depends on the
-     number of working set entries.  */
-  ws_cum_hotness_incr = summary->sum_all / NUM_GCOV_WORKING_SETS;
-
-  /* Next fill in an array of the cumulative hotness values corresponding
-     to each working set summary entry we are going to compute below.
-     Skip 0% statistics, which can be extrapolated from the
-     rest of the summary data.  */
-  cum = ws_cum_hotness_incr;
-  for (ws_ix = 0; ws_ix < NUM_GCOV_WORKING_SETS;
-       ws_ix++, cum += ws_cum_hotness_incr)
-    working_set_cum_values[ws_ix] = cum;
-  /* The last summary entry is reserved for (roughly) 99.9% of the
-     working set. Divide by 1024 so it becomes a shift, which gives
-     almost exactly 99.9%.  */
-  working_set_cum_values[NUM_GCOV_WORKING_SETS-1]
-      = summary->sum_all - summary->sum_all/1024;
-
-  /* Next, walk through the histogram in decending order of hotness
-     and compute the statistics for the working set summary array.
-     As histogram entries are accumulated, we check to see which
-     working set entries have had their expected cum_value reached
-     and fill them in, walking the working set entries in increasing
-     size of cum_value.  */
-  ws_ix = 0; /* The current entry into the working set array.  */
-  cum = 0; /* The current accumulated counter sum.  */
-  count = 0; /* The current accumulated count of block counters.  */
-  for (h_ix = GCOV_HISTOGRAM_SIZE - 1;
-       h_ix >= 0 && ws_ix < NUM_GCOV_WORKING_SETS; h_ix--)
-    {
-      histo_bucket = &summary->histogram[h_ix];
-
-      /* If we haven't reached the required cumulative counter value for
-         the current working set percentage, simply accumulate this histogram
-         entry into the running sums and continue to the next histogram
-         entry.  */
-      if (cum + histo_bucket->cum_value < working_set_cum_values[ws_ix])
-        {
-          cum += histo_bucket->cum_value;
-          count += histo_bucket->num_counters;
-          continue;
-        }
-
-      /* If adding the current histogram entry's cumulative counter value
-         causes us to exceed the current working set size, then estimate
-         how many of this histogram entry's counter values are required to
-         reach the working set size, and fill in working set entries
-         as we reach their expected cumulative value.  */
-      for (c_num = 0, tmp_cum = cum;
-           c_num < histo_bucket->num_counters && ws_ix < NUM_GCOV_WORKING_SETS;
-           c_num++)
-        {
-          count++;
-          /* If we haven't reached the last histogram entry counter, add
-             in the minimum value again. This will underestimate the
-             cumulative sum so far, because many of the counter values in this
-             entry may have been larger than the minimum. We could add in the
-             average value every time, but that would require an expensive
-             divide operation.  */
-          if (c_num + 1 < histo_bucket->num_counters)
-            tmp_cum += histo_bucket->min_value;
-          /* If we have reached the last histogram entry counter, then add
-             in the entire cumulative value.  */
-          else
-            tmp_cum = cum + histo_bucket->cum_value;
-
-	  /* Next walk through successive working set entries and fill in
-	     the statistics for any whose size we have reached by accumulating
-	     this histogram counter.  */
-	  while (ws_ix < NUM_GCOV_WORKING_SETS
-		 && tmp_cum >= working_set_cum_values[ws_ix])
-            {
-              gcov_working_sets[ws_ix].num_counters = count;
-              gcov_working_sets[ws_ix].min_counter
-                  = histo_bucket->min_value;
-              ws_ix++;
-            }
-        }
-      /* Finally, update the running cumulative value since we were
-         using a temporary above.  */
-      cum += histo_bucket->cum_value;
-    }
-  gcov_nonruntime_assert (ws_ix == NUM_GCOV_WORKING_SETS);
-}
-#endif /* IN_GCOV <= 0 && !IN_LIBGCOV */
diff --git a/gcc/gcov-io.h b/gcc/gcov-io.h
index 7a11f0aec7f..1fc31f52eee 100644
--- a/gcc/gcov-io.h
+++ b/gcc/gcov-io.h
@@ -133,17 +133,14 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
    blocks they are for.
 
    The data file contains the following records.
-        data: {unit summary:object summary:program* function-data*}*
+	data: {unit summary:object function-data*}*
 	unit: header int32:checksum
-        function-data:	announce_function present counts
+	function-data:	announce_function present counts
 	announce_function: header int32:ident
 		int32:lineno_checksum int32:cfg_checksum
 	present: header int32:present
 	counts: header int64:count*
-	summary: int32:checksum int32:num int32:runs int64:sum
-		 int64:max int64:sum_max histogram
-        histogram: {int32:bitvector}8 histogram-buckets*
-        histogram-buckets: int32:num int64:min int64:sum
+	summary: int32:checksum int32:runs int32:sum_max
 
    The ANNOUNCE_FUNCTION record is the same as that in the note file,
    but without the source location.  The COUNTS gives the
@@ -190,7 +187,7 @@ typedef uint64_t gcov_type_unsigned;
 
 #define ATTRIBUTE_HIDDEN
 
-#endif /* !IN_LIBGOCV */
+#endif /* !IN_LIBGCOV */
 
 #ifndef GCOV_LINKAGE
 #define GCOV_LINKAGE extern
@@ -240,9 +237,9 @@ typedef uint64_t gcov_type_unsigned;
 #define GCOV_TAG_COUNTER_BASE 	 ((gcov_unsigned_t)0x01a10000)
 #define GCOV_TAG_COUNTER_LENGTH(NUM) ((NUM) * 2)
 #define GCOV_TAG_COUNTER_NUM(LENGTH) ((LENGTH) / 2)
-#define GCOV_TAG_OBJECT_SUMMARY  ((gcov_unsigned_t)0xa1000000) /* Obsolete */
-#define GCOV_TAG_PROGRAM_SUMMARY ((gcov_unsigned_t)0xa3000000)
-#define GCOV_TAG_SUMMARY_LENGTH(NUM) (1 + (10 + 3 * 2) + (NUM) * 5)
+#define GCOV_TAG_OBJECT_SUMMARY  ((gcov_unsigned_t)0xa1000000)
+#define GCOV_TAG_PROGRAM_SUMMARY ((gcov_unsigned_t)0xa3000000) /* Obsolete */
+#define GCOV_TAG_SUMMARY_LENGTH (2)
 #define GCOV_TAG_AFDO_FILE_NAMES ((gcov_unsigned_t)0xaa000000)
 #define GCOV_TAG_AFDO_FUNCTION ((gcov_unsigned_t)0xac000000)
 #define GCOV_TAG_AFDO_WORKING_SET ((gcov_unsigned_t)0xaf000000)
@@ -307,43 +304,12 @@ GCOV_COUNTERS
 #define GCOV_ARC_FAKE		(1 << 1)
 #define GCOV_ARC_FALLTHROUGH	(1 << 2)
 
-/* Structured records.  */
-
-/* Structure used for each bucket of the log2 histogram of counter values.  */
-typedef struct
-{
-  /* Number of counters whose profile count falls within the bucket.  */
-  gcov_unsigned_t num_counters;
-  /* Smallest profile count included in this bucket.  */
-  gcov_type min_value;
-  /* Cumulative value of the profile counts in this bucket.  */
-  gcov_type cum_value;
-} gcov_bucket_type;
-
-/* For a log2 scale histogram with each range split into 4
-   linear sub-ranges, there will be at most 64 (max gcov_type bit size) - 1 log2
-   ranges since the lowest 2 log2 values share the lowest 4 linear
-   sub-range (values 0 - 3).  This is 252 total entries (63*4).  */
-
-#define GCOV_HISTOGRAM_SIZE 252
-
-/* How many unsigned ints are required to hold a bit vector of non-zero
-   histogram entries when the histogram is written to the gcov file.
-   This is essentially a ceiling divide by 32 bits.  */
-#define GCOV_HISTOGRAM_BITVECTOR_SIZE (GCOV_HISTOGRAM_SIZE + 31) / 32
-
 /* Object & program summary record.  */
 
 struct gcov_summary
 {
-  gcov_unsigned_t checksum;	/* Checksum of program.  */
-  gcov_unsigned_t num;		/* Number of counters.  */
   gcov_unsigned_t runs;		/* Number of program runs.  */
-  gcov_type sum_all;		/* Sum of all counters accumulated.  */
-  gcov_type run_max;		/* Maximum value on a single run.  */
   gcov_type sum_max;    	/* Sum of individual run max values.  */
-  gcov_bucket_type histogram[GCOV_HISTOGRAM_SIZE]; /* Histogram of
-						      counter values.  */
 };
 
 #if !defined(inhibit_libc)
@@ -380,35 +346,12 @@ GCOV_LINKAGE void gcov_write_unsigned (gcov_unsigned_t) ATTRIBUTE_HIDDEN;
 
 #if !IN_GCOV && !IN_LIBGCOV
 /* Available only in compiler */
-GCOV_LINKAGE unsigned gcov_histo_index (gcov_type value);
 GCOV_LINKAGE void gcov_write_string (const char *);
 GCOV_LINKAGE void gcov_write_filename (const char *);
 GCOV_LINKAGE gcov_position_t gcov_write_tag (gcov_unsigned_t);
 GCOV_LINKAGE void gcov_write_length (gcov_position_t /*position*/);
 #endif
 
-#if IN_GCOV <= 0 && !IN_LIBGCOV
-/* Available in gcov-dump and the compiler.  */
-
-/* Number of data points in the working set summary array. Using 128
-   provides information for at least every 1% increment of the total
-   profile size. The last entry is hardwired to 99.9% of the total.  */
-#define NUM_GCOV_WORKING_SETS 128
-
-/* Working set size statistics for a given percentage of the entire
-   profile (sum_all from the counter summary).  */
-typedef struct gcov_working_set_info
-{
-  /* Number of hot counters included in this working set.  */
-  unsigned num_counters;
-  /* Smallest counter included in this working set.  */
-  gcov_type min_counter;
-} gcov_working_set_t;
-
-GCOV_LINKAGE void compute_working_sets (const gcov_summary *summary,
-                                        gcov_working_set_t *gcov_working_sets);
-#endif
-
 #if IN_GCOV > 0
 /* Available in gcov */
 GCOV_LINKAGE time_t gcov_time (void);
diff --git a/gcc/gcov-tool.c b/gcc/gcov-tool.c
index 15fd710b18c..88539f9647f 100644
--- a/gcc/gcov-tool.c
+++ b/gcc/gcov-tool.c
@@ -423,7 +423,6 @@ print_overlap_usage_message (int error_p)
   fnotice (file, "    -o, --object                        Print object level info\n");
   fnotice (file, "    -t <float>, --hot_threshold <float> Set the threshold for hotness\n");
   fnotice (file, "    -v, --verbose                       Verbose mode\n");
-
 }
 
 static const struct option overlap_options[] =
diff --git a/gcc/gcov.c b/gcc/gcov.c
index c09d5060053..922e2de2646 100644
--- a/gcc/gcov.c
+++ b/gcc/gcov.c
@@ -418,7 +418,6 @@ static vector<char *> processed_files;
 /* This holds data summary information.  */
 
 static unsigned object_runs;
-static unsigned program_count;
 
 static unsigned total_lines;
 static unsigned total_executed;
@@ -1829,12 +1828,11 @@ read_count_file (void)
       unsigned length = gcov_read_unsigned ();
       unsigned long base = gcov_position ();
 
-      if (tag == GCOV_TAG_PROGRAM_SUMMARY)
+      if (tag == GCOV_TAG_OBJECT_SUMMARY)
 	{
 	  struct gcov_summary summary;
 	  gcov_read_summary (&summary);
-	  object_runs += summary.runs;
-	  program_count++;
+	  object_runs = summary.runs;
 	}
       else if (tag == GCOV_TAG_FUNCTION && !length)
 	; /* placeholder  */
@@ -2952,7 +2950,6 @@ output_lines (FILE *gcov_file, const source_info *src)
 	       no_data_file ? "-" : da_file_name);
       fprintf (gcov_file, DEFAULT_LINE_START "Runs:%u\n", object_runs);
     }
-  fprintf (gcov_file, DEFAULT_LINE_START "Programs:%u\n", program_count);
 
   source_file = fopen (src->name, "r");
   if (!source_file)
diff --git a/gcc/ipa-profile.c b/gcc/ipa-profile.c
index f921d1bb6f4..c74f4a4a41d 100644
--- a/gcc/ipa-profile.c
+++ b/gcc/ipa-profile.c
@@ -25,13 +25,6 @@ along with GCC; see the file COPYING3.  If not see
      from profile feedback. This histogram is complete only with LTO,
      otherwise it contains information only about the current unit.
 
-     Similar histogram is also estimated by coverage runtime.  This histogram
-     is not dependent on LTO, but it suffers from various defects; first
-     gcov runtime is not weighting individual basic block by estimated execution
-     time and second the merging of multiple runs makes assumption that the
-     histogram distribution did not change.  Consequentely histogram constructed
-     here may be more precise.
-
      The information is used to set hot/cold thresholds.
    - Next speculative indirect call resolution is performed:  the local
      profile pass assigns profile-id to each function and provide us with a
@@ -512,25 +505,7 @@ ipa_profile (void)
       gcov_type threshold;
 
       gcc_assert (overall_size);
-      if (dump_file)
-	{
-	  gcov_type min, cumulated_time = 0, cumulated_size = 0;
 
-	  fprintf (dump_file, "Overall time: %" PRId64"\n",
-		   (int64_t)overall_time);
-	  min = get_hot_bb_threshold ();
-          for (i = 0; i < (int)histogram.length () && histogram[i]->count >= min;
-	       i++)
-	    {
-	      cumulated_time += histogram[i]->count * histogram[i]->time;
-	      cumulated_size += histogram[i]->size;
-	    }
-	  fprintf (dump_file, "GCOV min count: %" PRId64
-		   " Time:%3.2f%% Size:%3.2f%%\n", 
-		   (int64_t)min,
-		   cumulated_time * 100.0 / overall_time,
-		   cumulated_size * 100.0 / overall_size);
-	}
       cutoff = (overall_time * PARAM_VALUE (HOT_BB_COUNT_WS_PERMILLE) + 500) / 1000;
       threshold = 0;
       for (i = 0; cumulated < cutoff; i++)
@@ -557,6 +532,7 @@ ipa_profile (void)
 		   cumulated_time * 100.0 / overall_time,
 		   cumulated_size * 100.0 / overall_size);
 	}
+
       if (threshold > get_hot_bb_threshold ()
 	  || in_lto_p)
 	{
diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c
index 1e6a7adeaa2..6d9eea13f22 100644
--- a/gcc/lto-cgraph.c
+++ b/gcc/lto-cgraph.c
@@ -693,39 +693,14 @@ lto_output_ref (struct lto_simple_output_block *ob, struct ipa_ref *ref,
 static void
 output_profile_summary (struct lto_simple_output_block *ob)
 {
-  unsigned h_ix;
-  struct bitpack_d bp;
-
   if (profile_info)
     {
       /* We do not output num and run_max, they are not used by
          GCC profile feedback and they are difficult to merge from multiple
          units.  */
-      gcc_assert (profile_info->runs);
-      streamer_write_uhwi_stream (ob->main_stream, profile_info->runs);
-      streamer_write_gcov_count_stream (ob->main_stream, profile_info->sum_max);
+      unsigned runs = (profile_info->runs);
+      streamer_write_uhwi_stream (ob->main_stream, runs);
 
-      /* sum_all is needed for computing the working set with the
-         histogram.  */
-      streamer_write_gcov_count_stream (ob->main_stream, profile_info->sum_all);
-
-      /* Create and output a bitpack of non-zero histogram entries indices.  */
-      bp = bitpack_create (ob->main_stream);
-      for (h_ix = 0; h_ix < GCOV_HISTOGRAM_SIZE; h_ix++)
-        bp_pack_value (&bp, profile_info->histogram[h_ix].num_counters > 0, 1);
-      streamer_write_bitpack (&bp);
-      /* Now stream out only those non-zero entries.  */
-      for (h_ix = 0; h_ix < GCOV_HISTOGRAM_SIZE; h_ix++)
-        {
-          if (!profile_info->histogram[h_ix].num_counters)
-            continue;
-          streamer_write_gcov_count_stream (ob->main_stream,
-                                      profile_info->histogram[h_ix].num_counters);
-          streamer_write_gcov_count_stream (ob->main_stream,
-                                      profile_info->histogram[h_ix].min_value);
-          streamer_write_gcov_count_stream (ob->main_stream,
-                                      profile_info->histogram[h_ix].cum_value);
-         }
       /* IPA-profile computes hot bb threshold based on cumulated
 	 whole program profile.  We need to stream it down to ltrans.  */
        if (flag_wpa)
@@ -1591,46 +1566,16 @@ input_refs (struct lto_input_block *ib,
     }
 }
 	    
-
-static gcov_summary lto_gcov_summary;
-
 /* Input profile_info from IB.  */
 static void
 input_profile_summary (struct lto_input_block *ib,
 		       struct lto_file_decl_data *file_data)
 {
-  unsigned h_ix;
-  struct bitpack_d bp;
   unsigned int runs = streamer_read_uhwi (ib);
   if (runs)
     {
       file_data->profile_info.runs = runs;
-      file_data->profile_info.sum_max = streamer_read_gcov_count (ib);
-      file_data->profile_info.sum_all = streamer_read_gcov_count (ib);
-
-      memset (file_data->profile_info.histogram, 0,
-              sizeof (gcov_bucket_type) * GCOV_HISTOGRAM_SIZE);
-      /* Input the bitpack of non-zero histogram indices.  */
-      bp = streamer_read_bitpack (ib);
-      /* Read in and unpack the full bitpack, flagging non-zero
-         histogram entries by setting the num_counters non-zero.  */
-      for (h_ix = 0; h_ix < GCOV_HISTOGRAM_SIZE; h_ix++)
-        {
-          file_data->profile_info.histogram[h_ix].num_counters
-              = bp_unpack_value (&bp, 1);
-        }
-      for (h_ix = 0; h_ix < GCOV_HISTOGRAM_SIZE; h_ix++)
-        {
-          if (!file_data->profile_info.histogram[h_ix].num_counters)
-            continue;
-
-          file_data->profile_info.histogram[h_ix].num_counters
-              = streamer_read_gcov_count (ib);
-          file_data->profile_info.histogram[h_ix].min_value
-              = streamer_read_gcov_count (ib);
-          file_data->profile_info.histogram[h_ix].cum_value
-              = streamer_read_gcov_count (ib);
-        }
+
       /* IPA-profile computes hot bb threshold based on cumulated
 	 whole program profile.  We need to stream it down to ltrans.  */
       if (flag_ltrans)
@@ -1645,13 +1590,10 @@ static void
 merge_profile_summaries (struct lto_file_decl_data **file_data_vec)
 {
   struct lto_file_decl_data *file_data;
-  unsigned int j, h_ix;
+  unsigned int j;
   gcov_unsigned_t max_runs = 0;
   struct cgraph_node *node;
   struct cgraph_edge *edge;
-  gcov_type saved_sum_all = 0;
-  gcov_summary *saved_profile_info = 0;
-  int saved_scale = 0;
 
   /* Find unit with maximal number of runs.  If we ever get serious about
      roundoff errors, we might also consider computing smallest common
@@ -1672,70 +1614,8 @@ merge_profile_summaries (struct lto_file_decl_data **file_data_vec)
       return;
     }
 
-  profile_info = &lto_gcov_summary;
-  lto_gcov_summary.runs = max_runs;
-  lto_gcov_summary.sum_max = 0;
-  memset (lto_gcov_summary.histogram, 0,
-          sizeof (gcov_bucket_type) * GCOV_HISTOGRAM_SIZE);
-
-  /* Rescale all units to the maximal number of runs.
-     sum_max can not be easily merged, as we have no idea what files come from
-     the same run.  We do not use the info anyway, so leave it 0.  */
-  for (j = 0; (file_data = file_data_vec[j]) != NULL; j++)
-    if (file_data->profile_info.runs)
-      {
-	int scale = GCOV_COMPUTE_SCALE (max_runs,
-                                        file_data->profile_info.runs);
-	lto_gcov_summary.sum_max
-            = MAX (lto_gcov_summary.sum_max,
-                   apply_scale (file_data->profile_info.sum_max, scale));
-	lto_gcov_summary.sum_all
-            = MAX (lto_gcov_summary.sum_all,
-                   apply_scale (file_data->profile_info.sum_all, scale));
-        /* Save a pointer to the profile_info with the largest
-           scaled sum_all and the scale for use in merging the
-           histogram.  */
-        if (!saved_profile_info
-            || lto_gcov_summary.sum_all > saved_sum_all)
-          {
-            saved_profile_info = &file_data->profile_info;
-            saved_sum_all = lto_gcov_summary.sum_all;
-            saved_scale = scale;
-          }
-      }
-
-  gcc_assert (saved_profile_info);
-
-  /* Scale up the histogram from the profile that had the largest
-     scaled sum_all above.  */
-  for (h_ix = 0; h_ix < GCOV_HISTOGRAM_SIZE; h_ix++)
-    {
-      /* Scale up the min value as we did the corresponding sum_all
-         above. Use that to find the new histogram index.  */
-      gcov_type scaled_min
-          = apply_scale (saved_profile_info->histogram[h_ix].min_value,
-                         saved_scale);
-      /* The new index may be shared with another scaled histogram entry,
-         so we need to account for a non-zero histogram entry at new_ix.  */
-      unsigned new_ix = gcov_histo_index (scaled_min);
-      lto_gcov_summary.histogram[new_ix].min_value
-          = (lto_gcov_summary.histogram[new_ix].num_counters
-             ? MIN (lto_gcov_summary.histogram[new_ix].min_value, scaled_min)
-             : scaled_min);
-      /* Some of the scaled counter values would ostensibly need to be placed
-         into different (larger) histogram buckets, but we keep things simple
-         here and place the scaled cumulative counter value in the bucket
-         corresponding to the scaled minimum counter value.  */
-      lto_gcov_summary.histogram[new_ix].cum_value
-          += apply_scale (saved_profile_info->histogram[h_ix].cum_value,
-                          saved_scale);
-      lto_gcov_summary.histogram[new_ix].num_counters
-          += saved_profile_info->histogram[h_ix].num_counters;
-    }
-
-  /* Watch roundoff errors.  */
-  if (lto_gcov_summary.sum_max < max_runs)
-    lto_gcov_summary.sum_max = max_runs;
+  profile_info = XCNEW (gcov_summary);
+  profile_info->runs = max_runs;
 
   /* If merging already happent at WPA time, we are done.  */
   if (flag_ltrans)
@@ -1814,10 +1694,6 @@ input_symtab (void)
 
   merge_profile_summaries (file_data_vec);
 
-  if (!flag_auto_profile)
-    get_working_sets ();
-
-
   /* Clear out the aux field that was used to store enough state to
      tell which nodes should be overwritten.  */
   FOR_EACH_FUNCTION (node)
diff --git a/gcc/modulo-sched.c b/gcc/modulo-sched.c
index 9a27365bfbc..121e6191afd 100644
--- a/gcc/modulo-sched.c
+++ b/gcc/modulo-sched.c
@@ -1449,10 +1449,6 @@ sms_schedule (void)
                   fprintf (dump_file, "%" PRId64 "max %" PRId64,
                            (int64_t) trip_count, (int64_t) max_trip_count);
                   fprintf (dump_file, "\n");
-	      	  fprintf (dump_file, "SMS profile-sum-max ");
-	      	  fprintf (dump_file, "%" PRId64,
-	          	   (int64_t) profile_info->sum_max);
-	      	  fprintf (dump_file, "\n");
 	    	}
 	    }
           continue;
@@ -1567,10 +1563,6 @@ sms_schedule (void)
 	      fprintf (dump_file, "%" PRId64,
 	               (int64_t) bb->count.to_gcov_type ());
 	      fprintf (dump_file, "\n");
-	      fprintf (dump_file, "SMS profile-sum-max ");
-	      fprintf (dump_file, "%" PRId64,
-	               (int64_t) profile_info->sum_max);
-	      fprintf (dump_file, "\n");
 	    }
 	  fprintf (dump_file, "SMS doloop\n");
 	  fprintf (dump_file, "SMS built-ddg %d\n", g->num_nodes);
diff --git a/gcc/params.def b/gcc/params.def
index a0ad3ecdad6..9f0697327d4 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -393,10 +393,15 @@ DEFPARAM(PARAM_SMS_LOOP_AVERAGE_COUNT_THRESHOLD,
 	 "A threshold on the average loop count considered by the swing modulo scheduler.",
 	 0, 0, 0)
 
+DEFPARAM(HOT_BB_COUNT_FRACTION,
+	 "hot-bb-count-fraction",
+	 "Select fraction of the maximal count of repetitions of basic block in program given basic "
+	 "block needs to have to be considered hot (used in non-LTO mode)",
+	 10000, 0, 0)
 DEFPARAM(HOT_BB_COUNT_WS_PERMILLE,
 	 "hot-bb-count-ws-permille",
          "A basic block profile count is considered hot if it contributes to "
-         "the given permillage of the entire profiled execution.",
+         "the given permillage of the entire profiled execution (used in LTO mode).",
 	 999, 0, 1000)
 DEFPARAM(HOT_BB_FREQUENCY_FRACTION,
 	 "hot-bb-frequency-fraction",
diff --git a/gcc/postreload-gcse.c b/gcc/postreload-gcse.c
index afa61dcede6..b56993183d0 100644
--- a/gcc/postreload-gcse.c
+++ b/gcc/postreload-gcse.c
@@ -1161,7 +1161,7 @@ eliminate_partially_redundant_load (basic_block bb, rtx_insn *insn,
       || (optimize_bb_for_size_p (bb) && npred_ok > 1)
       /* If we don't have profile information we cannot tell if splitting
          a critical edge is profitable or not so don't do it.  */
-      || ((! profile_info || profile_status_for_fn (cfun) != PROFILE_READ
+      || ((!profile_info || profile_status_for_fn (cfun) != PROFILE_READ
 	   || targetm.cannot_modify_jumps_p ())
 	  && critical_edge_split))
     goto cleanup;
diff --git a/gcc/predict.c b/gcc/predict.c
index 51145526d2a..ab2dc8ed031 100644
--- a/gcc/predict.c
+++ b/gcc/predict.c
@@ -129,12 +129,13 @@ static gcov_type min_count = -1;
 gcov_type
 get_hot_bb_threshold ()
 {
-  gcov_working_set_t *ws;
   if (min_count == -1)
     {
-      ws = find_working_set (PARAM_VALUE (HOT_BB_COUNT_WS_PERMILLE));
-      gcc_assert (ws);
-      min_count = ws->min_counter;
+      min_count
+	= profile_info->sum_max / PARAM_VALUE (HOT_BB_COUNT_FRACTION);
+      if (dump_file)
+	fprintf (dump_file, "Setting hotness threshold to %" PRId64 ".\n",
+		 min_count);
     }
   return min_count;
 }
diff --git a/gcc/profile.c b/gcc/profile.c
index cb51e0d4c51..2130319b081 100644
--- a/gcc/profile.c
+++ b/gcc/profile.c
@@ -84,11 +84,7 @@ struct bb_profile_info {
 
 /* Counter summary from the last set of coverage counts read.  */
 
-const gcov_summary *profile_info;
-
-/* Counter working set information computed from the current counter
-   summary. Not initialized unless profile_info summary is non-NULL.  */
-static gcov_working_set_t gcov_working_sets[NUM_GCOV_WORKING_SETS];
+gcov_summary *profile_info;
 
 /* Collect statistics on the performance of this pass for the entire source
    file.  */
@@ -103,14 +99,6 @@ static int total_num_times_called;
 static int total_hist_br_prob[20];
 static int total_num_branches;
 
-/* Helper function to update gcov_working_sets.  */
-
-void add_working_set (gcov_working_set_t *set) {
-  int i = 0;
-  for (; i < NUM_GCOV_WORKING_SETS; i++)
-    gcov_working_sets[i] = set[i];
-}
-
 /* Forward declarations.  */
 static void find_spanning_tree (struct edge_list *);
 
@@ -207,60 +195,6 @@ instrument_values (histogram_values values)
 }
 \f
 
-/* Fill the working set information into the profile_info structure.  */
-
-void
-get_working_sets (void)
-{
-  unsigned ws_ix, pctinc, pct;
-  gcov_working_set_t *ws_info;
-
-  if (!profile_info)
-    return;
-
-  compute_working_sets (profile_info, gcov_working_sets);
-
-  if (dump_file)
-    {
-      fprintf (dump_file, "Counter working sets:\n");
-      /* Multiply the percentage by 100 to avoid float.  */
-      pctinc = 100 * 100 / NUM_GCOV_WORKING_SETS;
-      for (ws_ix = 0, pct = pctinc; ws_ix < NUM_GCOV_WORKING_SETS;
-           ws_ix++, pct += pctinc)
-        {
-          if (ws_ix == NUM_GCOV_WORKING_SETS - 1)
-            pct = 9990;
-          ws_info = &gcov_working_sets[ws_ix];
-          /* Print out the percentage using int arithmatic to avoid float.  */
-          fprintf (dump_file, "\t\t%u.%02u%%: num counts=%u, min counter="
-                   "%" PRId64 "\n",
-                   pct / 100, pct - (pct / 100 * 100),
-                   ws_info->num_counters,
-                   (int64_t)ws_info->min_counter);
-        }
-    }
-}
-
-/* Given a the desired percentage of the full profile (sum_all from the
-   summary), multiplied by 10 to avoid float in PCT_TIMES_10, returns
-   the corresponding working set information. If an exact match for
-   the percentage isn't found, the closest value is used.  */
-
-gcov_working_set_t *
-find_working_set (unsigned pct_times_10)
-{
-  unsigned i;
-  if (!profile_info)
-    return NULL;
-  gcc_assert (pct_times_10 <= 1000);
-  if (pct_times_10 >= 999)
-    return &gcov_working_sets[NUM_GCOV_WORKING_SETS - 1];
-  i = pct_times_10 * NUM_GCOV_WORKING_SETS / 1000;
-  if (!i)
-    return &gcov_working_sets[0];
-  return &gcov_working_sets[i - 1];
-}
-
 /* Computes hybrid profile for all matching entries in da_file.  
    
    CFG_CHECKSUM is the precomputed checksum for the CFG.  */
@@ -283,21 +217,14 @@ get_exec_counts (unsigned cfg_checksum, unsigned lineno_checksum)
 	  num_edges++;
     }
 
-  counts = get_coverage_counts (GCOV_COUNTER_ARCS, num_edges, cfg_checksum,
-				lineno_checksum, &profile_info);
+  counts = get_coverage_counts (GCOV_COUNTER_ARCS, cfg_checksum,
+				lineno_checksum);
   if (!counts)
     return NULL;
 
-  get_working_sets ();
-
-  if (dump_file && profile_info)
-    fprintf (dump_file, "Merged %u profiles with maximal count %u.\n",
-	     profile_info->runs, (unsigned) profile_info->sum_max);
-
   return counts;
 }
 
-
 static bool
 is_edge_inconsistent (vec<edge, va_gc> *edges)
 {
@@ -439,29 +366,7 @@ read_profile_edge_counts (gcov_type *exec_counts)
 	  {
 	    num_edges++;
 	    if (exec_counts)
-	      {
-		edge_gcov_count (e) = exec_counts[exec_counts_pos++];
-		if (edge_gcov_count (e) > profile_info->sum_max)
-		  {
-		    if (flag_profile_correction)
-		      {
-			static bool informed = 0;
-			if (dump_enabled_p () && !informed)
-			  {
-			    dump_location_t loc
-			      = dump_location_t::from_location_t
-			        (input_location);
-			    dump_printf_loc (MSG_NOTE, loc,
-					     "corrupted profile info: edge count"
-					     " exceeds maximal count\n");
-			  }
-			informed = 1;
-		      }
-		    else
-		      error ("corrupted profile info: edge from %i to %i exceeds maximal count",
-			     bb->index, e->dest->index);
-		  }
-	      }
+	      edge_gcov_count (e) = exec_counts[exec_counts_pos++];
 	    else
 	      edge_gcov_count (e) = 0;
 
@@ -511,12 +416,6 @@ compute_branch_probabilities (unsigned cfg_checksum, unsigned lineno_checksum)
   bb_gcov_counts.safe_grow_cleared (last_basic_block_for_fn (cfun));
   edge_gcov_counts = new hash_map<edge,gcov_type>;
 
-  if (profile_info->sum_all < profile_info->sum_max)
-    {
-      error ("corrupted profile info: sum_all is smaller than sum_max");
-      exec_counts = NULL;
-    }
-
   /* Attach extra info block to each bb.  */
   alloc_aux_for_blocks (sizeof (struct bb_profile_info));
   FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR_FOR_FN (cfun), NULL, next_bb)
@@ -871,10 +770,9 @@ compute_value_histograms (histogram_values values, unsigned cfg_checksum,
 	  continue;
 	}
 
-      histogram_counts[t] =
-	get_coverage_counts (COUNTER_FOR_HIST_TYPE (t),
-			     n_histogram_counters[t], cfg_checksum,
-			     lineno_checksum, NULL);
+      histogram_counts[t] = get_coverage_counts (COUNTER_FOR_HIST_TYPE (t),
+						 cfg_checksum,
+						 lineno_checksum);
       if (histogram_counts[t])
 	any = 1;
       act_count[t] = histogram_counts[t];
diff --git a/gcc/profile.h b/gcc/profile.h
index 6b37bb6f3df..183e8d83b65 100644
--- a/gcc/profile.h
+++ b/gcc/profile.h
@@ -75,6 +75,6 @@ extern void get_working_sets (void);
 
 /* Counter summary from the last set of coverage counts read by
    profile.c.  */
-extern const struct gcov_summary *profile_info;
+extern struct gcov_summary *profile_info;
 
 #endif /* PROFILE_H */
diff --git a/libgcc/libgcov-driver.c b/libgcc/libgcov-driver.c
index 1f2c4a74298..cdebb747326 100644
--- a/libgcc/libgcov-driver.c
+++ b/libgcc/libgcov-driver.c
@@ -24,6 +24,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 <http://www.gnu.org/licenses/>.  */
 
 #include "libgcov.h"
+#include "gcov-io.h"
 
 #if defined(inhibit_libc)
 /* If libc and its header files are not available, provide dummy functions.  */
@@ -156,25 +157,6 @@ fail:
   return (struct gcov_fn_buffer **)free_fn_data (gi_ptr, fn_buffer, ix);
 }
 
-/* Add an unsigned value to the current crc */
-
-static gcov_unsigned_t
-crc32_unsigned (gcov_unsigned_t crc32, gcov_unsigned_t value)
-{
-  unsigned ix;
-
-  for (ix = 32; ix--; value <<= 1)
-    {
-      unsigned feedback;
-
-      feedback = (value ^ crc32) & 0x80000000 ? 0x04c11db7 : 0;
-      crc32 <<= 1;
-      crc32 ^= feedback;
-    }
-
-  return crc32;
-}
-
 /* Check if VERSION of the info block PTR matches libgcov one.
    Return 1 on success, or zero in case of versions mismatch.
    If FILENAME is not NULL, its value used for reporting purposes
@@ -198,117 +180,8 @@ gcov_version (struct gcov_info *ptr, gcov_unsigned_t version,
   return 1;
 }
 
-/* Insert counter VALUE into HISTOGRAM.  */
-
-static void
-gcov_histogram_insert(gcov_bucket_type *histogram, gcov_type value)
-{
-  unsigned i;
-
-  i = gcov_histo_index(value);
-  histogram[i].num_counters++;
-  histogram[i].cum_value += value;
-  if (value < histogram[i].min_value)
-    histogram[i].min_value = value;
-}
-
-/* Computes a histogram of the arc counters to place in the summary SUM.  */
-
-static void
-gcov_compute_histogram (struct gcov_info *list, struct gcov_summary *sum)
-{
-  struct gcov_info *gi_ptr;
-  const struct gcov_fn_info *gfi_ptr;
-  const struct gcov_ctr_info *ci_ptr;
-  unsigned f_ix, ix;
-  int h_ix;
-
-  /* First check if there are any counts recorded for this counter.  */
-  if (!sum->num)
-    return;
-
-  for (h_ix = 0; h_ix < GCOV_HISTOGRAM_SIZE; h_ix++)
-    {
-      sum->histogram[h_ix].num_counters = 0;
-      sum->histogram[h_ix].min_value = sum->run_max;
-      sum->histogram[h_ix].cum_value = 0;
-    }
-
-  /* Walk through all the per-object structures and record each of
-     the count values in histogram.  */
-  for (gi_ptr = list; gi_ptr; gi_ptr = gi_ptr->next)
-    {
-      for (f_ix = 0; f_ix != gi_ptr->n_functions; f_ix++)
-        {
-          gfi_ptr = gi_ptr->functions[f_ix];
-
-          if (!gfi_ptr || gfi_ptr->key != gi_ptr)
-            continue;
-
-	  ci_ptr = &gfi_ptr->ctrs[0];
-	  for (ix = 0; ix < ci_ptr->num; ix++)
-	    gcov_histogram_insert (sum->histogram, ci_ptr->values[ix]);
-        }
-    }
-}
-
 /* buffer for the fn_data from another program.  */
 static struct gcov_fn_buffer *fn_buffer;
-/* buffer for summary from other programs to be written out. */
-static struct gcov_summary_buffer *sum_buffer;
-
-/* This function computes the program level summary and the histo-gram.
-   It computes and returns CRC32 and stored summary in THIS_PRG.  */
-
-#if !IN_GCOV_TOOL
-static
-#endif
-gcov_unsigned_t
-compute_summary (struct gcov_info *list, struct gcov_summary *this_prg)
-{
-  struct gcov_info *gi_ptr;
-  const struct gcov_fn_info *gfi_ptr;
-  const struct gcov_ctr_info *ci_ptr;
-  int f_ix;
-  gcov_unsigned_t c_num;
-  gcov_unsigned_t crc32 = 0;
-
-  /* Find the totals for this execution.  */
-  memset (this_prg, 0, sizeof (*this_prg));
-  for (gi_ptr = list; gi_ptr; gi_ptr = gi_ptr->next)
-    {
-      crc32 = crc32_unsigned (crc32, gi_ptr->stamp);
-      crc32 = crc32_unsigned (crc32, gi_ptr->n_functions);
-
-      for (f_ix = 0; (unsigned)f_ix != gi_ptr->n_functions; f_ix++)
-        {
-          gfi_ptr = gi_ptr->functions[f_ix];
-
-          if (gfi_ptr && gfi_ptr->key != gi_ptr)
-            gfi_ptr = 0;
-
-          crc32 = crc32_unsigned (crc32, gfi_ptr ? gfi_ptr->cfg_checksum : 0);
-          crc32 = crc32_unsigned (crc32,
-                                  gfi_ptr ? gfi_ptr->lineno_checksum : 0);
-          if (!gfi_ptr)
-            continue;
-
-	  ci_ptr = gfi_ptr->ctrs;
-	  this_prg->num += ci_ptr->num;
-	  crc32 = crc32_unsigned (crc32, ci_ptr->num);
-
-	  for (c_num = 0; c_num < ci_ptr->num; c_num++)
-	    {
-	      this_prg->sum_all += ci_ptr->values[c_num];
-	      if (this_prg->run_max < ci_ptr->values[c_num])
-		this_prg->run_max = ci_ptr->values[c_num];
-	    }
-	  ci_ptr++;
-	}
-    }
-  gcov_compute_histogram (list, this_prg);
-  return crc32;
-}
 
 /* Including system dependent components. */
 #include "libgcov-driver-system.c"
@@ -320,18 +193,13 @@ compute_summary (struct gcov_info *list, struct gcov_summary *this_prg)
 static int
 merge_one_data (const char *filename,
 		struct gcov_info *gi_ptr,
-		struct gcov_summary *prg_p,
-		struct gcov_summary *this_prg,
-		gcov_position_t *summary_pos_p,
-		gcov_position_t *eof_pos_p,
-		gcov_unsigned_t crc32)
+		struct gcov_summary *summary)
 {
   gcov_unsigned_t tag, length;
   unsigned t_ix;
-  int f_ix;
+  int f_ix = -1;
   int error = 0;
   struct gcov_fn_buffer **fn_tail = &fn_buffer;
-  struct gcov_summary_buffer **sum_tail = &sum_buffer;
 
   length = gcov_read_unsigned ();
   if (!gcov_version (gi_ptr, length, filename))
@@ -346,46 +214,14 @@ merge_one_data (const char *filename,
       return 0;
     }
 
-  /* Look for program summary.  */
-  for (f_ix = 0;;)
-    {
-      struct gcov_summary tmp;
-
-      *eof_pos_p = gcov_position ();
-      tag = gcov_read_unsigned ();
-      if (tag != GCOV_TAG_PROGRAM_SUMMARY)
-        break;
-
-      f_ix--;
-      length = gcov_read_unsigned ();
-      gcov_read_summary (&tmp);
-      if ((error = gcov_is_error ()))
-        goto read_error;
-      if (*summary_pos_p)
-        {
-          /* Save all summaries after the one that will be
-             merged into below. These will need to be rewritten
-             as histogram merging may change the number of non-zero
-             histogram entries that will be emitted, and thus the
-             size of the merged summary.  */
-          (*sum_tail) = (struct gcov_summary_buffer *)
-              xmalloc (sizeof(struct gcov_summary_buffer));
-          (*sum_tail)->summary = tmp;
-          (*sum_tail)->next = 0;
-          sum_tail = &((*sum_tail)->next);
-          goto next_summary;
-        }
-      if (tmp.checksum != crc32)
-        goto next_summary;
-
-      if (tmp.num != this_prg->num)
-	goto next_summary;
-      *prg_p = tmp;
-      *summary_pos_p = *eof_pos_p;
-
-    next_summary:;
-    }
+  tag = gcov_read_unsigned ();
+  if (tag != GCOV_TAG_OBJECT_SUMMARY)
+    goto read_mismatch;
+  length = gcov_read_unsigned ();
+  gcc_assert (length > 0);
+  gcov_read_summary (summary);
 
+  tag = gcov_read_unsigned ();
   /* Merge execution counts for each function.  */
   for (f_ix = 0; (unsigned)f_ix != gi_ptr->n_functions;
        f_ix++, tag = gcov_read_unsigned ())
@@ -472,38 +308,15 @@ read_error:
 
 static void
 write_one_data (const struct gcov_info *gi_ptr,
-		const struct gcov_summary *prg_p,
-		const gcov_position_t eof_pos,
-		const gcov_position_t summary_pos)
+		const struct gcov_summary *prg_p)
 {
   unsigned f_ix;
-  struct gcov_summary_buffer *next_sum_buffer;
 
-  /* Write out the data.  */
-  if (!eof_pos)
-    {
-      gcov_write_tag_length (GCOV_DATA_MAGIC, GCOV_VERSION);
-      gcov_write_unsigned (gi_ptr->stamp);
-    }
-
-  if (summary_pos)
-    gcov_seek (summary_pos);
+  gcov_write_tag_length (GCOV_DATA_MAGIC, GCOV_VERSION);
+  gcov_write_unsigned (gi_ptr->stamp);
 
   /* Generate whole program statistics.  */
-  gcov_write_summary (GCOV_TAG_PROGRAM_SUMMARY, prg_p);
-
-  /* Rewrite all the summaries that were after the summary we merged
-     into. This is necessary as the merged summary may have a different
-     size due to the number of non-zero histogram entries changing after
-     merging.  */
-
-  while (sum_buffer)
-    {
-      gcov_write_summary (GCOV_TAG_PROGRAM_SUMMARY, &sum_buffer->summary);
-      next_sum_buffer = sum_buffer->next;
-      free (sum_buffer);
-      sum_buffer = next_sum_buffer;
-    }
+  gcov_write_summary (GCOV_TAG_OBJECT_SUMMARY, prg_p);
 
   /* Write execution counts for each function.  */
   for (f_ix = 0; f_ix != gi_ptr->n_functions; f_ix++)
@@ -562,70 +375,19 @@ write_one_data (const struct gcov_info *gi_ptr,
   gcov_write_unsigned (0);
 }
 
-/* Helper function for merging summary.
-   Return -1 on error. Return 0 on success.  */
+/* Helper function for merging summary.  */
 
-static int
-merge_summary (const char *filename __attribute__ ((unused)), int run_counted,
-	       struct gcov_summary *prg,
-	       struct gcov_summary *this_prg, gcov_unsigned_t crc32,
-	       struct gcov_summary *all_prg __attribute__ ((unused)))
+static void
+merge_summary (int run_counted, struct gcov_summary *summary,
+	      gcov_type run_max)
 {
-#if !GCOV_LOCKED 
-  /* summary for all instances of program.  */ 
-  struct gcov_summary *all;
-#endif 
-
-  /* Merge the summary.  */
-  int first = !prg->runs;
-
   if (!run_counted)
-    prg->runs++;
-  if (first)
-    prg->num = this_prg->num;
-  prg->sum_all += this_prg->sum_all;
-  if (prg->run_max < this_prg->run_max)
-    prg->run_max = this_prg->run_max;
-  prg->sum_max += this_prg->run_max;
-  if (first)
-    memcpy (prg->histogram, this_prg->histogram,
-	    sizeof (gcov_bucket_type) * GCOV_HISTOGRAM_SIZE);
-  else
-    gcov_histogram_merge (prg->histogram, this_prg->histogram);
-#if !GCOV_LOCKED
-  all = all_prg;
-  if (!all->runs && prg->runs)
-    {
-      all->num = prg->num;
-      all->runs = prg->runs;
-      all->sum_all = prg->sum_all;
-      all->run_max = prg->run_max;
-      all->sum_max = prg->sum_max;
-    }
-  else if (!all_prg->checksum
-	   /* Don't compare the histograms, which may have slight
-	      variations depending on the order they were updated
-	      due to the truncating integer divides used in the
-	      merge.  */
-	   && (all->num != prg->num
-	       || all->runs != prg->runs
-	       || all->sum_all != prg->sum_all
-	       || all->run_max != prg->run_max
-	       || all->sum_max != prg->sum_max))
     {
-      gcov_error ("profiling:%s:Data file mismatch - some "
-		  "data files may have been concurrently "
-		  "updated without locking support\n", filename);
-      all_prg->checksum = ~0u;
+      summary->runs++;
+      summary->sum_max += run_max;
     }
-#endif
-
-  prg->checksum = crc32;
-
-  return 0;
 }
 
-
 /* Sort N entries in VALUE_ARRAY in descending order.
    Each entry in VALUE_ARRAY has two values. The sorting
    is based on the second value.  */
@@ -713,18 +475,13 @@ gcov_sort_topn_counter_arrays (const struct gcov_info *gi_ptr)
 
 static void
 dump_one_gcov (struct gcov_info *gi_ptr, struct gcov_filename *gf,
-	       unsigned run_counted,
-	       gcov_unsigned_t crc32, struct gcov_summary *all_prg,
-	       struct gcov_summary *this_prg)
+	       unsigned run_counted, gcov_type run_max)
 {
-  struct gcov_summary prg; /* summary for this object over all program.  */
+  struct gcov_summary summary = {};
   int error;
   gcov_unsigned_t tag;
-  gcov_position_t summary_pos = 0;
-  gcov_position_t eof_pos = 0;
 
   fn_buffer = 0;
-  sum_buffer = 0;
 
   gcov_sort_topn_counter_arrays (gi_ptr);
 
@@ -741,26 +498,16 @@ dump_one_gcov (struct gcov_info *gi_ptr, struct gcov_filename *gf,
           gcov_error ("profiling:%s:Not a gcov data file\n", gf->filename);
           goto read_fatal;
         }
-      error = merge_one_data (gf->filename, gi_ptr, &prg, this_prg,
-			      &summary_pos, &eof_pos, crc32);
+      error = merge_one_data (gf->filename, gi_ptr, &summary);
       if (error == -1)
         goto read_fatal;
     }
 
   gcov_rewrite ();
 
-  if (!summary_pos)
-    {
-      memset (&prg, 0, sizeof (prg));
-      summary_pos = eof_pos;
-    }
-
-  error = merge_summary (gf->filename, run_counted, &prg, this_prg,
-			 crc32, all_prg);
-  if (error == -1)
-    goto read_fatal;
+  merge_summary (run_counted, &summary, run_max);
 
-  write_one_data (gi_ptr, &prg, eof_pos, summary_pos);
+  write_one_data (gi_ptr, &summary);
   /* fall through */
 
 read_fatal:;
@@ -787,21 +534,26 @@ gcov_do_dump (struct gcov_info *list, int run_counted)
 {
   struct gcov_info *gi_ptr;
   struct gcov_filename gf;
-  gcov_unsigned_t crc32;
-  struct gcov_summary all_prg;
-  struct gcov_summary this_prg;
 
-  crc32 = compute_summary (list, &this_prg);
+  /* Compute run_max of this program run.  */
+  gcov_type run_max = 0;
+  for (gi_ptr = list; gi_ptr; gi_ptr = gi_ptr->next)
+    for (unsigned f_ix = 0; (unsigned)f_ix != gi_ptr->n_functions; f_ix++)
+      {
+	const struct gcov_ctr_info *cinfo
+	  = &gi_ptr->functions[f_ix]->ctrs[GCOV_COUNTER_ARCS];
+
+	for (unsigned i = 0; i < cinfo->num; i++)
+	  if (run_max < cinfo->values[i])
+	    run_max = cinfo->values[i];
+      }
 
   allocate_filename_struct (&gf);
-#if !GCOV_LOCKED
-  memset (&all_prg, 0, sizeof (all_prg));
-#endif
 
   /* Now merge each file.  */
   for (gi_ptr = list; gi_ptr; gi_ptr = gi_ptr->next)
     {
-      dump_one_gcov (gi_ptr, &gf, run_counted, crc32, &all_prg, &this_prg);
+      dump_one_gcov (gi_ptr, &gf, run_counted, run_max);
       free (gf.filename);
     }
 
diff --git a/libgcc/libgcov-util.c b/libgcc/libgcov-util.c
index 37dd186beaa..408bda82236 100644
--- a/libgcc/libgcov-util.c
+++ b/libgcc/libgcov-util.c
@@ -32,6 +32,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #include "diagnostic.h"
 #include "version.h"
 #include "demangle.h"
+#include "gcov-io.h"
 
 /* Borrowed from basic-block.h.  */
 #define RDIV(X,Y) (((X) + (Y) / 2) / (Y))
@@ -79,6 +80,8 @@ static int k_ctrs_mask[GCOV_COUNTERS];
 static struct gcov_ctr_info k_ctrs[GCOV_COUNTERS];
 /* Number of kind of counters that have been seen.  */
 static int k_ctrs_types;
+/* The object summary being processed.  */
+static struct gcov_summary *curr_object_summary;
 
 /* Merge functions for counters.  */
 #define DEF_GCOV_COUNTER(COUNTER, NAME, FN_TYPE) __gcov_merge ## FN_TYPE,
@@ -131,7 +134,6 @@ static const tag_format_t tag_table[] =
   {GCOV_TAG_ARCS, "ARCS", tag_arcs},
   {GCOV_TAG_LINES, "LINES", tag_lines},
   {GCOV_TAG_OBJECT_SUMMARY, "OBJECT_SUMMARY", tag_summary},
-  {GCOV_TAG_PROGRAM_SUMMARY, "PROGRAM_SUMMARY", tag_summary},
   {0, NULL, NULL}
 };
 
@@ -223,9 +225,8 @@ tag_counters (unsigned tag, unsigned length)
 static void
 tag_summary (unsigned tag ATTRIBUTE_UNUSED, unsigned length ATTRIBUTE_UNUSED)
 {
-  struct gcov_summary summary;
-
-  gcov_read_summary (&summary);
+  curr_object_summary = (gcov_summary *) xcalloc (sizeof (gcov_summary), 1);
+  gcov_read_summary (curr_object_summary);
 }
 
 /* This function is called at the end of reading a gcda file.
@@ -239,7 +240,8 @@ read_gcda_finalize (struct gcov_info *obj_info)
   set_fn_ctrs (curr_fn_info);
   obstack_ptr_grow (&fn_info, curr_fn_info);
 
-  /* We set the following fields: merge, n_functions, and functions.  */
+  /* We set the following fields: merge, n_functions, functions
+     and summary.  */
   obj_info->n_functions = num_fn_info;
   obj_info->functions = (const struct gcov_fn_info**) obstack_finish (&fn_info);
 
@@ -299,6 +301,7 @@ read_gcda_file (const char *filename)
   obstack_init (&fn_info);
   num_fn_info = 0;
   curr_fn_info = 0;
+  curr_object_summary = NULL;
   {
     size_t len = strlen (filename) + 1;
     char *str_dup = (char*) xmalloc (len);
@@ -892,8 +895,6 @@ calculate_2_entries (const unsigned long v1, const unsigned long v2,
 }
 
 /*  Compute the overlap score between GCOV_INFO1 and GCOV_INFO2.
-    SUM_1 is the sum_all for profile1 where GCOV_INFO1 belongs.
-    SUM_2 is the sum_all for profile2 where GCOV_INFO2 belongs.
     This function also updates cumulative score CUM_1_RESULT and
     CUM_2_RESULT.  */
 
@@ -1048,12 +1049,6 @@ struct overlap_t {
 /* Cumlative overlap dscore for profile1 and profile2.  */
 static double overlap_sum_1, overlap_sum_2;
 
-/* sum_all for profile1 and profile2.  */
-static gcov_type p1_sum_all, p2_sum_all;
-
-/* run_max for profile1 and profile2.  */
-static gcov_type p1_run_max, p2_run_max;
-
 /* The number of gcda files in the profiles.  */
 static unsigned gcda_files[2];
 
@@ -1200,10 +1195,6 @@ matched_gcov_info (const struct gcov_info *info1, const struct gcov_info *info2)
   return 1;
 }
 
-/* Defined in libgcov-driver.c.  */
-extern gcov_unsigned_t compute_summary (struct gcov_info *,
-					struct gcov_summary *);
-
 /* Compute the overlap score of two profiles with the head of GCOV_LIST1 and
    GCOV_LIST1. Return a number ranging from [0.0, 1.0], with 0.0 meaning no
    match and 1.0 meaning a perfect match.  */
@@ -1212,21 +1203,11 @@ static double
 calculate_overlap (struct gcov_info *gcov_list1,
                    struct gcov_info *gcov_list2)
 {
-  struct gcov_summary this_prg;
   unsigned list1_cnt = 0, list2_cnt= 0, all_cnt;
   unsigned int i, j;
   const struct gcov_info *gi_ptr;
   struct overlap_t *all_infos;
 
-  compute_summary (gcov_list1, &this_prg);
-  overlap_sum_1 = (double) (this_prg.sum_all);
-  p1_sum_all = this_prg.sum_all;
-  p1_run_max = this_prg.run_max;
-  compute_summary (gcov_list2, &this_prg);
-  overlap_sum_2 = (double) (this_prg.sum_all);
-  p2_sum_all = this_prg.sum_all;
-  p2_run_max = this_prg.run_max;
-
   for (gi_ptr = gcov_list1; gi_ptr; gi_ptr = gi_ptr->next)
     list1_cnt++;
   for (gi_ptr = gcov_list2; gi_ptr; gi_ptr = gi_ptr->next)
@@ -1334,10 +1315,6 @@ calculate_overlap (struct gcov_info *gcov_list1,
 	  cold_gcda_files[1], both_cold_cnt);
   printf ("    zero files:  %12u\t%12u\t%12u\n", zero_gcda_files[0],
 	  zero_gcda_files[1], both_zero_cnt);
-  printf ("       sum_all:  %12" PRId64 "\t%12" PRId64 "\n",
-	  p1_sum_all, p2_sum_all);
-  printf ("       run_max:  %12" PRId64 "\t%12" PRId64 "\n",
-	  p1_run_max, p2_run_max);
 
   return prg_val;
 }


[-- Attachment #3: spec-int-change.txt --]
[-- Type: text/plain, Size: 7695 bytes --]


400.perlbench
     VM SIZE                      FILE SIZE
 ++++++++++++++ GROWING        ++++++++++++++
  [ = ]       0 [Unmapped]     +2.49Ki   +35%
  +0.9%    +840 .eh_frame         +840  +0.9%

 -------------- SHRINKING      --------------
  [ = ]       0 .debug_loc     -77.0Ki  -4.8%
  -4.0% -36.7Ki .text          -36.7Ki  -4.0%
 -12.8% -18.6Ki .rodata        -18.6Ki -12.8%
  [ = ]       0 .debug_line    -17.1Ki  -2.6%
  [ = ]       0 .debug_info    -12.7Ki  -0.8%
  [ = ]       0 .debug_ranges  -10.5Ki  -6.0%
  [ = ]       0 .debug_aranges -1.16Ki -10.2%
  [ = ]       0 .debug_abbrev     -147  -0.2%
  [ = ]       0 .strtab            -54  -0.1%
  [ = ]       0 .symtab            -48  -0.1%
  -0.1%     -16 .eh_frame_hdr      -16  -0.1%

  -4.5% -54.5Ki TOTAL           -170Ki  -3.1%

401.bzip2
     VM SIZE                      FILE SIZE
 ++++++++++++++ GROWING        ++++++++++++++
  [ = ]       0 [Unmapped]        +485   +15%
  [ = ]       0 .debug_info        +12  +0.0%

 -------------- SHRINKING      --------------
  -0.7%    -480 .text             -480  -0.7%
  [ = ]       0 .debug_line       -249  -0.5%
  [ = ]       0 .debug_ranges     -112  -0.8%
  [ = ]       0 .debug_loc         -31  -0.0%
  [ = ]       0 .debug_aranges     -16  -1.9%
  [ = ]       0 .debug_abbrev       -9  -0.1%

  -0.5%    -480 TOTAL             -400  -0.1%

403.gcc
     VM SIZE                      FILE SIZE
 ++++++++++++++ GROWING        ++++++++++++++
  [ = ]       0 .symtab           +984  +0.5%
  [ = ]       0 .strtab           +941  +0.6%
  +0.7%    +320 .eh_frame_hdr     +320  +0.7%

 -------------- SHRINKING      --------------
  [ = ]       0 .debug_loc      -792Ki -13.6%
 -12.0%  -352Ki .text           -352Ki -12.0%
  [ = ]       0 .debug_line     -192Ki  -9.7%
  [ = ]       0 .debug_info     -174Ki  -3.6%
  [ = ]       0 .debug_ranges   -162Ki -19.5%
  -5.7% -55.4Ki .rodata        -55.4Ki  -5.7%
  [ = ]       0 .debug_aranges -7.42Ki -22.8%
  -1.5% -3.69Ki .eh_frame      -3.69Ki  -1.5%
  [ = ]       0 [Unmapped]        -785  -9.5%
  [ = ]       0 .debug_abbrev      -26  -0.0%

  -8.3%  -411Ki TOTAL          -1.70Mi  -9.4%

429.mcf
     VM SIZE                   FILE SIZE
 ++++++++++++++ GROWING     ++++++++++++++
  +0.1%     +16 .text           +16  +0.1%

 -------------- SHRINKING   --------------
  [ = ]       0 .debug_loc      -63  -0.3%
  [ = ]       0 [Unmapped]      -26  -0.4%
  [ = ]       0 .debug_line      -7  -0.0%

  +0.1%     +16 TOTAL           -80  -0.1%

445.gobmk
     VM SIZE                      FILE SIZE
 ++++++++++++++ GROWING        ++++++++++++++
  [ = ]       0 [Unmapped]     +1.39Ki   +30%
  [ = ]       0 .symtab           +192  +0.1%
  [ = ]       0 .strtab           +162  +0.1%
  +0.2%     +56 .eh_frame_hdr      +56  +0.2%

 -------------- SHRINKING      --------------
  [ = ]       0 .debug_loc      -151Ki  -8.3%
  -9.2% -72.8Ki .text          -72.8Ki  -9.2%
  [ = ]       0 .debug_line    -38.5Ki  -7.7%
  [ = ]       0 .debug_info    -26.1Ki  -1.9%
  [ = ]       0 .debug_ranges  -21.6Ki -12.5%
  [ = ]       0 .debug_aranges -1.38Ki -15.6%
  -0.5%    -584 .eh_frame         -584  -0.5%
  [ = ]       0 .debug_abbrev     -199  -0.3%
  -0.0%     -88 .rodata            -88  -0.0%

  -1.2% -73.4Ki TOTAL           -311Ki  -3.8%

456.hmmer
     VM SIZE                     FILE SIZE
 ++++++++++++++ GROWING       ++++++++++++++
  [ = ]       0 [Unmapped]        +46  +0.7%
  [ = ]       0 .debug_ranges     +32  +0.2%

 -------------- SHRINKING     --------------
  [ = ]       0 .debug_loc       -156  -0.1%
  -0.0%     -48 .text             -48  -0.0%
  [ = ]       0 .debug_line       -18  -0.0%

  -0.0%     -48 TOTAL            -144  -0.0%

458.sjeng
     VM SIZE                      FILE SIZE
 ++++++++++++++ GROWING        ++++++++++++++
  [ = ]       0 [Unmapped]     +3.97Ki  +204%
  [ = ]       0 .debug_loc         +67  +0.0%
  [ = ]       0 .debug_line        +61  +0.1%
  +0.0%     +32 .text              +32  +0.0%
  [ = ]       0 .debug_aranges     +16  +0.7%
  [ = ]       0 .debug_ranges      +16  +0.1%

  +0.0%     +32 TOTAL          +4.16Ki  +0.7%

462.libquantum
     VM SIZE                   FILE SIZE
 ++++++++++++++ GROWING     ++++++++++++++
  +0.2%     +64 .text           +64  +0.2%
  [ = ]       0 .debug_line     +43  +0.2%

 -------------- SHRINKING   --------------
  [ = ]       0 .debug_loc     -219  -0.3%
  [ = ]       0 [Unmapped]      -64  -0.8%

  +0.2%     +64 TOTAL          -176  -0.1%

464.h264ref
     VM SIZE                      FILE SIZE
 ++++++++++++++ GROWING        ++++++++++++++
  [ = ]       0 .debug_loc     +70.5Ki +10.0%
  +6.1% +30.0Ki .text          +30.0Ki  +6.1%
  [ = ]       0 .debug_line    +19.3Ki  +5.6%
  [ = ]       0 .debug_info    +7.85Ki  +0.9%
  [ = ]       0 .debug_ranges  +4.58Ki   +14%
  [ = ]       0 .debug_abbrev     +233  +0.5%
  +0.3%    +224 .rodata           +224  +0.3%
  [ = ]       0 .debug_aranges    +112  +2.2%

 -------------- SHRINKING      --------------
  [ = ]       0 [Unmapped]     -1.96Ki -54.5%
  [ = ]       0 .strtab           -172  -0.7%
  -0.5%    -168 .eh_frame         -168  -0.5%
  [ = ]       0 .symtab           -144  -0.4%
  -0.8%     -48 .eh_frame_hdr      -48  -0.8%

  +3.0% +30.0Ki TOTAL           +130Ki  +4.8%

471.omnetpp
     VM SIZE                         FILE SIZE
 ++++++++++++++ GROWING           ++++++++++++++
  [ = ]       0 .debug_loc        +3.49Ki  +0.2%
  +0.6% +2.11Ki .text             +2.11Ki  +0.6%
  [ = ]       0 [Unmapped]        +2.04Ki   +41%
  [ = ]       0 .debug_ranges     +1.70Ki  +1.1%
  [ = ]       0 .debug_line       +1.66Ki  +0.4%
  [ = ]       0 .debug_info       +1.29Ki  +0.1%
  [ = ]       0 .debug_aranges       +176  +1.0%
  +0.0%     +16 .gcc_except_table     +16  +0.0%
  [ = ]       0 .debug_abbrev         +14  +0.0%

 -------------- SHRINKING         --------------
  -0.2%    -136 .eh_frame            -136  -0.2%
  [ = ]       0 .strtab               -83  -0.1%
  [ = ]       0 .symtab               -72  -0.1%
  -0.1%     -24 .eh_frame_hdr         -24  -0.1%

  +0.3% +1.97Ki TOTAL             +12.2Ki  +0.2%

473.astar
     VM SIZE                      FILE SIZE
 ++++++++++++++ GROWING        ++++++++++++++
  [ = ]       0 .debug_loc     +6.20Ki  +6.0%
  +9.5% +3.25Ki .text          +3.25Ki  +9.5%
  [ = ]       0 .debug_line    +2.88Ki  +6.9%
  [ = ]       0 .debug_ranges  +2.56Ki   +14%
  [ = ]       0 .debug_info    +1.90Ki  +0.9%
  [ = ]       0 [Unmapped]        +717   +11%
  [ = ]       0 .debug_aranges     +48  +2.8%
  +0.8%     +48 .eh_frame          +48  +0.8%

 -------------- SHRINKING      --------------
  [ = ]       0 .strtab            -27  -0.5%
  [ = ]       0 .symtab            -24  -0.4%
  [ = ]       0 .debug_abbrev      -18  -0.1%
  -0.8%      -8 .eh_frame_hdr       -8  -0.8%

  +6.5% +3.29Ki TOTAL          +17.5Ki  +3.8%

483.xalancbmk
     VM SIZE                         FILE SIZE
 ++++++++++++++ GROWING           ++++++++++++++
  [ = ]       0 .debug_loc        +2.46Ki  +0.0%
  +0.1% +1.09Ki .text             +1.09Ki  +0.1%
  [ = ]       0 .debug_ranges     +1.08Ki  +0.1%
  [ = ]       0 .debug_line       +1.02Ki  +0.0%
  [ = ]       0 .debug_info          +344  +0.0%
  [ = ]       0 .debug_aranges        +64  +0.0%
  +0.0%     +24 .gcc_except_table     +24  +0.0%
  +0.0%     +16 .eh_frame             +16  +0.0%
  [ = ]       0 .debug_abbrev          +1  +0.0%

 -------------- SHRINKING         --------------
  [ = ]       0 [Unmapped]        -1.12Ki -13.8%

  +0.0% +1.13Ki TOTAL             +4.97Ki  +0.0%

999.specrand
     VM SIZE        FILE SIZE
 ++++++++++++++  ++++++++++++++

  [ = ]       0        0  [ = ]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Remove arc profile histogram in non-LTO mode.
  2018-09-19 18:19 [PATCH] Remove arc profile histogram in non-LTO mode Martin Liška
@ 2018-09-20  3:25 ` Bin.Cheng
  2018-09-20  9:34   ` Jan Hubicka
  0 siblings, 1 reply; 7+ messages in thread
From: Bin.Cheng @ 2018-09-20  3:25 UTC (permalink / raw)
  To: Martin Liška; +Cc: gcc-patches List, Jan Hubicka

On Thu, Sep 20, 2018 at 2:11 AM Martin Liška <mliska@suse.cz> wrote:
>
> Hello.
>
> I've been working for some time on a patch that simplifies how we set
> the hotness threshold of basic blocks. Currently, we calculate so called
> arc profile histograms that should identify edges that cover 99.9% of all
> branching. These edges are then identified as hot. Disadvantage of the approach
> is that it comes with significant overhead in run-time and GCC related code
> is also not trivial. Moreover, anytime a histogram is merged after an instrumented
> run, the resulting histogram is misleading.
>
> That said, I decided to simplify it again, remove usage of the histogram and return
> to what we have before (--param hot-bb-count-fraction). That basically says that
> we consider hot each edge that has execution count bigger than sum_max / 10.000.
>
> Note that LTO+PGO remains untouched as it still uses histogram that is dynamically
> calculated by read arc counts.
Hi,
Does this affect AutoFDO stuff?  AutoFDO is broken and I am fixing it
now, on the basis of current code.

Thanks,
bin
>
> Note the statistics of the patch:
>   19 files changed, 101 insertions(+), 1216 deletions(-)
>
> I'm attaching file sizes of SPEC2006 int benchmark.
>
> Patch survives testing on x86_64-linux-gnu machine.
> Ready to be installed?
>
> Martin
>
> gcc/ChangeLog:
>
> 2018-09-19  Martin Liska  <mliska@suse.cz>
>
>         * auto-profile.c (autofdo_source_profile::read): Do not
>         set sum_all.
>         (read_profile): Do not add working sets.
>         (read_autofdo_file): Remove sum_all.
>         (afdo_callsite_hot_enough_for_early_inline): Remove const
>         qualifier.
>         * coverage.c (struct counts_entry): Remove gcov_summary.
>         (read_counts_file): Read new GCOV_TAG_OBJECT_SUMMARY,
>         do not support GCOV_TAG_PROGRAM_SUMMARY.
>         (get_coverage_counts): Remove summary and expected
>         arguments.
>         * coverage.h (get_coverage_counts): Likewise.
>         * doc/gcov-dump.texi: Remove -w option.
>         * gcov-dump.c (dump_working_sets): Remove.
>         (main): Do not support '-w' option.
>         (print_usage): Likewise.
>         (tag_summary): Likewise.
>         * gcov-io.c (gcov_write_summary): Do not dump
>         histogram.
>         (gcov_read_summary): Likewise.
>         (gcov_histo_index): Remove.
>         (gcov_histogram_merge): Likewise.
>         (compute_working_sets): Likewise.
>         * gcov-io.h (GCOV_TAG_OBJECT_SUMMARY): Mark
>         it not obsolete.
>         (GCOV_TAG_PROGRAM_SUMMARY): Mark it obsolete.
>         (GCOV_TAG_SUMMARY_LENGTH): Adjust.
>         (GCOV_HISTOGRAM_SIZE): Remove.
>         (GCOV_HISTOGRAM_BITVECTOR_SIZE): Likewise.
>         (struct gcov_summary): Simplify rapidly just
>         to runs and sum_max fields.
>         (gcov_histo_index): Remove.
>         (NUM_GCOV_WORKING_SETS): Likewise.
>         (compute_working_sets): Likewise.
>         * gcov-tool.c (print_overlap_usage_message): Remove
>         trailing empty line.
>         * gcov.c (read_count_file): Read GCOV_TAG_OBJECT_SUMMARY.
>         (output_lines): Remove program related line.
>         * ipa-profile.c (ipa_profile): Do not consider GCOV histogram.
>         * lto-cgraph.c (output_profile_summary): Do not stream GCOV
>         histogram.
>         (input_profile_summary): Do not read it.
>         (merge_profile_summaries): And do not merge it.
>         (input_symtab): Do not call removed function.
>         * modulo-sched.c (sms_schedule): Do not print sum_max.
>         * params.def (HOT_BB_COUNT_FRACTION): Reincarnate param that was
>         removed when histogram method was invented.
>         (HOT_BB_COUNT_WS_PERMILLE): Mention that it's used only in LTO
>         mode.
>         * postreload-gcse.c (eliminate_partially_redundant_load): Fix
>         GCOV coding style.
>         * predict.c (get_hot_bb_threshold): Use HOT_BB_COUNT_FRACTION
>         and dump selected value.
>         * profile.c (add_working_set): Remove.
>         (get_working_sets): Likewise.
>         (find_working_set): Likewise.
>         (get_exec_counts): Do not work with working sets.
>         (read_profile_edge_counts): Do not inform as sum_max is removed.
>         (compute_branch_probabilities): Likewise.
>         (compute_value_histograms): Remove argument for call of
>         get_coverage_counts.
>         * profile.h: Do not make gcov_summary const.
>
> libgcc/ChangeLog:
>
> 2018-09-19  Martin Liska  <mliska@suse.cz>
>
>         * libgcov-driver.c (crc32_unsigned): Remove.
>         (gcov_histogram_insert): Likewise.
>         (gcov_compute_histogram): Likewise.
>         (compute_summary): Simplify rapidly.
>         (merge_one_data): Do not handle PROGRAM_SUMMARY tag.
>         (merge_summary): Rapidly simplify.
>         (dump_one_gcov): Ignore gcov_summary.
>         (gcov_do_dump): Do not handle program summary, it's not
>         used.
>         * libgcov-util.c (tag_summary): Remove.
>         (read_gcda_finalize): Fix coding style.
>         (read_gcda_file): Initialize curr_object_summary.
>         (compute_summary): Remove.
>         (calculate_overlap): Remove settings of run_max.
> ---
>   gcc/auto-profile.c      |  21 +--
>   gcc/coverage.c          |  59 +-----
>   gcc/coverage.h          |   4 +-
>   gcc/doc/gcov-dump.texi  |   6 +-
>   gcc/gcov-dump.c         |  81 +-------
>   gcc/gcov-io.c           | 398 +---------------------------------------
>   gcc/gcov-io.h           |  71 +------
>   gcc/gcov-tool.c         |   1 -
>   gcc/gcov.c              |   7 +-
>   gcc/ipa-profile.c       |  26 +--
>   gcc/lto-cgraph.c        | 136 +-------------
>   gcc/modulo-sched.c      |   8 -
>   gcc/params.def          |   7 +-
>   gcc/postreload-gcse.c   |   2 +-
>   gcc/predict.c           |   9 +-
>   gcc/profile.c           | 116 +-----------
>   gcc/profile.h           |   2 +-
>   libgcc/libgcov-driver.c | 324 ++++----------------------------
>   libgcc/libgcov-util.c   |  39 +---
>   19 files changed, 101 insertions(+), 1216 deletions(-)
>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Remove arc profile histogram in non-LTO mode.
  2018-09-20  3:25 ` Bin.Cheng
@ 2018-09-20  9:34   ` Jan Hubicka
  2018-09-20 10:45     ` Bin.Cheng
  0 siblings, 1 reply; 7+ messages in thread
From: Jan Hubicka @ 2018-09-20  9:34 UTC (permalink / raw)
  To: Bin.Cheng; +Cc: Martin Liška, gcc-patches List

> On Thu, Sep 20, 2018 at 2:11 AM Martin Liška <mliska@suse.cz> wrote:
> >
> > Hello.
> >
> > I've been working for some time on a patch that simplifies how we set
> > the hotness threshold of basic blocks. Currently, we calculate so called
> > arc profile histograms that should identify edges that cover 99.9% of all
> > branching. These edges are then identified as hot. Disadvantage of the approach
> > is that it comes with significant overhead in run-time and GCC related code
> > is also not trivial. Moreover, anytime a histogram is merged after an instrumented
> > run, the resulting histogram is misleading.
> >
> > That said, I decided to simplify it again, remove usage of the histogram and return
> > to what we have before (--param hot-bb-count-fraction). That basically says that
> > we consider hot each edge that has execution count bigger than sum_max / 10.000.
> >
> > Note that LTO+PGO remains untouched as it still uses histogram that is dynamically
> > calculated by read arc counts.
> Hi,
> Does this affect AutoFDO stuff?  AutoFDO is broken and I am fixing it
> now, on the basis of current code.

This is indpendent of Auto-FDO. There we probably can define cutoffs for hot-cold
partitions in the tool translating global data into per-file data read by GCC.
It is great you will take a deper look at autoFDO. it indeed needs work!

The patch is OK, thank for working on it!  Histograms was added by google as
bit of experiment, but I do not think they turned out to be useful. The data
produced by them was not very related to what the IPA profile generation produces
and thus it did not seem to match reality very well.

Honza
> 
> Thanks,
> bin
> >
> > Note the statistics of the patch:
> >   19 files changed, 101 insertions(+), 1216 deletions(-)
> >
> > I'm attaching file sizes of SPEC2006 int benchmark.
> >
> > Patch survives testing on x86_64-linux-gnu machine.
> > Ready to be installed?
> >
> > Martin
> >
> > gcc/ChangeLog:
> >
> > 2018-09-19  Martin Liska  <mliska@suse.cz>
> >
> >         * auto-profile.c (autofdo_source_profile::read): Do not
> >         set sum_all.
> >         (read_profile): Do not add working sets.
> >         (read_autofdo_file): Remove sum_all.
> >         (afdo_callsite_hot_enough_for_early_inline): Remove const
> >         qualifier.
> >         * coverage.c (struct counts_entry): Remove gcov_summary.
> >         (read_counts_file): Read new GCOV_TAG_OBJECT_SUMMARY,
> >         do not support GCOV_TAG_PROGRAM_SUMMARY.
> >         (get_coverage_counts): Remove summary and expected
> >         arguments.
> >         * coverage.h (get_coverage_counts): Likewise.
> >         * doc/gcov-dump.texi: Remove -w option.
> >         * gcov-dump.c (dump_working_sets): Remove.
> >         (main): Do not support '-w' option.
> >         (print_usage): Likewise.
> >         (tag_summary): Likewise.
> >         * gcov-io.c (gcov_write_summary): Do not dump
> >         histogram.
> >         (gcov_read_summary): Likewise.
> >         (gcov_histo_index): Remove.
> >         (gcov_histogram_merge): Likewise.
> >         (compute_working_sets): Likewise.
> >         * gcov-io.h (GCOV_TAG_OBJECT_SUMMARY): Mark
> >         it not obsolete.
> >         (GCOV_TAG_PROGRAM_SUMMARY): Mark it obsolete.
> >         (GCOV_TAG_SUMMARY_LENGTH): Adjust.
> >         (GCOV_HISTOGRAM_SIZE): Remove.
> >         (GCOV_HISTOGRAM_BITVECTOR_SIZE): Likewise.
> >         (struct gcov_summary): Simplify rapidly just
> >         to runs and sum_max fields.
> >         (gcov_histo_index): Remove.
> >         (NUM_GCOV_WORKING_SETS): Likewise.
> >         (compute_working_sets): Likewise.
> >         * gcov-tool.c (print_overlap_usage_message): Remove
> >         trailing empty line.
> >         * gcov.c (read_count_file): Read GCOV_TAG_OBJECT_SUMMARY.
> >         (output_lines): Remove program related line.
> >         * ipa-profile.c (ipa_profile): Do not consider GCOV histogram.
> >         * lto-cgraph.c (output_profile_summary): Do not stream GCOV
> >         histogram.
> >         (input_profile_summary): Do not read it.
> >         (merge_profile_summaries): And do not merge it.
> >         (input_symtab): Do not call removed function.
> >         * modulo-sched.c (sms_schedule): Do not print sum_max.
> >         * params.def (HOT_BB_COUNT_FRACTION): Reincarnate param that was
> >         removed when histogram method was invented.
> >         (HOT_BB_COUNT_WS_PERMILLE): Mention that it's used only in LTO
> >         mode.
> >         * postreload-gcse.c (eliminate_partially_redundant_load): Fix
> >         GCOV coding style.
> >         * predict.c (get_hot_bb_threshold): Use HOT_BB_COUNT_FRACTION
> >         and dump selected value.
> >         * profile.c (add_working_set): Remove.
> >         (get_working_sets): Likewise.
> >         (find_working_set): Likewise.
> >         (get_exec_counts): Do not work with working sets.
> >         (read_profile_edge_counts): Do not inform as sum_max is removed.
> >         (compute_branch_probabilities): Likewise.
> >         (compute_value_histograms): Remove argument for call of
> >         get_coverage_counts.
> >         * profile.h: Do not make gcov_summary const.
> >
> > libgcc/ChangeLog:
> >
> > 2018-09-19  Martin Liska  <mliska@suse.cz>
> >
> >         * libgcov-driver.c (crc32_unsigned): Remove.
> >         (gcov_histogram_insert): Likewise.
> >         (gcov_compute_histogram): Likewise.
> >         (compute_summary): Simplify rapidly.
> >         (merge_one_data): Do not handle PROGRAM_SUMMARY tag.
> >         (merge_summary): Rapidly simplify.
> >         (dump_one_gcov): Ignore gcov_summary.
> >         (gcov_do_dump): Do not handle program summary, it's not
> >         used.
> >         * libgcov-util.c (tag_summary): Remove.
> >         (read_gcda_finalize): Fix coding style.
> >         (read_gcda_file): Initialize curr_object_summary.
> >         (compute_summary): Remove.
> >         (calculate_overlap): Remove settings of run_max.
> > ---
> >   gcc/auto-profile.c      |  21 +--
> >   gcc/coverage.c          |  59 +-----
> >   gcc/coverage.h          |   4 +-
> >   gcc/doc/gcov-dump.texi  |   6 +-
> >   gcc/gcov-dump.c         |  81 +-------
> >   gcc/gcov-io.c           | 398 +---------------------------------------
> >   gcc/gcov-io.h           |  71 +------
> >   gcc/gcov-tool.c         |   1 -
> >   gcc/gcov.c              |   7 +-
> >   gcc/ipa-profile.c       |  26 +--
> >   gcc/lto-cgraph.c        | 136 +-------------
> >   gcc/modulo-sched.c      |   8 -
> >   gcc/params.def          |   7 +-
> >   gcc/postreload-gcse.c   |   2 +-
> >   gcc/predict.c           |   9 +-
> >   gcc/profile.c           | 116 +-----------
> >   gcc/profile.h           |   2 +-
> >   libgcc/libgcov-driver.c | 324 ++++----------------------------
> >   libgcc/libgcov-util.c   |  39 +---
> >   19 files changed, 101 insertions(+), 1216 deletions(-)
> >
> >

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Remove arc profile histogram in non-LTO mode.
  2018-09-20  9:34   ` Jan Hubicka
@ 2018-09-20 10:45     ` Bin.Cheng
  2018-09-20 10:53       ` Jan Hubicka
  0 siblings, 1 reply; 7+ messages in thread
From: Bin.Cheng @ 2018-09-20 10:45 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Martin Liška, gcc-patches List

On Thu, Sep 20, 2018 at 5:26 PM Jan Hubicka <hubicka@ucw.cz> wrote:
>
> > On Thu, Sep 20, 2018 at 2:11 AM Martin Liška <mliska@suse.cz> wrote:
> > >
> > > Hello.
> > >
> > > I've been working for some time on a patch that simplifies how we set
> > > the hotness threshold of basic blocks. Currently, we calculate so called
> > > arc profile histograms that should identify edges that cover 99.9% of all
> > > branching. These edges are then identified as hot. Disadvantage of the approach
> > > is that it comes with significant overhead in run-time and GCC related code
> > > is also not trivial. Moreover, anytime a histogram is merged after an instrumented
> > > run, the resulting histogram is misleading.
> > >
> > > That said, I decided to simplify it again, remove usage of the histogram and return
> > > to what we have before (--param hot-bb-count-fraction). That basically says that
> > > we consider hot each edge that has execution count bigger than sum_max / 10.000.
> > >
> > > Note that LTO+PGO remains untouched as it still uses histogram that is dynamically
> > > calculated by read arc counts.
> > Hi,
> > Does this affect AutoFDO stuff?  AutoFDO is broken and I am fixing it
> > now, on the basis of current code.
>
> This is indpendent of Auto-FDO. There we probably can define cutoffs for hot-cold
> partitions in the tool translating global data into per-file data read by GCC.
> It is great you will take a deper look at autoFDO. it indeed needs work!
>
> The patch is OK, thank for working on it!  Histograms was added by google as
> bit of experiment, but I do not think they turned out to be useful. The data
I did some experiments showing it is somehow useful, for autoFDO.  To
which extend it is useful remains a question I need to investigate
later.

Thanks,
bin
> produced by them was not very related to what the IPA profile generation produces
> and thus it did not seem to match reality very well.
>
> Honza
> >
> > Thanks,
> > bin
> > >
> > > Note the statistics of the patch:
> > >   19 files changed, 101 insertions(+), 1216 deletions(-)
> > >
> > > I'm attaching file sizes of SPEC2006 int benchmark.
> > >
> > > Patch survives testing on x86_64-linux-gnu machine.
> > > Ready to be installed?
> > >
> > > Martin
> > >
> > > gcc/ChangeLog:
> > >
> > > 2018-09-19  Martin Liska  <mliska@suse.cz>
> > >
> > >         * auto-profile.c (autofdo_source_profile::read): Do not
> > >         set sum_all.
> > >         (read_profile): Do not add working sets.
> > >         (read_autofdo_file): Remove sum_all.
> > >         (afdo_callsite_hot_enough_for_early_inline): Remove const
> > >         qualifier.
> > >         * coverage.c (struct counts_entry): Remove gcov_summary.
> > >         (read_counts_file): Read new GCOV_TAG_OBJECT_SUMMARY,
> > >         do not support GCOV_TAG_PROGRAM_SUMMARY.
> > >         (get_coverage_counts): Remove summary and expected
> > >         arguments.
> > >         * coverage.h (get_coverage_counts): Likewise.
> > >         * doc/gcov-dump.texi: Remove -w option.
> > >         * gcov-dump.c (dump_working_sets): Remove.
> > >         (main): Do not support '-w' option.
> > >         (print_usage): Likewise.
> > >         (tag_summary): Likewise.
> > >         * gcov-io.c (gcov_write_summary): Do not dump
> > >         histogram.
> > >         (gcov_read_summary): Likewise.
> > >         (gcov_histo_index): Remove.
> > >         (gcov_histogram_merge): Likewise.
> > >         (compute_working_sets): Likewise.
> > >         * gcov-io.h (GCOV_TAG_OBJECT_SUMMARY): Mark
> > >         it not obsolete.
> > >         (GCOV_TAG_PROGRAM_SUMMARY): Mark it obsolete.
> > >         (GCOV_TAG_SUMMARY_LENGTH): Adjust.
> > >         (GCOV_HISTOGRAM_SIZE): Remove.
> > >         (GCOV_HISTOGRAM_BITVECTOR_SIZE): Likewise.
> > >         (struct gcov_summary): Simplify rapidly just
> > >         to runs and sum_max fields.
> > >         (gcov_histo_index): Remove.
> > >         (NUM_GCOV_WORKING_SETS): Likewise.
> > >         (compute_working_sets): Likewise.
> > >         * gcov-tool.c (print_overlap_usage_message): Remove
> > >         trailing empty line.
> > >         * gcov.c (read_count_file): Read GCOV_TAG_OBJECT_SUMMARY.
> > >         (output_lines): Remove program related line.
> > >         * ipa-profile.c (ipa_profile): Do not consider GCOV histogram.
> > >         * lto-cgraph.c (output_profile_summary): Do not stream GCOV
> > >         histogram.
> > >         (input_profile_summary): Do not read it.
> > >         (merge_profile_summaries): And do not merge it.
> > >         (input_symtab): Do not call removed function.
> > >         * modulo-sched.c (sms_schedule): Do not print sum_max.
> > >         * params.def (HOT_BB_COUNT_FRACTION): Reincarnate param that was
> > >         removed when histogram method was invented.
> > >         (HOT_BB_COUNT_WS_PERMILLE): Mention that it's used only in LTO
> > >         mode.
> > >         * postreload-gcse.c (eliminate_partially_redundant_load): Fix
> > >         GCOV coding style.
> > >         * predict.c (get_hot_bb_threshold): Use HOT_BB_COUNT_FRACTION
> > >         and dump selected value.
> > >         * profile.c (add_working_set): Remove.
> > >         (get_working_sets): Likewise.
> > >         (find_working_set): Likewise.
> > >         (get_exec_counts): Do not work with working sets.
> > >         (read_profile_edge_counts): Do not inform as sum_max is removed.
> > >         (compute_branch_probabilities): Likewise.
> > >         (compute_value_histograms): Remove argument for call of
> > >         get_coverage_counts.
> > >         * profile.h: Do not make gcov_summary const.
> > >
> > > libgcc/ChangeLog:
> > >
> > > 2018-09-19  Martin Liska  <mliska@suse.cz>
> > >
> > >         * libgcov-driver.c (crc32_unsigned): Remove.
> > >         (gcov_histogram_insert): Likewise.
> > >         (gcov_compute_histogram): Likewise.
> > >         (compute_summary): Simplify rapidly.
> > >         (merge_one_data): Do not handle PROGRAM_SUMMARY tag.
> > >         (merge_summary): Rapidly simplify.
> > >         (dump_one_gcov): Ignore gcov_summary.
> > >         (gcov_do_dump): Do not handle program summary, it's not
> > >         used.
> > >         * libgcov-util.c (tag_summary): Remove.
> > >         (read_gcda_finalize): Fix coding style.
> > >         (read_gcda_file): Initialize curr_object_summary.
> > >         (compute_summary): Remove.
> > >         (calculate_overlap): Remove settings of run_max.
> > > ---
> > >   gcc/auto-profile.c      |  21 +--
> > >   gcc/coverage.c          |  59 +-----
> > >   gcc/coverage.h          |   4 +-
> > >   gcc/doc/gcov-dump.texi  |   6 +-
> > >   gcc/gcov-dump.c         |  81 +-------
> > >   gcc/gcov-io.c           | 398 +---------------------------------------
> > >   gcc/gcov-io.h           |  71 +------
> > >   gcc/gcov-tool.c         |   1 -
> > >   gcc/gcov.c              |   7 +-
> > >   gcc/ipa-profile.c       |  26 +--
> > >   gcc/lto-cgraph.c        | 136 +-------------
> > >   gcc/modulo-sched.c      |   8 -
> > >   gcc/params.def          |   7 +-
> > >   gcc/postreload-gcse.c   |   2 +-
> > >   gcc/predict.c           |   9 +-
> > >   gcc/profile.c           | 116 +-----------
> > >   gcc/profile.h           |   2 +-
> > >   libgcc/libgcov-driver.c | 324 ++++----------------------------
> > >   libgcc/libgcov-util.c   |  39 +---
> > >   19 files changed, 101 insertions(+), 1216 deletions(-)
> > >
> > >

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Remove arc profile histogram in non-LTO mode.
  2018-09-20 10:45     ` Bin.Cheng
@ 2018-09-20 10:53       ` Jan Hubicka
  2018-09-21  7:19         ` Bin.Cheng
  0 siblings, 1 reply; 7+ messages in thread
From: Jan Hubicka @ 2018-09-20 10:53 UTC (permalink / raw)
  To: Bin.Cheng; +Cc: Martin Liška, gcc-patches List

> On Thu, Sep 20, 2018 at 5:26 PM Jan Hubicka <hubicka@ucw.cz> wrote:
> >
> > > On Thu, Sep 20, 2018 at 2:11 AM Martin Liška <mliska@suse.cz> wrote:
> > > >
> > > > Hello.
> > > >
> > > > I've been working for some time on a patch that simplifies how we set
> > > > the hotness threshold of basic blocks. Currently, we calculate so called
> > > > arc profile histograms that should identify edges that cover 99.9% of all
> > > > branching. These edges are then identified as hot. Disadvantage of the approach
> > > > is that it comes with significant overhead in run-time and GCC related code
> > > > is also not trivial. Moreover, anytime a histogram is merged after an instrumented
> > > > run, the resulting histogram is misleading.
> > > >
> > > > That said, I decided to simplify it again, remove usage of the histogram and return
> > > > to what we have before (--param hot-bb-count-fraction). That basically says that
> > > > we consider hot each edge that has execution count bigger than sum_max / 10.000.
> > > >
> > > > Note that LTO+PGO remains untouched as it still uses histogram that is dynamically
> > > > calculated by read arc counts.
> > > Hi,
> > > Does this affect AutoFDO stuff?  AutoFDO is broken and I am fixing it
> > > now, on the basis of current code.
> >
> > This is indpendent of Auto-FDO. There we probably can define cutoffs for hot-cold
> > partitions in the tool translating global data into per-file data read by GCC.
> > It is great you will take a deper look at autoFDO. it indeed needs work!
> >
> > The patch is OK, thank for working on it!  Histograms was added by google as
> > bit of experiment, but I do not think they turned out to be useful. The data
> I did some experiments showing it is somehow useful, for autoFDO.  To
> which extend it is useful remains a question I need to investigate
> later.

Indeed auto-FDO has better idea about whole program behaviour. We could revive
the patch for streaming histograms and reading them to compiler if that turns
out to be a good idea. I can see that auto-FDO profile data tells you pretty
clearly where the hot spots are and it is not as easy to recover this information
from profile annotated CFG becuase of all the transforms we do.
Lets fix and benchmark auto-FDO first and then we could decide what is best option.
Putting the stream-in code back should not be hard if it turns out to be useful.

Main problem with current historams with normal FDO is the fact that you need
to merge them between runs which is technically impossible job to do, so they
work for programs run once, but not for programs run many times in train runs
like gcc itself.  It seems to me that for those relaly interested in
performance it is good idea to switch to LTO and that makes it possible to
calculate histograms during the linking stage.

Honza
> 
> Thanks,
> bin
> > produced by them was not very related to what the IPA profile generation produces
> > and thus it did not seem to match reality very well.
> >
> > Honza
> > >
> > > Thanks,
> > > bin
> > > >
> > > > Note the statistics of the patch:
> > > >   19 files changed, 101 insertions(+), 1216 deletions(-)
> > > >
> > > > I'm attaching file sizes of SPEC2006 int benchmark.
> > > >
> > > > Patch survives testing on x86_64-linux-gnu machine.
> > > > Ready to be installed?
> > > >
> > > > Martin
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > > 2018-09-19  Martin Liska  <mliska@suse.cz>
> > > >
> > > >         * auto-profile.c (autofdo_source_profile::read): Do not
> > > >         set sum_all.
> > > >         (read_profile): Do not add working sets.
> > > >         (read_autofdo_file): Remove sum_all.
> > > >         (afdo_callsite_hot_enough_for_early_inline): Remove const
> > > >         qualifier.
> > > >         * coverage.c (struct counts_entry): Remove gcov_summary.
> > > >         (read_counts_file): Read new GCOV_TAG_OBJECT_SUMMARY,
> > > >         do not support GCOV_TAG_PROGRAM_SUMMARY.
> > > >         (get_coverage_counts): Remove summary and expected
> > > >         arguments.
> > > >         * coverage.h (get_coverage_counts): Likewise.
> > > >         * doc/gcov-dump.texi: Remove -w option.
> > > >         * gcov-dump.c (dump_working_sets): Remove.
> > > >         (main): Do not support '-w' option.
> > > >         (print_usage): Likewise.
> > > >         (tag_summary): Likewise.
> > > >         * gcov-io.c (gcov_write_summary): Do not dump
> > > >         histogram.
> > > >         (gcov_read_summary): Likewise.
> > > >         (gcov_histo_index): Remove.
> > > >         (gcov_histogram_merge): Likewise.
> > > >         (compute_working_sets): Likewise.
> > > >         * gcov-io.h (GCOV_TAG_OBJECT_SUMMARY): Mark
> > > >         it not obsolete.
> > > >         (GCOV_TAG_PROGRAM_SUMMARY): Mark it obsolete.
> > > >         (GCOV_TAG_SUMMARY_LENGTH): Adjust.
> > > >         (GCOV_HISTOGRAM_SIZE): Remove.
> > > >         (GCOV_HISTOGRAM_BITVECTOR_SIZE): Likewise.
> > > >         (struct gcov_summary): Simplify rapidly just
> > > >         to runs and sum_max fields.
> > > >         (gcov_histo_index): Remove.
> > > >         (NUM_GCOV_WORKING_SETS): Likewise.
> > > >         (compute_working_sets): Likewise.
> > > >         * gcov-tool.c (print_overlap_usage_message): Remove
> > > >         trailing empty line.
> > > >         * gcov.c (read_count_file): Read GCOV_TAG_OBJECT_SUMMARY.
> > > >         (output_lines): Remove program related line.
> > > >         * ipa-profile.c (ipa_profile): Do not consider GCOV histogram.
> > > >         * lto-cgraph.c (output_profile_summary): Do not stream GCOV
> > > >         histogram.
> > > >         (input_profile_summary): Do not read it.
> > > >         (merge_profile_summaries): And do not merge it.
> > > >         (input_symtab): Do not call removed function.
> > > >         * modulo-sched.c (sms_schedule): Do not print sum_max.
> > > >         * params.def (HOT_BB_COUNT_FRACTION): Reincarnate param that was
> > > >         removed when histogram method was invented.
> > > >         (HOT_BB_COUNT_WS_PERMILLE): Mention that it's used only in LTO
> > > >         mode.
> > > >         * postreload-gcse.c (eliminate_partially_redundant_load): Fix
> > > >         GCOV coding style.
> > > >         * predict.c (get_hot_bb_threshold): Use HOT_BB_COUNT_FRACTION
> > > >         and dump selected value.
> > > >         * profile.c (add_working_set): Remove.
> > > >         (get_working_sets): Likewise.
> > > >         (find_working_set): Likewise.
> > > >         (get_exec_counts): Do not work with working sets.
> > > >         (read_profile_edge_counts): Do not inform as sum_max is removed.
> > > >         (compute_branch_probabilities): Likewise.
> > > >         (compute_value_histograms): Remove argument for call of
> > > >         get_coverage_counts.
> > > >         * profile.h: Do not make gcov_summary const.
> > > >
> > > > libgcc/ChangeLog:
> > > >
> > > > 2018-09-19  Martin Liska  <mliska@suse.cz>
> > > >
> > > >         * libgcov-driver.c (crc32_unsigned): Remove.
> > > >         (gcov_histogram_insert): Likewise.
> > > >         (gcov_compute_histogram): Likewise.
> > > >         (compute_summary): Simplify rapidly.
> > > >         (merge_one_data): Do not handle PROGRAM_SUMMARY tag.
> > > >         (merge_summary): Rapidly simplify.
> > > >         (dump_one_gcov): Ignore gcov_summary.
> > > >         (gcov_do_dump): Do not handle program summary, it's not
> > > >         used.
> > > >         * libgcov-util.c (tag_summary): Remove.
> > > >         (read_gcda_finalize): Fix coding style.
> > > >         (read_gcda_file): Initialize curr_object_summary.
> > > >         (compute_summary): Remove.
> > > >         (calculate_overlap): Remove settings of run_max.
> > > > ---
> > > >   gcc/auto-profile.c      |  21 +--
> > > >   gcc/coverage.c          |  59 +-----
> > > >   gcc/coverage.h          |   4 +-
> > > >   gcc/doc/gcov-dump.texi  |   6 +-
> > > >   gcc/gcov-dump.c         |  81 +-------
> > > >   gcc/gcov-io.c           | 398 +---------------------------------------
> > > >   gcc/gcov-io.h           |  71 +------
> > > >   gcc/gcov-tool.c         |   1 -
> > > >   gcc/gcov.c              |   7 +-
> > > >   gcc/ipa-profile.c       |  26 +--
> > > >   gcc/lto-cgraph.c        | 136 +-------------
> > > >   gcc/modulo-sched.c      |   8 -
> > > >   gcc/params.def          |   7 +-
> > > >   gcc/postreload-gcse.c   |   2 +-
> > > >   gcc/predict.c           |   9 +-
> > > >   gcc/profile.c           | 116 +-----------
> > > >   gcc/profile.h           |   2 +-
> > > >   libgcc/libgcov-driver.c | 324 ++++----------------------------
> > > >   libgcc/libgcov-util.c   |  39 +---
> > > >   19 files changed, 101 insertions(+), 1216 deletions(-)
> > > >
> > > >

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Remove arc profile histogram in non-LTO mode.
  2018-09-20 10:53       ` Jan Hubicka
@ 2018-09-21  7:19         ` Bin.Cheng
  2018-09-21  8:41           ` Martin Liška
  0 siblings, 1 reply; 7+ messages in thread
From: Bin.Cheng @ 2018-09-21  7:19 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Martin Liška, gcc-patches List

On Thu, Sep 20, 2018 at 6:43 PM Jan Hubicka <hubicka@ucw.cz> wrote:
>
> > On Thu, Sep 20, 2018 at 5:26 PM Jan Hubicka <hubicka@ucw.cz> wrote:
> > >
> > > > On Thu, Sep 20, 2018 at 2:11 AM Martin Liška <mliska@suse.cz> wrote:
> > > > >
> > > > > Hello.
> > > > >
> > > > > I've been working for some time on a patch that simplifies how we set
> > > > > the hotness threshold of basic blocks. Currently, we calculate so called
> > > > > arc profile histograms that should identify edges that cover 99.9% of all
> > > > > branching. These edges are then identified as hot. Disadvantage of the approach
> > > > > is that it comes with significant overhead in run-time and GCC related code
> > > > > is also not trivial. Moreover, anytime a histogram is merged after an instrumented
> > > > > run, the resulting histogram is misleading.
> > > > >
> > > > > That said, I decided to simplify it again, remove usage of the histogram and return
> > > > > to what we have before (--param hot-bb-count-fraction). That basically says that
> > > > > we consider hot each edge that has execution count bigger than sum_max / 10.000.
> > > > >
> > > > > Note that LTO+PGO remains untouched as it still uses histogram that is dynamically
> > > > > calculated by read arc counts.
> > > > Hi,
> > > > Does this affect AutoFDO stuff?  AutoFDO is broken and I am fixing it
> > > > now, on the basis of current code.
> > >
> > > This is indpendent of Auto-FDO. There we probably can define cutoffs for hot-cold
> > > partitions in the tool translating global data into per-file data read by GCC.
> > > It is great you will take a deper look at autoFDO. it indeed needs work!
> > >
> > > The patch is OK, thank for working on it!  Histograms was added by google as
> > > bit of experiment, but I do not think they turned out to be useful. The data
> > I did some experiments showing it is somehow useful, for autoFDO.  To
> > which extend it is useful remains a question I need to investigate
> > later.
>
> Indeed auto-FDO has better idea about whole program behaviour. We could revive
> the patch for streaming histograms and reading them to compiler if that turns
> out to be a good idea. I can see that auto-FDO profile data tells you pretty
> clearly where the hot spots are and it is not as easy to recover this information
> from profile annotated CFG becuase of all the transforms we do.
> Lets fix and benchmark auto-FDO first and then we could decide what is best option.
> Putting the stream-in code back should not be hard if it turns out to be useful.
>
> Main problem with current historams with normal FDO is the fact that you need
> to merge them between runs which is technically impossible job to do, so they
> work for programs run once, but not for programs run many times in train runs
> like gcc itself.  It seems to me that for those relaly interested in
> performance it is good idea to switch to LTO and that makes it possible to
> calculate histograms during the linking stage.

honza, thanks very much for detailed explanation.

Thanks,
bin
>
> Honza
> >
> > Thanks,
> > bin
> > > produced by them was not very related to what the IPA profile generation produces
> > > and thus it did not seem to match reality very well.
> > >
> > > Honza
> > > >
> > > > Thanks,
> > > > bin
> > > > >
> > > > > Note the statistics of the patch:
> > > > >   19 files changed, 101 insertions(+), 1216 deletions(-)
> > > > >
> > > > > I'm attaching file sizes of SPEC2006 int benchmark.
> > > > >
> > > > > Patch survives testing on x86_64-linux-gnu machine.
> > > > > Ready to be installed?
> > > > >
> > > > > Martin
> > > > >
> > > > > gcc/ChangeLog:
> > > > >
> > > > > 2018-09-19  Martin Liska  <mliska@suse.cz>
> > > > >
> > > > >         * auto-profile.c (autofdo_source_profile::read): Do not
> > > > >         set sum_all.
> > > > >         (read_profile): Do not add working sets.
> > > > >         (read_autofdo_file): Remove sum_all.
> > > > >         (afdo_callsite_hot_enough_for_early_inline): Remove const
> > > > >         qualifier.
> > > > >         * coverage.c (struct counts_entry): Remove gcov_summary.
> > > > >         (read_counts_file): Read new GCOV_TAG_OBJECT_SUMMARY,
> > > > >         do not support GCOV_TAG_PROGRAM_SUMMARY.
> > > > >         (get_coverage_counts): Remove summary and expected
> > > > >         arguments.
> > > > >         * coverage.h (get_coverage_counts): Likewise.
> > > > >         * doc/gcov-dump.texi: Remove -w option.
> > > > >         * gcov-dump.c (dump_working_sets): Remove.
> > > > >         (main): Do not support '-w' option.
> > > > >         (print_usage): Likewise.
> > > > >         (tag_summary): Likewise.
> > > > >         * gcov-io.c (gcov_write_summary): Do not dump
> > > > >         histogram.
> > > > >         (gcov_read_summary): Likewise.
> > > > >         (gcov_histo_index): Remove.
> > > > >         (gcov_histogram_merge): Likewise.
> > > > >         (compute_working_sets): Likewise.
> > > > >         * gcov-io.h (GCOV_TAG_OBJECT_SUMMARY): Mark
> > > > >         it not obsolete.
> > > > >         (GCOV_TAG_PROGRAM_SUMMARY): Mark it obsolete.
> > > > >         (GCOV_TAG_SUMMARY_LENGTH): Adjust.
> > > > >         (GCOV_HISTOGRAM_SIZE): Remove.
> > > > >         (GCOV_HISTOGRAM_BITVECTOR_SIZE): Likewise.
> > > > >         (struct gcov_summary): Simplify rapidly just
> > > > >         to runs and sum_max fields.
> > > > >         (gcov_histo_index): Remove.
> > > > >         (NUM_GCOV_WORKING_SETS): Likewise.
> > > > >         (compute_working_sets): Likewise.
> > > > >         * gcov-tool.c (print_overlap_usage_message): Remove
> > > > >         trailing empty line.
> > > > >         * gcov.c (read_count_file): Read GCOV_TAG_OBJECT_SUMMARY.
> > > > >         (output_lines): Remove program related line.
> > > > >         * ipa-profile.c (ipa_profile): Do not consider GCOV histogram.
> > > > >         * lto-cgraph.c (output_profile_summary): Do not stream GCOV
> > > > >         histogram.
> > > > >         (input_profile_summary): Do not read it.
> > > > >         (merge_profile_summaries): And do not merge it.
> > > > >         (input_symtab): Do not call removed function.
> > > > >         * modulo-sched.c (sms_schedule): Do not print sum_max.
> > > > >         * params.def (HOT_BB_COUNT_FRACTION): Reincarnate param that was
> > > > >         removed when histogram method was invented.
> > > > >         (HOT_BB_COUNT_WS_PERMILLE): Mention that it's used only in LTO
> > > > >         mode.
> > > > >         * postreload-gcse.c (eliminate_partially_redundant_load): Fix
> > > > >         GCOV coding style.
> > > > >         * predict.c (get_hot_bb_threshold): Use HOT_BB_COUNT_FRACTION
> > > > >         and dump selected value.
> > > > >         * profile.c (add_working_set): Remove.
> > > > >         (get_working_sets): Likewise.
> > > > >         (find_working_set): Likewise.
> > > > >         (get_exec_counts): Do not work with working sets.
> > > > >         (read_profile_edge_counts): Do not inform as sum_max is removed.
> > > > >         (compute_branch_probabilities): Likewise.
> > > > >         (compute_value_histograms): Remove argument for call of
> > > > >         get_coverage_counts.
> > > > >         * profile.h: Do not make gcov_summary const.
> > > > >
> > > > > libgcc/ChangeLog:
> > > > >
> > > > > 2018-09-19  Martin Liska  <mliska@suse.cz>
> > > > >
> > > > >         * libgcov-driver.c (crc32_unsigned): Remove.
> > > > >         (gcov_histogram_insert): Likewise.
> > > > >         (gcov_compute_histogram): Likewise.
> > > > >         (compute_summary): Simplify rapidly.
> > > > >         (merge_one_data): Do not handle PROGRAM_SUMMARY tag.
> > > > >         (merge_summary): Rapidly simplify.
> > > > >         (dump_one_gcov): Ignore gcov_summary.
> > > > >         (gcov_do_dump): Do not handle program summary, it's not
> > > > >         used.
> > > > >         * libgcov-util.c (tag_summary): Remove.
> > > > >         (read_gcda_finalize): Fix coding style.
> > > > >         (read_gcda_file): Initialize curr_object_summary.
> > > > >         (compute_summary): Remove.
> > > > >         (calculate_overlap): Remove settings of run_max.
> > > > > ---
> > > > >   gcc/auto-profile.c      |  21 +--
> > > > >   gcc/coverage.c          |  59 +-----
> > > > >   gcc/coverage.h          |   4 +-
> > > > >   gcc/doc/gcov-dump.texi  |   6 +-
> > > > >   gcc/gcov-dump.c         |  81 +-------
> > > > >   gcc/gcov-io.c           | 398 +---------------------------------------
> > > > >   gcc/gcov-io.h           |  71 +------
> > > > >   gcc/gcov-tool.c         |   1 -
> > > > >   gcc/gcov.c              |   7 +-
> > > > >   gcc/ipa-profile.c       |  26 +--
> > > > >   gcc/lto-cgraph.c        | 136 +-------------
> > > > >   gcc/modulo-sched.c      |   8 -
> > > > >   gcc/params.def          |   7 +-
> > > > >   gcc/postreload-gcse.c   |   2 +-
> > > > >   gcc/predict.c           |   9 +-
> > > > >   gcc/profile.c           | 116 +-----------
> > > > >   gcc/profile.h           |   2 +-
> > > > >   libgcc/libgcov-driver.c | 324 ++++----------------------------
> > > > >   libgcc/libgcov-util.c   |  39 +---
> > > > >   19 files changed, 101 insertions(+), 1216 deletions(-)
> > > > >
> > > > >

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Remove arc profile histogram in non-LTO mode.
  2018-09-21  7:19         ` Bin.Cheng
@ 2018-09-21  8:41           ` Martin Liška
  0 siblings, 0 replies; 7+ messages in thread
From: Martin Liška @ 2018-09-21  8:41 UTC (permalink / raw)
  To: Bin.Cheng, Jan Hubicka; +Cc: gcc-patches List

On 9/21/18 6:05 AM, Bin.Cheng wrote:
> On Thu, Sep 20, 2018 at 6:43 PM Jan Hubicka <hubicka@ucw.cz> wrote:
>>
>>> On Thu, Sep 20, 2018 at 5:26 PM Jan Hubicka <hubicka@ucw.cz> wrote:
>>>>
>>>>> On Thu, Sep 20, 2018 at 2:11 AM Martin Liška <mliska@suse.cz> wrote:
>>>>>>
>>>>>> Hello.
>>>>>>
>>>>>> I've been working for some time on a patch that simplifies how we set
>>>>>> the hotness threshold of basic blocks. Currently, we calculate so called
>>>>>> arc profile histograms that should identify edges that cover 99.9% of all
>>>>>> branching. These edges are then identified as hot. Disadvantage of the approach
>>>>>> is that it comes with significant overhead in run-time and GCC related code
>>>>>> is also not trivial. Moreover, anytime a histogram is merged after an instrumented
>>>>>> run, the resulting histogram is misleading.
>>>>>>
>>>>>> That said, I decided to simplify it again, remove usage of the histogram and return
>>>>>> to what we have before (--param hot-bb-count-fraction). That basically says that
>>>>>> we consider hot each edge that has execution count bigger than sum_max / 10.000.
>>>>>>
>>>>>> Note that LTO+PGO remains untouched as it still uses histogram that is dynamically
>>>>>> calculated by read arc counts.
>>>>> Hi,
>>>>> Does this affect AutoFDO stuff?  AutoFDO is broken and I am fixing it
>>>>> now, on the basis of current code.
>>>>
>>>> This is indpendent of Auto-FDO. There we probably can define cutoffs for hot-cold
>>>> partitions in the tool translating global data into per-file data read by GCC.
>>>> It is great you will take a deper look at autoFDO. it indeed needs work!
>>>>
>>>> The patch is OK, thank for working on it!  Histograms was added by google as
>>>> bit of experiment, but I do not think they turned out to be useful. The data
>>> I did some experiments showing it is somehow useful, for autoFDO.  To
>>> which extend it is useful remains a question I need to investigate
>>> later.
>>
>> Indeed auto-FDO has better idea about whole program behaviour. We could revive
>> the patch for streaming histograms and reading them to compiler if that turns
>> out to be a good idea. I can see that auto-FDO profile data tells you pretty
>> clearly where the hot spots are and it is not as easy to recover this information
>> from profile annotated CFG becuase of all the transforms we do.
>> Lets fix and benchmark auto-FDO first and then we could decide what is best option.
>> Putting the stream-in code back should not be hard if it turns out to be useful.
>>
>> Main problem with current historams with normal FDO is the fact that you need
>> to merge them between runs which is technically impossible job to do, so they
>> work for programs run once, but not for programs run many times in train runs
>> like gcc itself.  It seems to me that for those relaly interested in
>> performance it is good idea to switch to LTO and that makes it possible to
>> calculate histograms during the linking stage.
> 
> honza, thanks very much for detailed explanation.

Thanks Honza for review.

Bin: Do not hesitate and ask me about what you'll need. As Honza mentioned
I don't see any problem with propagating hotness information from AutoFDO into
*.gcda files. Format can be discussed.

Martin

> 
> Thanks,
> bin
>>
>> Honza
>>>
>>> Thanks,
>>> bin
>>>> produced by them was not very related to what the IPA profile generation produces
>>>> and thus it did not seem to match reality very well.
>>>>
>>>> Honza
>>>>>
>>>>> Thanks,
>>>>> bin
>>>>>>
>>>>>> Note the statistics of the patch:
>>>>>>   19 files changed, 101 insertions(+), 1216 deletions(-)
>>>>>>
>>>>>> I'm attaching file sizes of SPEC2006 int benchmark.
>>>>>>
>>>>>> Patch survives testing on x86_64-linux-gnu machine.
>>>>>> Ready to be installed?
>>>>>>
>>>>>> Martin
>>>>>>
>>>>>> gcc/ChangeLog:
>>>>>>
>>>>>> 2018-09-19  Martin Liska  <mliska@suse.cz>
>>>>>>
>>>>>>         * auto-profile.c (autofdo_source_profile::read): Do not
>>>>>>         set sum_all.
>>>>>>         (read_profile): Do not add working sets.
>>>>>>         (read_autofdo_file): Remove sum_all.
>>>>>>         (afdo_callsite_hot_enough_for_early_inline): Remove const
>>>>>>         qualifier.
>>>>>>         * coverage.c (struct counts_entry): Remove gcov_summary.
>>>>>>         (read_counts_file): Read new GCOV_TAG_OBJECT_SUMMARY,
>>>>>>         do not support GCOV_TAG_PROGRAM_SUMMARY.
>>>>>>         (get_coverage_counts): Remove summary and expected
>>>>>>         arguments.
>>>>>>         * coverage.h (get_coverage_counts): Likewise.
>>>>>>         * doc/gcov-dump.texi: Remove -w option.
>>>>>>         * gcov-dump.c (dump_working_sets): Remove.
>>>>>>         (main): Do not support '-w' option.
>>>>>>         (print_usage): Likewise.
>>>>>>         (tag_summary): Likewise.
>>>>>>         * gcov-io.c (gcov_write_summary): Do not dump
>>>>>>         histogram.
>>>>>>         (gcov_read_summary): Likewise.
>>>>>>         (gcov_histo_index): Remove.
>>>>>>         (gcov_histogram_merge): Likewise.
>>>>>>         (compute_working_sets): Likewise.
>>>>>>         * gcov-io.h (GCOV_TAG_OBJECT_SUMMARY): Mark
>>>>>>         it not obsolete.
>>>>>>         (GCOV_TAG_PROGRAM_SUMMARY): Mark it obsolete.
>>>>>>         (GCOV_TAG_SUMMARY_LENGTH): Adjust.
>>>>>>         (GCOV_HISTOGRAM_SIZE): Remove.
>>>>>>         (GCOV_HISTOGRAM_BITVECTOR_SIZE): Likewise.
>>>>>>         (struct gcov_summary): Simplify rapidly just
>>>>>>         to runs and sum_max fields.
>>>>>>         (gcov_histo_index): Remove.
>>>>>>         (NUM_GCOV_WORKING_SETS): Likewise.
>>>>>>         (compute_working_sets): Likewise.
>>>>>>         * gcov-tool.c (print_overlap_usage_message): Remove
>>>>>>         trailing empty line.
>>>>>>         * gcov.c (read_count_file): Read GCOV_TAG_OBJECT_SUMMARY.
>>>>>>         (output_lines): Remove program related line.
>>>>>>         * ipa-profile.c (ipa_profile): Do not consider GCOV histogram.
>>>>>>         * lto-cgraph.c (output_profile_summary): Do not stream GCOV
>>>>>>         histogram.
>>>>>>         (input_profile_summary): Do not read it.
>>>>>>         (merge_profile_summaries): And do not merge it.
>>>>>>         (input_symtab): Do not call removed function.
>>>>>>         * modulo-sched.c (sms_schedule): Do not print sum_max.
>>>>>>         * params.def (HOT_BB_COUNT_FRACTION): Reincarnate param that was
>>>>>>         removed when histogram method was invented.
>>>>>>         (HOT_BB_COUNT_WS_PERMILLE): Mention that it's used only in LTO
>>>>>>         mode.
>>>>>>         * postreload-gcse.c (eliminate_partially_redundant_load): Fix
>>>>>>         GCOV coding style.
>>>>>>         * predict.c (get_hot_bb_threshold): Use HOT_BB_COUNT_FRACTION
>>>>>>         and dump selected value.
>>>>>>         * profile.c (add_working_set): Remove.
>>>>>>         (get_working_sets): Likewise.
>>>>>>         (find_working_set): Likewise.
>>>>>>         (get_exec_counts): Do not work with working sets.
>>>>>>         (read_profile_edge_counts): Do not inform as sum_max is removed.
>>>>>>         (compute_branch_probabilities): Likewise.
>>>>>>         (compute_value_histograms): Remove argument for call of
>>>>>>         get_coverage_counts.
>>>>>>         * profile.h: Do not make gcov_summary const.
>>>>>>
>>>>>> libgcc/ChangeLog:
>>>>>>
>>>>>> 2018-09-19  Martin Liska  <mliska@suse.cz>
>>>>>>
>>>>>>         * libgcov-driver.c (crc32_unsigned): Remove.
>>>>>>         (gcov_histogram_insert): Likewise.
>>>>>>         (gcov_compute_histogram): Likewise.
>>>>>>         (compute_summary): Simplify rapidly.
>>>>>>         (merge_one_data): Do not handle PROGRAM_SUMMARY tag.
>>>>>>         (merge_summary): Rapidly simplify.
>>>>>>         (dump_one_gcov): Ignore gcov_summary.
>>>>>>         (gcov_do_dump): Do not handle program summary, it's not
>>>>>>         used.
>>>>>>         * libgcov-util.c (tag_summary): Remove.
>>>>>>         (read_gcda_finalize): Fix coding style.
>>>>>>         (read_gcda_file): Initialize curr_object_summary.
>>>>>>         (compute_summary): Remove.
>>>>>>         (calculate_overlap): Remove settings of run_max.
>>>>>> ---
>>>>>>   gcc/auto-profile.c      |  21 +--
>>>>>>   gcc/coverage.c          |  59 +-----
>>>>>>   gcc/coverage.h          |   4 +-
>>>>>>   gcc/doc/gcov-dump.texi  |   6 +-
>>>>>>   gcc/gcov-dump.c         |  81 +-------
>>>>>>   gcc/gcov-io.c           | 398 +---------------------------------------
>>>>>>   gcc/gcov-io.h           |  71 +------
>>>>>>   gcc/gcov-tool.c         |   1 -
>>>>>>   gcc/gcov.c              |   7 +-
>>>>>>   gcc/ipa-profile.c       |  26 +--
>>>>>>   gcc/lto-cgraph.c        | 136 +-------------
>>>>>>   gcc/modulo-sched.c      |   8 -
>>>>>>   gcc/params.def          |   7 +-
>>>>>>   gcc/postreload-gcse.c   |   2 +-
>>>>>>   gcc/predict.c           |   9 +-
>>>>>>   gcc/profile.c           | 116 +-----------
>>>>>>   gcc/profile.h           |   2 +-
>>>>>>   libgcc/libgcov-driver.c | 324 ++++----------------------------
>>>>>>   libgcc/libgcov-util.c   |  39 +---
>>>>>>   19 files changed, 101 insertions(+), 1216 deletions(-)
>>>>>>
>>>>>>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-09-21  8:33 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-19 18:19 [PATCH] Remove arc profile histogram in non-LTO mode Martin Liška
2018-09-20  3:25 ` Bin.Cheng
2018-09-20  9:34   ` Jan Hubicka
2018-09-20 10:45     ` Bin.Cheng
2018-09-20 10:53       ` Jan Hubicka
2018-09-21  7:19         ` Bin.Cheng
2018-09-21  8:41           ` Martin Liška

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).