public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Profile housekeeping 6/n (-fprofile-consistency-report)
@ 2012-10-04 14:01 Jan Hubicka
  2012-10-04 14:09 ` Steven Bosscher
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Hubicka @ 2012-10-04 14:01 UTC (permalink / raw)
  To: gcc-patches

Hi,
this patch implements -fprofile-consistency-report that is useful to get an
statistic about what pass are major offenders in keeping profile up-to-date.

For example the following is output for combine.c
 Pass: fnsplit              (after pass) mismatched in:  +16 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: fnsplit              (after TODO) mismatched in:  -16 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: inline               (after pass) mismatched in: +197 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: inline               (after TODO) mismatched in: -209 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: ccp                  (after TODO) mismatched in:   +8 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: vrp                  (after pass) mismatched in: +191 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: vrp                  (after TODO) mismatched in:  +25 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: dce                  (after TODO) mismatched in:  -19 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: cdce                 (after pass) mismatched in:   -1 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: cselim               (after pass) mismatched in:   +1 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: ifcombine            (after TODO) mismatched in:   +1 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: phiopt               (after pass) mismatched in:   -2 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: ch                   (after pass) mismatched in:   +2 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: ch                   (after TODO) mismatched in:   +1 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: dom                  (after pass) mismatched in:  +89 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: dom                  (after TODO) mismatched in:   -3 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: phicprop             (after TODO) mismatched in:   -6 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: dce                  (after TODO) mismatched in:   -2 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: copyprop             (after TODO) mismatched in:  -17 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: unswitch             (after pass) mismatched in:   +7 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: unswitch             (after TODO) mismatched in:  +19 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: cunroll              (after pass) mismatched in:  +10 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: vrp                  (after pass) mismatched in:  +18 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: vrp                  (after TODO) mismatched in:   -8 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: dom                  (after pass) mismatched in:  +14 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: dom                  (after TODO) mismatched in:   +4 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: phicprop             (after TODO) mismatched in:   +1 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: cddce                (after TODO) mismatched in:   -1 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: expand               (after pass) mismatched in: +435 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: jump                 (after pass) mismatched in:   +6 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: cse1                 (after pass) mismatched in:   +1 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: cprop                (after pass) mismatched in:   -8 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: rtl pre              (after pass) mismatched in:   -1 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: cse_local            (after pass) mismatched in:   -7 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: ce1                  (after pass) mismatched in:   +5 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: loop2_init           (after pass) mismatched in:   +1 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: loop2_done           (after pass) mismatched in:   -1 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: reload               (after pass) mismatched in:   -4 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: gcse2                (after pass) mismatched in:   -1 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: split2               (after pass) mismatched in:   +1 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: pro_and_epilogue     (after pass) mismatched in:   +4 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: dse2                 (after pass) mismatched in:   -2 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: jump2                (after pass) mismatched in:   -1 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: ce3                  (after pass) mismatched in:   -1 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: bbro                 (after pass) mismatched in:   -6 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: compgotos            (after pass) mismatched in:   -1 (freqs)   +0 (counts); michmatched out:   +0 (freqs)   +0 (counts)

As a quick explanation, inliner is recomputing frequencies as part of TODO, so
the divergence.  VRP and DOM breaks a lot of probabilities by jump threading
where the original estimate was inconsistent (it would make sense to do some of
jump threading prior profile estimate).
Expansion obviously needs a lot of TLC, not only in switch expansion that
is currently all wrong and quite commonly excercised in combine.c

Regtested/bootstrapped x86_64-linux.  If there are no complains I will commit
the patch tomorrow.

Honza
	* doc/invoke.texi (-fprofile-consistency-report): Document.
	* common.opt (fprofile-consistency-report): New.
	* toplev.h (dump_profile_consistency_report): Declare.
	* toplev.c (finalize): Call dump_profile_consistency_report.
	* passes.c (profile_record): New global var.
	(check_profile_consistency): New function.
	(dump_profile_consistency_report): New function.
	(execute_one_ipa_transform_pass): Call check_profile_consistency.
	(execute_one_pass): Likewise.
Index: doc/invoke.texi
===================================================================
--- doc/invoke.texi	(revision 191852)
+++ doc/invoke.texi	(working copy)
@@ -386,7 +386,7 @@ Objective-C and Objective-C++ Dialects}.
 -fno-toplevel-reorder -fno-trapping-math -fno-zero-initialized-in-bss @gol
 -fomit-frame-pointer -foptimize-register-move -foptimize-sibling-calls @gol
 -fpartial-inlining -fpeel-loops -fpredictive-commoning @gol
--fprefetch-loop-arrays @gol
+-fprefetch-loop-arrays -fprofile-consistency-report @gol
 -fprofile-correction -fprofile-dir=@var{path} -fprofile-generate @gol
 -fprofile-generate=@var{path} @gol
 -fprofile-use -fprofile-use=@var{path} -fprofile-values @gol
@@ -5149,6 +5149,11 @@ allocation for the WPA phase only.
 Makes the compiler print some statistics about permanent memory
 allocation before or after interprocedural optimization.
 
+@item -fprofile-consistency-report
+@opindex fprofile-consistency-report
+Makes the compiler print some statistics about consistency of the
+(estimated) profile.
+
 @item -fstack-usage
 @opindex fstack-usage
 Makes the compiler output stack usage information for the program, on a
Index: common.opt
===================================================================
--- common.opt	(revision 191852)
+++ common.opt	(working copy)
@@ -1649,6 +1649,10 @@ fprofile-values
 Common Report Var(flag_profile_values)
 Insert code to profile values of expressions
 
+fprofile-consistency-report
+Common Report Var(profile_report)
+Report on consistency of profile
+
 frandom-seed
 Common Var(common_deferred_options) Defer
 
Index: toplev.c
===================================================================
--- toplev.c	(revision 191852)
+++ toplev.c	(working copy)
@@ -1815,6 +1815,9 @@ finalize (bool no_backend)
   if (mem_report)
     dump_memory_report (true);
 
+  if (profile_report)
+    dump_profile_consistency_report ();
+
   /* Language-specific end of compilation actions.  */
   lang_hooks.finish ();
 }
Index: passes.c
===================================================================
--- passes.c	(revision 191852)
+++ passes.c	(working copy)
@@ -1782,6 +1784,108 @@ execute_function_dump (void *data ATTRIB
     }
 }
 
+/* Hold statistic about profile consistency.  */
+
+struct profile_record
+{
+  int num_mismatched_freq_in[2];
+  int num_mismatched_freq_out[2];
+  int num_mismatched_count_in[2];
+  int num_mismatched_count_out[2];
+  bool tested;
+};
+
+static struct profile_record *profile_record;
+
+/* Account profile inconsistencies for pass INDEX. If SUBPASS is non-zero, the
+   accounting happens after TODO.  */
+
+static void
+check_profile_consistency (int index, int subpass)
+{
+  basic_block bb;
+  edge_iterator ei;
+  edge e;
+  int sum;
+  gcov_type lsum;
+
+  if (index == -1)
+    return;
+  if (!profile_record)
+    profile_record = XCNEWVEC (struct profile_record,
+			       passes_by_id_size);
+  gcc_assert (index < passes_by_id_size && index >= 0);
+  gcc_assert (subpass < 2);
+  profile_record[index].tested = true;
+
+  FOR_ALL_BB (bb)
+   {
+      if (bb != EXIT_BLOCK_PTR_FOR_FUNCTION (cfun))
+	{
+	  sum = 0;
+	  FOR_EACH_EDGE (e, ei, bb->succs)
+	    sum += e->probability;
+	  if (EDGE_COUNT (bb->succs) && abs (sum - REG_BR_PROB_BASE) > 100)
+	    profile_record[index].num_mismatched_freq_out[subpass]++;
+	  lsum = 0;
+	  FOR_EACH_EDGE (e, ei, bb->succs)
+	    lsum += e->count;
+	  if (EDGE_COUNT (bb->succs)
+	      && (lsum - bb->count > 100 || lsum - bb->count < -100))
+	    profile_record[index].num_mismatched_count_out[subpass]++;
+	}
+	if (bb != ENTRY_BLOCK_PTR_FOR_FUNCTION (cfun))
+	  {
+	    sum = 0;
+	    FOR_EACH_EDGE (e, ei, bb->preds)
+	      sum += EDGE_FREQUENCY (e);
+	    if (abs (sum - bb->frequency) > 100
+		|| (MAX (sum, bb->frequency) > 10
+		    && abs ((sum - bb->frequency) * 100 / (MAX (sum, bb->frequency) + 1)) > 10))
+	      profile_record[index].num_mismatched_freq_in[subpass]++;
+	    lsum = 0;
+	    FOR_EACH_EDGE (e, ei, bb->preds)
+	      lsum += e->count;
+	    if (lsum - bb->count > 100 || lsum - bb->count < -100)
+	      profile_record[index].num_mismatched_count_in[subpass]++;
+	  }
+   }
+}
+
+/* Output profile consistency.  */
+
+void
+dump_profile_consistency_report (void)
+{
+  int i, j;
+  int last_freq_in = 0, last_count_in = 0, last_freq_out = 0, last_count_out = 0;
+
+  if (!profile_record)
+    return;
+  fprintf (stderr, "\nProfile consistency report:\n");
+  for (i = 0; i < passes_by_id_size; i++)
+    for (j = 0 ; j < 2; j++)
+      if ((profile_record[i].num_mismatched_freq_in[j] != last_freq_in
+	   || profile_record[i].num_mismatched_freq_out[j] != last_freq_out
+	   || profile_record[i].num_mismatched_count_in[j] != last_count_in
+	   || profile_record[i].num_mismatched_count_out[j] != last_count_out)
+	  && profile_record[i].tested)
+      {
+	fprintf (stderr," Pass: %-20s %s mismatched in: %+4i (freqs) %+4i (counts); "
+		 "michmatched out: %+4i (freqs) %+4i (counts)\n",
+		 passes_by_id [i]->name,
+		 j ? "(after TODO)" : "(after pass)",
+		 profile_record[i].num_mismatched_freq_in[j] - last_freq_in,
+		 profile_record[i].num_mismatched_count_in[j] - last_count_in,
+		 profile_record[i].num_mismatched_freq_out[j] - last_freq_out,
+		 profile_record[i].num_mismatched_count_out[j] - last_count_out);
+	last_freq_in = profile_record[i].num_mismatched_freq_in[j];
+	last_freq_out = profile_record[i].num_mismatched_freq_out[j];
+	last_count_in = profile_record[i].num_mismatched_count_in[j];
+	last_count_out = profile_record[i].num_mismatched_count_out[j];
+      }
+}
+
 /* Perform all TODO actions that ought to be done on each function.  */
 
 static void
@@ -2050,9 +2150,16 @@ execute_one_ipa_transform_pass (struct c
   if (pass->tv_id != TV_NONE)
     timevar_pop (pass->tv_id);
 
+  if (profile_report && cfun && (cfun->curr_properties & PROP_cfg)
+      && profile_status != PROFILE_ABSENT)
+    check_profile_consistency (pass->static_pass_number, 0);
+
   /* Run post-pass cleanup and verification.  */
   execute_todo (todo_after);
   verify_interpass_invariants ();
+  if (profile_report && cfun && (cfun->curr_properties & PROP_cfg)
+      && profile_status != PROFILE_ABSENT)
+    check_profile_consistency (pass->static_pass_number, 1);
 
   do_per_function (execute_function_dump, NULL);
   pass_fini_dump_file (pass);
@@ -2218,8 +2325,16 @@ execute_one_pass (struct opt_pass *pass)
       clean_graph_dump_file (dump_file_name);
     }
 
+  if (profile_report && cfun && (cfun->curr_properties & PROP_cfg)
+      && profile_status != PROFILE_ABSENT)
+    check_profile_consistency (pass->static_pass_number, 0);
+
   /* Run post-pass cleanup and verification.  */
   execute_todo (todo_after | pass->todo_flags_finish);
+  if (profile_report && cfun && (cfun->curr_properties & PROP_cfg)
+      && profile_status != PROFILE_ABSENT)
+    check_profile_consistency (pass->static_pass_number, 1);
+
   verify_interpass_invariants ();
   do_per_function (execute_function_dump, NULL);
   if (pass->type == IPA_PASS)
Index: toplev.h
===================================================================
--- toplev.h	(revision 191852)
+++ toplev.h	(working copy)
@@ -49,6 +49,7 @@ extern void emit_debug_global_declaratio
 extern void write_global_declarations (void);
 
 extern void dump_memory_report (bool);
+extern void dump_profile_consistency_report (void);
 
 extern void target_reinit (void);
 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Profile housekeeping 6/n (-fprofile-consistency-report)
  2012-10-04 14:01 Profile housekeeping 6/n (-fprofile-consistency-report) Jan Hubicka
@ 2012-10-04 14:09 ` Steven Bosscher
  2012-10-04 14:40   ` Jan Hubicka
  2012-10-06 14:10   ` Jan Hubicka
  0 siblings, 2 replies; 13+ messages in thread
From: Steven Bosscher @ 2012-10-04 14:09 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc-patches

On Thu, Oct 4, 2012 at 4:01 PM, Jan Hubicka wrote:
>         * doc/invoke.texi (-fprofile-consistency-report): Document.
>         * common.opt (fprofile-consistency-report): New.
>         * toplev.h (dump_profile_consistency_report): Declare.
>         * toplev.c (finalize): Call dump_profile_consistency_report.
>         * passes.c (profile_record): New global var.
>         (check_profile_consistency): New function.
>         (dump_profile_consistency_report): New function.
>         (execute_one_ipa_transform_pass): Call check_profile_consistency.
>         (execute_one_pass): Likewise.


Nice. And long overdue! :-)


> +fprofile-consistency-report
> +Common Report Var(profile_report)
> +Report on consistency of profile

Maybe make this a -d flag instead of -f?


> Index: passes.c
> +/* Hold statistic about profile consistency.  */
...

I don't see why this should live in passes.c, can you please put it in
a more logical place (profile.c, perhaps)?

Ciao!
Steven

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Profile housekeeping 6/n (-fprofile-consistency-report)
  2012-10-04 14:09 ` Steven Bosscher
@ 2012-10-04 14:40   ` Jan Hubicka
  2012-10-06 14:10   ` Jan Hubicka
  1 sibling, 0 replies; 13+ messages in thread
From: Jan Hubicka @ 2012-10-04 14:40 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: Jan Hubicka, gcc-patches

> On Thu, Oct 4, 2012 at 4:01 PM, Jan Hubicka wrote:
> >         * doc/invoke.texi (-fprofile-consistency-report): Document.
> >         * common.opt (fprofile-consistency-report): New.
> >         * toplev.h (dump_profile_consistency_report): Declare.
> >         * toplev.c (finalize): Call dump_profile_consistency_report.
> >         * passes.c (profile_record): New global var.
> >         (check_profile_consistency): New function.
> >         (dump_profile_consistency_report): New function.
> >         (execute_one_ipa_transform_pass): Call check_profile_consistency.
> >         (execute_one_pass): Likewise.
> 
> 
> Nice. And long overdue! :-)
> 
> 
> > +fprofile-consistency-report
> > +Common Report Var(profile_report)
> > +Report on consistency of profile
> 
> Maybe make this a -d flag instead of -f?

time-report and mem-report are also -f, so I guess we shall move all of them or none.
> 
> 
> > Index: passes.c
> > +/* Hold statistic about profile consistency.  */
> ...
> 
> I don't see why this should live in passes.c, can you please put it in
> a more logical place (profile.c, perhaps)?

Hmm, I guess predict.c then.
I had it there but then reminded Richard's effort to pull out functions that
are only called from elsehwere and not using anything from given unit ;)

Honza
> 
> Ciao!
> Steven

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Profile housekeeping 6/n (-fprofile-consistency-report)
  2012-10-04 14:09 ` Steven Bosscher
  2012-10-04 14:40   ` Jan Hubicka
@ 2012-10-06 14:10   ` Jan Hubicka
  2012-10-06 14:39     ` Steven Bosscher
  2012-10-06 15:15     ` Graham Stott
  1 sibling, 2 replies; 13+ messages in thread
From: Jan Hubicka @ 2012-10-06 14:10 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: Jan Hubicka, gcc-patches

> > Index: passes.c
> > +/* Hold statistic about profile consistency.  */
> ...
> 
> I don't see why this should live in passes.c, can you please put it in
> a more logical place (profile.c, perhaps)?

Hmm, the problem here is that the code is using passmanager's dumping bits
to order the passes and assign them names.  So moving it elsewhere requires
exporting it that is not nice.  passes.c does similar stuff already, so I decided
to keep it there.

Here is patch I comitted that prints more detailed report.  It reports also changes
in overall unit time/size estimates and when pass did nothing it reports it as
uneffective.  For tramp3d it looks this way:

Pass name                        |mismatch in |mismated out|Overall
                                 |freq count  |freq count  |size       time
cfg                  (after TODO)|            |            |  -1.5474%
ssa                              |            |            |  -2.0148%
inline_param         ------------|            |            |
einline                          |            |            |  -0.4991%
einline              (after TODO)|            |            |  -0.0129%
early_optimizations  ------------|            |            |
copyrename           ------------|            |            |
ccp                              |            |            |  -0.2273%
forwprop                         |            |            |  -0.0688%
ealias               ------------|            |            |
esra                             |            |            |  -0.1892%
fre                              |            |            |  -7.9369%
copyprop             (after TODO)|            |            |  -0.0187%
mergephi             ------------|            |            |
cddce                            |            |            |  -0.1655%
eipa_sra                         |            |            |  -0.0237%
tailr                            |            |            |  -0.5305%
switchconv                       |            |            |  +0.0190%
profile_estimate                 |    +1      |            |
local-pure-const     ------------|            |            |
fnsplit                          |   +20      |            |  +0.2333%  +0.8267%
fnsplit              (after TODO)|   -20      |            |  -1.8146%  -1.3300%
release_ssa          ------------|            |            |
inline_param         ------------|            |            |
inline                           |  +229      |            | +14.5138% -10.6020%
inline               (after TODO)|  -225      |            |  -0.2662% -10.2195%
copyrename                       |            |            |  -2.7026%  -2.9448%
cunrolli             ------------|            |            |
ccp                              |            |            |  -0.0174%  -0.0142%
ccp                  (after TODO)|    +8      |            |  -0.3571%  -0.1213%
forwprop                         |            |            |  -0.1442%  -0.5343%
alias                ------------|            |            |
retslot              ------------|            |            |
phiprop              ------------|            |            |
fre                              |            |            |  -0.2801%  -0.2814%
fre                  (after TODO)|            |            |  -0.2897%  -0.0362%
copyprop                         |            |            |  -0.0264%  -0.0251%
mergephi             ------------|            |            |
vrp                              |   +80      |            |  +0.1233%  -1.0271%
vrp                  (after TODO)|    +3      |            |  -0.8618%  -0.2421%
dce                              |            |            |  -0.0089%  -0.0021%
dce                  (after TODO)|    -7      |            |  -0.0089%  -0.0347%
cdce                 ------------|            |            |
cselim                           |            |            |  -0.0177%  +0.0000%
ifcombine                        |            |            |  +0.0044%  +0.0001%
ifcombine            (after TODO)|            |            |  -0.0089%  +0.0000%
phiopt                           |            |            |  -0.1730%  -0.1203%
tailr                ------------|            |            |
ch                               |    +2      |            |  +2.0535%  +0.0000%
ch                   (after TODO)|            |            |            +0.0002%
cplxlower            ------------|            |            |
sra                              |            |            |  +0.0087%
copyrename           ------------|            |            |
dom                              |   +86      |            |  +1.1802%  -0.1366%
dom                  (after TODO)|    +5      |            |  -0.1894%  -0.3042%
phicprop                         |            |            |  -0.0043%
phicprop             (after TODO)|    -6      |            |            +0.2117%
dse                              |            |            |  -0.0086%  -0.0009%
reassoc                          |            |            |  +0.0302%  +0.0116%
dce                              |            |            |  -0.6640%  -0.1408%
dce                  (after TODO)|    -2      |            |
forwprop                         |            |            |  -0.1129%  -0.1826%
phiopt               ------------|            |            |
objsz                ------------|            |            |
strlen               ------------|            |            |
ccp                  (after TODO)|            |            |  -0.0087%
copyprop             ------------|            |            |
sincos               ------------|            |            |
bswap                ------------|            |            |
crited               ------------|            |            |
pre                              |    +1      |            |  +1.2168%  -1.3703%
pre                  (after TODO)|    -2      |            |  -0.0215%  +0.0764%
sink                             |            |            |            -0.0256%
loop                 ------------|            |            |
loopinit             ------------|            |            |
lim                              |            |            |  +0.0086%  -0.1529%
copyprop             (after TODO)|   -11      |            |            +0.1483%
dceloop                          |            |            |  -0.0172%  -0.0028%
unswitch                         |    +7      |            |  +1.0265%  +0.0404%
unswitch             (after TODO)|   +21      |            |  -0.5569%  -0.7164%
sccp                 ------------|            |            |
ldist                ------------|            |            |
copyprop             ------------|            |            |
ivcanon                          |            |            |  +0.0214%  +0.0661%
ifcvt                ------------|            |            |
vect                 ------------|            |            |
dceloop                          |            |            |  -0.0085%  -0.0009%
pcom                             |            |            |            -0.0002%
cunroll                          |    -2      |            |  +0.8891%  -0.0781%
slp                  ------------|            |            |
ivopts                           |            |            |  +0.1822%  +0.0837%
lim                  ------------|            |            |
loopdone             ------------|            |            |
veclower2            ------------|            |            |
reassoc                          |            |            |  -0.0042%  -0.0169%
vrp                              |   +17      |            |  -0.1015%  -0.2523%
vrp                  (after TODO)|    +1      |            |  -0.2540%  -0.3505%
slsr                             |            |            |  +0.1952%  +0.6254%
dom                              |   +16      |            |  -0.0254%  -0.6641%
dom                  (after TODO)|    +5      |            |  -0.0424%  -0.0243%
phicprop                         |            |            |  -0.0170%
phicprop             (after TODO)|    +2      |            |            +0.0141%
cddce                            |            |            |  -0.1229%  -0.0515%
cddce                (after TODO)|    -1      |            |  -0.0085%
dse                  ------------|            |            |
forwprop                         |            |            |  -0.0382%  -0.0254%
phiopt                           |            |            |  -0.0255%  -0.0462%
fab                  ------------|            |            |
widening_mul         ------------|            |            |
tailc                ------------|            |            |
copyrename           ------------|            |            |
uncprop              ------------|            |            |
local-pure-const     ------------|            |            |
nrv                  ------------|            |            |
optimized            ------------|            |            |
expand                           |  +430      |            |----------
vregs                ------------|            |            |
into_cfglayout                   |            |            |  -4.4596%  -3.2103%
jump                             |    +6      |            |  -0.5852%  -0.5775%
subreg1                          |            |            |  -0.0589%
dfinit               ------------|            |            |
cse1                             |            |            |  -0.0966%  -0.0620%
fwprop1                          |            |            |  -2.5152%  -3.0170%
cprop                            |    -8      |            |  -0.4209%  -0.4256%
rtl pre                          |            |            |  +0.8649%  +0.7079%
hoist                ------------|            |            |
cprop                            |    -7      |            |  -0.9099%  -1.0555%
cse_local                        |            |            |  -0.1787%  -0.2454%
ce1                              |    -1      |            |  +0.0913%  +0.3673%
reginfo              ------------|            |            |
loop2                ------------|            |            |
loop2_init           ------------|            |            |
loop2_invariant                  |            |            |  +0.0523%  +0.1014%
loop2_unswitch       ------------|            |            |
loop2_done           ------------|            |            |
cprop                            |            |            |  -0.3951%  -0.2580%
cse2                             |            |            |  -0.3698%  -0.3355%
dse1                             |            |            |  -0.0025%  -0.0011%
fwprop2                          |            |            |  -0.0012%  -0.0013%
init-regs            ------------|            |            |
ud_dce                           |            |            |  -0.0784%  -0.0126%
combine                          |            |            |  -3.1886%  -3.4658%
ce2                              |            |            |            +0.0001%
regmove                          |            |            |  +0.0025%  +0.0000%
outof_cfglayout                  |            |            |  +3.9667%  +1.8602%
split1                           |            |            |  +0.0828%  +0.0878%
subreg2              ------------|            |            |
mode_sw              ------------|            |            |
asmcons              ------------|            |            |
ira                              |            |            |  +3.0210%  +0.5602%
reload                           |    -4      |            |  -2.3169%  -3.8620%
postreload                       |            |            |  -0.3193%  -0.0548%
gcse2                            |            |            |  -0.0170%  -0.0051%
split2                           |            |            |  +0.2646%  +0.3206%
ree                              |            |            |  +0.0012%  +0.0175%
pro_and_epilogue                 |    +4      |            |  +3.8163% +18.4507%
dse2                 ------------|            |            |
csa                  ------------|            |            |
jump2                            |            |            |  -4.4781%  -0.0612%
peephole2                        |            |            |  +0.0073%  -0.0539%
ce3                              |            |            |  -0.0049%  +0.0092%
cprop_hardreg        ------------|            |            |
rtl_dce                          |            |            |  -0.0903%  -0.0736%
bbro                             |    -7      |            |  +0.4776%  -1.0224%
split4               ------------|            |            |
sched2               ------------|            |            |
stack                ------------|            |            |
alignments           ------------|            |            |
compgotos            ------------|            |            |

Honza

	* doc/invoke.texi (-fprofile-report): Document.
	* common.opt (-fprofile-report): New option.
	* toplev.c (finalize): Call dump_profile_report.
	* toplev.h (profile_report): Declare.
	* passes.c (profile_record): New static var.
	(check_profile_consistency): New function.
	(dump_profile_record): New function.
	(execute_one_ipa_transform_pass): Call check_profile_consistency.
	(execute_one_pass): Likewise.
Index: doc/invoke.texi
===================================================================
--- doc/invoke.texi	(revision 192116)
+++ doc/invoke.texi	(working copy)
@@ -388,7 +388,7 @@ Objective-C and Objective-C++ Dialects}.
 -fno-toplevel-reorder -fno-trapping-math -fno-zero-initialized-in-bss @gol
 -fomit-frame-pointer -foptimize-register-move -foptimize-sibling-calls @gol
 -fpartial-inlining -fpeel-loops -fpredictive-commoning @gol
--fprefetch-loop-arrays @gol
+-fprefetch-loop-arrays -fprofile-report @gol
 -fprofile-correction -fprofile-dir=@var{path} -fprofile-generate @gol
 -fprofile-generate=@var{path} @gol
 -fprofile-use -fprofile-use=@var{path} -fprofile-values @gol
@@ -5153,6 +5153,11 @@ allocation for the WPA phase only.
 Makes the compiler print some statistics about permanent memory
 allocation before or after interprocedural optimization.
 
+@item -fprofile-report
+@opindex fprofile-report
+Makes the compiler print some statistics about consistency of the
+(estimated) profile and effect of individual passes.
+
 @item -fstack-usage
 @opindex fstack-usage
 Makes the compiler output stack usage information for the program, on a
Index: common.opt
===================================================================
--- common.opt	(revision 192116)
+++ common.opt	(working copy)
@@ -1654,6 +1654,10 @@ fprofile-values
 Common Report Var(flag_profile_values)
 Insert code to profile values of expressions
 
+fprofile-report
+Common Report Var(profile_report)
+Report on consistency of profile
+
 frandom-seed
 Common Var(common_deferred_options) Defer
 
Index: toplev.c
===================================================================
--- toplev.c	(revision 192116)
+++ toplev.c	(working copy)
@@ -1815,6 +1815,9 @@ finalize (bool no_backend)
   if (mem_report)
     dump_memory_report (true);
 
+  if (dump_profile_report)
+    dump_profile_report ();
+
   /* Language-specific end of compilation actions.  */
   lang_hooks.finish ();
 }
Index: toplev.h
===================================================================
--- toplev.h	(revision 192116)
+++ toplev.h	(working copy)
@@ -49,6 +49,7 @@ extern void emit_debug_global_declaratio
 extern void write_global_declarations (void);
 
 extern void dump_memory_report (bool);
+extern void dump_profile_report (void);
 
 extern void target_reinit (void);
 
Index: passes.c
===================================================================
--- passes.c	(revision 192116)
+++ passes.c	(working copy)
@@ -1778,6 +1780,209 @@ execute_function_dump (void *data ATTRIB
     }
 }
 
+/* Make statistic about profile consistency.  */
+
+struct profile_record
+{
+  int num_mismatched_freq_in[2];
+  int num_mismatched_freq_out[2];
+  int num_mismatched_count_in[2];
+  int num_mismatched_count_out[2];
+  bool run;
+  gcov_type time[2];
+  int size[2];
+};
+
+static struct profile_record *profile_record;
+
+static void
+check_profile_consistency (int index, int subpass, bool run)
+{
+  basic_block bb;
+  edge_iterator ei;
+  edge e;
+  int sum;
+  gcov_type lsum;
+
+  if (index == -1)
+    return;
+  if (!profile_record)
+    profile_record = XCNEWVEC (struct profile_record,
+			       passes_by_id_size);
+  gcc_assert (index < passes_by_id_size && index >= 0);
+  gcc_assert (subpass < 2);
+  profile_record[index].run |= run;
+
+  FOR_ALL_BB (bb)
+   {
+      if (bb != EXIT_BLOCK_PTR_FOR_FUNCTION (cfun)
+	  && profile_status != PROFILE_ABSENT)
+	{
+	  sum = 0;
+	  FOR_EACH_EDGE (e, ei, bb->succs)
+	    sum += e->probability;
+	  if (EDGE_COUNT (bb->succs) && abs (sum - REG_BR_PROB_BASE) > 100)
+	    profile_record[index].num_mismatched_freq_out[subpass]++;
+	  lsum = 0;
+	  FOR_EACH_EDGE (e, ei, bb->succs)
+	    lsum += e->count;
+	  if (EDGE_COUNT (bb->succs)
+	      && (lsum - bb->count > 100 || lsum - bb->count < -100))
+	    profile_record[index].num_mismatched_count_out[subpass]++;
+	}
+      if (bb != ENTRY_BLOCK_PTR_FOR_FUNCTION (cfun)
+	  && profile_status != PROFILE_ABSENT)
+	{
+	  sum = 0;
+	  FOR_EACH_EDGE (e, ei, bb->preds)
+	    sum += EDGE_FREQUENCY (e);
+	  if (abs (sum - bb->frequency) > 100
+	      || (MAX (sum, bb->frequency) > 10
+		  && abs ((sum - bb->frequency) * 100 / (MAX (sum, bb->frequency) + 1)) > 10))
+	    profile_record[index].num_mismatched_freq_in[subpass]++;
+	  lsum = 0;
+	  FOR_EACH_EDGE (e, ei, bb->preds)
+	    lsum += e->count;
+	  if (lsum - bb->count > 100 || lsum - bb->count < -100)
+	    profile_record[index].num_mismatched_count_in[subpass]++;
+	}
+      if (bb == ENTRY_BLOCK_PTR_FOR_FUNCTION (cfun)
+	  || bb == EXIT_BLOCK_PTR_FOR_FUNCTION (cfun))
+	continue;
+      if ((cfun && (cfun->curr_properties & PROP_trees)))
+	{
+	  gimple_stmt_iterator i;
+
+	  for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
+	    {
+	      profile_record[index].size[subpass]
+		 += estimate_num_insns (gsi_stmt (i), &eni_size_weights);
+	      if (profile_status == PROFILE_READ)
+		profile_record[index].time[subpass]
+		   += estimate_num_insns (gsi_stmt (i),
+					  &eni_time_weights) * bb->count;
+	      else if (profile_status == PROFILE_GUESSED)
+		profile_record[index].time[subpass]
+		   += estimate_num_insns (gsi_stmt (i),
+					  &eni_time_weights) * bb->frequency;
+	    }
+	}
+      else if (cfun && (cfun->curr_properties & PROP_rtl))
+	{
+	  rtx insn;
+	  for (insn = NEXT_INSN (BB_HEAD (bb)); insn && insn != NEXT_INSN (BB_END (bb));
+	       insn = NEXT_INSN (insn))
+	    if (INSN_P (insn))
+	      {
+		profile_record[index].size[subpass]
+		   += insn_rtx_cost (PATTERN (insn), false);
+		if (profile_status == PROFILE_READ)
+		  profile_record[index].time[subpass]
+		     += insn_rtx_cost (PATTERN (insn), true) * bb->count;
+		else if (profile_status == PROFILE_GUESSED)
+		  profile_record[index].time[subpass]
+		     += insn_rtx_cost (PATTERN (insn), true) * bb->frequency;
+	      }
+	}
+   }
+}
+
+/* Output profile consistency.  */
+
+void
+dump_profile_report (void)
+{
+  int i, j;
+  int last_freq_in = 0, last_count_in = 0, last_freq_out = 0, last_count_out = 0;
+  gcov_type last_time, last_size;
+  double rel_time_change, rel_size_change;
+  int last_reported;
+
+  if (!profile_record)
+    return;
+  fprintf (stderr, "\nProfile consistency report:\n\n");
+  fprintf (stderr, "Pass name                        |mismatch in |mismated out|Overall\n");
+  fprintf (stderr, "                                 |freq count  |freq count  |size   time\n");
+	   
+  for (i = 0; i < passes_by_id_size; i++)
+    for (j = 0 ; j < 2; j++)
+      if (profile_record[i].run)
+	{
+	  if (last_time)
+	    rel_time_change = (profile_record[i].time[j]
+			       - (double)last_time) * 100 / (double)last_time;
+	  else
+	    rel_time_change = 0;
+	  if (last_size)
+	    rel_size_change = (profile_record[i].size[j]
+			       - (double)last_size) * 100 / (double)last_size;
+	  else
+	    rel_size_change = 0;
+
+	  if (profile_record[i].num_mismatched_freq_in[j] != last_freq_in
+	      || profile_record[i].num_mismatched_freq_out[j] != last_freq_out
+	      || profile_record[i].num_mismatched_count_in[j] != last_count_in
+	      || profile_record[i].num_mismatched_count_out[j] != last_count_out
+	      || rel_time_change || rel_size_change)
+	    {
+	      last_reported = i;
+              fprintf (stderr, "%-20s %s",
+		       passes_by_id [i]->name,
+		       j ? "(after TODO)" : "            ");
+	      if (profile_record[i].num_mismatched_freq_in[j] != last_freq_in)
+		fprintf (stderr, "| %+5i",
+		         profile_record[i].num_mismatched_freq_in[j]
+			  - last_freq_in);
+	      else
+		fprintf (stderr, "|      ");
+	      if (profile_record[i].num_mismatched_count_in[j] != last_count_in)
+		fprintf (stderr, " %+5i",
+		         profile_record[i].num_mismatched_count_in[j]
+			  - last_count_in);
+	      else
+		fprintf (stderr, "      ");
+	      if (profile_record[i].num_mismatched_freq_out[j] != last_freq_out)
+		fprintf (stderr, "| %+5i",
+		         profile_record[i].num_mismatched_freq_out[j]
+			  - last_freq_out);
+	      else
+		fprintf (stderr, "|      ");
+	      if (profile_record[i].num_mismatched_count_out[j] != last_count_out)
+		fprintf (stderr, " %+5i",
+		         profile_record[i].num_mismatched_count_out[j]
+			  - last_count_out);
+	      else
+		fprintf (stderr, "      ");
+
+	      /* Size/time units change across gimple and RTL.  */
+	      if (i == pass_expand.pass.static_pass_number)
+		fprintf (stderr, "|----------");
+	      else
+		{
+		  if (rel_size_change)
+		    fprintf (stderr, "| %+8.4f%%", rel_size_change);
+		  else
+		    fprintf (stderr, "|          ");
+		  if (rel_time_change)
+		    fprintf (stderr, " %+8.4f%%", rel_time_change);
+		}
+	      fprintf (stderr, "\n");
+	      last_freq_in = profile_record[i].num_mismatched_freq_in[j];
+	      last_freq_out = profile_record[i].num_mismatched_freq_out[j];
+	      last_count_in = profile_record[i].num_mismatched_count_in[j];
+	      last_count_out = profile_record[i].num_mismatched_count_out[j];
+	    }
+	  else if (j && last_reported != i)
+	    {
+	      last_reported = i;
+              fprintf (stderr, "%-20s ------------|            |            |\n",
+		       passes_by_id [i]->name);
+	    }
+	  last_time = profile_record[i].time[j];
+	  last_size = profile_record[i].size[j];
+	}
+}
+
 /* Perform all TODO actions that ought to be done on each function.  */
 
 static void
@@ -2042,9 +2247,14 @@ execute_one_ipa_transform_pass (struct c
   if (pass->tv_id != TV_NONE)
     timevar_pop (pass->tv_id);
 
+  if (profile_report && cfun && (cfun->curr_properties & PROP_cfg))
+    check_profile_consistency (pass->static_pass_number, 0, true);
+
   /* Run post-pass cleanup and verification.  */
   execute_todo (todo_after);
   verify_interpass_invariants ();
+  if (profile_report && cfun && (cfun->curr_properties & PROP_cfg))
+    check_profile_consistency (pass->static_pass_number, 1, true);
 
   do_per_function (execute_function_dump, NULL);
   pass_fini_dump_file (pass);
@@ -2144,6 +2354,13 @@ execute_one_pass (struct opt_pass *pass)
 
   if (!gate_status)
     {
+      /* Run so passes selectively disabling themselves on a given function
+	 are not miscounted.  */
+      if (profile_report && cfun && (cfun->curr_properties & PROP_cfg))
+	{
+          check_profile_consistency (pass->static_pass_number, 0, false);
+          check_profile_consistency (pass->static_pass_number, 1, false);
+	}
       current_pass = NULL;
       return false;
     }
@@ -2210,8 +2427,14 @@ execute_one_pass (struct opt_pass *pass)
       clean_graph_dump_file (dump_file_name);
     }
 
+  if (profile_report && cfun && (cfun->curr_properties & PROP_cfg))
+    check_profile_consistency (pass->static_pass_number, 0, true);
+
   /* Run post-pass cleanup and verification.  */
   execute_todo (todo_after | pass->todo_flags_finish);
+  if (profile_report && cfun && (cfun->curr_properties & PROP_cfg))
+    check_profile_consistency (pass->static_pass_number, 1, true);
+
   verify_interpass_invariants ();
   do_per_function (execute_function_dump, NULL);
   if (pass->type == IPA_PASS)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Profile housekeeping 6/n (-fprofile-consistency-report)
  2012-10-06 14:10   ` Jan Hubicka
@ 2012-10-06 14:39     ` Steven Bosscher
  2012-10-06 15:43       ` Steven Bosscher
  2012-10-06 15:44       ` Jan Hubicka
  2012-10-06 15:15     ` Graham Stott
  1 sibling, 2 replies; 13+ messages in thread
From: Steven Bosscher @ 2012-10-06 14:39 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc-patches

On Sat, Oct 6, 2012 at 4:10 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> > Index: passes.c
>> > +/* Hold statistic about profile consistency.  */
>> ...
>>
>> I don't see why this should live in passes.c, can you please put it in
>> a more logical place (profile.c, perhaps)?
>
> Hmm, the problem here is that the code is using passmanager's dumping bits
> to order the passes and assign them names.  So moving it elsewhere requires
> exporting it that is not nice.  passes.c does similar stuff already, so I decided
> to keep it there.

I think this is not the right decision. We can also throw _all_ code
on one pile because there are inter-dependencies. Or they can be fixed
with a proper interface.

> If there are no complains I will commit the patch tomorrow.

+1 complaint.
You're putting profile stuff and even RTL stuff in the pass manager.
That is Just Wrong.

Ciao!
Steven

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Profile housekeeping 6/n (-fprofile-consistency-report)
  2012-10-06 14:10   ` Jan Hubicka
  2012-10-06 14:39     ` Steven Bosscher
@ 2012-10-06 15:15     ` Graham Stott
  2012-10-06 15:34       ` Jan Hubicka
  1 sibling, 1 reply; 13+ messages in thread
From: Graham Stott @ 2012-10-06 15:15 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc-patches

Jan.

This patch also breaks bootstrap due compilation errors reported for 
pases.c and toplev.c

Graham

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Profile housekeeping 6/n (-fprofile-consistency-report)
  2012-10-06 15:15     ` Graham Stott
@ 2012-10-06 15:34       ` Jan Hubicka
  0 siblings, 0 replies; 13+ messages in thread
From: Jan Hubicka @ 2012-10-06 15:34 UTC (permalink / raw)
  To: Graham Stott; +Cc: Jan Hubicka, gcc-patches

> Jan.
> 
> This patch also breaks bootstrap due compilation errors reported for
> pases.c and toplev.c

Sorry for that.  I swapped files and used old version of the patch. It should
be fixed now.

Honza
> 
> Graham

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Profile housekeeping 6/n (-fprofile-consistency-report)
  2012-10-06 14:39     ` Steven Bosscher
@ 2012-10-06 15:43       ` Steven Bosscher
  2012-10-06 15:46         ` Jan Hubicka
  2012-10-06 15:44       ` Jan Hubicka
  1 sibling, 1 reply; 13+ messages in thread
From: Steven Bosscher @ 2012-10-06 15:43 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc-patches

On Sat, Oct 6, 2012 at 4:39 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:

>> If there are no complains I will commit the patch tomorrow.
>
> +1 complaint.
> You're putting profile stuff and even RTL stuff in the pass manager.
> That is Just Wrong.

You already committed the patch. Your tomorrow started early? ;-)

Ciao!
Steven

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Profile housekeeping 6/n (-fprofile-consistency-report)
  2012-10-06 14:39     ` Steven Bosscher
  2012-10-06 15:43       ` Steven Bosscher
@ 2012-10-06 15:44       ` Jan Hubicka
  1 sibling, 0 replies; 13+ messages in thread
From: Jan Hubicka @ 2012-10-06 15:44 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: Jan Hubicka, gcc-patches

> On Sat, Oct 6, 2012 at 4:10 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
> >> > Index: passes.c
> >> > +/* Hold statistic about profile consistency.  */
> >> ...
> >>
> >> I don't see why this should live in passes.c, can you please put it in
> >> a more logical place (profile.c, perhaps)?
> >
> > Hmm, the problem here is that the code is using passmanager's dumping bits
> > to order the passes and assign them names.  So moving it elsewhere requires
> > exporting it that is not nice.  passes.c does similar stuff already, so I decided
> > to keep it there.
> 
> I think this is not the right decision. We can also throw _all_ code
> on one pile because there are inter-dependencies. Or they can be fixed
> with a proper interface.
> 
> > If there are no complains I will commit the patch tomorrow.
> 
> +1 complaint.
> You're putting profile stuff and even RTL stuff in the pass manager.
> That is Just Wrong.

Hmm, the code really is about collecting statistics of individual passes.
I do not think it is that different from rest of TODO handling logic, but
I do not care much. It is not supposed to be something that will grow over
the time and become essential part of the compiler.

The code uses the CFG/Gimple/RTL interfaces that are both used by many parts of
the compiler (including the passmanager) + passmanager's passes_by_id and
passes_by_id_size that are not exported.

profile.c is not good place because that is only about reading profile
feedback.  cfg.c or predict.c seems better because that is where most of
cfg/profile API lives.  I will probably move it to cfg.c, because it should be
in sync with the BB dumping code that should report the same inconsistencies as
the statistics loop.

Would you preffer exporting profile_record structure and putting the actual
walk of CFG into cfg.c or exporting the passmanager's passes_by_id used by the
dumping? (actually passes_by_id misses "static" in the declaratoin, so it is
matter of putting it to the header)  I will implement one of those.

Honza
> 
> Ciao!
> Steven

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Profile housekeeping 6/n (-fprofile-consistency-report)
  2012-10-06 15:43       ` Steven Bosscher
@ 2012-10-06 15:46         ` Jan Hubicka
  2012-10-06 15:56           ` Jan Hubicka
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Hubicka @ 2012-10-06 15:46 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: Jan Hubicka, gcc-patches

> On Sat, Oct 6, 2012 at 4:39 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
> 
> >> If there are no complains I will commit the patch tomorrow.
> >
> > +1 complaint.
> > You're putting profile stuff and even RTL stuff in the pass manager.
> > That is Just Wrong.
> 
> You already committed the patch. Your tomorrow started early? ;-)

The tomorrow you cite is from Thursday :)

Honza

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Profile housekeeping 6/n (-fprofile-consistency-report)
  2012-10-06 15:46         ` Jan Hubicka
@ 2012-10-06 15:56           ` Jan Hubicka
  2012-10-08 21:31             ` Steven Bosscher
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Hubicka @ 2012-10-06 15:56 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Steven Bosscher, gcc-patches

Hi,
does this look better? Moving to cfg.c would importing tree-pass.h and rtl.h
that is not cool either. predict.c does all of these.
Obviously can also go to a separate file, if preferred.

If it looks fine to you, I will commit it after testing.

Honza

Index: tree-pass.h
===================================================================
*** tree-pass.h	(revision 192158)
--- tree-pass.h	(working copy)
*************** extern void do_per_function_toporder (vo
*** 547,551 ****
--- 547,553 ----
  extern void disable_pass (const char *);
  extern void enable_pass (const char *);
  extern void dump_passes (void);
+ extern struct opt_pass **passes_by_id;
+ extern int passes_by_id_size;
  
  #endif /* GCC_TREE_PASS_H */
Index: predict.c
===================================================================
*** predict.c	(revision 192158)
--- predict.c	(working copy)
*************** rebuild_frequencies (void)
*** 2799,2801 ****
--- 2799,3005 ----
      gcc_unreachable ();
    timevar_pop (TV_REBUILD_FREQUENCIES);
  }
+ 
+ /* Make statistic about profile consistency.  */
+ 
+ struct profile_record
+ {
+   int num_mismatched_freq_in[2];
+   int num_mismatched_freq_out[2];
+   int num_mismatched_count_in[2];
+   int num_mismatched_count_out[2];
+   bool run;
+   gcov_type time[2];
+   int size[2];
+ };
+ 
+ static struct profile_record *profile_record;
+ 
+ static void
+ check_profile_consistency (int index, int subpass, bool run)
+ {
+   basic_block bb;
+   edge_iterator ei;
+   edge e;
+   int sum;
+   gcov_type lsum;
+ 
+   if (index == -1)
+     return;
+   if (!profile_record)
+     profile_record = XCNEWVEC (struct profile_record,
+ 			       passes_by_id_size);
+   gcc_assert (index < passes_by_id_size && index >= 0);
+   gcc_assert (subpass < 2);
+   profile_record[index].run |= run;
+ 
+   FOR_ALL_BB (bb)
+    {
+       if (bb != EXIT_BLOCK_PTR_FOR_FUNCTION (cfun)
+ 	  && profile_status != PROFILE_ABSENT)
+ 	{
+ 	  sum = 0;
+ 	  FOR_EACH_EDGE (e, ei, bb->succs)
+ 	    sum += e->probability;
+ 	  if (EDGE_COUNT (bb->succs) && abs (sum - REG_BR_PROB_BASE) > 100)
+ 	    profile_record[index].num_mismatched_freq_out[subpass]++;
+ 	  lsum = 0;
+ 	  FOR_EACH_EDGE (e, ei, bb->succs)
+ 	    lsum += e->count;
+ 	  if (EDGE_COUNT (bb->succs)
+ 	      && (lsum - bb->count > 100 || lsum - bb->count < -100))
+ 	    profile_record[index].num_mismatched_count_out[subpass]++;
+ 	}
+       if (bb != ENTRY_BLOCK_PTR_FOR_FUNCTION (cfun)
+ 	  && profile_status != PROFILE_ABSENT)
+ 	{
+ 	  sum = 0;
+ 	  FOR_EACH_EDGE (e, ei, bb->preds)
+ 	    sum += EDGE_FREQUENCY (e);
+ 	  if (abs (sum - bb->frequency) > 100
+ 	      || (MAX (sum, bb->frequency) > 10
+ 		  && abs ((sum - bb->frequency) * 100 / (MAX (sum, bb->frequency) + 1)) > 10))
+ 	    profile_record[index].num_mismatched_freq_in[subpass]++;
+ 	  lsum = 0;
+ 	  FOR_EACH_EDGE (e, ei, bb->preds)
+ 	    lsum += e->count;
+ 	  if (lsum - bb->count > 100 || lsum - bb->count < -100)
+ 	    profile_record[index].num_mismatched_count_in[subpass]++;
+ 	}
+       if (bb == ENTRY_BLOCK_PTR_FOR_FUNCTION (cfun)
+ 	  || bb == EXIT_BLOCK_PTR_FOR_FUNCTION (cfun))
+ 	continue;
+       if ((cfun && (cfun->curr_properties & PROP_trees)))
+ 	{
+ 	  gimple_stmt_iterator i;
+ 
+ 	  for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
+ 	    {
+ 	      profile_record[index].size[subpass]
+ 		 += estimate_num_insns (gsi_stmt (i), &eni_size_weights);
+ 	      if (profile_status == PROFILE_READ)
+ 		profile_record[index].time[subpass]
+ 		   += estimate_num_insns (gsi_stmt (i),
+ 					  &eni_time_weights) * bb->count;
+ 	      else if (profile_status == PROFILE_GUESSED)
+ 		profile_record[index].time[subpass]
+ 		   += estimate_num_insns (gsi_stmt (i),
+ 					  &eni_time_weights) * bb->frequency;
+ 	    }
+ 	}
+       else if (cfun && (cfun->curr_properties & PROP_rtl))
+ 	{
+ 	  rtx insn;
+ 	  for (insn = NEXT_INSN (BB_HEAD (bb)); insn && insn != NEXT_INSN (BB_END (bb));
+ 	       insn = NEXT_INSN (insn))
+ 	    if (INSN_P (insn))
+ 	      {
+ 		profile_record[index].size[subpass]
+ 		   += insn_rtx_cost (PATTERN (insn), false);
+ 		if (profile_status == PROFILE_READ)
+ 		  profile_record[index].time[subpass]
+ 		     += insn_rtx_cost (PATTERN (insn), true) * bb->count;
+ 		else if (profile_status == PROFILE_GUESSED)
+ 		  profile_record[index].time[subpass]
+ 		     += insn_rtx_cost (PATTERN (insn), true) * bb->frequency;
+ 	      }
+ 	}
+    }
+ }
+ 
+ /* Output profile consistency.  */
+ 
+ void
+ dump_profile_report (void)
+ {
+   int i, j;
+   int last_freq_in = 0, last_count_in = 0, last_freq_out = 0, last_count_out = 0;
+   gcov_type last_time = 0, last_size = 0;
+   double rel_time_change, rel_size_change;
+   int last_reported = 0;
+ 
+   if (!profile_record)
+     return;
+   fprintf (stderr, "\nProfile consistency report:\n\n");
+   fprintf (stderr, "Pass name                        |mismatch in |mismated out|Overall\n");
+   fprintf (stderr, "                                 |freq count  |freq count  |size      time\n");
+ 	   
+   for (i = 0; i < passes_by_id_size; i++)
+     for (j = 0 ; j < 2; j++)
+       if (profile_record[i].run)
+ 	{
+ 	  if (last_time)
+ 	    rel_time_change = (profile_record[i].time[j]
+ 			       - (double)last_time) * 100 / (double)last_time;
+ 	  else
+ 	    rel_time_change = 0;
+ 	  if (last_size)
+ 	    rel_size_change = (profile_record[i].size[j]
+ 			       - (double)last_size) * 100 / (double)last_size;
+ 	  else
+ 	    rel_size_change = 0;
+ 
+ 	  if (profile_record[i].num_mismatched_freq_in[j] != last_freq_in
+ 	      || profile_record[i].num_mismatched_freq_out[j] != last_freq_out
+ 	      || profile_record[i].num_mismatched_count_in[j] != last_count_in
+ 	      || profile_record[i].num_mismatched_count_out[j] != last_count_out
+ 	      || rel_time_change || rel_size_change)
+ 	    {
+ 	      last_reported = i;
+               fprintf (stderr, "%-20s %s",
+ 		       passes_by_id [i]->name,
+ 		       j ? "(after TODO)" : "            ");
+ 	      if (profile_record[i].num_mismatched_freq_in[j] != last_freq_in)
+ 		fprintf (stderr, "| %+5i",
+ 		         profile_record[i].num_mismatched_freq_in[j]
+ 			  - last_freq_in);
+ 	      else
+ 		fprintf (stderr, "|      ");
+ 	      if (profile_record[i].num_mismatched_count_in[j] != last_count_in)
+ 		fprintf (stderr, " %+5i",
+ 		         profile_record[i].num_mismatched_count_in[j]
+ 			  - last_count_in);
+ 	      else
+ 		fprintf (stderr, "      ");
+ 	      if (profile_record[i].num_mismatched_freq_out[j] != last_freq_out)
+ 		fprintf (stderr, "| %+5i",
+ 		         profile_record[i].num_mismatched_freq_out[j]
+ 			  - last_freq_out);
+ 	      else
+ 		fprintf (stderr, "|      ");
+ 	      if (profile_record[i].num_mismatched_count_out[j] != last_count_out)
+ 		fprintf (stderr, " %+5i",
+ 		         profile_record[i].num_mismatched_count_out[j]
+ 			  - last_count_out);
+ 	      else
+ 		fprintf (stderr, "      ");
+ 
+ 	      /* Size/time units change across gimple and RTL.  */
+ 	      if (i == pass_expand.pass.static_pass_number)
+ 		fprintf (stderr, "|----------");
+ 	      else
+ 		{
+ 		  if (rel_size_change)
+ 		    fprintf (stderr, "| %+8.4f%%", rel_size_change);
+ 		  else
+ 		    fprintf (stderr, "|          ");
+ 		  if (rel_time_change)
+ 		    fprintf (stderr, " %+8.4f%%", rel_time_change);
+ 		}
+ 	      fprintf (stderr, "\n");
+ 	      last_freq_in = profile_record[i].num_mismatched_freq_in[j];
+ 	      last_freq_out = profile_record[i].num_mismatched_freq_out[j];
+ 	      last_count_in = profile_record[i].num_mismatched_count_in[j];
+ 	      last_count_out = profile_record[i].num_mismatched_count_out[j];
+ 	    }
+ 	  else if (j && last_reported != i)
+ 	    {
+ 	      last_reported = i;
+               fprintf (stderr, "%-20s ------------|            |            |\n",
+ 		       passes_by_id [i]->name);
+ 	    }
+ 	  last_time = profile_record[i].time[j];
+ 	  last_size = profile_record[i].size[j];
+ 	}
+ }
+ 
Index: basic-block.h
===================================================================
*** basic-block.h	(revision 192158)
--- basic-block.h	(working copy)
*************** extern void alloc_aux_for_edge (edge, in
*** 741,746 ****
--- 741,747 ----
  extern void alloc_aux_for_edges (int);
  extern void clear_aux_for_edges (void);
  extern void free_aux_for_edges (void);
+ extern void check_profile_consistency (int, int, bool);
  
  /* In cfganal.c  */
  extern void find_unreachable_blocks (void);
Index: passes.c
===================================================================
*** passes.c	(revision 192164)
--- passes.c	(working copy)
*************** execute_function_dump (void *data ATTRIB
*** 1778,1986 ****
      }
  }
  
- /* Make statistic about profile consistency.  */
- 
- struct profile_record
- {
-   int num_mismatched_freq_in[2];
-   int num_mismatched_freq_out[2];
-   int num_mismatched_count_in[2];
-   int num_mismatched_count_out[2];
-   bool run;
-   gcov_type time[2];
-   int size[2];
- };
- 
- static struct profile_record *profile_record;
- 
- static void
- check_profile_consistency (int index, int subpass, bool run)
- {
-   basic_block bb;
-   edge_iterator ei;
-   edge e;
-   int sum;
-   gcov_type lsum;
- 
-   if (index == -1)
-     return;
-   if (!profile_record)
-     profile_record = XCNEWVEC (struct profile_record,
- 			       passes_by_id_size);
-   gcc_assert (index < passes_by_id_size && index >= 0);
-   gcc_assert (subpass < 2);
-   profile_record[index].run |= run;
- 
-   FOR_ALL_BB (bb)
-    {
-       if (bb != EXIT_BLOCK_PTR_FOR_FUNCTION (cfun)
- 	  && profile_status != PROFILE_ABSENT)
- 	{
- 	  sum = 0;
- 	  FOR_EACH_EDGE (e, ei, bb->succs)
- 	    sum += e->probability;
- 	  if (EDGE_COUNT (bb->succs) && abs (sum - REG_BR_PROB_BASE) > 100)
- 	    profile_record[index].num_mismatched_freq_out[subpass]++;
- 	  lsum = 0;
- 	  FOR_EACH_EDGE (e, ei, bb->succs)
- 	    lsum += e->count;
- 	  if (EDGE_COUNT (bb->succs)
- 	      && (lsum - bb->count > 100 || lsum - bb->count < -100))
- 	    profile_record[index].num_mismatched_count_out[subpass]++;
- 	}
-       if (bb != ENTRY_BLOCK_PTR_FOR_FUNCTION (cfun)
- 	  && profile_status != PROFILE_ABSENT)
- 	{
- 	  sum = 0;
- 	  FOR_EACH_EDGE (e, ei, bb->preds)
- 	    sum += EDGE_FREQUENCY (e);
- 	  if (abs (sum - bb->frequency) > 100
- 	      || (MAX (sum, bb->frequency) > 10
- 		  && abs ((sum - bb->frequency) * 100 / (MAX (sum, bb->frequency) + 1)) > 10))
- 	    profile_record[index].num_mismatched_freq_in[subpass]++;
- 	  lsum = 0;
- 	  FOR_EACH_EDGE (e, ei, bb->preds)
- 	    lsum += e->count;
- 	  if (lsum - bb->count > 100 || lsum - bb->count < -100)
- 	    profile_record[index].num_mismatched_count_in[subpass]++;
- 	}
-       if (bb == ENTRY_BLOCK_PTR_FOR_FUNCTION (cfun)
- 	  || bb == EXIT_BLOCK_PTR_FOR_FUNCTION (cfun))
- 	continue;
-       if ((cfun && (cfun->curr_properties & PROP_trees)))
- 	{
- 	  gimple_stmt_iterator i;
- 
- 	  for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
- 	    {
- 	      profile_record[index].size[subpass]
- 		 += estimate_num_insns (gsi_stmt (i), &eni_size_weights);
- 	      if (profile_status == PROFILE_READ)
- 		profile_record[index].time[subpass]
- 		   += estimate_num_insns (gsi_stmt (i),
- 					  &eni_time_weights) * bb->count;
- 	      else if (profile_status == PROFILE_GUESSED)
- 		profile_record[index].time[subpass]
- 		   += estimate_num_insns (gsi_stmt (i),
- 					  &eni_time_weights) * bb->frequency;
- 	    }
- 	}
-       else if (cfun && (cfun->curr_properties & PROP_rtl))
- 	{
- 	  rtx insn;
- 	  for (insn = NEXT_INSN (BB_HEAD (bb)); insn && insn != NEXT_INSN (BB_END (bb));
- 	       insn = NEXT_INSN (insn))
- 	    if (INSN_P (insn))
- 	      {
- 		profile_record[index].size[subpass]
- 		   += insn_rtx_cost (PATTERN (insn), false);
- 		if (profile_status == PROFILE_READ)
- 		  profile_record[index].time[subpass]
- 		     += insn_rtx_cost (PATTERN (insn), true) * bb->count;
- 		else if (profile_status == PROFILE_GUESSED)
- 		  profile_record[index].time[subpass]
- 		     += insn_rtx_cost (PATTERN (insn), true) * bb->frequency;
- 	      }
- 	}
-    }
- }
- 
- /* Output profile consistency.  */
- 
- void
- dump_profile_report (void)
- {
-   int i, j;
-   int last_freq_in = 0, last_count_in = 0, last_freq_out = 0, last_count_out = 0;
-   gcov_type last_time = 0, last_size = 0;
-   double rel_time_change, rel_size_change;
-   int last_reported = 0;
- 
-   if (!profile_record)
-     return;
-   fprintf (stderr, "\nProfile consistency report:\n\n");
-   fprintf (stderr, "Pass name                        |mismatch in |mismated out|Overall\n");
-   fprintf (stderr, "                                 |freq count  |freq count  |size      time\n");
- 	   
-   for (i = 0; i < passes_by_id_size; i++)
-     for (j = 0 ; j < 2; j++)
-       if (profile_record[i].run)
- 	{
- 	  if (last_time)
- 	    rel_time_change = (profile_record[i].time[j]
- 			       - (double)last_time) * 100 / (double)last_time;
- 	  else
- 	    rel_time_change = 0;
- 	  if (last_size)
- 	    rel_size_change = (profile_record[i].size[j]
- 			       - (double)last_size) * 100 / (double)last_size;
- 	  else
- 	    rel_size_change = 0;
- 
- 	  if (profile_record[i].num_mismatched_freq_in[j] != last_freq_in
- 	      || profile_record[i].num_mismatched_freq_out[j] != last_freq_out
- 	      || profile_record[i].num_mismatched_count_in[j] != last_count_in
- 	      || profile_record[i].num_mismatched_count_out[j] != last_count_out
- 	      || rel_time_change || rel_size_change)
- 	    {
- 	      last_reported = i;
-               fprintf (stderr, "%-20s %s",
- 		       passes_by_id [i]->name,
- 		       j ? "(after TODO)" : "            ");
- 	      if (profile_record[i].num_mismatched_freq_in[j] != last_freq_in)
- 		fprintf (stderr, "| %+5i",
- 		         profile_record[i].num_mismatched_freq_in[j]
- 			  - last_freq_in);
- 	      else
- 		fprintf (stderr, "|      ");
- 	      if (profile_record[i].num_mismatched_count_in[j] != last_count_in)
- 		fprintf (stderr, " %+5i",
- 		         profile_record[i].num_mismatched_count_in[j]
- 			  - last_count_in);
- 	      else
- 		fprintf (stderr, "      ");
- 	      if (profile_record[i].num_mismatched_freq_out[j] != last_freq_out)
- 		fprintf (stderr, "| %+5i",
- 		         profile_record[i].num_mismatched_freq_out[j]
- 			  - last_freq_out);
- 	      else
- 		fprintf (stderr, "|      ");
- 	      if (profile_record[i].num_mismatched_count_out[j] != last_count_out)
- 		fprintf (stderr, " %+5i",
- 		         profile_record[i].num_mismatched_count_out[j]
- 			  - last_count_out);
- 	      else
- 		fprintf (stderr, "      ");
- 
- 	      /* Size/time units change across gimple and RTL.  */
- 	      if (i == pass_expand.pass.static_pass_number)
- 		fprintf (stderr, "|----------");
- 	      else
- 		{
- 		  if (rel_size_change)
- 		    fprintf (stderr, "| %+8.4f%%", rel_size_change);
- 		  else
- 		    fprintf (stderr, "|          ");
- 		  if (rel_time_change)
- 		    fprintf (stderr, " %+8.4f%%", rel_time_change);
- 		}
- 	      fprintf (stderr, "\n");
- 	      last_freq_in = profile_record[i].num_mismatched_freq_in[j];
- 	      last_freq_out = profile_record[i].num_mismatched_freq_out[j];
- 	      last_count_in = profile_record[i].num_mismatched_count_in[j];
- 	      last_count_out = profile_record[i].num_mismatched_count_out[j];
- 	    }
- 	  else if (j && last_reported != i)
- 	    {
- 	      last_reported = i;
-               fprintf (stderr, "%-20s ------------|            |            |\n",
- 		       passes_by_id [i]->name);
- 	    }
- 	  last_time = profile_record[i].time[j];
- 	  last_size = profile_record[i].size[j];
- 	}
- }
- 
  /* Perform all TODO actions that ought to be done on each function.  */
  
  static void
--- 1778,1783 ----

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Profile housekeeping 6/n (-fprofile-consistency-report)
  2012-10-06 15:56           ` Jan Hubicka
@ 2012-10-08 21:31             ` Steven Bosscher
  2012-10-09  8:45               ` Jan Hubicka
  0 siblings, 1 reply; 13+ messages in thread
From: Steven Bosscher @ 2012-10-08 21:31 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 951 bytes --]

On Sat, Oct 6, 2012 at 5:56 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
> Hi,
> does this look better? Moving to cfg.c would importing tree-pass.h and rtl.h
> that is not cool either. predict.c does all of these.
> Obviously can also go to a separate file, if preferred.

Attached is how I would do it. What do you think about this?

Ciao!
Steven

        * basic-block. (profile_record): New struct, moved from passes.c.
        * cfghooks.h (struct cfg_hooks) <account_profile_record>: New hook.
        (account_profile_record): New prototype.
        * cfghooks.c (account_profile_record): New function.
        * tree-cfg.c (gimple_account_profile_record): New function
        (gimple_cfg_hooks): Add it.
        * cfgrtl.c (rtl_account_profile_record): New function
        (rtl_cfg_hooks, cfg_layout_rtl_cfg_hooks): Add it.
        * passes.c (check_profile_consistency): Simplify.  Move IR-dependent
        code around using cfghooks machinery.

[-- Attachment #2: profile_consistency.diff --]
[-- Type: application/octet-stream, Size: 12640 bytes --]

	* basic-block. (profile_record): New struct, moved from passes.c.
	* cfghooks.h (struct cfg_hooks) <account_profile_record>: New hook.
	(account_profile_record): New prototype.
	* cfghooks.c (account_profile_record): New function.
	* tree-cfg.c (gimple_account_profile_record): New function
	(gimple_cfg_hooks): Add it.
	* cfgrtl.c (rtl_account_profile_record): New function
	(rtl_cfg_hooks, cfg_layout_rtl_cfg_hooks): Add it.
	* passes.c (check_profile_consistency): Simplify.  Move IR-dependent
	code around using cfghooks machinery.

Index: basic-block.h
===================================================================
--- basic-block.h	(revision 192222)
+++ basic-block.h	(working copy)
@@ -101,6 +101,37 @@ typedef struct gcov_working_set_info
   gcov_type min_counter;
 } gcov_working_set_t;
 
+/* Structure to gather statistic about profile consistency, per pass.
+   An array of this structure, indexed by pass static number, is allocated
+   in passes.c.  The structure is defined here so that different CFG modes
+   can do their book-keeping via CFG hooks.
+
+   For every field[2], field[0] is the count before the pass runs, and
+   field[1] is the post-pass count.  This allows us to monitor the effect
+   of each individual pass on the profile consistency.
+   
+   This structure is not supposed to be used by anything other than passes.c
+   and one CFG hook per CFG mode.  */
+struct profile_record
+{
+  /* The number of basic blocks where sum(freq) of the block's predecessors
+     doesn't match reasonably well with the incoming frequency.  */
+  int num_mismatched_freq_in[2];
+  /* Likewise for a basic block's successors.  */
+  int num_mismatched_freq_out[2];
+  /* The number of basic blocks where sum(count) of the block's predecessors
+     doesn't match reasonably well with the incoming frequency.  */
+  int num_mismatched_count_in[2];
+  /* Likewise for a basic block's successors.  */
+  int num_mismatched_count_out[2];
+  /* A weighted cost of the run-time of the function body.  */
+  gcov_type time[2];
+  /* A weighted cost of the size of the function body.  */
+  int size[2];
+  /* True iff this pass actually was run.  */
+  bool run;
+};
+
 /* Declared in cfgloop.h.  */
 struct loop;
 
Index: cfghooks.h
===================================================================
--- cfghooks.h	(revision 192222)
+++ cfghooks.h	(working copy)
@@ -145,6 +145,9 @@ struct cfg_hooks
   /* Split a basic block if it ends with a conditional branch and if
      the other part of the block is not empty.  */
   basic_block (*split_block_before_cond_jump) (basic_block);
+
+  /* Do book-keeping of a basic block for the profile consistency checker.  */
+  void (*account_profile_record) (basic_block, int, struct profile_record *);
 };
 
 extern void verify_flow_info (void);
@@ -198,6 +201,8 @@ extern void copy_bbs (basic_block *, unsigned, bas
 		      edge *, unsigned, edge *, struct loop *,
 		      basic_block);
 
+void account_profile_record (struct profile_record *, int);
+
 extern void cfg_layout_initialize (unsigned int);
 extern void cfg_layout_finalize (void);
 
Index: cfghooks.c
===================================================================
--- cfghooks.c	(revision 192222)
+++ cfghooks.c	(working copy)
@@ -1324,3 +1324,57 @@ split_block_before_cond_jump (basic_block bb)
   return cfg_hooks->split_block_before_cond_jump (bb);
 }
 
+/* Work-horse for passes.c:check_profile_consistency.
+   Do book-keeping of the CFG for the profile consistency checker.
+   If AFTER_PASS is 0, do pre-pass accounting, or if AFTER_PASS is 1
+   then do post-pass accounting.  Store the counting in RECORD.  */
+
+void
+account_profile_record (struct profile_record *record, int after_pass)
+{
+  basic_block bb;
+  edge_iterator ei;
+  edge e;
+  int sum;
+  gcov_type lsum;
+
+  FOR_ALL_BB (bb)
+   {
+      if (bb != EXIT_BLOCK_PTR_FOR_FUNCTION (cfun)
+	  && profile_status != PROFILE_ABSENT)
+	{
+	  sum = 0;
+	  FOR_EACH_EDGE (e, ei, bb->succs)
+	    sum += e->probability;
+	  if (EDGE_COUNT (bb->succs) && abs (sum - REG_BR_PROB_BASE) > 100)
+	    record->num_mismatched_freq_out[after_pass]++;
+	  lsum = 0;
+	  FOR_EACH_EDGE (e, ei, bb->succs)
+	    lsum += e->count;
+	  if (EDGE_COUNT (bb->succs)
+	      && (lsum - bb->count > 100 || lsum - bb->count < -100))
+	    record->num_mismatched_count_out[after_pass]++;
+	}
+      if (bb != ENTRY_BLOCK_PTR_FOR_FUNCTION (cfun)
+	  && profile_status != PROFILE_ABSENT)
+	{
+	  sum = 0;
+	  FOR_EACH_EDGE (e, ei, bb->preds)
+	    sum += EDGE_FREQUENCY (e);
+	  if (abs (sum - bb->frequency) > 100
+	      || (MAX (sum, bb->frequency) > 10
+		  && abs ((sum - bb->frequency) * 100 / (MAX (sum, bb->frequency) + 1)) > 10))
+	    record->num_mismatched_freq_in[after_pass]++;
+	  lsum = 0;
+	  FOR_EACH_EDGE (e, ei, bb->preds)
+	    lsum += e->count;
+	  if (lsum - bb->count > 100 || lsum - bb->count < -100)
+	    record->num_mismatched_count_in[after_pass]++;
+	}
+      if (bb == ENTRY_BLOCK_PTR_FOR_FUNCTION (cfun)
+	  || bb == EXIT_BLOCK_PTR_FOR_FUNCTION (cfun))
+	continue;
+      gcc_assert (cfg_hooks->account_profile_record);
+      cfg_hooks->account_profile_record(bb, after_pass, record);
+   }
+}
Index: tree-cfg.c
===================================================================
--- tree-cfg.c	(revision 192222)
+++ tree-cfg.c	(working copy)
@@ -7591,6 +7591,30 @@ gimple_lv_add_condition_to_bb (basic_block first_h
   e0->flags |= EDGE_FALSE_VALUE;
 }
 
+
+/* Do book-keeping of basic block BB for the profile consistency checker.
+   If AFTER_PASS is 0, do pre-pass accounting, or if AFTER_PASS is 1
+   then do post-pass accounting.  Store the counting in RECORD.  */
+static void
+gimple_account_profile_record (basic_block bb, int after_pass,
+			       struct profile_record *record)
+{
+  gimple_stmt_iterator i;
+  for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
+    {
+      record->size[after_pass]
+	+= estimate_num_insns (gsi_stmt (i), &eni_size_weights);
+      if (profile_status == PROFILE_READ)
+	record->time[after_pass]
+	  += estimate_num_insns (gsi_stmt (i),
+				 &eni_time_weights) * bb->count;
+      else if (profile_status == PROFILE_GUESSED)
+	record->time[after_pass]
+	  += estimate_num_insns (gsi_stmt (i),
+				 &eni_time_weights) * bb->frequency;
+    }
+}
+
 struct cfg_hooks gimple_cfg_hooks = {
   "gimple",
   gimple_verify_flow_info,
@@ -7624,6 +7648,7 @@ struct cfg_hooks gimple_cfg_hooks = {
   flush_pending_stmts, 		/* flush_pending_stmts */  
   gimple_empty_block_p,           /* block_empty_p */
   gimple_split_block_before_cond_jump, /* split_block_before_cond_jump */
+  gimple_account_profile_record,
 };
 
 
Index: cfgrtl.c
===================================================================
--- cfgrtl.c	(revision 192222)
+++ cfgrtl.c	(working copy)
@@ -4452,13 +4452,35 @@ rtl_duplicate_bb (basic_block bb)
   return bb;
 }
 
+/* Do book-keeping of basic block BB for the profile consistency checker.
+   If AFTER_PASS is 0, do pre-pass accounting, or if AFTER_PASS is 1
+   then do post-pass accounting.  Store the counting in RECORD.  */
+static void
+rtl_account_profile_record (basic_block bb, int after_pass,
+			    struct profile_record *record)
+{
+  rtx insn;
+  FOR_BB_INSNS (bb, insn)
+    if (INSN_P (insn))
+      {
+	record->size[after_pass]
+	  += insn_rtx_cost (PATTERN (insn), false);
+	if (profile_status == PROFILE_READ)
+	  record->time[after_pass]
+	    += insn_rtx_cost (PATTERN (insn), true) * bb->count;
+	else if (profile_status == PROFILE_GUESSED)
+	  record->time[after_pass]
+	    += insn_rtx_cost (PATTERN (insn), true) * bb->frequency;
+      }
+}
+
 /* Implementation of CFG manipulation for linearized RTL.  */
 struct cfg_hooks rtl_cfg_hooks = {
-  "rtl",
-  rtl_verify_flow_info,
-  rtl_dump_bb,
-  rtl_create_basic_block,
-  rtl_redirect_edge_and_branch,
+    "rtl",
+    rtl_verify_flow_info,
+    rtl_dump_bb,
+    rtl_create_basic_block,
+    rtl_redirect_edge_and_branch,
   rtl_redirect_edge_and_branch_force,
   rtl_can_remove_branch_p,
   rtl_delete_block,
@@ -4486,6 +4508,7 @@ struct cfg_hooks rtl_cfg_hooks = {
   NULL, /* flush_pending_stmts */
   rtl_block_empty_p, /* block_empty_p */
   rtl_split_block_before_cond_jump, /* split_block_before_cond_jump */
+  rtl_account_profile_record,
 };
 
 /* Implementation of CFG manipulation for cfg layout RTL, where
@@ -4526,6 +4549,7 @@ struct cfg_hooks cfg_layout_rtl_cfg_hooks = {
   NULL, /* flush_pending_stmts */  
   rtl_block_empty_p, /* block_empty_p */
   rtl_split_block_before_cond_jump, /* split_block_before_cond_jump */
+  rtl_account_profile_record,
 };
 
 #include "gt-cfgrtl.h"
Index: passes.c
===================================================================
--- passes.c	(revision 192222)
+++ passes.c	(working copy)
@@ -1778,30 +1778,16 @@ execute_function_dump (void *data ATTRIBUTE_UNUSED
     }
 }
 
-/* Make statistic about profile consistency.  */
-
-struct profile_record
-{
-  int num_mismatched_freq_in[2];
-  int num_mismatched_freq_out[2];
-  int num_mismatched_count_in[2];
-  int num_mismatched_count_out[2];
-  bool run;
-  gcov_type time[2];
-  int size[2];
-};
-
 static struct profile_record *profile_record;
 
+/* Do profile consistency book-keeping for the pass with static number INDEX.
+   If SUBPASS is zero, we run _before_ the pass, and if SUBPASS is one, then
+   we run _after_ the pass.  RUN is true if the pass really runs, or FALSE
+   if we are only book-keeping on passes that may have selectively disabled
+   themselves on a given function.  */
 static void
 check_profile_consistency (int index, int subpass, bool run)
 {
-  basic_block bb;
-  edge_iterator ei;
-  edge e;
-  int sum;
-  gcov_type lsum;
-
   if (index == -1)
     return;
   if (!profile_record)
@@ -1810,79 +1796,7 @@ check_profile_consistency (int index, int subpass,
   gcc_assert (index < passes_by_id_size && index >= 0);
   gcc_assert (subpass < 2);
   profile_record[index].run |= run;
-
-  FOR_ALL_BB (bb)
-   {
-      if (bb != EXIT_BLOCK_PTR_FOR_FUNCTION (cfun)
-	  && profile_status != PROFILE_ABSENT)
-	{
-	  sum = 0;
-	  FOR_EACH_EDGE (e, ei, bb->succs)
-	    sum += e->probability;
-	  if (EDGE_COUNT (bb->succs) && abs (sum - REG_BR_PROB_BASE) > 100)
-	    profile_record[index].num_mismatched_freq_out[subpass]++;
-	  lsum = 0;
-	  FOR_EACH_EDGE (e, ei, bb->succs)
-	    lsum += e->count;
-	  if (EDGE_COUNT (bb->succs)
-	      && (lsum - bb->count > 100 || lsum - bb->count < -100))
-	    profile_record[index].num_mismatched_count_out[subpass]++;
-	}
-      if (bb != ENTRY_BLOCK_PTR_FOR_FUNCTION (cfun)
-	  && profile_status != PROFILE_ABSENT)
-	{
-	  sum = 0;
-	  FOR_EACH_EDGE (e, ei, bb->preds)
-	    sum += EDGE_FREQUENCY (e);
-	  if (abs (sum - bb->frequency) > 100
-	      || (MAX (sum, bb->frequency) > 10
-		  && abs ((sum - bb->frequency) * 100 / (MAX (sum, bb->frequency) + 1)) > 10))
-	    profile_record[index].num_mismatched_freq_in[subpass]++;
-	  lsum = 0;
-	  FOR_EACH_EDGE (e, ei, bb->preds)
-	    lsum += e->count;
-	  if (lsum - bb->count > 100 || lsum - bb->count < -100)
-	    profile_record[index].num_mismatched_count_in[subpass]++;
-	}
-      if (bb == ENTRY_BLOCK_PTR_FOR_FUNCTION (cfun)
-	  || bb == EXIT_BLOCK_PTR_FOR_FUNCTION (cfun))
-	continue;
-      if ((cfun && (cfun->curr_properties & PROP_trees)))
-	{
-	  gimple_stmt_iterator i;
-
-	  for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
-	    {
-	      profile_record[index].size[subpass]
-		 += estimate_num_insns (gsi_stmt (i), &eni_size_weights);
-	      if (profile_status == PROFILE_READ)
-		profile_record[index].time[subpass]
-		   += estimate_num_insns (gsi_stmt (i),
-					  &eni_time_weights) * bb->count;
-	      else if (profile_status == PROFILE_GUESSED)
-		profile_record[index].time[subpass]
-		   += estimate_num_insns (gsi_stmt (i),
-					  &eni_time_weights) * bb->frequency;
-	    }
-	}
-      else if (cfun && (cfun->curr_properties & PROP_rtl))
-	{
-	  rtx insn;
-	  for (insn = NEXT_INSN (BB_HEAD (bb)); insn && insn != NEXT_INSN (BB_END (bb));
-	       insn = NEXT_INSN (insn))
-	    if (INSN_P (insn))
-	      {
-		profile_record[index].size[subpass]
-		   += insn_rtx_cost (PATTERN (insn), false);
-		if (profile_status == PROFILE_READ)
-		  profile_record[index].time[subpass]
-		     += insn_rtx_cost (PATTERN (insn), true) * bb->count;
-		else if (profile_status == PROFILE_GUESSED)
-		  profile_record[index].time[subpass]
-		     += insn_rtx_cost (PATTERN (insn), true) * bb->frequency;
-	      }
-	}
-   }
+  account_profile_record (&profile_record[index], subpass);
 }
 
 /* Output profile consistency.  */

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Profile housekeeping 6/n (-fprofile-consistency-report)
  2012-10-08 21:31             ` Steven Bosscher
@ 2012-10-09  8:45               ` Jan Hubicka
  0 siblings, 0 replies; 13+ messages in thread
From: Jan Hubicka @ 2012-10-09  8:45 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: Jan Hubicka, gcc-patches

> On Sat, Oct 6, 2012 at 5:56 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
> > Hi,
> > does this look better? Moving to cfg.c would importing tree-pass.h and rtl.h
> > that is not cool either. predict.c does all of these.
> > Obviously can also go to a separate file, if preferred.
> 
> Attached is how I would do it. What do you think about this?
> 
> Ciao!
> Steven
> 
>         * basic-block. (profile_record): New struct, moved from passes.c.
>         * cfghooks.h (struct cfg_hooks) <account_profile_record>: New hook.
>         (account_profile_record): New prototype.
>         * cfghooks.c (account_profile_record): New function.
>         * tree-cfg.c (gimple_account_profile_record): New function
>         (gimple_cfg_hooks): Add it.
>         * cfgrtl.c (rtl_account_profile_record): New function
>         (rtl_cfg_hooks, cfg_layout_rtl_cfg_hooks): Add it.
>         * passes.c (check_profile_consistency): Simplify.  Move IR-dependent
>         code around using cfghooks machinery.

OK, I did not wanted to go as far as adding a hook, but why not.  Patch approved.
Thanks for looking into it!
Honza

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2012-10-09  8:45 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-04 14:01 Profile housekeeping 6/n (-fprofile-consistency-report) Jan Hubicka
2012-10-04 14:09 ` Steven Bosscher
2012-10-04 14:40   ` Jan Hubicka
2012-10-06 14:10   ` Jan Hubicka
2012-10-06 14:39     ` Steven Bosscher
2012-10-06 15:43       ` Steven Bosscher
2012-10-06 15:46         ` Jan Hubicka
2012-10-06 15:56           ` Jan Hubicka
2012-10-08 21:31             ` Steven Bosscher
2012-10-09  8:45               ` Jan Hubicka
2012-10-06 15:44       ` Jan Hubicka
2012-10-06 15:15     ` Graham Stott
2012-10-06 15:34       ` Jan Hubicka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).