[PATCH] Use new dump scheme for loop unroll passes

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH] Use new dump scheme for loop unroll passes
@ 2012-12-14  2:16 Sharad Singhai
  2012-12-14  4:59 ` Xinliang David Li
  0 siblings, 1 reply; 4+ messages in thread
From: Sharad Singhai @ 2012-12-14  2:16 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Biener, David Li

[-- Attachment #1: Type: text/plain, Size: 3742 bytes --]

Hi,

As per discussion in http://gcc.gnu.org/ml/gcc/2012-12/msg00056.html,
the attached patch updates loop unroll passes to use new dump
infrastructure.

This patch filters relevant dump messages into the following
three categories

- optimized: an optimization was successfully applied
- missed: an optimization was missed due to the described reason
- note: other relevant/detailed info during optimization. For example,
  loop unrolling computes the loop bounds and size.

Two sample outputs from one of the gcc tests (gcc.dg/unroll_1.c) are below.

Sample 1
-------------- info about optimized loops via
"-fopt-info-loop-optimized" -------
$ gcc gcc.dg/unroll_1.c -fno-diagnostics-show-caret -O2 -S
-fdump-rtl-loop2_unroll -funroll-loops -fopt-info-loop-optimized

Unrolled loop 1 completely (duplicated 2 times).
Exit condition of peeled iterations was eliminated.
Last iteration exit edge was proved true.
Unrolled loop 1 completely (duplicated 2 times).
Exit condition of peeled iterations was eliminated.
Last iteration exit edge was proved true.
--------------------------------

Sample 2:
--- All available loop optimization info, i.e., optimized+missed+note
via "-fopt-info-loop" ---
$ gcc gcc.dg/unroll_1.c -fno-diagnostics-show-caret -O2 -S
-fdump-rtl-loop2_unroll -funroll-loops -fopt-info-loop

Loop 1 iterates 2 times.
Loop 1 iterates at most 2 times.
Estimating sizes for loop 1
 BB: 4, after_exit: 0
  size:   2 if (i_1 <= 1)
   Exit condition will be eliminated in peeled copies.
 BB: 3, after_exit: 1
  size:   1 _5 = b[i_1];
  size:   1 _6 = _5 + 1;
  size:   1 a[i_1] = _6;
  size:   1 i_8 = i_1 + 1;
   Induction variable computation will be folded away.
size: 6-3, last_iteration: 2-0
  Loop size: 6
  Estimated size after unrolling: 5
Unrolled loop 1 completely (duplicated 2 times).
Exit condition of peeled iterations was eliminated.
Last iteration exit edge was proved true.
Forced exit to be taken: if (1 == 0)
Loop 1 iterates 2 times.
Loop 1 iterates at most 2 times.
Estimating sizes for loop 1
 BB: 4, after_exit: 0
  size:   2 if (i_1 <= 1)
   Exit condition will be eliminated in peeled copies.
 BB: 3, after_exit: 1
  size:   1 _4 = b[i_1];
  size:   1 _5 = _4 + 1;
  size:   1 a[i_1] = _5;
  size:   1 i_7 = i_1 + 1;
   Induction variable computation will be folded away.
size: 6-3, last_iteration: 2-0
  Loop size: 6
  Estimated size after unrolling: 5
Unrolled loop 1 completely (duplicated 2 times).
Exit condition of peeled iterations was eliminated.
Last iteration exit edge was proved true.
Forced exit to be taken: if (1 == 0)
--------------------------------

I would like to mention that this information is perhaps too verbose
and the the source location of optimized loops is not displayed. I can
add source line info (and fix up corresponding tests) if needed. But
right now I wanted to maintain current dump format faithfully. Perhaps
the format can be tweaked for better readability.

Note that all information dumped in response to -fopt-info is also
present in regular dump file(s) when corresponding dumps are
enabled. Thus in above examples, the loop optimization info is also
present in *.loop2_unroll dump file since the command line specified a
dump file via "-fdump-rtl-loop2_unroll" in addition to -fopt-info.

(As a side note, while doing the conversion, I found that the MSG_*
dump flags are unwieldy when used in conjunction with other
flags. Perhaps these flags should be renamed/shortened. I propose the following
       MSG_MISSED_OPTIMIZATION  ==> MSG_MISSED
       MSG_OPTIMIZED_LOCATIONS  ==> MSG_OPTIMIZED
But that is pure renaming and can be done separately.)

I have bootstrapped and tested this patch on x86_64 and found no new
failures. Okay for trunk?

Thanks,
Sharad

[-- Attachment #2: unroll.opt_info.patch --]
[-- Type: application/octet-stream, Size: 38018 bytes --]

2012-12-13  Sharad Singhai  <singhai@google.com>

	* dumpfile.c (dump_rtl): New function.
	* dumpfile.h (dump_rtl): Add extern declaration.
	(print_rtl): Likewise.
	* tree-ssa-loop-ivcanon.c: Instead of printf use dump_printf with appropriate
	categorization for -fopt-info.
	* loop-unroll.c: Likewise.

Index: dumpfile.c
===================================================================
--- dumpfile.c	(revision 194420)
+++ dumpfile.c	(working copy)
@@ -901,3 +901,14 @@ enable_rtl_dump_file (void)
 {
   return dump_enable_all (TDF_RTL | TDF_DETAILS | TDF_BLOCKS, NULL) > 0;
 }
+
+/* Print rtx on the dump streams.  */
+
+void
+dump_rtl (int dump_kind, const_rtx rtx)
+{
+  if (dump_file && (dump_kind & pflags))
+    print_rtl (dump_file, rtx);
+  if (alt_dump_file && (dump_kind & alt_flags))
+    print_rtl (alt_dump_file, rtx);
+}
Index: dumpfile.h
===================================================================
--- dumpfile.h	(revision 194420)
+++ dumpfile.h	(working copy)
@@ -136,6 +136,7 @@ extern void dump_printf (int, const char *, ...) A
 extern void dump_printf_loc (int, source_location,
                              const char *, ...) ATTRIBUTE_PRINTF_3;
 extern void dump_basic_block (int, basic_block, int);
+extern void dump_rtl (int, const_rtx);
 extern void dump_generic_expr_loc (int, source_location, int, tree);
 extern void dump_generic_expr (int, int, tree);
 extern void dump_gimple_stmt_loc (int, source_location, int, gimple, int);
@@ -145,10 +146,12 @@ extern unsigned int dump_register (const char *, c
                                    int, int);
 extern bool enable_rtl_dump_file (void);
 
-/* In combine.c  */
+/* In combine.c.  */
 extern void dump_combine_total_stats (FILE *);
-/* In cfghooks.c  */
+/* In cfghooks.c.  */
 extern void dump_bb (FILE *, basic_block, int, int);
+/* In print-rtl.c.  */
+extern void print_rtl (FILE *, const_rtx);
 
 /* Global variables used to communicate with passes.  */
 extern FILE *dump_file;
Index: tree-ssa-loop-ivcanon.c
===================================================================
--- tree-ssa-loop-ivcanon.c	(revision 194420)
+++ tree-ssa-loop-ivcanon.c	(working copy)
@@ -74,11 +74,12 @@ create_canonical_iv (struct loop *loop, edge exit,
   gimple_stmt_iterator incr_at;
   enum tree_code cmp;
 
-  if (dump_file && (dump_flags & TDF_DETAILS))
+  if (dump_enabled_p ())
     {
-      fprintf (dump_file, "Added canonical iv to loop %d, ", loop->num);
-      print_generic_expr (dump_file, niter, TDF_SLIM);
-      fprintf (dump_file, " iterations.\n");
+      dump_printf (TDF_DETAILS | MSG_NOTE,
+                   "Added canonical iv to loop %d, ", loop->num);
+      dump_generic_expr (TDF_DETAILS | MSG_NOTE, TDF_SLIM, niter);
+      dump_printf (TDF_DETAILS | MSG_NOTE, " iterations.\n");
     }
 
   cond = last_stmt (exit->src);
@@ -230,8 +231,8 @@ tree_estimate_loop_size (struct loop *loop, edge e
   size->num_branches_on_hot_path = 0;
   size->constant_iv = 0;
 
-  if (dump_file && (dump_flags & TDF_DETAILS))
-    fprintf (dump_file, "Estimating sizes for loop %i\n", loop->num);
+  dump_printf (TDF_DETAILS | MSG_NOTE,
+               "Estimating sizes for loop %i\n", loop->num);
   for (i = 0; i < loop->num_nodes; i++)
     {
       if (edge_to_cancel && body[i] != edge_to_cancel->src
@@ -239,9 +240,9 @@ tree_estimate_loop_size (struct loop *loop, edge e
 	after_exit = true;
       else
 	after_exit = false;
-      if (dump_file && (dump_flags & TDF_DETAILS))
-	fprintf (dump_file, " BB: %i, after_exit: %i\n", body[i]->index, after_exit);
 
+      dump_printf (TDF_DETAILS | MSG_NOTE,
+                   " BB: %i, after_exit: %i\n", body[i]->index, after_exit);
       for (gsi = gsi_start_bb (body[i]); !gsi_end_p (gsi); gsi_next (&gsi))
 	{
 	  gimple stmt = gsi_stmt (gsi);
@@ -250,11 +251,11 @@ tree_estimate_loop_size (struct loop *loop, edge e
 	  bool likely_eliminated_last = false;
 	  bool likely_eliminated_peeled = false;
 
-	  if (dump_file && (dump_flags & TDF_DETAILS))
-	    {
-	      fprintf (dump_file, "  size: %3i ", num);
-	      print_gimple_stmt (dump_file, gsi_stmt (gsi), 0, 0);
-	    }
+          if (dump_enabled_p ())
+            {
+              dump_printf (TDF_DETAILS | MSG_NOTE, "  size: %3i ", num);
+              dump_gimple_stmt (TDF_DETAILS | MSG_NOTE, 0, gsi_stmt (gsi), 0);
+            }
 
 	  /* Look for reasons why we might optimize this stmt away. */
 
@@ -262,26 +263,26 @@ tree_estimate_loop_size (struct loop *loop, edge e
 	  if (exit && body[i] == exit->src
 		   && stmt == last_stmt (exit->src))
 	    {
-	      if (dump_file && (dump_flags & TDF_DETAILS))
-	        fprintf (dump_file, "   Exit condition will be eliminated "
-			 "in peeled copies.\n");
+              dump_printf (TDF_DETAILS | MSG_NOTE,
+                           "   Exit condition will be eliminated "
+                           "in peeled copies.\n");
 	      likely_eliminated_peeled = true;
 	    }
 	  else if (edge_to_cancel && body[i] == edge_to_cancel->src
 		   && stmt == last_stmt (edge_to_cancel->src))
 	    {
-	      if (dump_file && (dump_flags & TDF_DETAILS))
-	        fprintf (dump_file, "   Exit condition will be eliminated "
-			 "in last copy.\n");
+              dump_printf (TDF_DETAILS | MSG_NOTE,
+                           "   Exit condition will be eliminated "
+                           "in last copy.\n");
 	      likely_eliminated_last = true;
 	    }
 	  /* Sets of IV variables  */
 	  else if (gimple_code (stmt) == GIMPLE_ASSIGN
 	      && constant_after_peeling (gimple_assign_lhs (stmt), stmt, loop))
 	    {
-	      if (dump_file && (dump_flags & TDF_DETAILS))
-	        fprintf (dump_file, "   Induction variable computation will"
-			 " be folded away.\n");
+              dump_printf (TDF_DETAILS | MSG_NOTE,
+                           "   Induction variable computation will"
+                           " be folded away.\n");
 	      likely_eliminated = true;
 	    }
 	  /* Assignments of IV variables.  */
@@ -293,8 +294,8 @@ tree_estimate_loop_size (struct loop *loop, edge e
 		       				  stmt, loop)))
 	    {
 	      size->constant_iv = true;
-	      if (dump_file && (dump_flags & TDF_DETAILS))
-	        fprintf (dump_file, "   Constant expression will be folded away.\n");
+              dump_printf (TDF_DETAILS | MSG_NOTE,
+                           "   Constant expression will be folded away.\n");
 	      likely_eliminated = true;
 	    }
 	  /* Conditionals.  */
@@ -304,8 +305,8 @@ tree_estimate_loop_size (struct loop *loop, edge e
 		   || (gimple_code (stmt) == GIMPLE_SWITCH
 		       && constant_after_peeling (gimple_switch_index (stmt), stmt, loop)))
 	    {
-	      if (dump_file && (dump_flags & TDF_DETAILS))
-	        fprintf (dump_file, "   Constant conditional.\n");
+              dump_printf (TDF_DETAILS | MSG_NOTE,
+                           "   Constant conditional.\n");
 	      likely_eliminated = true;
 	    }
 
@@ -359,10 +360,10 @@ tree_estimate_loop_size (struct loop *loop, edge e
 	}
     }
   path.release ();
-  if (dump_file && (dump_flags & TDF_DETAILS))
-    fprintf (dump_file, "size: %i-%i, last_iteration: %i-%i\n", size->overall,
-    	     size->eliminated_by_peeling, size->last_iteration,
-	     size->last_iteration_eliminated_by_peeling);
+  dump_printf (TDF_DETAILS | MSG_NOTE,
+               "size: %i-%i, last_iteration: %i-%i\n", size->overall,
+               size->eliminated_by_peeling, size->last_iteration,
+               size->last_iteration_eliminated_by_peeling);
 
   free (body);
   return false;
@@ -495,10 +496,11 @@ remove_exits_and_undefined_stmts (struct loop *loo
 	  gimple_set_location (stmt, gimple_location (elt->stmt));
 	  gsi_insert_before (&gsi, stmt, GSI_NEW_STMT);
 	  changed = true;
-	  if (dump_file && (dump_flags & TDF_DETAILS))
+          if (dump_enabled_p ())
 	    {
-	      fprintf (dump_file, "Forced statement unreachable: ");
-	      print_gimple_stmt (dump_file, elt->stmt, 0, 0);
+              dump_printf (TDF_DETAILS | MSG_NOTE,
+                           "Forced statement unreachable: ");
+              dump_gimple_stmt (TDF_DETAILS | MSG_NOTE, 0, elt->stmt, 0);
 	    }
 	}
       /* If we know the exit will be taken after peeling, update.  */
@@ -508,10 +510,10 @@ remove_exits_and_undefined_stmts (struct loop *loo
 	  basic_block bb = gimple_bb (elt->stmt);
 	  edge exit_edge = EDGE_SUCC (bb, 0);
 
-	  if (dump_file && (dump_flags & TDF_DETAILS))
-	    {
-	      fprintf (dump_file, "Forced exit to be taken: ");
-	      print_gimple_stmt (dump_file, elt->stmt, 0, 0);
+          if (dump_enabled_p ())
+            {
+              dump_printf (TDF_DETAILS | MSG_NOTE, "Forced exit to be taken: ");
+              dump_gimple_stmt (TDF_DETAILS | MSG_NOTE, 0, elt->stmt, 0);
 	    }
 	  if (!loop_exit_edge_p (loop, exit_edge))
 	    exit_edge = EDGE_SUCC (bb, 1);
@@ -564,11 +566,12 @@ remove_redundant_iv_tests (struct loop *loop)
 	      || !loop->nb_iterations_upper_bound.ult
 		   (tree_to_double_int (niter.niter)))
 	    continue;
-	  
-	  if (dump_file && (dump_flags & TDF_DETAILS))
+          if (dump_enabled_p ())
 	    {
-	      fprintf (dump_file, "Removed pointless exit: ");
-	      print_gimple_stmt (dump_file, elt->stmt, 0, 0);
+              dump_printf (TDF_DETAILS | MSG_OPTIMIZED_LOCATIONS,
+                           "Removed pointless exit: ");
+              dump_gimple_stmt (TDF_DETAILS | MSG_OPTIMIZED_LOCATIONS, 0,
+                                elt->stmt, 0);
 	    }
 	  if (exit_edge->flags & EDGE_TRUE_VALUE)
 	    gimple_cond_make_false (elt->stmt);
@@ -719,18 +722,20 @@ try_unroll_loop_completely (struct loop *loop,
       ninsns = size.overall;
       if (large)
 	{
-	  if (dump_file && (dump_flags & TDF_DETAILS))
-	    fprintf (dump_file, "Not unrolling loop %d: it is too large.\n",
-		     loop->num);
+          dump_printf (TDF_DETAILS | MSG_MISSED_OPTIMIZATION,
+                       "Not unrolling loop %d: it is too large.\n",
+                       loop->num);
 	  return false;
 	}
 
       unr_insns = estimated_unrolled_size (&size, n_unroll);
-      if (dump_file && (dump_flags & TDF_DETAILS))
+      if (dump_enabled_p ())
 	{
-	  fprintf (dump_file, "  Loop size: %d\n", (int) ninsns);
-	  fprintf (dump_file, "  Estimated size after unrolling: %d\n",
-		   (int) unr_insns);
+          dump_printf (TDF_DETAILS | MSG_NOTE,
+                       "  Loop size: %d\n", (int) ninsns);
+          dump_printf (TDF_DETAILS | MSG_NOTE,
+                       "  Estimated size after unrolling: %d\n",
+                       (int) unr_insns);
 	}
 
       /* If the code is going to shrink, we don't need to be extra cautious
@@ -746,9 +751,9 @@ try_unroll_loop_completely (struct loop *loop,
 	 this is always a good idea.  */
       else if (ul == UL_NO_GROWTH)
 	{
-	  if (dump_file && (dump_flags & TDF_DETAILS))
-	    fprintf (dump_file, "Not unrolling loop %d: size would grow.\n",
-		     loop->num);
+          dump_printf (TDF_DETAILS | MSG_MISSED_OPTIMIZATION,
+                       "Not unrolling loop %d: size would grow.\n",
+                       loop->num);
 	  return false;
 	}
       /* Outer loops tend to be less interesting candidates for complette
@@ -757,20 +762,18 @@ try_unroll_loop_completely (struct loop *loop,
 	 grow.  */
       else if (loop->inner)
 	{
-	  if (dump_file && (dump_flags & TDF_DETAILS))
-	    fprintf (dump_file, "Not unrolling loop %d: "
-		     "it is not innermost and code would grow.\n",
-		     loop->num);
+          dump_printf (TDF_DETAILS | MSG_MISSED_OPTIMIZATION,
+                       "Not unrolling loop %d: it is not innermost "
+                       "and code would grow.\n", loop->num);
 	  return false;
 	}
       /* If there is call on a hot path through the loop, then
 	 there is most probably not much to optimize.  */
       else if (size.num_non_pure_calls_on_hot_path)
 	{
-	  if (dump_file && (dump_flags & TDF_DETAILS))
-	    fprintf (dump_file, "Not unrolling loop %d: "
-		     "contains call and code would grow.\n",
-		     loop->num);
+          dump_printf (TDF_DETAILS | MSG_MISSED_OPTIMIZATION,
+                       "Not unrolling loop %d: contains call and"
+                       " code would grow.\n", loop->num);
 	  return false;
 	}
       /* If there is pure/const call in the function, then we
@@ -784,10 +787,9 @@ try_unroll_loop_completely (struct loop *loop,
 	       && (size.non_call_stmts_on_hot_path
 		   <= 3 + size.num_pure_calls_on_hot_path))
 	{
-	  if (dump_file && (dump_flags & TDF_DETAILS))
-	    fprintf (dump_file, "Not unrolling loop %d: "
-		     "contains just pure calls and code would grow.\n",
-		     loop->num);
+          dump_printf (TDF_DETAILS | MSG_MISSED_OPTIMIZATION,
+                       "Not unrolling loop %d: contains just pure calls"
+                       " and code would grow.\n", loop->num);
 	  return false;
 	}
       /* Complette unrolling is major win when control flow is removed and
@@ -799,20 +801,20 @@ try_unroll_loop_completely (struct loop *loop,
       else if (size.num_branches_on_hot_path * (int)n_unroll
 	       > PARAM_VALUE (PARAM_MAX_PEEL_BRANCHES))
 	{
-	  if (dump_file && (dump_flags & TDF_DETAILS))
-	    fprintf (dump_file, "Not unrolling loop %d: "
-		     " number of branches on hot path in the unrolled sequence"
-		     " reach --param max-peel-branches limit.\n",
-		     loop->num);
+          dump_printf (TDF_DETAILS | MSG_MISSED_OPTIMIZATION,
+                       "Not unrolling loop %d: "
+                       " number of branches on hot path in the unrolled"
+                       "  sequence reach --param max-peel-branches limit.\n",
+                       loop->num);
 	  return false;
 	}
       else if (unr_insns
 	       > (unsigned) PARAM_VALUE (PARAM_MAX_COMPLETELY_PEELED_INSNS))
 	{
-	  if (dump_file && (dump_flags & TDF_DETAILS))
-	    fprintf (dump_file, "Not unrolling loop %d: "
-		     "(--param max-completely-peeled-insns limit reached).\n",
-		     loop->num);
+          dump_printf (TDF_DETAILS | MSG_MISSED_OPTIMIZATION,
+                       "Not unrolling loop %d: "
+                       "(--param max-completely-peeled-insns limit reached).\n",
+                       loop->num);
 	  return false;
 	}
 
@@ -829,8 +831,8 @@ try_unroll_loop_completely (struct loop *loop,
 	{
           free_original_copy_tables ();
 	  free (wont_exit);
-	  if (dump_file && (dump_flags & TDF_DETAILS))
-	    fprintf (dump_file, "Failed to duplicate the loop\n");
+          dump_printf (TDF_DETAILS | MSG_MISSED_OPTIMIZATION,
+                       "Failed to duplicate the loop\n");
 	  return false;
 	}
 
@@ -863,22 +865,26 @@ try_unroll_loop_completely (struct loop *loop,
   loops_to_unloop.safe_push (loop);
   loops_to_unloop_nunroll.safe_push (n_unroll);
 
-  if (dump_file && (dump_flags & TDF_DETAILS))
+  if (dump_enabled_p ())
     {
       if (!n_unroll)
-        fprintf (dump_file, "Turned loop %d to non-loop; it never loops.\n",
-		 num);
+        dump_printf (TDF_DETAILS | MSG_MISSED_OPTIMIZATION,
+                     "Turned loop %d to non-loop; it never loops.\n", num);
       else
-        fprintf (dump_file, "Unrolled loop %d completely "
-		 "(duplicated %i times).\n", num, (int)n_unroll);
+        dump_printf (TDF_DETAILS | MSG_OPTIMIZED_LOCATIONS,
+                     "Unrolled loop %d completely "
+                     "(duplicated %i times).\n", num, (int)n_unroll);
       if (exit)
-        fprintf (dump_file, "Exit condition of peeled iterations was "
-		 "eliminated.\n");
+        dump_printf (TDF_DETAILS | MSG_OPTIMIZED_LOCATIONS,
+                     "Exit condition of peeled iterations was "
+                     "eliminated.\n");
       if (edge_to_cancel)
-        fprintf (dump_file, "Last iteration exit edge was proved true.\n");
+        dump_printf (TDF_DETAILS | MSG_OPTIMIZED_LOCATIONS,
+                     "Last iteration exit edge was proved true.\n");
       else
-        fprintf (dump_file, "Latch of last iteration was marked by "
-		 "__builtin_unreachable ().\n");
+        dump_printf (TDF_DETAILS | MSG_OPTIMIZED_LOCATIONS,
+                     "Latch of last iteration was marked by "
+                     "__builtin_unreachable ().\n");
     }
 
   return true;
@@ -931,18 +937,17 @@ canonicalize_loop_induction_variables (struct loop
   /* Force re-computation of loop bounds so we can remove redundant exits.  */
   maxiter = max_loop_iterations_int (loop);
 
-  if (dump_file && (dump_flags & TDF_DETAILS)
-      && TREE_CODE (niter) == INTEGER_CST)
+  if (dump_enabled_p () && TREE_CODE (niter) == INTEGER_CST)
     {
-      fprintf (dump_file, "Loop %d iterates ", loop->num);
-      print_generic_expr (dump_file, niter, TDF_SLIM);
-      fprintf (dump_file, " times.\n");
+      dump_printf (TDF_DETAILS | MSG_NOTE, "Loop %d iterates ", loop->num);
+      dump_generic_expr (TDF_DETAILS | MSG_NOTE, TDF_SLIM, niter);
+      dump_printf (TDF_DETAILS | MSG_NOTE, " times.\n");
     }
-  if (dump_file && (dump_flags & TDF_DETAILS)
-      && maxiter >= 0)
+  if (dump_enabled_p () && maxiter >= 0)
     {
-      fprintf (dump_file, "Loop %d iterates at most %i times.\n", loop->num,
-	       (int)maxiter);
+      dump_printf (TDF_DETAILS | MSG_NOTE,
+                   "Loop %d iterates at most %i times.\n", loop->num,
+                   (int)maxiter);
     }
 
   /* Remove exits that are known to be never taken based on loop bound.
Index: loop-unroll.c
===================================================================
--- loop-unroll.c	(revision 194420)
+++ loop-unroll.c	(working copy)
@@ -235,10 +235,10 @@ peel_loops_completely (int flags)
     {
       loop->lpt_decision.decision = LPT_NONE;
 
-      if (dump_file)
-	fprintf (dump_file,
-		 "\n;; *** Considering loop %d for complete peeling ***\n",
-		 loop->num);
+      if (dump_enabled_p ())
+        dump_printf (TDF_RTL | MSG_NOTE,
+                     "\n;; *** Considering loop %d for complete peeling ***\n",
+                     loop->num);
 
       loop->ninsns = num_loop_insns (loop);
 
@@ -268,31 +268,31 @@ decide_unrolling_and_peeling (int flags)
     {
       loop->lpt_decision.decision = LPT_NONE;
 
-      if (dump_file)
-	fprintf (dump_file, "\n;; *** Considering loop %d ***\n", loop->num);
+      if (dump_enabled_p ())
+	dump_printf (TDF_RTL | MSG_NOTE,
+                     "\n;; *** Considering loop %d ***\n", loop->num);
 
       /* Do not peel cold areas.  */
       if (optimize_loop_for_size_p (loop))
 	{
-	  if (dump_file)
-	    fprintf (dump_file, ";; Not considering loop, cold area\n");
+          dump_printf (TDF_RTL | MSG_NOTE,
+                       ";; Not considering loop, cold area\n");
 	  continue;
 	}
 
       /* Can the loop be manipulated?  */
       if (!can_duplicate_loop_p (loop))
 	{
-	  if (dump_file)
-	    fprintf (dump_file,
-		     ";; Not considering loop, cannot duplicate\n");
+          dump_printf (TDF_RTL | MSG_NOTE,
+                       ";; Not considering loop, cannot duplicate\n");
 	  continue;
 	}
 
       /* Skip non-innermost loops.  */
       if (loop->inner)
 	{
-	  if (dump_file)
-	    fprintf (dump_file, ";; Not considering loop, is not innermost\n");
+          dump_printf (TDF_RTL | MSG_NOTE,
+                       ";; Not considering loop, is not innermost\n");
 	  continue;
 	}
 
@@ -319,14 +319,15 @@ decide_peel_once_rolling (struct loop *loop, int f
 {
   struct niter_desc *desc;
 
-  if (dump_file)
-    fprintf (dump_file, "\n;; Considering peeling once rolling loop\n");
+  if (dump_enabled_p ())
+    dump_printf (TDF_RTL | MSG_NOTE,
+                 "\n;; Considering peeling once rolling loop\n");
 
   /* Is the loop small enough?  */
   if ((unsigned) PARAM_VALUE (PARAM_MAX_ONCE_PEELED_INSNS) < loop->ninsns)
     {
-      if (dump_file)
-	fprintf (dump_file, ";; Not considering loop, is too big\n");
+      dump_printf (TDF_RTL | MSG_NOTE,
+                   ";; Not considering loop, is too big\n");
       return;
     }
 
@@ -341,15 +342,14 @@ decide_peel_once_rolling (struct loop *loop, int f
       || (desc->niter != 0
 	  && max_loop_iterations_int (loop) != 0))
     {
-      if (dump_file)
-	fprintf (dump_file,
-		 ";; Unable to prove that the loop rolls exactly once\n");
+      dump_printf (TDF_RTL | MSG_NOTE,
+                   ";; Unable to prove that the loop rolls exactly once\n");
       return;
     }
 
   /* Success.  */
-  if (dump_file)
-    fprintf (dump_file, ";; Decided to peel exactly once rolling loop\n");
+  dump_printf (TDF_RTL | MSG_OPTIMIZED_LOCATIONS,
+               ";; Decided to peel exactly once rolling loop\n");
   loop->lpt_decision.decision = LPT_PEEL_COMPLETELY;
 }
 
@@ -360,31 +360,30 @@ decide_peel_completely (struct loop *loop, int fla
   unsigned npeel;
   struct niter_desc *desc;
 
-  if (dump_file)
-    fprintf (dump_file, "\n;; Considering peeling completely\n");
+  dump_printf (TDF_RTL | MSG_NOTE,
+               "\n;; Considering peeling completely\n");
 
   /* Skip non-innermost loops.  */
   if (loop->inner)
     {
-      if (dump_file)
-	fprintf (dump_file, ";; Not considering loop, is not innermost\n");
+      dump_printf (TDF_RTL | MSG_NOTE,
+                   ";; Not considering loop, is not innermost\n");
       return;
     }
 
   /* Do not peel cold areas.  */
   if (optimize_loop_for_size_p (loop))
     {
-      if (dump_file)
-	fprintf (dump_file, ";; Not considering loop, cold area\n");
+      dump_printf (TDF_RTL | MSG_MISSED_OPTIMIZATION,
+                   ";; Not considering loop, cold area\n");
       return;
     }
 
   /* Can the loop be manipulated?  */
   if (!can_duplicate_loop_p (loop))
     {
-      if (dump_file)
-	fprintf (dump_file,
-		 ";; Not considering loop, cannot duplicate\n");
+      dump_printf (TDF_RTL | MSG_MISSED_OPTIMIZATION,
+                   ";; Not considering loop, cannot duplicate\n");
       return;
     }
 
@@ -396,8 +395,8 @@ decide_peel_completely (struct loop *loop, int fla
   /* Is the loop small enough?  */
   if (!npeel)
     {
-      if (dump_file)
-	fprintf (dump_file, ";; Not considering loop, is too big\n");
+      dump_printf (TDF_RTL | MSG_MISSED_OPTIMIZATION,
+                   ";; Not considering loop, is too big\n");
       return;
     }
 
@@ -410,27 +409,28 @@ decide_peel_completely (struct loop *loop, int fla
       || !desc->const_iter
       || desc->infinite)
     {
-      if (dump_file)
-	fprintf (dump_file,
-		 ";; Unable to prove that the loop iterates constant times\n");
+      dump_printf (TDF_RTL | MSG_MISSED_OPTIMIZATION,
+                   ";; Unable to prove that the loop iterates constant times\n");
       return;
     }
 
   if (desc->niter > npeel - 1)
     {
-      if (dump_file)
+      if (dump_enabled_p ())
 	{
-	  fprintf (dump_file,
-		   ";; Not peeling loop completely, rolls too much (");
-	  fprintf (dump_file, HOST_WIDEST_INT_PRINT_DEC, desc->niter);
-	  fprintf (dump_file, " iterations > %d [maximum peelings])\n", npeel);
+	  dump_printf (TDF_RTL | MSG_MISSED_OPTIMIZATION,
+                       ";; Not peeling loop completely, rolls too much (");
+	  dump_printf (TDF_RTL | MSG_MISSED_OPTIMIZATION,
+                       HOST_WIDEST_INT_PRINT_DEC, desc->niter);
+	  dump_printf (TDF_RTL | MSG_MISSED_OPTIMIZATION,
+                       " iterations > %d [maximum peelings])\n", npeel);
 	}
       return;
     }
 
   /* Success.  */
-  if (dump_file)
-    fprintf (dump_file, ";; Decided to peel loop completely\n");
+  dump_printf (TDF_RTL | MSG_OPTIMIZED_LOCATIONS,
+               ";; Decided to peel loop completely\n");
   loop->lpt_decision.decision = LPT_PEEL_COMPLETELY;
 }
 
@@ -508,8 +508,8 @@ peel_loop_completely (struct loop *loop)
      the loop.  */
   remove_path (ein);
 
-  if (dump_file)
-    fprintf (dump_file, ";; Peeled loop completely, %d times\n", (int) npeel);
+  dump_printf (TDF_RTL | MSG_OPTIMIZED_LOCATIONS,
+               ";; Peeled loop completely, %d times\n", (int) npeel);
 }
 
 /* Decide whether to unroll LOOP iterating constant number of times
@@ -528,10 +528,9 @@ decide_unroll_constant_iterations (struct loop *lo
       return;
     }
 
-  if (dump_file)
-    fprintf (dump_file,
-	     "\n;; Considering unrolling loop with constant "
-	     "number of iterations\n");
+  dump_printf (TDF_RTL | MSG_NOTE,
+               "\n;; Considering unrolling loop with constant "
+               "number of iterations\n");
 
   /* nunroll = total number of copies of the original loop body in
      unrolled loop (i.e. if it is 2, we have to duplicate loop body once.  */
@@ -546,8 +545,8 @@ decide_unroll_constant_iterations (struct loop *lo
   /* Skip big loops.  */
   if (nunroll <= 1)
     {
-      if (dump_file)
-	fprintf (dump_file, ";; Not considering loop, is too big\n");
+      dump_printf (TDF_RTL | MSG_MISSED_OPTIMIZATION,
+                   ";; Not considering loop, is too big\n");
       return;
     }
 
@@ -557,9 +556,8 @@ decide_unroll_constant_iterations (struct loop *lo
   /* Check number of iterations.  */
   if (!desc->simple_p || !desc->const_iter || desc->assumptions)
     {
-      if (dump_file)
-	fprintf (dump_file,
-		 ";; Unable to prove that the loop iterates constant times\n");
+      dump_printf (TDF_RTL | MSG_MISSED_OPTIMIZATION,
+                   ";; Unable to prove that the loop iterates constant times\n");
       return;
     }
 
@@ -572,8 +570,8 @@ decide_unroll_constant_iterations (struct loop *lo
 	   || max_loop_iterations (loop, &iterations))
 	  && iterations.ult (double_int::from_shwi (2 * nunroll))))
     {
-      if (dump_file)
-	fprintf (dump_file, ";; Not unrolling loop, doesn't roll\n");
+      dump_printf (TDF_RTL | MSG_MISSED_OPTIMIZATION,
+                   ";; Not unrolling loop, doesn't roll\n");
       return;
     }
 
@@ -609,9 +607,9 @@ decide_unroll_constant_iterations (struct loop *lo
   loop->lpt_decision.decision = LPT_UNROLL_CONSTANT;
   loop->lpt_decision.times = best_unroll;
 
-  if (dump_file)
-    fprintf (dump_file, ";; Decided to unroll the loop %d times (%d copies).\n",
-	     loop->lpt_decision.times, best_copies);
+  dump_printf (TDF_RTL | MSG_OPTIMIZED_LOCATIONS,
+               ";; Decided to unroll the loop %d times (%d copies).\n",
+               loop->lpt_decision.times, best_copies);
 }
 
 /* Unroll LOOP with constant number of iterations LOOP->LPT_DECISION.TIMES times.
@@ -669,8 +667,8 @@ unroll_loop_constant_iterations (struct loop *loop
 	 in the first copy, so that the loops that start with test
 	 of exit condition have continuous body after unrolling.  */
 
-      if (dump_file)
-	fprintf (dump_file, ";; Condition at beginning of loop.\n");
+      dump_printf (TDF_RTL | MSG_NOTE,
+                   ";; Condition at beginning of loop.\n");
 
       /* Peel exit_mod iterations.  */
       bitmap_clear_bit (wont_exit, 0);
@@ -710,10 +708,9 @@ unroll_loop_constant_iterations (struct loop *loop
     {
       /* Leave exit test in last copy, for the same reason as above if
 	 the loop tests the condition at the end of loop body.  */
+      dump_printf (TDF_RTL | MSG_NOTE,
+                   ";; Condition at end of loop.\n");
 
-      if (dump_file)
-	fprintf (dump_file, ";; Condition at end of loop.\n");
-
       /* We know that niter >= max_unroll + 2; so we do not need to care of
 	 case when we would exit before reaching the loop.  So just peel
 	 exit_mod + 1 iterations.  */
@@ -810,10 +807,9 @@ unroll_loop_constant_iterations (struct loop *loop
     remove_path (e);
   remove_edges.release ();
 
-  if (dump_file)
-    fprintf (dump_file,
-	     ";; Unrolled loop %d times, constant # of iterations %i insns\n",
-	     max_unroll, num_loop_insns (loop));
+  dump_printf (TDF_RTL | MSG_OPTIMIZED_LOCATIONS,
+               ";; Unrolled loop %d times, constant # of iterations %i insns\n",
+               max_unroll, num_loop_insns (loop));
 }
 
 /* Decide whether to unroll LOOP iterating runtime computable number of times
@@ -831,10 +827,9 @@ decide_unroll_runtime_iterations (struct loop *loo
       return;
     }
 
-  if (dump_file)
-    fprintf (dump_file,
-	     "\n;; Considering unrolling loop with runtime "
-	     "computable number of iterations\n");
+  dump_printf (TDF_RTL | MSG_NOTE,
+               "\n;; Considering unrolling loop with runtime "
+               "computable number of iterations\n");
 
   /* nunroll = total number of copies of the original loop body in
      unrolled loop (i.e. if it is 2, we have to duplicate loop body once.  */
@@ -851,8 +846,8 @@ decide_unroll_runtime_iterations (struct loop *loo
   /* Skip big loops.  */
   if (nunroll <= 1)
     {
-      if (dump_file)
-	fprintf (dump_file, ";; Not considering loop, is too big\n");
+      dump_printf (TDF_RTL | MSG_MISSED_OPTIMIZATION,
+                   ";; Not considering loop, is too big\n");
       return;
     }
 
@@ -862,17 +857,16 @@ decide_unroll_runtime_iterations (struct loop *loo
   /* Check simpleness.  */
   if (!desc->simple_p || desc->assumptions)
     {
-      if (dump_file)
-	fprintf (dump_file,
-		 ";; Unable to prove that the number of iterations "
-		 "can be counted in runtime\n");
+      dump_printf (TDF_RTL | MSG_MISSED_OPTIMIZATION,
+                   ";; Unable to prove that the number of iterations "
+                   "can be counted in runtime\n");
       return;
     }
 
   if (desc->const_iter)
     {
-      if (dump_file)
-	fprintf (dump_file, ";; Loop iterates constant times\n");
+      dump_printf (TDF_RTL | MSG_MISSED_OPTIMIZATION,
+                   ";; Loop iterates constant times\n");
       return;
     }
 
@@ -881,8 +875,8 @@ decide_unroll_runtime_iterations (struct loop *loo
        || max_loop_iterations (loop, &iterations))
       && iterations.ult (double_int::from_shwi (2 * nunroll)))
     {
-      if (dump_file)
-	fprintf (dump_file, ";; Not unrolling loop, doesn't roll\n");
+      dump_printf (TDF_RTL | MSG_MISSED_OPTIMIZATION,
+                   ";; Not unrolling loop, doesn't roll\n");
       return;
     }
 
@@ -894,9 +888,9 @@ decide_unroll_runtime_iterations (struct loop *loo
   loop->lpt_decision.decision = LPT_UNROLL_RUNTIME;
   loop->lpt_decision.times = i - 1;
 
-  if (dump_file)
-    fprintf (dump_file, ";; Decided to unroll the loop %d times.\n",
-	     loop->lpt_decision.times);
+  dump_printf (TDF_RTL | MSG_OPTIMIZED_LOCATIONS,
+               ";; Decided to unroll the loop %d times.\n",
+               loop->lpt_decision.times);
 }
 
 /* Splits edge E and inserts the sequence of instructions INSNS on it, and
@@ -1215,11 +1209,10 @@ unroll_loop_runtime_iterations (struct loop *loop)
 	loop->any_estimate = false;
     }
 
-  if (dump_file)
-    fprintf (dump_file,
-	     ";; Unrolled loop %d times, counting # of iterations "
-	     "in runtime, %i insns\n",
-	     max_unroll, num_loop_insns (loop));
+  dump_printf (TDF_RTL | MSG_OPTIMIZED_LOCATIONS,
+               ";; Unrolled loop %d times, counting # of iterations "
+               "in runtime, %i insns\n",
+               max_unroll, num_loop_insns (loop));
 
   dom_bbs.release ();
 }
@@ -1237,8 +1230,7 @@ decide_peel_simple (struct loop *loop, int flags)
       return;
     }
 
-  if (dump_file)
-    fprintf (dump_file, "\n;; Considering simply peeling loop\n");
+  dump_printf (TDF_RTL | MSG_NOTE, "\n;; Considering simply peeling loop\n");
 
   /* npeel = number of iterations to peel.  */
   npeel = PARAM_VALUE (PARAM_MAX_PEELED_INSNS) / loop->ninsns;
@@ -1248,8 +1240,8 @@ decide_peel_simple (struct loop *loop, int flags)
   /* Skip big loops.  */
   if (!npeel)
     {
-      if (dump_file)
-	fprintf (dump_file, ";; Not considering loop, is too big\n");
+      dump_printf (TDF_RTL | MSG_MISSED_OPTIMIZATION,
+                   ";; Not considering loop, is too big\n");
       return;
     }
 
@@ -1265,8 +1257,8 @@ decide_peel_simple (struct loop *loop, int flags)
   if (num_loop_branches (loop) > 1
       && profile_status != PROFILE_READ)
     {
-      if (dump_file)
-	fprintf (dump_file, ";; Not peeling, contains branches\n");
+      dump_printf (TDF_RTL | MSG_MISSED_OPTIMIZATION,
+                   ";; Not peeling, contains branches\n");
       return;
     }
 
@@ -1275,13 +1267,15 @@ decide_peel_simple (struct loop *loop, int flags)
     {
       if (double_int::from_shwi (npeel).ule (iterations))
 	{
-	  if (dump_file)
+          if (dump_enabled_p ())
 	    {
-	      fprintf (dump_file, ";; Not peeling loop, rolls too much (");
-	      fprintf (dump_file, HOST_WIDEST_INT_PRINT_DEC,
-		       (HOST_WIDEST_INT) (iterations.to_shwi () + 1));
-	      fprintf (dump_file, " iterations > %d [maximum peelings])\n",
-		       npeel);
+	      dump_printf (TDF_RTL | MSG_MISSED_OPTIMIZATION,
+                           ";; Not peeling loop, rolls too much (");
+	      dump_printf (TDF_RTL | MSG_MISSED_OPTIMIZATION,
+                           HOST_WIDEST_INT_PRINT_DEC,
+                           (HOST_WIDEST_INT) (iterations.to_shwi () + 1));
+	      dump_printf (TDF_RTL | MSG_MISSED_OPTIMIZATION,
+                           " iterations > %d [maximum peelings])\n", npeel);
 	    }
 	  return;
 	}
@@ -1296,9 +1290,8 @@ decide_peel_simple (struct loop *loop, int flags)
     {
       /* For now we have no good heuristics to decide whether loop peeling
          will be effective, so disable it.  */
-      if (dump_file)
-	fprintf (dump_file,
-		 ";; Not peeling loop, no evidence it will be profitable\n");
+      dump_printf (TDF_RTL | MSG_MISSED_OPTIMIZATION,
+                   ";; Not peeling loop, no evidence it will be profitable\n");
       return;
     }
 
@@ -1306,9 +1299,9 @@ decide_peel_simple (struct loop *loop, int flags)
   loop->lpt_decision.decision = LPT_PEEL_SIMPLE;
   loop->lpt_decision.times = npeel;
 
-  if (dump_file)
-    fprintf (dump_file, ";; Decided to simply peel the loop %d times.\n",
-	     loop->lpt_decision.times);
+  dump_printf (TDF_RTL | MSG_OPTIMIZED_LOCATIONS,
+               ";; Decided to simply peel the loop %d times.\n",
+               loop->lpt_decision.times);
 }
 
 /* Peel a LOOP LOOP->LPT_DECISION.TIMES times.  The transformation does this:
@@ -1378,8 +1371,7 @@ peel_loop_simple (struct loop *loop)
 	  free_simple_loop_desc (loop);
 	}
     }
-  if (dump_file)
-    fprintf (dump_file, ";; Peeling loop %d times\n", npeel);
+  dump_printf (TDF_RTL | MSG_OPTIMIZED_LOCATIONS, ";; Peeling loop %d times\n", npeel);
 }
 
 /* Decide whether to unroll LOOP stupidly and how much.  */
@@ -1396,8 +1388,7 @@ decide_unroll_stupid (struct loop *loop, int flags
       return;
     }
 
-  if (dump_file)
-    fprintf (dump_file, "\n;; Considering unrolling loop stupidly\n");
+  dump_printf (TDF_RTL | MSG_NOTE, "\n;; Considering unrolling loop stupidly\n");
 
   /* nunroll = total number of copies of the original loop body in
      unrolled loop (i.e. if it is 2, we have to duplicate loop body once.  */
@@ -1415,8 +1406,8 @@ decide_unroll_stupid (struct loop *loop, int flags
   /* Skip big loops.  */
   if (nunroll <= 1)
     {
-      if (dump_file)
-	fprintf (dump_file, ";; Not considering loop, is too big\n");
+      dump_printf (TDF_RTL | MSG_MISSED_OPTIMIZATION,
+                   ";; Not considering loop, is too big\n");
       return;
     }
 
@@ -1426,8 +1417,7 @@ decide_unroll_stupid (struct loop *loop, int flags
   /* Check simpleness.  */
   if (desc->simple_p && !desc->assumptions)
     {
-      if (dump_file)
-	fprintf (dump_file, ";; The loop is simple\n");
+      dump_printf (TDF_RTL | MSG_MISSED_OPTIMIZATION, ";; The loop is simple\n");
       return;
     }
 
@@ -1437,8 +1427,8 @@ decide_unroll_stupid (struct loop *loop, int flags
      is also relatively good reason to not unroll.  */
   if (num_loop_branches (loop) > 1)
     {
-      if (dump_file)
-	fprintf (dump_file, ";; Not unrolling, contains branches\n");
+      dump_printf (TDF_RTL | MSG_MISSED_OPTIMIZATION,
+                   ";; Not unrolling, contains branches\n");
       return;
     }
 
@@ -1447,8 +1437,8 @@ decide_unroll_stupid (struct loop *loop, int flags
        || max_loop_iterations (loop, &iterations))
       && iterations.ult (double_int::from_shwi (2 * nunroll)))
     {
-      if (dump_file)
-	fprintf (dump_file, ";; Not unrolling loop, doesn't roll\n");
+      dump_printf (TDF_RTL | MSG_MISSED_OPTIMIZATION,
+                   ";; Not unrolling loop, doesn't roll\n");
       return;
     }
 
@@ -1461,9 +1451,9 @@ decide_unroll_stupid (struct loop *loop, int flags
   loop->lpt_decision.decision = LPT_UNROLL_STUPID;
   loop->lpt_decision.times = i - 1;
 
-  if (dump_file)
-    fprintf (dump_file, ";; Decided to unroll the loop stupidly %d times.\n",
-	     loop->lpt_decision.times);
+  dump_printf (TDF_RTL | MSG_OPTIMIZED_LOCATIONS,
+               ";; Decided to unroll the loop stupidly %d times.\n",
+               loop->lpt_decision.times);
 }
 
 /* Unroll a LOOP LOOP->LPT_DECISION.TIMES times.  The transformation does this:
@@ -1530,9 +1520,9 @@ unroll_loop_stupid (struct loop *loop)
       desc->simple_p = false;
     }
 
-  if (dump_file)
-    fprintf (dump_file, ";; Unrolled loop %d times, %i insns\n",
-	     nunroll, num_loop_insns (loop));
+  dump_printf (TDF_RTL | MSG_OPTIMIZED_LOCATIONS,
+               ";; Unrolled loop %d times, %i insns\n",
+               nunroll, num_loop_insns (loop));
 }
 
 /* A hash function for information about insns to split.  */
@@ -1738,11 +1728,11 @@ analyze_insn_to_expand_var (struct loop *loop, rtx
   if (!referenced_in_one_insn_in_loop_p (loop, dest, &debug_uses))
     return NULL;
 
-  if (dump_file)
+  if (dump_enabled_p ())
     {
-      fprintf (dump_file, "\n;; Expanding Accumulator ");
-      print_rtl (dump_file, dest);
-      fprintf (dump_file, "\n");
+      dump_printf (TDF_RTL | MSG_NOTE, "\n;; Expanding Accumulator ");
+      dump_rtl (TDF_RTL | MSG_NOTE, dest);
+      dump_printf (TDF_RTL | MSG_NOTE, "\n");
     }
 
   if (debug_uses)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Use new dump scheme for loop unroll passes
  2012-12-14  2:16 [PATCH] Use new dump scheme for loop unroll passes Sharad Singhai
@ 2012-12-14  4:59 ` Xinliang David Li
  2012-12-14  5:38   ` Teresa Johnson
  0 siblings, 1 reply; 4+ messages in thread
From: Xinliang David Li @ 2012-12-14  4:59 UTC (permalink / raw)
  To: Sharad Singhai; +Cc: gcc-patches, Richard Biener

A couple of comments:

1) please dump with source location if possible. What is the use of
the information if there is no line number
2) Please do not use the existing dump report -- Loop 1,2,3 etc means
nothing to user
3) The optimization report should be standardized with some template
(similar to informational notes):

    file line:column note: <xxxx> is <yyyy>ed <additional information)

  where xxxx is a source construct such as a loop, a branch, a
function etc, while yyyy is the transformation such as 'vectorized',
'unrolled', 'peeled', 'if converted', 'hoisted' etc. Additional
information can be something to describe more about the transformation
and the source construct. For instance, unrolled N times, unrolled
completely,   and profile information of the loop (entry count,
average trip count etc). The addtitional information needs to be
concise. Please do *not* dump with verbosity as you proposed
(including the size, induction variable folding, exit condition
elimination etc).

4) the existing dump (into the dump file) can be changed to use the
same dump format above
5) For loop unroll/peeling, the dumping code can be refactorized using
one report function -- see the code in google branch

6) do not forget the tree level unroller.

David

On Thu, Dec 13, 2012 at 6:15 PM, Sharad Singhai <singhai@google.com> wrote:
> Hi,
>
> As per discussion in http://gcc.gnu.org/ml/gcc/2012-12/msg00056.html,
> the attached patch updates loop unroll passes to use new dump
> infrastructure.
>
> This patch filters relevant dump messages into the following
> three categories
>
> - optimized: an optimization was successfully applied
> - missed: an optimization was missed due to the described reason
> - note: other relevant/detailed info during optimization. For example,
>   loop unrolling computes the loop bounds and size.
>
> Two sample outputs from one of the gcc tests (gcc.dg/unroll_1.c) are below.
>
> Sample 1
> -------------- info about optimized loops via
> "-fopt-info-loop-optimized" -------
> $ gcc gcc.dg/unroll_1.c -fno-diagnostics-show-caret -O2 -S
> -fdump-rtl-loop2_unroll -funroll-loops -fopt-info-loop-optimized
>
> Unrolled loop 1 completely (duplicated 2 times).
> Exit condition of peeled iterations was eliminated.
> Last iteration exit edge was proved true.
> Unrolled loop 1 completely (duplicated 2 times).
> Exit condition of peeled iterations was eliminated.
> Last iteration exit edge was proved true.
> --------------------------------
>
> Sample 2:
> --- All available loop optimization info, i.e., optimized+missed+note
> via "-fopt-info-loop" ---
> $ gcc gcc.dg/unroll_1.c -fno-diagnostics-show-caret -O2 -S
> -fdump-rtl-loop2_unroll -funroll-loops -fopt-info-loop
>
> Loop 1 iterates 2 times.
> Loop 1 iterates at most 2 times.
> Estimating sizes for loop 1
>  BB: 4, after_exit: 0
>   size:   2 if (i_1 <= 1)
>    Exit condition will be eliminated in peeled copies.
>  BB: 3, after_exit: 1
>   size:   1 _5 = b[i_1];
>   size:   1 _6 = _5 + 1;
>   size:   1 a[i_1] = _6;
>   size:   1 i_8 = i_1 + 1;
>    Induction variable computation will be folded away.
> size: 6-3, last_iteration: 2-0
>   Loop size: 6
>   Estimated size after unrolling: 5
> Unrolled loop 1 completely (duplicated 2 times).
> Exit condition of peeled iterations was eliminated.
> Last iteration exit edge was proved true.
> Forced exit to be taken: if (1 == 0)
> Loop 1 iterates 2 times.
> Loop 1 iterates at most 2 times.
> Estimating sizes for loop 1
>  BB: 4, after_exit: 0
>   size:   2 if (i_1 <= 1)
>    Exit condition will be eliminated in peeled copies.
>  BB: 3, after_exit: 1
>   size:   1 _4 = b[i_1];
>   size:   1 _5 = _4 + 1;
>   size:   1 a[i_1] = _5;
>   size:   1 i_7 = i_1 + 1;
>    Induction variable computation will be folded away.
> size: 6-3, last_iteration: 2-0
>   Loop size: 6
>   Estimated size after unrolling: 5
> Unrolled loop 1 completely (duplicated 2 times).
> Exit condition of peeled iterations was eliminated.
> Last iteration exit edge was proved true.
> Forced exit to be taken: if (1 == 0)
> --------------------------------
>
> I would like to mention that this information is perhaps too verbose
> and the the source location of optimized loops is not displayed. I can
> add source line info (and fix up corresponding tests) if needed. But
> right now I wanted to maintain current dump format faithfully. Perhaps
> the format can be tweaked for better readability.
>
> Note that all information dumped in response to -fopt-info is also
> present in regular dump file(s) when corresponding dumps are
> enabled. Thus in above examples, the loop optimization info is also
> present in *.loop2_unroll dump file since the command line specified a
> dump file via "-fdump-rtl-loop2_unroll" in addition to -fopt-info.
>
> (As a side note, while doing the conversion, I found that the MSG_*
> dump flags are unwieldy when used in conjunction with other
> flags. Perhaps these flags should be renamed/shortened. I propose the following
>        MSG_MISSED_OPTIMIZATION  ==> MSG_MISSED
>        MSG_OPTIMIZED_LOCATIONS  ==> MSG_OPTIMIZED
> But that is pure renaming and can be done separately.)
>
> I have bootstrapped and tested this patch on x86_64 and found no new
> failures. Okay for trunk?
>
> Thanks,
> Sharad

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Use new dump scheme for loop unroll passes
  2012-12-14  4:59 ` Xinliang David Li
@ 2012-12-14  5:38   ` Teresa Johnson
  2012-12-15  3:02     ` Sharad Singhai
  0 siblings, 1 reply; 4+ messages in thread
From: Teresa Johnson @ 2012-12-14  5:38 UTC (permalink / raw)
  To: Xinliang David Li; +Cc: Sharad Singhai, gcc-patches, Richard Biener

On Thu, Dec 13, 2012 at 8:58 PM, Xinliang David Li <davidxl@google.com> wrote:
> A couple of comments:
>
> 1) please dump with source location if possible. What is the use of
> the information if there is no line number

The google branches have the code to identify a source location of the
loop, and a similar message to the one proposed below (which uses the
inform() interface on the google branches). I have a trunk patch ready
to submit with this ported to the new dumping infrastructure, which I
was going to submit after Sharad's patch. Sharad, do you want me to
submit that one first, then it can be leveraged if you want to extend
the messages? But I agree with David in that I think the bulk of these
types of messages should stay in the dump file and not be emitted by
-fopt-info because they are too verbose and low-level. Can the new
dumping infrastructure be used to just dump to the dump file and not
via -fopt-info?

Teresa

> 2) Please do not use the existing dump report -- Loop 1,2,3 etc means
> nothing to user
> 3) The optimization report should be standardized with some template
> (similar to informational notes):
>
>     file line:column note: <xxxx> is <yyyy>ed <additional information)
>
>   where xxxx is a source construct such as a loop, a branch, a
> function etc, while yyyy is the transformation such as 'vectorized',
> 'unrolled', 'peeled', 'if converted', 'hoisted' etc. Additional
> information can be something to describe more about the transformation
> and the source construct. For instance, unrolled N times, unrolled
> completely,   and profile information of the loop (entry count,
> average trip count etc). The addtitional information needs to be
> concise. Please do *not* dump with verbosity as you proposed
> (including the size, induction variable folding, exit condition
> elimination etc).
>
> 4) the existing dump (into the dump file) can be changed to use the
> same dump format above
> 5) For loop unroll/peeling, the dumping code can be refactorized using
> one report function -- see the code in google branch
>
> 6) do not forget the tree level unroller.
>
> David
>
> On Thu, Dec 13, 2012 at 6:15 PM, Sharad Singhai <singhai@google.com> wrote:
>> Hi,
>>
>> As per discussion in http://gcc.gnu.org/ml/gcc/2012-12/msg00056.html,
>> the attached patch updates loop unroll passes to use new dump
>> infrastructure.
>>
>> This patch filters relevant dump messages into the following
>> three categories
>>
>> - optimized: an optimization was successfully applied
>> - missed: an optimization was missed due to the described reason
>> - note: other relevant/detailed info during optimization. For example,
>>   loop unrolling computes the loop bounds and size.
>>
>> Two sample outputs from one of the gcc tests (gcc.dg/unroll_1.c) are below.
>>
>> Sample 1
>> -------------- info about optimized loops via
>> "-fopt-info-loop-optimized" -------
>> $ gcc gcc.dg/unroll_1.c -fno-diagnostics-show-caret -O2 -S
>> -fdump-rtl-loop2_unroll -funroll-loops -fopt-info-loop-optimized
>>
>> Unrolled loop 1 completely (duplicated 2 times).
>> Exit condition of peeled iterations was eliminated.
>> Last iteration exit edge was proved true.
>> Unrolled loop 1 completely (duplicated 2 times).
>> Exit condition of peeled iterations was eliminated.
>> Last iteration exit edge was proved true.
>> --------------------------------
>>
>> Sample 2:
>> --- All available loop optimization info, i.e., optimized+missed+note
>> via "-fopt-info-loop" ---
>> $ gcc gcc.dg/unroll_1.c -fno-diagnostics-show-caret -O2 -S
>> -fdump-rtl-loop2_unroll -funroll-loops -fopt-info-loop
>>
>> Loop 1 iterates 2 times.
>> Loop 1 iterates at most 2 times.
>> Estimating sizes for loop 1
>>  BB: 4, after_exit: 0
>>   size:   2 if (i_1 <= 1)
>>    Exit condition will be eliminated in peeled copies.
>>  BB: 3, after_exit: 1
>>   size:   1 _5 = b[i_1];
>>   size:   1 _6 = _5 + 1;
>>   size:   1 a[i_1] = _6;
>>   size:   1 i_8 = i_1 + 1;
>>    Induction variable computation will be folded away.
>> size: 6-3, last_iteration: 2-0
>>   Loop size: 6
>>   Estimated size after unrolling: 5
>> Unrolled loop 1 completely (duplicated 2 times).
>> Exit condition of peeled iterations was eliminated.
>> Last iteration exit edge was proved true.
>> Forced exit to be taken: if (1 == 0)
>> Loop 1 iterates 2 times.
>> Loop 1 iterates at most 2 times.
>> Estimating sizes for loop 1
>>  BB: 4, after_exit: 0
>>   size:   2 if (i_1 <= 1)
>>    Exit condition will be eliminated in peeled copies.
>>  BB: 3, after_exit: 1
>>   size:   1 _4 = b[i_1];
>>   size:   1 _5 = _4 + 1;
>>   size:   1 a[i_1] = _5;
>>   size:   1 i_7 = i_1 + 1;
>>    Induction variable computation will be folded away.
>> size: 6-3, last_iteration: 2-0
>>   Loop size: 6
>>   Estimated size after unrolling: 5
>> Unrolled loop 1 completely (duplicated 2 times).
>> Exit condition of peeled iterations was eliminated.
>> Last iteration exit edge was proved true.
>> Forced exit to be taken: if (1 == 0)
>> --------------------------------
>>
>> I would like to mention that this information is perhaps too verbose
>> and the the source location of optimized loops is not displayed. I can
>> add source line info (and fix up corresponding tests) if needed. But
>> right now I wanted to maintain current dump format faithfully. Perhaps
>> the format can be tweaked for better readability.
>>
>> Note that all information dumped in response to -fopt-info is also
>> present in regular dump file(s) when corresponding dumps are
>> enabled. Thus in above examples, the loop optimization info is also
>> present in *.loop2_unroll dump file since the command line specified a
>> dump file via "-fdump-rtl-loop2_unroll" in addition to -fopt-info.
>>
>> (As a side note, while doing the conversion, I found that the MSG_*
>> dump flags are unwieldy when used in conjunction with other
>> flags. Perhaps these flags should be renamed/shortened. I propose the following
>>        MSG_MISSED_OPTIMIZATION  ==> MSG_MISSED
>>        MSG_OPTIMIZED_LOCATIONS  ==> MSG_OPTIMIZED
>> But that is pure renaming and can be done separately.)
>>
>> I have bootstrapped and tested this patch on x86_64 and found no new
>> failures. Okay for trunk?
>>
>> Thanks,
>> Sharad



--
Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Use new dump scheme for loop unroll passes
  2012-12-14  5:38   ` Teresa Johnson
@ 2012-12-15  3:02     ` Sharad Singhai
  0 siblings, 0 replies; 4+ messages in thread
From: Sharad Singhai @ 2012-12-15  3:02 UTC (permalink / raw)
  To: Teresa Johnson; +Cc: Xinliang David Li, gcc-patches, Richard Biener

Teresa,

Yes, I didn't take enhancements in google branches into account while
porting this patch. In light of these comments, I withdraw this patch
and will wait for your patch. Once your patch is in, I will update
this patch for regular dumps.

To answer your other question, yes, the new dump infrastructure can
dump to either dump file or opt-info streams (or both) depending upon
dump_flags. If the dump_flags contain TDF_* flags then the dump
happens on regular dump files, if dump_flags contain MSG_* flags then
the dump happens on opt-info stream.

Thanks,
Sharad


On Thu, Dec 13, 2012 at 9:37 PM, Teresa Johnson <tejohnson@google.com> wrote:
> On Thu, Dec 13, 2012 at 8:58 PM, Xinliang David Li <davidxl@google.com> wrote:
>> A couple of comments:
>>
>> 1) please dump with source location if possible. What is the use of
>> the information if there is no line number
>
> The google branches have the code to identify a source location of the
> loop, and a similar message to the one proposed below (which uses the
> inform() interface on the google branches). I have a trunk patch ready
> to submit with this ported to the new dumping infrastructure, which I
> was going to submit after Sharad's patch. Sharad, do you want me to
> submit that one first, then it can be leveraged if you want to extend
> the messages? But I agree with David in that I think the bulk of these
> types of messages should stay in the dump file and not be emitted by
> -fopt-info because they are too verbose and low-level. Can the new
> dumping infrastructure be used to just dump to the dump file and not
> via -fopt-info?
>
> Teresa
>
>> 2) Please do not use the existing dump report -- Loop 1,2,3 etc means
>> nothing to user
>> 3) The optimization report should be standardized with some template
>> (similar to informational notes):
>>
>>     file line:column note: <xxxx> is <yyyy>ed <additional information)
>>
>>   where xxxx is a source construct such as a loop, a branch, a
>> function etc, while yyyy is the transformation such as 'vectorized',
>> 'unrolled', 'peeled', 'if converted', 'hoisted' etc. Additional
>> information can be something to describe more about the transformation
>> and the source construct. For instance, unrolled N times, unrolled
>> completely,   and profile information of the loop (entry count,
>> average trip count etc). The addtitional information needs to be
>> concise. Please do *not* dump with verbosity as you proposed
>> (including the size, induction variable folding, exit condition
>> elimination etc).
>>
>> 4) the existing dump (into the dump file) can be changed to use the
>> same dump format above
>> 5) For loop unroll/peeling, the dumping code can be refactorized using
>> one report function -- see the code in google branch
>>
>> 6) do not forget the tree level unroller.
>>
>> David
>>
>> On Thu, Dec 13, 2012 at 6:15 PM, Sharad Singhai <singhai@google.com> wrote:
>>> Hi,
>>>
>>> As per discussion in http://gcc.gnu.org/ml/gcc/2012-12/msg00056.html,
>>> the attached patch updates loop unroll passes to use new dump
>>> infrastructure.
>>>
>>> This patch filters relevant dump messages into the following
>>> three categories
>>>
>>> - optimized: an optimization was successfully applied
>>> - missed: an optimization was missed due to the described reason
>>> - note: other relevant/detailed info during optimization. For example,
>>>   loop unrolling computes the loop bounds and size.
>>>
>>> Two sample outputs from one of the gcc tests (gcc.dg/unroll_1.c) are below.
>>>
>>> Sample 1
>>> -------------- info about optimized loops via
>>> "-fopt-info-loop-optimized" -------
>>> $ gcc gcc.dg/unroll_1.c -fno-diagnostics-show-caret -O2 -S
>>> -fdump-rtl-loop2_unroll -funroll-loops -fopt-info-loop-optimized
>>>
>>> Unrolled loop 1 completely (duplicated 2 times).
>>> Exit condition of peeled iterations was eliminated.
>>> Last iteration exit edge was proved true.
>>> Unrolled loop 1 completely (duplicated 2 times).
>>> Exit condition of peeled iterations was eliminated.
>>> Last iteration exit edge was proved true.
>>> --------------------------------
>>>
>>> Sample 2:
>>> --- All available loop optimization info, i.e., optimized+missed+note
>>> via "-fopt-info-loop" ---
>>> $ gcc gcc.dg/unroll_1.c -fno-diagnostics-show-caret -O2 -S
>>> -fdump-rtl-loop2_unroll -funroll-loops -fopt-info-loop
>>>
>>> Loop 1 iterates 2 times.
>>> Loop 1 iterates at most 2 times.
>>> Estimating sizes for loop 1
>>>  BB: 4, after_exit: 0
>>>   size:   2 if (i_1 <= 1)
>>>    Exit condition will be eliminated in peeled copies.
>>>  BB: 3, after_exit: 1
>>>   size:   1 _5 = b[i_1];
>>>   size:   1 _6 = _5 + 1;
>>>   size:   1 a[i_1] = _6;
>>>   size:   1 i_8 = i_1 + 1;
>>>    Induction variable computation will be folded away.
>>> size: 6-3, last_iteration: 2-0
>>>   Loop size: 6
>>>   Estimated size after unrolling: 5
>>> Unrolled loop 1 completely (duplicated 2 times).
>>> Exit condition of peeled iterations was eliminated.
>>> Last iteration exit edge was proved true.
>>> Forced exit to be taken: if (1 == 0)
>>> Loop 1 iterates 2 times.
>>> Loop 1 iterates at most 2 times.
>>> Estimating sizes for loop 1
>>>  BB: 4, after_exit: 0
>>>   size:   2 if (i_1 <= 1)
>>>    Exit condition will be eliminated in peeled copies.
>>>  BB: 3, after_exit: 1
>>>   size:   1 _4 = b[i_1];
>>>   size:   1 _5 = _4 + 1;
>>>   size:   1 a[i_1] = _5;
>>>   size:   1 i_7 = i_1 + 1;
>>>    Induction variable computation will be folded away.
>>> size: 6-3, last_iteration: 2-0
>>>   Loop size: 6
>>>   Estimated size after unrolling: 5
>>> Unrolled loop 1 completely (duplicated 2 times).
>>> Exit condition of peeled iterations was eliminated.
>>> Last iteration exit edge was proved true.
>>> Forced exit to be taken: if (1 == 0)
>>> --------------------------------
>>>
>>> I would like to mention that this information is perhaps too verbose
>>> and the the source location of optimized loops is not displayed. I can
>>> add source line info (and fix up corresponding tests) if needed. But
>>> right now I wanted to maintain current dump format faithfully. Perhaps
>>> the format can be tweaked for better readability.
>>>
>>> Note that all information dumped in response to -fopt-info is also
>>> present in regular dump file(s) when corresponding dumps are
>>> enabled. Thus in above examples, the loop optimization info is also
>>> present in *.loop2_unroll dump file since the command line specified a
>>> dump file via "-fdump-rtl-loop2_unroll" in addition to -fopt-info.
>>>
>>> (As a side note, while doing the conversion, I found that the MSG_*
>>> dump flags are unwieldy when used in conjunction with other
>>> flags. Perhaps these flags should be renamed/shortened. I propose the following
>>>        MSG_MISSED_OPTIMIZATION  ==> MSG_MISSED
>>>        MSG_OPTIMIZED_LOCATIONS  ==> MSG_OPTIMIZED
>>> But that is pure renaming and can be done separately.)
>>>
>>> I have bootstrapped and tested this patch on x86_64 and found no new
>>> failures. Okay for trunk?
>>>
>>> Thanks,
>>> Sharad
>
>
>
> --
> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-12-15  3:02 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-12-14  2:16 [PATCH] Use new dump scheme for loop unroll passes Sharad Singhai
2012-12-14  4:59 ` Xinliang David Li
2012-12-14  5:38   ` Teresa Johnson
2012-12-15  3:02     ` Sharad Singhai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).