public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH, Loop optimizer]: Add logic to disable certain loop optimizations on pre-/post-loops
@ 2010-12-13 21:57 Fang, Changpeng
  2010-12-14  0:56 ` Sebastian Pop
                   ` (2 more replies)
  0 siblings, 3 replies; 36+ messages in thread
From: Fang, Changpeng @ 2010-12-13 21:57 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1233 bytes --]

Hi,

The attached patch adds the logic to disable certain loop optimizations on pre-/post-loops. 

Some loop optimizations (auto-vectorization, loop unrolling, etc) may peel a few iterations
of a loop to form pre- and/or post-loops for various purposes (alignment, loop bounds, etc).
Currently, GCC loop optimizer is unable to recognize that such loops will roll only a few 
iterations and still perform optimizations on them. While this does not hurt the performance in general,
it may significantly increase the compilation time and code size without performance benefit.

This patch adds such logic for the loop optimizer to recognize pre- and/or post loops, and disable
prefetch, unswitch and loop unrolling on them. On polyhedron with -Ofast -funroll-loops -march=amdfam10,
the patch could reduce the compilation time by 28% on average, the reduce the binary size by 20% on
 average (see the atached data).  Note that the small improvement (0.5%) could have been noise, the
code size reduction could possibly improve the performance in some cases (I_cache iprovement?).

The patch passed bootstrap and gcc regression tests on x86_64-unknown-linux-gnu.

Is it OK to commit to trunk?

Thanks,

Changpeng
   

[-- Attachment #2: polyhedron.txt --]
[-- Type: text/plain, Size: 956 bytes --]

Effact of the pre-/post-loop patch on polyhedron 
option: gfortran -Ofast -funroll-loops -march=amdfam10

               compilation     code size      speed
              time reduction   reduction      improvement
                 (%)             (%)            (%)
-----------------------------------------------------------
          ac	-20.54		-17.15		0
      aermod	-15.93		-10.15		2.51
         air	-5.74		-5.45		-0.09
    capacita	-31.35		-18.27		0.08
     channel	-11.32		-10.24		1.22
       doduc	-4.52		-6.12		0.82
     fatigue	-34.51		-15.94		0
     gas_dyn	-45.56		-28.66		2.31
      induct	-3.1		-1.91		0.05
       linpk	-25.55		-27.5		0.26
        mdbx	-24.06		-19.74		1.27
          nf	-60.85		-48.92		-0.77
     protein	-44.73		-24.02		-0.19
      rnflow	-50.55		-36.69		0.47
    test_fpu	-52.49		-41.35		1.18
        tfft	-24.83		-18.29		0.39
-----------------------------------------------------------
     average	-28.48		-20.65		0.59


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: 0001-Don-t-perform-certain-loop-optimizations-on-pre-post.patch --]
[-- Type: text/x-patch; name="0001-Don-t-perform-certain-loop-optimizations-on-pre-post.patch", Size: 8306 bytes --]

From e8636e80de4d6de8ba2dbc8f08bd2daddd02edc3 Mon Sep 17 00:00:00 2001
From: Changpeng Fang <chfang@houghton.(none)>
Date: Mon, 13 Dec 2010 12:01:49 -0800
Subject: [PATCH] Don't perform certain loop optimizations on pre/post loops

	* basic-block.h (bb_flags): Add a new flag BB_PRE_POST_LOOP_HEADER.
	* cfg.c (clear_bb_flags): Keep BB_PRE_POST_LOOP_HEADER marker.
	* cfgloop.h (mark_pre_or_post_loop): New function declaration.
	  (pre_or_post_loop_p): New function declaration.
	* loop-unroll.c (decide_unroll_runtime_iterations): Do not unroll a
	  pre- or post-loop.
	* loop-unswitch.c (unswitch_single_loop): Do not unswitch a pre- or
	  post-loop.
	* tree-ssa-loop-manip.c (tree_transform_and_unroll_loop): Mark the
	  post-loop.
	* tree-ssa-loop-niter.c (mark_pre_or_post_loop): Implement the new
	  function.  (pre_or_post_loop_p): Implement the new function.
	* tree-ssa-loop-prefetch.c (loop_prefetch_arrays): Don't prefetch
	  a pre- or post-loop.
	* tree-ssa-loop-unswitch.c (tree_ssa_unswitch_loops): Do not unswitch
	  a pre- or post-loop.
	* tree-vect-loop-manip.c (vect_do_peeling_for_loop_bound): Mark the
	  post-loop.  (vect_do_peeling_for_alignment): Mark the pre-loop.
---
 gcc/basic-block.h            |    6 +++++-
 gcc/cfg.c                    |    7 ++++---
 gcc/cfgloop.h                |    2 ++
 gcc/loop-unroll.c            |    7 +++++++
 gcc/loop-unswitch.c          |    8 ++++++++
 gcc/tree-ssa-loop-manip.c    |    3 +++
 gcc/tree-ssa-loop-niter.c    |   20 ++++++++++++++++++++
 gcc/tree-ssa-loop-prefetch.c |    7 +++++++
 gcc/tree-ssa-loop-unswitch.c |    8 ++++++++
 gcc/tree-vect-loop-manip.c   |    8 ++++++++
 10 files changed, 72 insertions(+), 4 deletions(-)

diff --git a/gcc/basic-block.h b/gcc/basic-block.h
index be0a1d1..78552fd 100644
--- a/gcc/basic-block.h
+++ b/gcc/basic-block.h
@@ -245,7 +245,11 @@ enum bb_flags
 
   /* Set on blocks that cannot be threaded through.
      Only used in cfgcleanup.c.  */
-  BB_NONTHREADABLE_BLOCK = 1 << 11
+  BB_NONTHREADABLE_BLOCK = 1 << 11,
+
+  /* Set on blocks that are headers of pre- or post-loops.  */
+  BB_PRE_POST_LOOP_HEADER = 1 << 12
+
 };
 
 /* Dummy flag for convenience in the hot/cold partitioning code.  */
diff --git a/gcc/cfg.c b/gcc/cfg.c
index c8ef799..e9b394a 100644
--- a/gcc/cfg.c
+++ b/gcc/cfg.c
@@ -425,8 +425,8 @@ redirect_edge_pred (edge e, basic_block new_pred)
   connect_src (e);
 }
 
-/* Clear all basic block flags, with the exception of partitioning and
-   setjmp_target.  */
+/* Clear all basic block flags, with the exception of partitioning,
+   setjmp_target, and the pre/post loop marker.  */
 void
 clear_bb_flags (void)
 {
@@ -434,7 +434,8 @@ clear_bb_flags (void)
 
   FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR, NULL, next_bb)
     bb->flags = (BB_PARTITION (bb)
-		 | (bb->flags & (BB_DISABLE_SCHEDULE + BB_RTL + BB_NON_LOCAL_GOTO_TARGET)));
+		 | (bb->flags & (BB_DISABLE_SCHEDULE + BB_RTL + BB_NON_LOCAL_GOTO_TARGET
+                                 + BB_PRE_POST_LOOP_HEADER)));
 }
 \f
 /* Check the consistency of profile information.  We can't do that
diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h
index bf2614e..ce848cc 100644
--- a/gcc/cfgloop.h
+++ b/gcc/cfgloop.h
@@ -279,6 +279,8 @@ extern rtx doloop_condition_get (rtx);
 void estimate_numbers_of_iterations_loop (struct loop *, bool);
 HOST_WIDE_INT estimated_loop_iterations_int (struct loop *, bool);
 bool estimated_loop_iterations (struct loop *, bool, double_int *);
+void mark_pre_or_post_loop (struct loop *);
+bool pre_or_post_loop_p (struct loop *);
 
 /* Loop manipulation.  */
 extern bool can_duplicate_loop_p (const struct loop *loop);
diff --git a/gcc/loop-unroll.c b/gcc/loop-unroll.c
index 67d6ea0..6f095f6 100644
--- a/gcc/loop-unroll.c
+++ b/gcc/loop-unroll.c
@@ -857,6 +857,13 @@ decide_unroll_runtime_iterations (struct loop *loop, int flags)
 	fprintf (dump_file, ";; Loop iterates constant times\n");
       return;
     }
+ 
+  if (pre_or_post_loop_p (loop))
+    {
+      if (dump_file)
+        fprintf (dump_file, ";; Not unrolling, a pre- or post-loop\n");
+      return;
+    }
 
   /* If we have profile feedback, check whether the loop rolls.  */
   if (loop->header->count && expected_loop_iterations (loop) < 2 * nunroll)
diff --git a/gcc/loop-unswitch.c b/gcc/loop-unswitch.c
index 77524d8..59373bf 100644
--- a/gcc/loop-unswitch.c
+++ b/gcc/loop-unswitch.c
@@ -276,6 +276,14 @@ unswitch_single_loop (struct loop *loop, rtx cond_checked, int num)
       return;
     }
 
+  /* Pre- or post loop usually just roll a few iterations.  */
+  if (pre_or_post_loop_p (loop))
+    {
+      if (dump_file)
+	fprintf (dump_file, ";; Not unswitching, a pre- or post loop\n");
+      return;
+    }
+
   /* We must be able to duplicate loop body.  */
   if (!can_duplicate_loop_p (loop))
     {
diff --git a/gcc/tree-ssa-loop-manip.c b/gcc/tree-ssa-loop-manip.c
index 87b2c0d..f8ddbab 100644
--- a/gcc/tree-ssa-loop-manip.c
+++ b/gcc/tree-ssa-loop-manip.c
@@ -931,6 +931,9 @@ tree_transform_and_unroll_loop (struct loop *loop, unsigned factor,
   gcc_assert (new_loop != NULL);
   update_ssa (TODO_update_ssa);
 
+  /* NEW_LOOP is a post-loop.  */
+  mark_pre_or_post_loop (new_loop);
+
   /* Determine the probability of the exit edge of the unrolled loop.  */
   new_est_niter = est_niter / factor;
 
diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index ee85f6f..33e8cc3 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -3011,6 +3011,26 @@ estimate_numbers_of_iterations (bool use_undefined_p)
   fold_undefer_and_ignore_overflow_warnings ();
 }
 
+/* Mark LOOP as a pre- or post loop.  */
+
+void
+mark_pre_or_post_loop (struct loop *loop)
+{
+  gcc_assert (loop && loop->header);
+  loop->header->flags |= BB_PRE_POST_LOOP_HEADER;
+}
+
+/* Return true if LOOP is a pre- or post loop.  */
+
+bool
+pre_or_post_loop_p (struct loop *loop)
+{
+  int masked_flags;
+  gcc_assert (loop && loop->header);
+  masked_flags = (loop->header->flags & BB_PRE_POST_LOOP_HEADER);
+  return (masked_flags != 0);
+}
+
 /* Returns true if statement S1 dominates statement S2.  */
 
 bool
diff --git a/gcc/tree-ssa-loop-prefetch.c b/gcc/tree-ssa-loop-prefetch.c
index 59c65d3..5c9f640 100644
--- a/gcc/tree-ssa-loop-prefetch.c
+++ b/gcc/tree-ssa-loop-prefetch.c
@@ -1793,6 +1793,13 @@ loop_prefetch_arrays (struct loop *loop)
       return false;
     }
 
+  if (pre_or_post_loop_p (loop))
+    {
+      if (dump_file && (dump_flags & TDF_DETAILS))
+	fprintf (dump_file, "  Not Prefetching -- pre- or post loop\n");
+      return false;
+    }
+
   /* FIXME: the time should be weighted by the probabilities of the blocks in
      the loop body.  */
   time = tree_num_loop_insns (loop, &eni_time_weights);
diff --git a/gcc/tree-ssa-loop-unswitch.c b/gcc/tree-ssa-loop-unswitch.c
index b6b32dc..f3b8108 100644
--- a/gcc/tree-ssa-loop-unswitch.c
+++ b/gcc/tree-ssa-loop-unswitch.c
@@ -88,6 +88,14 @@ tree_ssa_unswitch_loops (void)
       if (dump_file && (dump_flags & TDF_DETAILS))
         fprintf (dump_file, ";; Considering loop %d\n", loop->num);
 
+     /* Do not unswitch a pre- or post loop.  */
+     if (pre_or_post_loop_p (loop))
+       {
+          if (dump_file && (dump_flags & TDF_DETAILS))
+            fprintf (dump_file, ";; Not unswitching, a pre- or post loop\n");
+          continue;
+        }
+
       /* Do not unswitch in cold regions. */
       if (optimize_loop_for_size_p (loop))
         {
diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
index 6ecd304..9a63f7e 100644
--- a/gcc/tree-vect-loop-manip.c
+++ b/gcc/tree-vect-loop-manip.c
@@ -1938,6 +1938,10 @@ vect_do_peeling_for_loop_bound (loop_vec_info loop_vinfo, tree *ratio,
 					    cond_expr, cond_expr_stmt_list);
   gcc_assert (new_loop);
   gcc_assert (loop_num == loop->num);
+  
+  /* NEW_LOOP is a post loop.  */
+  mark_pre_or_post_loop (new_loop);
+
 #ifdef ENABLE_CHECKING
   slpeel_verify_cfg_after_peeling (loop, new_loop);
 #endif
@@ -2191,6 +2195,10 @@ vect_do_peeling_for_alignment (loop_vec_info loop_vinfo)
 				   th, true, NULL_TREE, NULL);
 
   gcc_assert (new_loop);
+
+  /* NEW_LOOP is a pre-loop.  */
+  mark_pre_or_post_loop (new_loop);
+  
 #ifdef ENABLE_CHECKING
   slpeel_verify_cfg_after_peeling (new_loop, loop);
 #endif
-- 
1.6.3.3


^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2011-01-05 14:11 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-12-13 21:57 [PATCH, Loop optimizer]: Add logic to disable certain loop optimizations on pre-/post-loops Fang, Changpeng
2010-12-14  0:56 ` Sebastian Pop
2010-12-14  8:25 ` Zdenek Dvorak
2010-12-14 20:02   ` Fang, Changpeng
2010-12-14 21:55     ` Zdenek Dvorak
2010-12-15  6:16       ` Richard Guenther
2010-12-15  8:34         ` Fang, Changpeng
2010-12-15  9:22           ` Xinliang David Li
2010-12-15 10:00             ` Zdenek Dvorak
2010-12-15 16:46               ` Fang, Changpeng
2010-12-15 16:47                 ` Zdenek Dvorak
2010-12-15 17:08               ` Xinliang David Li
2010-12-16 12:09               ` Richard Guenther
2010-12-16 12:41                 ` Zdenek Dvorak
2010-12-16 18:26                   ` Fang, Changpeng
2010-12-16 20:06                     ` Zdenek Dvorak
2010-12-17  3:53                       ` Fang, Changpeng
2010-12-17  6:36                         ` Jack Howarth
2010-12-17  9:55                           ` Fang, Changpeng
2010-12-17 16:13                             ` Jack Howarth
2010-12-17 16:48                               ` Fang, Changpeng
2010-12-17 17:20                                 ` Jack Howarth
2010-12-17 18:01                                 ` Jack Howarth
2010-12-17 18:31                                   ` Fang, Changpeng
2011-01-04  3:33                             ` Jack Howarth
2011-01-04 22:40                               ` Fang, Changpeng
2011-01-05 12:07                                 ` Richard Guenther
2011-01-05 14:12                                 ` Jack Howarth
2010-12-17 21:45                           ` Jack Howarth
2010-12-17 22:35                             ` Fang, Changpeng
2010-12-17 22:54                               ` [PATCH, Loop optimizer]: Add logic to disable certain loop optimizations on pre-/post-loopsj Jack Howarth
2010-12-18 11:35                               ` [PATCH, Loop optimizer]: Add logic to disable certain loop optimizations on pre-/post-loops Jack Howarth
2010-12-19  0:29                     ` Richard Guenther
2010-12-21 22:45                       ` Fang, Changpeng
2010-12-14 16:13 ` Jack Howarth
2010-12-14 17:33   ` Fang, Changpeng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).