public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: Preliminary patch for PR23820 and PR24309
@ 2007-10-17 13:16 Ayal Zaks
  2007-10-17 15:47 ` Sebastian Pop
  0 siblings, 1 reply; 26+ messages in thread
From: Ayal Zaks @ 2007-10-17 13:16 UTC (permalink / raw)
  To: Sebastian Pop, Daniel Berlin; +Cc: Zdenek Dvorak, gcc-patches


> On 10/15/07, Zdenek Dvorak <rakdver@kam.mff.cuni.cz> wrote:
> > > After this patch, as we won't have other PRs open on the loop
> > > interchange (otherwise ping me on these and I'll give them a look),
> > > would it be possible to enable loop interchange at -O3 or -O2 level?
> >
> > I think enabling it at -O3 for now would be a good idea,
> >
>
> I just bootstrapped and tested a patch with loop interchange
> enabled at -O3, and there were 6 regressions with it:
>
> gfortran.sum gfortran.dg/alloc_comp_assign_2.f90
> gfortran.sum gfortran.dg/pr19928-2.f90
> gfortran.sum gfortran.fortran-torture/execute/scalarize2.f90
> gfortran.sum gfortran.fortran-torture/execute/scalarize.f90
> gfortran.sum gfortran.fortran-torture/execute/where_1.f90
> gfortran.sum gfortran.fortran-torture/execute/where_6.f90
>
> They are all failing the execution test for -O3.  I'll have a look at
> these fails before proposing a patch to enable loop interchange
> at -O3.

A couple of points to consider/clarify:

1. Are the statistics gathered by gather_interchange_stats() invariant, in
the sense that you can pre-gather them once per loop and cache them,
instead of re-gathering them twice for each pair of loops? This may save
compile time.

2. The priority function
>     if (dependence_steps_i < dependence_steps_j
>         || nb_deps_not_carried_by_i > nb_deps_not_carried_by_j
>         || double_int_ucmp (access_strides_i, access_strides_j) < 0)

is not 'stable' in the sense that you may prefer to interchange i with j,
and if asked again prefer to interchange back. If point 1 above is true,
you are (bubble) sorting the loops according to their priorities, using
unstable comparisons. (Don't iterate until convergence ;-).

3. Can dist
>       else if (dist < 0)
>         (*dependence_steps) += -dist;

be negative?


4. Not too clear an example ...:

>    Example: for the following loop,
>
>    | loop_1 runs 1335 times
>    |   loop_2 runs 1335 times
>    |     A[{{0, +, 1}_1, +, 1335}_2]
>    |     B[{{0, +, 1}_1, +, 1335}_2]
>    |   endloop_2
>    |   A[{0, +, 1336}_1]
>    | endloop_1
>
>    gather_interchange_stats (in loop_1) will return
>    DEPENDENCE_STEPS = 3002
>    NB_DEPS_NOT_CARRIED_BY_LOOP = 5
>    ACCESS_STRIDES = 10694
>
>    gather_interchange_stats (in loop_2) will return
>    DEPENDENCE_STEPS = 3000
>    NB_DEPS_NOT_CARRIED_BY_LOOP = 7
>    ACCESS_STRIDES = 8010

Thanks to Ira and Victor,
Ayal.

^ permalink raw reply	[flat|nested] 26+ messages in thread
* Preliminary patch for PR23820 and PR24309
@ 2007-10-11 21:32 Sebastian Pop
  2007-10-12  1:16 ` Zdenek Dvorak
  0 siblings, 1 reply; 26+ messages in thread
From: Sebastian Pop @ 2007-10-11 21:32 UTC (permalink / raw)
  To: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 520 bytes --]

Hi,

Attached is a patch that fixes both problems with the lambda
framework.  While testing, I observed that ltrans-3.c now fails
because there are more than 3 basic blocks in the outer loop, making
the new implementation of the perfect_nest_p saying that this is not a
perfect nest.

What is the name of the pass that cannonicalizes the CFG removing
empty basic blocks containing only labels, or just a goto, as the
following one?


 bb_9 (preds = {bb_7 }, succs = {bb_10 })
    {
    <bb 9>:

    }

Thanks,
Sebastian

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 803_pr24309.diff --]
[-- Type: text/x-patch; name="803_pr24309.diff", Size: 8755 bytes --]

Index: tree-loop-linear.c
===================================================================
--- tree-loop-linear.c	(revision 129032)
+++ tree-loop-linear.c	(working copy)
@@ -244,6 +244,46 @@ try_interchange_loops (lambda_trans_matr
   return trans;
 }
 
+/* Return the number of nested loops in LOOP_NEST, or 0 if the loops
+   are not perfectly nested.  */
+
+static unsigned int
+perfect_loop_nest_depth (struct loop *loop_nest)
+{
+  struct loop *temp;
+  unsigned int depth = 1;
+
+  /* If it's not a loop nest, we don't want it.  We also don't handle
+     sibling loops properly, which are loops of the following form:
+
+     | for (i = 0; i < 50; i++)
+     |   {
+     |     for (j = 0; j < 50; j++)
+     |       {
+     |        ...
+     |       }
+     |     for (j = 0; j < 50; j++)
+     |       {
+     |        ...
+     |       }
+     |   }
+  */
+
+  if (!loop_nest->inner || !single_exit (loop_nest))
+    return 0;
+
+  for (temp = loop_nest->inner; temp; temp = temp->inner)
+    {
+      /* If we have a sibling loop or multiple exit edges, jump ship.  */
+      if (temp->next || !single_exit (temp))
+	return 0;
+
+      depth++;
+    }
+
+  return depth;
+}
+
 /* Perform a set of linear transforms on loops.  */
 
 void
@@ -263,47 +303,18 @@ linear_transform_loops (void)
       unsigned int depth = 0;
       VEC (ddr_p, heap) *dependence_relations;
       VEC (data_reference_p, heap) *datarefs;
-      struct loop *temp;
       lambda_loopnest before, after;
       lambda_trans_matrix trans;
-      bool problem = false;
       struct obstack lambda_obstack;
       gcc_obstack_init (&lambda_obstack);
 
-      /* If it's not a loop nest, we don't want it.
-         We also don't handle sibling loops properly, 
-         which are loops of the following form:
-         for (i = 0; i < 50; i++)
-           {
-             for (j = 0; j < 50; j++)
-               {
-	        ...
-               }
-             for (j = 0; j < 50; j++)
-               {
-                ...
-               }
-           } */
-      if (!loop_nest->inner || !single_exit (loop_nest))
+      depth = perfect_loop_nest_depth (loop_nest);
+      if (depth == 0)
 	continue;
+
       VEC_truncate (tree, oldivs, 0);
       VEC_truncate (tree, invariants, 0);
-      depth = 1;
-      for (temp = loop_nest->inner; temp; temp = temp->inner)
-	{
-	  /* If we have a sibling loop or multiple exit edges, jump ship.  */
-	  if (temp->next || !single_exit (temp))
-	    {
-	      problem = true;
-	      break;
-	    }
-	  depth ++;
-	}
-      if (problem)
-	continue;
 
-      /* Analyze data references and dependence relations using scev.  */      
- 
       datarefs = VEC_alloc (data_reference_p, heap, 10);
       dependence_relations = VEC_alloc (ddr_p, heap, 10 * 10);
       compute_data_dependences_for_loop (loop_nest, true, &datarefs,
@@ -353,12 +364,12 @@ linear_transform_loops (void)
 	  print_lambda_loopnest (dump_file, after, 'u');
 	}
 
-      lambda_loopnest_to_gcc_loopnest (loop_nest, oldivs, invariants,
-				       &remove_ivs,
-                                       after, trans, &lambda_obstack);
-      modified = true;
+      modified = lambda_loopnest_to_gcc_loopnest (loop_nest, oldivs, 
+						  invariants, &remove_ivs,
+						  after, trans,
+						  &lambda_obstack);
 
-      if (dump_file)
+      if (modified && dump_file)
 	fprintf (dump_file, "Successfully transformed loop.\n");
 
     free_and_continue:
Index: testsuite/gcc.dg/tree-ssa/pr23820.c
===================================================================
--- testsuite/gcc.dg/tree-ssa/pr23820.c	(revision 0)
+++ testsuite/gcc.dg/tree-ssa/pr23820.c	(revision 0)
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-loop-linear" } */
+
+int t [2][4];
+
+void foo (void)
+{
+  int i, j, k, v;
+  float e;
+  for (;;)
+    {
+      v = 0;
+      for (j = 0; j < 2; j ++)
+        {
+          for (k = 2; k < 4; k ++)
+            {
+              e = 0.0;
+              for (i = 0; i < 4; i ++)
+                e += t [j][i];
+              if (e)
+                v = j;
+            }
+        }
+      t [v][0] = 0;
+    }
+}
Index: testsuite/gcc.dg/tree-ssa/pr24309.c
===================================================================
--- testsuite/gcc.dg/tree-ssa/pr24309.c	(revision 0)
+++ testsuite/gcc.dg/tree-ssa/pr24309.c	(revision 0)
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-loop-linear" } */
+
+float weight[10];
+void lsp_weight_quant(float *x, char *cdbk)
+{
+   int i,j;
+   float dist;
+   int best_id=0;
+   for (i=0;i<16;i++)
+   {
+      for (j=0;j<10;j++)
+         dist=dist+weight[j];
+      if (dist<0)
+         best_id=i;
+   }
+   x[j] = cdbk[best_id*10+j];
+}
Index: lambda.h
===================================================================
--- lambda.h	(revision 129032)
+++ lambda.h	(working copy)
@@ -204,7 +204,7 @@ lambda_loopnest gcc_loopnest_to_lambda_l
 						 VEC(tree,heap) **,
                                                  VEC(tree,heap) **,
                                                  struct obstack *);
-void lambda_loopnest_to_gcc_loopnest (struct loop *,
+bool lambda_loopnest_to_gcc_loopnest (struct loop *,
 				      VEC(tree,heap) *, VEC(tree,heap) *,
 				      VEC(tree,heap) **,
                                       lambda_loopnest, lambda_trans_matrix,
Index: lambda-code.c
===================================================================
--- lambda-code.c	(revision 129032)
+++ lambda-code.c	(working copy)
@@ -1675,6 +1675,37 @@ remove_iv (tree iv_stmt)
     }
 }
 
+/* Return true if we are sure that the code generation cannot be
+   performed for this loop nest.  */
+
+static bool
+lambda_cannot_generate_loop_nest (VEC (tree, heap) *old_ivs)
+{
+  unsigned i;
+  tree oldiv;
+
+  for (i = 0; VEC_iterate (tree, old_ivs, i, oldiv); i++)
+    {
+      imm_use_iterator imm_iter;
+      tree oldiv_def;
+      tree oldiv_stmt = SSA_NAME_DEF_STMT (oldiv);
+      tree stmt;
+
+      if (TREE_CODE (oldiv_stmt) == PHI_NODE)
+        oldiv_def = PHI_RESULT (oldiv_stmt);
+      else
+	oldiv_def = SINGLE_SSA_TREE_OPERAND (oldiv_stmt, SSA_OP_DEF);
+      gcc_assert (oldiv_def != NULL_TREE);
+
+      FOR_EACH_IMM_USE_STMT (stmt, imm_iter, oldiv_def)
+        {
+	  if (TREE_CODE (stmt) == PHI_NODE)
+	    return true;
+	}
+    }
+
+  return false;
+}
 
 /* Transform a lambda loopnest NEW_LOOPNEST, which had TRANSFORM applied to
    it, back into gcc code.  This changes the
@@ -1686,9 +1717,11 @@ remove_iv (tree iv_stmt)
    INVARIANTS is a vector of loop invariants from the old loopnest.
    NEW_LOOPNEST is the new lambda loopnest to replace OLD_LOOPNEST with.
    TRANSFORM is the matrix transform that was applied to OLD_LOOPNEST to get 
-   NEW_LOOPNEST.  */
+   NEW_LOOPNEST.  
+
+   Returns true when the code generation succeeded.  */
 
-void
+bool
 lambda_loopnest_to_gcc_loopnest (struct loop *old_loopnest,
 				 VEC(tree,heap) *old_ivs,
 				 VEC(tree,heap) *invariants,
@@ -1705,6 +1738,14 @@ lambda_loopnest_to_gcc_loopnest (struct 
   
   block_stmt_iterator bsi;
 
+  if (lambda_cannot_generate_loop_nest (old_ivs))
+    {
+      if (dump_file)
+	fprintf (dump_file, "Lambda code generation failed to produce code.\n");
+
+      return false;
+    }
+
   if (dump_file)
     {
       transform = lambda_trans_matrix_inverse (transform);
@@ -1865,6 +1906,8 @@ lambda_loopnest_to_gcc_loopnest (struct 
       VEC_safe_push (tree, heap, *remove_ivs, oldiv_stmt);
     }
   VEC_free (tree, heap, new_ivs);
+
+  return true;
 }
 
 /* Return TRUE if this is not interesting statement from the perspective of
@@ -1969,18 +2012,30 @@ bool
 perfect_nest_p (struct loop *loop)
 {
   basic_block *bbs;
-  size_t i;
+  size_t i, n_bb = 0;
   tree exit_cond;
 
+  /* Loops at depth 0 are perfect nests.  */
   if (!loop->inner)
     return true;
+
   bbs = get_loop_body (loop);
   exit_cond = get_loop_exit_condition (loop);
+
   for (i = 0; i < loop->num_nodes; i++)
     {
       if (bbs[i]->loop_father == loop)
 	{
 	  block_stmt_iterator bsi;
+
+	  /* A nest is not perfect when there are more than three
+	     basic blocks per non innermost loop.  */
+	  if (n_bb++ >= 3)
+	    {
+	      free (bbs);
+	      return false;
+	    }
+
 	  for (bsi = bsi_start (bbs[i]); !bsi_end_p (bsi); bsi_next (&bsi))
 	    {
 	      tree stmt = bsi_stmt (bsi);
@@ -1993,11 +2048,10 @@ perfect_nest_p (struct loop *loop)
 	    }
 	}
     }
+
   free (bbs);
-  /* See if the inner loops are perfectly nested as well.  */
-  if (loop->inner)    
-    return perfect_nest_p (loop->inner);
-  return true;
+
+  return perfect_nest_p (loop->inner);
 }
 
 /* Replace the USES of X in STMT, or uses with the same step as X with Y.

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2007-10-22 14:36 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-10-17 13:16 Preliminary patch for PR23820 and PR24309 Ayal Zaks
2007-10-17 15:47 ` Sebastian Pop
2007-10-18 10:01   ` Ayal Zaks
  -- strict thread matches above, loose matches on Subject: below --
2007-10-11 21:32 Sebastian Pop
2007-10-12  1:16 ` Zdenek Dvorak
2007-10-12 21:08   ` Sebastian Pop
2007-10-12 21:21     ` Richard Guenther
2007-10-12 22:10       ` Zdenek Dvorak
2007-10-13 12:23         ` Richard Guenther
2007-10-12 22:16     ` Zdenek Dvorak
2007-10-12 22:32       ` Sebastian Pop
2007-10-12 23:05         ` Sebastian Pop
2007-10-12 23:18           ` Sebastian Pop
2007-10-13  0:06             ` Zdenek Dvorak
2007-10-13 12:26             ` Richard Guenther
2007-10-13 16:06               ` Sebastian Pop
2007-10-13 17:19                 ` Richard Guenther
2007-10-13 17:35                   ` Richard Guenther
2007-10-13 17:43                     ` Sebastian Pop
2007-10-14  2:25                     ` Zdenek Dvorak
2007-10-15 17:49     ` Sebastian Pop
2007-10-15 18:13       ` Zdenek Dvorak
2007-10-16 21:56         ` Sebastian Pop
2007-10-21 20:58       ` Eric Botcazou
2007-10-22 15:14         ` Sebastian Pop
2007-10-22 15:15           ` Eric Botcazou

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).