public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH]: Fix PR 17672
@ 2004-10-17  4:36 Daniel Berlin
  2004-10-17 14:02 ` Daniel Berlin
  2004-10-25  4:01 ` Mark Mitchell
  0 siblings, 2 replies; 3+ messages in thread
From: Daniel Berlin @ 2004-10-17  4:36 UTC (permalink / raw)
  To: gcc-patches; +Cc: mark

[-- Attachment #1: Type: text/plain, Size: 3570 bytes --]

The attached patch fixes PR 17672, a failure bootstrapping with -ftree-
loop-linear.  

This patch subsumes the fixes and testcases in
http://gcc.gnu.org/ml/gcc-patches/2004-10/msg01040.html
and adds another testcase.
Besides the explanation for those fixes listed in that message, the
following problems were fixed, all of which were discovered from the
single loop we interchange in caller-save.c :)

build_classic_dist_vector was not setting the size of the vector to the
correct number, causing garbage to be printed in dumps and handed to
clients who thought they were still within the bounds of the vector.

The gcc-tree-to-linear-expression part was incorrectly setting the
constant part of the linear expression for the bounds to the extra
amount added for GT/LT comparisons, instead of adding that to the
existing constant.

So we would claim that a loop like

for (i = 53; i > 3; i--)

Had an upper bound of 53, and a lower bound of 1 (instead of 4).

The tree-optimize portion of this patch is a somewhat of a punt on the
best solution until at least 4.1, unless Mark would like otherwise.
I'm happy to try to add some quick condition testing, but i believe he
may rather want to just document this as a known bug, especially since -
ftree-loop-linear is off by default.

What happens is this:

ivcanon creates a new canonical iv, but does no iv substitution,
resulting in 2 independent iv's for each loop.
linear loop transforms happily decides it's okay to transform these
loops (which it is), but only replaces the one iv it knows about, which
happens to not be the one used to access the arrays.
Bad karma ensues.

The reordering of the passes is actually more correct than we had
before. We want linear loop transforms to run before the vectorizer and
if-conversion, since it may help expose more loops to them.

However, the punting part is that we have simply moved it before ivcanon
so that for the one loop we interchange during gcc bootstrap, we don't
end up with multiple ivs per loop.

A real fix is twofold:
1. Have IVS run before linear loop transforms (but not strength
reduction), so that we hopefully end up with a single iv per loop, which
is what we really want for linear loop transforms.  If multiple ivs is
somehow more optimal for a given loop, we should create them well after
linear loop transforms.
2. Keep an up-to-date list of ivs for each loop, so that linear loop
transform can tell when there are multiple ivs being used to access the
array references it's going to end up transforming (IE when 1 didn't do
what we want), so we can punt.

However, this is too much for 4.0 (especially since i'm about to suggest
an intrusive change to fix another bug :P).

So the state is that unfortunately, I can easily construct a testcase
that fails at runtime (but isn't easy to just scan dump files for in a
way that won't break) even with the attached fix.

It would look something like this:

for (i = 0, ivtmp1 = 0; ivtmp1 < 50; ivtmp1++)
{
   for (j = 0, ivtmp2 = 0; ivtmp2 < 50; ivtmp2++)       
	{
		USE a[i][j];
		j++;
	}
   i++;
}

Mark, again, since this option is off by default, it's your call whether
you want me to 
1. fix the bugs found while fixing this PR that are slightly more
involved through some kludge
or

2. We just want to add something to the release notes saying we know it
can fail in some cases, stick with this patch alone, close 17672, and
create a new pr to track the remaining bugs targeted at 4.1.

Personally, i'd suggest 2, but you are the RM.  I'm happy to do the work
for 1 if that's what you want.



[-- Attachment #2: Type: text/x-patch, Size: 32576 bytes --]

2004-10-16  Daniel Berlin  <dberlin@dberlin.org>

	Fix PR tree-optimization/17672
	
	* lambda-code.c (lambda_lattice_compute_base): Fix reversed
	assert test.
	(gcc_tree_to_linear_expression): Add extra to existing constant.
	(depth_of_nest): Factor out function used in various places.
	(gcc_loop_to_lambda_loop): Clean up code a little bit. No
	functional changes.
	(find_induction_var_from_exit_cond): Stop guessing, and just
	get the right answer :).
	(gcc_loopnest_to_lambda_loopnest): Remove useless pre-allocation.
	Print out message about result of attempt to create perfect nest.
	(lbv_to_gcc_expression): Add type argument, use it to do math
	and induction variable creation.
	(lle_to_gcc_expression): Ditto.
	(lambda_loopnest_to_gcc_loopnest): Create new iv with same type as
	oldiv. Pass type argument to lle_to_gcc_expression and 
	lbv_to_gcc_expression.
	Reset number of iterations after transformation.
	(perfect_nestify): Remove useless pre-allocation, and cleanup
	a small amount.

	* tree-data-ref.c (build_classic_dist_vector): Use correct size
	for vector.
	(find_data_references_in_loop): Use get_loop_body, not testing of
	every single basic block to see if it is in the loop.

	* tree-loop-linear.c (linear_transform_loops): Use DDR_SIZE_VECT.
	Compute immediate uses.

	* tree-optimize.c: Move linear_transform_loops to before ivcanon.

Index: lambda-code.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/lambda-code.c,v
retrieving revision 2.13
diff -u -p -r2.13 lambda-code.c
--- lambda-code.c	6 Oct 2004 19:40:54 -0000	2.13
+++ lambda-code.c	17 Oct 2004 02:14:09 -0000
@@ -51,7 +51,7 @@
  Keshav Pingali for formal proofs that the various statements below are
  correct. 
 
- A loop iteration space are the points traversed by the loop.  A point in the
+ A loop iteration space represents the points traversed by the loop.  A point in the
  iteration space can be represented by a vector of size <loop depth>.  You can
  therefore represent the iteration space as a integral combinations of a set
  of basis vectors. 
@@ -116,7 +116,6 @@
  of the lattice.  */
 
 
-
 DEF_VEC_GC_P(int);
 
 static bool perfect_nestify (struct loops *, 
@@ -416,7 +415,7 @@ lambda_lattice_compute_base (lambda_loop
 	  /* Otherwise, we need the lower bound expression (which must
 	     be an affine function)  to determine the base.  */
 	  expression = LL_LOWER_BOUND (loop);
-	  gcc_assert (expression && LLE_NEXT (expression) 
+	  gcc_assert (expression && !LLE_NEXT (expression) 
 		      && LLE_DENOMINATOR (expression) == 1);
 
 	  /* The lower triangular portion of the base is going to be the
@@ -1150,7 +1149,7 @@ gcc_tree_to_linear_expression (int depth
 	lle = lambda_linear_expression_new (depth, 2 * depth);
 	LLE_CONSTANT (lle) = TREE_INT_CST_LOW (expr);
 	if (extra != 0)
-	  LLE_CONSTANT (lle) = extra;
+	  LLE_CONSTANT (lle) += extra;
 
 	LLE_DENOMINATOR (lle) = 1;
       }
@@ -1193,6 +1192,21 @@ gcc_tree_to_linear_expression (int depth
   return lle;
 }
 
+/* Return the depth of the loopnest NEST */
+
+static int 
+depth_of_nest (struct loop *nest)
+{
+  size_t depth = 0;
+  while (nest)
+    {
+      depth++;
+      nest = nest->inner;
+    }
+  return depth;
+}
+
+
 /* Return true if OP is invariant in LOOP and all outer loops.  */
 
 static bool
@@ -1236,7 +1250,7 @@ gcc_loop_to_lambda_loop (struct loop *lo
   tree test;
   int stepint;
   int extra = 0;
-  tree lboundvar, uboundvar;
+  tree lboundvar, uboundvar, uboundresult;
   use_optype uses;
 
   /* Find out induction var and exit condition.  */
@@ -1403,18 +1417,16 @@ gcc_loop_to_lambda_loop (struct loop *lo
   else if (TREE_CODE (test) == GT_EXPR)
     extra = -1 * stepint;
 
-  ubound = gcc_tree_to_linear_expression (depth,
-					  uboundvar,
+  ubound = gcc_tree_to_linear_expression (depth, uboundvar,
 					  outerinductionvars,
 					  *invariants, extra);
-  VEC_safe_push (tree, *uboundvars, build (PLUS_EXPR, integer_type_node,
-					uboundvar,
-					build_int_cst (integer_type_node, extra)));
+  uboundresult = build (PLUS_EXPR, TREE_TYPE (uboundvar), uboundvar,
+			build_int_cst (TREE_TYPE (uboundvar), extra));
+  VEC_safe_push (tree, *uboundvars, uboundresult);
   VEC_safe_push (tree, *lboundvars, lboundvar);
   VEC_safe_push (int, *steps, stepint);
   if (!ubound)
     {
-
       if (dump_file && (dump_flags & TDF_DETAILS))
 	fprintf (dump_file,
 		 "Unable to convert loop: Cannot convert upper bound to linear expression\n");
@@ -1444,26 +1456,17 @@ find_induction_var_from_exit_cond (struc
   test = TREE_OPERAND (expr, 0);
   if (!COMPARISON_CLASS_P (test))
     return NULL_TREE;
-  /* This is a guess.  We say that for a <,!=,<= b, a is the induction
-     variable.
-     For >, >=, we guess b is the induction variable.
-     If we are wrong, it'll fail the rest of the induction variable tests, and
-     everything will be fine anyway.  */
-  switch (TREE_CODE (test))
-    {
-    case LT_EXPR:
-    case LE_EXPR:
-    case NE_EXPR:
-      ivarop = TREE_OPERAND (test, 0);
-      break;      
-    case GT_EXPR:
-    case GE_EXPR:
-    case EQ_EXPR:
+
+  /* Find the side that is invariant in this loop. The ivar must be the other
+     side.  */
+  
+  if (expr_invariant_in_loop_p (loop, TREE_OPERAND (test, 0)))
       ivarop = TREE_OPERAND (test, 1);
-      break;
-    default:
-      gcc_unreachable();
-    }
+  else if (expr_invariant_in_loop_p (loop, TREE_OPERAND (test, 1)))
+      ivarop = TREE_OPERAND (test, 0);
+  else
+    return NULL_TREE;
+
   if (TREE_CODE (ivarop) != SSA_NAME)
     return NULL_TREE;
   return ivarop;
@@ -1488,25 +1491,14 @@ gcc_loopnest_to_lambda_loopnest (struct 
   struct loop *temp;
   int depth = 0;
   size_t i;
-  VEC (lambda_loop) *loops;
-  VEC (tree) *uboundvars;
-  VEC (tree) *lboundvars;
-  VEC (int) *steps;
+  VEC (lambda_loop) *loops = NULL;
+  VEC (tree) *uboundvars = NULL;
+  VEC (tree) *lboundvars  = NULL;
+  VEC (int) *steps = NULL;
   lambda_loop newloop;
   tree inductionvar = NULL;
-
-  temp = loop_nest;
-  while (temp)
-    {
-      depth++;
-      temp = temp->inner;
-    }
-  loops = VEC_alloc (lambda_loop, 1);
-  *inductionvars = VEC_alloc (tree, 1);
-  *invariants = VEC_alloc (tree, 1);
-  lboundvars = VEC_alloc (tree, 1);
-  uboundvars = VEC_alloc (tree, 1);
-  steps = VEC_alloc (int, 1);
+  
+  depth = depth_of_nest (loop_nest);
   temp = loop_nest;
   while (temp)
     {
@@ -1520,13 +1512,21 @@ gcc_loopnest_to_lambda_loopnest (struct 
       VEC_safe_push (lambda_loop, loops, newloop);
       temp = temp->inner;
     }
-  if (need_perfect_nest 
-      && !perfect_nestify (currloops, loop_nest, 
-			   lboundvars, uboundvars, steps, *inductionvars))
+  if (need_perfect_nest)
     {
-      if (dump_file)
-	fprintf (dump_file, "Not a perfect nest and couldn't convert to one.\n");    
-      return NULL;
+      if (!perfect_nestify (currloops, loop_nest, 
+			    lboundvars, uboundvars, steps, *inductionvars))
+	{
+	  if (dump_file)
+	    fprintf (dump_file, "Not a perfect loop nest and couldn't convert to one.\n");    
+	  return NULL;
+	}
+      else
+	{
+	  if (dump_file)
+	    fprintf (dump_file, "Successfully converted loop nest to perfect loop nest.\n");
+	}
+      
     }
   ret = lambda_loopnest_new (depth, 2 * depth);
   for (i = 0; VEC_iterate (lambda_loop, loops, i, newloop); i++)
@@ -1536,22 +1536,26 @@ gcc_loopnest_to_lambda_loopnest (struct 
 
 }
 
+
 /* Convert a lambda body vector LBV to a gcc tree, and return the new tree. 
    STMTS_TO_INSERT is a pointer to a tree where the statements we need to be
    inserted for us are stored.  INDUCTION_VARS is the array of induction
-   variables for the loop this LBV is from.  */
+   variables for the loop this LBV is from.  TYPE is the tree type to use for
+   the variables and trees involved.  */
 
 static tree
-lbv_to_gcc_expression (lambda_body_vector lbv,
-		       VEC (tree) *induction_vars, tree * stmts_to_insert)
+lbv_to_gcc_expression (lambda_body_vector lbv, 
+		       tree type, VEC (tree) *induction_vars, 
+		       tree * stmts_to_insert)
 {
   tree stmts, stmt, resvar, name;
+  tree iv;
   size_t i;
   tree_stmt_iterator tsi;
 
   /* Create a statement list and a linear expression temporary.  */
   stmts = alloc_stmt_list ();
-  resvar = create_tmp_var (integer_type_node, "lbvtmp");
+  resvar = create_tmp_var (type, "lbvtmp");
   add_referenced_tmp_var (resvar);
 
   /* Start at 0.  */
@@ -1561,39 +1565,40 @@ lbv_to_gcc_expression (lambda_body_vecto
   tsi = tsi_last (stmts);
   tsi_link_after (&tsi, stmt, TSI_CONTINUE_LINKING);
 
-  for (i = 0; i < VEC_length (tree ,induction_vars) ; i++)
+  for (i = 0; VEC_iterate (tree, induction_vars, i, iv); i++)
     {
       if (LBV_COEFFICIENTS (lbv)[i] != 0)
 	{
 	  tree newname;
-
+	  tree coeffmult;
+	  
 	  /* newname = coefficient * induction_variable */
+	  coeffmult = build_int_cst (type, LBV_COEFFICIENTS (lbv)[i]);
 	  stmt = build (MODIFY_EXPR, void_type_node, resvar,
-			fold (build (MULT_EXPR, integer_type_node,
-				     VEC_index (tree, induction_vars, i),
-				     build_int_cst (integer_type_node,
-						    LBV_COEFFICIENTS (lbv)[i]))));
+			fold (build (MULT_EXPR, type, iv, coeffmult)));
+
 	  newname = make_ssa_name (resvar, stmt);
 	  TREE_OPERAND (stmt, 0) = newname;
 	  tsi = tsi_last (stmts);
 	  tsi_link_after (&tsi, stmt, TSI_CONTINUE_LINKING);
+
 	  /* name = name + newname */
 	  stmt = build (MODIFY_EXPR, void_type_node, resvar,
-			build (PLUS_EXPR, integer_type_node, name, newname));
+			build (PLUS_EXPR, type, name, newname));
 	  name = make_ssa_name (resvar, stmt);
 	  TREE_OPERAND (stmt, 0) = name;
 	  tsi = tsi_last (stmts);
 	  tsi_link_after (&tsi, stmt, TSI_CONTINUE_LINKING);
+
 	}
     }
 
   /* Handle any denominator that occurs.  */
   if (LBV_DENOMINATOR (lbv) != 1)
     {
+      tree denominator = build_int_cst (type, LBV_DENOMINATOR (lbv));
       stmt = build (MODIFY_EXPR, void_type_node, resvar,
-		    build (CEIL_DIV_EXPR, integer_type_node,
-			   name, build_int_cst (integer_type_node,
-						LBV_DENOMINATOR (lbv))));
+		    build (CEIL_DIV_EXPR, type, name, denominator));
       name = make_ssa_name (resvar, stmt);
       TREE_OPERAND (stmt, 0) = name;
       tsi = tsi_last (stmts);
@@ -1608,6 +1613,7 @@ lbv_to_gcc_expression (lambda_body_vecto
    Return the tree that represents the final value of the expression.
    LLE is the linear expression to convert.
    OFFSET is the linear offset to apply to the expression.
+   TYPE is the tree type to use for the variables and math. 
    INDUCTION_VARS is a vector of induction variables for the loops.
    INVARIANTS is a vector of the loop nest invariants.
    WRAP specifies what tree code to wrap the results in, if there is more than
@@ -1618,6 +1624,7 @@ lbv_to_gcc_expression (lambda_body_vecto
 static tree
 lle_to_gcc_expression (lambda_linear_expression lle,
 		       lambda_linear_expression offset,
+		       tree type,
 		       VEC(tree) *induction_vars,
 		       VEC(tree) *invariants,
 		       enum tree_code wrap, tree * stmts_to_insert)
@@ -1625,14 +1632,14 @@ lle_to_gcc_expression (lambda_linear_exp
   tree stmts, stmt, resvar, name;
   size_t i;
   tree_stmt_iterator tsi;
-  VEC(tree) *results;
+  tree iv, invar;
+  VEC(tree) *results = NULL;
 
   name = NULL_TREE;
   /* Create a statement list and a linear expression temporary.  */
   stmts = alloc_stmt_list ();
-  resvar = create_tmp_var (integer_type_node, "lletmp");
+  resvar = create_tmp_var (type, "lletmp");
   add_referenced_tmp_var (resvar);
-  results = VEC_alloc (tree, 1);
 
   /* Build up the linear expressions, and put the variable representing the
      result in the results array.  */
@@ -1648,7 +1655,7 @@ lle_to_gcc_expression (lambda_linear_exp
       /* First do the induction variables.  
          at the end, name = name + all the induction variables added
          together.  */
-      for (i = 0; i < VEC_length (tree ,induction_vars); i++)
+      for (i = 0; VEC_iterate (tree, induction_vars, i, iv); i++)
 	{
 	  if (LLE_COEFFICIENTS (lle)[i] != 0)
 	    {
@@ -1663,11 +1670,9 @@ lle_to_gcc_expression (lambda_linear_exp
 		}
 	      else
 		{
-		  coeff = build_int_cst (integer_type_node,
+		  coeff = build_int_cst (type,
 					 LLE_COEFFICIENTS (lle)[i]);
-		  mult = fold (build (MULT_EXPR, integer_type_node,
-				      VEC_index (tree, induction_vars, i),
-				      coeff));
+		  mult = fold (build (MULT_EXPR, type, iv, coeff));
 		}
 
 	      /* newname = mult */
@@ -1679,8 +1684,7 @@ lle_to_gcc_expression (lambda_linear_exp
 
 	      /* name = name + newname */
 	      stmt = build (MODIFY_EXPR, void_type_node, resvar,
-			    build (PLUS_EXPR, integer_type_node,
-				   name, newname));
+			    build (PLUS_EXPR, type, name, newname));
 	      name = make_ssa_name (resvar, stmt);
 	      TREE_OPERAND (stmt, 0) = name;
 	      tsi = tsi_last (stmts);
@@ -1691,26 +1695,23 @@ lle_to_gcc_expression (lambda_linear_exp
       /* Handle our invariants.
          At the end, we have name = name + result of adding all multiplied
          invariants.  */
-      for (i = 0; i < VEC_length (tree, invariants); i++)
+      for (i = 0; VEC_iterate (tree, invariants, i, invar); i++)
 	{
 	  if (LLE_INVARIANT_COEFFICIENTS (lle)[i] != 0)
 	    {
 	      tree newname;
 	      tree mult;
 	      tree coeff;
-
+	      int invcoeff = LLE_INVARIANT_COEFFICIENTS (lle)[i];
 	      /* mult = invariant * coefficient  */
-	      if (LLE_INVARIANT_COEFFICIENTS (lle)[i] == 1)
+	      if (invcoeff == 1)
 		{
-		  mult = VEC_index (tree, invariants, i);
+		  mult = invar;
 		}
 	      else
 		{
-		  coeff = build_int_cst (integer_type_node,
-					 LLE_INVARIANT_COEFFICIENTS (lle)[i]);
-		  mult = fold (build (MULT_EXPR, integer_type_node,
-				      VEC_index (tree, invariants, i),
-				      coeff));
+		  coeff = build_int_cst (type, invcoeff);
+		  mult = fold (build (MULT_EXPR, type, invar, coeff));
 		}
 
 	      /* newname = mult */
@@ -1722,8 +1723,7 @@ lle_to_gcc_expression (lambda_linear_exp
 
 	      /* name = name + newname */
 	      stmt = build (MODIFY_EXPR, void_type_node, resvar,
-			    build (PLUS_EXPR, integer_type_node,
-				   name, newname));
+			    build (PLUS_EXPR, type, name, newname));
 	      name = make_ssa_name (resvar, stmt);
 	      TREE_OPERAND (stmt, 0) = name;
 	      tsi = tsi_last (stmts);
@@ -1736,9 +1736,8 @@ lle_to_gcc_expression (lambda_linear_exp
       if (LLE_CONSTANT (lle) != 0)
 	{
 	  stmt = build (MODIFY_EXPR, void_type_node, resvar,
-			build (PLUS_EXPR, integer_type_node,
-			       name, build_int_cst (integer_type_node,
-						    LLE_CONSTANT (lle))));
+			build (PLUS_EXPR, type, name, 
+			       build_int_cst (type, LLE_CONSTANT (lle))));
 	  name = make_ssa_name (resvar, stmt);
 	  TREE_OPERAND (stmt, 0) = name;
 	  tsi = tsi_last (stmts);
@@ -1750,9 +1749,8 @@ lle_to_gcc_expression (lambda_linear_exp
       if (LLE_CONSTANT (offset) != 0)
 	{
 	  stmt = build (MODIFY_EXPR, void_type_node, resvar,
-			build (PLUS_EXPR, integer_type_node,
-			       name, build_int_cst (integer_type_node,
-						    LLE_CONSTANT (offset))));
+			build (PLUS_EXPR, type, name, 
+			       build_int_cst (type, LLE_CONSTANT (offset))));
 	  name = make_ssa_name (resvar, stmt);
 	  TREE_OPERAND (stmt, 0) = name;
 	  tsi = tsi_last (stmts);
@@ -1764,14 +1762,12 @@ lle_to_gcc_expression (lambda_linear_exp
 	{
 	  if (wrap == MAX_EXPR)
 	    stmt = build (MODIFY_EXPR, void_type_node, resvar,
-			  build (CEIL_DIV_EXPR, integer_type_node,
-				 name, build_int_cst (integer_type_node,
-						      LLE_DENOMINATOR (lle))));
+			  build (CEIL_DIV_EXPR, type, name, 
+				 build_int_cst (type, LLE_DENOMINATOR (lle))));
 	  else if (wrap == MIN_EXPR)
 	    stmt = build (MODIFY_EXPR, void_type_node, resvar,
-			  build (FLOOR_DIV_EXPR, integer_type_node,
-				 name, build_int_cst (integer_type_node,
-						      LLE_DENOMINATOR (lle))));
+			  build (FLOOR_DIV_EXPR, type, name, 
+				 build_int_cst (type, LLE_DENOMINATOR (lle))));
 	  else
 	    gcc_unreachable();
 
@@ -1794,7 +1790,7 @@ lle_to_gcc_expression (lambda_linear_exp
       tree op1 = VEC_index (tree, results, 0);
       tree op2 = VEC_index (tree, results, 1);
       stmt = build (MODIFY_EXPR, void_type_node, resvar,
-		    build (wrap, integer_type_node, op1, op2));
+		    build (wrap, type, op1, op2));
       name = make_ssa_name (resvar, stmt);
       TREE_OPERAND (stmt, 0) = name;
       tsi = tsi_last (stmts);
@@ -1816,6 +1812,7 @@ lle_to_gcc_expression (lambda_linear_exp
    NEW_LOOPNEST is the new lambda loopnest to replace OLD_LOOPNEST with.
    TRANSFORM is the matrix transform that was applied to OLD_LOOPNEST to get 
    NEW_LOOPNEST.  */
+
 void
 lambda_loopnest_to_gcc_loopnest (struct loop *old_loopnest,
 				 VEC(tree) *old_ivs,
@@ -1827,7 +1824,9 @@ lambda_loopnest_to_gcc_loopnest (struct 
   struct loop *temp;
   size_t i = 0;
   size_t depth = 0;
-  VEC(tree) *new_ivs;
+  VEC(tree) *new_ivs = NULL;
+  tree oldiv;
+  
   block_stmt_iterator bsi;
 
   if (dump_file)
@@ -1836,13 +1835,7 @@ lambda_loopnest_to_gcc_loopnest (struct 
       fprintf (dump_file, "Inverse of transformation matrix:\n");
       print_lambda_trans_matrix (dump_file, transform);
     }
-  temp = old_loopnest;
-  new_ivs = VEC_alloc (tree, 1);
-  while (temp)
-    {
-      temp = temp->inner;
-      depth++;
-    }
+  depth = depth_of_nest (old_loopnest);
   temp = old_loopnest;
 
   while (temp)
@@ -1853,26 +1846,38 @@ lambda_loopnest_to_gcc_loopnest (struct 
       enum tree_code testtype;
       tree newupperbound, newlowerbound;
       lambda_linear_expression offset;
+      tree type;
+
+      oldiv = VEC_index (tree, old_ivs, i);
+      type = TREE_TYPE (oldiv);
+
       /* First, build the new induction variable temporary  */
 
-      ivvar = create_tmp_var (integer_type_node, "lnivtmp");
+      ivvar = create_tmp_var (type, "lnivtmp");
       add_referenced_tmp_var (ivvar);
 
       VEC_safe_push (tree, new_ivs, ivvar);
 
       newloop = LN_LOOPS (new_loopnest)[i];
 
+      /* We have to reset the number of iterations in the loop we are modifying,
+	 so that it recomputes the value.  If this was actually expensive,
+	 we could just compute the answer from the transformation matrix
+	 and the old values.  */
+      temp->nb_iterations = NULL_TREE;
+    
       /* Linear offset is a bit tricky to handle.  Punt on the unhandled
          cases for now.  */
       offset = LL_LINEAR_OFFSET (newloop);
-
+      
       gcc_assert (LLE_DENOMINATOR (offset) == 1 &&
 		  lambda_vector_zerop (LLE_COEFFICIENTS (offset), depth));
-      
+	    
       /* Now build the  new lower bounds, and insert the statements
          necessary to generate it on the loop preheader.  */
       newlowerbound = lle_to_gcc_expression (LL_LOWER_BOUND (newloop),
 					     LL_LINEAR_OFFSET (newloop),
+					     type,
 					     new_ivs,
 					     invariants, MAX_EXPR, &stmts);
       bsi_insert_on_edge (loop_preheader_edge (temp), stmts);
@@ -1881,6 +1886,7 @@ lambda_loopnest_to_gcc_loopnest (struct 
          basic block of the exit condition */
       newupperbound = lle_to_gcc_expression (LL_UPPER_BOUND (newloop),
 					     LL_LINEAR_OFFSET (newloop),
+					     type,
 					     new_ivs,
 					     invariants, MIN_EXPR, &stmts);
       exitcond = get_loop_exit_condition (temp);
@@ -1894,7 +1900,7 @@ lambda_loopnest_to_gcc_loopnest (struct 
       bb = EDGE_PRED (temp->latch, 0)->src;
       bsi = bsi_last (bb);
       create_iv (newlowerbound,
-		 build_int_cst (integer_type_node, LL_STEP (newloop)),
+		 build_int_cst (type, LL_STEP (newloop)),
 		 ivvar, temp, &bsi, false, &ivvar,
 		 &ivvarinced);
 
@@ -1910,14 +1916,13 @@ lambda_loopnest_to_gcc_loopnest (struct 
       i++;
       temp = temp->inner;
     }
-  
+
   /* Rewrite uses of the old ivs so that they are now specified in terms of
      the new ivs.  */
-  temp = old_loopnest;
-  for (i = 0; i < VEC_length (tree, old_ivs); i++)
+
+  for (i = 0; VEC_iterate (tree, old_ivs, i, oldiv); i++)
     {
       int j;
-      tree oldiv = VEC_index (tree, old_ivs, i);
       dataflow_t imm = get_immediate_uses (SSA_NAME_DEF_STMT (oldiv));
       for (j = 0; j < num_immediate_uses (imm); j++)
 	{
@@ -1929,14 +1934,17 @@ lambda_loopnest_to_gcc_loopnest (struct 
 	      if (USE_FROM_PTR (use_p) == oldiv)
 		{
 		  tree newiv, stmts;
-		  lambda_body_vector lbv;
+		  lambda_body_vector lbv, newlbv;
 		  /* Compute the new expression for the induction
 		     variable.  */
 		  depth = VEC_length (tree, new_ivs);
 		  lbv = lambda_body_vector_new (depth);
 		  LBV_COEFFICIENTS (lbv)[i] = 1;
-		  lbv = lambda_body_vector_compute_new (transform, lbv);
-		  newiv = lbv_to_gcc_expression (lbv, new_ivs, &stmts);
+		  
+		  newlbv = lambda_body_vector_compute_new (transform, lbv);
+
+		  newiv = lbv_to_gcc_expression (newlbv, TREE_TYPE (oldiv),
+						 new_ivs, &stmts);
 		  bsi = stmt_for_bsi (stmt);
 		  /* Insert the statements to build that
 		     expression.  */
@@ -2048,6 +2056,8 @@ stmt_is_bumper_for_loop (struct loop *lo
     }
   return false;
 }
+
+
 /* Return true if LOOP is a perfect loop nest.
    Perfect loop nests are those loop nests where all code occurs in the
    innermost loop body.
@@ -2275,14 +2285,12 @@ perfect_nestify (struct loops *loops,
   tree phi;
   tree uboundvar;
   tree stmt;
-  tree ivvar, ivvarinced;
-  VEC (tree) *phis;
+  tree oldivvar, ivvar, ivvarinced;
+  VEC (tree) *phis = NULL;
 
   if (!can_convert_to_perfect_nest (loop, loopivs))
     return false;
 
-  phis = VEC_alloc (tree, 1);
-  
   /* Create the new loop */
 
   olddest = loop->single_exit->dest;
@@ -2354,8 +2362,7 @@ perfect_nestify (struct loops *loops,
   add_referenced_tmp_var (ivvar);
   bsi = bsi_last (EDGE_PRED (newloop->latch, 0)->src);
   create_iv (VEC_index (tree, lbounds, 0),
-	     build_int_cst (integer_type_node, 
-			    VEC_index (int, steps, 0)),
+	     build_int_cst (integer_type_node, VEC_index (int, steps, 0)),
 	     ivvar, newloop, &bsi, false, &ivvar, &ivvarinced);	     
 
   /* Create the new upper bound.  This may be not just a variable, so we copy
@@ -2377,6 +2384,7 @@ perfect_nestify (struct loops *loops,
   bbs = get_loop_body (loop); 
   /* Now replace the induction variable in the moved statements with the
      correct loop induction variable.  */
+  oldivvar = VEC_index (tree, loopivs, 0);
   for (i = 0; i < loop->num_nodes; i++)
     {
       block_stmt_iterator tobsi = bsi_last (bodybb);
@@ -2395,9 +2403,7 @@ perfect_nestify (struct loops *loops,
 		  bsi_next (&bsi);
 		  continue;
 		}
-	      replace_uses_of_x_with_y (stmt, 
-					VEC_index (tree, loopivs, 0),
-					ivvar);
+	      replace_uses_of_x_with_y (stmt, oldivvar, ivvar);
 	      bsi_move_before (&bsi, &tobsi);
 	    }
 	}
Index: tree-data-ref.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-data-ref.c,v
retrieving revision 2.12
diff -u -p -r2.12 tree-data-ref.c
--- tree-data-ref.c	13 Oct 2004 11:58:10 -0000	2.12
+++ tree-data-ref.c	17 Oct 2004 02:14:09 -0000
@@ -1927,7 +1927,7 @@ build_classic_dist_vector (struct data_d
   }
   
   DDR_DIST_VECT (ddr) = dist_v;
-  DDR_SIZE_VECT (ddr) = nb_loops;
+  DDR_SIZE_VECT (ddr) = nb_loops - first_loop;
 }
 
 /* Compute the classic per loop direction vector.  
@@ -2195,23 +2195,21 @@ compute_all_dependences (varray_type dat
    DATAREFS.  Returns chrec_dont_know when failing to analyze a
    difficult case, returns NULL_TREE otherwise.
    
-   FIXME: This is a "dumb" walker over all the trees in the loop body.
-   Find another technique that avoids this costly walk.  This is
-   acceptable for the moment, since this function is used only for
-   debugging purposes.  */
+   TODO: This function should be made smarter so that it can handle address
+   arithmetic as if they were array accesses, etc.  */
 
 tree 
 find_data_references_in_loop (struct loop *loop, varray_type *datarefs)
 {
   bool dont_know_node_not_inserted = true;
-  basic_block bb;
+  basic_block *bbs;
   block_stmt_iterator bsi;
+  size_t i;
 
-  FOR_EACH_BB (bb)
+  bbs = get_loop_body (loop);
+  for (i = 0; i < loop->num_nodes; i++)
     {
-      if (!flow_bb_inside_loop_p (loop, bb))
-	continue;
-      
+      basic_block bb = bbs[i];
       for (bsi = bsi_start (bb); !bsi_end_p (bsi); bsi_next (&bsi))
         {
 	  tree stmt = bsi_stmt (bsi);
@@ -2224,7 +2222,7 @@ find_data_references_in_loop (struct loo
 	      && !V_MUST_DEF_OPS (ann)
 	      && !V_MAY_DEF_OPS (ann))
 	    continue;
-
+	  
 	  /* In the GIMPLE representation, a modify expression
   	     contains a single load or store to memory.  */
 	  if (TREE_CODE (TREE_OPERAND (stmt, 0)) == ARRAY_REF)
@@ -2236,7 +2234,6 @@ find_data_references_in_loop (struct loo
 	    VARRAY_PUSH_GENERIC_PTR 
 		    (*datarefs, analyze_array (stmt, TREE_OPERAND (stmt, 1), 
 					       true));
-
   	  else
 	    {
 	      if (dont_know_node_not_inserted)
@@ -2249,7 +2246,6 @@ find_data_references_in_loop (struct loo
 		  DR_BASE_NAME (res) = NULL;
 		  DR_IS_READ (res) = false;
 		  VARRAY_PUSH_GENERIC_PTR (*datarefs, res);
-
 		  dont_know_node_not_inserted = false;
 		}
 	    }
@@ -2264,6 +2260,7 @@ find_data_references_in_loop (struct loo
 	compute_estimated_nb_iterations (bb->loop_father);
     }
 
+  free (bbs);
   return dont_know_node_not_inserted ? NULL_TREE : chrec_dont_know;
 }
 
Index: tree-loop-linear.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-loop-linear.c,v
retrieving revision 2.2
diff -u -p -r2.2 tree-loop-linear.c
--- tree-loop-linear.c	16 Sep 2004 16:16:14 -0000	2.2
+++ tree-loop-linear.c	17 Oct 2004 02:14:09 -0000
@@ -239,7 +239,7 @@ void
 linear_transform_loops (struct loops *loops)
 {
   unsigned int i;
-
+  compute_immediate_uses (TDFA_USE_OPS | TDFA_USE_VOPS, NULL);
   for (i = 1; i < loops->num; i++)
     {
       unsigned int depth = 0;
@@ -247,8 +247,8 @@ linear_transform_loops (struct loops *lo
       varray_type dependence_relations;
       struct loop *loop_nest = loops->parray[i];
       struct loop *temp;
-      VEC (tree) *oldivs;
-      VEC (tree) *invariants;
+      VEC (tree) *oldivs = NULL;
+      VEC (tree) *invariants = NULL;
       lambda_loopnest before, after;
       lambda_trans_matrix trans;
       bool problem = false;
@@ -306,11 +306,11 @@ linear_transform_loops (struct loops *lo
 		{
 		  fprintf (dump_file, "DISTANCE_V (");
 		  print_lambda_vector (dump_file, DDR_DIST_VECT (ddr), 
-				       loops->num);
+				       DDR_SIZE_VECT (ddr));
 		  fprintf (dump_file, ")\n");
 		  fprintf (dump_file, "DIRECTION_V (");
 		  print_lambda_vector (dump_file, DDR_DIR_VECT (ddr), 
-				       loops->num);
+				       DDR_SIZE_VECT (ddr));
 		  fprintf (dump_file, ")\n");
 		}
 	    }
@@ -319,6 +319,7 @@ linear_transform_loops (struct loops *lo
       /* Build the transformation matrix.  */
       trans = lambda_trans_matrix_new (depth, depth);
       lambda_matrix_id (LTM_MATRIX (trans), depth);
+
       trans = try_interchange_loops (trans, depth, dependence_relations,
 				     datarefs, loop_nest->num);
 
@@ -359,6 +360,8 @@ linear_transform_loops (struct loops *lo
 	}
       lambda_loopnest_to_gcc_loopnest (loop_nest, oldivs, invariants,
 				       after, trans);
+      if (dump_file)
+	fprintf (dump_file, "Successfully transformed loop.\n");
       oldivs = NULL;
       invariants = NULL;
       free_dependence_relations (dependence_relations);
Index: tree-optimize.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-optimize.c,v
retrieving revision 2.57
diff -u -p -r2.57 tree-optimize.c
--- tree-optimize.c	15 Oct 2004 05:00:37 -0000	2.57
+++ tree-optimize.c	17 Oct 2004 02:14:09 -0000
@@ -392,11 +394,12 @@ init_tree_optimization_passes (void)
   NEXT_PASS (pass_loop_init);
   NEXT_PASS (pass_lim);
   NEXT_PASS (pass_unswitch);
-  NEXT_PASS (pass_iv_canon);
   NEXT_PASS (pass_record_bounds);
+  NEXT_PASS (pass_linear_transform);
+  NEXT_PASS (pass_iv_canon);
   NEXT_PASS (pass_if_conversion);
   NEXT_PASS (pass_vectorize);
-  NEXT_PASS (pass_linear_transform);
   NEXT_PASS (pass_complete_unroll);
   NEXT_PASS (pass_iv_optimize);
   NEXT_PASS (pass_loop_done);
Index: testsuite/gcc.dg/tree-ssa/ltrans-1.c
===================================================================
RCS file: testsuite/gcc.dg/tree-ssa/ltrans-1.c
diff -N testsuite/gcc.dg/tree-ssa/ltrans-1.c
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ testsuite/gcc.dg/tree-ssa/ltrans-1.c	17 Oct 2004 02:14:19 -0000
@@ -0,0 +1,22 @@
+/* { dg-do compile } */ 
+/* { dg-options "-O2 -ftree-loop-linear -fdump-tree-ltrans-all" } */
+
+double u[1782225];
+int foo(int N, int *res)
+{
+  int i, j;
+  double sum = 0.0;
+  /* This loop should be converted to a perfect nest and
+     interchanged. */
+  for (i = 0; i < N; i++)
+    {
+      for (j = 0; j < N; j++)
+	sum = sum + u[i + 1335 * j];
+      
+      u[1336 * i] *= 2;
+    }
+  *res = sum + N;
+}
+/* { dg-final { scan-tree-dump-times "converted loop nest to perfect
+   loop nest" 1 "ltrans"} } */ 
+/* { dg-final { scan-tree-dump-times "transformed loop" 1 "ltrans"} } */ 
Index: testsuite/gcc.dg/tree-ssa/ltrans-2.c
===================================================================
RCS file: testsuite/gcc.dg/tree-ssa/ltrans-2.c
diff -N testsuite/gcc.dg/tree-ssa/ltrans-2.c
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ testsuite/gcc.dg/tree-ssa/ltrans-2.c	17 Oct 2004 02:14:19 -0000
@@ -0,0 +1,24 @@
+/* { dg-do compile } */ 
+/* { dg-options "-O2 -ftree-loop-linear -fdump-tree-ltrans-all" } */
+
+double u[1782225];
+int foo(int N, int *res)
+{
+  unsigned int i, j;
+  double sum = 0;
+  
+  /* This loop should be converted to a perfect nest and
+     interchanged.  */
+  for (i = 0; i < N; i++)
+    {
+      for (j = 0; j < N; j++)
+	{
+	  sum = sum + u[i + 1335 * j];
+	  if (j == N - 1)
+	    u[1336 * i] *= 2;
+	}
+    }
+  *res = sum + N;
+}
+/* { dg-final { scan-tree-dump-times "transformed loop" 1 "ltrans"} {
+   xfail *-*-*} } */ 
Index: testsuite/gcc.dg/tree-ssa/ltrans-3.c
===================================================================
RCS file: testsuite/gcc.dg/tree-ssa/ltrans-3.c
diff -N testsuite/gcc.dg/tree-ssa/ltrans-3.c
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ testsuite/gcc.dg/tree-ssa/ltrans-3.c	17 Oct 2004 02:14:19 -0000
@@ -0,0 +1,19 @@
+/* { dg-do compile } */ 
+/* { dg-options "-O2 -ftree-loop-linear -fdump-tree-ltrans-all" } */
+
+double u[1782225];
+int foo(int N, int *res)
+{
+  unsigned int i, j;
+  double sum = 0;
+      for (i = 0; i < N; i++)
+	{
+	  for (j = 0; j < N; j++)
+	    {
+	      sum = sum + u[i + 1335 * j];
+	    }
+	}
+      *res = sum + N;
+}
+
+/* { dg-final { scan-tree-dump-times "transformed loop" 1 "ltrans"} } */ 
Index: testsuite/gcc.dg/tree-ssa/ltrans-4.c
===================================================================
RCS file: testsuite/gcc.dg/tree-ssa/ltrans-4.c
diff -N testsuite/gcc.dg/tree-ssa/ltrans-4.c
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ testsuite/gcc.dg/tree-ssa/ltrans-4.c	17 Oct 2004 02:14:19 -0000
@@ -0,0 +1,18 @@
+/* { dg-do compile } */ 
+/* { dg-options "-O20 -ftree-loop-linear -fdump-tree-ltrans-all" } */
+
+double u[1782225];
+int foo(int N, int *res)
+{
+  int i, j;
+  double sum = 0;
+  for (i = 0; i < N; i++)	
+    for (j = 0; j < N; j++)
+      sum = sum + u[i + 1335 * j];
+  
+  for (i = 0; i < N; i++)
+    u[1336 * i] *= 2;
+  *res = sum + N;
+}
+
+/* { dg-final { scan-tree-dump-times "transformed loop" 1 "ltrans"} } */ 
Index: testsuite/gcc.dg/tree-ssa/ltrans-5.c
===================================================================
RCS file: testsuite/gcc.dg/tree-ssa/ltrans-5.c
diff -N testsuite/gcc.dg/tree-ssa/ltrans-5.c
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ testsuite/gcc.dg/tree-ssa/ltrans-5.c	17 Oct 2004 02:14:19 -0000
@@ -0,0 +1,18 @@
+/* { dg-do compile } */ 
+/* { dg-options "-O2 -ftree-loop-linear -fdump-tree-ltrans-all" } */
+typedef struct rtx_
+{
+} *rtx;
+static rtx regno_save_mem[53][16 / 4 + 1];
+extern set_mem_alias_set (rtx, rtx);
+int main(void)
+{
+  int i, j;
+  for (i = 0; i < 53; i++)
+    for (j = (16 / (0 ? 8 : 4)); j > 0; j--)
+      if (regno_save_mem[i][j] != 0)
+        set_mem_alias_set (regno_save_mem[i][j], 0);
+}
+
+/* { dg-final { scan-tree-dump-times "Linear expression:  constant: 1   invariants:   denominator: 1" 1 "ltrans" } } */
+/* { dg-final { scan-tree-dump-times "transformed loop" 1 "ltrans"} } */ 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH]: Fix PR 17672
  2004-10-17  4:36 [PATCH]: Fix PR 17672 Daniel Berlin
@ 2004-10-17 14:02 ` Daniel Berlin
  2004-10-25  4:01 ` Mark Mitchell
  1 sibling, 0 replies; 3+ messages in thread
From: Daniel Berlin @ 2004-10-17 14:02 UTC (permalink / raw)
  To: gcc-patches; +Cc: mark

On Sat, 2004-10-16 at 23:05 -0400, Daniel Berlin wrote:
> The attached patch fixes PR 17672, a failure bootstrapping with -ftree-
> loop-linear.  

I forgot to add that the patch was bootstrapped and regtested on i686-
pc-linux-gnu.

--Dan

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH]: Fix PR 17672
  2004-10-17  4:36 [PATCH]: Fix PR 17672 Daniel Berlin
  2004-10-17 14:02 ` Daniel Berlin
@ 2004-10-25  4:01 ` Mark Mitchell
  1 sibling, 0 replies; 3+ messages in thread
From: Mark Mitchell @ 2004-10-25  4:01 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: gcc-patches

Daniel Berlin wrote:

>Mark, again, since this option is off by default, it's your call whether
>you want me to 
>1. fix the bugs found while fixing this PR that are slightly more
>involved through some kludge
>or
>
>2. We just want to add something to the release notes saying we know it
>can fail in some cases, stick with this patch alone, close 17672, and
>create a new pr to track the remaining bugs targeted at 4.1.
>  
>
Like you, I think (2) is the much more attractive option.

Thank you for bringing this issue to my attention.

-- 
Mark Mitchell
CodeSourcery, LLC
(916) 791-8304
mark@codesourcery.com

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2004-10-25  3:40 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-10-17  4:36 [PATCH]: Fix PR 17672 Daniel Berlin
2004-10-17 14:02 ` Daniel Berlin
2004-10-25  4:01 ` Mark Mitchell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).