[PATCH GCC]Vectorize possible infinite loops by versioning

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH GCC]Vectorize possible infinite loops by versioning
@ 2016-06-28  7:08 Bin Cheng
  2016-07-20 21:27 ` Jeff Law
  0 siblings, 1 reply; 3+ messages in thread
From: Bin Cheng @ 2016-06-28  7:08 UTC (permalink / raw)
  To: gcc-patches; +Cc: nd

[-- Attachment #1: Type: text/plain, Size: 3013 bytes --]

Hi,
This patch improves vectorizer in order to handle possible infinite loops by versioning.  Its changes fall in three categories.  
A) Changes in vect_get_loop_niters.  AT the moment, it computes niter using number_of_executions_latch, in this way the assumption is discarded and loop not vectorized.  To fix the issue, we need assumption information from niter analyzer and use that as a break condition to version the loop.  This patch uses newly introduced interface number_of_iterations_exit_assumptions and passes assumptions all the way to vect_analyze_loop_form.  The assumptions will be finally recorded in LOOP_VINFO_NITERS_ASSUMPTIONS.
B) It sets and clears flag LOOP_F_ASSUMPTIONS for loop.  The flag is important because during checking if a loop can be vectorized (with versioning), all data references need to be analyzed by assuming LOOP_VINFO_NITERS_ASSUMPTIONS is TRUE.  Otherwise it's very likely address expression of data reference won't be identified as SCEV and vectorization would fail.  With this flag set to TRUE, niter analyzer will bypass assumptions recorded LOOP_VINFO_NITERS_ASSUMPTIONS.  I also keep this flag for versioned loop because the assumption is guaranteed to be TRUE after versioning.  For now, I didn't copy these flags in copy_loop_info, but I think this can be done so that the flags can be inherited by peeled pre/post loop.  Maybe in follow up patches.  Also it's possible to turn other bool fields into flags in the future?
C) This patch uses existing infrastructure to version a loop against LOOP_VINFO_NITERS_ASSUMPTIONS, just like for alignment or alias check.  The change is straightforward, however, I did refactoring to versioning related macros hoping the code would be cleaner.

Bootstrap and test along with previous niter patches on x86_64 and AArch64.  Is it OK?

Thanks,
bin

2016-06-27  Bin Cheng  <bin.cheng@arm.com>

	PR tree-optimization/57558
	* tree-vect-loop-manip.c (vect_create_cond_for_niters_checks): New
	function.
	(vect_loop_versioning): Support versioning with niter assumptions.
	* tree-vect-loop.c (tree-ssa-loop.h): Include new header file.
	(vect_get_loop_niters): New parameter.  Reimplement to support
	assumptions in loop niter info.
	(vect_analyze_loop_form_1, vect_analyze_loop_form): Ditto.
	(new_loop_vec_info): Init LOOP_VINFO_NITERS_ASSUMPTIONS.
	(vect_estimate_min_profitable_iters): Use LOOP_REQUIRES_VERSIONING.
	* tree-vectorizer.c (vect_free_loop_info_assumptions): New function.
	(vectorize_loops): Free loop niter info for loops with flag
	LOOP_F_ASSUMPTIONS set.
	* tree-vectorizer.h (struct _loorefactoredp_vec_info): New field
	num_iters_assumptions.
	(LOOP_VINFO_NITERS_ASSUMPTIONS): New macro.
	(LOOP_REQUIRES_VERSIONING_FOR_NITERS): New macro.
	(LOOP_REQUIRES_VERSIONING): New macro.
	(vect_free_loop_info_assumptions): New decl.

gcc/testsuite/ChangeLog
2016-06-27  Bin Cheng  <bin.cheng@arm.com>

	* gcc.dg/vect/pr57558-1.c: New test.
	* gcc.dg/vect/pr57558-2.c: New test.

[-- Attachment #2: pr57558-20160625.txt --]
[-- Type: text/plain, Size: 19334 bytes --]

diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
index 93b29b7..c3d5021 100644
--- a/gcc/tree-vect-loop-manip.c
+++ b/gcc/tree-vect-loop-manip.c
@@ -2074,6 +2074,37 @@ vect_do_peeling_for_alignment (loop_vec_info loop_vinfo, tree ni_name,
   free_original_copy_tables ();
 }
 
+/* Function vect_create_cond_for_niters_checks.
+
+   Create a conditional expression that represents the run-time checks for
+   loop's niter.  The loop is guaranteed to to terminate if the run-time
+   checks hold.
+
+   Input:
+   COND_EXPR  - input conditional expression.  New conditions will be chained
+                with logical AND operation.  If it is NULL, then the function
+                is used to return the number of alias checks.
+   LOOP_VINFO - field LOOP_VINFO_MAY_ALIAS_STMTS contains the list of ddrs
+	        to be checked.
+
+   Output:
+   COND_EXPR - conditional expression.
+
+   The returned COND_EXPR is the conditional expression to be used in the
+   if statement that controls which version of the loop gets executed at
+   runtime.  */
+
+static void
+vect_create_cond_for_niters_checks (loop_vec_info loop_vinfo, tree *cond_expr)
+{
+  tree part_cond_expr = LOOP_VINFO_NITERS_ASSUMPTIONS (loop_vinfo);
+
+  if (*cond_expr)
+    *cond_expr = fold_build2 (TRUTH_AND_EXPR, boolean_type_node,
+			      *cond_expr, part_cond_expr);
+  else
+    *cond_expr = part_cond_expr;
+}
 
 /* Function vect_create_cond_for_align_checks.
 
@@ -2322,7 +2353,7 @@ void
 vect_loop_versioning (loop_vec_info loop_vinfo,
 		      unsigned int th, bool check_profitability)
 {
-  struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
+  struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo), *nloop;
   struct loop *scalar_loop = LOOP_VINFO_SCALAR_LOOP (loop_vinfo);
   basic_block condition_bb;
   gphi_iterator gsi;
@@ -2339,14 +2370,19 @@ vect_loop_versioning (loop_vec_info loop_vinfo,
   tree scalar_loop_iters = LOOP_VINFO_NITERS (loop_vinfo);
   bool version_align = LOOP_REQUIRES_VERSIONING_FOR_ALIGNMENT (loop_vinfo);
   bool version_alias = LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo);
+  bool version_niter = LOOP_REQUIRES_VERSIONING_FOR_NITERS (loop_vinfo);
 
   if (check_profitability)
-    {
-      cond_expr = fold_build2 (GT_EXPR, boolean_type_node, scalar_loop_iters,
-			       build_int_cst (TREE_TYPE (scalar_loop_iters), th));
-      cond_expr = force_gimple_operand_1 (cond_expr, &cond_expr_stmt_list,
-					  is_gimple_condexpr, NULL_TREE);
-    }
+    cond_expr = fold_build2 (GT_EXPR, boolean_type_node, scalar_loop_iters,
+			     build_int_cst (TREE_TYPE (scalar_loop_iters),
+						       th));
+
+  if (version_niter)
+    vect_create_cond_for_niters_checks (loop_vinfo, &cond_expr);
+
+  if (cond_expr)
+    cond_expr = force_gimple_operand_1 (cond_expr, &cond_expr_stmt_list,
+					is_gimple_condexpr, NULL_TREE);
 
   if (version_align)
     vect_create_cond_for_align_checks (loop_vinfo, &cond_expr,
@@ -2367,8 +2403,8 @@ vect_loop_versioning (loop_vec_info loop_vinfo,
 
       /* We don't want to scale SCALAR_LOOP's frequencies, we need to
 	 scale LOOP's frequencies instead.  */
-      loop_version (scalar_loop, cond_expr, &condition_bb,
-		    prob, REG_BR_PROB_BASE, REG_BR_PROB_BASE - prob, true);
+      nloop = loop_version (scalar_loop, cond_expr, &condition_bb, prob,
+			    REG_BR_PROB_BASE, REG_BR_PROB_BASE - prob, true);
       scale_loop_frequencies (loop, prob, REG_BR_PROB_BASE);
       /* CONDITION_BB was created above SCALAR_LOOP's preheader,
 	 while we need to move it above LOOP's preheader.  */
@@ -2395,8 +2431,16 @@ vect_loop_versioning (loop_vec_info loop_vinfo,
 			       condition_bb);
     }
   else
-    loop_version (loop, cond_expr, &condition_bb,
-		  prob, prob, REG_BR_PROB_BASE - prob, true);
+    nloop = loop_version (loop, cond_expr, &condition_bb,
+			  prob, prob, REG_BR_PROB_BASE - prob, true);
+
+  if (version_niter)
+    {
+      /* The versioned loop could be infinite, we need to clear existing
+	 niter information which is copied from the original loop.  */
+      gcc_assert (loop_flag_set_p (loop, LOOP_F_ASSUMPTIONS));
+      vect_free_loop_info_assumptions (nloop);
+    }
 
   if (LOCATION_LOCUS (vect_location) != UNKNOWN_LOCATION
       && dump_enabled_p ())
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 6c0337b..4f65e2e 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -41,6 +41,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-ssa-loop-ivopts.h"
 #include "tree-ssa-loop-manip.h"
 #include "tree-ssa-loop-niter.h"
+#include "tree-ssa-loop.h"
 #include "cfgloop.h"
 #include "params.h"
 #include "tree-scalar-evolution.h"
@@ -1002,37 +1003,87 @@ vect_fixup_scalar_cycles_with_patterns (loop_vec_info loop_vinfo)
 
    Determine how many iterations the loop is executed and place it
    in NUMBER_OF_ITERATIONS.  Place the number of latch iterations
-   in NUMBER_OF_ITERATIONSM1.
+   in NUMBER_OF_ITERATIONSM1.  Place the condition under which the
+   niter information holds in ASSUMPTIONS.
 
    Return the loop exit condition.  */
 
 
 static gcond *
-vect_get_loop_niters (struct loop *loop, tree *number_of_iterations,
-		      tree *number_of_iterationsm1)
+vect_get_loop_niters (struct loop *loop, tree *assumptions,
+		      tree *number_of_iterations, tree *number_of_iterationsm1)
 {
-  tree niters;
-
+  edge exit = single_exit (loop);
+  struct tree_niter_desc niter_desc;
+  tree niter_assumptions, niter, may_be_zero;
+  gcond *cond = get_loop_exit_condition (loop);
+
+  *assumptions = boolean_true_node;
+  *number_of_iterationsm1 = chrec_dont_know;
+  *number_of_iterations = chrec_dont_know;
   if (dump_enabled_p ())
     dump_printf_loc (MSG_NOTE, vect_location,
 		     "=== get_loop_niters ===\n");
 
-  niters = number_of_latch_executions (loop);
-  *number_of_iterationsm1 = niters;
+  if (!exit)
+    return cond;
+
+  niter = chrec_dont_know;
+  may_be_zero = NULL_TREE;
+  niter_assumptions = boolean_true_node;
+  if (!number_of_iterations_exit_assumptions (loop, exit, &niter_desc)
+      || chrec_contains_undetermined (niter_desc.niter))
+    return cond;
+
+  niter_assumptions = niter_desc.assumptions;
+  may_be_zero = niter_desc.may_be_zero;
+  niter = niter_desc.niter;
+
+  if (may_be_zero && integer_zerop (may_be_zero))
+    may_be_zero = NULL_TREE;
+
+  if (may_be_zero)
+    {
+      if (COMPARISON_CLASS_P (may_be_zero))
+	{
+	  /* Try to combine may_be_zero with assumptions, this can simplify
+	     computation of niter expression.  */
+	  if (niter_assumptions && !integer_nonzerop (niter_assumptions))
+	    niter_assumptions = fold_build2 (TRUTH_AND_EXPR, boolean_type_node,
+					     niter_assumptions,
+					     fold_build1 (TRUTH_NOT_EXPR,
+							  boolean_type_node,
+							  may_be_zero));
+	  else
+	    niter = fold_build3 (COND_EXPR, TREE_TYPE (niter), may_be_zero,
+				 build_int_cst (TREE_TYPE (niter), 0), niter);
+
+	  may_be_zero = NULL_TREE;
+	}
+      else
+	{
+	  niter = build_int_cst (TREE_TYPE (niter), 0);
+	  *number_of_iterationsm1 = niter;
+	  *number_of_iterations = niter;
+	  return cond;
+	}
+    }
+
+  *assumptions = niter_assumptions;
+  *number_of_iterationsm1 = niter;
 
   /* We want the number of loop header executions which is the number
      of latch executions plus one.
      ???  For UINT_MAX latch executions this number overflows to zero
      for loops like do { n++; } while (n != 0);  */
-  if (niters && !chrec_contains_undetermined (niters))
-    niters = fold_build2 (PLUS_EXPR, TREE_TYPE (niters), unshare_expr (niters),
-			  build_int_cst (TREE_TYPE (niters), 1));
-  *number_of_iterations = niters;
+  if (niter && !chrec_contains_undetermined (niter))
+    niter = fold_build2 (PLUS_EXPR, TREE_TYPE (niter), unshare_expr (niter),
+			  build_int_cst (TREE_TYPE (niter), 1));
+  *number_of_iterations = niter;
 
-  return get_loop_exit_condition (loop);
+  return cond;
 }
 
-
 /* Function bb_in_loop_p
 
    Used as predicate for dfs order traversal of the loop bbs.  */
@@ -1101,6 +1152,7 @@ new_loop_vec_info (struct loop *loop)
   LOOP_VINFO_NITERSM1 (res) = NULL;
   LOOP_VINFO_NITERS (res) = NULL;
   LOOP_VINFO_NITERS_UNCHANGED (res) = NULL;
+  LOOP_VINFO_NITERS_ASSUMPTIONS (res) = NULL;
   LOOP_VINFO_COST_MODEL_THRESHOLD (res) = 0;
   LOOP_VINFO_VECTORIZABLE_P (res) = 0;
   LOOP_VINFO_PEELING_FOR_ALIGNMENT (res) = 0;
@@ -1280,12 +1332,13 @@ vect_compute_single_scalar_iteration_cost (loop_vec_info loop_vinfo)
    Verify that certain CFG restrictions hold, including:
    - the loop has a pre-header
    - the loop has a single entry and exit
-   - the loop exit condition is simple enough, and the number of iterations
-     can be analyzed (a countable loop).  */
+   - the loop exit condition is simple enough
+   - the number of iterations can be analyzed, i.e, a countable loop.  The
+     niter could be analyzed under some assumptions.  */
 
 bool
 vect_analyze_loop_form_1 (struct loop *loop, gcond **loop_cond,
-			  tree *number_of_iterationsm1,
+			  tree *assumptions, tree *number_of_iterationsm1,
 			  tree *number_of_iterations, gcond **inner_loop_cond)
 {
   if (dump_enabled_p ())
@@ -1376,9 +1429,13 @@ vect_analyze_loop_form_1 (struct loop *loop, gcond **loop_cond,
 	}
 
       /* Analyze the inner-loop.  */
-      tree inner_niterm1, inner_niter;
+      tree inner_niterm1, inner_niter, inner_assumptions;
       if (! vect_analyze_loop_form_1 (loop->inner, inner_loop_cond,
-				      &inner_niterm1, &inner_niter, NULL))
+				      &inner_assumptions, &inner_niterm1,
+				      &inner_niter, NULL)
+	  /* Don't support analyzing niter under assumptions for inner
+	     loop.  */
+	  || !integer_onep (inner_assumptions))
 	{
 	  if (dump_enabled_p ())
             dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
@@ -1447,7 +1504,7 @@ vect_analyze_loop_form_1 (struct loop *loop, gcond **loop_cond,
 	}
     }
 
-  *loop_cond = vect_get_loop_niters (loop, number_of_iterations,
+  *loop_cond = vect_get_loop_niters (loop, assumptions, number_of_iterations,
 				     number_of_iterationsm1);
   if (!*loop_cond)
     {
@@ -1457,7 +1514,8 @@ vect_analyze_loop_form_1 (struct loop *loop, gcond **loop_cond,
       return false;
     }
 
-  if (!*number_of_iterations
+  if (integer_zerop (*assumptions)
+      || !*number_of_iterations
       || chrec_contains_undetermined (*number_of_iterations))
     {
       if (dump_enabled_p ())
@@ -1483,10 +1541,11 @@ vect_analyze_loop_form_1 (struct loop *loop, gcond **loop_cond,
 loop_vec_info
 vect_analyze_loop_form (struct loop *loop)
 {
-  tree number_of_iterations, number_of_iterationsm1;
+  tree assumptions, number_of_iterations, number_of_iterationsm1;
   gcond *loop_cond, *inner_loop_cond = NULL;
 
-  if (! vect_analyze_loop_form_1 (loop, &loop_cond, &number_of_iterationsm1,
+  if (! vect_analyze_loop_form_1 (loop, &loop_cond,
+				  &assumptions, &number_of_iterationsm1,
 				  &number_of_iterations, &inner_loop_cond))
     return NULL;
 
@@ -1494,6 +1553,19 @@ vect_analyze_loop_form (struct loop *loop)
   LOOP_VINFO_NITERSM1 (loop_vinfo) = number_of_iterationsm1;
   LOOP_VINFO_NITERS (loop_vinfo) = number_of_iterations;
   LOOP_VINFO_NITERS_UNCHANGED (loop_vinfo) = number_of_iterations;
+  if (!integer_onep (assumptions))
+    {
+      /* We consider to vectorize this loop by versioning it under
+	 some assumptions.  In order to do this, we need to clear
+	 existing information computed by scev and niter analyzer.  */
+      scev_reset_htab ();
+      free_numbers_of_iterations_estimates_loop (loop);
+      /* Also set flag for this loop so that following scev and niter
+	 analysis are done under the assumptions.  */
+      loop_flag_set (loop, LOOP_F_ASSUMPTIONS);
+      /* Also record the assumptions for versioning.  */
+      LOOP_VINFO_NITERS_ASSUMPTIONS (loop_vinfo) = assumptions;
+    }
 
   if (!LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo))
     {
@@ -2090,8 +2162,7 @@ start_over:
                /* In case of versioning, check if the maximum number of
                   iterations is greater than th.  If they are identical,
                   the epilogue is unnecessary.  */
-	       && ((!LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo)
-	            && !LOOP_REQUIRES_VERSIONING_FOR_ALIGNMENT (loop_vinfo))
+	       && (!LOOP_REQUIRES_VERSIONING (loop_vinfo)
                    || (unsigned HOST_WIDE_INT) max_niter > th)))
     LOOP_VINFO_PEELING_FOR_NITER (loop_vinfo) = true;
 
@@ -3125,8 +3196,18 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
                    "versioning aliasing.\n");
     }
 
-  if (LOOP_REQUIRES_VERSIONING_FOR_ALIGNMENT (loop_vinfo)
-      || LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo))
+  /* Requires loop versioning with niter checks.  */
+  if (LOOP_REQUIRES_VERSIONING_FOR_NITERS (loop_vinfo))
+    {
+      /*  FIXME: Make cost depend on complexity of individual check.  */
+      (void) add_stmt_cost (target_cost_data, 1, vector_stmt, NULL, 0,
+			    vect_prologue);
+      dump_printf (MSG_NOTE,
+                   "cost model: Adding cost of checks for loop "
+                   "versioning niters.\n");
+    }
+
+  if (LOOP_REQUIRES_VERSIONING (loop_vinfo))
     (void) add_stmt_cost (target_cost_data, 1, cond_branch_taken, NULL, 0,
 			  vect_prologue);
 
@@ -3283,12 +3364,10 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
      decide whether to vectorize at compile time.  Hence the scalar version
      do not carry cost model guard costs.  */
   if (!LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
-      || LOOP_REQUIRES_VERSIONING_FOR_ALIGNMENT (loop_vinfo)
-      || LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo))
+      || LOOP_REQUIRES_VERSIONING (loop_vinfo))
     {
       /* Cost model check occurs at versioning.  */
-      if (LOOP_REQUIRES_VERSIONING_FOR_ALIGNMENT (loop_vinfo)
-          || LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo))
+      if (LOOP_REQUIRES_VERSIONING (loop_vinfo))
 	scalar_outside_cost += vect_get_stmt_cost (cond_branch_not_taken);
       else
 	{
@@ -6626,8 +6705,7 @@ vect_transform_loop (loop_vec_info loop_vinfo)
   /* Version the loop first, if required, so the profitability check
      comes first.  */
 
-  if (LOOP_REQUIRES_VERSIONING_FOR_ALIGNMENT (loop_vinfo)
-      || LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo))
+  if (LOOP_REQUIRES_VERSIONING (loop_vinfo))
     {
       vect_loop_versioning (loop_vinfo, th, check_profitability);
       check_profitability = false;
diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c
index 2669813..23126f9 100644
--- a/gcc/tree-vectorizer.c
+++ b/gcc/tree-vectorizer.c
@@ -69,6 +69,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple-iterator.h"
 #include "gimple-walk.h"
 #include "tree-ssa-loop-manip.h"
+#include "tree-ssa-loop-niter.h"
 #include "tree-cfg.h"
 #include "cfgloop.h"
 #include "tree-vectorizer.h"
@@ -364,6 +365,20 @@ vect_destroy_datarefs (vec_info *vinfo)
   free_data_refs (vinfo->datarefs);
 }
 
+/* A helper function to free scev and LOOP niter information, as well as
+   clear loop flag LOOP_F_ASSUMPTIONS.  */
+
+void
+vect_free_loop_info_assumptions (struct loop *loop)
+{
+  scev_reset_htab ();
+  /* We need to explicitly reset upper bound information since they are
+     used even after free_numbers_of_iterations_estimates_loop.  */
+  loop->any_upper_bound = false;
+  loop->any_likely_upper_bound = false;
+  free_numbers_of_iterations_estimates_loop (loop);
+  loop_flag_clear (loop, LOOP_F_ASSUMPTIONS);
+}
 
 /* Return whether STMT is inside the region we try to vectorize.  */
 
@@ -533,7 +548,14 @@ vectorize_loops (void)
 	loop->aux = loop_vinfo;
 
 	if (!loop_vinfo || !LOOP_VINFO_VECTORIZABLE_P (loop_vinfo))
-	  continue;
+	  {
+	    /* Free existing information if loop is analyzed with some
+	       assumptions.  */
+	    if (loop_flag_set_p (loop, LOOP_F_ASSUMPTIONS))
+	      vect_free_loop_info_assumptions (loop);
+
+	    continue;
+	  }
 
         if (!dbg_cnt (vect_loop))
 	  {
@@ -541,6 +563,11 @@ vectorize_loops (void)
 	       debug counter.  Set any_ifcvt_loops to visit
 	       them at finalization.  */
 	    any_ifcvt_loops = true;
+	    /* Free existing information if loop is analyzed with some
+	       assumptions.  */
+	    if (loop_flag_set_p (loop, LOOP_F_ASSUMPTIONS))
+	      vect_free_loop_info_assumptions (loop);
+
 	    break;
 	  }
 
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index f15672f..b41738c 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -231,6 +231,8 @@ typedef struct _loop_vec_info : public vec_info {
   tree num_iters;
   /* Number of iterations of the original loop.  */
   tree num_iters_unchanged;
+  /* Condition under which this loop is analyzed.  */
+  tree num_iters_assumptions;
 
   /* Threshold of number of iterations below which vectorzation will not be
      performed. It is calculated from MIN_PROFITABLE_ITERS and
@@ -343,6 +345,7 @@ typedef struct _loop_vec_info : public vec_info {
    prologue peeling retain total unchanged scalar loop iterations for
    cost model.  */
 #define LOOP_VINFO_NITERS_UNCHANGED(L)     (L)->num_iters_unchanged
+#define LOOP_VINFO_NITERS_ASSUMPTIONS(L)   (L)->num_iters_assumptions
 #define LOOP_VINFO_COST_MODEL_THRESHOLD(L) (L)->th
 #define LOOP_VINFO_VECTORIZABLE_P(L)       (L)->vectorizable
 #define LOOP_VINFO_VECT_FACTOR(L)          (L)->vectorization_factor
@@ -375,6 +378,12 @@ typedef struct _loop_vec_info : public vec_info {
   ((L)->may_misalign_stmts.length () > 0)
 #define LOOP_REQUIRES_VERSIONING_FOR_ALIAS(L)     \
   ((L)->may_alias_ddrs.length () > 0)
+#define LOOP_REQUIRES_VERSIONING_FOR_NITERS(L)    \
+  (LOOP_VINFO_NITERS_ASSUMPTIONS (L))
+#define LOOP_REQUIRES_VERSIONING(L)               \
+  (LOOP_REQUIRES_VERSIONING_FOR_ALIGNMENT (L)     \
+   || LOOP_REQUIRES_VERSIONING_FOR_ALIAS (L)      \
+   || LOOP_REQUIRES_VERSIONING_FOR_NITERS (L))
 
 #define LOOP_VINFO_NITERS_KNOWN_P(L)          \
   (tree_fits_shwi_p ((L)->num_iters) && tree_to_shwi ((L)->num_iters) > 0)
@@ -1111,5 +1120,6 @@ void vect_pattern_recog (vec_info *);
 unsigned vectorize_loops (void);
 void vect_destroy_datarefs (vec_info *);
 bool vect_stmt_in_region_p (vec_info *, gimple *);
+void vect_free_loop_info_assumptions (struct loop *);
 
 #endif  /* GCC_TREE_VECTORIZER_H  */
diff --git a/gcc/testsuite/gcc.dg/vect/pr57558-1.c b/gcc/testsuite/gcc.dg/vect/pr57558-1.c
new file mode 100644
index 0000000..1b36b75
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr57558-1.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_int } */
+
+typedef unsigned long ul;
+void foo (ul* __restrict x, ul* __restrict y, ul n)
+{
+  ul i;
+  for (i=1; i<=n; i++, x++, y++)
+    *x += *y;
+}
+
+/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/pr57558-2.c b/gcc/testsuite/gcc.dg/vect/pr57558-2.c
new file mode 100644
index 0000000..2ed291b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr57558-2.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_int } */
+
+void foo(int *a, int len)
+{
+  unsigned short i;
+
+  for (i = 1; i < (len - 1); i++)
+    a[i] = a[i+1];
+}
+
+/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" } } */

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH GCC]Vectorize possible infinite loops by versioning
  2016-06-28  7:08 [PATCH GCC]Vectorize possible infinite loops by versioning Bin Cheng
@ 2016-07-20 21:27 ` Jeff Law
  2016-07-21  8:49   ` Bin.Cheng
  0 siblings, 1 reply; 3+ messages in thread
From: Jeff Law @ 2016-07-20 21:27 UTC (permalink / raw)
  To: Bin Cheng, gcc-patches; +Cc: nd

On 06/28/2016 12:18 AM, Bin Cheng wrote:
> Hi,
> This patch improves vectorizer in order to handle possible infinite loops by versioning.  Its changes fall in three categories.
> A) Changes in vect_get_loop_niters.  AT the moment, it computes niter using number_of_executions_latch, in this way the assumption is discarded and loop not vectorized.  To fix the issue, we need assumption information from niter analyzer and use that as a break condition to version the loop.  This patch uses newly introduced interface number_of_iterations_exit_assumptions and passes assumptions all the way to vect_analyze_loop_form.  The assumptions will be finally recorded in LOOP_VINFO_NITERS_ASSUMPTIONS.
> B) It sets and clears flag LOOP_F_ASSUMPTIONS for loop.  The flag is important because during checking if a loop can be vectorized (with versioning), all data references need to be analyzed by assuming LOOP_VINFO_NITERS_ASSUMPTIONS is TRUE.  Otherwise it's very likely address expression of data reference won't be identified as SCEV and vectorization would fail.  With this flag set to TRUE, niter analyzer will bypass assumptions recorded LOOP_VINFO_NITERS_ASSUMPTIONS.  I also keep this flag for versioned loop because the assumption is guaranteed to be TRUE after versioning.  For now, I didn't copy these flags in copy_loop_info, but I think this can be done so that the flags can be inherited by peeled pre/post loop.  Maybe in follow up patches.  Also it's possible to turn other bool fields into flags in the future?
> C) This patch uses existing infrastructure to version a loop against LOOP_VINFO_NITERS_ASSUMPTIONS, just like for alignment or alias check.  The change is straightforward, however, I did refactoring to versioning related macros hoping the code would be cleaner.
>
> Bootstrap and test along with previous niter patches on x86_64 and AArch64.  Is it OK?
So I have one high level concern -- how (if at all) does this interact 
with Ilya's changes to vectorize loop tails that are just about through 
the review process?

Related -- I see that you throw away the SCEV/iteration knowledge then 
analyze the loop using the given assumptions, then eventually throw that 
information away.  Which sounds generally reasonable -- except for one 
potential issue -- does anything still want to look at the original 
SCEV/iteration information (that we've lost)?  I'm assuming no since you 
didn't try to restore it and we pass the testsuite with your change.


>
> Thanks,
> bin
>
> 2016-06-27  Bin Cheng  <bin.cheng@arm.com>
>
> 	PR tree-optimization/57558
> 	* tree-vect-loop-manip.c (vect_create_cond_for_niters_checks): New
> 	function.
> 	(vect_loop_versioning): Support versioning with niter assumptions.
> 	* tree-vect-loop.c (tree-ssa-loop.h): Include new header file.
> 	(vect_get_loop_niters): New parameter.  Reimplement to support
> 	assumptions in loop niter info.
> 	(vect_analyze_loop_form_1, vect_analyze_loop_form): Ditto.
> 	(new_loop_vec_info): Init LOOP_VINFO_NITERS_ASSUMPTIONS.
> 	(vect_estimate_min_profitable_iters): Use LOOP_REQUIRES_VERSIONING.
> 	* tree-vectorizer.c (vect_free_loop_info_assumptions): New function.
> 	(vectorize_loops): Free loop niter info for loops with flag
> 	LOOP_F_ASSUMPTIONS set.
> 	* tree-vectorizer.h (struct _loorefactoredp_vec_info): New field
Typo in the ChangeLog entry.

> 	num_iters_assumptions.
> 	(LOOP_VINFO_NITERS_ASSUMPTIONS): New macro.
> 	(LOOP_REQUIRES_VERSIONING_FOR_NITERS): New macro.
> 	(LOOP_REQUIRES_VERSIONING): New macro.
> 	(vect_free_loop_info_assumptions): New decl.
>
> gcc/testsuite/ChangeLog
> 2016-06-27  Bin Cheng  <bin.cheng@arm.com>
>
> 	* gcc.dg/vect/pr57558-1.c: New test.
> 	* gcc.dg/vect/pr57558-2.c: New test.
I was rather surprised at how simple supporting this case was.  THe two 
high level questions above are the only things I'm worried about.  Let's 
figure out the answers to those questions before we OK this for the trunk.
Jeff

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH GCC]Vectorize possible infinite loops by versioning
  2016-07-20 21:27 ` Jeff Law
@ 2016-07-21  8:49   ` Bin.Cheng
  0 siblings, 0 replies; 3+ messages in thread
From: Bin.Cheng @ 2016-07-21  8:49 UTC (permalink / raw)
  To: Jeff Law; +Cc: Bin Cheng, gcc-patches, nd

On Wed, Jul 20, 2016 at 10:27 PM, Jeff Law <law@redhat.com> wrote:
> On 06/28/2016 12:18 AM, Bin Cheng wrote:
>>
>> Hi,
>> This patch improves vectorizer in order to handle possible infinite loops
>> by versioning.  Its changes fall in three categories.
>> A) Changes in vect_get_loop_niters.  AT the moment, it computes niter
>> using number_of_executions_latch, in this way the assumption is discarded
>> and loop not vectorized.  To fix the issue, we need assumption information
>> from niter analyzer and use that as a break condition to version the loop.
>> This patch uses newly introduced interface
>> number_of_iterations_exit_assumptions and passes assumptions all the way to
>> vect_analyze_loop_form.  The assumptions will be finally recorded in
>> LOOP_VINFO_NITERS_ASSUMPTIONS.
>> B) It sets and clears flag LOOP_F_ASSUMPTIONS for loop.  The flag is
>> important because during checking if a loop can be vectorized (with
>> versioning), all data references need to be analyzed by assuming
>> LOOP_VINFO_NITERS_ASSUMPTIONS is TRUE.  Otherwise it's very likely address
>> expression of data reference won't be identified as SCEV and vectorization
>> would fail.  With this flag set to TRUE, niter analyzer will bypass
>> assumptions recorded LOOP_VINFO_NITERS_ASSUMPTIONS.  I also keep this flag
>> for versioned loop because the assumption is guaranteed to be TRUE after
>> versioning.  For now, I didn't copy these flags in copy_loop_info, but I
>> think this can be done so that the flags can be inherited by peeled pre/post
>> loop.  Maybe in follow up patches.  Also it's possible to turn other bool
>> fields into flags in the future?
>> C) This patch uses existing infrastructure to version a loop against
>> LOOP_VINFO_NITERS_ASSUMPTIONS, just like for alignment or alias check.  The
>> change is straightforward, however, I did refactoring to versioning related
>> macros hoping the code would be cleaner.
>>
>> Bootstrap and test along with previous niter patches on x86_64 and
>> AArch64.  Is it OK?
>
Hi Jeff,
Thanks for looking at this.

> So I have one high level concern -- how (if at all) does this interact with
> Ilya's changes to vectorize loop tails that are just about through the
> review process?
Though I haven't look into details of tail-vect patch set, I would not
expect interference between them.  Versioning in this patch is before
tail-vect exactly like existing versioning for alias/alignment.
>
> Related -- I see that you throw away the SCEV/iteration knowledge then
> analyze the loop using the given assumptions, then eventually throw that
> information away.  Which sounds generally reasonable -- except for one
> potential issue -- does anything still want to look at the original
> SCEV/iteration information (that we've lost)?  I'm assuming no since you
> didn't try to restore it and we pass the testsuite with your change.
The niter/scev information discarded are buffered data.  The only
impact is on compile time, that is, scev/niter information will be
computed from scratch rather than queried from buffered (hash)
data-structure, if following customer need these information.  For
successful versioning, I think this is kind of inevitable because
scev/niter information of versioned loop IS different to the original
loop.

>
>
>>
>> Thanks,
>> bin
>>
>> 2016-06-27  Bin Cheng  <bin.cheng@arm.com>
>>
>>         PR tree-optimization/57558
>>         * tree-vect-loop-manip.c (vect_create_cond_for_niters_checks): New
>>         function.
>>         (vect_loop_versioning): Support versioning with niter assumptions.
>>         * tree-vect-loop.c (tree-ssa-loop.h): Include new header file.
>>         (vect_get_loop_niters): New parameter.  Reimplement to support
>>         assumptions in loop niter info.
>>         (vect_analyze_loop_form_1, vect_analyze_loop_form): Ditto.
>>         (new_loop_vec_info): Init LOOP_VINFO_NITERS_ASSUMPTIONS.
>>         (vect_estimate_min_profitable_iters): Use
>> LOOP_REQUIRES_VERSIONING.
>>         * tree-vectorizer.c (vect_free_loop_info_assumptions): New
>> function.
>>         (vectorize_loops): Free loop niter info for loops with flag
>>         LOOP_F_ASSUMPTIONS set.
>>         * tree-vectorizer.h (struct _loorefactoredp_vec_info): New field
>
> Typo in the ChangeLog entry.
At this moment, prerequisite patch set for this one is ongoing
review/rework, I will update this patch after finishing prerequisites.

>
>>         num_iters_assumptions.
>>         (LOOP_VINFO_NITERS_ASSUMPTIONS): New macro.
>>         (LOOP_REQUIRES_VERSIONING_FOR_NITERS): New macro.
>>         (LOOP_REQUIRES_VERSIONING): New macro.
>>         (vect_free_loop_info_assumptions): New decl.
>>
>> gcc/testsuite/ChangeLog
>> 2016-06-27  Bin Cheng  <bin.cheng@arm.com>
>>
>>         * gcc.dg/vect/pr57558-1.c: New test.
>>         * gcc.dg/vect/pr57558-2.c: New test.
>
> I was rather surprised at how simple supporting this case was.  THe two high
> level questions above are the only things I'm worried about.  Let's figure
> out the answers to those questions before we OK this for the trunk.
Yeah, it reuses existing versioning code in vectorizer.  Major part of
work is in niter/scev to support analyzing loops whose counter may
overflow/wrap.

Thanks,
bin
> Jeff
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-07-21  8:49 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-28  7:08 [PATCH GCC]Vectorize possible infinite loops by versioning Bin Cheng
2016-07-20 21:27 ` Jeff Law
2016-07-21  8:49   ` Bin.Cheng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).