public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [00/11] Add a vec_basic_block of scalar statements
@ 2018-07-30 11:36 Richard Sandiford
  2018-07-30 11:37 ` [01/11] Schedule SLP earlier Richard Sandiford
                   ` (10 more replies)
  0 siblings, 11 replies; 20+ messages in thread
From: Richard Sandiford @ 2018-07-30 11:36 UTC (permalink / raw)
  To: gcc-patches

This series puts the statements that need to be vectorised into a
"vec_basic_block" structure of linked stmt_vec_infos, and then puts
pattern statements into this block rather than hanging them off the
original scalar statement.

Partly this is clean-up, since making pattern statements more like
first-class statements removes a lot of indirection.  The diffstat
for the series is:

 7 files changed, 691 insertions(+), 978 deletions(-)

It also makes it easier to do something approaching proper DCE
on the scalar code (patch 10).  However, the main motivation is
to allow the result of an earlier pattern statement to be reused
as the STMT_VINFO_RELATED_STMT for a later (non-pattern) statement.
I have two current uses for this:

(1) The way overwidening detection works means that we can sometimes
    be left with sequences of the form:

      type1 narrowed = ... + ...;   // originally done in type2
      type2 extended = (type2) narrowed;
      type3 truncated = (type3) extended;

    which cast_forwprop can simplify to:

      type1 narrowed = ... + ...;   // originally done in type2
      type3 truncated = (type3) narrowed;

    But if type3 == type1, we really want to replace truncated
    directly with narrowed.  The current representation doesn't
    allow this.

(2) For SVE extending loads, we want to look for:

      type1 narrow = *ptr;
      type2 extended = (type2) narrow; // only use of narrow

    replace narrow with:

      type2 tmp = .LOAD_EXT (ptr, ...);

    and replace extended directly with tmp.  (Deleting narrow and
    replacing tmp with a .LOAD_EXT would move the location of the
    load and so wouldn't be safe in general.)

The series doesn't do either of these things, it's just laying the
groundwork.  It applies on top of:

https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01308.html

I tested each individual patch on aarch64-linux-gnu and the series as a
whole on aarch64-linux-gnu with SVE, aarch64_be-elf and x86_64-linux-gnu.
OK to install?

Thanks,
Richard

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [01/11] Schedule SLP earlier
  2018-07-30 11:36 [00/11] Add a vec_basic_block of scalar statements Richard Sandiford
@ 2018-07-30 11:37 ` Richard Sandiford
  2018-08-01 12:49   ` Richard Biener
  2018-07-30 11:37 ` [02/11] Remove vect_schedule_slp return value Richard Sandiford
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 20+ messages in thread
From: Richard Sandiford @ 2018-07-30 11:37 UTC (permalink / raw)
  To: gcc-patches

vect_transform_loop used to call vect_schedule_slp lazily when it
came across the first SLP statement, but it seems easier to do it
before the main loop.


2018-07-30  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vect-loop.c (vect_transform_loop_stmt): Remove slp_scheduled
	argument.
	(vect_transform_loop): Update calls accordingly.  Schedule SLP
	instances before the main loop, if any exist.

Index: gcc/tree-vect-loop.c
===================================================================
--- gcc/tree-vect-loop.c	2018-07-30 12:32:15.000000000 +0100
+++ gcc/tree-vect-loop.c	2018-07-30 12:32:16.190624704 +0100
@@ -8199,14 +8199,12 @@ scale_profile_for_vect_loop (struct loop
 }
 
 /* Vectorize STMT_INFO if relevant, inserting any new instructions before GSI.
-   When vectorizing STMT_INFO as a store, set *SEEN_STORE to its stmt_vec_info.
-   *SLP_SCHEDULE is a running record of whether we have called
-   vect_schedule_slp.  */
+   When vectorizing STMT_INFO as a store, set *SEEN_STORE to its
+   stmt_vec_info.  */
 
 static void
 vect_transform_loop_stmt (loop_vec_info loop_vinfo, stmt_vec_info stmt_info,
-			  gimple_stmt_iterator *gsi,
-			  stmt_vec_info *seen_store, bool *slp_scheduled)
+			  gimple_stmt_iterator *gsi, stmt_vec_info *seen_store)
 {
   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
   poly_uint64 vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
@@ -8237,24 +8235,10 @@ vect_transform_loop_stmt (loop_vec_info
 	dump_printf_loc (MSG_NOTE, vect_location, "multiple-types.\n");
     }
 
-  /* SLP.  Schedule all the SLP instances when the first SLP stmt is
-     reached.  */
-  if (slp_vect_type slptype = STMT_SLP_TYPE (stmt_info))
-    {
-
-      if (!*slp_scheduled)
-	{
-	  *slp_scheduled = true;
-
-	  DUMP_VECT_SCOPE ("scheduling SLP instances");
-
-	  vect_schedule_slp (loop_vinfo);
-	}
-
-      /* Hybrid SLP stmts must be vectorized in addition to SLP.  */
-      if (slptype == pure_slp)
-	return;
-    }
+  /* Pure SLP statements have already been vectorized.  We still need
+     to apply loop vectorization to hybrid SLP statements.  */
+  if (PURE_SLP_STMT (stmt_info))
+    return;
 
   if (dump_enabled_p ())
     dump_printf_loc (MSG_NOTE, vect_location, "transform statement.\n");
@@ -8284,7 +8268,6 @@ vect_transform_loop (loop_vec_info loop_
   tree niters_vector_mult_vf = NULL_TREE;
   poly_uint64 vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
   unsigned int lowest_vf = constant_lower_bound (vf);
-  bool slp_scheduled = false;
   gimple *stmt;
   bool check_profitability = false;
   unsigned int th;
@@ -8390,6 +8373,14 @@ vect_transform_loop (loop_vec_info loop_
     /* This will deal with any possible peeling.  */
     vect_prepare_for_masked_peels (loop_vinfo);
 
+  /* Schedule the SLP instances first, then handle loop vectorization
+     below.  */
+  if (!loop_vinfo->slp_instances.is_empty ())
+    {
+      DUMP_VECT_SCOPE ("scheduling SLP instances");
+      vect_schedule_slp (loop_vinfo);
+    }
+
   /* FORNOW: the vectorizer supports only loops which body consist
      of one basic block (header + empty latch). When the vectorizer will
      support more involved loop forms, the order by which the BBs are
@@ -8468,16 +8459,15 @@ vect_transform_loop (loop_vec_info loop_
 			  stmt_vec_info pat_stmt_info
 			    = loop_vinfo->lookup_stmt (gsi_stmt (subsi));
 			  vect_transform_loop_stmt (loop_vinfo, pat_stmt_info,
-						    &si, &seen_store,
-						    &slp_scheduled);
+						    &si, &seen_store);
 			}
 		      stmt_vec_info pat_stmt_info
 			= STMT_VINFO_RELATED_STMT (stmt_info);
 		      vect_transform_loop_stmt (loop_vinfo, pat_stmt_info, &si,
-						&seen_store, &slp_scheduled);
+						&seen_store);
 		    }
 		  vect_transform_loop_stmt (loop_vinfo, stmt_info, &si,
-					    &seen_store, &slp_scheduled);
+					    &seen_store);
 		}
 	      gsi_next (&si);
 	      if (seen_store)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [02/11] Remove vect_schedule_slp return value
  2018-07-30 11:36 [00/11] Add a vec_basic_block of scalar statements Richard Sandiford
  2018-07-30 11:37 ` [01/11] Schedule SLP earlier Richard Sandiford
@ 2018-07-30 11:37 ` Richard Sandiford
  2018-08-01 12:49   ` Richard Biener
  2018-07-30 11:38 ` [04/11] Add a vect_orig_stmt helper function Richard Sandiford
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 20+ messages in thread
From: Richard Sandiford @ 2018-07-30 11:37 UTC (permalink / raw)
  To: gcc-patches

Nothing now uses the vect_schedule_slp return value, so it's not worth
propagating the value through vect_schedule_slp_instance.


2018-07-30  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vectorizer.h (vect_schedule_slp): Return void.
	* tree-vect-slp.c (vect_schedule_slp_instance): Likewise.
	(vect_schedule_slp): Likewise.

Index: gcc/tree-vectorizer.h
===================================================================
--- gcc/tree-vectorizer.h	2018-07-30 12:32:09.114687014 +0100
+++ gcc/tree-vectorizer.h	2018-07-30 12:32:19.366596715 +0100
@@ -1575,7 +1575,7 @@ extern bool vect_transform_slp_perm_load
 					  gimple_stmt_iterator *, poly_uint64,
 					  slp_instance, bool, unsigned *);
 extern bool vect_slp_analyze_operations (vec_info *);
-extern bool vect_schedule_slp (vec_info *);
+extern void vect_schedule_slp (vec_info *);
 extern bool vect_analyze_slp (vec_info *, unsigned);
 extern bool vect_make_slp_decision (loop_vec_info);
 extern void vect_detect_hybrid_slp (loop_vec_info);
Index: gcc/tree-vect-slp.c
===================================================================
--- gcc/tree-vect-slp.c	2018-07-30 12:32:09.026687790 +0100
+++ gcc/tree-vect-slp.c	2018-07-30 12:32:19.366596715 +0100
@@ -3849,11 +3849,11 @@ vect_transform_slp_perm_load (slp_tree n
 
 /* Vectorize SLP instance tree in postorder.  */
 
-static bool
+static void
 vect_schedule_slp_instance (slp_tree node, slp_instance instance,
 			    scalar_stmts_to_slp_tree_map_t *bst_map)
 {
-  bool grouped_store, is_store;
+  bool grouped_store;
   gimple_stmt_iterator si;
   stmt_vec_info stmt_info;
   unsigned int group_size;
@@ -3862,14 +3862,14 @@ vect_schedule_slp_instance (slp_tree nod
   slp_tree child;
 
   if (SLP_TREE_DEF_TYPE (node) != vect_internal_def)
-    return false;
+    return;
 
   /* See if we have already vectorized the same set of stmts and reuse their
      vectorized stmts.  */
   if (slp_tree *leader = bst_map->get (SLP_TREE_SCALAR_STMTS (node)))
     {
       SLP_TREE_VEC_STMTS (node).safe_splice (SLP_TREE_VEC_STMTS (*leader));
-      return false;
+      return;
     }
 
   bst_map->put (SLP_TREE_SCALAR_STMTS (node).copy (), node);
@@ -3991,11 +3991,10 @@ vect_schedule_slp_instance (slp_tree nod
 	    }
 	  v0.release ();
 	  v1.release ();
-	  return false;
+	  return;
 	}
     }
-  is_store = vect_transform_stmt (stmt_info, &si, &grouped_store, node,
-				  instance);
+  vect_transform_stmt (stmt_info, &si, &grouped_store, node, instance);
 
   /* Restore stmt def-types.  */
   FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
@@ -4005,8 +4004,6 @@ vect_schedule_slp_instance (slp_tree nod
 	FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (child), j, child_stmt_info)
 	  STMT_VINFO_DEF_TYPE (child_stmt_info) = vect_internal_def;
       }
-
-  return is_store;
 }
 
 /* Replace scalar calls from SLP node NODE with setting of their lhs to zero.
@@ -4048,14 +4045,12 @@ vect_remove_slp_scalar_calls (slp_tree n
 
 /* Generate vector code for all SLP instances in the loop/basic block.  */
 
-bool
+void
 vect_schedule_slp (vec_info *vinfo)
 {
   vec<slp_instance> slp_instances;
   slp_instance instance;
   unsigned int i;
-  bool is_store = false;
-
 
   scalar_stmts_to_slp_tree_map_t *bst_map
     = new scalar_stmts_to_slp_tree_map_t ();
@@ -4063,8 +4058,8 @@ vect_schedule_slp (vec_info *vinfo)
   FOR_EACH_VEC_ELT (slp_instances, i, instance)
     {
       /* Schedule the tree of INSTANCE.  */
-      is_store = vect_schedule_slp_instance (SLP_INSTANCE_TREE (instance),
-                                             instance, bst_map);
+      vect_schedule_slp_instance (SLP_INSTANCE_TREE (instance),
+				  instance, bst_map);
       if (dump_enabled_p ())
 	dump_printf_loc (MSG_NOTE, vect_location,
                          "vectorizing stmts using SLP.\n");
@@ -4099,6 +4094,4 @@ vect_schedule_slp (vec_info *vinfo)
 	  vinfo->remove_stmt (store_info);
         }
     }
-
-  return is_store;
 }

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [03/11] Remove vect_transform_stmt grouped_store argument
  2018-07-30 11:36 [00/11] Add a vec_basic_block of scalar statements Richard Sandiford
                   ` (2 preceding siblings ...)
  2018-07-30 11:38 ` [04/11] Add a vect_orig_stmt helper function Richard Sandiford
@ 2018-07-30 11:38 ` Richard Sandiford
  2018-08-01 12:49   ` Richard Biener
  2018-07-30 11:39 ` [05/11] Add a vect_stmt_to_vectorize helper function Richard Sandiford
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 20+ messages in thread
From: Richard Sandiford @ 2018-07-30 11:38 UTC (permalink / raw)
  To: gcc-patches

Nothing now uses the grouped_store value passed back by
vect_transform_stmt, so we might as well remove it.


2018-07-30  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vectorizer.h (vect_transform_stmt): Remove grouped_store
	argument.
	* tree-vect-stmts.c (vect_transform_stmt): Likewise.
	* tree-vect-loop.c (vect_transform_loop_stmt): Update call accordingly.
	(vect_transform_loop): Likewise.
	* tree-vect-slp.c (vect_schedule_slp_instance): Likewise.

Index: gcc/tree-vectorizer.h
===================================================================
--- gcc/tree-vectorizer.h	2018-07-30 12:32:19.366596715 +0100
+++ gcc/tree-vectorizer.h	2018-07-30 12:32:22.718567174 +0100
@@ -1459,7 +1459,7 @@ extern tree vect_init_vector (stmt_vec_i
                               gimple_stmt_iterator *);
 extern tree vect_get_vec_def_for_stmt_copy (vec_info *, tree);
 extern bool vect_transform_stmt (stmt_vec_info, gimple_stmt_iterator *,
-                                 bool *, slp_tree, slp_instance);
+				 slp_tree, slp_instance);
 extern void vect_remove_stores (stmt_vec_info);
 extern bool vect_analyze_stmt (stmt_vec_info, bool *, slp_tree, slp_instance,
 			       stmt_vector_for_cost *);
Index: gcc/tree-vect-stmts.c
===================================================================
--- gcc/tree-vect-stmts.c	2018-07-30 12:32:09.114687014 +0100
+++ gcc/tree-vect-stmts.c	2018-07-30 12:32:22.718567174 +0100
@@ -9662,8 +9662,7 @@ vect_analyze_stmt (stmt_vec_info stmt_in
 
 bool
 vect_transform_stmt (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi,
-		     bool *grouped_store, slp_tree slp_node,
-                     slp_instance slp_node_instance)
+		     slp_tree slp_node, slp_instance slp_node_instance)
 {
   vec_info *vinfo = stmt_info->vinfo;
   bool is_store = false;
@@ -9727,7 +9726,6 @@ vect_transform_stmt (stmt_vec_info stmt_
 	     last store in the chain is reached.  Store stmts before the last
 	     one are skipped, and there vec_stmt_info shouldn't be freed
 	     meanwhile.  */
-	  *grouped_store = true;
 	  stmt_vec_info group_info = DR_GROUP_FIRST_ELEMENT (stmt_info);
 	  if (DR_GROUP_STORE_COUNT (group_info) == DR_GROUP_SIZE (group_info))
 	    is_store = true;
Index: gcc/tree-vect-loop.c
===================================================================
--- gcc/tree-vect-loop.c	2018-07-30 12:32:16.190624704 +0100
+++ gcc/tree-vect-loop.c	2018-07-30 12:32:22.714567210 +0100
@@ -8243,8 +8243,7 @@ vect_transform_loop_stmt (loop_vec_info
   if (dump_enabled_p ())
     dump_printf_loc (MSG_NOTE, vect_location, "transform statement.\n");
 
-  bool grouped_store = false;
-  if (vect_transform_stmt (stmt_info, gsi, &grouped_store, NULL, NULL))
+  if (vect_transform_stmt (stmt_info, gsi, NULL, NULL))
     *seen_store = stmt_info;
 }
 
@@ -8425,7 +8424,7 @@ vect_transform_loop (loop_vec_info loop_
 	    {
 	      if (dump_enabled_p ())
 		dump_printf_loc (MSG_NOTE, vect_location, "transform phi.\n");
-	      vect_transform_stmt (stmt_info, NULL, NULL, NULL, NULL);
+	      vect_transform_stmt (stmt_info, NULL, NULL, NULL);
 	    }
 	}
 
Index: gcc/tree-vect-slp.c
===================================================================
--- gcc/tree-vect-slp.c	2018-07-30 12:32:19.366596715 +0100
+++ gcc/tree-vect-slp.c	2018-07-30 12:32:22.714567210 +0100
@@ -3853,7 +3853,6 @@ vect_transform_slp_perm_load (slp_tree n
 vect_schedule_slp_instance (slp_tree node, slp_instance instance,
 			    scalar_stmts_to_slp_tree_map_t *bst_map)
 {
-  bool grouped_store;
   gimple_stmt_iterator si;
   stmt_vec_info stmt_info;
   unsigned int group_size;
@@ -3945,11 +3944,11 @@ vect_schedule_slp_instance (slp_tree nod
 	  vec<stmt_vec_info> v1;
 	  unsigned j;
 	  tree tmask = NULL_TREE;
-	  vect_transform_stmt (stmt_info, &si, &grouped_store, node, instance);
+	  vect_transform_stmt (stmt_info, &si, node, instance);
 	  v0 = SLP_TREE_VEC_STMTS (node).copy ();
 	  SLP_TREE_VEC_STMTS (node).truncate (0);
 	  gimple_assign_set_rhs_code (stmt, ocode);
-	  vect_transform_stmt (stmt_info, &si, &grouped_store, node, instance);
+	  vect_transform_stmt (stmt_info, &si, node, instance);
 	  gimple_assign_set_rhs_code (stmt, code0);
 	  v1 = SLP_TREE_VEC_STMTS (node).copy ();
 	  SLP_TREE_VEC_STMTS (node).truncate (0);
@@ -3994,7 +3993,7 @@ vect_schedule_slp_instance (slp_tree nod
 	  return;
 	}
     }
-  vect_transform_stmt (stmt_info, &si, &grouped_store, node, instance);
+  vect_transform_stmt (stmt_info, &si, node, instance);
 
   /* Restore stmt def-types.  */
   FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [04/11] Add a vect_orig_stmt helper function
  2018-07-30 11:36 [00/11] Add a vec_basic_block of scalar statements Richard Sandiford
  2018-07-30 11:37 ` [01/11] Schedule SLP earlier Richard Sandiford
  2018-07-30 11:37 ` [02/11] Remove vect_schedule_slp return value Richard Sandiford
@ 2018-07-30 11:38 ` Richard Sandiford
  2018-08-01 12:50   ` Richard Biener
  2018-07-30 11:38 ` [03/11] Remove vect_transform_stmt grouped_store argument Richard Sandiford
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 20+ messages in thread
From: Richard Sandiford @ 2018-07-30 11:38 UTC (permalink / raw)
  To: gcc-patches

This patch just adds a helper function for going from a potential
pattern statement to the original scalar statement.


2018-07-30  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vectorizer.h (vect_orig_stmt): New function.
	* tree-vect-data-refs.c (vect_preserves_scalar_order_p): Use it.
	* tree-vect-loop.c (vect_model_reduction_cost): Likewise.
	(vect_create_epilog_for_reduction): Likewise.
	(vectorizable_live_operation): Likewise.
	* tree-vect-slp.c (vect_find_last_scalar_stmt_in_slp): Likewise.
	(vect_detect_hybrid_slp_stmts, vect_schedule_slp): Likewise.
	* tree-vect-stmts.c (vectorizable_call): Likewise.
	(vectorizable_simd_clone_call, vect_remove_stores): Likewise.

Index: gcc/tree-vectorizer.h
===================================================================
--- gcc/tree-vectorizer.h	2018-07-30 12:32:22.718567174 +0100
+++ gcc/tree-vectorizer.h	2018-07-30 12:32:26.218536339 +0100
@@ -1120,6 +1120,17 @@ is_pattern_stmt_p (stmt_vec_info stmt_in
   return stmt_info->pattern_stmt_p;
 }
 
+/* If STMT_INFO is a pattern statement, return the statement that it
+   replaces, otherwise return STMT_INFO itself.  */
+
+inline stmt_vec_info
+vect_orig_stmt (stmt_vec_info stmt_info)
+{
+  if (is_pattern_stmt_p (stmt_info))
+    return STMT_VINFO_RELATED_STMT (stmt_info);
+  return stmt_info;
+}
+
 /* Return true if BB is a loop header.  */
 
 static inline bool
Index: gcc/tree-vect-data-refs.c
===================================================================
--- gcc/tree-vect-data-refs.c	2018-07-30 12:32:08.934688600 +0100
+++ gcc/tree-vect-data-refs.c	2018-07-30 12:32:26.214536374 +0100
@@ -214,10 +214,8 @@ vect_preserves_scalar_order_p (dr_vec_in
      (but could happen later) while reads will happen no later than their
      current position (but could happen earlier).  Reordering is therefore
      only possible if the first access is a write.  */
-  if (is_pattern_stmt_p (stmtinfo_a))
-    stmtinfo_a = STMT_VINFO_RELATED_STMT (stmtinfo_a);
-  if (is_pattern_stmt_p (stmtinfo_b))
-    stmtinfo_b = STMT_VINFO_RELATED_STMT (stmtinfo_b);
+  stmtinfo_a = vect_orig_stmt (stmtinfo_a);
+  stmtinfo_b = vect_orig_stmt (stmtinfo_b);
   stmt_vec_info earlier_stmt_info = get_earlier_stmt (stmtinfo_a, stmtinfo_b);
   return !DR_IS_WRITE (STMT_VINFO_DATA_REF (earlier_stmt_info));
 }
Index: gcc/tree-vect-loop.c
===================================================================
--- gcc/tree-vect-loop.c	2018-07-30 12:32:22.714567210 +0100
+++ gcc/tree-vect-loop.c	2018-07-30 12:32:26.214536374 +0100
@@ -3814,10 +3814,7 @@ vect_model_reduction_cost (stmt_vec_info
 
   vectype = STMT_VINFO_VECTYPE (stmt_info);
   mode = TYPE_MODE (vectype);
-  stmt_vec_info orig_stmt_info = STMT_VINFO_RELATED_STMT (stmt_info);
-
-  if (!orig_stmt_info)
-    orig_stmt_info = stmt_info;
+  stmt_vec_info orig_stmt_info = vect_orig_stmt (stmt_info);
 
   code = gimple_assign_rhs_code (orig_stmt_info->stmt);
 
@@ -4738,13 +4735,8 @@ vect_create_epilog_for_reduction (vec<tr
          Otherwise (it is a regular reduction) - the tree-code and scalar-def
          are taken from STMT.  */
 
-  stmt_vec_info orig_stmt_info = STMT_VINFO_RELATED_STMT (stmt_info);
-  if (!orig_stmt_info)
-    {
-      /* Regular reduction  */
-      orig_stmt_info = stmt_info;
-    }
-  else
+  stmt_vec_info orig_stmt_info = vect_orig_stmt (stmt_info);
+  if (orig_stmt_info != stmt_info)
     {
       /* Reduction pattern  */
       gcc_assert (STMT_VINFO_IN_PATTERN_P (orig_stmt_info));
@@ -5540,11 +5532,7 @@ vect_create_epilog_for_reduction (vec<tr
   if (REDUC_GROUP_FIRST_ELEMENT (stmt_info))
     {
       stmt_vec_info dest_stmt_info
-	= SLP_TREE_SCALAR_STMTS (slp_node)[group_size - 1];
-      /* Handle reduction patterns.  */
-      if (STMT_VINFO_RELATED_STMT (dest_stmt_info))
-	dest_stmt_info = STMT_VINFO_RELATED_STMT (dest_stmt_info);
-
+	= vect_orig_stmt (SLP_TREE_SCALAR_STMTS (slp_node)[group_size - 1]);
       scalar_dest = gimple_assign_lhs (dest_stmt_info->stmt);
       group_size = 1;
     }
@@ -7898,10 +7886,8 @@ vectorizable_live_operation (stmt_vec_in
       return true;
     }
 
-  /* If stmt has a related stmt, then use that for getting the lhs.  */
-  gimple *stmt = (is_pattern_stmt_p (stmt_info)
-		  ? STMT_VINFO_RELATED_STMT (stmt_info)->stmt
-		  : stmt_info->stmt);
+  /* Use the lhs of the original scalar statement.  */
+  gimple *stmt = vect_orig_stmt (stmt_info)->stmt;
 
   lhs = (is_a <gphi *> (stmt)) ? gimple_phi_result (stmt)
 	: gimple_get_lhs (stmt);
Index: gcc/tree-vect-slp.c
===================================================================
--- gcc/tree-vect-slp.c	2018-07-30 12:32:22.714567210 +0100
+++ gcc/tree-vect-slp.c	2018-07-30 12:32:26.218536339 +0100
@@ -1848,8 +1848,7 @@ vect_find_last_scalar_stmt_in_slp (slp_t
 
   for (int i = 0; SLP_TREE_SCALAR_STMTS (node).iterate (i, &stmt_vinfo); i++)
     {
-      if (is_pattern_stmt_p (stmt_vinfo))
-	stmt_vinfo = STMT_VINFO_RELATED_STMT (stmt_vinfo);
+      stmt_vinfo = vect_orig_stmt (stmt_vinfo);
       last = last ? get_later_stmt (stmt_vinfo, last) : stmt_vinfo;
     }
 
@@ -2314,10 +2313,7 @@ vect_detect_hybrid_slp_stmts (slp_tree n
       gcc_checking_assert (PURE_SLP_STMT (stmt_vinfo));
       /* If we get a pattern stmt here we have to use the LHS of the
          original stmt for immediate uses.  */
-      gimple *stmt = stmt_vinfo->stmt;
-      if (! STMT_VINFO_IN_PATTERN_P (stmt_vinfo)
-	  && STMT_VINFO_RELATED_STMT (stmt_vinfo))
-	stmt = STMT_VINFO_RELATED_STMT (stmt_vinfo)->stmt;
+      gimple *stmt = vect_orig_stmt (stmt_vinfo)->stmt;
       tree def;
       if (gimple_code (stmt) == GIMPLE_PHI)
 	def = gimple_phi_result (stmt);
@@ -4087,8 +4083,7 @@ vect_schedule_slp (vec_info *vinfo)
 	  if (!STMT_VINFO_DATA_REF (store_info))
 	    break;
 
-	  if (is_pattern_stmt_p (store_info))
-	    store_info = STMT_VINFO_RELATED_STMT (store_info);
+	  store_info = vect_orig_stmt (store_info);
 	  /* Free the attached stmt_vec_info and remove the stmt.  */
 	  vinfo->remove_stmt (store_info);
         }
Index: gcc/tree-vect-stmts.c
===================================================================
--- gcc/tree-vect-stmts.c	2018-07-30 12:32:22.718567174 +0100
+++ gcc/tree-vect-stmts.c	2018-07-30 12:32:26.218536339 +0100
@@ -3628,8 +3628,7 @@ vectorizable_call (stmt_vec_info stmt_in
   if (slp_node)
     return true;
 
-  if (is_pattern_stmt_p (stmt_info))
-    stmt_info = STMT_VINFO_RELATED_STMT (stmt_info);
+  stmt_info = vect_orig_stmt (stmt_info);
   lhs = gimple_get_lhs (stmt_info->stmt);
 
   gassign *new_stmt
@@ -4364,10 +4363,7 @@ vectorizable_simd_clone_call (stmt_vec_i
   if (scalar_dest)
     {
       type = TREE_TYPE (scalar_dest);
-      if (is_pattern_stmt_p (stmt_info))
-	lhs = gimple_call_lhs (STMT_VINFO_RELATED_STMT (stmt_info)->stmt);
-      else
-	lhs = gimple_call_lhs (stmt);
+      lhs = gimple_call_lhs (vect_orig_stmt (stmt_info)->stmt);
       new_stmt = gimple_build_assign (lhs, build_zero_cst (type));
     }
   else
@@ -9843,8 +9839,7 @@ vect_remove_stores (stmt_vec_info first_
   while (next_stmt_info)
     {
       stmt_vec_info tmp = DR_GROUP_NEXT_ELEMENT (next_stmt_info);
-      if (is_pattern_stmt_p (next_stmt_info))
-	next_stmt_info = STMT_VINFO_RELATED_STMT (next_stmt_info);
+      next_stmt_info = vect_orig_stmt (next_stmt_info);
       /* Free the attached stmt_vec_info and remove the stmt.  */
       vinfo->remove_stmt (next_stmt_info);
       next_stmt_info = tmp;

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [05/11] Add a vect_stmt_to_vectorize helper function
  2018-07-30 11:36 [00/11] Add a vec_basic_block of scalar statements Richard Sandiford
                   ` (3 preceding siblings ...)
  2018-07-30 11:38 ` [03/11] Remove vect_transform_stmt grouped_store argument Richard Sandiford
@ 2018-07-30 11:39 ` Richard Sandiford
  2018-08-01 12:51   ` Richard Biener
  2018-07-30 11:41 ` [06/11] Handle VMAT_INVARIANT separately Richard Sandiford
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 20+ messages in thread
From: Richard Sandiford @ 2018-07-30 11:39 UTC (permalink / raw)
  To: gcc-patches

This patch adds a helper that does the opposite of vect_orig_stmt:
go from the original scalar statement to the statement that should
actually be vectorised.

The use in the last two hunks of vectorizable_reduction are because
reduc_stmt_info (first hunk) and stmt_info (second hunk) are already
pattern statements if appropriate.


2018-07-30  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vectorizer.h (vect_stmt_to_vectorize): New function.
	* tree-vect-loop.c (vect_update_vf_for_slp): Use it.
	(vectorizable_reduction): Likewise.
	* tree-vect-slp.c (vect_analyze_slp_instance): Likewise.
	(vect_detect_hybrid_slp_stmts): Likewise.
	* tree-vect-stmts.c (vect_is_simple_use): Likewise.

Index: gcc/tree-vectorizer.h
===================================================================
--- gcc/tree-vectorizer.h	2018-07-30 12:32:26.218536339 +0100
+++ gcc/tree-vectorizer.h	2018-07-30 12:32:29.586506669 +0100
@@ -1131,6 +1131,17 @@ vect_orig_stmt (stmt_vec_info stmt_info)
   return stmt_info;
 }
 
+/* If STMT_INFO has been replaced by a pattern statement, return the
+   replacement statement, otherwise return STMT_INFO itself.  */
+
+inline stmt_vec_info
+vect_stmt_to_vectorize (stmt_vec_info stmt_info)
+{
+  if (STMT_VINFO_IN_PATTERN_P (stmt_info))
+    return STMT_VINFO_RELATED_STMT (stmt_info);
+  return stmt_info;
+}
+
 /* Return true if BB is a loop header.  */
 
 static inline bool
Index: gcc/tree-vect-loop.c
===================================================================
--- gcc/tree-vect-loop.c	2018-07-30 12:32:26.214536374 +0100
+++ gcc/tree-vect-loop.c	2018-07-30 12:32:29.586506669 +0100
@@ -1424,9 +1424,7 @@ vect_update_vf_for_slp (loop_vec_info lo
 	   gsi_next (&si))
 	{
 	  stmt_vec_info stmt_info = loop_vinfo->lookup_stmt (gsi_stmt (si));
-	  if (STMT_VINFO_IN_PATTERN_P (stmt_info)
-	      && STMT_VINFO_RELATED_STMT (stmt_info))
-	    stmt_info = STMT_VINFO_RELATED_STMT (stmt_info);
+	  stmt_info = vect_stmt_to_vectorize (stmt_info);
 	  if ((STMT_VINFO_RELEVANT_P (stmt_info)
 	       || VECTORIZABLE_CYCLE_DEF (STMT_VINFO_DEF_TYPE (stmt_info)))
 	      && !PURE_SLP_STMT (stmt_info))
@@ -6111,8 +6109,7 @@ vectorizable_reduction (stmt_vec_info st
 	return true;
 
       stmt_vec_info reduc_stmt_info = STMT_VINFO_REDUC_DEF (stmt_info);
-      if (STMT_VINFO_IN_PATTERN_P (reduc_stmt_info))
-	reduc_stmt_info = STMT_VINFO_RELATED_STMT (reduc_stmt_info);
+      reduc_stmt_info = vect_stmt_to_vectorize (reduc_stmt_info);
 
       if (STMT_VINFO_VEC_REDUCTION_TYPE (reduc_stmt_info)
 	  == EXTRACT_LAST_REDUCTION)
@@ -6145,8 +6142,7 @@ vectorizable_reduction (stmt_vec_info st
       if (ncopies > 1
 	  && STMT_VINFO_RELEVANT (reduc_stmt_info) <= vect_used_only_live
 	  && (use_stmt_info = loop_vinfo->lookup_single_use (phi_result))
-	  && (use_stmt_info == reduc_stmt_info
-	      || STMT_VINFO_RELATED_STMT (use_stmt_info) == reduc_stmt_info))
+	  && vect_stmt_to_vectorize (use_stmt_info) == reduc_stmt_info)
 	single_defuse_cycle = true;
 
       /* Create the destination vector  */
@@ -6915,8 +6911,7 @@ vectorizable_reduction (stmt_vec_info st
   if (ncopies > 1
       && (STMT_VINFO_RELEVANT (stmt_info) <= vect_used_only_live)
       && (use_stmt_info = loop_vinfo->lookup_single_use (reduc_phi_result))
-      && (use_stmt_info == stmt_info
-	  || STMT_VINFO_RELATED_STMT (use_stmt_info) == stmt_info))
+      && vect_stmt_to_vectorize (use_stmt_info) == stmt_info)
     {
       single_defuse_cycle = true;
       epilog_copies = 1;
Index: gcc/tree-vect-slp.c
===================================================================
--- gcc/tree-vect-slp.c	2018-07-30 12:32:26.218536339 +0100
+++ gcc/tree-vect-slp.c	2018-07-30 12:32:29.586506669 +0100
@@ -1969,11 +1969,7 @@ vect_analyze_slp_instance (vec_info *vin
       /* Collect the stores and store them in SLP_TREE_SCALAR_STMTS.  */
       while (next_info)
         {
-	  if (STMT_VINFO_IN_PATTERN_P (next_info)
-	      && STMT_VINFO_RELATED_STMT (next_info))
-	    scalar_stmts.safe_push (STMT_VINFO_RELATED_STMT (next_info));
-	  else
-	    scalar_stmts.safe_push (next_info);
+	  scalar_stmts.safe_push (vect_stmt_to_vectorize (next_info));
 	  next_info = DR_GROUP_NEXT_ELEMENT (next_info);
         }
     }
@@ -1983,11 +1979,7 @@ vect_analyze_slp_instance (vec_info *vin
 	 SLP_TREE_SCALAR_STMTS.  */
       while (next_info)
         {
-	  if (STMT_VINFO_IN_PATTERN_P (next_info)
-	      && STMT_VINFO_RELATED_STMT (next_info))
-	    scalar_stmts.safe_push (STMT_VINFO_RELATED_STMT (next_info));
-	  else
-	    scalar_stmts.safe_push (next_info);
+	  scalar_stmts.safe_push (vect_stmt_to_vectorize (next_info));
 	  next_info = REDUC_GROUP_NEXT_ELEMENT (next_info);
         }
       /* Mark the first element of the reduction chain as reduction to properly
@@ -2325,9 +2317,7 @@ vect_detect_hybrid_slp_stmts (slp_tree n
 	    use_vinfo = loop_vinfo->lookup_stmt (use_stmt);
 	    if (!use_vinfo)
 	      continue;
-	    if (STMT_VINFO_IN_PATTERN_P (use_vinfo)
-		&& STMT_VINFO_RELATED_STMT (use_vinfo))
-	      use_vinfo = STMT_VINFO_RELATED_STMT (use_vinfo);
+	    use_vinfo = vect_stmt_to_vectorize (use_vinfo);
 	    if (!STMT_SLP_TYPE (use_vinfo)
 		&& (STMT_VINFO_RELEVANT (use_vinfo)
 		    || VECTORIZABLE_CYCLE_DEF (STMT_VINFO_DEF_TYPE (use_vinfo)))
Index: gcc/tree-vect-stmts.c
===================================================================
--- gcc/tree-vect-stmts.c	2018-07-30 12:32:26.218536339 +0100
+++ gcc/tree-vect-stmts.c	2018-07-30 12:32:29.586506669 +0100
@@ -10031,11 +10031,8 @@ vect_is_simple_use (tree operand, vec_in
 	*dt = vect_external_def;
       else
 	{
-	  if (STMT_VINFO_IN_PATTERN_P (stmt_vinfo))
-	    {
-	      stmt_vinfo = STMT_VINFO_RELATED_STMT (stmt_vinfo);
-	      def_stmt = stmt_vinfo->stmt;
-	    }
+	  stmt_vinfo = vect_stmt_to_vectorize (stmt_vinfo);
+	  def_stmt = stmt_vinfo->stmt;
 	  switch (gimple_code (def_stmt))
 	    {
 	    case GIMPLE_PHI:

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [06/11] Handle VMAT_INVARIANT separately
  2018-07-30 11:36 [00/11] Add a vec_basic_block of scalar statements Richard Sandiford
                   ` (4 preceding siblings ...)
  2018-07-30 11:39 ` [05/11] Add a vect_stmt_to_vectorize helper function Richard Sandiford
@ 2018-07-30 11:41 ` Richard Sandiford
  2018-08-01 12:52   ` Richard Biener
  2018-07-30 11:42 ` [07/11] Use single basic block array in loop_vec_info Richard Sandiford
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 20+ messages in thread
From: Richard Sandiford @ 2018-07-30 11:41 UTC (permalink / raw)
  To: gcc-patches

Invariant loads were handled as a variation on the code for contiguous
loads.  We detected whether they were invariant or not as a byproduct of
creating the vector pointer ivs: vect_create_data_ref_ptr passed back an
inv_p to say whether the pointer was invariant.

But vectorised invariant loads just keep the original scalar load,
so this meant that detecting invariant loads had the side-effect of
creating an unwanted vector pointer iv.  The placement of the code
also meant that we'd create a vector load and then not use the result.
In principle this is wrong code, since there's no guarantee that there's
a vector's worth of accessible data at that address, but we rely on DCE
to get rid of the load before any harm is done.

E.g., for an invariant load in an inner loop (which seems like the more
common use case for this code), we'd create:

   vectp_a.6_52 = &a + 4;

   # vectp_a.5_53 = PHI <vectp_a.5_54(9), vectp_a.6_52(2)>

   # vectp_a.5_55 = PHI <vectp_a.5_53(3), vectp_a.5_56(10)>

   vect_next_a_11.7_57 = MEM[(int *)vectp_a.5_55];
   next_a_11 = a[_1];
   vect_cst__58 = {next_a_11, next_a_11, next_a_11, next_a_11};

   vectp_a.5_56 = vectp_a.5_55 + 4;

   vectp_a.5_54 = vectp_a.5_53 + 0;

whereas all we want is:

   next_a_11 = a[_1];
   vect_cst__58 = {next_a_11, next_a_11, next_a_11, next_a_11};

This patch moves the handling to its own block and makes
vect_create_data_ref_ptr assert (when creating a full iv) that the
address isn't invariant.

The ncopies handling is unfortunate, but a preexisting issue.
Richi's suggestion of using a vector of vector statements would
let us reuse one statement for all copies.


2018-07-30  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vectorizer.h (vect_create_data_ref_ptr): Remove inv_p
	parameter.
	* tree-vect-data-refs.c (vect_create_data_ref_ptr): Likewise.
	When creating an iv, assert that the step is not known to be zero.
	(vect_setup_realignment): Update call accordingly.
	* tree-vect-stmts.c (vectorizable_store): Likewise.
	(vectorizable_load): Likewise.  Handle VMAT_INVARIANT separately.

Index: gcc/tree-vectorizer.h
===================================================================
*** gcc/tree-vectorizer.h	2018-07-30 12:32:29.586506669 +0100
--- gcc/tree-vectorizer.h	2018-07-30 12:40:13.000000000 +0100
*************** extern bool vect_analyze_data_refs (vec_
*** 1527,1533 ****
  extern void vect_record_base_alignments (vec_info *);
  extern tree vect_create_data_ref_ptr (stmt_vec_info, tree, struct loop *, tree,
  				      tree *, gimple_stmt_iterator *,
! 				      gimple **, bool, bool *,
  				      tree = NULL_TREE, tree = NULL_TREE);
  extern tree bump_vector_ptr (tree, gimple *, gimple_stmt_iterator *,
  			     stmt_vec_info, tree);
--- 1527,1533 ----
  extern void vect_record_base_alignments (vec_info *);
  extern tree vect_create_data_ref_ptr (stmt_vec_info, tree, struct loop *, tree,
  				      tree *, gimple_stmt_iterator *,
! 				      gimple **, bool,
  				      tree = NULL_TREE, tree = NULL_TREE);
  extern tree bump_vector_ptr (tree, gimple *, gimple_stmt_iterator *,
  			     stmt_vec_info, tree);
Index: gcc/tree-vect-data-refs.c
===================================================================
*** gcc/tree-vect-data-refs.c	2018-07-30 12:32:26.214536374 +0100
--- gcc/tree-vect-data-refs.c	2018-07-30 12:32:32.546480596 +0100
*************** vect_create_addr_base_for_vector_ref (st
*** 4674,4689 ****
  
        Return the increment stmt that updates the pointer in PTR_INCR.
  
!    3. Set INV_P to true if the access pattern of the data reference in the
!       vectorized loop is invariant.  Set it to false otherwise.
! 
!    4. Return the pointer.  */
  
  tree
  vect_create_data_ref_ptr (stmt_vec_info stmt_info, tree aggr_type,
  			  struct loop *at_loop, tree offset,
  			  tree *initial_address, gimple_stmt_iterator *gsi,
! 			  gimple **ptr_incr, bool only_init, bool *inv_p,
  			  tree byte_offset, tree iv_step)
  {
    const char *base_name;
--- 4674,4686 ----
  
        Return the increment stmt that updates the pointer in PTR_INCR.
  
!    3. Return the pointer.  */
  
  tree
  vect_create_data_ref_ptr (stmt_vec_info stmt_info, tree aggr_type,
  			  struct loop *at_loop, tree offset,
  			  tree *initial_address, gimple_stmt_iterator *gsi,
! 			  gimple **ptr_incr, bool only_init,
  			  tree byte_offset, tree iv_step)
  {
    const char *base_name;
*************** vect_create_data_ref_ptr (stmt_vec_info
*** 4705,4711 ****
    bool insert_after;
    tree indx_before_incr, indx_after_incr;
    gimple *incr;
-   tree step;
    bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_info);
  
    gcc_assert (iv_step != NULL_TREE
--- 4702,4707 ----
*************** vect_create_data_ref_ptr (stmt_vec_info
*** 4726,4739 ****
        *ptr_incr = NULL;
      }
  
-   /* Check the step (evolution) of the load in LOOP, and record
-      whether it's invariant.  */
-   step = vect_dr_behavior (dr_info)->step;
-   if (integer_zerop (step))
-     *inv_p = true;
-   else
-     *inv_p = false;
- 
    /* Create an expression for the first address accessed by this load
       in LOOP.  */
    base_name = get_name (DR_BASE_ADDRESS (dr));
--- 4722,4727 ----
*************** vect_create_data_ref_ptr (stmt_vec_info
*** 4849,4863 ****
      aptr = aggr_ptr_init;
    else
      {
        if (iv_step == NULL_TREE)
  	{
! 	  /* The step of the aggregate pointer is the type size.  */
  	  iv_step = TYPE_SIZE_UNIT (aggr_type);
! 	  /* One exception to the above is when the scalar step of the load in
! 	     LOOP is zero. In this case the step here is also zero.  */
! 	  if (*inv_p)
! 	    iv_step = size_zero_node;
! 	  else if (tree_int_cst_sgn (step) == -1)
  	    iv_step = fold_build1 (NEGATE_EXPR, TREE_TYPE (iv_step), iv_step);
  	}
  
--- 4837,4853 ----
      aptr = aggr_ptr_init;
    else
      {
+       /* Accesses to invariant addresses should be handled specially
+ 	 by the caller.  */
+       tree step = vect_dr_behavior (dr_info)->step;
+       gcc_assert (!integer_zerop (step));
+ 
        if (iv_step == NULL_TREE)
  	{
! 	  /* The step of the aggregate pointer is the type size,
! 	     negated for downward accesses.  */
  	  iv_step = TYPE_SIZE_UNIT (aggr_type);
! 	  if (tree_int_cst_sgn (step) == -1)
  	    iv_step = fold_build1 (NEGATE_EXPR, TREE_TYPE (iv_step), iv_step);
  	}
  
*************** vect_setup_realignment (stmt_vec_info st
*** 5462,5468 ****
    gphi *phi_stmt;
    tree msq = NULL_TREE;
    gimple_seq stmts = NULL;
-   bool inv_p;
    bool compute_in_loop = false;
    bool nested_in_vect_loop = false;
    struct loop *containing_loop = (gimple_bb (stmt_info->stmt))->loop_father;
--- 5452,5457 ----
*************** vect_setup_realignment (stmt_vec_info st
*** 5556,5562 ****
        vec_dest = vect_create_destination_var (scalar_dest, vectype);
        ptr = vect_create_data_ref_ptr (stmt_info, vectype,
  				      loop_for_initial_load, NULL_TREE,
! 				      &init_addr, NULL, &inc, true, &inv_p);
        if (TREE_CODE (ptr) == SSA_NAME)
  	new_temp = copy_ssa_name (ptr);
        else
--- 5545,5551 ----
        vec_dest = vect_create_destination_var (scalar_dest, vectype);
        ptr = vect_create_data_ref_ptr (stmt_info, vectype,
  				      loop_for_initial_load, NULL_TREE,
! 				      &init_addr, NULL, &inc, true);
        if (TREE_CODE (ptr) == SSA_NAME)
  	new_temp = copy_ssa_name (ptr);
        else
Index: gcc/tree-vect-stmts.c
===================================================================
*** gcc/tree-vect-stmts.c	2018-07-30 12:32:29.586506669 +0100
--- gcc/tree-vect-stmts.c	2018-07-30 12:40:14.000000000 +0100
*************** vectorizable_store (stmt_vec_info stmt_i
*** 6254,6260 ****
    unsigned int group_size, i;
    vec<tree> oprnds = vNULL;
    vec<tree> result_chain = vNULL;
-   bool inv_p;
    tree offset = NULL_TREE;
    vec<tree> vec_oprnds = vNULL;
    bool slp = (slp_node != NULL);
--- 6254,6259 ----
*************** vectorizable_store (stmt_vec_info stmt_i
*** 7018,7039 ****
  	    {
  	      dataref_ptr = unshare_expr (DR_BASE_ADDRESS (first_dr_info->dr));
  	      dataref_offset = build_int_cst (ref_type, 0);
- 	      inv_p = false;
  	    }
  	  else if (STMT_VINFO_GATHER_SCATTER_P (stmt_info))
! 	    {
! 	      vect_get_gather_scatter_ops (loop, stmt_info, &gs_info,
! 					   &dataref_ptr, &vec_offset);
! 	      inv_p = false;
! 	    }
  	  else
  	    dataref_ptr
  	      = vect_create_data_ref_ptr (first_stmt_info, aggr_type,
  					  simd_lane_access_p ? loop : NULL,
  					  offset, &dummy, gsi, &ptr_incr,
! 					  simd_lane_access_p, &inv_p,
! 					  NULL_TREE, bump);
! 	  gcc_assert (bb_vinfo || !inv_p);
  	}
        else
  	{
--- 7017,7032 ----
  	    {
  	      dataref_ptr = unshare_expr (DR_BASE_ADDRESS (first_dr_info->dr));
  	      dataref_offset = build_int_cst (ref_type, 0);
  	    }
  	  else if (STMT_VINFO_GATHER_SCATTER_P (stmt_info))
! 	    vect_get_gather_scatter_ops (loop, stmt_info, &gs_info,
! 					 &dataref_ptr, &vec_offset);
  	  else
  	    dataref_ptr
  	      = vect_create_data_ref_ptr (first_stmt_info, aggr_type,
  					  simd_lane_access_p ? loop : NULL,
  					  offset, &dummy, gsi, &ptr_incr,
! 					  simd_lane_access_p, NULL_TREE, bump);
  	}
        else
  	{
*************** vectorizable_load (stmt_vec_info stmt_in
*** 7419,7425 ****
    bool grouped_load = false;
    stmt_vec_info first_stmt_info;
    stmt_vec_info first_stmt_info_for_drptr = NULL;
-   bool inv_p;
    bool compute_in_loop = false;
    struct loop *at_loop;
    int vec_num;
--- 7412,7417 ----
*************** vectorizable_load (stmt_vec_info stmt_in
*** 7669,7674 ****
--- 7661,7723 ----
        return true;
      }
  
+   if (memory_access_type == VMAT_INVARIANT)
+     {
+       gcc_assert (!grouped_load && !mask && !bb_vinfo);
+       /* If we have versioned for aliasing or the loop doesn't
+ 	 have any data dependencies that would preclude this,
+ 	 then we are sure this is a loop invariant load and
+ 	 thus we can insert it on the preheader edge.  */
+       bool hoist_p = (LOOP_VINFO_NO_DATA_DEPENDENCIES (loop_vinfo)
+ 		      && !nested_in_vect_loop
+ 		      && hoist_defs_of_uses (stmt_info, loop));
+       if (hoist_p)
+ 	{
+ 	  gassign *stmt = as_a <gassign *> (stmt_info->stmt);
+ 	  if (dump_enabled_p ())
+ 	    {
+ 	      dump_printf_loc (MSG_NOTE, vect_location,
+ 			       "hoisting out of the vectorized loop: ");
+ 	      dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt, 0);
+ 	    }
+ 	  scalar_dest = copy_ssa_name (scalar_dest);
+ 	  tree rhs = unshare_expr (gimple_assign_rhs1 (stmt));
+ 	  gsi_insert_on_edge_immediate
+ 	    (loop_preheader_edge (loop),
+ 	     gimple_build_assign (scalar_dest, rhs));
+ 	}
+       /* These copies are all equivalent, but currently the representation
+ 	 requires a separate STMT_VINFO_VEC_STMT for each one.  */
+       prev_stmt_info = NULL;
+       gimple_stmt_iterator gsi2 = *gsi;
+       gsi_next (&gsi2);
+       for (j = 0; j < ncopies; j++)
+ 	{
+ 	  stmt_vec_info new_stmt_info;
+ 	  if (hoist_p)
+ 	    {
+ 	      new_temp = vect_init_vector (stmt_info, scalar_dest,
+ 					   vectype, NULL);
+ 	      gimple *new_stmt = SSA_NAME_DEF_STMT (new_temp);
+ 	      new_stmt_info = vinfo->add_stmt (new_stmt);
+ 	    }
+ 	  else
+ 	    {
+ 	      new_temp = vect_init_vector (stmt_info, scalar_dest,
+ 					   vectype, &gsi2);
+ 	      new_stmt_info = vinfo->lookup_def (new_temp);
+ 	    }
+ 	  if (slp)
+ 	    SLP_TREE_VEC_STMTS (slp_node).quick_push (new_stmt_info);
+ 	  else if (j == 0)
+ 	    STMT_VINFO_VEC_STMT (stmt_info) = *vec_stmt = new_stmt_info;
+ 	  else
+ 	    STMT_VINFO_RELATED_STMT (prev_stmt_info) = new_stmt_info;
+ 	  prev_stmt_info = new_stmt_info;
+ 	}
+       return true;
+     }
+ 
    if (memory_access_type == VMAT_ELEMENTWISE
        || memory_access_type == VMAT_STRIDED_SLP)
      {
*************** vectorizable_load (stmt_vec_info stmt_in
*** 8177,8183 ****
  	    {
  	      dataref_ptr = unshare_expr (DR_BASE_ADDRESS (first_dr_info->dr));
  	      dataref_offset = build_int_cst (ref_type, 0);
- 	      inv_p = false;
  	    }
  	  else if (first_stmt_info_for_drptr
  		   && first_stmt_info != first_stmt_info_for_drptr)
--- 8226,8231 ----
*************** vectorizable_load (stmt_vec_info stmt_in
*** 8186,8192 ****
  		= vect_create_data_ref_ptr (first_stmt_info_for_drptr,
  					    aggr_type, at_loop, offset, &dummy,
  					    gsi, &ptr_incr, simd_lane_access_p,
! 					    &inv_p, byte_offset, bump);
  	      /* Adjust the pointer by the difference to first_stmt.  */
  	      data_reference_p ptrdr
  		= STMT_VINFO_DATA_REF (first_stmt_info_for_drptr);
--- 8234,8240 ----
  		= vect_create_data_ref_ptr (first_stmt_info_for_drptr,
  					    aggr_type, at_loop, offset, &dummy,
  					    gsi, &ptr_incr, simd_lane_access_p,
! 					    byte_offset, bump);
  	      /* Adjust the pointer by the difference to first_stmt.  */
  	      data_reference_p ptrdr
  		= STMT_VINFO_DATA_REF (first_stmt_info_for_drptr);
*************** vectorizable_load (stmt_vec_info stmt_in
*** 8199,8214 ****
  					     stmt_info, diff);
  	    }
  	  else if (STMT_VINFO_GATHER_SCATTER_P (stmt_info))
! 	    {
! 	      vect_get_gather_scatter_ops (loop, stmt_info, &gs_info,
! 					   &dataref_ptr, &vec_offset);
! 	      inv_p = false;
! 	    }
  	  else
  	    dataref_ptr
  	      = vect_create_data_ref_ptr (first_stmt_info, aggr_type, at_loop,
  					  offset, &dummy, gsi, &ptr_incr,
! 					  simd_lane_access_p, &inv_p,
  					  byte_offset, bump);
  	  if (mask)
  	    vec_mask = vect_get_vec_def_for_operand (mask, stmt_info,
--- 8247,8259 ----
  					     stmt_info, diff);
  	    }
  	  else if (STMT_VINFO_GATHER_SCATTER_P (stmt_info))
! 	    vect_get_gather_scatter_ops (loop, stmt_info, &gs_info,
! 					 &dataref_ptr, &vec_offset);
  	  else
  	    dataref_ptr
  	      = vect_create_data_ref_ptr (first_stmt_info, aggr_type, at_loop,
  					  offset, &dummy, gsi, &ptr_incr,
! 					  simd_lane_access_p,
  					  byte_offset, bump);
  	  if (mask)
  	    vec_mask = vect_get_vec_def_for_operand (mask, stmt_info,
*************** vectorizable_load (stmt_vec_info stmt_in
*** 8492,8538 ****
  		    }
  		}
  
- 	      /* 4. Handle invariant-load.  */
- 	      if (inv_p && !bb_vinfo)
- 		{
- 		  gcc_assert (!grouped_load);
- 		  /* If we have versioned for aliasing or the loop doesn't
- 		     have any data dependencies that would preclude this,
- 		     then we are sure this is a loop invariant load and
- 		     thus we can insert it on the preheader edge.  */
- 		  if (LOOP_VINFO_NO_DATA_DEPENDENCIES (loop_vinfo)
- 		      && !nested_in_vect_loop
- 		      && hoist_defs_of_uses (stmt_info, loop))
- 		    {
- 		      gassign *stmt = as_a <gassign *> (stmt_info->stmt);
- 		      if (dump_enabled_p ())
- 			{
- 			  dump_printf_loc (MSG_NOTE, vect_location,
- 					   "hoisting out of the vectorized "
- 					   "loop: ");
- 			  dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt, 0);
- 			}
- 		      tree tem = copy_ssa_name (scalar_dest);
- 		      gsi_insert_on_edge_immediate
- 			(loop_preheader_edge (loop),
- 			 gimple_build_assign (tem,
- 					      unshare_expr
- 					        (gimple_assign_rhs1 (stmt))));
- 		      new_temp = vect_init_vector (stmt_info, tem,
- 						   vectype, NULL);
- 		      new_stmt = SSA_NAME_DEF_STMT (new_temp);
- 		      new_stmt_info = vinfo->add_stmt (new_stmt);
- 		    }
- 		  else
- 		    {
- 		      gimple_stmt_iterator gsi2 = *gsi;
- 		      gsi_next (&gsi2);
- 		      new_temp = vect_init_vector (stmt_info, scalar_dest,
- 						   vectype, &gsi2);
- 		      new_stmt_info = vinfo->lookup_def (new_temp);
- 		    }
- 		}
- 
  	      if (memory_access_type == VMAT_CONTIGUOUS_REVERSE)
  		{
  		  tree perm_mask = perm_mask_for_reverse (vectype);
--- 8537,8542 ----

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [07/11] Use single basic block array in loop_vec_info
  2018-07-30 11:36 [00/11] Add a vec_basic_block of scalar statements Richard Sandiford
                   ` (5 preceding siblings ...)
  2018-07-30 11:41 ` [06/11] Handle VMAT_INVARIANT separately Richard Sandiford
@ 2018-07-30 11:42 ` Richard Sandiford
  2018-08-01 12:58   ` Richard Biener
  2018-07-30 11:43 ` [08/11] Make hoist_defs_of_uses use vec_info::lookup_def Richard Sandiford
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 20+ messages in thread
From: Richard Sandiford @ 2018-07-30 11:42 UTC (permalink / raw)
  To: gcc-patches

_loop_vec_info::_loop_vec_info used get_loop_array to get the
order of the blocks when creating stmt_vec_infos, but then used
dfs_enumerate_from to get the order of the blocks that the rest
of the vectoriser uses.  We should be able to use that order
for creating stmt_vec_infos too.


2018-07-30  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Use the
	result of dfs_enumerate_from when constructing stmt_vec_infos,
	instead of additionally calling get_loop_body.

Index: gcc/tree-vect-loop.c
===================================================================
*** gcc/tree-vect-loop.c	2018-07-30 12:40:59.366015643 +0100
--- gcc/tree-vect-loop.c	2018-07-30 12:40:59.362015678 +0100
*************** _loop_vec_info::_loop_vec_info (struct l
*** 834,844 ****
      scalar_loop (NULL),
      orig_loop_info (NULL)
  {
!   /* Create/Update stmt_info for all stmts in the loop.  */
!   basic_block *body = get_loop_body (loop);
!   for (unsigned int i = 0; i < loop->num_nodes; i++)
      {
!       basic_block bb = body[i];
        gimple_stmt_iterator si;
  
        for (si = gsi_start_phis (bb); !gsi_end_p (si); gsi_next (&si))
--- 834,851 ----
      scalar_loop (NULL),
      orig_loop_info (NULL)
  {
!   /* CHECKME: We want to visit all BBs before their successors (except for
!      latch blocks, for which this assertion wouldn't hold).  In the simple
!      case of the loop forms we allow, a dfs order of the BBs would the same
!      as reversed postorder traversal, so we are safe.  */
! 
!   unsigned int nbbs = dfs_enumerate_from (loop->header, 0, bb_in_loop_p,
! 					  bbs, loop->num_nodes, loop);
!   gcc_assert (nbbs == loop->num_nodes);
! 
!   for (unsigned int i = 0; i < nbbs; i++)
      {
!       basic_block bb = bbs[i];
        gimple_stmt_iterator si;
  
        for (si = gsi_start_phis (bb); !gsi_end_p (si); gsi_next (&si))
*************** _loop_vec_info::_loop_vec_info (struct l
*** 855,870 ****
  	  add_stmt (stmt);
  	}
      }
-   free (body);
- 
-   /* CHECKME: We want to visit all BBs before their successors (except for
-      latch blocks, for which this assertion wouldn't hold).  In the simple
-      case of the loop forms we allow, a dfs order of the BBs would the same
-      as reversed postorder traversal, so we are safe.  */
- 
-   unsigned int nbbs = dfs_enumerate_from (loop->header, 0, bb_in_loop_p,
- 					  bbs, loop->num_nodes, loop);
-   gcc_assert (nbbs == loop->num_nodes);
  }
  
  /* Free all levels of MASKS.  */
--- 862,867 ----

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [08/11] Make hoist_defs_of_uses use vec_info::lookup_def
  2018-07-30 11:36 [00/11] Add a vec_basic_block of scalar statements Richard Sandiford
                   ` (6 preceding siblings ...)
  2018-07-30 11:42 ` [07/11] Use single basic block array in loop_vec_info Richard Sandiford
@ 2018-07-30 11:43 ` Richard Sandiford
  2018-08-01 13:01   ` Richard Biener
  2018-07-30 11:46 ` [09/11] Add a vec_basic_block structure Richard Sandiford
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 20+ messages in thread
From: Richard Sandiford @ 2018-07-30 11:43 UTC (permalink / raw)
  To: gcc-patches

This patch makes hoist_defs_of_uses use vec_info::lookup_def instead of:

      if (!gimple_nop_p (def_stmt)
	  && flow_bb_inside_loop_p (loop, gimple_bb (def_stmt)))

to test whether a feeding scalar statement needs to be hoisted out
of the vectorised loop.  It isn't worth doing in its own right,
but it's a prerequisite for the next patch, which needs to update
the stmt_vec_infos of the hoisted statements.


2018-07-30  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vect-stmts.c (hoist_defs_of_uses): Use vec_info::lookup_def
	instead of gimple_nop_p and flow_bb_inside_loop_p to decide
	whether a statement needs to be hoisted.

Index: gcc/tree-vect-stmts.c
===================================================================
*** gcc/tree-vect-stmts.c	2018-07-30 12:42:35.633169005 +0100
--- gcc/tree-vect-stmts.c	2018-07-30 12:42:35.629169040 +0100
*************** permute_vec_elements (tree x, tree y, tr
*** 7322,7370 ****
  static bool
  hoist_defs_of_uses (stmt_vec_info stmt_info, struct loop *loop)
  {
    ssa_op_iter i;
    tree op;
    bool any = false;
  
    FOR_EACH_SSA_TREE_OPERAND (op, stmt_info->stmt, i, SSA_OP_USE)
!     {
!       gimple *def_stmt = SSA_NAME_DEF_STMT (op);
!       if (!gimple_nop_p (def_stmt)
! 	  && flow_bb_inside_loop_p (loop, gimple_bb (def_stmt)))
! 	{
! 	  /* Make sure we don't need to recurse.  While we could do
! 	     so in simple cases when there are more complex use webs
! 	     we don't have an easy way to preserve stmt order to fulfil
! 	     dependencies within them.  */
! 	  tree op2;
! 	  ssa_op_iter i2;
! 	  if (gimple_code (def_stmt) == GIMPLE_PHI)
  	    return false;
! 	  FOR_EACH_SSA_TREE_OPERAND (op2, def_stmt, i2, SSA_OP_USE)
! 	    {
! 	      gimple *def_stmt2 = SSA_NAME_DEF_STMT (op2);
! 	      if (!gimple_nop_p (def_stmt2)
! 		  && flow_bb_inside_loop_p (loop, gimple_bb (def_stmt2)))
! 		return false;
! 	    }
! 	  any = true;
! 	}
!     }
  
    if (!any)
      return true;
  
    FOR_EACH_SSA_TREE_OPERAND (op, stmt_info->stmt, i, SSA_OP_USE)
!     {
!       gimple *def_stmt = SSA_NAME_DEF_STMT (op);
!       if (!gimple_nop_p (def_stmt)
! 	  && flow_bb_inside_loop_p (loop, gimple_bb (def_stmt)))
! 	{
! 	  gimple_stmt_iterator gsi = gsi_for_stmt (def_stmt);
! 	  gsi_remove (&gsi, false);
! 	  gsi_insert_on_edge_immediate (loop_preheader_edge (loop), def_stmt);
! 	}
!     }
  
    return true;
  }
--- 7322,7360 ----
  static bool
  hoist_defs_of_uses (stmt_vec_info stmt_info, struct loop *loop)
  {
+   vec_info *vinfo = stmt_info->vinfo;
    ssa_op_iter i;
    tree op;
    bool any = false;
  
    FOR_EACH_SSA_TREE_OPERAND (op, stmt_info->stmt, i, SSA_OP_USE)
!     if (stmt_vec_info def_stmt_info = vinfo->lookup_def (op))
!       {
! 	/* Make sure we don't need to recurse.  While we could do
! 	   so in simple cases when there are more complex use webs
! 	   we don't have an easy way to preserve stmt order to fulfil
! 	   dependencies within them.  */
! 	tree op2;
! 	ssa_op_iter i2;
! 	if (gimple_code (def_stmt_info->stmt) == GIMPLE_PHI)
! 	  return false;
! 	FOR_EACH_SSA_TREE_OPERAND (op2, def_stmt_info->stmt, i2, SSA_OP_USE)
! 	  if (vinfo->lookup_def (op2))
  	    return false;
! 	any = true;
!       }
  
    if (!any)
      return true;
  
    FOR_EACH_SSA_TREE_OPERAND (op, stmt_info->stmt, i, SSA_OP_USE)
!     if (stmt_vec_info def_stmt_info = vinfo->lookup_def (op))
!       {
! 	gimple_stmt_iterator gsi = gsi_for_stmt (def_stmt_info->stmt);
! 	gsi_remove (&gsi, false);
! 	gsi_insert_on_edge_immediate (loop_preheader_edge (loop),
! 				      def_stmt_info->stmt);
!       }
  
    return true;
  }

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [10/11] Make the vectoriser do its own DCE
  2018-07-30 11:36 [00/11] Add a vec_basic_block of scalar statements Richard Sandiford
                   ` (8 preceding siblings ...)
  2018-07-30 11:46 ` [09/11] Add a vec_basic_block structure Richard Sandiford
@ 2018-07-30 11:46 ` Richard Sandiford
  2018-07-30 11:47 ` [11/11] Insert pattern statements into vec_basic_blocks Richard Sandiford
  10 siblings, 0 replies; 20+ messages in thread
From: Richard Sandiford @ 2018-07-30 11:46 UTC (permalink / raw)
  To: gcc-patches

The vectoriser normally leaves a later DCE pass to remove the scalar
code, but we've accumulated various bits of code to remove cases that
DCE can't handle, such as removing the scalar stores that have been
replaced by vector stores, and the scalar calls to internal functions.
(The latter must be removed for correctness, since no underlying scalar
optabs exist for those calls.)

Now that vec_basic_block gives us an easy way of iterating over the
original scalar code (ignoring any new code inserted by the vectoriser),
it seems easier to do the DCE directly.  This involves marking the few
cases in which the vector code needs part of the original scalar code
to be kept around.


2018-07-30  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vectorizer.h (_stmt_vec_info::used_by_vector_code_p): New
	member variable.
	(vect_mark_used_by_vector_code): Declare.
	(vect_remove_dead_scalar_stmts): Likewise.
	(vect_transform_stmt): Return void.
	(vect_remove_stores): Delete.
	* tree-vectorizer.c (vec_info::remove_stmt): Handle phis.
	* tree-vect-stmts.c (vect_mark_used_by_vector_code): New function.
	(vectorizable_call, vectorizable_simd_clone_call): Don't remove
	scalar calls here.
	(vectorizable_load): Mark unhoisted scalar loads that feed a
	load-and-broadcast operation as being needed by the vector code.
	(vect_transform_stmt): Return void.
	(vect_remove_stores): Delete.
	(vect_maybe_remove_scalar_stmt): New function.
	(vect_remove_dead_scalar_stmts): Likewise.
	* tree-vect-slp.c (vect_slp_bb): Call vect_remove_dead_scalar_stmts.
	(vect_remove_slp_scalar_calls): Delete.
	(vect_schedule_slp): Don't call it.  Don't remove scalar stores here.
	* tree-vect-loop.c (vectorizable_reduction): Mark scalar phis that
	are retained by the vector code.
	(vectorizable_live_operation): Mark scalar live-out statements that
	are retained by the vector code.
	(vect_transform_loop_stmt): Remove seen_store argument.  Mark gconds
	in nested loops as being needed by the vector code.  Replace the
	outer loop's gcond with a dummy condition.
	(vect_transform_loop): Update calls accordingly.  Don't remove
	scalar stores or calls here; call vect_remove_dead_scalar_stmts
	instead.

Index: gcc/tree-vectorizer.h
===================================================================
--- gcc/tree-vectorizer.h	2018-07-30 12:32:42.790390350 +0100
+++ gcc/tree-vectorizer.h	2018-07-30 12:32:46.658356275 +0100
@@ -925,6 +925,10 @@ struct _stmt_vec_info {
   /* For both loads and stores.  */
   bool simd_lane_access_p;
 
+  /* True if the vectorized code keeps this statement in its current form.
+     Only meaningful for statements that were in the original scalar code.  */
+  bool used_by_vector_code_p;
+
   /* Classifies how the load or store is going to be implemented
      for loop vectorization.  */
   vect_memory_access_type memory_access_type;
@@ -1522,6 +1526,7 @@ extern stmt_vec_info vect_finish_replace
 extern stmt_vec_info vect_finish_stmt_generation (stmt_vec_info, gimple *,
 						  gimple_stmt_iterator *);
 extern bool vect_mark_stmts_to_be_vectorized (loop_vec_info);
+extern void vect_mark_used_by_vector_code (stmt_vec_info);
 extern tree vect_get_store_rhs (stmt_vec_info);
 extern tree vect_get_vec_def_for_operand_1 (stmt_vec_info, enum vect_def_type);
 extern tree vect_get_vec_def_for_operand (tree, stmt_vec_info, tree = NULL);
@@ -1532,9 +1537,8 @@ extern void vect_get_vec_defs_for_stmt_c
 extern tree vect_init_vector (stmt_vec_info, tree, tree,
                               gimple_stmt_iterator *);
 extern tree vect_get_vec_def_for_stmt_copy (vec_info *, tree);
-extern bool vect_transform_stmt (stmt_vec_info, gimple_stmt_iterator *,
+extern void vect_transform_stmt (stmt_vec_info, gimple_stmt_iterator *,
 				 slp_tree, slp_instance);
-extern void vect_remove_stores (stmt_vec_info);
 extern bool vect_analyze_stmt (stmt_vec_info, bool *, slp_tree, slp_instance,
 			       stmt_vector_for_cost *);
 extern bool vectorizable_condition (stmt_vec_info, gimple_stmt_iterator *,
@@ -1554,6 +1558,7 @@ extern gcall *vect_gen_while (tree, tree
 extern tree vect_gen_while_not (gimple_seq *, tree, tree, tree);
 extern bool vect_get_vector_types_for_stmt (stmt_vec_info, tree *, tree *);
 extern tree vect_get_mask_type_for_stmt (stmt_vec_info);
+extern void vect_remove_dead_scalar_stmts (vec_info *);
 
 /* In tree-vect-data-refs.c.  */
 extern bool vect_can_force_dr_alignment_p (const_tree, unsigned int);
Index: gcc/tree-vectorizer.c
===================================================================
--- gcc/tree-vectorizer.c	2018-07-30 12:32:42.790390350 +0100
+++ gcc/tree-vectorizer.c	2018-07-30 12:32:46.658356275 +0100
@@ -653,8 +653,13 @@ vec_info::remove_stmt (stmt_vec_info stm
   set_vinfo_for_stmt (stmt_info->stmt, NULL);
   gimple_stmt_iterator si = gsi_for_stmt (stmt_info->stmt);
   unlink_stmt_vdef (stmt_info->stmt);
-  gsi_remove (&si, true);
-  release_defs (stmt_info->stmt);
+  if (is_a <gphi *> (stmt_info->stmt))
+    remove_phi_node (&si, true);
+  else
+    {
+      gsi_remove (&si, true);
+      release_defs (stmt_info->stmt);
+    }
   stmt_info->block->remove (stmt_info);
   free_stmt_vec_info (stmt_info);
 }
Index: gcc/tree-vect-stmts.c
===================================================================
--- gcc/tree-vect-stmts.c	2018-07-30 12:32:42.790390350 +0100
+++ gcc/tree-vect-stmts.c	2018-07-30 12:32:46.658356275 +0100
@@ -766,6 +766,34 @@ vect_mark_stmts_to_be_vectorized (loop_v
   return true;
 }
 
+/* Record that scalar statement STMT_INFO is needed by the vectorized code.  */
+
+void
+vect_mark_used_by_vector_code (stmt_vec_info stmt_info)
+{
+  vec_info *vinfo = stmt_info->vinfo;
+  auto_vec<stmt_vec_info, 16> worklist;
+  worklist.quick_push (stmt_info);
+  stmt_info->used_by_vector_code_p = true;
+  do
+    {
+      stmt_info = worklist.pop ();
+      ssa_op_iter iter;
+      use_operand_p use;
+      FOR_EACH_PHI_OR_STMT_USE (use, stmt_info->stmt, iter, SSA_OP_USE)
+	{
+	  tree op = USE_FROM_PTR (use);
+	  stmt_vec_info def_stmt_info = vinfo->lookup_def (op);
+	  if (def_stmt_info && !def_stmt_info->used_by_vector_code_p)
+	    {
+	      def_stmt_info->used_by_vector_code_p = 1;
+	      worklist.safe_push (def_stmt_info);
+	    }
+	}
+    }
+  while (!worklist.is_empty ());
+}
+
 /* Compute the prologue cost for invariant or constant operands.  */
 
 static unsigned
@@ -3094,7 +3122,6 @@ vectorizable_call (stmt_vec_info stmt_in
   auto_vec<tree, 8> orig_vargs;
   enum { NARROW, NONE, WIDEN } modifier;
   size_t i, nargs;
-  tree lhs;
 
   if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
     return false;
@@ -3592,22 +3619,6 @@ vectorizable_call (stmt_vec_info stmt_in
     return false;
 
   vargs.release ();
-
-  /* The call in STMT might prevent it from being removed in dce.
-     We however cannot remove it here, due to the way the ssa name
-     it defines is mapped to the new definition.  So just replace
-     rhs of the statement with something harmless.  */
-
-  if (slp_node)
-    return true;
-
-  stmt_info = vect_orig_stmt (stmt_info);
-  lhs = gimple_get_lhs (stmt_info->stmt);
-
-  gassign *new_stmt
-    = gimple_build_assign (lhs, build_zero_cst (TREE_TYPE (lhs)));
-  vinfo->replace_stmt (gsi, stmt_info, new_stmt);
-
   return true;
 }
 
@@ -3716,7 +3727,7 @@ vectorizable_simd_clone_call (stmt_vec_i
 {
   tree vec_dest;
   tree scalar_dest;
-  tree op, type;
+  tree op;
   tree vec_oprnd0 = NULL_TREE;
   stmt_vec_info prev_stmt_info;
   tree vectype;
@@ -3730,7 +3741,7 @@ vectorizable_simd_clone_call (stmt_vec_i
   auto_vec<simd_call_arg_info> arginfo;
   vec<tree> vargs = vNULL;
   size_t i, nargs;
-  tree lhs, rtype, ratype;
+  tree rtype, ratype;
   vec<constructor_elt, va_gc> *ret_ctor_elts = NULL;
 
   /* Is STMT a vectorizable call?   */
@@ -4323,27 +4334,6 @@ vectorizable_simd_clone_call (stmt_vec_i
     }
 
   vargs.release ();
-
-  /* The call in STMT might prevent it from being removed in dce.
-     We however cannot remove it here, due to the way the ssa name
-     it defines is mapped to the new definition.  So just replace
-     rhs of the statement with something harmless.  */
-
-  if (slp_node)
-    return true;
-
-  gimple *new_stmt;
-  if (scalar_dest)
-    {
-      type = TREE_TYPE (scalar_dest);
-      lhs = gimple_call_lhs (vect_orig_stmt (stmt_info)->stmt);
-      new_stmt = gimple_build_assign (lhs, build_zero_cst (type));
-    }
-  else
-    new_stmt = gimple_build_nop ();
-  vinfo->replace_stmt (gsi, stmt_info, new_stmt);
-  unlink_stmt_vdef (stmt);
-
   return true;
 }
 
@@ -7650,6 +7640,8 @@ vectorizable_load (stmt_vec_info stmt_in
 	    (loop_preheader_edge (loop),
 	     gimple_build_assign (scalar_dest, rhs));
 	}
+      else
+	vect_mark_used_by_vector_code (stmt_info);
       /* These copies are all equivalent, but currently the representation
 	 requires a separate STMT_VINFO_VEC_STMT for each one.  */
       prev_stmt_info = NULL;
@@ -9625,12 +9617,11 @@ vect_analyze_stmt (stmt_vec_info stmt_in
 
    Create a vectorized stmt to replace STMT_INFO, and insert it at BSI.  */
 
-bool
+void
 vect_transform_stmt (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi,
 		     slp_tree slp_node, slp_instance slp_node_instance)
 {
   vec_info *vinfo = stmt_info->vinfo;
-  bool is_store = false;
   stmt_vec_info vec_stmt = NULL;
   bool done;
 
@@ -9685,18 +9676,6 @@ vect_transform_stmt (stmt_vec_info stmt_
     case store_vec_info_type:
       done = vectorizable_store (stmt_info, gsi, &vec_stmt, slp_node, NULL);
       gcc_assert (done);
-      if (STMT_VINFO_GROUPED_ACCESS (stmt_info) && !slp_node)
-	{
-	  /* In case of interleaving, the whole chain is vectorized when the
-	     last store in the chain is reached.  Store stmts before the last
-	     one are skipped, and there vec_stmt_info shouldn't be freed
-	     meanwhile.  */
-	  stmt_vec_info group_info = DR_GROUP_FIRST_ELEMENT (stmt_info);
-	  if (DR_GROUP_STORE_COUNT (group_info) == DR_GROUP_SIZE (group_info))
-	    is_store = true;
-	}
-      else
-	is_store = true;
       break;
 
     case condition_vec_info_type:
@@ -9791,30 +9770,9 @@ vect_transform_stmt (stmt_vec_info stmt_
 
   if (vec_stmt)
     STMT_VINFO_VEC_STMT (stmt_info) = vec_stmt;
-
-  return is_store;
 }
 
 
-/* Remove a group of stores (for SLP or interleaving), free their
-   stmt_vec_info.  */
-
-void
-vect_remove_stores (stmt_vec_info first_stmt_info)
-{
-  vec_info *vinfo = first_stmt_info->vinfo;
-  stmt_vec_info next_stmt_info = first_stmt_info;
-
-  while (next_stmt_info)
-    {
-      stmt_vec_info tmp = DR_GROUP_NEXT_ELEMENT (next_stmt_info);
-      next_stmt_info = vect_orig_stmt (next_stmt_info);
-      /* Free the attached stmt_vec_info and remove the stmt.  */
-      vinfo->remove_stmt (next_stmt_info);
-      next_stmt_info = tmp;
-    }
-}
-
 /* Function get_vectype_for_scalar_type_and_size.
 
    Returns the vector type corresponding to SCALAR_TYPE  and SIZE as supported
@@ -10852,3 +10810,112 @@ vect_get_mask_type_for_stmt (stmt_vec_in
     }
   return mask_type;
 }
+
+/* Handle vect_remove_dead_scalar_stmts for statement STMT_INFO.  */
+
+static void
+vect_maybe_remove_scalar_stmt (stmt_vec_info stmt_info)
+{
+  vec_info *vinfo = stmt_info->vinfo;
+  bool bb_p = is_a <bb_vec_info> (vinfo);
+
+  /* Keep scalar statements that are needed by the vectorized code,
+     such as the phi in a "fold left" reduction or the load in a
+     load-and-broadcast operation.  */
+  if (stmt_info->used_by_vector_code_p)
+    return;
+
+  tree lhs;
+  if (gphi *phi = dyn_cast <gphi *> (stmt_info->stmt))
+    lhs = gimple_phi_result (phi);
+  else if (is_a <gassign *> (stmt_info->stmt)
+	   || is_a <gcall *> (stmt_info->stmt))
+    lhs = gimple_get_lhs (stmt_info->stmt);
+  else
+    /* Don't remove other types of statement.  */
+    return;
+
+  if (lhs && TREE_CODE (lhs) == SSA_NAME && !has_zero_uses (lhs))
+    {
+      /* Keep the virtual operand phi.  In all other cases,
+	 !used_by_vector_code_p should guarantee that the statement
+	 is dead for loop vectorization.  */
+      if (bb_p || virtual_operand_p (lhs))
+	return;
+
+      /* Check and process all uses of the lhs.  */
+      imm_use_iterator imm_iter;
+      gimple *use_stmt;
+      FOR_EACH_IMM_USE_STMT (use_stmt, imm_iter, lhs)
+	{
+	  if (is_gimple_debug (use_stmt))
+	    continue;
+
+	  /* The use must be either a loop phi (which we'll delete later)
+	     or an exit phi.  */
+	  gphi *use_phi = dyn_cast <gphi *> (use_stmt);
+	  gcc_assert (use_phi);
+	  if (!vinfo->lookup_stmt (use_phi))
+	    {
+	      /* It's an exit phi, and the phi must itself be dead code.  */
+	      gcc_assert (has_zero_uses (gimple_phi_result (use_phi)));
+	      if (dump_enabled_p ())
+		{
+		  dump_printf_loc (MSG_NOTE, vect_location,
+				   "deleting exit phi: ");
+		  dump_gimple_stmt (MSG_NOTE, TDF_SLIM, use_phi, 0);
+		}
+	      gimple_stmt_iterator gsi = gsi_for_stmt (use_phi);
+	      remove_phi_node (&gsi, true);
+	    }
+	}
+    }
+
+  /* The loop vectorizer decides for each statement whether the statement
+     should be vectorized, dropped or kept as-is.  We dealt with the last
+     case above, so anything left can be removed.  However, the region
+     used in BB vectorization includes unrelated statements that we should
+     only drop if we can prove they are dead.  */
+  if (bb_p
+      && ((lhs && TREE_CODE (lhs) != SSA_NAME)
+	  || gimple_vdef (stmt_info->stmt)
+	  || gimple_has_side_effects (stmt_info->stmt)))
+    {
+      stmt_vec_info final_info = vect_stmt_to_vectorize (stmt_info);
+      /* RELEVANT_P is only meaningful if the statement is still part of
+	 an SLP instance.  It can be set on other statements that were
+	 tentatively part of an SLP instance that we had to abandon.  */
+      if (STMT_VINFO_NUM_SLP_USES (final_info) == 0
+	  || !STMT_VINFO_RELEVANT_P (final_info))
+	return;
+    }
+
+  if (dump_enabled_p ())
+    {
+      dump_printf_loc (MSG_NOTE, vect_location, "deleting scalar stmt: ");
+      dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt_info->stmt, 0);
+    }
+  vinfo->remove_stmt (stmt_info);
+}
+
+/* Called after vectorizing VINFO.  Remove any scalar statements that
+   are no longer needed.  */
+
+void
+vect_remove_dead_scalar_stmts (vec_info *vinfo)
+{
+  DUMP_VECT_SCOPE ("vect_remove_dead_scalar_stmts");
+
+  unsigned int i;
+  vec_basic_block *vec_bb;
+  FOR_EACH_VEC_ELT_REVERSE (vinfo->blocks, i, vec_bb)
+    {
+      stmt_vec_info prev_stmt_info;
+      for (stmt_vec_info stmt_info = vec_bb->last (); stmt_info;
+	   stmt_info = prev_stmt_info)
+	{
+	  prev_stmt_info = stmt_info->prev;
+	  vect_maybe_remove_scalar_stmt (stmt_info);
+	}
+    }
+}
Index: gcc/tree-vect-slp.c
===================================================================
--- gcc/tree-vect-slp.c	2018-07-30 12:32:42.790390350 +0100
+++ gcc/tree-vect-slp.c	2018-07-30 12:32:46.654356310 +0100
@@ -3024,6 +3024,7 @@ vect_slp_bb (basic_block bb)
 
 	  bb_vinfo->shared->check_datarefs ();
 	  vect_schedule_slp (bb_vinfo);
+	  vect_remove_dead_scalar_stmts (bb_vinfo);
 
 	  unsigned HOST_WIDE_INT bytes;
 	  if (current_vector_size.is_constant (&bytes))
@@ -3986,43 +3987,6 @@ vect_schedule_slp_instance (slp_tree nod
       }
 }
 
-/* Replace scalar calls from SLP node NODE with setting of their lhs to zero.
-   For loop vectorization this is done in vectorizable_call, but for SLP
-   it needs to be deferred until end of vect_schedule_slp, because multiple
-   SLP instances may refer to the same scalar stmt.  */
-
-static void
-vect_remove_slp_scalar_calls (slp_tree node)
-{
-  gimple *new_stmt;
-  gimple_stmt_iterator gsi;
-  int i;
-  slp_tree child;
-  tree lhs;
-  stmt_vec_info stmt_info;
-
-  if (SLP_TREE_DEF_TYPE (node) != vect_internal_def)
-    return;
-
-  FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
-    vect_remove_slp_scalar_calls (child);
-
-  FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, stmt_info)
-    {
-      gcall *stmt = dyn_cast <gcall *> (stmt_info->stmt);
-      if (!stmt || gimple_bb (stmt) == NULL)
-	continue;
-      if (is_pattern_stmt_p (stmt_info)
-	  || !PURE_SLP_STMT (stmt_info))
-	continue;
-      lhs = gimple_call_lhs (stmt);
-      new_stmt = gimple_build_assign (lhs, build_zero_cst (TREE_TYPE (lhs)));
-      gsi = gsi_for_stmt (stmt);
-      stmt_info->vinfo->replace_stmt (&gsi, stmt_info, new_stmt);
-      SSA_NAME_DEF_STMT (gimple_assign_lhs (new_stmt)) = new_stmt;
-    }
-}
-
 /* Generate vector code for all SLP instances in the loop/basic block.  */
 
 void
@@ -4045,32 +4009,4 @@ vect_schedule_slp (vec_info *vinfo)
                          "vectorizing stmts using SLP.\n");
     }
   delete bst_map;
-
-  FOR_EACH_VEC_ELT (slp_instances, i, instance)
-    {
-      slp_tree root = SLP_INSTANCE_TREE (instance);
-      stmt_vec_info store_info;
-      unsigned int j;
-
-      /* Remove scalar call stmts.  Do not do this for basic-block
-	 vectorization as not all uses may be vectorized.
-	 ???  Why should this be necessary?  DCE should be able to
-	 remove the stmts itself.
-	 ???  For BB vectorization we can as well remove scalar
-	 stmts starting from the SLP tree root if they have no
-	 uses.  */
-      if (is_a <loop_vec_info> (vinfo))
-	vect_remove_slp_scalar_calls (root);
-
-      for (j = 0; SLP_TREE_SCALAR_STMTS (root).iterate (j, &store_info)
-                  && j < SLP_INSTANCE_GROUP_SIZE (instance); j++)
-        {
-	  if (!STMT_VINFO_DATA_REF (store_info))
-	    break;
-
-	  store_info = vect_orig_stmt (store_info);
-	  /* Free the attached stmt_vec_info and remove the stmt.  */
-	  vinfo->remove_stmt (store_info);
-        }
-    }
 }
Index: gcc/tree-vect-loop.c
===================================================================
--- gcc/tree-vect-loop.c	2018-07-30 12:32:42.786390386 +0100
+++ gcc/tree-vect-loop.c	2018-07-30 12:32:46.654356310 +0100
@@ -6044,18 +6044,24 @@ vectorizable_reduction (stmt_vec_info st
 	}
 
       if (STMT_VINFO_REDUC_TYPE (stmt_info) == FOLD_LEFT_REDUCTION)
-	/* Leave the scalar phi in place.  Note that checking
-	   STMT_VINFO_VEC_REDUCTION_TYPE (as below) only works
-	   for reductions involving a single statement.  */
-	return true;
+	{
+	  /* Leave the scalar phi in place.  Note that checking
+	     STMT_VINFO_VEC_REDUCTION_TYPE (as below) only works
+	     for reductions involving a single statement.  */
+	  vect_mark_used_by_vector_code (stmt_info);
+	  return true;
+	}
 
       stmt_vec_info reduc_stmt_info = STMT_VINFO_REDUC_DEF (stmt_info);
       reduc_stmt_info = vect_stmt_to_vectorize (reduc_stmt_info);
 
       if (STMT_VINFO_VEC_REDUCTION_TYPE (reduc_stmt_info)
 	  == EXTRACT_LAST_REDUCTION)
-	/* Leave the scalar phi in place.  */
-	return true;
+	{
+	  /* Leave the scalar phi in place.  */
+	  vect_mark_used_by_vector_code (stmt_info);
+	  return true;
+	}
 
       gassign *reduc_stmt = as_a <gassign *> (reduc_stmt_info->stmt);
       for (unsigned k = 1; k < gimple_num_ops (reduc_stmt); ++k)
@@ -7748,6 +7754,7 @@ vectorizable_live_operation (stmt_vec_in
 	dump_printf_loc (MSG_NOTE, vect_location,
 			 "statement is simple and uses invariant.  Leaving in "
 			 "place.\n");
+      vect_mark_used_by_vector_code (stmt_info);
       return true;
     }
 
@@ -8120,13 +8127,12 @@ scale_profile_for_vect_loop (struct loop
     scale_bbs_frequencies (&loop->latch, 1, exit_l->probability / prob);
 }
 
-/* Vectorize STMT_INFO if relevant, inserting any new instructions before GSI.
-   When vectorizing STMT_INFO as a store, set *SEEN_STORE to its
-   stmt_vec_info.  */
+/* Vectorize STMT_INFO if relevant, inserting any new instructions
+   before GSI.  */
 
 static void
 vect_transform_loop_stmt (loop_vec_info loop_vinfo, stmt_vec_info stmt_info,
-			  gimple_stmt_iterator *gsi, stmt_vec_info *seen_store)
+			  gimple_stmt_iterator *gsi)
 {
   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
   poly_uint64 vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
@@ -8141,6 +8147,21 @@ vect_transform_loop_stmt (loop_vec_info
   if (MAY_HAVE_DEBUG_BIND_STMTS && !STMT_VINFO_LIVE_P (stmt_info))
     vect_loop_kill_debug_uses (loop, stmt_info);
 
+  if (gcond *cond = dyn_cast <gcond *> (stmt_info->stmt))
+    {
+      if (nested_in_vect_loop_p (loop, stmt_info))
+	/* The gconds for inner loops aren't changed by vectorization.  */
+	vect_mark_used_by_vector_code (stmt_info);
+      else
+	{
+	  /* Replace the scalar gcond with a dummy condition for now,
+	     so that all the scalar inputs to it are dead.  We'll later
+	     replace the condition with the new vector-based one.  */
+	  gimple_cond_make_false (cond);
+	  update_stmt (cond);
+	}
+    }
+
   if (!STMT_VINFO_RELEVANT_P (stmt_info)
       && !STMT_VINFO_LIVE_P (stmt_info))
     return;
@@ -8165,8 +8186,7 @@ vect_transform_loop_stmt (loop_vec_info
   if (dump_enabled_p ())
     dump_printf_loc (MSG_NOTE, vect_location, "transform statement.\n");
 
-  if (vect_transform_stmt (stmt_info, gsi, NULL, NULL))
-    *seen_store = stmt_info;
+  vect_transform_stmt (stmt_info, gsi, NULL, NULL);
 }
 
 /* Function vect_transform_loop.
@@ -8351,7 +8371,6 @@ vect_transform_loop (loop_vec_info loop_
 	    loop_vinfo->remove_stmt (stmt_info);
 	  else
 	    {
-	      stmt_vec_info seen_store = NULL;
 	      gimple_stmt_iterator si = gsi_for_stmt (stmt_info->stmt);
 	      if (STMT_VINFO_IN_PATTERN_P (stmt_info))
 		{
@@ -8362,49 +8381,19 @@ vect_transform_loop (loop_vec_info loop_
 		      stmt_vec_info pat_stmt_info
 			= loop_vinfo->lookup_stmt (gsi_stmt (subsi));
 		      vect_transform_loop_stmt (loop_vinfo, pat_stmt_info,
-						&si, &seen_store);
+						&si);
 		    }
 		  stmt_vec_info pat_stmt_info
 		    = STMT_VINFO_RELATED_STMT (stmt_info);
-		  vect_transform_loop_stmt (loop_vinfo, pat_stmt_info, &si,
-					    &seen_store);
-		}
-	      vect_transform_loop_stmt (loop_vinfo, stmt_info, &si,
-					&seen_store);
-	      if (seen_store)
-		{
-		  if (STMT_VINFO_GROUPED_ACCESS (seen_store))
-		    /* Interleaving.  If IS_STORE is TRUE, the
-		       vectorization of the interleaving chain was
-		       completed - free all the stores in the chain.  */
-		    vect_remove_stores (DR_GROUP_FIRST_ELEMENT (seen_store));
-		  else
-		    /* Free the attached stmt_vec_info and remove the stmt.  */
-		    loop_vinfo->remove_stmt (stmt_info);
-		}
-	    }
-	}
-
-      /* Stub out scalar statements that must not survive vectorization.
-	 Doing this here helps with grouped statements, or statements that
-	 are involved in patterns.  */
-      for (gimple_stmt_iterator gsi = gsi_start_bb (vec_bb->bb ());
-	   !gsi_end_p (gsi); gsi_next (&gsi))
-	{
-	  gcall *call = dyn_cast <gcall *> (gsi_stmt (gsi));
-	  if (call && gimple_call_internal_p (call, IFN_MASK_LOAD))
-	    {
-	      tree lhs = gimple_get_lhs (call);
-	      if (!VECTOR_TYPE_P (TREE_TYPE (lhs)))
-		{
-		  tree zero = build_zero_cst (TREE_TYPE (lhs));
-		  gimple *new_stmt = gimple_build_assign (lhs, zero);
-		  gsi_replace (&gsi, new_stmt, true);
+		  vect_transform_loop_stmt (loop_vinfo, pat_stmt_info, &si);
 		}
+	      vect_transform_loop_stmt (loop_vinfo, stmt_info, &si);
 	    }
 	}
     }				/* BBs in loop */
 
+  vect_remove_dead_scalar_stmts (loop_vinfo);
+
   /* The vectorization factor is always > 1, so if we use an IV increment of 1.
      a zero NITERS becomes a nonzero NITERS_VECTOR.  */
   if (integer_onep (step_vector))

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [09/11] Add a vec_basic_block structure
  2018-07-30 11:36 [00/11] Add a vec_basic_block of scalar statements Richard Sandiford
                   ` (7 preceding siblings ...)
  2018-07-30 11:43 ` [08/11] Make hoist_defs_of_uses use vec_info::lookup_def Richard Sandiford
@ 2018-07-30 11:46 ` Richard Sandiford
  2018-07-30 11:46 ` [10/11] Make the vectoriser do its own DCE Richard Sandiford
  2018-07-30 11:47 ` [11/11] Insert pattern statements into vec_basic_blocks Richard Sandiford
  10 siblings, 0 replies; 20+ messages in thread
From: Richard Sandiford @ 2018-07-30 11:46 UTC (permalink / raw)
  To: gcc-patches

This patch adds a vec_basic_block that records the scalar phis and
scalar statements that we need to vectorise.  This is a slight
simplification in its own right, since it avoids unnecesary statement
lookups and shaves >50 LOC.  But the main reason for doing it is
to allow the final patch in the series to treat pattern statements
less specially.

Putting phis (which are logically parallel) and normal statements
(which are logically serial) into a single list might seem dangerous,
but I think in practice it should be fine.  Very little vectoriser
code needs to handle the parallel nature of phis specially, and code
that does can still do so.  Having a single list simplifies code that
wants to look at every scalar phi or stmt in isolation.


2018-07-30  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vectorizer.h (vec_basic_block): New structure.
	(vec_info::blocks, _stmt_vec_info::block, _stmt_vec_info::prev)
	(_stmt_vec_info::next): New member variables.
	(FOR_EACH_VEC_BB_STMT, FOR_EACH_VEC_BB_STMT_REVERSE): New macros.
	(vec_basic_block::vec_basic_block): New function.
	* tree-vectorizer.c (vec_basic_block::add_to_end): Likewise.
	(vec_basic_block::add_before): Likewise.
	(vec_basic_block::remove): Likewise.
	(vec_info::~vec_info): Free the vec_basic_blocks.
	(vec_info::remove_stmt): Remove the statement from the containing
	vec_basic_block.
	* tree-vect-patterns.c (vect_determine_precisions)
	(vect_pattern_recog): Iterate over vec_basic_blocks.
	* tree-vect-loop.c (vect_determine_vectorization_factor)
	(vect_compute_single_scalar_iteration_cost, vect_update_vf_for_slp)
	(vect_analyze_loop_operations, vect_transform_loop): Likewise.
	(_loop_vec_info::_loop_vec_info): Construct vec_basic_blocks.
	* tree-vect-slp.c (_bb_vec_info::_bb_vec_info): Likewise.
	(vect_detect_hybrid_slp): Iterate over vec_basic_blocks.
	* tree-vect-stmts.c (vect_mark_stmts_to_be_vectorized): Likewise.
	(vect_finish_replace_stmt, vectorizable_condition): Remove the original
	statement from the containing block.
	(hoist_defs_of_uses): Likewise the statement that we're hoisting.

Index: gcc/tree-vectorizer.h
===================================================================
*** gcc/tree-vectorizer.h	2018-07-30 12:43:34.512651826 +0100
--- gcc/tree-vectorizer.h	2018-07-30 12:43:34.508651861 +0100
*************** #define SLP_TREE_LOAD_PERMUTATION(S)
*** 171,177 ****
--- 171,200 ----
  #define SLP_TREE_TWO_OPERATORS(S)		 (S)->two_operators
  #define SLP_TREE_DEF_TYPE(S)			 (S)->def_type
  
+ /* Information about the phis and statements in a block that we're trying
+    to vectorize, in their original order.  */
+ class vec_basic_block
+ {
+ public:
+   vec_basic_block (basic_block);
+ 
+   void add_to_end (stmt_vec_info);
+   void add_before (stmt_vec_info, stmt_vec_info);
+   void remove (stmt_vec_info);
+ 
+   basic_block bb () const { return m_bb; }
+   stmt_vec_info first () const { return m_first; }
+   stmt_vec_info last () const { return m_last; }
+ 
+ private:
+   /* The block itself.  */
+   basic_block m_bb;
  
+   /* The first and last statements in the block, forming a double-linked list.
+      The list includes both phis and true statements.  */
+   stmt_vec_info m_first;
+   stmt_vec_info m_last;
+ };
  
  /* Describes two objects whose addresses must be unequal for the vectorized
     loop to be valid.  */
*************** struct vec_info {
*** 249,254 ****
--- 272,280 ----
    /* Cost data used by the target cost model.  */
    void *target_cost_data;
  
+   /* The basic blocks in the vectorization region.  */
+   auto_vec<vec_basic_block *, 5> blocks;
+ 
  private:
    stmt_vec_info new_stmt_vec_info (gimple *stmt);
    void set_vinfo_for_stmt (gimple *, stmt_vec_info);
*************** struct dr_vec_info {
*** 776,781 ****
--- 802,812 ----
  typedef struct data_reference *dr_p;
  
  struct _stmt_vec_info {
+   /* The block to which the statement belongs, or null if none.  */
+   vec_basic_block *block;
+ 
+   /* Link chains for the previous and next statements in BLOCK.  */
+   stmt_vec_info prev, next;
  
    enum stmt_vec_info_type type;
  
*************** #define VECT_SCALAR_BOOLEAN_TYPE_P(TYPE)
*** 1072,1077 ****
--- 1103,1129 ----
         && TYPE_PRECISION (TYPE) == 1		\
         && TYPE_UNSIGNED (TYPE)))
  
+ /* Make STMT_INFO iterate over each statement in vec_basic_block VEC_BB
+    in forward order.  */
+ 
+ #define FOR_EACH_VEC_BB_STMT(VEC_BB, STMT_INFO) \
+   for (stmt_vec_info STMT_INFO = (VEC_BB)->first (); STMT_INFO; \
+        STMT_INFO = STMT_INFO->next)
+ 
+ /* Make STMT_INFO iterate over each statement in vec_basic_block VEC_BB
+    in backward order.  */
+ 
+ #define FOR_EACH_VEC_BB_STMT_REVERSE(VEC_BB, STMT_INFO) \
+   for (stmt_vec_info STMT_INFO = (VEC_BB)->last (); STMT_INFO; \
+        STMT_INFO = STMT_INFO->prev)
+ 
+ /* Construct a vec_basic_block for BB.  */
+ 
+ inline vec_basic_block::vec_basic_block (basic_block bb)
+   : m_bb (bb), m_first (NULL), m_last (NULL)
+ {
+ }
+ 
  static inline bool
  nested_in_vect_loop_p (struct loop *loop, stmt_vec_info stmt_info)
  {
Index: gcc/tree-vectorizer.c
===================================================================
*** gcc/tree-vectorizer.c	2018-07-30 12:43:34.512651826 +0100
--- gcc/tree-vectorizer.c	2018-07-30 12:43:34.508651861 +0100
*************** note_simd_array_uses (hash_table<simd_ar
*** 444,449 ****
--- 444,504 ----
    delete simd_array_to_simduid_htab;
  }
  \f
+ /* Add STMT_INFO to the end of the block.  */
+ 
+ void
+ vec_basic_block::add_to_end (stmt_vec_info stmt_info)
+ {
+   gcc_checking_assert (!stmt_info->block
+ 		       && !stmt_info->prev
+ 		       && !stmt_info->next);
+   if (m_last)
+     m_last->next = stmt_info;
+   else
+     m_first = stmt_info;
+   stmt_info->block = this;
+   stmt_info->prev = m_last;
+   m_last = stmt_info;
+ }
+ 
+ /* Add STMT_INFO to the block, inserting it before NEXT_STMT_INFO.  */
+ 
+ void
+ vec_basic_block::add_before (stmt_vec_info stmt_info,
+ 			     stmt_vec_info next_stmt_info)
+ {
+   gcc_checking_assert (!stmt_info->block
+ 		       && !stmt_info->prev
+ 		       && !stmt_info->next
+ 		       && next_stmt_info->block == this);
+   if (next_stmt_info->prev)
+     next_stmt_info->prev->next = stmt_info;
+   else
+     m_first = stmt_info;
+   stmt_info->block = this;
+   stmt_info->prev = next_stmt_info->prev;
+   stmt_info->next = next_stmt_info;
+   next_stmt_info->prev = stmt_info;
+ }
+ 
+ /* Remove STMT_INFO from the block.  */
+ 
+ void
+ vec_basic_block::remove (stmt_vec_info stmt_info)
+ {
+   gcc_checking_assert (stmt_info->block == this);
+   if (stmt_info->prev)
+     stmt_info->prev->next = stmt_info->next;
+   else
+     m_first = stmt_info->next;
+   if (stmt_info->next)
+     stmt_info->next->prev = stmt_info->prev;
+   else
+     m_last = stmt_info->prev;
+   stmt_info->block = NULL;
+   stmt_info->prev = stmt_info->next = NULL;
+ }
+ \f
  /* Initialize the vec_info with kind KIND_IN and target cost data
     TARGET_COST_DATA_IN.  */
  
*************** vec_info::vec_info (vec_info::vec_kind k
*** 459,466 ****
--- 514,525 ----
  vec_info::~vec_info ()
  {
    slp_instance instance;
+   vec_basic_block *vec_bb;
    unsigned int i;
  
+   FOR_EACH_VEC_ELT (blocks, i, vec_bb)
+     delete vec_bb;
+ 
    FOR_EACH_VEC_ELT (slp_instances, i, instance)
      vect_free_slp_instance (instance, true);
  
*************** vec_info::remove_stmt (stmt_vec_info stm
*** 596,601 ****
--- 655,661 ----
    unlink_stmt_vdef (stmt_info->stmt);
    gsi_remove (&si, true);
    release_defs (stmt_info->stmt);
+   stmt_info->block->remove (stmt_info);
    free_stmt_vec_info (stmt_info);
  }
  
Index: gcc/tree-vect-patterns.c
===================================================================
*** gcc/tree-vect-patterns.c	2018-07-30 12:43:34.512651826 +0100
--- gcc/tree-vect-patterns.c	2018-07-30 12:43:34.504651897 +0100
*************** vect_determine_precisions (vec_info *vin
*** 4631,4669 ****
  {
    DUMP_VECT_SCOPE ("vect_determine_precisions");
  
!   if (loop_vec_info loop_vinfo = dyn_cast <loop_vec_info> (vinfo))
!     {
!       struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
!       basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo);
!       unsigned int nbbs = loop->num_nodes;
! 
!       for (unsigned int i = 0; i < nbbs; i++)
! 	{
! 	  basic_block bb = bbs[nbbs - i - 1];
! 	  for (gimple_stmt_iterator si = gsi_last_bb (bb);
! 	       !gsi_end_p (si); gsi_prev (&si))
! 	    vect_determine_stmt_precisions
! 	      (vinfo->lookup_stmt (gsi_stmt (si)));
! 	}
!     }
!   else
!     {
!       bb_vec_info bb_vinfo = as_a <bb_vec_info> (vinfo);
!       gimple_stmt_iterator si = bb_vinfo->region_end;
!       gimple *stmt;
!       do
! 	{
! 	  if (!gsi_stmt (si))
! 	    si = gsi_last_bb (bb_vinfo->bb);
! 	  else
! 	    gsi_prev (&si);
! 	  stmt = gsi_stmt (si);
! 	  stmt_vec_info stmt_info = vinfo->lookup_stmt (stmt);
! 	  if (stmt_info && STMT_VINFO_VECTORIZABLE (stmt_info))
! 	    vect_determine_stmt_precisions (stmt_info);
! 	}
!       while (stmt != gsi_stmt (bb_vinfo->region_begin));
!     }
  }
  
  typedef gimple *(*vect_recog_func_ptr) (stmt_vec_info, tree *);
--- 4631,4641 ----
  {
    DUMP_VECT_SCOPE ("vect_determine_precisions");
  
!   unsigned int i;
!   vec_basic_block *vec_bb;
!   FOR_EACH_VEC_ELT_REVERSE (vinfo->blocks, i, vec_bb)
!     FOR_EACH_VEC_BB_STMT_REVERSE (vec_bb, stmt_info)
!       vect_determine_stmt_precisions (stmt_info);
  }
  
  typedef gimple *(*vect_recog_func_ptr) (stmt_vec_info, tree *);
*************** vect_pattern_recog_1 (vect_recog_func *r
*** 4923,4973 ****
  void
  vect_pattern_recog (vec_info *vinfo)
  {
-   struct loop *loop;
-   basic_block *bbs;
-   unsigned int nbbs;
-   gimple_stmt_iterator si;
-   unsigned int i, j;
- 
    vect_determine_precisions (vinfo);
  
    DUMP_VECT_SCOPE ("vect_pattern_recog");
  
!   if (loop_vec_info loop_vinfo = dyn_cast <loop_vec_info> (vinfo))
!     {
!       loop = LOOP_VINFO_LOOP (loop_vinfo);
!       bbs = LOOP_VINFO_BBS (loop_vinfo);
!       nbbs = loop->num_nodes;
! 
!       /* Scan through the loop stmts, applying the pattern recognition
! 	 functions starting at each stmt visited:  */
!       for (i = 0; i < nbbs; i++)
! 	{
! 	  basic_block bb = bbs[i];
! 	  for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si))
! 	    {
! 	      stmt_vec_info stmt_info = vinfo->lookup_stmt (gsi_stmt (si));
! 	      /* Scan over all generic vect_recog_xxx_pattern functions.  */
! 	      for (j = 0; j < NUM_PATTERNS; j++)
! 		vect_pattern_recog_1 (&vect_vect_recog_func_ptrs[j],
! 				      stmt_info);
! 	    }
! 	}
!     }
!   else
!     {
!       bb_vec_info bb_vinfo = as_a <bb_vec_info> (vinfo);
!       for (si = bb_vinfo->region_begin;
! 	   gsi_stmt (si) != gsi_stmt (bb_vinfo->region_end); gsi_next (&si))
! 	{
! 	  gimple *stmt = gsi_stmt (si);
! 	  stmt_vec_info stmt_info = bb_vinfo->lookup_stmt (stmt);
! 	  if (stmt_info && !STMT_VINFO_VECTORIZABLE (stmt_info))
! 	    continue;
! 
! 	  /* Scan over all generic vect_recog_xxx_pattern functions.  */
! 	  for (j = 0; j < NUM_PATTERNS; j++)
! 	    vect_pattern_recog_1 (&vect_vect_recog_func_ptrs[j], stmt_info);
! 	}
!     }
  }
--- 4895,4910 ----
  void
  vect_pattern_recog (vec_info *vinfo)
  {
    vect_determine_precisions (vinfo);
  
    DUMP_VECT_SCOPE ("vect_pattern_recog");
  
!   unsigned int i;
!   vec_basic_block *vec_bb;
!   FOR_EACH_VEC_ELT (vinfo->blocks, i, vec_bb)
!     FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
!       if (STMT_VINFO_VECTORIZABLE (stmt_info))
! 	/* Scan over all generic vect_recog_xxx_pattern functions.  */
! 	for (unsigned int j = 0; j < NUM_PATTERNS; j++)
! 	  vect_pattern_recog_1 (&vect_vect_recog_func_ptrs[j], stmt_info);
  }
Index: gcc/tree-vect-loop.c
===================================================================
*** gcc/tree-vect-loop.c	2018-07-30 12:43:34.512651826 +0100
--- gcc/tree-vect-loop.c	2018-07-30 12:43:34.500651932 +0100
*************** vect_determine_vf_for_stmt (stmt_vec_inf
*** 286,321 ****
  static bool
  vect_determine_vectorization_factor (loop_vec_info loop_vinfo)
  {
-   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
-   basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo);
-   unsigned nbbs = loop->num_nodes;
    poly_uint64 vectorization_factor = 1;
    tree scalar_type = NULL_TREE;
-   gphi *phi;
    tree vectype;
    stmt_vec_info stmt_info;
    unsigned i;
    auto_vec<stmt_vec_info> mask_producers;
  
    DUMP_VECT_SCOPE ("vect_determine_vectorization_factor");
  
!   for (i = 0; i < nbbs; i++)
!     {
!       basic_block bb = bbs[i];
! 
!       for (gphi_iterator si = gsi_start_phis (bb); !gsi_end_p (si);
! 	   gsi_next (&si))
  	{
- 	  phi = si.phi ();
- 	  stmt_info = loop_vinfo->lookup_stmt (phi);
  	  if (dump_enabled_p ())
  	    {
  	      dump_printf_loc (MSG_NOTE, vect_location, "==> examining phi: ");
  	      dump_gimple_stmt (MSG_NOTE, TDF_SLIM, phi, 0);
  	    }
  
- 	  gcc_assert (stmt_info);
- 
  	  if (STMT_VINFO_RELEVANT_P (stmt_info)
  	      || STMT_VINFO_LIVE_P (stmt_info))
              {
--- 286,311 ----
  static bool
  vect_determine_vectorization_factor (loop_vec_info loop_vinfo)
  {
    poly_uint64 vectorization_factor = 1;
    tree scalar_type = NULL_TREE;
    tree vectype;
    stmt_vec_info stmt_info;
    unsigned i;
    auto_vec<stmt_vec_info> mask_producers;
+   vec_basic_block *vec_bb;
  
    DUMP_VECT_SCOPE ("vect_determine_vectorization_factor");
  
!   FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
!     FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
!       if (gphi *phi = dyn_cast <gphi *> (stmt_info->stmt))
  	{
  	  if (dump_enabled_p ())
  	    {
  	      dump_printf_loc (MSG_NOTE, vect_location, "==> examining phi: ");
  	      dump_gimple_stmt (MSG_NOTE, TDF_SLIM, phi, 0);
  	    }
  
  	  if (STMT_VINFO_RELEVANT_P (stmt_info)
  	      || STMT_VINFO_LIVE_P (stmt_info))
              {
*************** vect_determine_vectorization_factor (loo
*** 363,378 ****
  	      vect_update_max_nunits (&vectorization_factor, vectype);
  	    }
  	}
! 
!       for (gimple_stmt_iterator si = gsi_start_bb (bb); !gsi_end_p (si);
! 	   gsi_next (&si))
! 	{
! 	  stmt_info = loop_vinfo->lookup_stmt (gsi_stmt (si));
! 	  if (!vect_determine_vf_for_stmt (stmt_info, &vectorization_factor,
! 					   &mask_producers))
! 	    return false;
!         }
!     }
  
    /* TODO: Analyze cost. Decide if worth while to vectorize.  */
    if (dump_enabled_p ())
--- 353,361 ----
  	      vect_update_max_nunits (&vectorization_factor, vectype);
  	    }
  	}
!       else if (!vect_determine_vf_for_stmt (stmt_info, &vectorization_factor,
! 					    &mask_producers))
! 	return false;
  
    /* TODO: Analyze cost. Decide if worth while to vectorize.  */
    if (dump_enabled_p ())
*************** _loop_vec_info::_loop_vec_info (struct l
*** 846,866 ****
    for (unsigned int i = 0; i < nbbs; i++)
      {
        basic_block bb = bbs[i];
        gimple_stmt_iterator si;
  
        for (si = gsi_start_phis (bb); !gsi_end_p (si); gsi_next (&si))
  	{
  	  gimple *phi = gsi_stmt (si);
  	  gimple_set_uid (phi, 0);
! 	  add_stmt (phi);
  	}
  
        for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si))
  	{
  	  gimple *stmt = gsi_stmt (si);
  	  gimple_set_uid (stmt, 0);
! 	  add_stmt (stmt);
  	}
      }
  }
  
--- 829,851 ----
    for (unsigned int i = 0; i < nbbs; i++)
      {
        basic_block bb = bbs[i];
+       vec_basic_block *vec_bb = new vec_basic_block (bb);
        gimple_stmt_iterator si;
  
        for (si = gsi_start_phis (bb); !gsi_end_p (si); gsi_next (&si))
  	{
  	  gimple *phi = gsi_stmt (si);
  	  gimple_set_uid (phi, 0);
! 	  vec_bb->add_to_end (add_stmt (phi));
  	}
  
        for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si))
  	{
  	  gimple *stmt = gsi_stmt (si);
  	  gimple_set_uid (stmt, 0);
! 	  vec_bb->add_to_end (add_stmt (stmt));
  	}
+       blocks.safe_push (vec_bb);
      }
  }
  
*************** vect_verify_full_masking (loop_vec_info
*** 1066,1074 ****
  vect_compute_single_scalar_iteration_cost (loop_vec_info loop_vinfo)
  {
    struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
!   basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo);
!   int nbbs = loop->num_nodes, factor;
!   int innerloop_iters, i;
  
    /* Gather costs for statements in the scalar loop.  */
  
--- 1051,1058 ----
  vect_compute_single_scalar_iteration_cost (loop_vec_info loop_vinfo)
  {
    struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
!   int factor, innerloop_iters;
!   unsigned int i;
  
    /* Gather costs for statements in the scalar loop.  */
  
*************** vect_compute_single_scalar_iteration_cos
*** 1077,1099 ****
    if (loop->inner)
      innerloop_iters = 50; /* FIXME */
  
!   for (i = 0; i < nbbs; i++)
      {
!       gimple_stmt_iterator si;
!       basic_block bb = bbs[i];
! 
!       if (bb->loop_father == loop->inner)
          factor = innerloop_iters;
        else
          factor = 1;
  
!       for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si))
!         {
! 	  gimple *stmt = gsi_stmt (si);
! 	  stmt_vec_info stmt_info = loop_vinfo->lookup_stmt (stmt);
! 
!           if (!is_gimple_assign (stmt) && !is_gimple_call (stmt))
!             continue;
  
            /* Skip stmts that are not vectorized inside the loop.  */
            if (stmt_info
--- 1061,1079 ----
    if (loop->inner)
      innerloop_iters = 50; /* FIXME */
  
!   vec_basic_block *vec_bb;
!   FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
      {
!       if (vec_bb->bb ()->loop_father == loop->inner)
          factor = innerloop_iters;
        else
          factor = 1;
  
!       FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
! 	{
! 	  if (!is_gimple_assign (stmt_info->stmt)
! 	      && !is_gimple_call (stmt_info->stmt))
! 	    continue;
  
            /* Skip stmts that are not vectorized inside the loop.  */
            if (stmt_info
*************** vect_analyze_loop_form (struct loop *loo
*** 1397,1407 ****
  static void
  vect_update_vf_for_slp (loop_vec_info loop_vinfo)
  {
-   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
-   basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo);
-   int nbbs = loop->num_nodes;
    poly_uint64 vectorization_factor;
!   int i;
  
    DUMP_VECT_SCOPE ("vect_update_vf_for_slp");
  
--- 1377,1384 ----
  static void
  vect_update_vf_for_slp (loop_vec_info loop_vinfo)
  {
    poly_uint64 vectorization_factor;
!   unsigned int i;
  
    DUMP_VECT_SCOPE ("vect_update_vf_for_slp");
  
*************** vect_update_vf_for_slp (loop_vec_info lo
*** 1414,1434 ****
       perform pure SLP on loop - cross iteration parallelism is not
       exploited.  */
    bool only_slp_in_loop = true;
!   for (i = 0; i < nbbs; i++)
!     {
!       basic_block bb = bbs[i];
!       for (gimple_stmt_iterator si = gsi_start_bb (bb); !gsi_end_p (si);
! 	   gsi_next (&si))
! 	{
! 	  stmt_vec_info stmt_info = loop_vinfo->lookup_stmt (gsi_stmt (si));
! 	  stmt_info = vect_stmt_to_vectorize (stmt_info);
! 	  if ((STMT_VINFO_RELEVANT_P (stmt_info)
! 	       || VECTORIZABLE_CYCLE_DEF (STMT_VINFO_DEF_TYPE (stmt_info)))
! 	      && !PURE_SLP_STMT (stmt_info))
! 	    /* STMT needs both SLP and loop-based vectorization.  */
! 	    only_slp_in_loop = false;
! 	}
!     }
  
    if (only_slp_in_loop)
      {
--- 1391,1407 ----
       perform pure SLP on loop - cross iteration parallelism is not
       exploited.  */
    bool only_slp_in_loop = true;
!   vec_basic_block *vec_bb;
!   FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
!     FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
!       {
! 	stmt_vec_info final_info = vect_stmt_to_vectorize (stmt_info);
! 	if ((STMT_VINFO_RELEVANT_P (final_info)
! 	     || VECTORIZABLE_CYCLE_DEF (STMT_VINFO_DEF_TYPE (final_info)))
! 	    && !PURE_SLP_STMT (final_info))
! 	  /* STMT needs both SLP and loop-based vectorization.  */
! 	  only_slp_in_loop = false;
!       }
  
    if (only_slp_in_loop)
      {
*************** vect_active_double_reduction_p (stmt_vec
*** 1491,1501 ****
  static bool
  vect_analyze_loop_operations (loop_vec_info loop_vinfo)
  {
!   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
!   basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo);
!   int nbbs = loop->num_nodes;
!   int i;
!   stmt_vec_info stmt_info;
    bool need_to_vectorize = false;
    bool ok;
  
--- 1464,1470 ----
  static bool
  vect_analyze_loop_operations (loop_vec_info loop_vinfo)
  {
!   unsigned int i;
    bool need_to_vectorize = false;
    bool ok;
  
*************** vect_analyze_loop_operations (loop_vec_i
*** 1504,1520 ****
    stmt_vector_for_cost cost_vec;
    cost_vec.create (2);
  
!   for (i = 0; i < nbbs; i++)
!     {
!       basic_block bb = bbs[i];
! 
!       for (gphi_iterator si = gsi_start_phis (bb); !gsi_end_p (si);
! 	   gsi_next (&si))
          {
-           gphi *phi = si.phi ();
            ok = true;
- 
- 	  stmt_info = loop_vinfo->lookup_stmt (phi);
            if (dump_enabled_p ())
              {
                dump_printf_loc (MSG_NOTE, vect_location, "examining phi: ");
--- 1473,1484 ----
    stmt_vector_for_cost cost_vec;
    cost_vec.create (2);
  
!   vec_basic_block *vec_bb;
!   FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
!     FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
!       if (gphi *phi = dyn_cast <gphi *> (stmt_info->stmt))
          {
            ok = true;
            if (dump_enabled_p ())
              {
                dump_printf_loc (MSG_NOTE, vect_location, "examining phi: ");
*************** vect_analyze_loop_operations (loop_vec_i
*** 1525,1531 ****
  
            /* Inner-loop loop-closed exit phi in outer-loop vectorization
               (i.e., a phi in the tail of the outer-loop).  */
!           if (! is_loop_header_bb_p (bb))
              {
                /* FORNOW: we currently don't support the case that these phis
                   are not used in the outerloop (unless it is double reduction,
--- 1489,1495 ----
  
            /* Inner-loop loop-closed exit phi in outer-loop vectorization
               (i.e., a phi in the tail of the outer-loop).  */
!           if (! is_loop_header_bb_p (vec_bb->bb ()))
              {
                /* FORNOW: we currently don't support the case that these phis
                   are not used in the outerloop (unless it is double reduction,
*************** vect_analyze_loop_operations (loop_vec_i
*** 1564,1571 ****
                continue;
              }
  
-           gcc_assert (stmt_info);
- 
            if ((STMT_VINFO_RELEVANT (stmt_info) == vect_used_in_scope
                 || STMT_VINFO_LIVE_P (stmt_info))
                && STMT_VINFO_DEF_TYPE (stmt_info) != vect_induction_def)
--- 1528,1533 ----
*************** vect_analyze_loop_operations (loop_vec_i
*** 1610,1627 ****
  	      return false;
              }
          }
! 
!       for (gimple_stmt_iterator si = gsi_start_bb (bb); !gsi_end_p (si);
! 	   gsi_next (&si))
!         {
! 	  gimple *stmt = gsi_stmt (si);
! 	  if (!gimple_clobber_p (stmt)
! 	      && !vect_analyze_stmt (loop_vinfo->lookup_stmt (stmt),
! 				     &need_to_vectorize,
! 				     NULL, NULL, &cost_vec))
! 	    return false;
!         }
!     } /* bbs */
  
    add_stmt_costs (loop_vinfo->target_cost_data, &cost_vec);
    cost_vec.release ();
--- 1572,1581 ----
  	      return false;
              }
          }
!       else if (!gimple_clobber_p (stmt_info->stmt)
! 	       && !vect_analyze_stmt (stmt_info, &need_to_vectorize,
! 				      NULL, NULL, &cost_vec))
! 	return false;
  
    add_stmt_costs (loop_vinfo->target_cost_data, &cost_vec);
    cost_vec.release ();
*************** vect_analyze_loop_2 (loop_vec_info loop_
*** 2207,2238 ****
      vect_free_slp_instance (instance, false);
    LOOP_VINFO_SLP_INSTANCES (loop_vinfo).release ();
    /* Reset SLP type to loop_vect on all stmts.  */
!   for (i = 0; i < LOOP_VINFO_LOOP (loop_vinfo)->num_nodes; ++i)
!     {
!       basic_block bb = LOOP_VINFO_BBS (loop_vinfo)[i];
!       for (gimple_stmt_iterator si = gsi_start_phis (bb);
! 	   !gsi_end_p (si); gsi_next (&si))
! 	{
! 	  stmt_vec_info stmt_info = loop_vinfo->lookup_stmt (gsi_stmt (si));
! 	  STMT_SLP_TYPE (stmt_info) = loop_vect;
! 	}
!       for (gimple_stmt_iterator si = gsi_start_bb (bb);
! 	   !gsi_end_p (si); gsi_next (&si))
! 	{
! 	  stmt_vec_info stmt_info = loop_vinfo->lookup_stmt (gsi_stmt (si));
! 	  STMT_SLP_TYPE (stmt_info) = loop_vect;
! 	  if (STMT_VINFO_IN_PATTERN_P (stmt_info))
! 	    {
! 	      gimple *pattern_def_seq = STMT_VINFO_PATTERN_DEF_SEQ (stmt_info);
! 	      stmt_info = STMT_VINFO_RELATED_STMT (stmt_info);
! 	      STMT_SLP_TYPE (stmt_info) = loop_vect;
! 	      for (gimple_stmt_iterator pi = gsi_start (pattern_def_seq);
! 		   !gsi_end_p (pi); gsi_next (&pi))
! 		STMT_SLP_TYPE (loop_vinfo->lookup_stmt (gsi_stmt (pi)))
! 		  = loop_vect;
! 	    }
! 	}
!     }
    /* Free optimized alias test DDRS.  */
    LOOP_VINFO_LOWER_BOUNDS (loop_vinfo).truncate (0);
    LOOP_VINFO_COMP_ALIAS_DDRS (loop_vinfo).release ();
--- 2161,2182 ----
      vect_free_slp_instance (instance, false);
    LOOP_VINFO_SLP_INSTANCES (loop_vinfo).release ();
    /* Reset SLP type to loop_vect on all stmts.  */
!   vec_basic_block *vec_bb;
!   FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
!     FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
!       {
! 	STMT_SLP_TYPE (stmt_info) = loop_vect;
! 	if (STMT_VINFO_IN_PATTERN_P (stmt_info))
! 	  {
! 	    gimple *pattern_def_seq = STMT_VINFO_PATTERN_DEF_SEQ (stmt_info);
! 	    STMT_SLP_TYPE (STMT_VINFO_RELATED_STMT (stmt_info)) = loop_vect;
! 	    for (gimple_stmt_iterator pi = gsi_start (pattern_def_seq);
! 		 !gsi_end_p (pi); gsi_next (&pi))
! 	      STMT_SLP_TYPE (loop_vinfo->lookup_stmt (gsi_stmt (pi)))
! 		= loop_vect;
! 	  }
!       }
! 
    /* Free optimized alias test DDRS.  */
    LOOP_VINFO_LOWER_BOUNDS (loop_vinfo).truncate (0);
    LOOP_VINFO_COMP_ALIAS_DDRS (loop_vinfo).release ();
*************** vect_transform_loop (loop_vec_info loop_
*** 8237,8251 ****
  {
    struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
    struct loop *epilogue = NULL;
!   basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo);
!   int nbbs = loop->num_nodes;
!   int i;
    tree niters_vector = NULL_TREE;
    tree step_vector = NULL_TREE;
    tree niters_vector_mult_vf = NULL_TREE;
    poly_uint64 vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
    unsigned int lowest_vf = constant_lower_bound (vf);
-   gimple *stmt;
    bool check_profitability = false;
    unsigned int th;
  
--- 8181,8192 ----
  {
    struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
    struct loop *epilogue = NULL;
!   unsigned int i;
    tree niters_vector = NULL_TREE;
    tree step_vector = NULL_TREE;
    tree niters_vector_mult_vf = NULL_TREE;
    poly_uint64 vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
    unsigned int lowest_vf = constant_lower_bound (vf);
    bool check_profitability = false;
    unsigned int th;
  
*************** vect_transform_loop (loop_vec_info loop_
*** 8363,8452 ****
       support more involved loop forms, the order by which the BBs are
       traversed need to be reconsidered.  */
  
!   for (i = 0; i < nbbs; i++)
      {
!       basic_block bb = bbs[i];
!       stmt_vec_info stmt_info;
! 
!       for (gphi_iterator si = gsi_start_phis (bb); !gsi_end_p (si);
! 	   gsi_next (&si))
!         {
! 	  gphi *phi = si.phi ();
! 	  if (dump_enabled_p ())
  	    {
! 	      dump_printf_loc (MSG_NOTE, vect_location,
!                                "------>vectorizing phi: ");
! 	      dump_gimple_stmt (MSG_NOTE, TDF_SLIM, phi, 0);
! 	    }
! 	  stmt_info = loop_vinfo->lookup_stmt (phi);
! 	  if (!stmt_info)
! 	    continue;
  
! 	  if (MAY_HAVE_DEBUG_BIND_STMTS && !STMT_VINFO_LIVE_P (stmt_info))
! 	    vect_loop_kill_debug_uses (loop, stmt_info);
  
! 	  if (!STMT_VINFO_RELEVANT_P (stmt_info)
! 	      && !STMT_VINFO_LIVE_P (stmt_info))
! 	    continue;
! 
! 	  if (STMT_VINFO_VECTYPE (stmt_info)
! 	      && (maybe_ne
! 		  (TYPE_VECTOR_SUBPARTS (STMT_VINFO_VECTYPE (stmt_info)), vf))
! 	      && dump_enabled_p ())
! 	    dump_printf_loc (MSG_NOTE, vect_location, "multiple-types.\n");
! 
! 	  if ((STMT_VINFO_DEF_TYPE (stmt_info) == vect_induction_def
! 	       || STMT_VINFO_DEF_TYPE (stmt_info) == vect_reduction_def
! 	       || STMT_VINFO_DEF_TYPE (stmt_info) == vect_nested_cycle)
! 	      && ! PURE_SLP_STMT (stmt_info))
! 	    {
! 	      if (dump_enabled_p ())
! 		dump_printf_loc (MSG_NOTE, vect_location, "transform phi.\n");
! 	      vect_transform_stmt (stmt_info, NULL, NULL, NULL);
  	    }
- 	}
- 
-       for (gimple_stmt_iterator si = gsi_start_bb (bb);
- 	   !gsi_end_p (si);)
- 	{
- 	  stmt = gsi_stmt (si);
  	  /* During vectorization remove existing clobber stmts.  */
! 	  if (gimple_clobber_p (stmt))
! 	    {
! 	      unlink_stmt_vdef (stmt);
! 	      gsi_remove (&si, true);
! 	      release_defs (stmt);
! 	    }
  	  else
  	    {
- 	      stmt_info = loop_vinfo->lookup_stmt (stmt);
- 
- 	      /* vector stmts created in the outer-loop during vectorization of
- 		 stmts in an inner-loop may not have a stmt_info, and do not
- 		 need to be vectorized.  */
  	      stmt_vec_info seen_store = NULL;
! 	      if (stmt_info)
  		{
! 		  if (STMT_VINFO_IN_PATTERN_P (stmt_info))
  		    {
- 		      gimple *def_seq = STMT_VINFO_PATTERN_DEF_SEQ (stmt_info);
- 		      for (gimple_stmt_iterator subsi = gsi_start (def_seq);
- 			   !gsi_end_p (subsi); gsi_next (&subsi))
- 			{
- 			  stmt_vec_info pat_stmt_info
- 			    = loop_vinfo->lookup_stmt (gsi_stmt (subsi));
- 			  vect_transform_loop_stmt (loop_vinfo, pat_stmt_info,
- 						    &si, &seen_store);
- 			}
  		      stmt_vec_info pat_stmt_info
! 			= STMT_VINFO_RELATED_STMT (stmt_info);
! 		      vect_transform_loop_stmt (loop_vinfo, pat_stmt_info, &si,
! 						&seen_store);
  		    }
! 		  vect_transform_loop_stmt (loop_vinfo, stmt_info, &si,
  					    &seen_store);
  		}
! 	      gsi_next (&si);
  	      if (seen_store)
  		{
  		  if (STMT_VINFO_GROUPED_ACCESS (seen_store))
--- 8304,8376 ----
       support more involved loop forms, the order by which the BBs are
       traversed need to be reconsidered.  */
  
!   vec_basic_block *vec_bb;
!   FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
      {
!       stmt_vec_info next_stmt_info;
!       for (stmt_vec_info stmt_info = vec_bb->first (); stmt_info;
! 	   stmt_info = next_stmt_info)
! 	{
! 	  next_stmt_info = stmt_info->next;
! 	  if (gphi *phi = dyn_cast <gphi *> (stmt_info->stmt))
  	    {
! 	      if (dump_enabled_p ())
! 		{
! 		  dump_printf_loc (MSG_NOTE, vect_location,
! 				   "------>vectorizing phi: ");
! 		  dump_gimple_stmt (MSG_NOTE, TDF_SLIM, phi, 0);
! 		}
  
! 	      if (MAY_HAVE_DEBUG_BIND_STMTS && !STMT_VINFO_LIVE_P (stmt_info))
! 		vect_loop_kill_debug_uses (loop, stmt_info);
  
! 	      if (!STMT_VINFO_RELEVANT_P (stmt_info)
! 		  && !STMT_VINFO_LIVE_P (stmt_info))
! 		continue;
! 
! 	      if (STMT_VINFO_VECTYPE (stmt_info)
! 		  && (maybe_ne
! 		      (TYPE_VECTOR_SUBPARTS (STMT_VINFO_VECTYPE (stmt_info)),
! 		       vf))
! 		  && dump_enabled_p ())
! 		dump_printf_loc (MSG_NOTE, vect_location, "multiple-types.\n");
! 
! 	      if ((STMT_VINFO_DEF_TYPE (stmt_info) == vect_induction_def
! 		   || STMT_VINFO_DEF_TYPE (stmt_info) == vect_reduction_def
! 		   || STMT_VINFO_DEF_TYPE (stmt_info) == vect_nested_cycle)
! 		  && ! PURE_SLP_STMT (stmt_info))
! 		{
! 		  if (dump_enabled_p ())
! 		    dump_printf_loc (MSG_NOTE, vect_location,
! 				     "transform phi.\n");
! 		  vect_transform_stmt (stmt_info, NULL, NULL, NULL);
! 		}
  	    }
  	  /* During vectorization remove existing clobber stmts.  */
! 	  else if (gimple_clobber_p (stmt_info->stmt))
! 	    loop_vinfo->remove_stmt (stmt_info);
  	  else
  	    {
  	      stmt_vec_info seen_store = NULL;
! 	      gimple_stmt_iterator si = gsi_for_stmt (stmt_info->stmt);
! 	      if (STMT_VINFO_IN_PATTERN_P (stmt_info))
  		{
! 		  gimple *def_seq = STMT_VINFO_PATTERN_DEF_SEQ (stmt_info);
! 		  for (gimple_stmt_iterator subsi = gsi_start (def_seq);
! 		       !gsi_end_p (subsi); gsi_next (&subsi))
  		    {
  		      stmt_vec_info pat_stmt_info
! 			= loop_vinfo->lookup_stmt (gsi_stmt (subsi));
! 		      vect_transform_loop_stmt (loop_vinfo, pat_stmt_info,
! 						&si, &seen_store);
  		    }
! 		  stmt_vec_info pat_stmt_info
! 		    = STMT_VINFO_RELATED_STMT (stmt_info);
! 		  vect_transform_loop_stmt (loop_vinfo, pat_stmt_info, &si,
  					    &seen_store);
  		}
! 	      vect_transform_loop_stmt (loop_vinfo, stmt_info, &si,
! 					&seen_store);
  	      if (seen_store)
  		{
  		  if (STMT_VINFO_GROUPED_ACCESS (seen_store))
*************** vect_transform_loop (loop_vec_info loop_
*** 8464,8470 ****
        /* Stub out scalar statements that must not survive vectorization.
  	 Doing this here helps with grouped statements, or statements that
  	 are involved in patterns.  */
!       for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
  	   !gsi_end_p (gsi); gsi_next (&gsi))
  	{
  	  gcall *call = dyn_cast <gcall *> (gsi_stmt (gsi));
--- 8388,8394 ----
        /* Stub out scalar statements that must not survive vectorization.
  	 Doing this here helps with grouped statements, or statements that
  	 are involved in patterns.  */
!       for (gimple_stmt_iterator gsi = gsi_start_bb (vec_bb->bb ());
  	   !gsi_end_p (gsi); gsi_next (&gsi))
  	{
  	  gcall *call = dyn_cast <gcall *> (gsi_stmt (gsi));
Index: gcc/tree-vect-slp.c
===================================================================
*** gcc/tree-vect-slp.c	2018-07-30 12:43:34.512651826 +0100
--- gcc/tree-vect-slp.c	2018-07-30 12:43:34.504651897 +0100
*************** vect_detect_hybrid_slp (loop_vec_info lo
*** 2408,2436 ****
  
    /* First walk all pattern stmt in the loop and mark defs of uses as
       hybrid because immediate uses in them are not recorded.  */
!   for (i = 0; i < LOOP_VINFO_LOOP (loop_vinfo)->num_nodes; ++i)
!     {
!       basic_block bb = LOOP_VINFO_BBS (loop_vinfo)[i];
!       for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
! 	   gsi_next (&gsi))
  	{
! 	  gimple *stmt = gsi_stmt (gsi);
! 	  stmt_vec_info stmt_info = loop_vinfo->lookup_stmt (stmt);
! 	  if (STMT_VINFO_IN_PATTERN_P (stmt_info))
! 	    {
! 	      walk_stmt_info wi;
! 	      memset (&wi, 0, sizeof (wi));
! 	      wi.info = loop_vinfo;
! 	      gimple_stmt_iterator gsi2
! 		= gsi_for_stmt (STMT_VINFO_RELATED_STMT (stmt_info)->stmt);
! 	      walk_gimple_stmt (&gsi2, vect_detect_hybrid_slp_2,
! 				vect_detect_hybrid_slp_1, &wi);
! 	      walk_gimple_seq (STMT_VINFO_PATTERN_DEF_SEQ (stmt_info),
! 			       vect_detect_hybrid_slp_2,
! 			       vect_detect_hybrid_slp_1, &wi);
! 	    }
  	}
-     }
  
    /* Then walk the SLP instance trees marking stmts with uses in
       non-SLP stmts as hybrid, also propagating hybrid down the
--- 2408,2429 ----
  
    /* First walk all pattern stmt in the loop and mark defs of uses as
       hybrid because immediate uses in them are not recorded.  */
!   vec_basic_block *vec_bb;
!   FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
!     FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
!       if (STMT_VINFO_IN_PATTERN_P (stmt_info))
  	{
! 	  walk_stmt_info wi;
! 	  memset (&wi, 0, sizeof (wi));
! 	  wi.info = loop_vinfo;
! 	  gimple_stmt_iterator gsi2
! 	    = gsi_for_stmt (STMT_VINFO_RELATED_STMT (stmt_info)->stmt);
! 	  walk_gimple_stmt (&gsi2, vect_detect_hybrid_slp_2,
! 			    vect_detect_hybrid_slp_1, &wi);
! 	  walk_gimple_seq (STMT_VINFO_PATTERN_DEF_SEQ (stmt_info),
! 			   vect_detect_hybrid_slp_2,
! 			   vect_detect_hybrid_slp_1, &wi);
  	}
  
    /* Then walk the SLP instance trees marking stmts with uses in
       non-SLP stmts as hybrid, also propagating hybrid down the
*************** _bb_vec_info::_bb_vec_info (gimple_stmt_
*** 2457,2469 ****
  {
    gimple_stmt_iterator gsi;
  
    for (gsi = region_begin; gsi_stmt (gsi) != gsi_stmt (region_end);
         gsi_next (&gsi))
      {
        gimple *stmt = gsi_stmt (gsi);
        gimple_set_uid (stmt, 0);
!       add_stmt (stmt);
      }
  
    bb->aux = this;
  }
--- 2450,2464 ----
  {
    gimple_stmt_iterator gsi;
  
+   vec_basic_block *vec_bb = new vec_basic_block (bb);
    for (gsi = region_begin; gsi_stmt (gsi) != gsi_stmt (region_end);
         gsi_next (&gsi))
      {
        gimple *stmt = gsi_stmt (gsi);
        gimple_set_uid (stmt, 0);
!       vec_bb->add_to_end (add_stmt (stmt));
      }
+   blocks.quick_push (vec_bb);
  
    bb->aux = this;
  }
Index: gcc/tree-vect-stmts.c
===================================================================
*** gcc/tree-vect-stmts.c	2018-07-30 12:43:34.512651826 +0100
--- gcc/tree-vect-stmts.c	2018-07-30 12:43:34.504651897 +0100
*************** process_use (stmt_vec_info stmt_vinfo, t
*** 612,623 ****
  bool
  vect_mark_stmts_to_be_vectorized (loop_vec_info loop_vinfo)
  {
-   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
-   basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo);
-   unsigned int nbbs = loop->num_nodes;
-   gimple_stmt_iterator si;
    unsigned int i;
-   basic_block bb;
    bool live_p;
    enum vect_relevant relevant;
  
--- 612,618 ----
*************** vect_mark_stmts_to_be_vectorized (loop_v
*** 626,659 ****
    auto_vec<stmt_vec_info, 64> worklist;
  
    /* 1. Init worklist.  */
!   for (i = 0; i < nbbs; i++)
!     {
!       bb = bbs[i];
!       for (si = gsi_start_phis (bb); !gsi_end_p (si); gsi_next (&si))
! 	{
! 	  stmt_vec_info phi_info = loop_vinfo->lookup_stmt (gsi_stmt (si));
! 	  if (dump_enabled_p ())
! 	    {
! 	      dump_printf_loc (MSG_NOTE, vect_location, "init: phi relevant? ");
! 	      dump_gimple_stmt (MSG_NOTE, TDF_SLIM, phi_info->stmt, 0);
! 	    }
! 
! 	  if (vect_stmt_relevant_p (phi_info, loop_vinfo, &relevant, &live_p))
! 	    vect_mark_relevant (&worklist, phi_info, relevant, live_p);
! 	}
!       for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si))
! 	{
! 	  stmt_vec_info stmt_info = loop_vinfo->lookup_stmt (gsi_stmt (si));
! 	  if (dump_enabled_p ())
! 	    {
! 	      dump_printf_loc (MSG_NOTE, vect_location, "init: stmt relevant? ");
! 	      dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt_info->stmt, 0);
! 	    }
! 
! 	  if (vect_stmt_relevant_p (stmt_info, loop_vinfo, &relevant, &live_p))
! 	    vect_mark_relevant (&worklist, stmt_info, relevant, live_p);
! 	}
!     }
  
    /* 2. Process_worklist */
    while (worklist.length () > 0)
--- 621,631 ----
    auto_vec<stmt_vec_info, 64> worklist;
  
    /* 1. Init worklist.  */
!   vec_basic_block *vec_bb;
!   FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
!     FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
!       if (vect_stmt_relevant_p (stmt_info, loop_vinfo, &relevant, &live_p))
! 	vect_mark_relevant (&worklist, stmt_info, relevant, live_p);
  
    /* 2. Process_worklist */
    while (worklist.length () > 0)
*************** vect_finish_replace_stmt (stmt_vec_info
*** 1753,1758 ****
--- 1725,1731 ----
  
    gimple_stmt_iterator gsi = gsi_for_stmt (stmt_info->stmt);
    gsi_replace (&gsi, vec_stmt, false);
+   stmt_info->block->remove (stmt_info);
  
    return vect_finish_stmt_generation_1 (stmt_info, vec_stmt);
  }
*************** hoist_defs_of_uses (stmt_vec_info stmt_i
*** 7352,7357 ****
--- 7325,7331 ----
        {
  	gimple_stmt_iterator gsi = gsi_for_stmt (def_stmt_info->stmt);
  	gsi_remove (&gsi, false);
+ 	def_stmt_info->block->remove (def_stmt_info);
  	gsi_insert_on_edge_immediate (loop_preheader_edge (loop),
  				      def_stmt_info->stmt);
        }
*************** vectorizable_condition (stmt_vec_info st
*** 9066,9071 ****
--- 9040,9046 ----
  		  gimple_stmt_iterator old_gsi
  		    = gsi_for_stmt (stmt_info->stmt);
  		  gsi_remove (&old_gsi, true);
+ 		  stmt_info->block->remove (stmt_info);
  		  new_stmt_info
  		    = vect_finish_stmt_generation (stmt_info, new_stmt, gsi);
  		}

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [11/11] Insert pattern statements into vec_basic_blocks
  2018-07-30 11:36 [00/11] Add a vec_basic_block of scalar statements Richard Sandiford
                   ` (9 preceding siblings ...)
  2018-07-30 11:46 ` [10/11] Make the vectoriser do its own DCE Richard Sandiford
@ 2018-07-30 11:47 ` Richard Sandiford
  10 siblings, 0 replies; 20+ messages in thread
From: Richard Sandiford @ 2018-07-30 11:47 UTC (permalink / raw)
  To: gcc-patches

The point of this patch is to put pattern statements in the same
vec_basic_block as the statements they replace, with the pattern
statements for S coming between S and S's original predecessor.
This removes the need to handle them specially in various places.


2018-07-30  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vectorizer.h (vec_basic_block): Expand comment.
	(_stmt_vec_info::pattern_def_seq): Delete.
	(STMT_VINFO_PATTERN_DEF_SEQ): Likewise.
	(is_main_pattern_stmt_p): New function.
	* tree-vect-loop.c (vect_determine_vf_for_stmt_1): Rename to...
	(vect_determine_vf_for_stmt): ...this, deleting the original
	function with this name.  Remove vectype_maybe_set_p argument
	and test is_pattern_stmt_p instead.  Retain the "examining..."
	message from the previous vect_determine_vf_for_stmt.
	(vect_compute_single_scalar_iteration_cost, vect_update_vf_for_slp)
	(vect_analyze_loop_2): Don't treat pattern statements specially.
	(vect_transform_loop): Likewise.  Use vect_orig_stmt to find the
	insertion point.
	* tree-vect-slp.c (vect_detect_hybrid_slp): Expect pattern statements
	to be in the statement list, without needing to follow
	STMT_VINFO_RELATED_STMT.  Remove PATTERN_DEF_SEQ handling.
	* tree-vect-stmts.c (vect_analyze_stmt): Don't handle pattern
	statements specially.
	(vect_remove_dead_scalar_stmts): Ignore pattern statements.
	* tree-vect-patterns.c (vect_set_pattern_stmt): Insert the pattern
	statement into the vec_basic_block immediately before the statement
	it replaces.
	(append_pattern_def_seq): Likewise.  If the original statement is
	itself a pattern statement, associate the new one with the original
	statement.
	(vect_split_statement): Use append_pattern_def_seq to insert the
	first pattern statement.
	(vect_recog_vector_vector_shift_pattern): Remove mention of
	STMT_VINFO_PATTERN_DEF_SEQ.
	(adjust_bool_stmts): Get the last pattern statement from the
	stmt_vec_info chain.
	(vect_mark_pattern_stmts): Rename to...
	(vect_replace_stmt_with_pattern): ...this.  Remove the
	PATTERN_DEF_SEQ handling and process only the pattern statement given.
	Use append_pattern_def_seq when replacing a pattern statement with
	another pattern statement, and use vec_basic_block::remove instead
	of gsi_remove to remove the old one.
	(vect_pattern_recog_1): Update accordingly.  Remove PATTERN_DEF_SEQ
	handling.  On failure, remove any half-formed pattern sequence from
	the vec_basic_block.  Install the vector type in pattern statements
	that don't yet have one.
	(vect_pattern_recog): Iterate over statements that are added
	by previous recognizers, but skipping those that have already
	been replaced, or the main pattern statement in such a replacement.

Index: gcc/tree-vectorizer.h
===================================================================
*** gcc/tree-vectorizer.h	2018-07-30 12:32:46.658356275 +0100
--- gcc/tree-vectorizer.h	2018-07-30 12:32:49.898327734 +0100
*************** #define SLP_TREE_TWO_OPERATORS(S)		 (S)-
*** 172,178 ****
  #define SLP_TREE_DEF_TYPE(S)			 (S)->def_type
  
  /* Information about the phis and statements in a block that we're trying
!    to vectorize, in their original order.  */
  class vec_basic_block
  {
  public:
--- 172,184 ----
  #define SLP_TREE_DEF_TYPE(S)			 (S)->def_type
  
  /* Information about the phis and statements in a block that we're trying
!    to vectorize.  This includes the phis and statements that were in the
!    original scalar code, in their original order.  It also includes any
!    pattern statements that the vectorizer has created to replace some
!    of the scalar ones.  Such pattern statements come immediately before
!    the statement that they replace; that is, all pattern statements P for
!    which vect_orig_stmt (P) == S form a sequence that comes immediately
!    before S.  */
  class vec_basic_block
  {
  public:
*************** struct _stmt_vec_info {
*** 870,880 ****
          pattern).  */
    stmt_vec_info related_stmt;
  
-   /* Used to keep a sequence of def stmts of a pattern stmt if such exists.
-      The sequence is attached to the original statement rather than the
-      pattern statement.  */
-   gimple_seq pattern_def_seq;
- 
    /* List of datarefs that are known to have the same alignment as the dataref
       of this stmt.  */
    vec<dr_p> same_align_refs;
--- 876,881 ----
*************** #define STMT_VINFO_DR_INFO(S) \
*** 1048,1054 ****
  
  #define STMT_VINFO_IN_PATTERN_P(S)         (S)->in_pattern_p
  #define STMT_VINFO_RELATED_STMT(S)         (S)->related_stmt
- #define STMT_VINFO_PATTERN_DEF_SEQ(S)      (S)->pattern_def_seq
  #define STMT_VINFO_SAME_ALIGN_REFS(S)      (S)->same_align_refs
  #define STMT_VINFO_SIMD_CLONE_INFO(S)	   (S)->simd_clone_info
  #define STMT_VINFO_DEF_TYPE(S)             (S)->def_type
--- 1049,1054 ----
*************** is_pattern_stmt_p (stmt_vec_info stmt_in
*** 1176,1181 ****
--- 1176,1192 ----
    return stmt_info->pattern_stmt_p;
  }
  
+ /* Return TRUE if a statement represented by STMT_INFO is the final
+    statement in a pattern.  */
+ 
+ static inline bool
+ is_main_pattern_stmt_p (stmt_vec_info stmt_info)
+ {
+   stmt_vec_info orig_stmt_info = STMT_VINFO_RELATED_STMT (stmt_info);
+   return (is_pattern_stmt_p (stmt_info)
+ 	  && STMT_VINFO_RELATED_STMT (orig_stmt_info) == stmt_info);
+ }
+ 
  /* If STMT_INFO is a pattern statement, return the statement that it
     replaces, otherwise return STMT_INFO itself.  */
  
Index: gcc/tree-vect-loop.c
===================================================================
*** gcc/tree-vect-loop.c	2018-07-30 12:32:46.654356310 +0100
--- gcc/tree-vect-loop.c	2018-07-30 12:32:49.894327770 +0100
*************** Software Foundation; either version 3, o
*** 155,172 ****
  
  static void vect_estimate_min_profitable_iters (loop_vec_info, int *, int *);
  
! /* Subroutine of vect_determine_vf_for_stmt that handles only one
!    statement.  VECTYPE_MAYBE_SET_P is true if STMT_VINFO_VECTYPE
!    may already be set for general statements (not just data refs).  */
  
  static bool
! vect_determine_vf_for_stmt_1 (stmt_vec_info stmt_info,
! 			      bool vectype_maybe_set_p,
! 			      poly_uint64 *vf,
! 			      vec<stmt_vec_info > *mask_producers)
  {
    gimple *stmt = stmt_info->stmt;
  
    if ((!STMT_VINFO_RELEVANT_P (stmt_info)
         && !STMT_VINFO_LIVE_P (stmt_info))
        || gimple_clobber_p (stmt))
--- 155,178 ----
  
  static void vect_estimate_min_profitable_iters (loop_vec_info, int *, int *);
  
! /* Subroutine of vect_determine_vectorization_factor.  Set the vector
!    type of STMT_INFO and update the vectorization factor VF accordingly.
!    If the statement produces a mask result whose vector type can only be
!    calculated later, add it to MASK_PRODUCERS.  Return true on success
!    or false if something prevented vectorization.  */
  
  static bool
! vect_determine_vf_for_stmt (stmt_vec_info stmt_info, poly_uint64 *vf,
! 			    vec<stmt_vec_info > *mask_producers)
  {
    gimple *stmt = stmt_info->stmt;
  
+   if (dump_enabled_p ())
+     {
+       dump_printf_loc (MSG_NOTE, vect_location, "==> examining statement: ");
+       dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt_info->stmt, 0);
+     }
+ 
    if ((!STMT_VINFO_RELEVANT_P (stmt_info)
         && !STMT_VINFO_LIVE_P (stmt_info))
        || gimple_clobber_p (stmt))
*************** vect_determine_vf_for_stmt_1 (stmt_vec_i
*** 188,194 ****
  	   that contain a data ref, or for "pattern-stmts" (stmts generated
  	   by the vectorizer to represent/replace a certain idiom).  */
  	gcc_assert ((STMT_VINFO_DATA_REF (stmt_info)
! 		     || vectype_maybe_set_p)
  		    && STMT_VINFO_VECTYPE (stmt_info) == stmt_vectype);
        else if (stmt_vectype == boolean_type_node)
  	mask_producers->safe_push (stmt_info);
--- 194,200 ----
  	   that contain a data ref, or for "pattern-stmts" (stmts generated
  	   by the vectorizer to represent/replace a certain idiom).  */
  	gcc_assert ((STMT_VINFO_DATA_REF (stmt_info)
! 		     || is_pattern_stmt_p (stmt_info))
  		    && STMT_VINFO_VECTYPE (stmt_info) == stmt_vectype);
        else if (stmt_vectype == boolean_type_node)
  	mask_producers->safe_push (stmt_info);
*************** vect_determine_vf_for_stmt_1 (stmt_vec_i
*** 202,263 ****
    return true;
  }
  
- /* Subroutine of vect_determine_vectorization_factor.  Set the vector
-    types of STMT_INFO and all attached pattern statements and update
-    the vectorization factor VF accordingly.  If some of the statements
-    produce a mask result whose vector type can only be calculated later,
-    add them to MASK_PRODUCERS.  Return true on success or false if
-    something prevented vectorization.  */
- 
- static bool
- vect_determine_vf_for_stmt (stmt_vec_info stmt_info, poly_uint64 *vf,
- 			    vec<stmt_vec_info > *mask_producers)
- {
-   vec_info *vinfo = stmt_info->vinfo;
-   if (dump_enabled_p ())
-     {
-       dump_printf_loc (MSG_NOTE, vect_location, "==> examining statement: ");
-       dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt_info->stmt, 0);
-     }
-   if (!vect_determine_vf_for_stmt_1 (stmt_info, false, vf, mask_producers))
-     return false;
- 
-   if (STMT_VINFO_IN_PATTERN_P (stmt_info)
-       && STMT_VINFO_RELATED_STMT (stmt_info))
-     {
-       gimple *pattern_def_seq = STMT_VINFO_PATTERN_DEF_SEQ (stmt_info);
-       stmt_info = STMT_VINFO_RELATED_STMT (stmt_info);
- 
-       /* If a pattern statement has def stmts, analyze them too.  */
-       for (gimple_stmt_iterator si = gsi_start (pattern_def_seq);
- 	   !gsi_end_p (si); gsi_next (&si))
- 	{
- 	  stmt_vec_info def_stmt_info = vinfo->lookup_stmt (gsi_stmt (si));
- 	  if (dump_enabled_p ())
- 	    {
- 	      dump_printf_loc (MSG_NOTE, vect_location,
- 			       "==> examining pattern def stmt: ");
- 	      dump_gimple_stmt (MSG_NOTE, TDF_SLIM,
- 				def_stmt_info->stmt, 0);
- 	    }
- 	  if (!vect_determine_vf_for_stmt_1 (def_stmt_info, true,
- 					     vf, mask_producers))
- 	    return false;
- 	}
- 
-       if (dump_enabled_p ())
- 	{
- 	  dump_printf_loc (MSG_NOTE, vect_location,
- 			   "==> examining pattern statement: ");
- 	  dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt_info->stmt, 0);
- 	}
-       if (!vect_determine_vf_for_stmt_1 (stmt_info, true, vf, mask_producers))
- 	return false;
-     }
- 
-   return true;
- }
- 
  /* Function vect_determine_vectorization_factor
  
     Determine the vectorization factor (VF).  VF is the number of data elements
--- 208,213 ----
*************** vect_compute_single_scalar_iteration_cos
*** 1078,1086 ****
            /* Skip stmts that are not vectorized inside the loop.  */
            if (stmt_info
                && !STMT_VINFO_RELEVANT_P (stmt_info)
!               && (!STMT_VINFO_LIVE_P (stmt_info)
!                   || !VECTORIZABLE_CYCLE_DEF (STMT_VINFO_DEF_TYPE (stmt_info)))
! 	      && !STMT_VINFO_IN_PATTERN_P (stmt_info))
              continue;
  
  	  vect_cost_for_stmt kind;
--- 1028,1035 ----
            /* Skip stmts that are not vectorized inside the loop.  */
            if (stmt_info
                && !STMT_VINFO_RELEVANT_P (stmt_info)
! 	      && (!VECTORIZABLE_CYCLE_DEF (STMT_VINFO_DEF_TYPE (stmt_info))
! 		  || !STMT_VINFO_LIVE_P (stmt_info)))
              continue;
  
  	  vect_cost_for_stmt kind;
*************** vect_update_vf_for_slp (loop_vec_info lo
*** 1394,1407 ****
    vec_basic_block *vec_bb;
    FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
      FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
!       {
! 	stmt_vec_info final_info = vect_stmt_to_vectorize (stmt_info);
! 	if ((STMT_VINFO_RELEVANT_P (final_info)
! 	     || VECTORIZABLE_CYCLE_DEF (STMT_VINFO_DEF_TYPE (final_info)))
! 	    && !PURE_SLP_STMT (final_info))
! 	  /* STMT needs both SLP and loop-based vectorization.  */
! 	  only_slp_in_loop = false;
!       }
  
    if (only_slp_in_loop)
      {
--- 1343,1353 ----
    vec_basic_block *vec_bb;
    FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
      FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
!       if ((STMT_VINFO_RELEVANT_P (stmt_info)
! 	   || VECTORIZABLE_CYCLE_DEF (STMT_VINFO_DEF_TYPE (stmt_info)))
! 	  && !PURE_SLP_STMT (stmt_info))
! 	/* STMT needs both SLP and loop-based vectorization.  */
! 	only_slp_in_loop = false;
  
    if (only_slp_in_loop)
      {
*************** vect_analyze_loop_2 (loop_vec_info loop_
*** 2164,2181 ****
    vec_basic_block *vec_bb;
    FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
      FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
!       {
! 	STMT_SLP_TYPE (stmt_info) = loop_vect;
! 	if (STMT_VINFO_IN_PATTERN_P (stmt_info))
! 	  {
! 	    gimple *pattern_def_seq = STMT_VINFO_PATTERN_DEF_SEQ (stmt_info);
! 	    STMT_SLP_TYPE (STMT_VINFO_RELATED_STMT (stmt_info)) = loop_vect;
! 	    for (gimple_stmt_iterator pi = gsi_start (pattern_def_seq);
! 		 !gsi_end_p (pi); gsi_next (&pi))
! 	      STMT_SLP_TYPE (loop_vinfo->lookup_stmt (gsi_stmt (pi)))
! 		= loop_vect;
! 	  }
!       }
  
    /* Free optimized alias test DDRS.  */
    LOOP_VINFO_LOWER_BOUNDS (loop_vinfo).truncate (0);
--- 2110,2116 ----
    vec_basic_block *vec_bb;
    FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
      FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
!       STMT_SLP_TYPE (stmt_info) = loop_vect;
  
    /* Free optimized alias test DDRS.  */
    LOOP_VINFO_LOWER_BOUNDS (loop_vinfo).truncate (0);
*************** vect_transform_loop (loop_vec_info loop_
*** 8371,8392 ****
  	    loop_vinfo->remove_stmt (stmt_info);
  	  else
  	    {
! 	      gimple_stmt_iterator si = gsi_for_stmt (stmt_info->stmt);
! 	      if (STMT_VINFO_IN_PATTERN_P (stmt_info))
! 		{
! 		  gimple *def_seq = STMT_VINFO_PATTERN_DEF_SEQ (stmt_info);
! 		  for (gimple_stmt_iterator subsi = gsi_start (def_seq);
! 		       !gsi_end_p (subsi); gsi_next (&subsi))
! 		    {
! 		      stmt_vec_info pat_stmt_info
! 			= loop_vinfo->lookup_stmt (gsi_stmt (subsi));
! 		      vect_transform_loop_stmt (loop_vinfo, pat_stmt_info,
! 						&si);
! 		    }
! 		  stmt_vec_info pat_stmt_info
! 		    = STMT_VINFO_RELATED_STMT (stmt_info);
! 		  vect_transform_loop_stmt (loop_vinfo, pat_stmt_info, &si);
! 		}
  	      vect_transform_loop_stmt (loop_vinfo, stmt_info, &si);
  	    }
  	}
--- 8306,8313 ----
  	    loop_vinfo->remove_stmt (stmt_info);
  	  else
  	    {
! 	      stmt_vec_info place = vect_orig_stmt (stmt_info);
! 	      gimple_stmt_iterator si = gsi_for_stmt (place->stmt);
  	      vect_transform_loop_stmt (loop_vinfo, stmt_info, &si);
  	    }
  	}
Index: gcc/tree-vect-slp.c
===================================================================
*** gcc/tree-vect-slp.c	2018-07-30 12:32:46.654356310 +0100
--- gcc/tree-vect-slp.c	2018-07-30 12:32:49.894327770 +0100
*************** vect_detect_hybrid_slp (loop_vec_info lo
*** 2411,2428 ****
    vec_basic_block *vec_bb;
    FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
      FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
!       if (STMT_VINFO_IN_PATTERN_P (stmt_info))
  	{
  	  walk_stmt_info wi;
  	  memset (&wi, 0, sizeof (wi));
  	  wi.info = loop_vinfo;
! 	  gimple_stmt_iterator gsi2
! 	    = gsi_for_stmt (STMT_VINFO_RELATED_STMT (stmt_info)->stmt);
! 	  walk_gimple_stmt (&gsi2, vect_detect_hybrid_slp_2,
  			    vect_detect_hybrid_slp_1, &wi);
- 	  walk_gimple_seq (STMT_VINFO_PATTERN_DEF_SEQ (stmt_info),
- 			   vect_detect_hybrid_slp_2,
- 			   vect_detect_hybrid_slp_1, &wi);
  	}
  
    /* Then walk the SLP instance trees marking stmts with uses in
--- 2411,2424 ----
    vec_basic_block *vec_bb;
    FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
      FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
!       if (is_pattern_stmt_p (stmt_info))
  	{
  	  walk_stmt_info wi;
  	  memset (&wi, 0, sizeof (wi));
  	  wi.info = loop_vinfo;
! 	  gimple_stmt_iterator gsi = gsi_for_stmt (stmt_info->stmt);
! 	  walk_gimple_stmt (&gsi, vect_detect_hybrid_slp_2,
  			    vect_detect_hybrid_slp_1, &wi);
  	}
  
    /* Then walk the SLP instance trees marking stmts with uses in
Index: gcc/tree-vect-stmts.c
===================================================================
*** gcc/tree-vect-stmts.c	2018-07-30 12:32:46.658356275 +0100
--- gcc/tree-vect-stmts.c	2018-07-30 12:32:49.898327734 +0100
*************** vect_analyze_stmt (stmt_vec_info stmt_in
*** 9384,9394 ****
  		   slp_tree node, slp_instance node_instance,
  		   stmt_vector_for_cost *cost_vec)
  {
-   vec_info *vinfo = stmt_info->vinfo;
    bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_info);
    enum vect_relevant relevance = STMT_VINFO_RELEVANT (stmt_info);
    bool ok;
-   gimple_seq pattern_def_seq;
  
    if (dump_enabled_p ())
      {
--- 9384,9392 ----
*************** vect_analyze_stmt (stmt_vec_info stmt_in
*** 9405,9498 ****
        return false;
      }
  
-   if (STMT_VINFO_IN_PATTERN_P (stmt_info)
-       && node == NULL
-       && (pattern_def_seq = STMT_VINFO_PATTERN_DEF_SEQ (stmt_info)))
-     {
-       gimple_stmt_iterator si;
- 
-       for (si = gsi_start (pattern_def_seq); !gsi_end_p (si); gsi_next (&si))
- 	{
- 	  stmt_vec_info pattern_def_stmt_info
- 	    = vinfo->lookup_stmt (gsi_stmt (si));
- 	  if (STMT_VINFO_RELEVANT_P (pattern_def_stmt_info)
- 	      || STMT_VINFO_LIVE_P (pattern_def_stmt_info))
- 	    {
- 	      /* Analyze def stmt of STMT if it's a pattern stmt.  */
- 	      if (dump_enabled_p ())
- 		{
- 		  dump_printf_loc (MSG_NOTE, vect_location,
- 				   "==> examining pattern def statement: ");
- 		  dump_gimple_stmt (MSG_NOTE, TDF_SLIM,
- 				    pattern_def_stmt_info->stmt, 0);
- 		}
- 
- 	      if (!vect_analyze_stmt (pattern_def_stmt_info,
- 				      need_to_vectorize, node, node_instance,
- 				      cost_vec))
- 		return false;
- 	    }
- 	}
-     }
- 
    /* Skip stmts that do not need to be vectorized. In loops this is expected
       to include:
       - the COND_EXPR which is the loop exit condition
       - any LABEL_EXPRs in the loop
       - computations that are used only for array indexing or loop control.
       In basic blocks we only analyze statements that are a part of some SLP
!      instance, therefore, all the statements are relevant.
! 
!      Pattern statement needs to be analyzed instead of the original statement
!      if the original statement is not relevant.  Otherwise, we analyze both
!      statements.  In basic blocks we are called from some SLP instance
!      traversal, don't analyze pattern stmts instead, the pattern stmts
!      already will be part of SLP instance.  */
  
-   stmt_vec_info pattern_stmt_info = STMT_VINFO_RELATED_STMT (stmt_info);
    if (!STMT_VINFO_RELEVANT_P (stmt_info)
        && !STMT_VINFO_LIVE_P (stmt_info))
      {
-       if (STMT_VINFO_IN_PATTERN_P (stmt_info)
- 	  && pattern_stmt_info
- 	  && (STMT_VINFO_RELEVANT_P (pattern_stmt_info)
- 	      || STMT_VINFO_LIVE_P (pattern_stmt_info)))
-         {
-           /* Analyze PATTERN_STMT instead of the original stmt.  */
- 	  stmt_info = pattern_stmt_info;
-           if (dump_enabled_p ())
-             {
-               dump_printf_loc (MSG_NOTE, vect_location,
-                                "==> examining pattern statement: ");
- 	      dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt_info->stmt, 0);
-             }
-         }
-       else
-         {
-           if (dump_enabled_p ())
-             dump_printf_loc (MSG_NOTE, vect_location, "irrelevant.\n");
- 
-           return true;
-         }
-     }
-   else if (STMT_VINFO_IN_PATTERN_P (stmt_info)
- 	   && node == NULL
- 	   && pattern_stmt_info
- 	   && (STMT_VINFO_RELEVANT_P (pattern_stmt_info)
- 	       || STMT_VINFO_LIVE_P (pattern_stmt_info)))
-     {
-       /* Analyze PATTERN_STMT too.  */
        if (dump_enabled_p ())
!         {
!           dump_printf_loc (MSG_NOTE, vect_location,
!                            "==> examining pattern statement: ");
! 	  dump_gimple_stmt (MSG_NOTE, TDF_SLIM, pattern_stmt_info->stmt, 0);
!         }
! 
!       if (!vect_analyze_stmt (pattern_stmt_info, need_to_vectorize, node,
! 			      node_instance, cost_vec))
!         return false;
!    }
  
    switch (STMT_VINFO_DEF_TYPE (stmt_info))
      {
--- 9403,9423 ----
        return false;
      }
  
    /* Skip stmts that do not need to be vectorized. In loops this is expected
       to include:
       - the COND_EXPR which is the loop exit condition
       - any LABEL_EXPRs in the loop
       - computations that are used only for array indexing or loop control.
       In basic blocks we only analyze statements that are a part of some SLP
!      instance, therefore, all the statements are relevant.  */
  
    if (!STMT_VINFO_RELEVANT_P (stmt_info)
        && !STMT_VINFO_LIVE_P (stmt_info))
      {
        if (dump_enabled_p ())
! 	dump_printf_loc (MSG_NOTE, vect_location, "irrelevant.\n");
!       return true;
!     }
  
    switch (STMT_VINFO_DEF_TYPE (stmt_info))
      {
*************** vect_remove_dead_scalar_stmts (vec_info
*** 10915,10921 ****
  	   stmt_info = prev_stmt_info)
  	{
  	  prev_stmt_info = stmt_info->prev;
! 	  vect_maybe_remove_scalar_stmt (stmt_info);
  	}
      }
  }
--- 10840,10847 ----
  	   stmt_info = prev_stmt_info)
  	{
  	  prev_stmt_info = stmt_info->prev;
! 	  if (!is_pattern_stmt_p (stmt_info))
! 	    vect_maybe_remove_scalar_stmt (stmt_info);
  	}
      }
  }
Index: gcc/tree-vect-patterns.c
===================================================================
*** gcc/tree-vect-patterns.c	2018-07-30 12:32:42.786390386 +0100
--- gcc/tree-vect-patterns.c	2018-07-30 12:32:49.894327770 +0100
*************** vect_init_pattern_stmt (gimple *pattern_
*** 125,151 ****
  vect_set_pattern_stmt (gimple *pattern_stmt, stmt_vec_info orig_stmt_info,
  		       tree vectype)
  {
!   STMT_VINFO_IN_PATTERN_P (orig_stmt_info) = true;
!   STMT_VINFO_RELATED_STMT (orig_stmt_info)
      = vect_init_pattern_stmt (pattern_stmt, orig_stmt_info, vectype);
  }
  
! /* Add NEW_STMT to STMT_INFO's pattern definition statements.  If VECTYPE
!    is nonnull, record that NEW_STMT's vector type is VECTYPE, which might
!    be different from the vector type of the final pattern statement.  */
  
  static inline void
  append_pattern_def_seq (stmt_vec_info stmt_info, gimple *new_stmt,
  			tree vectype = NULL_TREE)
  {
!   vec_info *vinfo = stmt_info->vinfo;
!   if (vectype)
!     {
!       stmt_vec_info new_stmt_info = vinfo->add_stmt (new_stmt);
!       STMT_VINFO_VECTYPE (new_stmt_info) = vectype;
!     }
!   gimple_seq_add_stmt_without_update (&STMT_VINFO_PATTERN_DEF_SEQ (stmt_info),
! 				      new_stmt);
  }
  
  /* The caller wants to perform new operations on vect_external variable
--- 125,150 ----
  vect_set_pattern_stmt (gimple *pattern_stmt, stmt_vec_info orig_stmt_info,
  		       tree vectype)
  {
!   stmt_vec_info pattern_stmt_info
      = vect_init_pattern_stmt (pattern_stmt, orig_stmt_info, vectype);
+   orig_stmt_info->block->add_before (pattern_stmt_info, orig_stmt_info);
+   STMT_VINFO_IN_PATTERN_P (orig_stmt_info) = true;
+   STMT_VINFO_RELATED_STMT (orig_stmt_info) = pattern_stmt_info;
  }
  
! /* Add NEW_STMT to the pattern statements that replace STMT_INFO.
!    If VECTYPE is nonnull, record that NEW_STMT's vector type is VECTYPE,
!    which might be different from the vector type of the final pattern
!    statement.  */
  
  static inline void
  append_pattern_def_seq (stmt_vec_info stmt_info, gimple *new_stmt,
  			tree vectype = NULL_TREE)
  {
!   stmt_vec_info orig_stmt_info = vect_orig_stmt (stmt_info);
!   stmt_vec_info new_stmt_info
!     = vect_init_pattern_stmt (new_stmt, orig_stmt_info, vectype);
!   stmt_info->block->add_before (new_stmt_info, stmt_info);
  }
  
  /* The caller wants to perform new operations on vect_external variable
*************** vect_split_statement (stmt_vec_info stmt
*** 633,643 ****
  {
    if (is_pattern_stmt_p (stmt2_info))
      {
-       /* STMT2_INFO is part of a pattern.  Get the statement to which
- 	 the pattern is attached.  */
-       stmt_vec_info orig_stmt2_info = STMT_VINFO_RELATED_STMT (stmt2_info);
-       vect_init_pattern_stmt (stmt1, orig_stmt2_info, vectype);
- 
        if (dump_enabled_p ())
  	{
  	  dump_printf_loc (MSG_NOTE, vect_location,
--- 632,637 ----
*************** vect_split_statement (stmt_vec_info stmt
*** 645,650 ****
--- 639,647 ----
  	  dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt2_info->stmt, 0);
  	}
  
+       /* Insert STMT1_INFO before STMT2_INFO.  */
+       append_pattern_def_seq (stmt2_info, stmt1, vectype);
+ 
        /* Since STMT2_INFO is a pattern statement, we can change it
  	 in-situ without worrying about changing the code for the
  	 containing block.  */
*************** vect_split_statement (stmt_vec_info stmt
*** 658,675 ****
  	  dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt2_info->stmt, 0);
  	}
  
-       gimple_seq *def_seq = &STMT_VINFO_PATTERN_DEF_SEQ (orig_stmt2_info);
-       if (STMT_VINFO_RELATED_STMT (orig_stmt2_info) == stmt2_info)
- 	/* STMT2_INFO is the actual pattern statement.  Add STMT1
- 	   to the end of the definition sequence.  */
- 	gimple_seq_add_stmt_without_update (def_seq, stmt1);
-       else
- 	{
- 	  /* STMT2_INFO belongs to the definition sequence.  Insert STMT1
- 	     before it.  */
- 	  gimple_stmt_iterator gsi = gsi_for_stmt (stmt2_info->stmt, def_seq);
- 	  gsi_insert_before_without_update (&gsi, stmt1, GSI_SAME_STMT);
- 	}
        return true;
      }
    else
--- 655,660 ----
*************** vect_split_statement (stmt_vec_info stmt
*** 689,698 ****
  	  dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt2_info->stmt, 0);
  	}
  
!       /* Add STMT1 as a singleton pattern definition sequence.  */
!       gimple_seq *def_seq = &STMT_VINFO_PATTERN_DEF_SEQ (stmt2_info);
!       vect_init_pattern_stmt (stmt1, stmt2_info, vectype);
!       gimple_seq_add_stmt_without_update (def_seq, stmt1);
  
        /* Build the second of the two pattern statements.  */
        tree new_lhs = vect_recog_temp_ssa_var (lhs_type, NULL);
--- 674,681 ----
  	  dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt2_info->stmt, 0);
  	}
  
!       /* Insert STMT1_INFO before STMT2_INFO.  */
!       append_pattern_def_seq (stmt2_info, stmt1, vectype);
  
        /* Build the second of the two pattern statements.  */
        tree new_lhs = vect_recog_temp_ssa_var (lhs_type, NULL);
*************** vect_recog_rotate_pattern (stmt_vec_info
*** 2164,2170 ****
      i.e. the shift/rotate stmt.  The original stmt (S3) is replaced
      with a shift/rotate which has same type on both operands, in the
      second case just b_T op c_T, in the first case with added cast
!     from a_t to c_T in STMT_VINFO_PATTERN_DEF_SEQ.
  
    Output:
  
--- 2147,2153 ----
      i.e. the shift/rotate stmt.  The original stmt (S3) is replaced
      with a shift/rotate which has same type on both operands, in the
      second case just b_T op c_T, in the first case with added cast
!     from a_t to c_T beforehand.
  
    Output:
  
*************** adjust_bool_stmts (hash_set <gimple *> &
*** 3518,3526 ****
      adjust_bool_pattern (gimple_assign_lhs (bool_stmts[i]),
  			 out_type, stmt_info, defs);
  
!   /* Pop the last pattern seq stmt and install it as pattern root for STMT.  */
!   gimple *pattern_stmt
!     = gimple_seq_last_stmt (STMT_VINFO_PATTERN_DEF_SEQ (stmt_info));
    return gimple_assign_lhs (pattern_stmt);
  }
  
--- 3501,3508 ----
      adjust_bool_pattern (gimple_assign_lhs (bool_stmts[i]),
  			 out_type, stmt_info, defs);
  
!   /* Return the result of the last statement we emitted.  */
!   gimple *pattern_stmt = stmt_info->prev->stmt;
    return gimple_assign_lhs (pattern_stmt);
  }
  
*************** static vect_recog_func vect_vect_recog_f
*** 4676,4689 ****
  
  const unsigned int NUM_PATTERNS = ARRAY_SIZE (vect_vect_recog_func_ptrs);
  
! /* Mark statements that are involved in a pattern.  */
  
  static inline void
! vect_mark_pattern_stmts (stmt_vec_info orig_stmt_info, gimple *pattern_stmt,
!                          tree pattern_vectype)
  {
-   gimple *def_seq = STMT_VINFO_PATTERN_DEF_SEQ (orig_stmt_info);
- 
    gimple *orig_pattern_stmt = NULL;
    if (is_pattern_stmt_p (orig_stmt_info))
      {
--- 4658,4671 ----
  
  const unsigned int NUM_PATTERNS = ARRAY_SIZE (vect_vect_recog_func_ptrs);
  
! /* Replace ORIG_STMT_INFO with PATTERN_STMT, using PATTERN_VECTYPE as
!    the vector type for PATTERN_STMT.  */
  
  static inline void
! vect_replace_stmt_with_pattern (stmt_vec_info orig_stmt_info,
! 				gimple *pattern_stmt,
! 				tree pattern_vectype)
  {
    gimple *orig_pattern_stmt = NULL;
    if (is_pattern_stmt_p (orig_stmt_info))
      {
*************** vect_mark_pattern_stmts (stmt_vec_info o
*** 4710,4741 ****
  	  dump_gimple_stmt (MSG_NOTE, TDF_SLIM, pattern_stmt, 0);
  	}
  
-       /* Switch to the statement that ORIG replaces.  */
-       orig_stmt_info = STMT_VINFO_RELATED_STMT (orig_stmt_info);
- 
        /* We shouldn't be replacing the main pattern statement.  */
!       gcc_assert (STMT_VINFO_RELATED_STMT (orig_stmt_info)->stmt
! 		  != orig_pattern_stmt);
!     }
  
!   if (def_seq)
!     for (gimple_stmt_iterator si = gsi_start (def_seq);
! 	 !gsi_end_p (si); gsi_next (&si))
!       vect_init_pattern_stmt (gsi_stmt (si), orig_stmt_info, pattern_vectype);
! 
!   if (orig_pattern_stmt)
!     {
!       vect_init_pattern_stmt (pattern_stmt, orig_stmt_info, pattern_vectype);
! 
!       /* Insert all the new pattern statements before the original one.  */
!       gimple_seq *orig_def_seq = &STMT_VINFO_PATTERN_DEF_SEQ (orig_stmt_info);
!       gimple_stmt_iterator gsi = gsi_for_stmt (orig_pattern_stmt,
! 					       orig_def_seq);
!       gsi_insert_seq_before_without_update (&gsi, def_seq, GSI_SAME_STMT);
!       gsi_insert_before_without_update (&gsi, pattern_stmt, GSI_SAME_STMT);
  
        /* Remove the pattern statement that this new pattern replaces.  */
!       gsi_remove (&gsi, false);
      }
    else
      vect_set_pattern_stmt (pattern_stmt, orig_stmt_info, pattern_vectype);
--- 4692,4705 ----
  	  dump_gimple_stmt (MSG_NOTE, TDF_SLIM, pattern_stmt, 0);
  	}
  
        /* We shouldn't be replacing the main pattern statement.  */
!       gcc_assert (!is_main_pattern_stmt_p (orig_stmt_info));
  
!       /* Insert the new pattern statement before the original one.  */
!       append_pattern_def_seq (orig_stmt_info, pattern_stmt, pattern_vectype);
  
        /* Remove the pattern statement that this new pattern replaces.  */
!       orig_stmt_info->block->remove (orig_stmt_info);
      }
    else
      vect_set_pattern_stmt (pattern_stmt, orig_stmt_info, pattern_vectype);
*************** vect_mark_pattern_stmts (stmt_vec_info o
*** 4762,4791 ****
  static void
  vect_pattern_recog_1 (vect_recog_func *recog_func, stmt_vec_info stmt_info)
  {
-   vec_info *vinfo = stmt_info->vinfo;
    gimple *pattern_stmt;
    loop_vec_info loop_vinfo;
    tree pattern_vectype;
  
!   /* If this statement has already been replaced with pattern statements,
!      leave the original statement alone, since the first match wins.
!      Instead try to match against the definition statements that feed
!      the main pattern statement.  */
!   if (STMT_VINFO_IN_PATTERN_P (stmt_info))
!     {
!       gimple_stmt_iterator gsi;
!       for (gsi = gsi_start (STMT_VINFO_PATTERN_DEF_SEQ (stmt_info));
! 	   !gsi_end_p (gsi); gsi_next (&gsi))
! 	vect_pattern_recog_1 (recog_func, vinfo->lookup_stmt (gsi_stmt (gsi)));
!       return;
!     }
! 
!   gcc_assert (!STMT_VINFO_PATTERN_DEF_SEQ (stmt_info));
    pattern_stmt = recog_func->fn (stmt_info, &pattern_vectype);
    if (!pattern_stmt)
      {
!       /* Clear any half-formed pattern definition sequence.  */
!       STMT_VINFO_PATTERN_DEF_SEQ (stmt_info) = NULL;
        return;
      }
  
--- 4726,4742 ----
  static void
  vect_pattern_recog_1 (vect_recog_func *recog_func, stmt_vec_info stmt_info)
  {
    gimple *pattern_stmt;
    loop_vec_info loop_vinfo;
    tree pattern_vectype;
  
!   stmt_vec_info prev_stmt_info = stmt_info->prev;
    pattern_stmt = recog_func->fn (stmt_info, &pattern_vectype);
    if (!pattern_stmt)
      {
!       /* Delete any half-formed pattern sequence.  */
!       while (stmt_info->prev != prev_stmt_info)
! 	stmt_info->block->remove (prev_stmt_info);
        return;
      }
  
*************** vect_pattern_recog_1 (vect_recog_func *r
*** 4800,4807 ****
        dump_gimple_stmt (MSG_NOTE, TDF_SLIM, pattern_stmt, 0);
      }
  
    /* Mark the stmts that are involved in the pattern. */
!   vect_mark_pattern_stmts (stmt_info, pattern_stmt, pattern_vectype);
  
    /* Patterns cannot be vectorized using SLP, because they change the order of
       computation.  */
--- 4751,4765 ----
        dump_gimple_stmt (MSG_NOTE, TDF_SLIM, pattern_stmt, 0);
      }
  
+   /* Install the vector type in pattern definition statements that
+      don't yet have one.  */
+   for (stmt_vec_info pat_stmt_info = stmt_info->prev;
+        pat_stmt_info != prev_stmt_info; pat_stmt_info = pat_stmt_info->prev)
+     if (!STMT_VINFO_VECTYPE (pat_stmt_info))
+       STMT_VINFO_VECTYPE (pat_stmt_info) = pattern_vectype;
+ 
    /* Mark the stmts that are involved in the pattern. */
!   vect_replace_stmt_with_pattern (stmt_info, pattern_stmt, pattern_vectype);
  
    /* Patterns cannot be vectorized using SLP, because they change the order of
       computation.  */
*************** vect_pattern_recog (vec_info *vinfo)
*** 4903,4910 ****
    vec_basic_block *vec_bb;
    FOR_EACH_VEC_ELT (vinfo->blocks, i, vec_bb)
      FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
!       if (STMT_VINFO_VECTORIZABLE (stmt_info))
! 	/* Scan over all generic vect_recog_xxx_pattern functions.  */
! 	for (unsigned int j = 0; j < NUM_PATTERNS; j++)
! 	  vect_pattern_recog_1 (&vect_vect_recog_func_ptrs[j], stmt_info);
  }
--- 4861,4887 ----
    vec_basic_block *vec_bb;
    FOR_EACH_VEC_ELT (vinfo->blocks, i, vec_bb)
      FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
!       {
! 	stmt_vec_info begin_prev = stmt_info->prev;
! 	if (STMT_VINFO_VECTORIZABLE (stmt_info))
! 	  /* Scan over all generic vect_recog_xxx_pattern functions.  */
! 	  for (unsigned int j = 0; j < NUM_PATTERNS; j++)
! 	    {
! 	      stmt_vec_info curr_prev;
! 	      /* Scan over STMT_INFO and any pattern definition statements
! 		 that were introduced by previous recognizers.  */
! 	      for (stmt_vec_info curr_info = stmt_info;
! 		   curr_info != begin_prev; curr_info = curr_prev)
! 		{
! 		  curr_prev = curr_info->prev;
! 		  /* The first match wins, so skip statements that have
! 		     already been replaced, and the final statement with
! 		     which they were replaced.  */
! 		  if (!STMT_VINFO_IN_PATTERN_P (curr_info)
! 		      && !is_main_pattern_stmt_p (curr_info))
! 		    vect_pattern_recog_1 (&vect_vect_recog_func_ptrs[j],
! 					  curr_info);
! 		}
! 	    }
!       }
  }

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [01/11] Schedule SLP earlier
  2018-07-30 11:37 ` [01/11] Schedule SLP earlier Richard Sandiford
@ 2018-08-01 12:49   ` Richard Biener
  0 siblings, 0 replies; 20+ messages in thread
From: Richard Biener @ 2018-08-01 12:49 UTC (permalink / raw)
  To: GCC Patches, richard.sandiford

On Mon, Jul 30, 2018 at 1:37 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> vect_transform_loop used to call vect_schedule_slp lazily when it
> came across the first SLP statement, but it seems easier to do it
> before the main loop.

Indeed.

OK.
Richard.

>
> 2018-07-30  Richard Sandiford  <richard.sandiford@arm.com>
>
> gcc/
>         * tree-vect-loop.c (vect_transform_loop_stmt): Remove slp_scheduled
>         argument.
>         (vect_transform_loop): Update calls accordingly.  Schedule SLP
>         instances before the main loop, if any exist.
>
> Index: gcc/tree-vect-loop.c
> ===================================================================
> --- gcc/tree-vect-loop.c        2018-07-30 12:32:15.000000000 +0100
> +++ gcc/tree-vect-loop.c        2018-07-30 12:32:16.190624704 +0100
> @@ -8199,14 +8199,12 @@ scale_profile_for_vect_loop (struct loop
>  }
>
>  /* Vectorize STMT_INFO if relevant, inserting any new instructions before GSI.
> -   When vectorizing STMT_INFO as a store, set *SEEN_STORE to its stmt_vec_info.
> -   *SLP_SCHEDULE is a running record of whether we have called
> -   vect_schedule_slp.  */
> +   When vectorizing STMT_INFO as a store, set *SEEN_STORE to its
> +   stmt_vec_info.  */
>
>  static void
>  vect_transform_loop_stmt (loop_vec_info loop_vinfo, stmt_vec_info stmt_info,
> -                         gimple_stmt_iterator *gsi,
> -                         stmt_vec_info *seen_store, bool *slp_scheduled)
> +                         gimple_stmt_iterator *gsi, stmt_vec_info *seen_store)
>  {
>    struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
>    poly_uint64 vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
> @@ -8237,24 +8235,10 @@ vect_transform_loop_stmt (loop_vec_info
>         dump_printf_loc (MSG_NOTE, vect_location, "multiple-types.\n");
>      }
>
> -  /* SLP.  Schedule all the SLP instances when the first SLP stmt is
> -     reached.  */
> -  if (slp_vect_type slptype = STMT_SLP_TYPE (stmt_info))
> -    {
> -
> -      if (!*slp_scheduled)
> -       {
> -         *slp_scheduled = true;
> -
> -         DUMP_VECT_SCOPE ("scheduling SLP instances");
> -
> -         vect_schedule_slp (loop_vinfo);
> -       }
> -
> -      /* Hybrid SLP stmts must be vectorized in addition to SLP.  */
> -      if (slptype == pure_slp)
> -       return;
> -    }
> +  /* Pure SLP statements have already been vectorized.  We still need
> +     to apply loop vectorization to hybrid SLP statements.  */
> +  if (PURE_SLP_STMT (stmt_info))
> +    return;
>
>    if (dump_enabled_p ())
>      dump_printf_loc (MSG_NOTE, vect_location, "transform statement.\n");
> @@ -8284,7 +8268,6 @@ vect_transform_loop (loop_vec_info loop_
>    tree niters_vector_mult_vf = NULL_TREE;
>    poly_uint64 vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
>    unsigned int lowest_vf = constant_lower_bound (vf);
> -  bool slp_scheduled = false;
>    gimple *stmt;
>    bool check_profitability = false;
>    unsigned int th;
> @@ -8390,6 +8373,14 @@ vect_transform_loop (loop_vec_info loop_
>      /* This will deal with any possible peeling.  */
>      vect_prepare_for_masked_peels (loop_vinfo);
>
> +  /* Schedule the SLP instances first, then handle loop vectorization
> +     below.  */
> +  if (!loop_vinfo->slp_instances.is_empty ())
> +    {
> +      DUMP_VECT_SCOPE ("scheduling SLP instances");
> +      vect_schedule_slp (loop_vinfo);
> +    }
> +
>    /* FORNOW: the vectorizer supports only loops which body consist
>       of one basic block (header + empty latch). When the vectorizer will
>       support more involved loop forms, the order by which the BBs are
> @@ -8468,16 +8459,15 @@ vect_transform_loop (loop_vec_info loop_
>                           stmt_vec_info pat_stmt_info
>                             = loop_vinfo->lookup_stmt (gsi_stmt (subsi));
>                           vect_transform_loop_stmt (loop_vinfo, pat_stmt_info,
> -                                                   &si, &seen_store,
> -                                                   &slp_scheduled);
> +                                                   &si, &seen_store);
>                         }
>                       stmt_vec_info pat_stmt_info
>                         = STMT_VINFO_RELATED_STMT (stmt_info);
>                       vect_transform_loop_stmt (loop_vinfo, pat_stmt_info, &si,
> -                                               &seen_store, &slp_scheduled);
> +                                               &seen_store);
>                     }
>                   vect_transform_loop_stmt (loop_vinfo, stmt_info, &si,
> -                                           &seen_store, &slp_scheduled);
> +                                           &seen_store);
>                 }
>               gsi_next (&si);
>               if (seen_store)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [03/11] Remove vect_transform_stmt grouped_store argument
  2018-07-30 11:38 ` [03/11] Remove vect_transform_stmt grouped_store argument Richard Sandiford
@ 2018-08-01 12:49   ` Richard Biener
  0 siblings, 0 replies; 20+ messages in thread
From: Richard Biener @ 2018-08-01 12:49 UTC (permalink / raw)
  To: GCC Patches, richard.sandiford

On Mon, Jul 30, 2018 at 1:38 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Nothing now uses the grouped_store value passed back by
> vect_transform_stmt, so we might as well remove it.

OK.

>
> 2018-07-30  Richard Sandiford  <richard.sandiford@arm.com>
>
> gcc/
>         * tree-vectorizer.h (vect_transform_stmt): Remove grouped_store
>         argument.
>         * tree-vect-stmts.c (vect_transform_stmt): Likewise.
>         * tree-vect-loop.c (vect_transform_loop_stmt): Update call accordingly.
>         (vect_transform_loop): Likewise.
>         * tree-vect-slp.c (vect_schedule_slp_instance): Likewise.
>
> Index: gcc/tree-vectorizer.h
> ===================================================================
> --- gcc/tree-vectorizer.h       2018-07-30 12:32:19.366596715 +0100
> +++ gcc/tree-vectorizer.h       2018-07-30 12:32:22.718567174 +0100
> @@ -1459,7 +1459,7 @@ extern tree vect_init_vector (stmt_vec_i
>                                gimple_stmt_iterator *);
>  extern tree vect_get_vec_def_for_stmt_copy (vec_info *, tree);
>  extern bool vect_transform_stmt (stmt_vec_info, gimple_stmt_iterator *,
> -                                 bool *, slp_tree, slp_instance);
> +                                slp_tree, slp_instance);
>  extern void vect_remove_stores (stmt_vec_info);
>  extern bool vect_analyze_stmt (stmt_vec_info, bool *, slp_tree, slp_instance,
>                                stmt_vector_for_cost *);
> Index: gcc/tree-vect-stmts.c
> ===================================================================
> --- gcc/tree-vect-stmts.c       2018-07-30 12:32:09.114687014 +0100
> +++ gcc/tree-vect-stmts.c       2018-07-30 12:32:22.718567174 +0100
> @@ -9662,8 +9662,7 @@ vect_analyze_stmt (stmt_vec_info stmt_in
>
>  bool
>  vect_transform_stmt (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi,
> -                    bool *grouped_store, slp_tree slp_node,
> -                     slp_instance slp_node_instance)
> +                    slp_tree slp_node, slp_instance slp_node_instance)
>  {
>    vec_info *vinfo = stmt_info->vinfo;
>    bool is_store = false;
> @@ -9727,7 +9726,6 @@ vect_transform_stmt (stmt_vec_info stmt_
>              last store in the chain is reached.  Store stmts before the last
>              one are skipped, and there vec_stmt_info shouldn't be freed
>              meanwhile.  */
> -         *grouped_store = true;
>           stmt_vec_info group_info = DR_GROUP_FIRST_ELEMENT (stmt_info);
>           if (DR_GROUP_STORE_COUNT (group_info) == DR_GROUP_SIZE (group_info))
>             is_store = true;
> Index: gcc/tree-vect-loop.c
> ===================================================================
> --- gcc/tree-vect-loop.c        2018-07-30 12:32:16.190624704 +0100
> +++ gcc/tree-vect-loop.c        2018-07-30 12:32:22.714567210 +0100
> @@ -8243,8 +8243,7 @@ vect_transform_loop_stmt (loop_vec_info
>    if (dump_enabled_p ())
>      dump_printf_loc (MSG_NOTE, vect_location, "transform statement.\n");
>
> -  bool grouped_store = false;
> -  if (vect_transform_stmt (stmt_info, gsi, &grouped_store, NULL, NULL))
> +  if (vect_transform_stmt (stmt_info, gsi, NULL, NULL))
>      *seen_store = stmt_info;
>  }
>
> @@ -8425,7 +8424,7 @@ vect_transform_loop (loop_vec_info loop_
>             {
>               if (dump_enabled_p ())
>                 dump_printf_loc (MSG_NOTE, vect_location, "transform phi.\n");
> -             vect_transform_stmt (stmt_info, NULL, NULL, NULL, NULL);
> +             vect_transform_stmt (stmt_info, NULL, NULL, NULL);
>             }
>         }
>
> Index: gcc/tree-vect-slp.c
> ===================================================================
> --- gcc/tree-vect-slp.c 2018-07-30 12:32:19.366596715 +0100
> +++ gcc/tree-vect-slp.c 2018-07-30 12:32:22.714567210 +0100
> @@ -3853,7 +3853,6 @@ vect_transform_slp_perm_load (slp_tree n
>  vect_schedule_slp_instance (slp_tree node, slp_instance instance,
>                             scalar_stmts_to_slp_tree_map_t *bst_map)
>  {
> -  bool grouped_store;
>    gimple_stmt_iterator si;
>    stmt_vec_info stmt_info;
>    unsigned int group_size;
> @@ -3945,11 +3944,11 @@ vect_schedule_slp_instance (slp_tree nod
>           vec<stmt_vec_info> v1;
>           unsigned j;
>           tree tmask = NULL_TREE;
> -         vect_transform_stmt (stmt_info, &si, &grouped_store, node, instance);
> +         vect_transform_stmt (stmt_info, &si, node, instance);
>           v0 = SLP_TREE_VEC_STMTS (node).copy ();
>           SLP_TREE_VEC_STMTS (node).truncate (0);
>           gimple_assign_set_rhs_code (stmt, ocode);
> -         vect_transform_stmt (stmt_info, &si, &grouped_store, node, instance);
> +         vect_transform_stmt (stmt_info, &si, node, instance);
>           gimple_assign_set_rhs_code (stmt, code0);
>           v1 = SLP_TREE_VEC_STMTS (node).copy ();
>           SLP_TREE_VEC_STMTS (node).truncate (0);
> @@ -3994,7 +3993,7 @@ vect_schedule_slp_instance (slp_tree nod
>           return;
>         }
>      }
> -  vect_transform_stmt (stmt_info, &si, &grouped_store, node, instance);
> +  vect_transform_stmt (stmt_info, &si, node, instance);
>
>    /* Restore stmt def-types.  */
>    FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [02/11] Remove vect_schedule_slp return value
  2018-07-30 11:37 ` [02/11] Remove vect_schedule_slp return value Richard Sandiford
@ 2018-08-01 12:49   ` Richard Biener
  0 siblings, 0 replies; 20+ messages in thread
From: Richard Biener @ 2018-08-01 12:49 UTC (permalink / raw)
  To: GCC Patches, richard.sandiford

On Mon, Jul 30, 2018 at 1:37 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Nothing now uses the vect_schedule_slp return value, so it's not worth
> propagating the value through vect_schedule_slp_instance.

OK.

>
> 2018-07-30  Richard Sandiford  <richard.sandiford@arm.com>
>
> gcc/
>         * tree-vectorizer.h (vect_schedule_slp): Return void.
>         * tree-vect-slp.c (vect_schedule_slp_instance): Likewise.
>         (vect_schedule_slp): Likewise.
>
> Index: gcc/tree-vectorizer.h
> ===================================================================
> --- gcc/tree-vectorizer.h       2018-07-30 12:32:09.114687014 +0100
> +++ gcc/tree-vectorizer.h       2018-07-30 12:32:19.366596715 +0100
> @@ -1575,7 +1575,7 @@ extern bool vect_transform_slp_perm_load
>                                           gimple_stmt_iterator *, poly_uint64,
>                                           slp_instance, bool, unsigned *);
>  extern bool vect_slp_analyze_operations (vec_info *);
> -extern bool vect_schedule_slp (vec_info *);
> +extern void vect_schedule_slp (vec_info *);
>  extern bool vect_analyze_slp (vec_info *, unsigned);
>  extern bool vect_make_slp_decision (loop_vec_info);
>  extern void vect_detect_hybrid_slp (loop_vec_info);
> Index: gcc/tree-vect-slp.c
> ===================================================================
> --- gcc/tree-vect-slp.c 2018-07-30 12:32:09.026687790 +0100
> +++ gcc/tree-vect-slp.c 2018-07-30 12:32:19.366596715 +0100
> @@ -3849,11 +3849,11 @@ vect_transform_slp_perm_load (slp_tree n
>
>  /* Vectorize SLP instance tree in postorder.  */
>
> -static bool
> +static void
>  vect_schedule_slp_instance (slp_tree node, slp_instance instance,
>                             scalar_stmts_to_slp_tree_map_t *bst_map)
>  {
> -  bool grouped_store, is_store;
> +  bool grouped_store;
>    gimple_stmt_iterator si;
>    stmt_vec_info stmt_info;
>    unsigned int group_size;
> @@ -3862,14 +3862,14 @@ vect_schedule_slp_instance (slp_tree nod
>    slp_tree child;
>
>    if (SLP_TREE_DEF_TYPE (node) != vect_internal_def)
> -    return false;
> +    return;
>
>    /* See if we have already vectorized the same set of stmts and reuse their
>       vectorized stmts.  */
>    if (slp_tree *leader = bst_map->get (SLP_TREE_SCALAR_STMTS (node)))
>      {
>        SLP_TREE_VEC_STMTS (node).safe_splice (SLP_TREE_VEC_STMTS (*leader));
> -      return false;
> +      return;
>      }
>
>    bst_map->put (SLP_TREE_SCALAR_STMTS (node).copy (), node);
> @@ -3991,11 +3991,10 @@ vect_schedule_slp_instance (slp_tree nod
>             }
>           v0.release ();
>           v1.release ();
> -         return false;
> +         return;
>         }
>      }
> -  is_store = vect_transform_stmt (stmt_info, &si, &grouped_store, node,
> -                                 instance);
> +  vect_transform_stmt (stmt_info, &si, &grouped_store, node, instance);
>
>    /* Restore stmt def-types.  */
>    FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
> @@ -4005,8 +4004,6 @@ vect_schedule_slp_instance (slp_tree nod
>         FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (child), j, child_stmt_info)
>           STMT_VINFO_DEF_TYPE (child_stmt_info) = vect_internal_def;
>        }
> -
> -  return is_store;
>  }
>
>  /* Replace scalar calls from SLP node NODE with setting of their lhs to zero.
> @@ -4048,14 +4045,12 @@ vect_remove_slp_scalar_calls (slp_tree n
>
>  /* Generate vector code for all SLP instances in the loop/basic block.  */
>
> -bool
> +void
>  vect_schedule_slp (vec_info *vinfo)
>  {
>    vec<slp_instance> slp_instances;
>    slp_instance instance;
>    unsigned int i;
> -  bool is_store = false;
> -
>
>    scalar_stmts_to_slp_tree_map_t *bst_map
>      = new scalar_stmts_to_slp_tree_map_t ();
> @@ -4063,8 +4058,8 @@ vect_schedule_slp (vec_info *vinfo)
>    FOR_EACH_VEC_ELT (slp_instances, i, instance)
>      {
>        /* Schedule the tree of INSTANCE.  */
> -      is_store = vect_schedule_slp_instance (SLP_INSTANCE_TREE (instance),
> -                                             instance, bst_map);
> +      vect_schedule_slp_instance (SLP_INSTANCE_TREE (instance),
> +                                 instance, bst_map);
>        if (dump_enabled_p ())
>         dump_printf_loc (MSG_NOTE, vect_location,
>                           "vectorizing stmts using SLP.\n");
> @@ -4099,6 +4094,4 @@ vect_schedule_slp (vec_info *vinfo)
>           vinfo->remove_stmt (store_info);
>          }
>      }
> -
> -  return is_store;
>  }

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [04/11] Add a vect_orig_stmt helper function
  2018-07-30 11:38 ` [04/11] Add a vect_orig_stmt helper function Richard Sandiford
@ 2018-08-01 12:50   ` Richard Biener
  0 siblings, 0 replies; 20+ messages in thread
From: Richard Biener @ 2018-08-01 12:50 UTC (permalink / raw)
  To: GCC Patches, richard.sandiford

On Mon, Jul 30, 2018 at 1:38 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> This patch just adds a helper function for going from a potential
> pattern statement to the original scalar statement.

OK.

Richard.

>
> 2018-07-30  Richard Sandiford  <richard.sandiford@arm.com>
>
> gcc/
>         * tree-vectorizer.h (vect_orig_stmt): New function.
>         * tree-vect-data-refs.c (vect_preserves_scalar_order_p): Use it.
>         * tree-vect-loop.c (vect_model_reduction_cost): Likewise.
>         (vect_create_epilog_for_reduction): Likewise.
>         (vectorizable_live_operation): Likewise.
>         * tree-vect-slp.c (vect_find_last_scalar_stmt_in_slp): Likewise.
>         (vect_detect_hybrid_slp_stmts, vect_schedule_slp): Likewise.
>         * tree-vect-stmts.c (vectorizable_call): Likewise.
>         (vectorizable_simd_clone_call, vect_remove_stores): Likewise.
>
> Index: gcc/tree-vectorizer.h
> ===================================================================
> --- gcc/tree-vectorizer.h       2018-07-30 12:32:22.718567174 +0100
> +++ gcc/tree-vectorizer.h       2018-07-30 12:32:26.218536339 +0100
> @@ -1120,6 +1120,17 @@ is_pattern_stmt_p (stmt_vec_info stmt_in
>    return stmt_info->pattern_stmt_p;
>  }
>
> +/* If STMT_INFO is a pattern statement, return the statement that it
> +   replaces, otherwise return STMT_INFO itself.  */
> +
> +inline stmt_vec_info
> +vect_orig_stmt (stmt_vec_info stmt_info)
> +{
> +  if (is_pattern_stmt_p (stmt_info))
> +    return STMT_VINFO_RELATED_STMT (stmt_info);
> +  return stmt_info;
> +}
> +
>  /* Return true if BB is a loop header.  */
>
>  static inline bool
> Index: gcc/tree-vect-data-refs.c
> ===================================================================
> --- gcc/tree-vect-data-refs.c   2018-07-30 12:32:08.934688600 +0100
> +++ gcc/tree-vect-data-refs.c   2018-07-30 12:32:26.214536374 +0100
> @@ -214,10 +214,8 @@ vect_preserves_scalar_order_p (dr_vec_in
>       (but could happen later) while reads will happen no later than their
>       current position (but could happen earlier).  Reordering is therefore
>       only possible if the first access is a write.  */
> -  if (is_pattern_stmt_p (stmtinfo_a))
> -    stmtinfo_a = STMT_VINFO_RELATED_STMT (stmtinfo_a);
> -  if (is_pattern_stmt_p (stmtinfo_b))
> -    stmtinfo_b = STMT_VINFO_RELATED_STMT (stmtinfo_b);
> +  stmtinfo_a = vect_orig_stmt (stmtinfo_a);
> +  stmtinfo_b = vect_orig_stmt (stmtinfo_b);
>    stmt_vec_info earlier_stmt_info = get_earlier_stmt (stmtinfo_a, stmtinfo_b);
>    return !DR_IS_WRITE (STMT_VINFO_DATA_REF (earlier_stmt_info));
>  }
> Index: gcc/tree-vect-loop.c
> ===================================================================
> --- gcc/tree-vect-loop.c        2018-07-30 12:32:22.714567210 +0100
> +++ gcc/tree-vect-loop.c        2018-07-30 12:32:26.214536374 +0100
> @@ -3814,10 +3814,7 @@ vect_model_reduction_cost (stmt_vec_info
>
>    vectype = STMT_VINFO_VECTYPE (stmt_info);
>    mode = TYPE_MODE (vectype);
> -  stmt_vec_info orig_stmt_info = STMT_VINFO_RELATED_STMT (stmt_info);
> -
> -  if (!orig_stmt_info)
> -    orig_stmt_info = stmt_info;
> +  stmt_vec_info orig_stmt_info = vect_orig_stmt (stmt_info);
>
>    code = gimple_assign_rhs_code (orig_stmt_info->stmt);
>
> @@ -4738,13 +4735,8 @@ vect_create_epilog_for_reduction (vec<tr
>           Otherwise (it is a regular reduction) - the tree-code and scalar-def
>           are taken from STMT.  */
>
> -  stmt_vec_info orig_stmt_info = STMT_VINFO_RELATED_STMT (stmt_info);
> -  if (!orig_stmt_info)
> -    {
> -      /* Regular reduction  */
> -      orig_stmt_info = stmt_info;
> -    }
> -  else
> +  stmt_vec_info orig_stmt_info = vect_orig_stmt (stmt_info);
> +  if (orig_stmt_info != stmt_info)
>      {
>        /* Reduction pattern  */
>        gcc_assert (STMT_VINFO_IN_PATTERN_P (orig_stmt_info));
> @@ -5540,11 +5532,7 @@ vect_create_epilog_for_reduction (vec<tr
>    if (REDUC_GROUP_FIRST_ELEMENT (stmt_info))
>      {
>        stmt_vec_info dest_stmt_info
> -       = SLP_TREE_SCALAR_STMTS (slp_node)[group_size - 1];
> -      /* Handle reduction patterns.  */
> -      if (STMT_VINFO_RELATED_STMT (dest_stmt_info))
> -       dest_stmt_info = STMT_VINFO_RELATED_STMT (dest_stmt_info);
> -
> +       = vect_orig_stmt (SLP_TREE_SCALAR_STMTS (slp_node)[group_size - 1]);
>        scalar_dest = gimple_assign_lhs (dest_stmt_info->stmt);
>        group_size = 1;
>      }
> @@ -7898,10 +7886,8 @@ vectorizable_live_operation (stmt_vec_in
>        return true;
>      }
>
> -  /* If stmt has a related stmt, then use that for getting the lhs.  */
> -  gimple *stmt = (is_pattern_stmt_p (stmt_info)
> -                 ? STMT_VINFO_RELATED_STMT (stmt_info)->stmt
> -                 : stmt_info->stmt);
> +  /* Use the lhs of the original scalar statement.  */
> +  gimple *stmt = vect_orig_stmt (stmt_info)->stmt;
>
>    lhs = (is_a <gphi *> (stmt)) ? gimple_phi_result (stmt)
>         : gimple_get_lhs (stmt);
> Index: gcc/tree-vect-slp.c
> ===================================================================
> --- gcc/tree-vect-slp.c 2018-07-30 12:32:22.714567210 +0100
> +++ gcc/tree-vect-slp.c 2018-07-30 12:32:26.218536339 +0100
> @@ -1848,8 +1848,7 @@ vect_find_last_scalar_stmt_in_slp (slp_t
>
>    for (int i = 0; SLP_TREE_SCALAR_STMTS (node).iterate (i, &stmt_vinfo); i++)
>      {
> -      if (is_pattern_stmt_p (stmt_vinfo))
> -       stmt_vinfo = STMT_VINFO_RELATED_STMT (stmt_vinfo);
> +      stmt_vinfo = vect_orig_stmt (stmt_vinfo);
>        last = last ? get_later_stmt (stmt_vinfo, last) : stmt_vinfo;
>      }
>
> @@ -2314,10 +2313,7 @@ vect_detect_hybrid_slp_stmts (slp_tree n
>        gcc_checking_assert (PURE_SLP_STMT (stmt_vinfo));
>        /* If we get a pattern stmt here we have to use the LHS of the
>           original stmt for immediate uses.  */
> -      gimple *stmt = stmt_vinfo->stmt;
> -      if (! STMT_VINFO_IN_PATTERN_P (stmt_vinfo)
> -         && STMT_VINFO_RELATED_STMT (stmt_vinfo))
> -       stmt = STMT_VINFO_RELATED_STMT (stmt_vinfo)->stmt;
> +      gimple *stmt = vect_orig_stmt (stmt_vinfo)->stmt;
>        tree def;
>        if (gimple_code (stmt) == GIMPLE_PHI)
>         def = gimple_phi_result (stmt);
> @@ -4087,8 +4083,7 @@ vect_schedule_slp (vec_info *vinfo)
>           if (!STMT_VINFO_DATA_REF (store_info))
>             break;
>
> -         if (is_pattern_stmt_p (store_info))
> -           store_info = STMT_VINFO_RELATED_STMT (store_info);
> +         store_info = vect_orig_stmt (store_info);
>           /* Free the attached stmt_vec_info and remove the stmt.  */
>           vinfo->remove_stmt (store_info);
>          }
> Index: gcc/tree-vect-stmts.c
> ===================================================================
> --- gcc/tree-vect-stmts.c       2018-07-30 12:32:22.718567174 +0100
> +++ gcc/tree-vect-stmts.c       2018-07-30 12:32:26.218536339 +0100
> @@ -3628,8 +3628,7 @@ vectorizable_call (stmt_vec_info stmt_in
>    if (slp_node)
>      return true;
>
> -  if (is_pattern_stmt_p (stmt_info))
> -    stmt_info = STMT_VINFO_RELATED_STMT (stmt_info);
> +  stmt_info = vect_orig_stmt (stmt_info);
>    lhs = gimple_get_lhs (stmt_info->stmt);
>
>    gassign *new_stmt
> @@ -4364,10 +4363,7 @@ vectorizable_simd_clone_call (stmt_vec_i
>    if (scalar_dest)
>      {
>        type = TREE_TYPE (scalar_dest);
> -      if (is_pattern_stmt_p (stmt_info))
> -       lhs = gimple_call_lhs (STMT_VINFO_RELATED_STMT (stmt_info)->stmt);
> -      else
> -       lhs = gimple_call_lhs (stmt);
> +      lhs = gimple_call_lhs (vect_orig_stmt (stmt_info)->stmt);
>        new_stmt = gimple_build_assign (lhs, build_zero_cst (type));
>      }
>    else
> @@ -9843,8 +9839,7 @@ vect_remove_stores (stmt_vec_info first_
>    while (next_stmt_info)
>      {
>        stmt_vec_info tmp = DR_GROUP_NEXT_ELEMENT (next_stmt_info);
> -      if (is_pattern_stmt_p (next_stmt_info))
> -       next_stmt_info = STMT_VINFO_RELATED_STMT (next_stmt_info);
> +      next_stmt_info = vect_orig_stmt (next_stmt_info);
>        /* Free the attached stmt_vec_info and remove the stmt.  */
>        vinfo->remove_stmt (next_stmt_info);
>        next_stmt_info = tmp;

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [05/11] Add a vect_stmt_to_vectorize helper function
  2018-07-30 11:39 ` [05/11] Add a vect_stmt_to_vectorize helper function Richard Sandiford
@ 2018-08-01 12:51   ` Richard Biener
  0 siblings, 0 replies; 20+ messages in thread
From: Richard Biener @ 2018-08-01 12:51 UTC (permalink / raw)
  To: GCC Patches, richard.sandiford

On Mon, Jul 30, 2018 at 1:39 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> This patch adds a helper that does the opposite of vect_orig_stmt:
> go from the original scalar statement to the statement that should
> actually be vectorised.
>
> The use in the last two hunks of vectorizable_reduction are because
> reduc_stmt_info (first hunk) and stmt_info (second hunk) are already
> pattern statements if appropriate.

OK.

Richard.

>
> 2018-07-30  Richard Sandiford  <richard.sandiford@arm.com>
>
> gcc/
>         * tree-vectorizer.h (vect_stmt_to_vectorize): New function.
>         * tree-vect-loop.c (vect_update_vf_for_slp): Use it.
>         (vectorizable_reduction): Likewise.
>         * tree-vect-slp.c (vect_analyze_slp_instance): Likewise.
>         (vect_detect_hybrid_slp_stmts): Likewise.
>         * tree-vect-stmts.c (vect_is_simple_use): Likewise.
>
> Index: gcc/tree-vectorizer.h
> ===================================================================
> --- gcc/tree-vectorizer.h       2018-07-30 12:32:26.218536339 +0100
> +++ gcc/tree-vectorizer.h       2018-07-30 12:32:29.586506669 +0100
> @@ -1131,6 +1131,17 @@ vect_orig_stmt (stmt_vec_info stmt_info)
>    return stmt_info;
>  }
>
> +/* If STMT_INFO has been replaced by a pattern statement, return the
> +   replacement statement, otherwise return STMT_INFO itself.  */
> +
> +inline stmt_vec_info
> +vect_stmt_to_vectorize (stmt_vec_info stmt_info)
> +{
> +  if (STMT_VINFO_IN_PATTERN_P (stmt_info))
> +    return STMT_VINFO_RELATED_STMT (stmt_info);
> +  return stmt_info;
> +}
> +
>  /* Return true if BB is a loop header.  */
>
>  static inline bool
> Index: gcc/tree-vect-loop.c
> ===================================================================
> --- gcc/tree-vect-loop.c        2018-07-30 12:32:26.214536374 +0100
> +++ gcc/tree-vect-loop.c        2018-07-30 12:32:29.586506669 +0100
> @@ -1424,9 +1424,7 @@ vect_update_vf_for_slp (loop_vec_info lo
>            gsi_next (&si))
>         {
>           stmt_vec_info stmt_info = loop_vinfo->lookup_stmt (gsi_stmt (si));
> -         if (STMT_VINFO_IN_PATTERN_P (stmt_info)
> -             && STMT_VINFO_RELATED_STMT (stmt_info))
> -           stmt_info = STMT_VINFO_RELATED_STMT (stmt_info);
> +         stmt_info = vect_stmt_to_vectorize (stmt_info);
>           if ((STMT_VINFO_RELEVANT_P (stmt_info)
>                || VECTORIZABLE_CYCLE_DEF (STMT_VINFO_DEF_TYPE (stmt_info)))
>               && !PURE_SLP_STMT (stmt_info))
> @@ -6111,8 +6109,7 @@ vectorizable_reduction (stmt_vec_info st
>         return true;
>
>        stmt_vec_info reduc_stmt_info = STMT_VINFO_REDUC_DEF (stmt_info);
> -      if (STMT_VINFO_IN_PATTERN_P (reduc_stmt_info))
> -       reduc_stmt_info = STMT_VINFO_RELATED_STMT (reduc_stmt_info);
> +      reduc_stmt_info = vect_stmt_to_vectorize (reduc_stmt_info);
>
>        if (STMT_VINFO_VEC_REDUCTION_TYPE (reduc_stmt_info)
>           == EXTRACT_LAST_REDUCTION)
> @@ -6145,8 +6142,7 @@ vectorizable_reduction (stmt_vec_info st
>        if (ncopies > 1
>           && STMT_VINFO_RELEVANT (reduc_stmt_info) <= vect_used_only_live
>           && (use_stmt_info = loop_vinfo->lookup_single_use (phi_result))
> -         && (use_stmt_info == reduc_stmt_info
> -             || STMT_VINFO_RELATED_STMT (use_stmt_info) == reduc_stmt_info))
> +         && vect_stmt_to_vectorize (use_stmt_info) == reduc_stmt_info)
>         single_defuse_cycle = true;
>
>        /* Create the destination vector  */
> @@ -6915,8 +6911,7 @@ vectorizable_reduction (stmt_vec_info st
>    if (ncopies > 1
>        && (STMT_VINFO_RELEVANT (stmt_info) <= vect_used_only_live)
>        && (use_stmt_info = loop_vinfo->lookup_single_use (reduc_phi_result))
> -      && (use_stmt_info == stmt_info
> -         || STMT_VINFO_RELATED_STMT (use_stmt_info) == stmt_info))
> +      && vect_stmt_to_vectorize (use_stmt_info) == stmt_info)
>      {
>        single_defuse_cycle = true;
>        epilog_copies = 1;
> Index: gcc/tree-vect-slp.c
> ===================================================================
> --- gcc/tree-vect-slp.c 2018-07-30 12:32:26.218536339 +0100
> +++ gcc/tree-vect-slp.c 2018-07-30 12:32:29.586506669 +0100
> @@ -1969,11 +1969,7 @@ vect_analyze_slp_instance (vec_info *vin
>        /* Collect the stores and store them in SLP_TREE_SCALAR_STMTS.  */
>        while (next_info)
>          {
> -         if (STMT_VINFO_IN_PATTERN_P (next_info)
> -             && STMT_VINFO_RELATED_STMT (next_info))
> -           scalar_stmts.safe_push (STMT_VINFO_RELATED_STMT (next_info));
> -         else
> -           scalar_stmts.safe_push (next_info);
> +         scalar_stmts.safe_push (vect_stmt_to_vectorize (next_info));
>           next_info = DR_GROUP_NEXT_ELEMENT (next_info);
>          }
>      }
> @@ -1983,11 +1979,7 @@ vect_analyze_slp_instance (vec_info *vin
>          SLP_TREE_SCALAR_STMTS.  */
>        while (next_info)
>          {
> -         if (STMT_VINFO_IN_PATTERN_P (next_info)
> -             && STMT_VINFO_RELATED_STMT (next_info))
> -           scalar_stmts.safe_push (STMT_VINFO_RELATED_STMT (next_info));
> -         else
> -           scalar_stmts.safe_push (next_info);
> +         scalar_stmts.safe_push (vect_stmt_to_vectorize (next_info));
>           next_info = REDUC_GROUP_NEXT_ELEMENT (next_info);
>          }
>        /* Mark the first element of the reduction chain as reduction to properly
> @@ -2325,9 +2317,7 @@ vect_detect_hybrid_slp_stmts (slp_tree n
>             use_vinfo = loop_vinfo->lookup_stmt (use_stmt);
>             if (!use_vinfo)
>               continue;
> -           if (STMT_VINFO_IN_PATTERN_P (use_vinfo)
> -               && STMT_VINFO_RELATED_STMT (use_vinfo))
> -             use_vinfo = STMT_VINFO_RELATED_STMT (use_vinfo);
> +           use_vinfo = vect_stmt_to_vectorize (use_vinfo);
>             if (!STMT_SLP_TYPE (use_vinfo)
>                 && (STMT_VINFO_RELEVANT (use_vinfo)
>                     || VECTORIZABLE_CYCLE_DEF (STMT_VINFO_DEF_TYPE (use_vinfo)))
> Index: gcc/tree-vect-stmts.c
> ===================================================================
> --- gcc/tree-vect-stmts.c       2018-07-30 12:32:26.218536339 +0100
> +++ gcc/tree-vect-stmts.c       2018-07-30 12:32:29.586506669 +0100
> @@ -10031,11 +10031,8 @@ vect_is_simple_use (tree operand, vec_in
>         *dt = vect_external_def;
>        else
>         {
> -         if (STMT_VINFO_IN_PATTERN_P (stmt_vinfo))
> -           {
> -             stmt_vinfo = STMT_VINFO_RELATED_STMT (stmt_vinfo);
> -             def_stmt = stmt_vinfo->stmt;
> -           }
> +         stmt_vinfo = vect_stmt_to_vectorize (stmt_vinfo);
> +         def_stmt = stmt_vinfo->stmt;
>           switch (gimple_code (def_stmt))
>             {
>             case GIMPLE_PHI:

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [06/11] Handle VMAT_INVARIANT separately
  2018-07-30 11:41 ` [06/11] Handle VMAT_INVARIANT separately Richard Sandiford
@ 2018-08-01 12:52   ` Richard Biener
  0 siblings, 0 replies; 20+ messages in thread
From: Richard Biener @ 2018-08-01 12:52 UTC (permalink / raw)
  To: GCC Patches, richard.sandiford

On Mon, Jul 30, 2018 at 1:41 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Invariant loads were handled as a variation on the code for contiguous
> loads.  We detected whether they were invariant or not as a byproduct of
> creating the vector pointer ivs: vect_create_data_ref_ptr passed back an
> inv_p to say whether the pointer was invariant.
>
> But vectorised invariant loads just keep the original scalar load,
> so this meant that detecting invariant loads had the side-effect of
> creating an unwanted vector pointer iv.  The placement of the code
> also meant that we'd create a vector load and then not use the result.
> In principle this is wrong code, since there's no guarantee that there's
> a vector's worth of accessible data at that address, but we rely on DCE
> to get rid of the load before any harm is done.
>
> E.g., for an invariant load in an inner loop (which seems like the more
> common use case for this code), we'd create:
>
>    vectp_a.6_52 = &a + 4;
>
>    # vectp_a.5_53 = PHI <vectp_a.5_54(9), vectp_a.6_52(2)>
>
>    # vectp_a.5_55 = PHI <vectp_a.5_53(3), vectp_a.5_56(10)>
>
>    vect_next_a_11.7_57 = MEM[(int *)vectp_a.5_55];
>    next_a_11 = a[_1];
>    vect_cst__58 = {next_a_11, next_a_11, next_a_11, next_a_11};
>
>    vectp_a.5_56 = vectp_a.5_55 + 4;
>
>    vectp_a.5_54 = vectp_a.5_53 + 0;
>
> whereas all we want is:
>
>    next_a_11 = a[_1];
>    vect_cst__58 = {next_a_11, next_a_11, next_a_11, next_a_11};
>
> This patch moves the handling to its own block and makes
> vect_create_data_ref_ptr assert (when creating a full iv) that the
> address isn't invariant.
>
> The ncopies handling is unfortunate, but a preexisting issue.
> Richi's suggestion of using a vector of vector statements would
> let us reuse one statement for all copies.

OK.

Richard.

>
> 2018-07-30  Richard Sandiford  <richard.sandiford@arm.com>
>
> gcc/
>         * tree-vectorizer.h (vect_create_data_ref_ptr): Remove inv_p
>         parameter.
>         * tree-vect-data-refs.c (vect_create_data_ref_ptr): Likewise.
>         When creating an iv, assert that the step is not known to be zero.
>         (vect_setup_realignment): Update call accordingly.
>         * tree-vect-stmts.c (vectorizable_store): Likewise.
>         (vectorizable_load): Likewise.  Handle VMAT_INVARIANT separately.
>
> Index: gcc/tree-vectorizer.h
> ===================================================================
> *** gcc/tree-vectorizer.h       2018-07-30 12:32:29.586506669 +0100
> --- gcc/tree-vectorizer.h       2018-07-30 12:40:13.000000000 +0100
> *************** extern bool vect_analyze_data_refs (vec_
> *** 1527,1533 ****
>   extern void vect_record_base_alignments (vec_info *);
>   extern tree vect_create_data_ref_ptr (stmt_vec_info, tree, struct loop *, tree,
>                                       tree *, gimple_stmt_iterator *,
> !                                     gimple **, bool, bool *,
>                                       tree = NULL_TREE, tree = NULL_TREE);
>   extern tree bump_vector_ptr (tree, gimple *, gimple_stmt_iterator *,
>                              stmt_vec_info, tree);
> --- 1527,1533 ----
>   extern void vect_record_base_alignments (vec_info *);
>   extern tree vect_create_data_ref_ptr (stmt_vec_info, tree, struct loop *, tree,
>                                       tree *, gimple_stmt_iterator *,
> !                                     gimple **, bool,
>                                       tree = NULL_TREE, tree = NULL_TREE);
>   extern tree bump_vector_ptr (tree, gimple *, gimple_stmt_iterator *,
>                              stmt_vec_info, tree);
> Index: gcc/tree-vect-data-refs.c
> ===================================================================
> *** gcc/tree-vect-data-refs.c   2018-07-30 12:32:26.214536374 +0100
> --- gcc/tree-vect-data-refs.c   2018-07-30 12:32:32.546480596 +0100
> *************** vect_create_addr_base_for_vector_ref (st
> *** 4674,4689 ****
>
>         Return the increment stmt that updates the pointer in PTR_INCR.
>
> !    3. Set INV_P to true if the access pattern of the data reference in the
> !       vectorized loop is invariant.  Set it to false otherwise.
> !
> !    4. Return the pointer.  */
>
>   tree
>   vect_create_data_ref_ptr (stmt_vec_info stmt_info, tree aggr_type,
>                           struct loop *at_loop, tree offset,
>                           tree *initial_address, gimple_stmt_iterator *gsi,
> !                         gimple **ptr_incr, bool only_init, bool *inv_p,
>                           tree byte_offset, tree iv_step)
>   {
>     const char *base_name;
> --- 4674,4686 ----
>
>         Return the increment stmt that updates the pointer in PTR_INCR.
>
> !    3. Return the pointer.  */
>
>   tree
>   vect_create_data_ref_ptr (stmt_vec_info stmt_info, tree aggr_type,
>                           struct loop *at_loop, tree offset,
>                           tree *initial_address, gimple_stmt_iterator *gsi,
> !                         gimple **ptr_incr, bool only_init,
>                           tree byte_offset, tree iv_step)
>   {
>     const char *base_name;
> *************** vect_create_data_ref_ptr (stmt_vec_info
> *** 4705,4711 ****
>     bool insert_after;
>     tree indx_before_incr, indx_after_incr;
>     gimple *incr;
> -   tree step;
>     bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_info);
>
>     gcc_assert (iv_step != NULL_TREE
> --- 4702,4707 ----
> *************** vect_create_data_ref_ptr (stmt_vec_info
> *** 4726,4739 ****
>         *ptr_incr = NULL;
>       }
>
> -   /* Check the step (evolution) of the load in LOOP, and record
> -      whether it's invariant.  */
> -   step = vect_dr_behavior (dr_info)->step;
> -   if (integer_zerop (step))
> -     *inv_p = true;
> -   else
> -     *inv_p = false;
> -
>     /* Create an expression for the first address accessed by this load
>        in LOOP.  */
>     base_name = get_name (DR_BASE_ADDRESS (dr));
> --- 4722,4727 ----
> *************** vect_create_data_ref_ptr (stmt_vec_info
> *** 4849,4863 ****
>       aptr = aggr_ptr_init;
>     else
>       {
>         if (iv_step == NULL_TREE)
>         {
> !         /* The step of the aggregate pointer is the type size.  */
>           iv_step = TYPE_SIZE_UNIT (aggr_type);
> !         /* One exception to the above is when the scalar step of the load in
> !            LOOP is zero. In this case the step here is also zero.  */
> !         if (*inv_p)
> !           iv_step = size_zero_node;
> !         else if (tree_int_cst_sgn (step) == -1)
>             iv_step = fold_build1 (NEGATE_EXPR, TREE_TYPE (iv_step), iv_step);
>         }
>
> --- 4837,4853 ----
>       aptr = aggr_ptr_init;
>     else
>       {
> +       /* Accesses to invariant addresses should be handled specially
> +        by the caller.  */
> +       tree step = vect_dr_behavior (dr_info)->step;
> +       gcc_assert (!integer_zerop (step));
> +
>         if (iv_step == NULL_TREE)
>         {
> !         /* The step of the aggregate pointer is the type size,
> !            negated for downward accesses.  */
>           iv_step = TYPE_SIZE_UNIT (aggr_type);
> !         if (tree_int_cst_sgn (step) == -1)
>             iv_step = fold_build1 (NEGATE_EXPR, TREE_TYPE (iv_step), iv_step);
>         }
>
> *************** vect_setup_realignment (stmt_vec_info st
> *** 5462,5468 ****
>     gphi *phi_stmt;
>     tree msq = NULL_TREE;
>     gimple_seq stmts = NULL;
> -   bool inv_p;
>     bool compute_in_loop = false;
>     bool nested_in_vect_loop = false;
>     struct loop *containing_loop = (gimple_bb (stmt_info->stmt))->loop_father;
> --- 5452,5457 ----
> *************** vect_setup_realignment (stmt_vec_info st
> *** 5556,5562 ****
>         vec_dest = vect_create_destination_var (scalar_dest, vectype);
>         ptr = vect_create_data_ref_ptr (stmt_info, vectype,
>                                       loop_for_initial_load, NULL_TREE,
> !                                     &init_addr, NULL, &inc, true, &inv_p);
>         if (TREE_CODE (ptr) == SSA_NAME)
>         new_temp = copy_ssa_name (ptr);
>         else
> --- 5545,5551 ----
>         vec_dest = vect_create_destination_var (scalar_dest, vectype);
>         ptr = vect_create_data_ref_ptr (stmt_info, vectype,
>                                       loop_for_initial_load, NULL_TREE,
> !                                     &init_addr, NULL, &inc, true);
>         if (TREE_CODE (ptr) == SSA_NAME)
>         new_temp = copy_ssa_name (ptr);
>         else
> Index: gcc/tree-vect-stmts.c
> ===================================================================
> *** gcc/tree-vect-stmts.c       2018-07-30 12:32:29.586506669 +0100
> --- gcc/tree-vect-stmts.c       2018-07-30 12:40:14.000000000 +0100
> *************** vectorizable_store (stmt_vec_info stmt_i
> *** 6254,6260 ****
>     unsigned int group_size, i;
>     vec<tree> oprnds = vNULL;
>     vec<tree> result_chain = vNULL;
> -   bool inv_p;
>     tree offset = NULL_TREE;
>     vec<tree> vec_oprnds = vNULL;
>     bool slp = (slp_node != NULL);
> --- 6254,6259 ----
> *************** vectorizable_store (stmt_vec_info stmt_i
> *** 7018,7039 ****
>             {
>               dataref_ptr = unshare_expr (DR_BASE_ADDRESS (first_dr_info->dr));
>               dataref_offset = build_int_cst (ref_type, 0);
> -             inv_p = false;
>             }
>           else if (STMT_VINFO_GATHER_SCATTER_P (stmt_info))
> !           {
> !             vect_get_gather_scatter_ops (loop, stmt_info, &gs_info,
> !                                          &dataref_ptr, &vec_offset);
> !             inv_p = false;
> !           }
>           else
>             dataref_ptr
>               = vect_create_data_ref_ptr (first_stmt_info, aggr_type,
>                                           simd_lane_access_p ? loop : NULL,
>                                           offset, &dummy, gsi, &ptr_incr,
> !                                         simd_lane_access_p, &inv_p,
> !                                         NULL_TREE, bump);
> !         gcc_assert (bb_vinfo || !inv_p);
>         }
>         else
>         {
> --- 7017,7032 ----
>             {
>               dataref_ptr = unshare_expr (DR_BASE_ADDRESS (first_dr_info->dr));
>               dataref_offset = build_int_cst (ref_type, 0);
>             }
>           else if (STMT_VINFO_GATHER_SCATTER_P (stmt_info))
> !           vect_get_gather_scatter_ops (loop, stmt_info, &gs_info,
> !                                        &dataref_ptr, &vec_offset);
>           else
>             dataref_ptr
>               = vect_create_data_ref_ptr (first_stmt_info, aggr_type,
>                                           simd_lane_access_p ? loop : NULL,
>                                           offset, &dummy, gsi, &ptr_incr,
> !                                         simd_lane_access_p, NULL_TREE, bump);
>         }
>         else
>         {
> *************** vectorizable_load (stmt_vec_info stmt_in
> *** 7419,7425 ****
>     bool grouped_load = false;
>     stmt_vec_info first_stmt_info;
>     stmt_vec_info first_stmt_info_for_drptr = NULL;
> -   bool inv_p;
>     bool compute_in_loop = false;
>     struct loop *at_loop;
>     int vec_num;
> --- 7412,7417 ----
> *************** vectorizable_load (stmt_vec_info stmt_in
> *** 7669,7674 ****
> --- 7661,7723 ----
>         return true;
>       }
>
> +   if (memory_access_type == VMAT_INVARIANT)
> +     {
> +       gcc_assert (!grouped_load && !mask && !bb_vinfo);
> +       /* If we have versioned for aliasing or the loop doesn't
> +        have any data dependencies that would preclude this,
> +        then we are sure this is a loop invariant load and
> +        thus we can insert it on the preheader edge.  */
> +       bool hoist_p = (LOOP_VINFO_NO_DATA_DEPENDENCIES (loop_vinfo)
> +                     && !nested_in_vect_loop
> +                     && hoist_defs_of_uses (stmt_info, loop));
> +       if (hoist_p)
> +       {
> +         gassign *stmt = as_a <gassign *> (stmt_info->stmt);
> +         if (dump_enabled_p ())
> +           {
> +             dump_printf_loc (MSG_NOTE, vect_location,
> +                              "hoisting out of the vectorized loop: ");
> +             dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt, 0);
> +           }
> +         scalar_dest = copy_ssa_name (scalar_dest);
> +         tree rhs = unshare_expr (gimple_assign_rhs1 (stmt));
> +         gsi_insert_on_edge_immediate
> +           (loop_preheader_edge (loop),
> +            gimple_build_assign (scalar_dest, rhs));
> +       }
> +       /* These copies are all equivalent, but currently the representation
> +        requires a separate STMT_VINFO_VEC_STMT for each one.  */
> +       prev_stmt_info = NULL;
> +       gimple_stmt_iterator gsi2 = *gsi;
> +       gsi_next (&gsi2);
> +       for (j = 0; j < ncopies; j++)
> +       {
> +         stmt_vec_info new_stmt_info;
> +         if (hoist_p)
> +           {
> +             new_temp = vect_init_vector (stmt_info, scalar_dest,
> +                                          vectype, NULL);
> +             gimple *new_stmt = SSA_NAME_DEF_STMT (new_temp);
> +             new_stmt_info = vinfo->add_stmt (new_stmt);
> +           }
> +         else
> +           {
> +             new_temp = vect_init_vector (stmt_info, scalar_dest,
> +                                          vectype, &gsi2);
> +             new_stmt_info = vinfo->lookup_def (new_temp);
> +           }
> +         if (slp)
> +           SLP_TREE_VEC_STMTS (slp_node).quick_push (new_stmt_info);
> +         else if (j == 0)
> +           STMT_VINFO_VEC_STMT (stmt_info) = *vec_stmt = new_stmt_info;
> +         else
> +           STMT_VINFO_RELATED_STMT (prev_stmt_info) = new_stmt_info;
> +         prev_stmt_info = new_stmt_info;
> +       }
> +       return true;
> +     }
> +
>     if (memory_access_type == VMAT_ELEMENTWISE
>         || memory_access_type == VMAT_STRIDED_SLP)
>       {
> *************** vectorizable_load (stmt_vec_info stmt_in
> *** 8177,8183 ****
>             {
>               dataref_ptr = unshare_expr (DR_BASE_ADDRESS (first_dr_info->dr));
>               dataref_offset = build_int_cst (ref_type, 0);
> -             inv_p = false;
>             }
>           else if (first_stmt_info_for_drptr
>                    && first_stmt_info != first_stmt_info_for_drptr)
> --- 8226,8231 ----
> *************** vectorizable_load (stmt_vec_info stmt_in
> *** 8186,8192 ****
>                 = vect_create_data_ref_ptr (first_stmt_info_for_drptr,
>                                             aggr_type, at_loop, offset, &dummy,
>                                             gsi, &ptr_incr, simd_lane_access_p,
> !                                           &inv_p, byte_offset, bump);
>               /* Adjust the pointer by the difference to first_stmt.  */
>               data_reference_p ptrdr
>                 = STMT_VINFO_DATA_REF (first_stmt_info_for_drptr);
> --- 8234,8240 ----
>                 = vect_create_data_ref_ptr (first_stmt_info_for_drptr,
>                                             aggr_type, at_loop, offset, &dummy,
>                                             gsi, &ptr_incr, simd_lane_access_p,
> !                                           byte_offset, bump);
>               /* Adjust the pointer by the difference to first_stmt.  */
>               data_reference_p ptrdr
>                 = STMT_VINFO_DATA_REF (first_stmt_info_for_drptr);
> *************** vectorizable_load (stmt_vec_info stmt_in
> *** 8199,8214 ****
>                                              stmt_info, diff);
>             }
>           else if (STMT_VINFO_GATHER_SCATTER_P (stmt_info))
> !           {
> !             vect_get_gather_scatter_ops (loop, stmt_info, &gs_info,
> !                                          &dataref_ptr, &vec_offset);
> !             inv_p = false;
> !           }
>           else
>             dataref_ptr
>               = vect_create_data_ref_ptr (first_stmt_info, aggr_type, at_loop,
>                                           offset, &dummy, gsi, &ptr_incr,
> !                                         simd_lane_access_p, &inv_p,
>                                           byte_offset, bump);
>           if (mask)
>             vec_mask = vect_get_vec_def_for_operand (mask, stmt_info,
> --- 8247,8259 ----
>                                              stmt_info, diff);
>             }
>           else if (STMT_VINFO_GATHER_SCATTER_P (stmt_info))
> !           vect_get_gather_scatter_ops (loop, stmt_info, &gs_info,
> !                                        &dataref_ptr, &vec_offset);
>           else
>             dataref_ptr
>               = vect_create_data_ref_ptr (first_stmt_info, aggr_type, at_loop,
>                                           offset, &dummy, gsi, &ptr_incr,
> !                                         simd_lane_access_p,
>                                           byte_offset, bump);
>           if (mask)
>             vec_mask = vect_get_vec_def_for_operand (mask, stmt_info,
> *************** vectorizable_load (stmt_vec_info stmt_in
> *** 8492,8538 ****
>                     }
>                 }
>
> -             /* 4. Handle invariant-load.  */
> -             if (inv_p && !bb_vinfo)
> -               {
> -                 gcc_assert (!grouped_load);
> -                 /* If we have versioned for aliasing or the loop doesn't
> -                    have any data dependencies that would preclude this,
> -                    then we are sure this is a loop invariant load and
> -                    thus we can insert it on the preheader edge.  */
> -                 if (LOOP_VINFO_NO_DATA_DEPENDENCIES (loop_vinfo)
> -                     && !nested_in_vect_loop
> -                     && hoist_defs_of_uses (stmt_info, loop))
> -                   {
> -                     gassign *stmt = as_a <gassign *> (stmt_info->stmt);
> -                     if (dump_enabled_p ())
> -                       {
> -                         dump_printf_loc (MSG_NOTE, vect_location,
> -                                          "hoisting out of the vectorized "
> -                                          "loop: ");
> -                         dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt, 0);
> -                       }
> -                     tree tem = copy_ssa_name (scalar_dest);
> -                     gsi_insert_on_edge_immediate
> -                       (loop_preheader_edge (loop),
> -                        gimple_build_assign (tem,
> -                                             unshare_expr
> -                                               (gimple_assign_rhs1 (stmt))));
> -                     new_temp = vect_init_vector (stmt_info, tem,
> -                                                  vectype, NULL);
> -                     new_stmt = SSA_NAME_DEF_STMT (new_temp);
> -                     new_stmt_info = vinfo->add_stmt (new_stmt);
> -                   }
> -                 else
> -                   {
> -                     gimple_stmt_iterator gsi2 = *gsi;
> -                     gsi_next (&gsi2);
> -                     new_temp = vect_init_vector (stmt_info, scalar_dest,
> -                                                  vectype, &gsi2);
> -                     new_stmt_info = vinfo->lookup_def (new_temp);
> -                   }
> -               }
> -
>               if (memory_access_type == VMAT_CONTIGUOUS_REVERSE)
>                 {
>                   tree perm_mask = perm_mask_for_reverse (vectype);
> --- 8537,8542 ----

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [07/11] Use single basic block array in loop_vec_info
  2018-07-30 11:42 ` [07/11] Use single basic block array in loop_vec_info Richard Sandiford
@ 2018-08-01 12:58   ` Richard Biener
  0 siblings, 0 replies; 20+ messages in thread
From: Richard Biener @ 2018-08-01 12:58 UTC (permalink / raw)
  To: GCC Patches, richard.sandiford

On Mon, Jul 30, 2018 at 1:42 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> _loop_vec_info::_loop_vec_info used get_loop_array to get the
> order of the blocks when creating stmt_vec_infos, but then used
> dfs_enumerate_from to get the order of the blocks that the rest
> of the vectoriser uses.  We should be able to use that order
> for creating stmt_vec_infos too.

OK.  Note I have rev_post_order_and_mark_dfs_back_seme for a patch I'm working
on (RPO order on a single-entry multiple-exit region).  I'll try to
remember that "fixme".

Richard.

>
> 2018-07-30  Richard Sandiford  <richard.sandiford@arm.com>
>
> gcc/
>         * tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Use the
>         result of dfs_enumerate_from when constructing stmt_vec_infos,
>         instead of additionally calling get_loop_body.
>
> Index: gcc/tree-vect-loop.c
> ===================================================================
> *** gcc/tree-vect-loop.c        2018-07-30 12:40:59.366015643 +0100
> --- gcc/tree-vect-loop.c        2018-07-30 12:40:59.362015678 +0100
> *************** _loop_vec_info::_loop_vec_info (struct l
> *** 834,844 ****
>       scalar_loop (NULL),
>       orig_loop_info (NULL)
>   {
> !   /* Create/Update stmt_info for all stmts in the loop.  */
> !   basic_block *body = get_loop_body (loop);
> !   for (unsigned int i = 0; i < loop->num_nodes; i++)
>       {
> !       basic_block bb = body[i];
>         gimple_stmt_iterator si;
>
>         for (si = gsi_start_phis (bb); !gsi_end_p (si); gsi_next (&si))
> --- 834,851 ----
>       scalar_loop (NULL),
>       orig_loop_info (NULL)
>   {
> !   /* CHECKME: We want to visit all BBs before their successors (except for
> !      latch blocks, for which this assertion wouldn't hold).  In the simple
> !      case of the loop forms we allow, a dfs order of the BBs would the same
> !      as reversed postorder traversal, so we are safe.  */
> !
> !   unsigned int nbbs = dfs_enumerate_from (loop->header, 0, bb_in_loop_p,
> !                                         bbs, loop->num_nodes, loop);
> !   gcc_assert (nbbs == loop->num_nodes);
> !
> !   for (unsigned int i = 0; i < nbbs; i++)
>       {
> !       basic_block bb = bbs[i];
>         gimple_stmt_iterator si;
>
>         for (si = gsi_start_phis (bb); !gsi_end_p (si); gsi_next (&si))
> *************** _loop_vec_info::_loop_vec_info (struct l
> *** 855,870 ****
>           add_stmt (stmt);
>         }
>       }
> -   free (body);
> -
> -   /* CHECKME: We want to visit all BBs before their successors (except for
> -      latch blocks, for which this assertion wouldn't hold).  In the simple
> -      case of the loop forms we allow, a dfs order of the BBs would the same
> -      as reversed postorder traversal, so we are safe.  */
> -
> -   unsigned int nbbs = dfs_enumerate_from (loop->header, 0, bb_in_loop_p,
> -                                         bbs, loop->num_nodes, loop);
> -   gcc_assert (nbbs == loop->num_nodes);
>   }
>
>   /* Free all levels of MASKS.  */
> --- 862,867 ----

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [08/11] Make hoist_defs_of_uses use vec_info::lookup_def
  2018-07-30 11:43 ` [08/11] Make hoist_defs_of_uses use vec_info::lookup_def Richard Sandiford
@ 2018-08-01 13:01   ` Richard Biener
  0 siblings, 0 replies; 20+ messages in thread
From: Richard Biener @ 2018-08-01 13:01 UTC (permalink / raw)
  To: GCC Patches, richard.sandiford

On Mon, Jul 30, 2018 at 1:43 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> This patch makes hoist_defs_of_uses use vec_info::lookup_def instead of:
>
>       if (!gimple_nop_p (def_stmt)
>           && flow_bb_inside_loop_p (loop, gimple_bb (def_stmt)))
>
> to test whether a feeding scalar statement needs to be hoisted out
> of the vectorised loop.  It isn't worth doing in its own right,
> but it's a prerequisite for the next patch, which needs to update
> the stmt_vec_infos of the hoisted statements.

OK.

>
> 2018-07-30  Richard Sandiford  <richard.sandiford@arm.com>
>
> gcc/
>         * tree-vect-stmts.c (hoist_defs_of_uses): Use vec_info::lookup_def
>         instead of gimple_nop_p and flow_bb_inside_loop_p to decide
>         whether a statement needs to be hoisted.
>
> Index: gcc/tree-vect-stmts.c
> ===================================================================
> *** gcc/tree-vect-stmts.c       2018-07-30 12:42:35.633169005 +0100
> --- gcc/tree-vect-stmts.c       2018-07-30 12:42:35.629169040 +0100
> *************** permute_vec_elements (tree x, tree y, tr
> *** 7322,7370 ****
>   static bool
>   hoist_defs_of_uses (stmt_vec_info stmt_info, struct loop *loop)
>   {
>     ssa_op_iter i;
>     tree op;
>     bool any = false;
>
>     FOR_EACH_SSA_TREE_OPERAND (op, stmt_info->stmt, i, SSA_OP_USE)
> !     {
> !       gimple *def_stmt = SSA_NAME_DEF_STMT (op);
> !       if (!gimple_nop_p (def_stmt)
> !         && flow_bb_inside_loop_p (loop, gimple_bb (def_stmt)))
> !       {
> !         /* Make sure we don't need to recurse.  While we could do
> !            so in simple cases when there are more complex use webs
> !            we don't have an easy way to preserve stmt order to fulfil
> !            dependencies within them.  */
> !         tree op2;
> !         ssa_op_iter i2;
> !         if (gimple_code (def_stmt) == GIMPLE_PHI)
>             return false;
> !         FOR_EACH_SSA_TREE_OPERAND (op2, def_stmt, i2, SSA_OP_USE)
> !           {
> !             gimple *def_stmt2 = SSA_NAME_DEF_STMT (op2);
> !             if (!gimple_nop_p (def_stmt2)
> !                 && flow_bb_inside_loop_p (loop, gimple_bb (def_stmt2)))
> !               return false;
> !           }
> !         any = true;
> !       }
> !     }
>
>     if (!any)
>       return true;
>
>     FOR_EACH_SSA_TREE_OPERAND (op, stmt_info->stmt, i, SSA_OP_USE)
> !     {
> !       gimple *def_stmt = SSA_NAME_DEF_STMT (op);
> !       if (!gimple_nop_p (def_stmt)
> !         && flow_bb_inside_loop_p (loop, gimple_bb (def_stmt)))
> !       {
> !         gimple_stmt_iterator gsi = gsi_for_stmt (def_stmt);
> !         gsi_remove (&gsi, false);
> !         gsi_insert_on_edge_immediate (loop_preheader_edge (loop), def_stmt);
> !       }
> !     }
>
>     return true;
>   }
> --- 7322,7360 ----
>   static bool
>   hoist_defs_of_uses (stmt_vec_info stmt_info, struct loop *loop)
>   {
> +   vec_info *vinfo = stmt_info->vinfo;
>     ssa_op_iter i;
>     tree op;
>     bool any = false;
>
>     FOR_EACH_SSA_TREE_OPERAND (op, stmt_info->stmt, i, SSA_OP_USE)
> !     if (stmt_vec_info def_stmt_info = vinfo->lookup_def (op))
> !       {
> !       /* Make sure we don't need to recurse.  While we could do
> !          so in simple cases when there are more complex use webs
> !          we don't have an easy way to preserve stmt order to fulfil
> !          dependencies within them.  */
> !       tree op2;
> !       ssa_op_iter i2;
> !       if (gimple_code (def_stmt_info->stmt) == GIMPLE_PHI)
> !         return false;
> !       FOR_EACH_SSA_TREE_OPERAND (op2, def_stmt_info->stmt, i2, SSA_OP_USE)
> !         if (vinfo->lookup_def (op2))
>             return false;
> !       any = true;
> !       }
>
>     if (!any)
>       return true;
>
>     FOR_EACH_SSA_TREE_OPERAND (op, stmt_info->stmt, i, SSA_OP_USE)
> !     if (stmt_vec_info def_stmt_info = vinfo->lookup_def (op))
> !       {
> !       gimple_stmt_iterator gsi = gsi_for_stmt (def_stmt_info->stmt);
> !       gsi_remove (&gsi, false);
> !       gsi_insert_on_edge_immediate (loop_preheader_edge (loop),
> !                                     def_stmt_info->stmt);
> !       }
>
>     return true;
>   }

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2018-08-01 13:01 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-30 11:36 [00/11] Add a vec_basic_block of scalar statements Richard Sandiford
2018-07-30 11:37 ` [01/11] Schedule SLP earlier Richard Sandiford
2018-08-01 12:49   ` Richard Biener
2018-07-30 11:37 ` [02/11] Remove vect_schedule_slp return value Richard Sandiford
2018-08-01 12:49   ` Richard Biener
2018-07-30 11:38 ` [04/11] Add a vect_orig_stmt helper function Richard Sandiford
2018-08-01 12:50   ` Richard Biener
2018-07-30 11:38 ` [03/11] Remove vect_transform_stmt grouped_store argument Richard Sandiford
2018-08-01 12:49   ` Richard Biener
2018-07-30 11:39 ` [05/11] Add a vect_stmt_to_vectorize helper function Richard Sandiford
2018-08-01 12:51   ` Richard Biener
2018-07-30 11:41 ` [06/11] Handle VMAT_INVARIANT separately Richard Sandiford
2018-08-01 12:52   ` Richard Biener
2018-07-30 11:42 ` [07/11] Use single basic block array in loop_vec_info Richard Sandiford
2018-08-01 12:58   ` Richard Biener
2018-07-30 11:43 ` [08/11] Make hoist_defs_of_uses use vec_info::lookup_def Richard Sandiford
2018-08-01 13:01   ` Richard Biener
2018-07-30 11:46 ` [09/11] Add a vec_basic_block structure Richard Sandiford
2018-07-30 11:46 ` [10/11] Make the vectoriser do its own DCE Richard Sandiford
2018-07-30 11:47 ` [11/11] Insert pattern statements into vec_basic_blocks Richard Sandiford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).