public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [0/6] Make vector pattern statements less special
@ 2018-08-28 11:19 Richard Sandiford
  2018-08-28 11:20 ` [1/6] Handle gphis in gimple_get_lhs Richard Sandiford
                   ` (5 more replies)
  0 siblings, 6 replies; 22+ messages in thread
From: Richard Sandiford @ 2018-08-28 11:19 UTC (permalink / raw)
  To: gcc-patches

The goal of this series is to make vector pattern statements
less special compared to normal bb statements, and thus make
them easier to work with.  It picks up the tail end of:

  https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01821.html

Patch 08/11 from that series turned out to be wrong, for the
reason shown by a test in 3/6.  The new series also contains a
couple of other bug fixes and three new patches (1/6, 2/6 and 6/6).

A side benefit is that it becomes as easy for the vectoriser to
do its own DCE as it is to do the half-DCE it currently does (4/6).
This might make:

          PUSH_INSERT_PASSES_WITHIN (pass_vectorize)
	      NEXT_PASS (pass_dce);
          POP_INSERT_PASSES ()

redundant -- I can experiment with that as a follow-on if the
series is OK.

The series is a prerequisite to supporting extending loads and
truncating stores.

Tested on aarch64-linux-gnu (with and without SVE), aarch64_be-elf
and x86_64-linux-gnu.  Also tested on SPEC2k6 and SPEC2017 for
SVE and x86_64-linux-gnu.

Richard

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [1/6] Handle gphis in gimple_get_lhs
  2018-08-28 11:19 [0/6] Make vector pattern statements less special Richard Sandiford
@ 2018-08-28 11:20 ` Richard Sandiford
  2018-08-28 18:22   ` Jeff Law
  2018-08-28 11:21 ` [2/6] Make vec_info::lookup_single_use take a stmt_vec_info Richard Sandiford
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 22+ messages in thread
From: Richard Sandiford @ 2018-08-28 11:20 UTC (permalink / raw)
  To: gcc-patches

Several callers of gimple_get_lhs deal with statements that might
be phis.  This patch makes gimple_get_lhs handle that case too,
so that the callers don't have to.


2018-08-28  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* gimple.c (gimple_get_lhs): Handle gphis.
	* tree-ssa-phionlycprop.c (get_lhs_or_phi_result): Delete and...
	(propagate_rhs_into_lhs, eliminate_const_or_copy): ...use
	gimple_get_lhs instead.
	* tree-ssa-dom.c (eliminate_redundant_computations): Don't handle
	phis specially before calling gimple_get_lhs.
	* tree-ssa-scopedtables.c (avail_exprs_stack::lookup_avail_expr):
	Likewise.
	* tree-vect-loop.c (vectorizable_live_operation): Likewise.
	* tree-vect-slp.c (vect_get_slp_vect_defs): Likewise.
	(vect_get_slp_defs): Likewise.
	* tree-vect-stmts.c (vect_get_vec_def_for_operand_1): Likewise.
	(vect_get_vec_def_for_stmt_copy, vect_transform_stmt): Likewise.

Index: gcc/gimple.c
===================================================================
--- gcc/gimple.c	2018-08-28 11:25:46.162880564 +0100
+++ gcc/gimple.c	2018-08-28 12:05:06.203027323 +0100
@@ -1757,12 +1757,12 @@ gimple_assign_set_rhs_with_ops (gimple_s
 tree
 gimple_get_lhs (const gimple *stmt)
 {
-  enum gimple_code code = gimple_code (stmt);
-
-  if (code == GIMPLE_ASSIGN)
-    return gimple_assign_lhs (stmt);
-  else if (code == GIMPLE_CALL)
-    return gimple_call_lhs (stmt);
+  if (const gphi *phi = dyn_cast <const gphi *> (stmt))
+    return gimple_phi_result (phi);
+  else if (const gassign *assign = dyn_cast <const gassign *> (stmt))
+    return gimple_assign_lhs (assign);
+  else if (const gcall *call = dyn_cast <const gcall *> (stmt))
+    return gimple_call_lhs (call);
   else
     return NULL_TREE;
 }
Index: gcc/tree-ssa-phionlycprop.c
===================================================================
--- gcc/tree-ssa-phionlycprop.c	2018-05-02 08:38:14.413364283 +0100
+++ gcc/tree-ssa-phionlycprop.c	2018-08-28 12:05:06.203027323 +0100
@@ -72,20 +72,6 @@ get_rhs_or_phi_arg (gimple *stmt)
 }
 
 
-/* Given a statement STMT, which is either a PHI node or an assignment,
-   return the "lhs" of the node.  */
-
-static tree
-get_lhs_or_phi_result (gimple *stmt)
-{
-  if (gimple_code (stmt) == GIMPLE_PHI)
-    return gimple_phi_result (stmt);
-  else if (is_gimple_assign (stmt))
-    return gimple_assign_lhs (stmt);
-  else
-    gcc_unreachable ();
-}
-
 /* Propagate RHS into all uses of LHS (when possible).
 
    RHS and LHS are derived from STMT, which is passed in solely so
@@ -186,7 +172,7 @@ propagate_rhs_into_lhs (gimple *stmt, tr
 		  print_gimple_stmt (dump_file, use_stmt, 0, dump_flags);
 		}
 
-	      result = get_lhs_or_phi_result (use_stmt);
+	      result = gimple_get_lhs (use_stmt);
 	      bitmap_set_bit (interesting_names, SSA_NAME_VERSION (result));
 	      continue;
 	    }
@@ -240,7 +226,7 @@ propagate_rhs_into_lhs (gimple *stmt, tr
               && (TREE_CODE (gimple_assign_rhs1 (use_stmt)) == SSA_NAME
                   || is_gimple_min_invariant (gimple_assign_rhs1 (use_stmt))))
             {
-	      tree result = get_lhs_or_phi_result (use_stmt);
+	      tree result = gimple_get_lhs (use_stmt);
 	      bitmap_set_bit (interesting_names, SSA_NAME_VERSION (result));
 	    }
 
@@ -345,7 +331,7 @@ propagate_rhs_into_lhs (gimple *stmt, tr
 eliminate_const_or_copy (gimple *stmt, bitmap interesting_names,
 			 bitmap need_eh_cleanup)
 {
-  tree lhs = get_lhs_or_phi_result (stmt);
+  tree lhs = gimple_get_lhs (stmt);
   tree rhs;
   int version = SSA_NAME_VERSION (lhs);
   bool cfg_altered = false;
Index: gcc/tree-ssa-dom.c
===================================================================
--- gcc/tree-ssa-dom.c	2018-08-28 11:25:45.842883317 +0100
+++ gcc/tree-ssa-dom.c	2018-08-28 12:05:06.203027323 +0100
@@ -1491,10 +1491,7 @@ eliminate_redundant_computations (gimple
 
   gimple *stmt = gsi_stmt (*gsi);
 
-  if (gimple_code (stmt) == GIMPLE_PHI)
-    def = gimple_phi_result (stmt);
-  else
-    def = gimple_get_lhs (stmt);
+  def = gimple_get_lhs (stmt);
 
   /* Certain expressions on the RHS can be optimized away, but can not
      themselves be entered into the hash tables.  */
Index: gcc/tree-ssa-scopedtables.c
===================================================================
--- gcc/tree-ssa-scopedtables.c	2018-06-14 12:27:39.536036781 +0100
+++ gcc/tree-ssa-scopedtables.c	2018-08-28 12:05:06.203027323 +0100
@@ -236,11 +236,7 @@ avail_exprs_stack::lookup_avail_expr (gi
   expr_hash_elt **slot;
   tree lhs;
 
-  /* Get LHS of phi, assignment, or call; else NULL_TREE.  */
-  if (gimple_code (stmt) == GIMPLE_PHI)
-    lhs = gimple_phi_result (stmt);
-  else
-    lhs = gimple_get_lhs (stmt);
+  lhs = gimple_get_lhs (stmt);
 
   class expr_hash_elt element (stmt, lhs);
 
Index: gcc/tree-vect-loop.c
===================================================================
--- gcc/tree-vect-loop.c	2018-08-28 11:25:46.022881769 +0100
+++ gcc/tree-vect-loop.c	2018-08-28 12:05:06.207027289 +0100
@@ -7924,8 +7924,7 @@ vectorizable_live_operation (stmt_vec_in
   /* Use the lhs of the original scalar statement.  */
   gimple *stmt = vect_orig_stmt (stmt_info)->stmt;
 
-  lhs = (is_a <gphi *> (stmt)) ? gimple_phi_result (stmt)
-	: gimple_get_lhs (stmt);
+  lhs = gimple_get_lhs (stmt);
   lhs_type = TREE_TYPE (lhs);
 
   bitsize = (VECTOR_BOOLEAN_TYPE_P (vectype)
@@ -7941,10 +7940,7 @@ vectorizable_live_operation (stmt_vec_in
 
       /* Get the correct slp vectorized stmt.  */
       gimple *vec_stmt = SLP_TREE_VEC_STMTS (slp_node)[vec_entry]->stmt;
-      if (gphi *phi = dyn_cast <gphi *> (vec_stmt))
-	vec_lhs = gimple_phi_result (phi);
-      else
-	vec_lhs = gimple_get_lhs (vec_stmt);
+      vec_lhs = gimple_get_lhs (vec_stmt);
 
       /* Get entry to use.  */
       bitstart = bitsize_int (vec_index);
Index: gcc/tree-vect-slp.c
===================================================================
--- gcc/tree-vect-slp.c	2018-08-24 14:05:30.699753857 +0100
+++ gcc/tree-vect-slp.c	2018-08-28 12:05:06.207027289 +0100
@@ -3477,7 +3477,6 @@ vect_get_constant_vectors (tree op, slp_
 static void
 vect_get_slp_vect_defs (slp_tree slp_node, vec<tree> *vec_oprnds)
 {
-  tree vec_oprnd;
   stmt_vec_info vec_def_stmt_info;
   unsigned int i;
 
@@ -3486,11 +3485,7 @@ vect_get_slp_vect_defs (slp_tree slp_nod
   FOR_EACH_VEC_ELT (SLP_TREE_VEC_STMTS (slp_node), i, vec_def_stmt_info)
     {
       gcc_assert (vec_def_stmt_info);
-      if (gphi *vec_def_phi = dyn_cast <gphi *> (vec_def_stmt_info->stmt))
-	vec_oprnd = gimple_phi_result (vec_def_phi);
-      else
-	vec_oprnd = gimple_get_lhs (vec_def_stmt_info->stmt);
-      vec_oprnds->quick_push (vec_oprnd);
+      vec_oprnds->quick_push (gimple_get_lhs (vec_def_stmt_info->stmt));
     }
 }
 
@@ -3535,10 +3530,7 @@ vect_get_slp_defs (vec<tree> ops, slp_tr
 	      stmt_vec_info related = STMT_VINFO_RELATED_STMT (first_def_info);
 	      tree first_def_op;
 
-	      if (gphi *first_def = dyn_cast <gphi *> (first_def_info->stmt))
-		first_def_op = gimple_phi_result (first_def);
-	      else
-		first_def_op = gimple_get_lhs (first_def_info->stmt);
+	      first_def_op = gimple_get_lhs (first_def_info->stmt);
 	      if (operand_equal_p (oprnd, first_def_op, 0)
 		  || (related
 		      && operand_equal_p (oprnd,
Index: gcc/tree-vect-stmts.c
===================================================================
--- gcc/tree-vect-stmts.c	2018-08-28 11:26:04.000000000 +0100
+++ gcc/tree-vect-stmts.c	2018-08-28 12:05:06.211027256 +0100
@@ -1465,7 +1465,6 @@ vect_init_vector (stmt_vec_info stmt_inf
 vect_get_vec_def_for_operand_1 (stmt_vec_info def_stmt_info,
 				enum vect_def_type dt)
 {
-  tree vec_oprnd;
   stmt_vec_info vec_stmt_info;
 
   switch (dt)
@@ -1488,11 +1487,7 @@ vect_get_vec_def_for_operand_1 (stmt_vec
 	  vec_stmt_info = (STMT_VINFO_VEC_STMT
 			   (STMT_VINFO_RELATED_STMT (def_stmt_info)));
 	gcc_assert (vec_stmt_info);
-	if (gphi *phi = dyn_cast <gphi *> (vec_stmt_info->stmt))
-	  vec_oprnd = PHI_RESULT (phi);
-	else
-	  vec_oprnd = gimple_get_lhs (vec_stmt_info->stmt);
-	return vec_oprnd;
+	return gimple_get_lhs (vec_stmt_info->stmt);
       }
 
     /* operand is defined by a loop header phi.  */
@@ -1505,11 +1500,7 @@ vect_get_vec_def_for_operand_1 (stmt_vec
 
 	/* Get the def from the vectorized stmt.  */
 	vec_stmt_info = STMT_VINFO_VEC_STMT (def_stmt_info);
-	if (gphi *phi = dyn_cast <gphi *> (vec_stmt_info->stmt))
-	  vec_oprnd = PHI_RESULT (phi);
-	else
-	  vec_oprnd = gimple_get_lhs (vec_stmt_info->stmt);
-	return vec_oprnd;
+	return gimple_get_lhs (vec_stmt_info->stmt);
       }
 
     default:
@@ -1642,11 +1633,7 @@ vect_get_vec_def_for_stmt_copy (vec_info
 
   def_stmt_info = STMT_VINFO_RELATED_STMT (def_stmt_info);
   gcc_assert (def_stmt_info);
-  if (gphi *phi = dyn_cast <gphi *> (def_stmt_info->stmt))
-    vec_oprnd = PHI_RESULT (phi);
-  else
-    vec_oprnd = gimple_get_lhs (def_stmt_info->stmt);
-  return vec_oprnd;
+  return gimple_get_lhs (def_stmt_info->stmt);
 }
 
 
@@ -9806,11 +9793,7 @@ vect_transform_stmt (stmt_vec_info stmt_
       /* Find the relevant loop-exit phi-node, and reord the vec_stmt there
         (to be used when vectorizing outer-loop stmts that use the DEF of
         STMT).  */
-      if (gimple_code (stmt) == GIMPLE_PHI)
-        scalar_dest = PHI_RESULT (stmt);
-      else
-        scalar_dest = gimple_get_lhs (stmt);
-
+      scalar_dest = gimple_get_lhs (stmt);
       FOR_EACH_IMM_USE_FAST (use_p, imm_iter, scalar_dest)
 	if (!flow_bb_inside_loop_p (innerloop, gimple_bb (USE_STMT (use_p))))
 	  {

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [2/6] Make vec_info::lookup_single_use take a stmt_vec_info
  2018-08-28 11:19 [0/6] Make vector pattern statements less special Richard Sandiford
  2018-08-28 11:20 ` [1/6] Handle gphis in gimple_get_lhs Richard Sandiford
@ 2018-08-28 11:21 ` Richard Sandiford
  2018-08-28 18:25   ` Jeff Law
  2018-08-28 11:22 ` [3/6] Add a vec_basic_block structure Richard Sandiford
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 22+ messages in thread
From: Richard Sandiford @ 2018-08-28 11:21 UTC (permalink / raw)
  To: gcc-patches

All callers to vec_info::lookup_single_use are asking about the lhs of a
statement that they're already examining.  It makes more sense to pass
that statement instead of the SSA name, since it is then easier to
handle statements that have been replaced by pattern statements.

A later patch adds support for pattern statements.  This one just
changes the interface to make that possible.


2018-08-28  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vectorizer.h (vec_info::lookup_single_use): Take a
	stmt_vec_info instead of an SSA name.
	* tree-vectorizer.c (vec_info::lookup_single_use): Likewise.
	* tree-vect-loop.c (vectorizable_reduction): Update call
	accordingly.
	* tree-vect-stmts.c (supportable_widening_operation): Likewise.

Index: gcc/tree-vectorizer.h
===================================================================
--- gcc/tree-vectorizer.h	2018-08-01 16:14:45.107097179 +0100
+++ gcc/tree-vectorizer.h	2018-08-28 12:05:08.743005901 +0100
@@ -220,7 +220,7 @@ struct vec_info {
   stmt_vec_info add_stmt (gimple *);
   stmt_vec_info lookup_stmt (gimple *);
   stmt_vec_info lookup_def (tree);
-  stmt_vec_info lookup_single_use (tree);
+  stmt_vec_info lookup_single_use (stmt_vec_info);
   struct dr_vec_info *lookup_dr (data_reference *);
   void move_dr (stmt_vec_info, stmt_vec_info);
   void remove_stmt (stmt_vec_info);
Index: gcc/tree-vectorizer.c
===================================================================
--- gcc/tree-vectorizer.c	2018-08-21 14:47:07.795167843 +0100
+++ gcc/tree-vectorizer.c	2018-08-28 12:05:08.743005901 +0100
@@ -544,13 +544,14 @@ vec_info::lookup_def (tree name)
   return NULL;
 }
 
-/* See whether there is a single non-debug statement that uses LHS and
-   whether that statement has an associated stmt_vec_info.  Return the
-   stmt_vec_info if so, otherwise return null.  */
+/* See whether there is a single non-debug use of the lhs of STMT_INFO
+   and whether the containing statement has an associated stmt_vec_info.
+   Return the stmt_vec_info if so, otherwise return null.  */
 
 stmt_vec_info
-vec_info::lookup_single_use (tree lhs)
+vec_info::lookup_single_use (stmt_vec_info stmt_info)
 {
+  tree lhs = gimple_get_lhs (stmt_info->stmt);
   use_operand_p dummy;
   gimple *use_stmt;
   if (single_imm_use (lhs, &dummy, &use_stmt))
Index: gcc/tree-vect-loop.c
===================================================================
--- gcc/tree-vect-loop.c	2018-08-28 12:05:06.207027289 +0100
+++ gcc/tree-vect-loop.c	2018-08-28 12:05:08.743005901 +0100
@@ -6180,7 +6180,7 @@ vectorizable_reduction (stmt_vec_info st
       stmt_vec_info use_stmt_info;
       if (ncopies > 1
 	  && STMT_VINFO_RELEVANT (reduc_stmt_info) <= vect_used_only_live
-	  && (use_stmt_info = loop_vinfo->lookup_single_use (phi_result))
+	  && (use_stmt_info = loop_vinfo->lookup_single_use (stmt_info))
 	  && vect_stmt_to_vectorize (use_stmt_info) == reduc_stmt_info)
 	single_defuse_cycle = true;
 
@@ -6947,10 +6947,9 @@ vectorizable_reduction (stmt_vec_info st
    in vectorizable_reduction and there are no intermediate stmts
    participating.  */
   stmt_vec_info use_stmt_info;
-  tree reduc_phi_result = gimple_phi_result (reduc_def_phi);
   if (ncopies > 1
       && (STMT_VINFO_RELEVANT (stmt_info) <= vect_used_only_live)
-      && (use_stmt_info = loop_vinfo->lookup_single_use (reduc_phi_result))
+      && (use_stmt_info = loop_vinfo->lookup_single_use (reduc_def_info))
       && vect_stmt_to_vectorize (use_stmt_info) == stmt_info)
     {
       single_defuse_cycle = true;
Index: gcc/tree-vect-stmts.c
===================================================================
--- gcc/tree-vect-stmts.c	2018-08-28 12:05:06.211027256 +0100
+++ gcc/tree-vect-stmts.c	2018-08-28 12:05:08.743005901 +0100
@@ -10237,8 +10237,8 @@ supportable_widening_operation (enum tre
              same operation.  One such an example is s += a * b, where elements
              in a and b cannot be reordered.  Here we check if the vector defined
              by STMT is only directly used in the reduction statement.  */
-	  tree lhs = gimple_assign_lhs (stmt_info->stmt);
-	  stmt_vec_info use_stmt_info = loop_info->lookup_single_use (lhs);
+	  stmt_vec_info use_stmt_info
+	    = loop_info->lookup_single_use (stmt_info);
 	  if (use_stmt_info
 	      && STMT_VINFO_DEF_TYPE (use_stmt_info) == vect_reduction_def)
 	    return true;

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [3/6] Add a vec_basic_block structure
  2018-08-28 11:19 [0/6] Make vector pattern statements less special Richard Sandiford
  2018-08-28 11:20 ` [1/6] Handle gphis in gimple_get_lhs Richard Sandiford
  2018-08-28 11:21 ` [2/6] Make vec_info::lookup_single_use take a stmt_vec_info Richard Sandiford
@ 2018-08-28 11:22 ` Richard Sandiford
  2018-08-28 22:38   ` Jeff Law
  2018-08-28 11:23 ` [4/6] Make the vectoriser do its own DCE Richard Sandiford
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 22+ messages in thread
From: Richard Sandiford @ 2018-08-28 11:22 UTC (permalink / raw)
  To: gcc-patches

This patch adds a vec_basic_block that records the scalar phis and
scalar statements that we need to vectorise.  This is a slight
simplification in its own right, since it avoids unnecesary statement
lookups and shaves >50 LOC.  But the main reason for doing it is
to allow the rest of the series to treat pattern statements less
specially.

Putting phis (which are logically parallel) and normal statements
(which are logically serial) into a single list might seem dangerous,
but I think in practice it should be fine.  Very little vectoriser
code needs to handle the parallel nature of phis specially, and code
that does can still do so.  Having a single list simplifies code that
wants to look at every scalar phi or stmt in isolation.

The new test is for cases in which we try to hoist the same expression
twice, which I broke with an earlier version of the patch.


2018-08-28  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vectorizer.h (vec_basic_block): New structure.
	(vec_info::blocks, _stmt_vec_info::block, _stmt_vec_info::prev)
	(_stmt_vec_info::next): New member variables.
	(FOR_EACH_VEC_BB_STMT, FOR_EACH_VEC_BB_STMT_REVERSE): New macros.
	(vec_basic_block::vec_basic_block): New function.
	* tree-vectorizer.c (vec_basic_block::add_to_end): Likewise.
	(vec_basic_block::add_before): Likewise.
	(vec_basic_block::remove): Likewise.
	(vec_info::~vec_info): Free the vec_basic_blocks.
	(vec_info::remove_stmt): Remove the statement from the containing
	vec_basic_block.
	* tree-vect-patterns.c (vect_determine_precisions)
	(vect_pattern_recog): Iterate over vec_basic_blocks.
	* tree-vect-loop.c (vect_determine_vectorization_factor)
	(vect_compute_single_scalar_iteration_cost, vect_update_vf_for_slp)
	(vect_analyze_loop_operations, vect_transform_loop): Likewise.
	(_loop_vec_info::_loop_vec_info): Construct vec_basic_blocks.
	* tree-vect-slp.c (_bb_vec_info::_bb_vec_info): Likewise.
	(vect_detect_hybrid_slp): Iterate over vec_basic_blocks.
	* tree-vect-stmts.c (vect_mark_stmts_to_be_vectorized): Likewise.
	(vect_finish_replace_stmt, vectorizable_condition): Remove the original
	statement from the containing block.
	(hoist_defs_of_uses): Likewise the statement that we're hoisting.

gcc/testsuite/
	* gcc.dg/vect/vect-hoist-1.c: New test.

Index: gcc/tree-vectorizer.h
===================================================================
--- gcc/tree-vectorizer.h	2018-08-28 12:05:08.743005901 +0100
+++ gcc/tree-vectorizer.h	2018-08-28 12:05:11.466982927 +0100
@@ -171,7 +171,30 @@ #define SLP_TREE_LOAD_PERMUTATION(S)
 #define SLP_TREE_TWO_OPERATORS(S)		 (S)->two_operators
 #define SLP_TREE_DEF_TYPE(S)			 (S)->def_type
 
+/* Information about the phis and statements in a block that we're trying
+   to vectorize, in their original order.  */
+class vec_basic_block
+{
+public:
+  vec_basic_block (basic_block);
+
+  void add_to_end (stmt_vec_info);
+  void add_before (stmt_vec_info, stmt_vec_info);
+  void remove (stmt_vec_info);
+
+  basic_block bb () const { return m_bb; }
+  stmt_vec_info first () const { return m_first; }
+  stmt_vec_info last () const { return m_last; }
+
+private:
+  /* The block itself.  */
+  basic_block m_bb;
 
+  /* The first and last statements in the block, forming a doubly-linked list.
+     The list includes both phis and true statements.  */
+  stmt_vec_info m_first;
+  stmt_vec_info m_last;
+};
 
 /* Describes two objects whose addresses must be unequal for the vectorized
    loop to be valid.  */
@@ -249,6 +272,9 @@ struct vec_info {
   /* Cost data used by the target cost model.  */
   void *target_cost_data;
 
+  /* The basic blocks in the vectorization region.  */
+  auto_vec<vec_basic_block *, 5> blocks;
+
 private:
   stmt_vec_info new_stmt_vec_info (gimple *stmt);
   void set_vinfo_for_stmt (gimple *, stmt_vec_info);
@@ -776,6 +802,11 @@ struct dr_vec_info {
 typedef struct data_reference *dr_p;
 
 struct _stmt_vec_info {
+  /* The block to which the statement belongs, or null if none.  */
+  vec_basic_block *block;
+
+  /* Link chains for the previous and next statements in BLOCK.  */
+  stmt_vec_info prev, next;
 
   enum stmt_vec_info_type type;
 
@@ -1072,6 +1103,27 @@ #define VECT_SCALAR_BOOLEAN_TYPE_P(TYPE)
        && TYPE_PRECISION (TYPE) == 1		\
        && TYPE_UNSIGNED (TYPE)))
 
+/* Make STMT_INFO iterate over each statement in vec_basic_block VEC_BB
+   in forward order.  */
+
+#define FOR_EACH_VEC_BB_STMT(VEC_BB, STMT_INFO) \
+  for (stmt_vec_info STMT_INFO = (VEC_BB)->first (); STMT_INFO; \
+       STMT_INFO = STMT_INFO->next)
+
+/* Make STMT_INFO iterate over each statement in vec_basic_block VEC_BB
+   in backward order.  */
+
+#define FOR_EACH_VEC_BB_STMT_REVERSE(VEC_BB, STMT_INFO) \
+  for (stmt_vec_info STMT_INFO = (VEC_BB)->last (); STMT_INFO; \
+       STMT_INFO = STMT_INFO->prev)
+
+/* Construct a vec_basic_block for BB.  */
+
+inline vec_basic_block::vec_basic_block (basic_block bb)
+  : m_bb (bb), m_first (NULL), m_last (NULL)
+{
+}
+
 static inline bool
 nested_in_vect_loop_p (struct loop *loop, stmt_vec_info stmt_info)
 {
Index: gcc/tree-vectorizer.c
===================================================================
--- gcc/tree-vectorizer.c	2018-08-28 12:05:08.743005901 +0100
+++ gcc/tree-vectorizer.c	2018-08-28 12:05:11.466982927 +0100
@@ -444,6 +444,61 @@ note_simd_array_uses (hash_table<simd_ar
   delete simd_array_to_simduid_htab;
 }
 \f
+/* Add STMT_INFO to the end of the block.  */
+
+void
+vec_basic_block::add_to_end (stmt_vec_info stmt_info)
+{
+  gcc_checking_assert (!stmt_info->block
+		       && !stmt_info->prev
+		       && !stmt_info->next);
+  if (m_last)
+    m_last->next = stmt_info;
+  else
+    m_first = stmt_info;
+  stmt_info->block = this;
+  stmt_info->prev = m_last;
+  m_last = stmt_info;
+}
+
+/* Add STMT_INFO to the block, inserting it before NEXT_STMT_INFO.  */
+
+void
+vec_basic_block::add_before (stmt_vec_info stmt_info,
+			     stmt_vec_info next_stmt_info)
+{
+  gcc_checking_assert (!stmt_info->block
+		       && !stmt_info->prev
+		       && !stmt_info->next
+		       && next_stmt_info->block == this);
+  if (next_stmt_info->prev)
+    next_stmt_info->prev->next = stmt_info;
+  else
+    m_first = stmt_info;
+  stmt_info->block = this;
+  stmt_info->prev = next_stmt_info->prev;
+  stmt_info->next = next_stmt_info;
+  next_stmt_info->prev = stmt_info;
+}
+
+/* Remove STMT_INFO from the block.  */
+
+void
+vec_basic_block::remove (stmt_vec_info stmt_info)
+{
+  gcc_checking_assert (stmt_info->block == this);
+  if (stmt_info->prev)
+    stmt_info->prev->next = stmt_info->next;
+  else
+    m_first = stmt_info->next;
+  if (stmt_info->next)
+    stmt_info->next->prev = stmt_info->prev;
+  else
+    m_last = stmt_info->prev;
+  stmt_info->block = NULL;
+  stmt_info->prev = stmt_info->next = NULL;
+}
+\f
 /* Initialize the vec_info with kind KIND_IN and target cost data
    TARGET_COST_DATA_IN.  */
 
@@ -459,8 +514,12 @@ vec_info::vec_info (vec_info::vec_kind k
 vec_info::~vec_info ()
 {
   slp_instance instance;
+  vec_basic_block *vec_bb;
   unsigned int i;
 
+  FOR_EACH_VEC_ELT (blocks, i, vec_bb)
+    delete vec_bb;
+
   FOR_EACH_VEC_ELT (slp_instances, i, instance)
     vect_free_slp_instance (instance, true);
 
@@ -597,6 +656,7 @@ vec_info::remove_stmt (stmt_vec_info stm
   unlink_stmt_vdef (stmt_info->stmt);
   gsi_remove (&si, true);
   release_defs (stmt_info->stmt);
+  stmt_info->block->remove (stmt_info);
   free_stmt_vec_info (stmt_info);
 }
 
Index: gcc/tree-vect-patterns.c
===================================================================
--- gcc/tree-vect-patterns.c	2018-08-01 15:40:18.277228757 +0100
+++ gcc/tree-vect-patterns.c	2018-08-28 12:05:11.462982962 +0100
@@ -4639,39 +4639,11 @@ vect_determine_precisions (vec_info *vin
 {
   DUMP_VECT_SCOPE ("vect_determine_precisions");
 
-  if (loop_vec_info loop_vinfo = dyn_cast <loop_vec_info> (vinfo))
-    {
-      struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
-      basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo);
-      unsigned int nbbs = loop->num_nodes;
-
-      for (unsigned int i = 0; i < nbbs; i++)
-	{
-	  basic_block bb = bbs[nbbs - i - 1];
-	  for (gimple_stmt_iterator si = gsi_last_bb (bb);
-	       !gsi_end_p (si); gsi_prev (&si))
-	    vect_determine_stmt_precisions
-	      (vinfo->lookup_stmt (gsi_stmt (si)));
-	}
-    }
-  else
-    {
-      bb_vec_info bb_vinfo = as_a <bb_vec_info> (vinfo);
-      gimple_stmt_iterator si = bb_vinfo->region_end;
-      gimple *stmt;
-      do
-	{
-	  if (!gsi_stmt (si))
-	    si = gsi_last_bb (bb_vinfo->bb);
-	  else
-	    gsi_prev (&si);
-	  stmt = gsi_stmt (si);
-	  stmt_vec_info stmt_info = vinfo->lookup_stmt (stmt);
-	  if (stmt_info && STMT_VINFO_VECTORIZABLE (stmt_info))
-	    vect_determine_stmt_precisions (stmt_info);
-	}
-      while (stmt != gsi_stmt (bb_vinfo->region_begin));
-    }
+  unsigned int i;
+  vec_basic_block *vec_bb;
+  FOR_EACH_VEC_ELT_REVERSE (vinfo->blocks, i, vec_bb)
+    FOR_EACH_VEC_BB_STMT_REVERSE (vec_bb, stmt_info)
+      vect_determine_stmt_precisions (stmt_info);
 }
 
 typedef gimple *(*vect_recog_func_ptr) (stmt_vec_info, tree *);
@@ -4931,51 +4903,16 @@ vect_pattern_recog_1 (vect_recog_func *r
 void
 vect_pattern_recog (vec_info *vinfo)
 {
-  struct loop *loop;
-  basic_block *bbs;
-  unsigned int nbbs;
-  gimple_stmt_iterator si;
-  unsigned int i, j;
-
   vect_determine_precisions (vinfo);
 
   DUMP_VECT_SCOPE ("vect_pattern_recog");
 
-  if (loop_vec_info loop_vinfo = dyn_cast <loop_vec_info> (vinfo))
-    {
-      loop = LOOP_VINFO_LOOP (loop_vinfo);
-      bbs = LOOP_VINFO_BBS (loop_vinfo);
-      nbbs = loop->num_nodes;
-
-      /* Scan through the loop stmts, applying the pattern recognition
-	 functions starting at each stmt visited:  */
-      for (i = 0; i < nbbs; i++)
-	{
-	  basic_block bb = bbs[i];
-	  for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si))
-	    {
-	      stmt_vec_info stmt_info = vinfo->lookup_stmt (gsi_stmt (si));
-	      /* Scan over all generic vect_recog_xxx_pattern functions.  */
-	      for (j = 0; j < NUM_PATTERNS; j++)
-		vect_pattern_recog_1 (&vect_vect_recog_func_ptrs[j],
-				      stmt_info);
-	    }
-	}
-    }
-  else
-    {
-      bb_vec_info bb_vinfo = as_a <bb_vec_info> (vinfo);
-      for (si = bb_vinfo->region_begin;
-	   gsi_stmt (si) != gsi_stmt (bb_vinfo->region_end); gsi_next (&si))
-	{
-	  gimple *stmt = gsi_stmt (si);
-	  stmt_vec_info stmt_info = bb_vinfo->lookup_stmt (stmt);
-	  if (stmt_info && !STMT_VINFO_VECTORIZABLE (stmt_info))
-	    continue;
-
-	  /* Scan over all generic vect_recog_xxx_pattern functions.  */
-	  for (j = 0; j < NUM_PATTERNS; j++)
-	    vect_pattern_recog_1 (&vect_vect_recog_func_ptrs[j], stmt_info);
-	}
-    }
+  unsigned int i;
+  vec_basic_block *vec_bb;
+  FOR_EACH_VEC_ELT (vinfo->blocks, i, vec_bb)
+    FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
+      if (STMT_VINFO_VECTORIZABLE (stmt_info))
+	/* Scan over all generic vect_recog_xxx_pattern functions.  */
+	for (unsigned int j = 0; j < NUM_PATTERNS; j++)
+	  vect_pattern_recog_1 (&vect_vect_recog_func_ptrs[j], stmt_info);
 }
Index: gcc/tree-vect-loop.c
===================================================================
--- gcc/tree-vect-loop.c	2018-08-28 12:05:08.743005901 +0100
+++ gcc/tree-vect-loop.c	2018-08-28 12:05:11.458982995 +0100
@@ -286,36 +286,26 @@ vect_determine_vf_for_stmt (stmt_vec_inf
 static bool
 vect_determine_vectorization_factor (loop_vec_info loop_vinfo)
 {
-  struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
-  basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo);
-  unsigned nbbs = loop->num_nodes;
   poly_uint64 vectorization_factor = 1;
   tree scalar_type = NULL_TREE;
-  gphi *phi;
   tree vectype;
   stmt_vec_info stmt_info;
   unsigned i;
   auto_vec<stmt_vec_info> mask_producers;
+  vec_basic_block *vec_bb;
 
   DUMP_VECT_SCOPE ("vect_determine_vectorization_factor");
 
-  for (i = 0; i < nbbs; i++)
-    {
-      basic_block bb = bbs[i];
-
-      for (gphi_iterator si = gsi_start_phis (bb); !gsi_end_p (si);
-	   gsi_next (&si))
+  FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
+    FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
+      if (gphi *phi = dyn_cast <gphi *> (stmt_info->stmt))
 	{
-	  phi = si.phi ();
-	  stmt_info = loop_vinfo->lookup_stmt (phi);
 	  if (dump_enabled_p ())
 	    {
 	      dump_printf_loc (MSG_NOTE, vect_location, "==> examining phi: ");
 	      dump_gimple_stmt (MSG_NOTE, TDF_SLIM, phi, 0);
 	    }
 
-	  gcc_assert (stmt_info);
-
 	  if (STMT_VINFO_RELEVANT_P (stmt_info)
 	      || STMT_VINFO_LIVE_P (stmt_info))
             {
@@ -363,16 +353,9 @@ vect_determine_vectorization_factor (loo
 	      vect_update_max_nunits (&vectorization_factor, vectype);
 	    }
 	}
-
-      for (gimple_stmt_iterator si = gsi_start_bb (bb); !gsi_end_p (si);
-	   gsi_next (&si))
-	{
-	  stmt_info = loop_vinfo->lookup_stmt (gsi_stmt (si));
-	  if (!vect_determine_vf_for_stmt (stmt_info, &vectorization_factor,
-					   &mask_producers))
-	    return false;
-        }
-    }
+      else if (!vect_determine_vf_for_stmt (stmt_info, &vectorization_factor,
+					    &mask_producers))
+	return false;
 
   /* TODO: Analyze cost. Decide if worth while to vectorize.  */
   if (dump_enabled_p ())
@@ -881,21 +864,23 @@ _loop_vec_info::_loop_vec_info (struct l
   for (unsigned int i = 0; i < nbbs; i++)
     {
       basic_block bb = bbs[i];
+      vec_basic_block *vec_bb = new vec_basic_block (bb);
       gimple_stmt_iterator si;
 
       for (si = gsi_start_phis (bb); !gsi_end_p (si); gsi_next (&si))
 	{
 	  gimple *phi = gsi_stmt (si);
 	  gimple_set_uid (phi, 0);
-	  add_stmt (phi);
+	  vec_bb->add_to_end (add_stmt (phi));
 	}
 
       for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si))
 	{
 	  gimple *stmt = gsi_stmt (si);
 	  gimple_set_uid (stmt, 0);
-	  add_stmt (stmt);
+	  vec_bb->add_to_end (add_stmt (stmt));
 	}
+      blocks.safe_push (vec_bb);
     }
 }
 
@@ -1101,9 +1086,8 @@ vect_verify_full_masking (loop_vec_info
 vect_compute_single_scalar_iteration_cost (loop_vec_info loop_vinfo)
 {
   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
-  basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo);
-  int nbbs = loop->num_nodes, factor;
-  int innerloop_iters, i;
+  int factor, innerloop_iters;
+  unsigned int i;
 
   /* Gather costs for statements in the scalar loop.  */
 
@@ -1112,23 +1096,19 @@ vect_compute_single_scalar_iteration_cos
   if (loop->inner)
     innerloop_iters = 50; /* FIXME */
 
-  for (i = 0; i < nbbs; i++)
+  vec_basic_block *vec_bb;
+  FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
     {
-      gimple_stmt_iterator si;
-      basic_block bb = bbs[i];
-
-      if (bb->loop_father == loop->inner)
+      if (vec_bb->bb ()->loop_father == loop->inner)
         factor = innerloop_iters;
       else
         factor = 1;
 
-      for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si))
-        {
-	  gimple *stmt = gsi_stmt (si);
-	  stmt_vec_info stmt_info = loop_vinfo->lookup_stmt (stmt);
-
-          if (!is_gimple_assign (stmt) && !is_gimple_call (stmt))
-            continue;
+      FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
+	{
+	  if (!is_gimple_assign (stmt_info->stmt)
+	      && !is_gimple_call (stmt_info->stmt))
+	    continue;
 
           /* Skip stmts that are not vectorized inside the loop.  */
           if (stmt_info
@@ -1432,11 +1412,8 @@ vect_analyze_loop_form (struct loop *loo
 static void
 vect_update_vf_for_slp (loop_vec_info loop_vinfo)
 {
-  struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
-  basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo);
-  int nbbs = loop->num_nodes;
   poly_uint64 vectorization_factor;
-  int i;
+  unsigned int i;
 
   DUMP_VECT_SCOPE ("vect_update_vf_for_slp");
 
@@ -1449,21 +1426,17 @@ vect_update_vf_for_slp (loop_vec_info lo
      perform pure SLP on loop - cross iteration parallelism is not
      exploited.  */
   bool only_slp_in_loop = true;
-  for (i = 0; i < nbbs; i++)
-    {
-      basic_block bb = bbs[i];
-      for (gimple_stmt_iterator si = gsi_start_bb (bb); !gsi_end_p (si);
-	   gsi_next (&si))
-	{
-	  stmt_vec_info stmt_info = loop_vinfo->lookup_stmt (gsi_stmt (si));
-	  stmt_info = vect_stmt_to_vectorize (stmt_info);
-	  if ((STMT_VINFO_RELEVANT_P (stmt_info)
-	       || VECTORIZABLE_CYCLE_DEF (STMT_VINFO_DEF_TYPE (stmt_info)))
-	      && !PURE_SLP_STMT (stmt_info))
-	    /* STMT needs both SLP and loop-based vectorization.  */
-	    only_slp_in_loop = false;
-	}
-    }
+  vec_basic_block *vec_bb;
+  FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
+    FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
+      {
+	stmt_vec_info final_info = vect_stmt_to_vectorize (stmt_info);
+	if ((STMT_VINFO_RELEVANT_P (final_info)
+	     || VECTORIZABLE_CYCLE_DEF (STMT_VINFO_DEF_TYPE (final_info)))
+	    && !PURE_SLP_STMT (final_info))
+	  /* STMT needs both SLP and loop-based vectorization.  */
+	  only_slp_in_loop = false;
+      }
 
   if (only_slp_in_loop)
     {
@@ -1526,11 +1499,7 @@ vect_active_double_reduction_p (stmt_vec
 static bool
 vect_analyze_loop_operations (loop_vec_info loop_vinfo)
 {
-  struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
-  basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo);
-  int nbbs = loop->num_nodes;
-  int i;
-  stmt_vec_info stmt_info;
+  unsigned int i;
   bool need_to_vectorize = false;
   bool ok;
 
@@ -1539,17 +1508,12 @@ vect_analyze_loop_operations (loop_vec_i
   stmt_vector_for_cost cost_vec;
   cost_vec.create (2);
 
-  for (i = 0; i < nbbs; i++)
-    {
-      basic_block bb = bbs[i];
-
-      for (gphi_iterator si = gsi_start_phis (bb); !gsi_end_p (si);
-	   gsi_next (&si))
+  vec_basic_block *vec_bb;
+  FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
+    FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
+      if (gphi *phi = dyn_cast <gphi *> (stmt_info->stmt))
         {
-          gphi *phi = si.phi ();
           ok = true;
-
-	  stmt_info = loop_vinfo->lookup_stmt (phi);
           if (dump_enabled_p ())
             {
               dump_printf_loc (MSG_NOTE, vect_location, "examining phi: ");
@@ -1560,7 +1524,7 @@ vect_analyze_loop_operations (loop_vec_i
 
           /* Inner-loop loop-closed exit phi in outer-loop vectorization
              (i.e., a phi in the tail of the outer-loop).  */
-          if (! is_loop_header_bb_p (bb))
+          if (! is_loop_header_bb_p (vec_bb->bb ()))
             {
               /* FORNOW: we currently don't support the case that these phis
                  are not used in the outerloop (unless it is double reduction,
@@ -1599,8 +1563,6 @@ vect_analyze_loop_operations (loop_vec_i
               continue;
             }
 
-          gcc_assert (stmt_info);
-
           if ((STMT_VINFO_RELEVANT (stmt_info) == vect_used_in_scope
                || STMT_VINFO_LIVE_P (stmt_info))
               && STMT_VINFO_DEF_TYPE (stmt_info) != vect_induction_def)
@@ -1645,18 +1607,10 @@ vect_analyze_loop_operations (loop_vec_i
 	      return false;
             }
         }
-
-      for (gimple_stmt_iterator si = gsi_start_bb (bb); !gsi_end_p (si);
-	   gsi_next (&si))
-        {
-	  gimple *stmt = gsi_stmt (si);
-	  if (!gimple_clobber_p (stmt)
-	      && !vect_analyze_stmt (loop_vinfo->lookup_stmt (stmt),
-				     &need_to_vectorize,
-				     NULL, NULL, &cost_vec))
-	    return false;
-        }
-    } /* bbs */
+      else if (!gimple_clobber_p (stmt_info->stmt)
+	       && !vect_analyze_stmt (stmt_info, &need_to_vectorize,
+				      NULL, NULL, &cost_vec))
+	return false;
 
   add_stmt_costs (loop_vinfo->target_cost_data, &cost_vec);
   cost_vec.release ();
@@ -2242,32 +2196,22 @@ vect_analyze_loop_2 (loop_vec_info loop_
     vect_free_slp_instance (instance, false);
   LOOP_VINFO_SLP_INSTANCES (loop_vinfo).release ();
   /* Reset SLP type to loop_vect on all stmts.  */
-  for (i = 0; i < LOOP_VINFO_LOOP (loop_vinfo)->num_nodes; ++i)
-    {
-      basic_block bb = LOOP_VINFO_BBS (loop_vinfo)[i];
-      for (gimple_stmt_iterator si = gsi_start_phis (bb);
-	   !gsi_end_p (si); gsi_next (&si))
-	{
-	  stmt_vec_info stmt_info = loop_vinfo->lookup_stmt (gsi_stmt (si));
-	  STMT_SLP_TYPE (stmt_info) = loop_vect;
-	}
-      for (gimple_stmt_iterator si = gsi_start_bb (bb);
-	   !gsi_end_p (si); gsi_next (&si))
-	{
-	  stmt_vec_info stmt_info = loop_vinfo->lookup_stmt (gsi_stmt (si));
-	  STMT_SLP_TYPE (stmt_info) = loop_vect;
-	  if (STMT_VINFO_IN_PATTERN_P (stmt_info))
-	    {
-	      gimple *pattern_def_seq = STMT_VINFO_PATTERN_DEF_SEQ (stmt_info);
-	      stmt_info = STMT_VINFO_RELATED_STMT (stmt_info);
-	      STMT_SLP_TYPE (stmt_info) = loop_vect;
-	      for (gimple_stmt_iterator pi = gsi_start (pattern_def_seq);
-		   !gsi_end_p (pi); gsi_next (&pi))
-		STMT_SLP_TYPE (loop_vinfo->lookup_stmt (gsi_stmt (pi)))
-		  = loop_vect;
-	    }
-	}
-    }
+  vec_basic_block *vec_bb;
+  FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
+    FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
+      {
+	STMT_SLP_TYPE (stmt_info) = loop_vect;
+	if (STMT_VINFO_IN_PATTERN_P (stmt_info))
+	  {
+	    gimple *pattern_def_seq = STMT_VINFO_PATTERN_DEF_SEQ (stmt_info);
+	    STMT_SLP_TYPE (STMT_VINFO_RELATED_STMT (stmt_info)) = loop_vect;
+	    for (gimple_stmt_iterator pi = gsi_start (pattern_def_seq);
+		 !gsi_end_p (pi); gsi_next (&pi))
+	      STMT_SLP_TYPE (loop_vinfo->lookup_stmt (gsi_stmt (pi)))
+		= loop_vect;
+	  }
+      }
+
   /* Free optimized alias test DDRS.  */
   LOOP_VINFO_LOWER_BOUNDS (loop_vinfo).truncate (0);
   LOOP_VINFO_COMP_ALIAS_DDRS (loop_vinfo).release ();
@@ -8275,15 +8219,12 @@ vect_transform_loop (loop_vec_info loop_
 {
   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
   struct loop *epilogue = NULL;
-  basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo);
-  int nbbs = loop->num_nodes;
-  int i;
+  unsigned int i;
   tree niters_vector = NULL_TREE;
   tree step_vector = NULL_TREE;
   tree niters_vector_mult_vf = NULL_TREE;
   poly_uint64 vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
   unsigned int lowest_vf = constant_lower_bound (vf);
-  gimple *stmt;
   bool check_profitability = false;
   unsigned int th;
 
@@ -8401,90 +8342,73 @@ vect_transform_loop (loop_vec_info loop_
      support more involved loop forms, the order by which the BBs are
      traversed need to be reconsidered.  */
 
-  for (i = 0; i < nbbs; i++)
+  vec_basic_block *vec_bb;
+  FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
     {
-      basic_block bb = bbs[i];
-      stmt_vec_info stmt_info;
-
-      for (gphi_iterator si = gsi_start_phis (bb); !gsi_end_p (si);
-	   gsi_next (&si))
-        {
-	  gphi *phi = si.phi ();
-	  if (dump_enabled_p ())
+      stmt_vec_info next_stmt_info;
+      for (stmt_vec_info stmt_info = vec_bb->first (); stmt_info;
+	   stmt_info = next_stmt_info)
+	{
+	  next_stmt_info = stmt_info->next;
+	  if (gphi *phi = dyn_cast <gphi *> (stmt_info->stmt))
 	    {
-	      dump_printf_loc (MSG_NOTE, vect_location,
-                               "------>vectorizing phi: ");
-	      dump_gimple_stmt (MSG_NOTE, TDF_SLIM, phi, 0);
-	    }
-	  stmt_info = loop_vinfo->lookup_stmt (phi);
-	  if (!stmt_info)
-	    continue;
+	      if (dump_enabled_p ())
+		{
+		  dump_printf_loc (MSG_NOTE, vect_location,
+				   "------>vectorizing phi: ");
+		  dump_gimple_stmt (MSG_NOTE, TDF_SLIM, phi, 0);
+		}
 
-	  if (MAY_HAVE_DEBUG_BIND_STMTS && !STMT_VINFO_LIVE_P (stmt_info))
-	    vect_loop_kill_debug_uses (loop, stmt_info);
+	      if (MAY_HAVE_DEBUG_BIND_STMTS && !STMT_VINFO_LIVE_P (stmt_info))
+		vect_loop_kill_debug_uses (loop, stmt_info);
 
-	  if (!STMT_VINFO_RELEVANT_P (stmt_info)
-	      && !STMT_VINFO_LIVE_P (stmt_info))
-	    continue;
-
-	  if (STMT_VINFO_VECTYPE (stmt_info)
-	      && (maybe_ne
-		  (TYPE_VECTOR_SUBPARTS (STMT_VINFO_VECTYPE (stmt_info)), vf))
-	      && dump_enabled_p ())
-	    dump_printf_loc (MSG_NOTE, vect_location, "multiple-types.\n");
-
-	  if ((STMT_VINFO_DEF_TYPE (stmt_info) == vect_induction_def
-	       || STMT_VINFO_DEF_TYPE (stmt_info) == vect_reduction_def
-	       || STMT_VINFO_DEF_TYPE (stmt_info) == vect_nested_cycle)
-	      && ! PURE_SLP_STMT (stmt_info))
-	    {
-	      if (dump_enabled_p ())
-		dump_printf_loc (MSG_NOTE, vect_location, "transform phi.\n");
-	      vect_transform_stmt (stmt_info, NULL, NULL, NULL);
+	      if (!STMT_VINFO_RELEVANT_P (stmt_info)
+		  && !STMT_VINFO_LIVE_P (stmt_info))
+		continue;
+
+	      if (STMT_VINFO_VECTYPE (stmt_info)
+		  && (maybe_ne
+		      (TYPE_VECTOR_SUBPARTS (STMT_VINFO_VECTYPE (stmt_info)),
+		       vf))
+		  && dump_enabled_p ())
+		dump_printf_loc (MSG_NOTE, vect_location, "multiple-types.\n");
+
+	      if ((STMT_VINFO_DEF_TYPE (stmt_info) == vect_induction_def
+		   || STMT_VINFO_DEF_TYPE (stmt_info) == vect_reduction_def
+		   || STMT_VINFO_DEF_TYPE (stmt_info) == vect_nested_cycle)
+		  && ! PURE_SLP_STMT (stmt_info))
+		{
+		  if (dump_enabled_p ())
+		    dump_printf_loc (MSG_NOTE, vect_location,
+				     "transform phi.\n");
+		  vect_transform_stmt (stmt_info, NULL, NULL, NULL);
+		}
 	    }
-	}
-
-      for (gimple_stmt_iterator si = gsi_start_bb (bb);
-	   !gsi_end_p (si);)
-	{
-	  stmt = gsi_stmt (si);
 	  /* During vectorization remove existing clobber stmts.  */
-	  if (gimple_clobber_p (stmt))
-	    {
-	      unlink_stmt_vdef (stmt);
-	      gsi_remove (&si, true);
-	      release_defs (stmt);
-	    }
+	  else if (gimple_clobber_p (stmt_info->stmt))
+	    loop_vinfo->remove_stmt (stmt_info);
 	  else
 	    {
-	      stmt_info = loop_vinfo->lookup_stmt (stmt);
-
-	      /* vector stmts created in the outer-loop during vectorization of
-		 stmts in an inner-loop may not have a stmt_info, and do not
-		 need to be vectorized.  */
 	      stmt_vec_info seen_store = NULL;
-	      if (stmt_info)
+	      gimple_stmt_iterator si = gsi_for_stmt (stmt_info->stmt);
+	      if (STMT_VINFO_IN_PATTERN_P (stmt_info))
 		{
-		  if (STMT_VINFO_IN_PATTERN_P (stmt_info))
+		  gimple *def_seq = STMT_VINFO_PATTERN_DEF_SEQ (stmt_info);
+		  for (gimple_stmt_iterator subsi = gsi_start (def_seq);
+		       !gsi_end_p (subsi); gsi_next (&subsi))
 		    {
-		      gimple *def_seq = STMT_VINFO_PATTERN_DEF_SEQ (stmt_info);
-		      for (gimple_stmt_iterator subsi = gsi_start (def_seq);
-			   !gsi_end_p (subsi); gsi_next (&subsi))
-			{
-			  stmt_vec_info pat_stmt_info
-			    = loop_vinfo->lookup_stmt (gsi_stmt (subsi));
-			  vect_transform_loop_stmt (loop_vinfo, pat_stmt_info,
-						    &si, &seen_store);
-			}
 		      stmt_vec_info pat_stmt_info
-			= STMT_VINFO_RELATED_STMT (stmt_info);
-		      vect_transform_loop_stmt (loop_vinfo, pat_stmt_info, &si,
-						&seen_store);
+			= loop_vinfo->lookup_stmt (gsi_stmt (subsi));
+		      vect_transform_loop_stmt (loop_vinfo, pat_stmt_info,
+						&si, &seen_store);
 		    }
-		  vect_transform_loop_stmt (loop_vinfo, stmt_info, &si,
+		  stmt_vec_info pat_stmt_info
+		    = STMT_VINFO_RELATED_STMT (stmt_info);
+		  vect_transform_loop_stmt (loop_vinfo, pat_stmt_info, &si,
 					    &seen_store);
 		}
-	      gsi_next (&si);
+	      vect_transform_loop_stmt (loop_vinfo, stmt_info, &si,
+					&seen_store);
 	      if (seen_store)
 		{
 		  if (STMT_VINFO_GROUPED_ACCESS (seen_store))
@@ -8502,7 +8426,7 @@ vect_transform_loop (loop_vec_info loop_
       /* Stub out scalar statements that must not survive vectorization.
 	 Doing this here helps with grouped statements, or statements that
 	 are involved in patterns.  */
-      for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
+      for (gimple_stmt_iterator gsi = gsi_start_bb (vec_bb->bb ());
 	   !gsi_end_p (gsi); gsi_next (&gsi))
 	{
 	  gcall *call = dyn_cast <gcall *> (gsi_stmt (gsi));
Index: gcc/tree-vect-slp.c
===================================================================
--- gcc/tree-vect-slp.c	2018-08-28 12:05:06.207027289 +0100
+++ gcc/tree-vect-slp.c	2018-08-28 12:05:11.462982962 +0100
@@ -2359,29 +2359,22 @@ vect_detect_hybrid_slp (loop_vec_info lo
 
   /* First walk all pattern stmt in the loop and mark defs of uses as
      hybrid because immediate uses in them are not recorded.  */
-  for (i = 0; i < LOOP_VINFO_LOOP (loop_vinfo)->num_nodes; ++i)
-    {
-      basic_block bb = LOOP_VINFO_BBS (loop_vinfo)[i];
-      for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
-	   gsi_next (&gsi))
+  vec_basic_block *vec_bb;
+  FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
+    FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
+      if (STMT_VINFO_IN_PATTERN_P (stmt_info))
 	{
-	  gimple *stmt = gsi_stmt (gsi);
-	  stmt_vec_info stmt_info = loop_vinfo->lookup_stmt (stmt);
-	  if (STMT_VINFO_IN_PATTERN_P (stmt_info))
-	    {
-	      walk_stmt_info wi;
-	      memset (&wi, 0, sizeof (wi));
-	      wi.info = loop_vinfo;
-	      gimple_stmt_iterator gsi2
-		= gsi_for_stmt (STMT_VINFO_RELATED_STMT (stmt_info)->stmt);
-	      walk_gimple_stmt (&gsi2, vect_detect_hybrid_slp_2,
-				vect_detect_hybrid_slp_1, &wi);
-	      walk_gimple_seq (STMT_VINFO_PATTERN_DEF_SEQ (stmt_info),
-			       vect_detect_hybrid_slp_2,
-			       vect_detect_hybrid_slp_1, &wi);
-	    }
+	  walk_stmt_info wi;
+	  memset (&wi, 0, sizeof (wi));
+	  wi.info = loop_vinfo;
+	  gimple_stmt_iterator gsi2
+	    = gsi_for_stmt (STMT_VINFO_RELATED_STMT (stmt_info)->stmt);
+	  walk_gimple_stmt (&gsi2, vect_detect_hybrid_slp_2,
+			    vect_detect_hybrid_slp_1, &wi);
+	  walk_gimple_seq (STMT_VINFO_PATTERN_DEF_SEQ (stmt_info),
+			   vect_detect_hybrid_slp_2,
+			   vect_detect_hybrid_slp_1, &wi);
 	}
-    }
 
   /* Then walk the SLP instance trees marking stmts with uses in
      non-SLP stmts as hybrid, also propagating hybrid down the
@@ -2408,13 +2401,15 @@ _bb_vec_info::_bb_vec_info (gimple_stmt_
 {
   gimple_stmt_iterator gsi;
 
+  vec_basic_block *vec_bb = new vec_basic_block (bb);
   for (gsi = region_begin; gsi_stmt (gsi) != gsi_stmt (region_end);
        gsi_next (&gsi))
     {
       gimple *stmt = gsi_stmt (gsi);
       gimple_set_uid (stmt, 0);
-      add_stmt (stmt);
+      vec_bb->add_to_end (add_stmt (stmt));
     }
+  blocks.quick_push (vec_bb);
 
   bb->aux = this;
 }
Index: gcc/tree-vect-stmts.c
===================================================================
--- gcc/tree-vect-stmts.c	2018-08-28 12:05:08.743005901 +0100
+++ gcc/tree-vect-stmts.c	2018-08-28 12:05:11.466982927 +0100
@@ -612,12 +612,7 @@ process_use (stmt_vec_info stmt_vinfo, t
 bool
 vect_mark_stmts_to_be_vectorized (loop_vec_info loop_vinfo)
 {
-  struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
-  basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo);
-  unsigned int nbbs = loop->num_nodes;
-  gimple_stmt_iterator si;
   unsigned int i;
-  basic_block bb;
   bool live_p;
   enum vect_relevant relevant;
 
@@ -626,34 +621,11 @@ vect_mark_stmts_to_be_vectorized (loop_v
   auto_vec<stmt_vec_info, 64> worklist;
 
   /* 1. Init worklist.  */
-  for (i = 0; i < nbbs; i++)
-    {
-      bb = bbs[i];
-      for (si = gsi_start_phis (bb); !gsi_end_p (si); gsi_next (&si))
-	{
-	  stmt_vec_info phi_info = loop_vinfo->lookup_stmt (gsi_stmt (si));
-	  if (dump_enabled_p ())
-	    {
-	      dump_printf_loc (MSG_NOTE, vect_location, "init: phi relevant? ");
-	      dump_gimple_stmt (MSG_NOTE, TDF_SLIM, phi_info->stmt, 0);
-	    }
-
-	  if (vect_stmt_relevant_p (phi_info, loop_vinfo, &relevant, &live_p))
-	    vect_mark_relevant (&worklist, phi_info, relevant, live_p);
-	}
-      for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si))
-	{
-	  stmt_vec_info stmt_info = loop_vinfo->lookup_stmt (gsi_stmt (si));
-	  if (dump_enabled_p ())
-	    {
-	      dump_printf_loc (MSG_NOTE, vect_location, "init: stmt relevant? ");
-	      dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt_info->stmt, 0);
-	    }
-
-	  if (vect_stmt_relevant_p (stmt_info, loop_vinfo, &relevant, &live_p))
-	    vect_mark_relevant (&worklist, stmt_info, relevant, live_p);
-	}
-    }
+  vec_basic_block *vec_bb;
+  FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
+    FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
+      if (vect_stmt_relevant_p (stmt_info, loop_vinfo, &relevant, &live_p))
+	vect_mark_relevant (&worklist, stmt_info, relevant, live_p);
 
   /* 2. Process_worklist */
   while (worklist.length () > 0)
@@ -1740,6 +1712,7 @@ vect_finish_replace_stmt (stmt_vec_info
 
   gimple_stmt_iterator gsi = gsi_for_stmt (stmt_info->stmt);
   gsi_replace (&gsi, vec_stmt, false);
+  stmt_info->block->remove (stmt_info);
 
   return vect_finish_stmt_generation_1 (stmt_info, vec_stmt);
 }
@@ -7309,6 +7282,7 @@ permute_vec_elements (tree x, tree y, tr
 static bool
 hoist_defs_of_uses (stmt_vec_info stmt_info, struct loop *loop)
 {
+  vec_info *vinfo = stmt_info->vinfo;
   ssa_op_iter i;
   tree op;
   bool any = false;
@@ -7349,6 +7323,8 @@ hoist_defs_of_uses (stmt_vec_info stmt_i
 	{
 	  gimple_stmt_iterator gsi = gsi_for_stmt (def_stmt);
 	  gsi_remove (&gsi, false);
+	  if (stmt_vec_info def_stmt_info = vinfo->lookup_stmt (def_stmt))
+	    def_stmt_info->block->remove (def_stmt_info);
 	  gsi_insert_on_edge_immediate (loop_preheader_edge (loop), def_stmt);
 	}
     }
@@ -9068,6 +9044,7 @@ vectorizable_condition (stmt_vec_info st
 		  gimple_stmt_iterator old_gsi
 		    = gsi_for_stmt (stmt_info->stmt);
 		  gsi_remove (&old_gsi, true);
+		  stmt_info->block->remove (stmt_info);
 		  new_stmt_info
 		    = vect_finish_stmt_generation (stmt_info, new_stmt, gsi);
 		}
Index: gcc/testsuite/gcc.dg/vect/vect-hoist-1.c
===================================================================
--- /dev/null	2018-07-26 10:26:13.137955424 +0100
+++ gcc/testsuite/gcc.dg/vect/vect-hoist-1.c	2018-08-28 12:05:11.458982995 +0100
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O1" } */
+
+int c[20];
+
+void
+f (int *a, int *b, int j)
+{
+  for (int i = 0; i < 100; ++i)
+    {
+      a[i] = a[i] + c[j + 4];
+      b[i] = b[i] + c[j + 4];
+    }
+}

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [4/6] Make the vectoriser do its own DCE
  2018-08-28 11:19 [0/6] Make vector pattern statements less special Richard Sandiford
                   ` (2 preceding siblings ...)
  2018-08-28 11:22 ` [3/6] Add a vec_basic_block structure Richard Sandiford
@ 2018-08-28 11:23 ` Richard Sandiford
  2018-08-28 23:01   ` Jeff Law
  2018-08-28 11:25 ` [6/6] Link imm uses for pattern stmts Richard Sandiford
  2018-08-28 11:25 ` [5/6] Insert pattern statements into vec_basic_blocks Richard Sandiford
  5 siblings, 1 reply; 22+ messages in thread
From: Richard Sandiford @ 2018-08-28 11:23 UTC (permalink / raw)
  To: gcc-patches

The vectoriser normally leaves a later DCE pass to remove the scalar
code, but we've accumulated various special cases for things that
DCE can't handle, such as removing the scalar stores that have been
replaced by vector stores, and the scalar calls to internal functions.
(The latter must be removed for correctness, since no underlying scalar
optabs exist for those calls.)

Now that vec_basic_block gives us an easy way of iterating over the
original scalar code (ignoring any new code inserted by the vectoriser),
it seems easier to do the DCE directly.  This involves marking the few
cases in which the vector code needs part of the original scalar code
to be kept around.


2018-08-28  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vectorizer.h (_stmt_vec_info::used_by_vector_code_p): New
	member variable.
	(vect_mark_used_by_vector_code): Declare.
	(vect_remove_dead_scalar_stmts): Likewise.
	(vect_transform_stmt): Return void.
	(vect_remove_stores): Delete.
	* tree-vectorizer.c (vec_info::remove_stmt): Handle phis.
	* tree-vect-stmts.c (vect_mark_used_by_vector_code): New function.
	(vectorizable_call, vectorizable_simd_clone_call): Don't remove
	scalar calls here.
	(vectorizable_shift): Mark shift amounts in a vector-scalar shift
	as being used by the vector code.
	(vectorizable_load): Mark unhoisted scalar loads that feed a
	load-and-broadcast operation as being needed by the vector code.
	(vect_transform_stmt): Return void.
	(vect_remove_stores): Delete.
	(vect_maybe_remove_scalar_stmt): New function.
	(vect_remove_dead_scalar_stmts): Likewise.
	* tree-vect-slp.c (vect_slp_bb): Call vect_remove_dead_scalar_stmts.
	(vect_remove_slp_scalar_calls): Delete.
	(vect_schedule_slp): Don't call it.  Don't remove scalar stores here.
	* tree-vect-loop.c (vectorizable_reduction): Mark scalar phis that
	are retained by the vector code.
	(vectorizable_live_operation): Mark scalar live-out statements that
	are retained by the vector code.
	(vect_transform_loop_stmt): Remove seen_store argument.  Mark gconds
	in nested loops as being needed by the vector code.  Replace the
	outer loop's gcond with a dummy condition.
	(vect_transform_loop): Update calls accordingly.  Don't remove
	scalar stores or calls here; call vect_remove_dead_scalar_stmts
	instead.
	* tree-vect-loop-manip.c (vect_update_ivs_after_vectorizer): Don't
	remove PHIs that were classified as reductions but never actually
	vectorized.

Index: gcc/tree-vectorizer.h
===================================================================
--- gcc/tree-vectorizer.h	2018-08-28 12:05:11.466982927 +0100
+++ gcc/tree-vectorizer.h	2018-08-28 12:05:14.014961439 +0100
@@ -925,6 +925,10 @@ struct _stmt_vec_info {
   /* For both loads and stores.  */
   bool simd_lane_access_p;
 
+  /* True if the vectorized code keeps this statement in its current form.
+     Only meaningful for statements that were in the original scalar code.  */
+  bool used_by_vector_code_p;
+
   /* Classifies how the load or store is going to be implemented
      for loop vectorization.  */
   vect_memory_access_type memory_access_type;
@@ -1522,6 +1526,7 @@ extern stmt_vec_info vect_finish_replace
 extern stmt_vec_info vect_finish_stmt_generation (stmt_vec_info, gimple *,
 						  gimple_stmt_iterator *);
 extern bool vect_mark_stmts_to_be_vectorized (loop_vec_info);
+extern void vect_mark_used_by_vector_code (stmt_vec_info);
 extern tree vect_get_store_rhs (stmt_vec_info);
 extern tree vect_get_vec_def_for_operand_1 (stmt_vec_info, enum vect_def_type);
 extern tree vect_get_vec_def_for_operand (tree, stmt_vec_info, tree = NULL);
@@ -1532,9 +1537,8 @@ extern void vect_get_vec_defs_for_stmt_c
 extern tree vect_init_vector (stmt_vec_info, tree, tree,
                               gimple_stmt_iterator *);
 extern tree vect_get_vec_def_for_stmt_copy (vec_info *, tree);
-extern bool vect_transform_stmt (stmt_vec_info, gimple_stmt_iterator *,
+extern void vect_transform_stmt (stmt_vec_info, gimple_stmt_iterator *,
 				 slp_tree, slp_instance);
-extern void vect_remove_stores (stmt_vec_info);
 extern bool vect_analyze_stmt (stmt_vec_info, bool *, slp_tree, slp_instance,
 			       stmt_vector_for_cost *);
 extern bool vectorizable_condition (stmt_vec_info, gimple_stmt_iterator *,
@@ -1554,6 +1558,7 @@ extern gcall *vect_gen_while (tree, tree
 extern tree vect_gen_while_not (gimple_seq *, tree, tree, tree);
 extern bool vect_get_vector_types_for_stmt (stmt_vec_info, tree *, tree *);
 extern tree vect_get_mask_type_for_stmt (stmt_vec_info);
+extern void vect_remove_dead_scalar_stmts (vec_info *);
 
 /* In tree-vect-data-refs.c.  */
 extern bool vect_can_force_dr_alignment_p (const_tree, unsigned int);
Index: gcc/tree-vectorizer.c
===================================================================
--- gcc/tree-vectorizer.c	2018-08-28 12:05:11.466982927 +0100
+++ gcc/tree-vectorizer.c	2018-08-28 12:05:14.014961439 +0100
@@ -654,8 +654,13 @@ vec_info::remove_stmt (stmt_vec_info stm
   set_vinfo_for_stmt (stmt_info->stmt, NULL);
   gimple_stmt_iterator si = gsi_for_stmt (stmt_info->stmt);
   unlink_stmt_vdef (stmt_info->stmt);
-  gsi_remove (&si, true);
-  release_defs (stmt_info->stmt);
+  if (is_a <gphi *> (stmt_info->stmt))
+    remove_phi_node (&si, true);
+  else
+    {
+      gsi_remove (&si, true);
+      release_defs (stmt_info->stmt);
+    }
   stmt_info->block->remove (stmt_info);
   free_stmt_vec_info (stmt_info);
 }
Index: gcc/tree-vect-stmts.c
===================================================================
--- gcc/tree-vect-stmts.c	2018-08-28 12:05:11.466982927 +0100
+++ gcc/tree-vect-stmts.c	2018-08-28 12:05:14.014961439 +0100
@@ -766,6 +766,34 @@ vect_mark_stmts_to_be_vectorized (loop_v
   return true;
 }
 
+/* Record that scalar statement STMT_INFO is needed by the vectorized code.  */
+
+void
+vect_mark_used_by_vector_code (stmt_vec_info stmt_info)
+{
+  vec_info *vinfo = stmt_info->vinfo;
+  auto_vec<stmt_vec_info, 16> worklist;
+  worklist.quick_push (stmt_info);
+  stmt_info->used_by_vector_code_p = true;
+  do
+    {
+      stmt_info = worklist.pop ();
+      ssa_op_iter iter;
+      use_operand_p use;
+      FOR_EACH_PHI_OR_STMT_USE (use, stmt_info->stmt, iter, SSA_OP_USE)
+	{
+	  tree op = USE_FROM_PTR (use);
+	  stmt_vec_info def_stmt_info = vinfo->lookup_def (op);
+	  if (def_stmt_info && !def_stmt_info->used_by_vector_code_p)
+	    {
+	      def_stmt_info->used_by_vector_code_p = 1;
+	      worklist.safe_push (def_stmt_info);
+	    }
+	}
+    }
+  while (!worklist.is_empty ());
+}
+
 /* Compute the prologue cost for invariant or constant operands.  */
 
 static unsigned
@@ -3081,7 +3109,6 @@ vectorizable_call (stmt_vec_info stmt_in
   auto_vec<tree, 8> orig_vargs;
   enum { NARROW, NONE, WIDEN } modifier;
   size_t i, nargs;
-  tree lhs;
 
   if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
     return false;
@@ -3579,22 +3606,6 @@ vectorizable_call (stmt_vec_info stmt_in
     return false;
 
   vargs.release ();
-
-  /* The call in STMT might prevent it from being removed in dce.
-     We however cannot remove it here, due to the way the ssa name
-     it defines is mapped to the new definition.  So just replace
-     rhs of the statement with something harmless.  */
-
-  if (slp_node)
-    return true;
-
-  stmt_info = vect_orig_stmt (stmt_info);
-  lhs = gimple_get_lhs (stmt_info->stmt);
-
-  gassign *new_stmt
-    = gimple_build_assign (lhs, build_zero_cst (TREE_TYPE (lhs)));
-  vinfo->replace_stmt (gsi, stmt_info, new_stmt);
-
   return true;
 }
 
@@ -3703,7 +3714,7 @@ vectorizable_simd_clone_call (stmt_vec_i
 {
   tree vec_dest;
   tree scalar_dest;
-  tree op, type;
+  tree op;
   tree vec_oprnd0 = NULL_TREE;
   stmt_vec_info prev_stmt_info;
   tree vectype;
@@ -3717,7 +3728,7 @@ vectorizable_simd_clone_call (stmt_vec_i
   auto_vec<simd_call_arg_info> arginfo;
   vec<tree> vargs = vNULL;
   size_t i, nargs;
-  tree lhs, rtype, ratype;
+  tree rtype, ratype;
   vec<constructor_elt, va_gc> *ret_ctor_elts = NULL;
 
   /* Is STMT a vectorizable call?   */
@@ -4310,27 +4321,6 @@ vectorizable_simd_clone_call (stmt_vec_i
     }
 
   vargs.release ();
-
-  /* The call in STMT might prevent it from being removed in dce.
-     We however cannot remove it here, due to the way the ssa name
-     it defines is mapped to the new definition.  So just replace
-     rhs of the statement with something harmless.  */
-
-  if (slp_node)
-    return true;
-
-  gimple *new_stmt;
-  if (scalar_dest)
-    {
-      type = TREE_TYPE (scalar_dest);
-      lhs = gimple_call_lhs (vect_orig_stmt (stmt_info)->stmt);
-      new_stmt = gimple_build_assign (lhs, build_zero_cst (type));
-    }
-  else
-    new_stmt = gimple_build_nop ();
-  vinfo->replace_stmt (gsi, vect_orig_stmt (stmt_info), new_stmt);
-  unlink_stmt_vdef (stmt);
-
   return true;
 }
 
@@ -5643,6 +5633,9 @@ vectorizable_shift (stmt_vec_info stmt_i
     dump_printf_loc (MSG_NOTE, vect_location,
                      "transform binary/unary operation.\n");
 
+  if (scalar_shift_arg && op1_def_stmt_info)
+    vect_mark_used_by_vector_code (op1_def_stmt_info);
+
   /* Handle def.  */
   vec_dest = vect_create_destination_var (scalar_dest, vectype);
 
@@ -7649,6 +7642,8 @@ vectorizable_load (stmt_vec_info stmt_in
 	    (loop_preheader_edge (loop),
 	     gimple_build_assign (scalar_dest, rhs));
 	}
+      else
+	vect_mark_used_by_vector_code (stmt_info);
       /* These copies are all equivalent, but currently the representation
 	 requires a separate STMT_VINFO_VEC_STMT for each one.  */
       prev_stmt_info = NULL;
@@ -9629,12 +9624,11 @@ vect_analyze_stmt (stmt_vec_info stmt_in
 
    Create a vectorized stmt to replace STMT_INFO, and insert it at BSI.  */
 
-bool
+void
 vect_transform_stmt (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi,
 		     slp_tree slp_node, slp_instance slp_node_instance)
 {
   vec_info *vinfo = stmt_info->vinfo;
-  bool is_store = false;
   stmt_vec_info vec_stmt = NULL;
   bool done;
 
@@ -9689,18 +9683,6 @@ vect_transform_stmt (stmt_vec_info stmt_
     case store_vec_info_type:
       done = vectorizable_store (stmt_info, gsi, &vec_stmt, slp_node, NULL);
       gcc_assert (done);
-      if (STMT_VINFO_GROUPED_ACCESS (stmt_info) && !slp_node)
-	{
-	  /* In case of interleaving, the whole chain is vectorized when the
-	     last store in the chain is reached.  Store stmts before the last
-	     one are skipped, and there vec_stmt_info shouldn't be freed
-	     meanwhile.  */
-	  stmt_vec_info group_info = DR_GROUP_FIRST_ELEMENT (stmt_info);
-	  if (DR_GROUP_STORE_COUNT (group_info) == DR_GROUP_SIZE (group_info))
-	    is_store = true;
-	}
-      else
-	is_store = true;
       break;
 
     case condition_vec_info_type:
@@ -9791,30 +9773,9 @@ vect_transform_stmt (stmt_vec_info stmt_
 
   if (vec_stmt)
     STMT_VINFO_VEC_STMT (stmt_info) = vec_stmt;
-
-  return is_store;
 }
 
 
-/* Remove a group of stores (for SLP or interleaving), free their
-   stmt_vec_info.  */
-
-void
-vect_remove_stores (stmt_vec_info first_stmt_info)
-{
-  vec_info *vinfo = first_stmt_info->vinfo;
-  stmt_vec_info next_stmt_info = first_stmt_info;
-
-  while (next_stmt_info)
-    {
-      stmt_vec_info tmp = DR_GROUP_NEXT_ELEMENT (next_stmt_info);
-      next_stmt_info = vect_orig_stmt (next_stmt_info);
-      /* Free the attached stmt_vec_info and remove the stmt.  */
-      vinfo->remove_stmt (next_stmt_info);
-      next_stmt_info = tmp;
-    }
-}
-
 /* Function get_vectype_for_scalar_type_and_size.
 
    Returns the vector type corresponding to SCALAR_TYPE  and SIZE as supported
@@ -10852,3 +10813,118 @@ vect_get_mask_type_for_stmt (stmt_vec_in
     }
   return mask_type;
 }
+
+/* Handle vect_remove_dead_scalar_stmts for statement STMT_INFO.  */
+
+static void
+vect_maybe_remove_scalar_stmt (stmt_vec_info stmt_info)
+{
+  vec_info *vinfo = stmt_info->vinfo;
+  bool bb_p = is_a <bb_vec_info> (vinfo);
+
+  /* Keep scalar statements that are needed by the vectorized code,
+     such as the phi in a "fold left" reduction or the load in a
+     load-and-broadcast operation.  */
+  if (stmt_info->used_by_vector_code_p)
+    return;
+
+  /* Check for types of statement that we can handle.  */
+  if (!is_a <gphi *> (stmt_info->stmt)
+      && !is_a <gassign *> (stmt_info->stmt)
+      && !is_a <gcall *> (stmt_info->stmt))
+    return;
+
+  /* Check whether the function still contains uses of the statement.  */
+  tree lhs = gimple_get_lhs (stmt_info->stmt);
+  if (lhs && TREE_CODE (lhs) == SSA_NAME && !has_zero_uses (lhs))
+    {
+      /* We now have to decide whether the statement is dead despite
+	 still having uses.  We cannot prove this for BB vectorization,
+	 since the vectorization region includes unrelated statements
+	 that might well not be dead.  We also need to keep virtual
+	 operand phis, since except in degenerate cases, the vector
+	 loop will contain virtual operands whenever the scalar loop did.
+
+	 In all other cases, !used_by_vector_code_p guarantees that
+	 the statement is dead for loop vectorization.  */
+      if (bb_p || virtual_operand_p (lhs))
+	return;
+
+      /* Check and process all uses of the lhs.  */
+      imm_use_iterator imm_iter;
+      gimple *use_stmt;
+      FOR_EACH_IMM_USE_STMT (use_stmt, imm_iter, lhs)
+	{
+	  if (is_gimple_debug (use_stmt))
+	    continue;
+
+	  /* The use must be either a loop phi (which we'll delete later)
+	     or an exit phi.  */
+	  gphi *use_phi = dyn_cast <gphi *> (use_stmt);
+	  gcc_assert (use_phi);
+	  if (!vinfo->lookup_stmt (use_phi))
+	    {
+	      /* It's an exit phi, and the phi must itself be dead code.  */
+	      gcc_assert (has_zero_uses (gimple_phi_result (use_phi)));
+	      if (dump_enabled_p ())
+		{
+		  dump_printf_loc (MSG_NOTE, vect_location,
+				   "deleting exit phi: ");
+		  dump_gimple_stmt (MSG_NOTE, TDF_SLIM, use_phi, 0);
+		}
+	      gimple_stmt_iterator gsi = gsi_for_stmt (use_phi);
+	      remove_phi_node (&gsi, true);
+	    }
+	}
+    }
+
+  /* The loop vectorizer decides for each statement whether the statement
+     should be vectorized, dropped or kept as-is.  We dealt with the last
+     case above, so anything left can be removed.  However, the region
+     used in BB vectorization includes unrelated statements that we should
+     only drop if we can prove they are dead.  */
+  if (bb_p
+      && ((lhs && TREE_CODE (lhs) != SSA_NAME)
+	  || gimple_vdef (stmt_info->stmt)
+	  || gimple_has_side_effects (stmt_info->stmt)))
+    {
+      stmt_vec_info final_info = vect_stmt_to_vectorize (stmt_info);
+      /* Keep the statement unless it was replaced by a vector version.
+
+	 Note that RELEVANT_P is only meaningful if the statement is still
+	 part of an SLP instance.  It can be set on other statements that
+	 were tentatively part of an SLP instance that we had to abandon.  */
+      if (STMT_VINFO_NUM_SLP_USES (final_info) == 0
+	  || !STMT_VINFO_RELEVANT_P (final_info))
+	return;
+    }
+
+  if (dump_enabled_p ())
+    {
+      dump_printf_loc (MSG_NOTE, vect_location, "deleting scalar stmt: ");
+      dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt_info->stmt, 0);
+    }
+  vinfo->remove_stmt (stmt_info);
+}
+
+/* Called after vectorizing VINFO.  Remove any scalar statements that
+   are no longer needed.  */
+
+void
+vect_remove_dead_scalar_stmts (vec_info *vinfo)
+{
+  DUMP_VECT_SCOPE ("vect_remove_dead_scalar_stmts");
+
+  unsigned int i;
+  vec_basic_block *vec_bb;
+  FOR_EACH_VEC_ELT_REVERSE (vinfo->blocks, i, vec_bb)
+    {
+      stmt_vec_info prev_stmt_info;
+      for (stmt_vec_info stmt_info = vec_bb->last (); stmt_info;
+	   stmt_info = prev_stmt_info)
+	{
+	  prev_stmt_info = stmt_info->prev;
+	  vect_maybe_remove_scalar_stmt (stmt_info);
+	}
+    }
+}
Index: gcc/tree-vect-slp.c
===================================================================
--- gcc/tree-vect-slp.c	2018-08-28 12:05:11.462982962 +0100
+++ gcc/tree-vect-slp.c	2018-08-28 12:05:14.010961472 +0100
@@ -2975,6 +2975,7 @@ vect_slp_bb (basic_block bb)
 
 	  bb_vinfo->shared->check_datarefs ();
 	  vect_schedule_slp (bb_vinfo);
+	  vect_remove_dead_scalar_stmts (bb_vinfo);
 
 	  unsigned HOST_WIDE_INT bytes;
 	  if (current_vector_size.is_constant (&bytes))
@@ -3954,43 +3955,6 @@ vect_schedule_slp_instance (slp_tree nod
       }
 }
 
-/* Replace scalar calls from SLP node NODE with setting of their lhs to zero.
-   For loop vectorization this is done in vectorizable_call, but for SLP
-   it needs to be deferred until end of vect_schedule_slp, because multiple
-   SLP instances may refer to the same scalar stmt.  */
-
-static void
-vect_remove_slp_scalar_calls (slp_tree node)
-{
-  gimple *new_stmt;
-  gimple_stmt_iterator gsi;
-  int i;
-  slp_tree child;
-  tree lhs;
-  stmt_vec_info stmt_info;
-
-  if (SLP_TREE_DEF_TYPE (node) != vect_internal_def)
-    return;
-
-  FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
-    vect_remove_slp_scalar_calls (child);
-
-  FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, stmt_info)
-    {
-      gcall *stmt = dyn_cast <gcall *> (stmt_info->stmt);
-      if (!stmt || gimple_bb (stmt) == NULL)
-	continue;
-      if (is_pattern_stmt_p (stmt_info)
-	  || !PURE_SLP_STMT (stmt_info))
-	continue;
-      lhs = gimple_call_lhs (stmt);
-      new_stmt = gimple_build_assign (lhs, build_zero_cst (TREE_TYPE (lhs)));
-      gsi = gsi_for_stmt (stmt);
-      stmt_info->vinfo->replace_stmt (&gsi, stmt_info, new_stmt);
-      SSA_NAME_DEF_STMT (gimple_assign_lhs (new_stmt)) = new_stmt;
-    }
-}
-
 /* Generate vector code for all SLP instances in the loop/basic block.  */
 
 void
@@ -4013,32 +3977,4 @@ vect_schedule_slp (vec_info *vinfo)
                          "vectorizing stmts using SLP.\n");
     }
   delete bst_map;
-
-  FOR_EACH_VEC_ELT (slp_instances, i, instance)
-    {
-      slp_tree root = SLP_INSTANCE_TREE (instance);
-      stmt_vec_info store_info;
-      unsigned int j;
-
-      /* Remove scalar call stmts.  Do not do this for basic-block
-	 vectorization as not all uses may be vectorized.
-	 ???  Why should this be necessary?  DCE should be able to
-	 remove the stmts itself.
-	 ???  For BB vectorization we can as well remove scalar
-	 stmts starting from the SLP tree root if they have no
-	 uses.  */
-      if (is_a <loop_vec_info> (vinfo))
-	vect_remove_slp_scalar_calls (root);
-
-      for (j = 0; SLP_TREE_SCALAR_STMTS (root).iterate (j, &store_info)
-                  && j < SLP_INSTANCE_GROUP_SIZE (instance); j++)
-        {
-	  if (!STMT_VINFO_DATA_REF (store_info))
-	    break;
-
-	  store_info = vect_orig_stmt (store_info);
-	  /* Free the attached stmt_vec_info and remove the stmt.  */
-	  vinfo->remove_stmt (store_info);
-        }
-    }
 }
Index: gcc/tree-vect-loop.c
===================================================================
--- gcc/tree-vect-loop.c	2018-08-28 12:05:11.458982995 +0100
+++ gcc/tree-vect-loop.c	2018-08-28 12:05:14.010961472 +0100
@@ -6086,18 +6086,24 @@ vectorizable_reduction (stmt_vec_info st
 	}
 
       if (STMT_VINFO_REDUC_TYPE (stmt_info) == FOLD_LEFT_REDUCTION)
-	/* Leave the scalar phi in place.  Note that checking
-	   STMT_VINFO_VEC_REDUCTION_TYPE (as below) only works
-	   for reductions involving a single statement.  */
-	return true;
+	{
+	  /* Leave the scalar phi in place.  Note that checking
+	     STMT_VINFO_VEC_REDUCTION_TYPE (as below) only works
+	     for reductions involving a single statement.  */
+	  vect_mark_used_by_vector_code (stmt_info);
+	  return true;
+	}
 
       stmt_vec_info reduc_stmt_info = STMT_VINFO_REDUC_DEF (stmt_info);
       reduc_stmt_info = vect_stmt_to_vectorize (reduc_stmt_info);
 
       if (STMT_VINFO_VEC_REDUCTION_TYPE (reduc_stmt_info)
 	  == EXTRACT_LAST_REDUCTION)
-	/* Leave the scalar phi in place.  */
-	return true;
+	{
+	  /* Leave the scalar phi in place.  */
+	  vect_mark_used_by_vector_code (stmt_info);
+	  return true;
+	}
 
       gassign *reduc_stmt = as_a <gassign *> (reduc_stmt_info->stmt);
       for (unsigned k = 1; k < gimple_num_ops (reduc_stmt); ++k)
@@ -7790,6 +7796,7 @@ vectorizable_live_operation (stmt_vec_in
 	dump_printf_loc (MSG_NOTE, vect_location,
 			 "statement is simple and uses invariant.  Leaving in "
 			 "place.\n");
+      vect_mark_used_by_vector_code (stmt_info);
       return true;
     }
 
@@ -8158,13 +8165,12 @@ scale_profile_for_vect_loop (struct loop
     scale_bbs_frequencies (&loop->latch, 1, exit_l->probability / prob);
 }
 
-/* Vectorize STMT_INFO if relevant, inserting any new instructions before GSI.
-   When vectorizing STMT_INFO as a store, set *SEEN_STORE to its
-   stmt_vec_info.  */
+/* Vectorize STMT_INFO if relevant, inserting any new instructions
+   before GSI.  */
 
 static void
 vect_transform_loop_stmt (loop_vec_info loop_vinfo, stmt_vec_info stmt_info,
-			  gimple_stmt_iterator *gsi, stmt_vec_info *seen_store)
+			  gimple_stmt_iterator *gsi)
 {
   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
   poly_uint64 vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
@@ -8179,6 +8185,21 @@ vect_transform_loop_stmt (loop_vec_info
   if (MAY_HAVE_DEBUG_BIND_STMTS && !STMT_VINFO_LIVE_P (stmt_info))
     vect_loop_kill_debug_uses (loop, stmt_info);
 
+  if (gcond *cond = dyn_cast <gcond *> (stmt_info->stmt))
+    {
+      if (nested_in_vect_loop_p (loop, stmt_info))
+	/* The gconds for inner loops aren't changed by vectorization.  */
+	vect_mark_used_by_vector_code (stmt_info);
+      else
+	{
+	  /* Replace the scalar gcond with a dummy condition for now,
+	     so that all the scalar inputs to it are dead.  We'll later
+	     replace the condition with the new vector-based one.  */
+	  gimple_cond_make_false (cond);
+	  update_stmt (cond);
+	}
+    }
+
   if (!STMT_VINFO_RELEVANT_P (stmt_info)
       && !STMT_VINFO_LIVE_P (stmt_info))
     return;
@@ -8203,8 +8224,7 @@ vect_transform_loop_stmt (loop_vec_info
   if (dump_enabled_p ())
     dump_printf_loc (MSG_NOTE, vect_location, "transform statement.\n");
 
-  if (vect_transform_stmt (stmt_info, gsi, NULL, NULL))
-    *seen_store = stmt_info;
+  vect_transform_stmt (stmt_info, gsi, NULL, NULL);
 }
 
 /* Function vect_transform_loop.
@@ -8389,7 +8409,6 @@ vect_transform_loop (loop_vec_info loop_
 	    loop_vinfo->remove_stmt (stmt_info);
 	  else
 	    {
-	      stmt_vec_info seen_store = NULL;
 	      gimple_stmt_iterator si = gsi_for_stmt (stmt_info->stmt);
 	      if (STMT_VINFO_IN_PATTERN_P (stmt_info))
 		{
@@ -8400,49 +8419,19 @@ vect_transform_loop (loop_vec_info loop_
 		      stmt_vec_info pat_stmt_info
 			= loop_vinfo->lookup_stmt (gsi_stmt (subsi));
 		      vect_transform_loop_stmt (loop_vinfo, pat_stmt_info,
-						&si, &seen_store);
+						&si);
 		    }
 		  stmt_vec_info pat_stmt_info
 		    = STMT_VINFO_RELATED_STMT (stmt_info);
-		  vect_transform_loop_stmt (loop_vinfo, pat_stmt_info, &si,
-					    &seen_store);
-		}
-	      vect_transform_loop_stmt (loop_vinfo, stmt_info, &si,
-					&seen_store);
-	      if (seen_store)
-		{
-		  if (STMT_VINFO_GROUPED_ACCESS (seen_store))
-		    /* Interleaving.  If IS_STORE is TRUE, the
-		       vectorization of the interleaving chain was
-		       completed - free all the stores in the chain.  */
-		    vect_remove_stores (DR_GROUP_FIRST_ELEMENT (seen_store));
-		  else
-		    /* Free the attached stmt_vec_info and remove the stmt.  */
-		    loop_vinfo->remove_stmt (stmt_info);
-		}
-	    }
-	}
-
-      /* Stub out scalar statements that must not survive vectorization.
-	 Doing this here helps with grouped statements, or statements that
-	 are involved in patterns.  */
-      for (gimple_stmt_iterator gsi = gsi_start_bb (vec_bb->bb ());
-	   !gsi_end_p (gsi); gsi_next (&gsi))
-	{
-	  gcall *call = dyn_cast <gcall *> (gsi_stmt (gsi));
-	  if (call && gimple_call_internal_p (call, IFN_MASK_LOAD))
-	    {
-	      tree lhs = gimple_get_lhs (call);
-	      if (!VECTOR_TYPE_P (TREE_TYPE (lhs)))
-		{
-		  tree zero = build_zero_cst (TREE_TYPE (lhs));
-		  gimple *new_stmt = gimple_build_assign (lhs, zero);
-		  gsi_replace (&gsi, new_stmt, true);
+		  vect_transform_loop_stmt (loop_vinfo, pat_stmt_info, &si);
 		}
+	      vect_transform_loop_stmt (loop_vinfo, stmt_info, &si);
 	    }
 	}
     }				/* BBs in loop */
 
+  vect_remove_dead_scalar_stmts (loop_vinfo);
+
   /* The vectorization factor is always > 1, so if we use an IV increment of 1.
      a zero NITERS becomes a nonzero NITERS_VECTOR.  */
   if (integer_onep (step_vector))
Index: gcc/tree-vect-loop-manip.c
===================================================================
--- gcc/tree-vect-loop-manip.c	2018-07-31 15:26:15.986653204 +0100
+++ gcc/tree-vect-loop-manip.c	2018-08-28 12:05:14.010961472 +0100
@@ -1515,6 +1515,14 @@ vect_update_ivs_after_vectorizer (loop_v
       /* Skip reduction and virtual phis.  */
       if (!iv_phi_p (phi_info))
 	{
+	  /* ??? We can sometimes see dead PHI loops that get classified
+	     as reductions but are never actually vectorized.  In this case
+	     it should be safe to delete the code or use an arbitrary
+	     initial value, but the safest thing is just to keep it
+	     and let a later pass clean it up.  */
+	  if (!virtual_operand_p (PHI_RESULT (phi))
+	      && !STMT_VINFO_RELEVANT_P (phi_info))
+	    vect_mark_used_by_vector_code (phi_info);
 	  if (dump_enabled_p ())
 	    dump_printf_loc (MSG_NOTE, vect_location,
 			     "reduc or virtual phi. skip.\n");

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [6/6] Link imm uses for pattern stmts
  2018-08-28 11:19 [0/6] Make vector pattern statements less special Richard Sandiford
                   ` (3 preceding siblings ...)
  2018-08-28 11:23 ` [4/6] Make the vectoriser do its own DCE Richard Sandiford
@ 2018-08-28 11:25 ` Richard Sandiford
  2018-08-29  7:43   ` Richard Biener
  2018-08-28 11:25 ` [5/6] Insert pattern statements into vec_basic_blocks Richard Sandiford
  5 siblings, 1 reply; 22+ messages in thread
From: Richard Sandiford @ 2018-08-28 11:25 UTC (permalink / raw)
  To: gcc-patches

One of the warts of the vectoriser IR is that it doesn't link SSA name
uses for pattern statements, leading to complicated special cases in
vect_mark_stmts_to_be_vectorized and (especially) vect_detect_hybrid_slp.
It also makes it harder to check how an SSA name is used after pattern
replacement (something I need for a later patch).

This patch adds a mode in which tree-ssa-operands.c can update
statements in the same non-invasive way as for debug statements.
It then uses this mode to update pattern statements when adding
them to a vec_basic_block, so that pattern statements become
even more like statements that existed from the outset.


2018-08-28  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-ssa-operands.h (update_stmt_operands): Add a transparent_p
	argument.
	* tree-ssa-operands.c (opf_transparent, opf_sticky): New macros.
	(get_mem_ref_operands, get_tmr_operands): Preserve opf_sticky
	rather than just opf_no_vops.
	(get_expr_operands): Preserve opf_sticky bits in the use flags.
	Assert that opf_no_vops and opf_transparent are already set
	for the debug statements.  Use opf_transparent rather than
	is_gimple_debug when deciding whether to mark something as
	having its address taken.
	(parse_ssa_operands): Add a transparent_p argument.  Set the
	opf_no_vops and opf_transparent flags when the argument is true,
	or when dealing with debug statements.  Check opf_no_vops before
	adding vuses and vdefs.
	(build_ssa_operands): Add a transparent_p argument and pass it to
	parse_ssa_operands.
	(verify_ssa_operands): Update call to parse_ssa_operands.
	(update_stmt_operands): Add a transparent_p argument and pass it to
	build_ssa_operands.
	* gimple-ssa.h (update_stmt, update_stmt_if_modified)
	(update_stmt_fn): Add an optional transparent_p parameter and
	update call to update_stmt_operands.
	* tree-vect-slp.c (vect_detect_hybrid_slp_1): Delete.
	(vect_detect_hybrid_slp_2): Likewise.
	(vect_detect_hybrid_slp): Don't treat pattern statements specially.
	* tree-vect-stmts.c (vect_mark_stmts_to_be_vectorized): Likewise.
	(vect_remove_dead_scalar_stmts): Remove pattern statements from
	the containing vec_info.
	* tree-vectorizer.h (vec_info::add_pattern_stmt_to_block): Declare.
	* tree-vectorizer.c (vec_basic_block::add_to_end)
	(vec_basic_block::add_before): Call add_pattern_stmt_to_block.
	(vec_basic_block::remove, vec_info::remove_stmt): Call
	remove_pattern_stmt_from_block.
	(vec_basic_block::add_pattern_stmt_to_block): New function.
	(remove_pattern_stmt_from_block): Likewise.
	(vec_info::free_stmt_vec_info): Handle pattern statements.
	(vec_info::lookup_single_use): Accept pattern statements
	as well as original statements.  Ignore uses in statements
	that have been replaced by a pattern statement.
	* tree-vect-patterns.c (vect_init_pattern_stmt): Don't call
	gimple_set_bb.
	(vect_look_through_possible_promotion): Use vinfo->lookup_single_use
	instead of has_single_use.  Track single uses for pattern statements
	too.

Index: gcc/tree-ssa-operands.h
===================================================================
--- gcc/tree-ssa-operands.h	2018-05-02 08:37:32.405761509 +0100
+++ gcc/tree-ssa-operands.h	2018-08-28 12:05:19.262917177 +0100
@@ -94,7 +94,7 @@ extern void init_ssa_operands (struct fu
 extern void fini_ssa_operands (struct function *);
 extern bool verify_ssa_operands (struct function *, gimple *stmt);
 extern void free_stmt_operands (struct function *, gimple *);
-extern void update_stmt_operands (struct function *, gimple *);
+extern void update_stmt_operands (struct function *, gimple *, bool);
 extern void swap_ssa_operands (gimple *, tree *, tree *);
 extern bool verify_imm_links (FILE *f, tree var);
 
Index: gcc/tree-ssa-operands.c
===================================================================
--- gcc/tree-ssa-operands.c	2018-08-28 11:25:46.242879876 +0100
+++ gcc/tree-ssa-operands.c	2018-08-28 12:05:19.262917177 +0100
@@ -99,6 +99,14 @@ #define opf_not_non_addressable (1 << 4)
 /* Operand is having its address taken.  */
 #define opf_address_taken (1 << 5)
 
+/* Operand must have no effect on code generation.  This is used for
+   debug statements, and also for statements that a pass has no intention
+   of adding to the block in their current form.  */
+#define opf_transparent (1 << 6)
+
+/* Flags that must never be dropped.  */
+#define opf_sticky (opf_no_vops | opf_transparent)
+
 /* Array for building all the use operands.  */
 static vec<tree *> build_uses;
 
@@ -590,7 +598,7 @@ get_mem_ref_operands (struct function *f
   /* If requested, add a USE operand for the base pointer.  */
   get_expr_operands (fn, stmt, pptr,
 		     opf_non_addressable | opf_use
-		     | (flags & (opf_no_vops|opf_not_non_addressable)));
+		     | (flags & (opf_sticky | opf_not_non_addressable)));
 }
 
 
@@ -605,11 +613,11 @@ get_tmr_operands (struct function *fn, g
 
   /* First record the real operands.  */
   get_expr_operands (fn, stmt,
-		     &TMR_BASE (expr), opf_use | (flags & opf_no_vops));
+		     &TMR_BASE (expr), opf_use | (flags & opf_sticky));
   get_expr_operands (fn, stmt,
-		     &TMR_INDEX (expr), opf_use | (flags & opf_no_vops));
+		     &TMR_INDEX (expr), opf_use | (flags & opf_sticky));
   get_expr_operands (fn, stmt,
-		     &TMR_INDEX2 (expr), opf_use | (flags & opf_no_vops));
+		     &TMR_INDEX2 (expr), opf_use | (flags & opf_sticky));
 
   add_virtual_operand (fn, stmt, flags);
 }
@@ -703,14 +711,14 @@ get_expr_operands (struct function *fn,
   enum tree_code code;
   enum tree_code_class codeclass;
   tree expr = *expr_p;
-  int uflags = opf_use;
+  int uflags = opf_use | (flags & opf_sticky);
+  gcc_checking_assert (!is_gimple_debug (stmt)
+		       || ((flags & opf_no_vops)
+			   && (flags & opf_transparent)));
 
   if (expr == NULL)
     return;
 
-  if (is_gimple_debug (stmt))
-    uflags |= (flags & opf_no_vops);
-
   code = TREE_CODE (expr);
   codeclass = TREE_CODE_CLASS (code);
 
@@ -723,7 +731,7 @@ get_expr_operands (struct function *fn,
 	 resolution).  */
       if ((!(flags & opf_non_addressable)
 	   || (flags & opf_not_non_addressable))
-	  && !is_gimple_debug (stmt))
+	  && !(flags & opf_transparent))
 	mark_address_taken (TREE_OPERAND (expr, 0));
 
       /* Otherwise, there may be variables referenced inside but there
@@ -885,43 +893,50 @@ get_expr_operands (struct function *fn,
 
 
 /* Parse STMT looking for operands.  When finished, the various
-   build_* operand vectors will have potential operands in them.  */
+   build_* operand vectors will have potential operands in them.
+   TRANSPARENT_P as for update_stmt_operands.  */
 
 static void
-parse_ssa_operands (struct function *fn, gimple *stmt)
+parse_ssa_operands (struct function *fn, gimple *stmt, bool transparent_p)
 {
   enum gimple_code code = gimple_code (stmt);
   size_t i, n, start = 0;
+  int flags = (transparent_p || code == GIMPLE_DEBUG
+	       ? opf_no_vops | opf_transparent : 0);
 
   switch (code)
     {
     case GIMPLE_ASM:
+      /* Not supported yet (but could be if needed).  */
+      gcc_assert (!transparent_p);
       get_asm_stmt_operands (fn, as_a <gasm *> (stmt));
       break;
 
     case GIMPLE_TRANSACTION:
       /* The start of a transaction is a memory barrier.  */
-      add_virtual_operand (fn, stmt, opf_def | opf_use);
+      add_virtual_operand (fn, stmt, opf_def | opf_use | flags);
       break;
 
     case GIMPLE_DEBUG:
       if (gimple_debug_bind_p (stmt)
 	  && gimple_debug_bind_has_value_p (stmt))
 	get_expr_operands (fn, stmt, gimple_debug_bind_get_value_ptr (stmt),
-			   opf_use | opf_no_vops);
+			   opf_use | flags);
       break;
 
     case GIMPLE_RETURN:
-      append_vuse (gimple_vop (fn));
+      if (!(flags & opf_no_vops))
+	append_vuse (gimple_vop (fn));
       goto do_default;
 
     case GIMPLE_CALL:
       /* Add call-clobbered operands, if needed.  */
-      maybe_add_call_vops (fn, as_a <gcall *> (stmt));
+      if (!(flags & opf_no_vops))
+	maybe_add_call_vops (fn, as_a <gcall *> (stmt));
       /* FALLTHRU */
 
     case GIMPLE_ASSIGN:
-      get_expr_operands (fn, stmt, gimple_op_ptr (stmt, 0), opf_def);
+      get_expr_operands (fn, stmt, gimple_op_ptr (stmt, 0), opf_def | flags);
       start = 1;
       /* FALLTHRU */
 
@@ -929,22 +944,23 @@ parse_ssa_operands (struct function *fn,
     do_default:
       n = gimple_num_ops (stmt);
       for (i = start; i < n; i++)
-	get_expr_operands (fn, stmt, gimple_op_ptr (stmt, i), opf_use);
+	get_expr_operands (fn, stmt, gimple_op_ptr (stmt, i), opf_use | flags);
       break;
     }
 }
 
 
-/* Create an operands cache for STMT.  */
+/* Create an operands cache for STMT.  TRANSPARENT_P as for
+   update_stmt_operands.  */
 
 static void
-build_ssa_operands (struct function *fn, gimple *stmt)
+build_ssa_operands (struct function *fn, gimple *stmt, bool transparent_p)
 {
   /* Initially assume that the statement has no volatile operands.  */
   gimple_set_has_volatile_ops (stmt, false);
 
   start_ssa_stmt_operands ();
-  parse_ssa_operands (fn, stmt);
+  parse_ssa_operands (fn, stmt, transparent_p);
   finalize_ssa_stmt_operands (fn, stmt);
 }
 
@@ -963,7 +979,7 @@ verify_ssa_operands (struct function *fn
   /* build_ssa_operands w/o finalizing them.  */
   gimple_set_has_volatile_ops (stmt, false);
   start_ssa_stmt_operands ();
-  parse_ssa_operands (fn, stmt);
+  parse_ssa_operands (fn, stmt, false);
 
   /* Now verify the built operands are the same as present in STMT.  */
   def = gimple_vdef (stmt);
@@ -1065,10 +1081,11 @@ free_stmt_operands (struct function *fn,
 }
 
 
-/* Get the operands of statement STMT.  */
+/* Get the operands of statement STMT.  TRANSPARENT_P says that opf_transparent
+   semantics should be used whatever form STMT happens to have.  */
 
 void
-update_stmt_operands (struct function *fn, gimple *stmt)
+update_stmt_operands (struct function *fn, gimple *stmt, bool transparent_p)
 {
   /* If update_stmt_operands is called before SSA is initialized, do
      nothing.  */
@@ -1078,7 +1095,7 @@ update_stmt_operands (struct function *f
   timevar_push (TV_TREE_OPS);
 
   gcc_assert (gimple_modified_p (stmt));
-  build_ssa_operands (fn, stmt);
+  build_ssa_operands (fn, stmt, transparent_p);
   gimple_set_modified (stmt, false);
 
   timevar_pop (TV_TREE_OPS);
Index: gcc/gimple-ssa.h
===================================================================
--- gcc/gimple-ssa.h	2018-05-02 08:37:33.501751141 +0100
+++ gcc/gimple-ssa.h	2018-08-28 12:05:19.262917177 +0100
@@ -164,36 +164,42 @@ gimple_vdef_op (gimple *g)
   return NULL_DEF_OPERAND_P;
 }
 
-/* Mark statement S as modified, and update it.  */
+/* Mark statement S as modified, and update it.  TRANSPARENT_P is true
+   if the update must have no effect on code generation, in much the
+   same way as for debug statements.  This means in particular that the
+   statement should not cause things to be marked addressable and should
+   not use virtual operands.  */
 
 static inline void
-update_stmt (gimple *s)
+update_stmt (gimple *s, bool transparent_p = false)
 {
   if (gimple_has_ops (s))
     {
       gimple_set_modified (s, true);
-      update_stmt_operands (cfun, s);
+      update_stmt_operands (cfun, s, transparent_p);
     }
 }
 
-/* Update statement S if it has been optimized.  */
+/* Update statement S if it has been optimized.  TRANSPARENT_P is as for
+   update_stmt.  */
 
 static inline void
-update_stmt_if_modified (gimple *s)
+update_stmt_if_modified (gimple *s, bool transparent_p = false)
 {
   if (gimple_modified_p (s))
-    update_stmt_operands (cfun, s);
+    update_stmt_operands (cfun, s, transparent_p);
 }
 
-/* Mark statement S as modified, and update it.  */
+/* Mark statement S as modified, and update it.  TRANSPARENT_P is as for
+   update_stmt.  */
 
 static inline void
-update_stmt_fn (struct function *fn, gimple *s)
+update_stmt_fn (struct function *fn, gimple *s, bool transparent_p = false)
 {
   if (gimple_has_ops (s))
     {
       gimple_set_modified (s, true);
-      update_stmt_operands (fn, s);
+      update_stmt_operands (fn, s, transparent_p);
     }
 }
 
Index: gcc/tree-vect-slp.c
===================================================================
--- gcc/tree-vect-slp.c	2018-08-28 12:05:16.522940287 +0100
+++ gcc/tree-vect-slp.c	2018-08-28 12:05:19.262917177 +0100
@@ -2302,50 +2302,6 @@ vect_detect_hybrid_slp_stmts (slp_tree n
       vect_detect_hybrid_slp_stmts (child, i, stype);
 }
 
-/* Helpers for vect_detect_hybrid_slp walking pattern stmt uses.  */
-
-static tree
-vect_detect_hybrid_slp_1 (tree *tp, int *, void *data)
-{
-  walk_stmt_info *wi = (walk_stmt_info *)data;
-  loop_vec_info loop_vinfo = (loop_vec_info) wi->info;
-
-  if (wi->is_lhs)
-    return NULL_TREE;
-
-  stmt_vec_info def_stmt_info = loop_vinfo->lookup_def (*tp);
-  if (def_stmt_info && PURE_SLP_STMT (def_stmt_info))
-    {
-      if (dump_enabled_p ())
-	{
-	  dump_printf_loc (MSG_NOTE, vect_location, "marking hybrid: ");
-	  dump_gimple_stmt (MSG_NOTE, TDF_SLIM, def_stmt_info->stmt, 0);
-	}
-      STMT_SLP_TYPE (def_stmt_info) = hybrid;
-    }
-
-  return NULL_TREE;
-}
-
-static tree
-vect_detect_hybrid_slp_2 (gimple_stmt_iterator *gsi, bool *handled,
-			  walk_stmt_info *wi)
-{
-  loop_vec_info loop_vinfo = (loop_vec_info) wi->info;
-  stmt_vec_info use_vinfo = loop_vinfo->lookup_stmt (gsi_stmt (*gsi));
-  /* If the stmt is in a SLP instance then this isn't a reason
-     to mark use definitions in other SLP instances as hybrid.  */
-  if (! STMT_SLP_TYPE (use_vinfo)
-      && (STMT_VINFO_RELEVANT (use_vinfo)
-	  || VECTORIZABLE_CYCLE_DEF (STMT_VINFO_DEF_TYPE (use_vinfo)))
-      && ! (gimple_code (gsi_stmt (*gsi)) == GIMPLE_PHI
-	    && STMT_VINFO_DEF_TYPE (use_vinfo) == vect_reduction_def))
-    ;
-  else
-    *handled = true;
-  return NULL_TREE;
-}
-
 /* Find stmts that must be both vectorized and SLPed.  */
 
 void
@@ -2357,22 +2313,7 @@ vect_detect_hybrid_slp (loop_vec_info lo
 
   DUMP_VECT_SCOPE ("vect_detect_hybrid_slp");
 
-  /* First walk all pattern stmt in the loop and mark defs of uses as
-     hybrid because immediate uses in them are not recorded.  */
-  vec_basic_block *vec_bb;
-  FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
-    FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
-      if (is_pattern_stmt_p (stmt_info))
-	{
-	  walk_stmt_info wi;
-	  memset (&wi, 0, sizeof (wi));
-	  wi.info = loop_vinfo;
-	  gimple_stmt_iterator gsi = gsi_for_stmt (stmt_info->stmt);
-	  walk_gimple_stmt (&gsi, vect_detect_hybrid_slp_2,
-			    vect_detect_hybrid_slp_1, &wi);
-	}
-
-  /* Then walk the SLP instance trees marking stmts with uses in
+  /* Walk the SLP instance trees marking stmts with uses in
      non-SLP stmts as hybrid, also propagating hybrid down the
      SLP tree, collecting the above info on-the-fly.  */
   FOR_EACH_VEC_ELT (slp_instances, i, instance)
Index: gcc/tree-vect-stmts.c
===================================================================
--- gcc/tree-vect-stmts.c	2018-08-28 12:05:16.522940287 +0100
+++ gcc/tree-vect-stmts.c	2018-08-28 12:05:19.266917143 +0100
@@ -703,54 +703,13 @@ vect_mark_stmts_to_be_vectorized (loop_v
             break;
         }
 
-      if (is_pattern_stmt_p (stmt_vinfo))
-        {
-          /* Pattern statements are not inserted into the code, so
-             FOR_EACH_PHI_OR_STMT_USE optimizes their operands out, and we
-             have to scan the RHS or function arguments instead.  */
-	  if (gassign *assign = dyn_cast <gassign *> (stmt_vinfo->stmt))
-	    {
-	      enum tree_code rhs_code = gimple_assign_rhs_code (assign);
-	      tree op = gimple_assign_rhs1 (assign);
-
-	      i = 1;
-	      if (rhs_code == COND_EXPR && COMPARISON_CLASS_P (op))
-		{
-		  if (!process_use (stmt_vinfo, TREE_OPERAND (op, 0),
-				    loop_vinfo, relevant, &worklist, false)
-		      || !process_use (stmt_vinfo, TREE_OPERAND (op, 1),
-				       loop_vinfo, relevant, &worklist, false))
-		    return false;
-		  i = 2;
-		}
-	      for (; i < gimple_num_ops (assign); i++)
-		{
-		  op = gimple_op (assign, i);
-                  if (TREE_CODE (op) == SSA_NAME
-		      && !process_use (stmt_vinfo, op, loop_vinfo, relevant,
-				       &worklist, false))
-                    return false;
-                 }
-            }
-	  else if (gcall *call = dyn_cast <gcall *> (stmt_vinfo->stmt))
-	    {
-	      for (i = 0; i < gimple_call_num_args (call); i++)
-		{
-		  tree arg = gimple_call_arg (call, i);
-		  if (!process_use (stmt_vinfo, arg, loop_vinfo, relevant,
-				    &worklist, false))
-                    return false;
-		}
-	    }
-        }
-      else
-	FOR_EACH_PHI_OR_STMT_USE (use_p, stmt_vinfo->stmt, iter, SSA_OP_USE)
-          {
-            tree op = USE_FROM_PTR (use_p);
-	    if (!process_use (stmt_vinfo, op, loop_vinfo, relevant,
-			      &worklist, false))
-              return false;
-          }
+      FOR_EACH_PHI_OR_STMT_USE (use_p, stmt_vinfo->stmt, iter, SSA_OP_USE)
+	{
+	  tree op = USE_FROM_PTR (use_p);
+	  if (!process_use (stmt_vinfo, op, loop_vinfo, relevant,
+			    &worklist, false))
+	    return false;
+	}
 
       if (STMT_VINFO_GATHER_SCATTER_P (stmt_vinfo))
 	{
@@ -10849,7 +10808,9 @@ vect_remove_dead_scalar_stmts (vec_info
 	   stmt_info = prev_stmt_info)
 	{
 	  prev_stmt_info = stmt_info->prev;
-	  if (!is_pattern_stmt_p (stmt_info))
+	  if (is_pattern_stmt_p (stmt_info))
+	    vinfo->remove_stmt (stmt_info);
+	  else
 	    vect_maybe_remove_scalar_stmt (stmt_info);
 	}
     }
Index: gcc/tree-vectorizer.h
===================================================================
--- gcc/tree-vectorizer.h	2018-08-28 12:05:16.522940287 +0100
+++ gcc/tree-vectorizer.h	2018-08-28 12:05:19.266917143 +0100
@@ -193,6 +193,8 @@ #define SLP_TREE_DEF_TYPE(S)			 (S)->def
   stmt_vec_info last () const { return m_last; }
 
 private:
+  void add_pattern_stmt_to_block (stmt_vec_info);
+
   /* The block itself.  */
   basic_block m_bb;
 
Index: gcc/tree-vectorizer.c
===================================================================
--- gcc/tree-vectorizer.c	2018-08-28 12:05:14.014961439 +0100
+++ gcc/tree-vectorizer.c	2018-08-28 12:05:19.266917143 +0100
@@ -459,6 +459,9 @@ vec_basic_block::add_to_end (stmt_vec_in
   stmt_info->block = this;
   stmt_info->prev = m_last;
   m_last = stmt_info;
+
+  if (is_pattern_stmt_p (stmt_info))
+    add_pattern_stmt_to_block (stmt_info);
 }
 
 /* Add STMT_INFO to the block, inserting it before NEXT_STMT_INFO.  */
@@ -479,6 +482,34 @@ vec_basic_block::add_before (stmt_vec_in
   stmt_info->prev = next_stmt_info->prev;
   stmt_info->next = next_stmt_info;
   next_stmt_info->prev = stmt_info;
+
+  if (is_pattern_stmt_p (stmt_info))
+    add_pattern_stmt_to_block (stmt_info);
+}
+
+/* Record that pattern statement STMT_INFO has just been added to the
+   vec_basic_block.  Adding it to the underlying basic_block would be
+   problematic because we need to be able to duplicate the original
+   scalar code in the middle of vectorization.  Instead we just set the
+   statement's gimple_bb and link its SSA uses.  */
+
+void
+vec_basic_block::add_pattern_stmt_to_block (stmt_vec_info stmt_info)
+{
+  gimple_set_bb (stmt_info->stmt, m_bb);
+  update_stmt (stmt_info->stmt, true);
+  gcc_assert (!gimple_vdef (stmt_info->stmt));
+  gcc_assert (!gimple_vuse (stmt_info->stmt));
+}
+
+/* Record that STMT_INFO has been removed from its vec_basic_block.
+   Undo the effect of add_pattern_stmt_to_block.  */
+
+static void
+remove_pattern_stmt_from_block (stmt_vec_info stmt_info)
+{
+  gimple_set_bb (stmt_info->stmt, NULL);
+  delink_stmt_imm_use (stmt_info->stmt);
 }
 
 /* Remove STMT_INFO from the block.  */
@@ -497,6 +528,9 @@ vec_basic_block::remove (stmt_vec_info s
     m_last = stmt_info->prev;
   stmt_info->block = NULL;
   stmt_info->prev = stmt_info->next = NULL;
+
+  if (is_pattern_stmt_p (stmt_info))
+    remove_pattern_stmt_from_block (stmt_info);
 }
 \f
 /* Initialize the vec_info with kind KIND_IN and target cost data
@@ -610,12 +644,37 @@ vec_info::lookup_def (tree name)
 stmt_vec_info
 vec_info::lookup_single_use (stmt_vec_info stmt_info)
 {
-  tree lhs = gimple_get_lhs (stmt_info->stmt);
-  use_operand_p dummy;
-  gimple *use_stmt;
-  if (single_imm_use (lhs, &dummy, &use_stmt))
-    return lookup_stmt (use_stmt);
-  return NULL;
+  gimple *stmts[2] = { stmt_info->stmt, NULL };
+  unsigned int num_stmts = 1;
+  /* If STMT_INFO replaces one of the original scalar statements,
+     check for uses of that statement's results too, to simulate the
+     effect of vect_stmt_to_be_vectorized.  */
+  if (is_main_pattern_stmt_p (stmt_info))
+    stmts[num_stmts++] = STMT_VINFO_RELATED_STMT (stmt_info)->stmt;
+  stmt_vec_info result = NULL;
+  for (unsigned int i = 0; i < num_stmts; ++i)
+    {
+      use_operand_p use_p;
+      imm_use_iterator imm_iter;
+      tree lhs = gimple_get_lhs (stmts[i]);
+      FOR_EACH_IMM_USE_FAST (use_p, imm_iter, lhs)
+	{
+	  gimple *use_stmt = USE_STMT (use_p);
+	  if (is_gimple_debug (use_stmt))
+	    continue;
+	  stmt_vec_info use_stmt_info = lookup_stmt (use_stmt);
+	  if (!use_stmt_info)
+	    return NULL;
+	  /* Ignore statements that have been replaced by a pattern
+	     and consider only the replacement statement.  */
+	  if (STMT_VINFO_IN_PATTERN_P (use_stmt_info))
+	    continue;
+	  if (result)
+	    return NULL;
+	  result = use_stmt_info;
+	}
+    }
+  return result;
 }
 
 /* Return vectorization information about DR.  */
@@ -650,16 +709,18 @@ vec_info::move_dr (stmt_vec_info new_stm
 void
 vec_info::remove_stmt (stmt_vec_info stmt_info)
 {
-  gcc_assert (!stmt_info->pattern_stmt_p);
   set_vinfo_for_stmt (stmt_info->stmt, NULL);
   gimple_stmt_iterator si = gsi_for_stmt (stmt_info->stmt);
   unlink_stmt_vdef (stmt_info->stmt);
-  if (is_a <gphi *> (stmt_info->stmt))
-    remove_phi_node (&si, true);
-  else
+  if (!is_pattern_stmt_p (stmt_info))
     {
-      gsi_remove (&si, true);
-      release_defs (stmt_info->stmt);
+      if (is_a <gphi *> (stmt_info->stmt))
+	remove_phi_node (&si, true);
+      else
+	{
+	  gsi_remove (&si, true);
+	  release_defs (stmt_info->stmt);
+	}
     }
   stmt_info->block->remove (stmt_info);
   free_stmt_vec_info (stmt_info);
@@ -749,12 +810,13 @@ vec_info::free_stmt_vec_infos (void)
 void
 vec_info::free_stmt_vec_info (stmt_vec_info stmt_info)
 {
-  if (stmt_info->pattern_stmt_p)
+  if (is_pattern_stmt_p (stmt_info))
     {
-      gimple_set_bb (stmt_info->stmt, NULL);
       tree lhs = gimple_get_lhs (stmt_info->stmt);
       if (lhs && TREE_CODE (lhs) == SSA_NAME)
 	release_ssa_name (lhs);
+      if (stmt_info->block)
+	remove_pattern_stmt_from_block (stmt_info);
     }
 
   STMT_VINFO_SAME_ALIGN_REFS (stmt_info).release ();
Index: gcc/tree-vect-patterns.c
===================================================================
--- gcc/tree-vect-patterns.c	2018-08-28 12:05:16.518940320 +0100
+++ gcc/tree-vect-patterns.c	2018-08-28 12:05:19.262917177 +0100
@@ -106,7 +106,6 @@ vect_init_pattern_stmt (gimple *pattern_
   stmt_vec_info pattern_stmt_info = vinfo->lookup_stmt (pattern_stmt);
   if (pattern_stmt_info == NULL)
     pattern_stmt_info = orig_stmt_info->vinfo->add_stmt (pattern_stmt);
-  gimple_set_bb (pattern_stmt, gimple_bb (orig_stmt_info->stmt));
 
   pattern_stmt_info->pattern_stmt_p = true;
   STMT_VINFO_RELATED_STMT (pattern_stmt_info) = orig_stmt_info;
@@ -412,11 +411,9 @@ vect_look_through_possible_promotion (ve
 	break;
       caster = def_stmt_info;
 
-      /* Ignore pattern statements, since we don't link uses for them.  */
       if (caster
 	  && single_use_p
-	  && !STMT_VINFO_RELATED_STMT (caster)
-	  && !has_single_use (res))
+	  && !vinfo->lookup_single_use (caster))
 	*single_use_p = false;
 
       gassign *assign = dyn_cast <gassign *> (def_stmt);

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [5/6] Insert pattern statements into vec_basic_blocks
  2018-08-28 11:19 [0/6] Make vector pattern statements less special Richard Sandiford
                   ` (4 preceding siblings ...)
  2018-08-28 11:25 ` [6/6] Link imm uses for pattern stmts Richard Sandiford
@ 2018-08-28 11:25 ` Richard Sandiford
  2018-08-28 23:16   ` Jeff Law
  2018-08-29  7:55   ` Jakub Jelinek
  5 siblings, 2 replies; 22+ messages in thread
From: Richard Sandiford @ 2018-08-28 11:25 UTC (permalink / raw)
  To: gcc-patches

The point of this patch is to put pattern statements in the same
vec_basic_block as the statements they replace, with the pattern
statements for S coming between S and S's original predecessor.
This removes the need to handle them specially in various places.


2018-08-28  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vectorizer.h (vec_basic_block): Expand comment.
	(_stmt_vec_info::pattern_def_seq): Delete.
	(STMT_VINFO_PATTERN_DEF_SEQ): Likewise.
	(is_main_pattern_stmt_p): New function.
	* tree-vect-loop.c (vect_determine_vf_for_stmt_1): Rename to...
	(vect_determine_vf_for_stmt): ...this, deleting the original
	function with this name.  Remove vectype_maybe_set_p argument
	and test is_pattern_stmt_p instead.  Retain the "examining..."
	message from the previous vect_determine_vf_for_stmt.
	(vect_compute_single_scalar_iteration_cost, vect_update_vf_for_slp)
	(vect_analyze_loop_2): Don't treat pattern statements specially.
	(vect_transform_loop): Likewise.  Use vect_orig_stmt to find the
	insertion point.
	* tree-vect-slp.c (vect_detect_hybrid_slp): Expect pattern statements
	to be in the statement list, without needing to follow
	STMT_VINFO_RELATED_STMT.  Remove PATTERN_DEF_SEQ handling.
	* tree-vect-stmts.c (vect_analyze_stmt): Don't handle pattern
	statements specially.
	(vect_remove_dead_scalar_stmts): Ignore pattern statements.
	* tree-vect-patterns.c (vect_set_pattern_stmt): Insert the pattern
	statement into the vec_basic_block immediately before the statement
	it replaces.
	(append_pattern_def_seq): Likewise.  If the original statement is
	itself a pattern statement, associate the new one with the original
	statement.
	(vect_split_statement): Use append_pattern_def_seq to insert the
	first pattern statement.
	(vect_recog_vector_vector_shift_pattern): Remove mention of
	STMT_VINFO_PATTERN_DEF_SEQ.
	(adjust_bool_stmts): Get the last pattern statement from the
	stmt_vec_info chain.
	(vect_mark_pattern_stmts): Rename to...
	(vect_replace_stmt_with_pattern): ...this.  Remove the
	PATTERN_DEF_SEQ handling and process only the pattern statement given.
	Use append_pattern_def_seq when replacing a pattern statement with
	another pattern statement, and use vec_basic_block::remove instead
	of gsi_remove to remove the old one.
	(vect_pattern_recog_1): Update accordingly.  Remove PATTERN_DEF_SEQ
	handling.  On failure, remove any half-formed pattern sequence from
	the vec_basic_block.  Install the vector type in pattern statements
	that don't yet have one.
	(vect_pattern_recog): Iterate over statements that are added
	by previous recognizers, but skipping those that have already
	been replaced, or the main pattern statement in such a replacement.

Index: gcc/tree-vectorizer.h
===================================================================
--- gcc/tree-vectorizer.h	2018-08-28 12:05:14.014961439 +0100
+++ gcc/tree-vectorizer.h	2018-08-28 12:05:16.522940287 +0100
@@ -172,7 +172,13 @@ #define SLP_TREE_TWO_OPERATORS(S)		 (S)-
 #define SLP_TREE_DEF_TYPE(S)			 (S)->def_type
 
 /* Information about the phis and statements in a block that we're trying
-   to vectorize, in their original order.  */
+   to vectorize.  This includes the phis and statements that were in the
+   original scalar code, in their original order.  It also includes any
+   pattern statements that the vectorizer has created to replace some
+   of the scalar ones.  Such pattern statements come immediately before
+   the statement that they replace; that is, all pattern statements P for
+   which vect_orig_stmt (P) == S form a sequence that comes immediately
+   before S.  */
 class vec_basic_block
 {
 public:
@@ -870,11 +876,6 @@ struct _stmt_vec_info {
         pattern).  */
   stmt_vec_info related_stmt;
 
-  /* Used to keep a sequence of def stmts of a pattern stmt if such exists.
-     The sequence is attached to the original statement rather than the
-     pattern statement.  */
-  gimple_seq pattern_def_seq;
-
   /* List of datarefs that are known to have the same alignment as the dataref
      of this stmt.  */
   vec<dr_p> same_align_refs;
@@ -1048,7 +1049,6 @@ #define STMT_VINFO_DR_INFO(S) \
 
 #define STMT_VINFO_IN_PATTERN_P(S)         (S)->in_pattern_p
 #define STMT_VINFO_RELATED_STMT(S)         (S)->related_stmt
-#define STMT_VINFO_PATTERN_DEF_SEQ(S)      (S)->pattern_def_seq
 #define STMT_VINFO_SAME_ALIGN_REFS(S)      (S)->same_align_refs
 #define STMT_VINFO_SIMD_CLONE_INFO(S)	   (S)->simd_clone_info
 #define STMT_VINFO_DEF_TYPE(S)             (S)->def_type
@@ -1176,6 +1176,17 @@ is_pattern_stmt_p (stmt_vec_info stmt_in
   return stmt_info->pattern_stmt_p;
 }
 
+/* Return TRUE if a statement represented by STMT_INFO is the final
+   statement in a pattern.  */
+
+static inline bool
+is_main_pattern_stmt_p (stmt_vec_info stmt_info)
+{
+  stmt_vec_info orig_stmt_info = STMT_VINFO_RELATED_STMT (stmt_info);
+  return (is_pattern_stmt_p (stmt_info)
+	  && STMT_VINFO_RELATED_STMT (orig_stmt_info) == stmt_info);
+}
+
 /* If STMT_INFO is a pattern statement, return the statement that it
    replaces, otherwise return STMT_INFO itself.  */
 
Index: gcc/tree-vect-loop.c
===================================================================
--- gcc/tree-vect-loop.c	2018-08-28 12:05:14.010961472 +0100
+++ gcc/tree-vect-loop.c	2018-08-28 12:05:16.518940320 +0100
@@ -155,18 +155,24 @@ Software Foundation; either version 3, o
 
 static void vect_estimate_min_profitable_iters (loop_vec_info, int *, int *);
 
-/* Subroutine of vect_determine_vf_for_stmt that handles only one
-   statement.  VECTYPE_MAYBE_SET_P is true if STMT_VINFO_VECTYPE
-   may already be set for general statements (not just data refs).  */
+/* Subroutine of vect_determine_vectorization_factor.  Set the vector
+   type of STMT_INFO and update the vectorization factor VF accordingly.
+   If the statement produces a mask result whose vector type can only be
+   calculated later, add it to MASK_PRODUCERS.  Return true on success
+   or false if something prevented vectorization.  */
 
 static bool
-vect_determine_vf_for_stmt_1 (stmt_vec_info stmt_info,
-			      bool vectype_maybe_set_p,
-			      poly_uint64 *vf,
-			      vec<stmt_vec_info > *mask_producers)
+vect_determine_vf_for_stmt (stmt_vec_info stmt_info, poly_uint64 *vf,
+			    vec<stmt_vec_info > *mask_producers)
 {
   gimple *stmt = stmt_info->stmt;
 
+  if (dump_enabled_p ())
+    {
+      dump_printf_loc (MSG_NOTE, vect_location, "==> examining statement: ");
+      dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt_info->stmt, 0);
+    }
+
   if ((!STMT_VINFO_RELEVANT_P (stmt_info)
        && !STMT_VINFO_LIVE_P (stmt_info))
       || gimple_clobber_p (stmt))
@@ -188,7 +194,7 @@ vect_determine_vf_for_stmt_1 (stmt_vec_i
 	   that contain a data ref, or for "pattern-stmts" (stmts generated
 	   by the vectorizer to represent/replace a certain idiom).  */
 	gcc_assert ((STMT_VINFO_DATA_REF (stmt_info)
-		     || vectype_maybe_set_p)
+		     || is_pattern_stmt_p (stmt_info))
 		    && STMT_VINFO_VECTYPE (stmt_info) == stmt_vectype);
       else if (stmt_vectype == boolean_type_node)
 	mask_producers->safe_push (stmt_info);
@@ -202,62 +208,6 @@ vect_determine_vf_for_stmt_1 (stmt_vec_i
   return true;
 }
 
-/* Subroutine of vect_determine_vectorization_factor.  Set the vector
-   types of STMT_INFO and all attached pattern statements and update
-   the vectorization factor VF accordingly.  If some of the statements
-   produce a mask result whose vector type can only be calculated later,
-   add them to MASK_PRODUCERS.  Return true on success or false if
-   something prevented vectorization.  */
-
-static bool
-vect_determine_vf_for_stmt (stmt_vec_info stmt_info, poly_uint64 *vf,
-			    vec<stmt_vec_info > *mask_producers)
-{
-  vec_info *vinfo = stmt_info->vinfo;
-  if (dump_enabled_p ())
-    {
-      dump_printf_loc (MSG_NOTE, vect_location, "==> examining statement: ");
-      dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt_info->stmt, 0);
-    }
-  if (!vect_determine_vf_for_stmt_1 (stmt_info, false, vf, mask_producers))
-    return false;
-
-  if (STMT_VINFO_IN_PATTERN_P (stmt_info)
-      && STMT_VINFO_RELATED_STMT (stmt_info))
-    {
-      gimple *pattern_def_seq = STMT_VINFO_PATTERN_DEF_SEQ (stmt_info);
-      stmt_info = STMT_VINFO_RELATED_STMT (stmt_info);
-
-      /* If a pattern statement has def stmts, analyze them too.  */
-      for (gimple_stmt_iterator si = gsi_start (pattern_def_seq);
-	   !gsi_end_p (si); gsi_next (&si))
-	{
-	  stmt_vec_info def_stmt_info = vinfo->lookup_stmt (gsi_stmt (si));
-	  if (dump_enabled_p ())
-	    {
-	      dump_printf_loc (MSG_NOTE, vect_location,
-			       "==> examining pattern def stmt: ");
-	      dump_gimple_stmt (MSG_NOTE, TDF_SLIM,
-				def_stmt_info->stmt, 0);
-	    }
-	  if (!vect_determine_vf_for_stmt_1 (def_stmt_info, true,
-					     vf, mask_producers))
-	    return false;
-	}
-
-      if (dump_enabled_p ())
-	{
-	  dump_printf_loc (MSG_NOTE, vect_location,
-			   "==> examining pattern statement: ");
-	  dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt_info->stmt, 0);
-	}
-      if (!vect_determine_vf_for_stmt_1 (stmt_info, true, vf, mask_producers))
-	return false;
-    }
-
-  return true;
-}
-
 /* Function vect_determine_vectorization_factor
 
    Determine the vectorization factor (VF).  VF is the number of data elements
@@ -1113,9 +1063,8 @@ vect_compute_single_scalar_iteration_cos
           /* Skip stmts that are not vectorized inside the loop.  */
           if (stmt_info
               && !STMT_VINFO_RELEVANT_P (stmt_info)
-              && (!STMT_VINFO_LIVE_P (stmt_info)
-                  || !VECTORIZABLE_CYCLE_DEF (STMT_VINFO_DEF_TYPE (stmt_info)))
-	      && !STMT_VINFO_IN_PATTERN_P (stmt_info))
+	      && (!VECTORIZABLE_CYCLE_DEF (STMT_VINFO_DEF_TYPE (stmt_info))
+		  || !STMT_VINFO_LIVE_P (stmt_info)))
             continue;
 
 	  vect_cost_for_stmt kind;
@@ -1429,14 +1378,11 @@ vect_update_vf_for_slp (loop_vec_info lo
   vec_basic_block *vec_bb;
   FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
     FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
-      {
-	stmt_vec_info final_info = vect_stmt_to_vectorize (stmt_info);
-	if ((STMT_VINFO_RELEVANT_P (final_info)
-	     || VECTORIZABLE_CYCLE_DEF (STMT_VINFO_DEF_TYPE (final_info)))
-	    && !PURE_SLP_STMT (final_info))
-	  /* STMT needs both SLP and loop-based vectorization.  */
-	  only_slp_in_loop = false;
-      }
+      if ((STMT_VINFO_RELEVANT_P (stmt_info)
+	   || VECTORIZABLE_CYCLE_DEF (STMT_VINFO_DEF_TYPE (stmt_info)))
+	  && !PURE_SLP_STMT (stmt_info))
+	/* STMT needs both SLP and loop-based vectorization.  */
+	only_slp_in_loop = false;
 
   if (only_slp_in_loop)
     {
@@ -2199,18 +2145,7 @@ vect_analyze_loop_2 (loop_vec_info loop_
   vec_basic_block *vec_bb;
   FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
     FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
-      {
-	STMT_SLP_TYPE (stmt_info) = loop_vect;
-	if (STMT_VINFO_IN_PATTERN_P (stmt_info))
-	  {
-	    gimple *pattern_def_seq = STMT_VINFO_PATTERN_DEF_SEQ (stmt_info);
-	    STMT_SLP_TYPE (STMT_VINFO_RELATED_STMT (stmt_info)) = loop_vect;
-	    for (gimple_stmt_iterator pi = gsi_start (pattern_def_seq);
-		 !gsi_end_p (pi); gsi_next (&pi))
-	      STMT_SLP_TYPE (loop_vinfo->lookup_stmt (gsi_stmt (pi)))
-		= loop_vect;
-	  }
-      }
+      STMT_SLP_TYPE (stmt_info) = loop_vect;
 
   /* Free optimized alias test DDRS.  */
   LOOP_VINFO_LOWER_BOUNDS (loop_vinfo).truncate (0);
@@ -8409,22 +8344,8 @@ vect_transform_loop (loop_vec_info loop_
 	    loop_vinfo->remove_stmt (stmt_info);
 	  else
 	    {
-	      gimple_stmt_iterator si = gsi_for_stmt (stmt_info->stmt);
-	      if (STMT_VINFO_IN_PATTERN_P (stmt_info))
-		{
-		  gimple *def_seq = STMT_VINFO_PATTERN_DEF_SEQ (stmt_info);
-		  for (gimple_stmt_iterator subsi = gsi_start (def_seq);
-		       !gsi_end_p (subsi); gsi_next (&subsi))
-		    {
-		      stmt_vec_info pat_stmt_info
-			= loop_vinfo->lookup_stmt (gsi_stmt (subsi));
-		      vect_transform_loop_stmt (loop_vinfo, pat_stmt_info,
-						&si);
-		    }
-		  stmt_vec_info pat_stmt_info
-		    = STMT_VINFO_RELATED_STMT (stmt_info);
-		  vect_transform_loop_stmt (loop_vinfo, pat_stmt_info, &si);
-		}
+	      stmt_vec_info place = vect_orig_stmt (stmt_info);
+	      gimple_stmt_iterator si = gsi_for_stmt (place->stmt);
 	      vect_transform_loop_stmt (loop_vinfo, stmt_info, &si);
 	    }
 	}
Index: gcc/tree-vect-slp.c
===================================================================
--- gcc/tree-vect-slp.c	2018-08-28 12:05:14.010961472 +0100
+++ gcc/tree-vect-slp.c	2018-08-28 12:05:16.522940287 +0100
@@ -2362,18 +2362,14 @@ vect_detect_hybrid_slp (loop_vec_info lo
   vec_basic_block *vec_bb;
   FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
     FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
-      if (STMT_VINFO_IN_PATTERN_P (stmt_info))
+      if (is_pattern_stmt_p (stmt_info))
 	{
 	  walk_stmt_info wi;
 	  memset (&wi, 0, sizeof (wi));
 	  wi.info = loop_vinfo;
-	  gimple_stmt_iterator gsi2
-	    = gsi_for_stmt (STMT_VINFO_RELATED_STMT (stmt_info)->stmt);
-	  walk_gimple_stmt (&gsi2, vect_detect_hybrid_slp_2,
+	  gimple_stmt_iterator gsi = gsi_for_stmt (stmt_info->stmt);
+	  walk_gimple_stmt (&gsi, vect_detect_hybrid_slp_2,
 			    vect_detect_hybrid_slp_1, &wi);
-	  walk_gimple_seq (STMT_VINFO_PATTERN_DEF_SEQ (stmt_info),
-			   vect_detect_hybrid_slp_2,
-			   vect_detect_hybrid_slp_1, &wi);
 	}
 
   /* Then walk the SLP instance trees marking stmts with uses in
Index: gcc/tree-vect-stmts.c
===================================================================
--- gcc/tree-vect-stmts.c	2018-08-28 12:05:14.014961439 +0100
+++ gcc/tree-vect-stmts.c	2018-08-28 12:05:16.522940287 +0100
@@ -9391,11 +9391,9 @@ vect_analyze_stmt (stmt_vec_info stmt_in
 		   slp_tree node, slp_instance node_instance,
 		   stmt_vector_for_cost *cost_vec)
 {
-  vec_info *vinfo = stmt_info->vinfo;
   bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_info);
   enum vect_relevant relevance = STMT_VINFO_RELEVANT (stmt_info);
   bool ok;
-  gimple_seq pattern_def_seq;
 
   if (dump_enabled_p ())
     {
@@ -9412,94 +9410,21 @@ vect_analyze_stmt (stmt_vec_info stmt_in
       return false;
     }
 
-  if (STMT_VINFO_IN_PATTERN_P (stmt_info)
-      && node == NULL
-      && (pattern_def_seq = STMT_VINFO_PATTERN_DEF_SEQ (stmt_info)))
-    {
-      gimple_stmt_iterator si;
-
-      for (si = gsi_start (pattern_def_seq); !gsi_end_p (si); gsi_next (&si))
-	{
-	  stmt_vec_info pattern_def_stmt_info
-	    = vinfo->lookup_stmt (gsi_stmt (si));
-	  if (STMT_VINFO_RELEVANT_P (pattern_def_stmt_info)
-	      || STMT_VINFO_LIVE_P (pattern_def_stmt_info))
-	    {
-	      /* Analyze def stmt of STMT if it's a pattern stmt.  */
-	      if (dump_enabled_p ())
-		{
-		  dump_printf_loc (MSG_NOTE, vect_location,
-				   "==> examining pattern def statement: ");
-		  dump_gimple_stmt (MSG_NOTE, TDF_SLIM,
-				    pattern_def_stmt_info->stmt, 0);
-		}
-
-	      if (!vect_analyze_stmt (pattern_def_stmt_info,
-				      need_to_vectorize, node, node_instance,
-				      cost_vec))
-		return false;
-	    }
-	}
-    }
-
   /* Skip stmts that do not need to be vectorized. In loops this is expected
      to include:
      - the COND_EXPR which is the loop exit condition
      - any LABEL_EXPRs in the loop
      - computations that are used only for array indexing or loop control.
      In basic blocks we only analyze statements that are a part of some SLP
-     instance, therefore, all the statements are relevant.
-
-     Pattern statement needs to be analyzed instead of the original statement
-     if the original statement is not relevant.  Otherwise, we analyze both
-     statements.  In basic blocks we are called from some SLP instance
-     traversal, don't analyze pattern stmts instead, the pattern stmts
-     already will be part of SLP instance.  */
+     instance, therefore, all the statements are relevant.  */
 
-  stmt_vec_info pattern_stmt_info = STMT_VINFO_RELATED_STMT (stmt_info);
   if (!STMT_VINFO_RELEVANT_P (stmt_info)
       && !STMT_VINFO_LIVE_P (stmt_info))
     {
-      if (STMT_VINFO_IN_PATTERN_P (stmt_info)
-	  && pattern_stmt_info
-	  && (STMT_VINFO_RELEVANT_P (pattern_stmt_info)
-	      || STMT_VINFO_LIVE_P (pattern_stmt_info)))
-        {
-          /* Analyze PATTERN_STMT instead of the original stmt.  */
-	  stmt_info = pattern_stmt_info;
-          if (dump_enabled_p ())
-            {
-              dump_printf_loc (MSG_NOTE, vect_location,
-                               "==> examining pattern statement: ");
-	      dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt_info->stmt, 0);
-            }
-        }
-      else
-        {
-          if (dump_enabled_p ())
-            dump_printf_loc (MSG_NOTE, vect_location, "irrelevant.\n");
-
-          return true;
-        }
-    }
-  else if (STMT_VINFO_IN_PATTERN_P (stmt_info)
-	   && node == NULL
-	   && pattern_stmt_info
-	   && (STMT_VINFO_RELEVANT_P (pattern_stmt_info)
-	       || STMT_VINFO_LIVE_P (pattern_stmt_info)))
-    {
-      /* Analyze PATTERN_STMT too.  */
       if (dump_enabled_p ())
-        {
-          dump_printf_loc (MSG_NOTE, vect_location,
-                           "==> examining pattern statement: ");
-	  dump_gimple_stmt (MSG_NOTE, TDF_SLIM, pattern_stmt_info->stmt, 0);
-        }
-
-      if (!vect_analyze_stmt (pattern_stmt_info, need_to_vectorize, node,
-			      node_instance, cost_vec))
-        return false;
-   }
+	dump_printf_loc (MSG_NOTE, vect_location, "irrelevant.\n");
+      return true;
+    }
 
   switch (STMT_VINFO_DEF_TYPE (stmt_info))
     {
@@ -10924,7 +10849,8 @@ vect_remove_dead_scalar_stmts (vec_info
 	   stmt_info = prev_stmt_info)
 	{
 	  prev_stmt_info = stmt_info->prev;
-	  vect_maybe_remove_scalar_stmt (stmt_info);
+	  if (!is_pattern_stmt_p (stmt_info))
+	    vect_maybe_remove_scalar_stmt (stmt_info);
 	}
     }
 }
Index: gcc/tree-vect-patterns.c
===================================================================
--- gcc/tree-vect-patterns.c	2018-08-28 12:05:11.462982962 +0100
+++ gcc/tree-vect-patterns.c	2018-08-28 12:05:16.518940320 +0100
@@ -125,27 +125,26 @@ vect_init_pattern_stmt (gimple *pattern_
 vect_set_pattern_stmt (gimple *pattern_stmt, stmt_vec_info orig_stmt_info,
 		       tree vectype)
 {
-  STMT_VINFO_IN_PATTERN_P (orig_stmt_info) = true;
-  STMT_VINFO_RELATED_STMT (orig_stmt_info)
+  stmt_vec_info pattern_stmt_info
     = vect_init_pattern_stmt (pattern_stmt, orig_stmt_info, vectype);
+  orig_stmt_info->block->add_before (pattern_stmt_info, orig_stmt_info);
+  STMT_VINFO_IN_PATTERN_P (orig_stmt_info) = true;
+  STMT_VINFO_RELATED_STMT (orig_stmt_info) = pattern_stmt_info;
 }
 
-/* Add NEW_STMT to STMT_INFO's pattern definition statements.  If VECTYPE
-   is nonnull, record that NEW_STMT's vector type is VECTYPE, which might
-   be different from the vector type of the final pattern statement.  */
+/* Add NEW_STMT to the pattern statements that replace STMT_INFO.
+   If VECTYPE is nonnull, record that NEW_STMT's vector type is VECTYPE,
+   which might be different from the vector type of the final pattern
+   statement.  */
 
 static inline void
 append_pattern_def_seq (stmt_vec_info stmt_info, gimple *new_stmt,
 			tree vectype = NULL_TREE)
 {
-  vec_info *vinfo = stmt_info->vinfo;
-  if (vectype)
-    {
-      stmt_vec_info new_stmt_info = vinfo->add_stmt (new_stmt);
-      STMT_VINFO_VECTYPE (new_stmt_info) = vectype;
-    }
-  gimple_seq_add_stmt_without_update (&STMT_VINFO_PATTERN_DEF_SEQ (stmt_info),
-				      new_stmt);
+  stmt_vec_info orig_stmt_info = vect_orig_stmt (stmt_info);
+  stmt_vec_info new_stmt_info
+    = vect_init_pattern_stmt (new_stmt, orig_stmt_info, vectype);
+  stmt_info->block->add_before (new_stmt_info, stmt_info);
 }
 
 /* The caller wants to perform new operations on vect_external variable
@@ -633,11 +632,6 @@ vect_split_statement (stmt_vec_info stmt
 {
   if (is_pattern_stmt_p (stmt2_info))
     {
-      /* STMT2_INFO is part of a pattern.  Get the statement to which
-	 the pattern is attached.  */
-      stmt_vec_info orig_stmt2_info = STMT_VINFO_RELATED_STMT (stmt2_info);
-      vect_init_pattern_stmt (stmt1, orig_stmt2_info, vectype);
-
       if (dump_enabled_p ())
 	{
 	  dump_printf_loc (MSG_NOTE, vect_location,
@@ -645,6 +639,9 @@ vect_split_statement (stmt_vec_info stmt
 	  dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt2_info->stmt, 0);
 	}
 
+      /* Insert STMT1_INFO before STMT2_INFO.  */
+      append_pattern_def_seq (stmt2_info, stmt1, vectype);
+
       /* Since STMT2_INFO is a pattern statement, we can change it
 	 in-situ without worrying about changing the code for the
 	 containing block.  */
@@ -658,18 +655,6 @@ vect_split_statement (stmt_vec_info stmt
 	  dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt2_info->stmt, 0);
 	}
 
-      gimple_seq *def_seq = &STMT_VINFO_PATTERN_DEF_SEQ (orig_stmt2_info);
-      if (STMT_VINFO_RELATED_STMT (orig_stmt2_info) == stmt2_info)
-	/* STMT2_INFO is the actual pattern statement.  Add STMT1
-	   to the end of the definition sequence.  */
-	gimple_seq_add_stmt_without_update (def_seq, stmt1);
-      else
-	{
-	  /* STMT2_INFO belongs to the definition sequence.  Insert STMT1
-	     before it.  */
-	  gimple_stmt_iterator gsi = gsi_for_stmt (stmt2_info->stmt, def_seq);
-	  gsi_insert_before_without_update (&gsi, stmt1, GSI_SAME_STMT);
-	}
       return true;
     }
   else
@@ -689,10 +674,8 @@ vect_split_statement (stmt_vec_info stmt
 	  dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt2_info->stmt, 0);
 	}
 
-      /* Add STMT1 as a singleton pattern definition sequence.  */
-      gimple_seq *def_seq = &STMT_VINFO_PATTERN_DEF_SEQ (stmt2_info);
-      vect_init_pattern_stmt (stmt1, stmt2_info, vectype);
-      gimple_seq_add_stmt_without_update (def_seq, stmt1);
+      /* Insert STMT1_INFO before STMT2_INFO.  */
+      append_pattern_def_seq (stmt2_info, stmt1, vectype);
 
       /* Build the second of the two pattern statements.  */
       tree new_lhs = vect_recog_temp_ssa_var (lhs_type, NULL);
@@ -2164,7 +2147,7 @@ vect_recog_rotate_pattern (stmt_vec_info
     i.e. the shift/rotate stmt.  The original stmt (S3) is replaced
     with a shift/rotate which has same type on both operands, in the
     second case just b_T op c_T, in the first case with added cast
-    from a_t to c_T in STMT_VINFO_PATTERN_DEF_SEQ.
+    from a_t to c_T beforehand.
 
   Output:
 
@@ -3518,9 +3501,8 @@ adjust_bool_stmts (hash_set <gimple *> &
     adjust_bool_pattern (gimple_assign_lhs (bool_stmts[i]),
 			 out_type, stmt_info, defs);
 
-  /* Pop the last pattern seq stmt and install it as pattern root for STMT.  */
-  gimple *pattern_stmt
-    = gimple_seq_last_stmt (STMT_VINFO_PATTERN_DEF_SEQ (stmt_info));
+  /* Return the result of the last statement we emitted.  */
+  gimple *pattern_stmt = stmt_info->prev->stmt;
   return gimple_assign_lhs (pattern_stmt);
 }
 
@@ -4684,14 +4666,14 @@ static vect_recog_func vect_vect_recog_f
 
 const unsigned int NUM_PATTERNS = ARRAY_SIZE (vect_vect_recog_func_ptrs);
 
-/* Mark statements that are involved in a pattern.  */
+/* Replace ORIG_STMT_INFO with PATTERN_STMT, using PATTERN_VECTYPE as
+   the vector type for PATTERN_STMT.  */
 
 static inline void
-vect_mark_pattern_stmts (stmt_vec_info orig_stmt_info, gimple *pattern_stmt,
-                         tree pattern_vectype)
+vect_replace_stmt_with_pattern (stmt_vec_info orig_stmt_info,
+				gimple *pattern_stmt,
+				tree pattern_vectype)
 {
-  gimple *def_seq = STMT_VINFO_PATTERN_DEF_SEQ (orig_stmt_info);
-
   gimple *orig_pattern_stmt = NULL;
   if (is_pattern_stmt_p (orig_stmt_info))
     {
@@ -4718,32 +4700,14 @@ vect_mark_pattern_stmts (stmt_vec_info o
 	  dump_gimple_stmt (MSG_NOTE, TDF_SLIM, pattern_stmt, 0);
 	}
 
-      /* Switch to the statement that ORIG replaces.  */
-      orig_stmt_info = STMT_VINFO_RELATED_STMT (orig_stmt_info);
-
       /* We shouldn't be replacing the main pattern statement.  */
-      gcc_assert (STMT_VINFO_RELATED_STMT (orig_stmt_info)->stmt
-		  != orig_pattern_stmt);
-    }
+      gcc_assert (!is_main_pattern_stmt_p (orig_stmt_info));
 
-  if (def_seq)
-    for (gimple_stmt_iterator si = gsi_start (def_seq);
-	 !gsi_end_p (si); gsi_next (&si))
-      vect_init_pattern_stmt (gsi_stmt (si), orig_stmt_info, pattern_vectype);
-
-  if (orig_pattern_stmt)
-    {
-      vect_init_pattern_stmt (pattern_stmt, orig_stmt_info, pattern_vectype);
-
-      /* Insert all the new pattern statements before the original one.  */
-      gimple_seq *orig_def_seq = &STMT_VINFO_PATTERN_DEF_SEQ (orig_stmt_info);
-      gimple_stmt_iterator gsi = gsi_for_stmt (orig_pattern_stmt,
-					       orig_def_seq);
-      gsi_insert_seq_before_without_update (&gsi, def_seq, GSI_SAME_STMT);
-      gsi_insert_before_without_update (&gsi, pattern_stmt, GSI_SAME_STMT);
+      /* Insert the new pattern statement before the original one.  */
+      append_pattern_def_seq (orig_stmt_info, pattern_stmt, pattern_vectype);
 
       /* Remove the pattern statement that this new pattern replaces.  */
-      gsi_remove (&gsi, false);
+      orig_stmt_info->block->remove (orig_stmt_info);
     }
   else
     vect_set_pattern_stmt (pattern_stmt, orig_stmt_info, pattern_vectype);
@@ -4770,30 +4734,17 @@ vect_mark_pattern_stmts (stmt_vec_info o
 static void
 vect_pattern_recog_1 (vect_recog_func *recog_func, stmt_vec_info stmt_info)
 {
-  vec_info *vinfo = stmt_info->vinfo;
   gimple *pattern_stmt;
   loop_vec_info loop_vinfo;
   tree pattern_vectype;
 
-  /* If this statement has already been replaced with pattern statements,
-     leave the original statement alone, since the first match wins.
-     Instead try to match against the definition statements that feed
-     the main pattern statement.  */
-  if (STMT_VINFO_IN_PATTERN_P (stmt_info))
-    {
-      gimple_stmt_iterator gsi;
-      for (gsi = gsi_start (STMT_VINFO_PATTERN_DEF_SEQ (stmt_info));
-	   !gsi_end_p (gsi); gsi_next (&gsi))
-	vect_pattern_recog_1 (recog_func, vinfo->lookup_stmt (gsi_stmt (gsi)));
-      return;
-    }
-
-  gcc_assert (!STMT_VINFO_PATTERN_DEF_SEQ (stmt_info));
+  stmt_vec_info prev_stmt_info = stmt_info->prev;
   pattern_stmt = recog_func->fn (stmt_info, &pattern_vectype);
   if (!pattern_stmt)
     {
-      /* Clear any half-formed pattern definition sequence.  */
-      STMT_VINFO_PATTERN_DEF_SEQ (stmt_info) = NULL;
+      /* Delete any half-formed pattern sequence.  */
+      while (stmt_info->prev != prev_stmt_info)
+	stmt_info->block->remove (stmt_info->prev);
       return;
     }
 
@@ -4808,8 +4759,15 @@ vect_pattern_recog_1 (vect_recog_func *r
       dump_gimple_stmt (MSG_NOTE, TDF_SLIM, pattern_stmt, 0);
     }
 
+  /* Install the vector type in pattern definition statements that
+     don't yet have one.  */
+  for (stmt_vec_info pat_stmt_info = stmt_info->prev;
+       pat_stmt_info != prev_stmt_info; pat_stmt_info = pat_stmt_info->prev)
+    if (!STMT_VINFO_VECTYPE (pat_stmt_info))
+      STMT_VINFO_VECTYPE (pat_stmt_info) = pattern_vectype;
+
   /* Mark the stmts that are involved in the pattern. */
-  vect_mark_pattern_stmts (stmt_info, pattern_stmt, pattern_vectype);
+  vect_replace_stmt_with_pattern (stmt_info, pattern_stmt, pattern_vectype);
 
   /* Patterns cannot be vectorized using SLP, because they change the order of
      computation.  */
@@ -4911,8 +4869,27 @@ vect_pattern_recog (vec_info *vinfo)
   vec_basic_block *vec_bb;
   FOR_EACH_VEC_ELT (vinfo->blocks, i, vec_bb)
     FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
-      if (STMT_VINFO_VECTORIZABLE (stmt_info))
-	/* Scan over all generic vect_recog_xxx_pattern functions.  */
-	for (unsigned int j = 0; j < NUM_PATTERNS; j++)
-	  vect_pattern_recog_1 (&vect_vect_recog_func_ptrs[j], stmt_info);
+      {
+	stmt_vec_info begin_prev = stmt_info->prev;
+	if (STMT_VINFO_VECTORIZABLE (stmt_info))
+	  /* Scan over all generic vect_recog_xxx_pattern functions.  */
+	  for (unsigned int j = 0; j < NUM_PATTERNS; j++)
+	    {
+	      stmt_vec_info curr_prev;
+	      /* Scan over STMT_INFO and any pattern definition statements
+		 that were introduced by previous recognizers.  */
+	      for (stmt_vec_info curr_info = stmt_info;
+		   curr_info != begin_prev; curr_info = curr_prev)
+		{
+		  curr_prev = curr_info->prev;
+		  /* The first match wins, so skip statements that have
+		     already been replaced, and the final statement with
+		     which they were replaced.  */
+		  if (!STMT_VINFO_IN_PATTERN_P (curr_info)
+		      && !is_main_pattern_stmt_p (curr_info))
+		    vect_pattern_recog_1 (&vect_vect_recog_func_ptrs[j],
+					  curr_info);
+		}
+	    }
+      }
 }

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [1/6] Handle gphis in gimple_get_lhs
  2018-08-28 11:20 ` [1/6] Handle gphis in gimple_get_lhs Richard Sandiford
@ 2018-08-28 18:22   ` Jeff Law
  0 siblings, 0 replies; 22+ messages in thread
From: Jeff Law @ 2018-08-28 18:22 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 08/28/2018 05:20 AM, Richard Sandiford wrote:
> Several callers of gimple_get_lhs deal with statements that might
> be phis.  This patch makes gimple_get_lhs handle that case too,
> so that the callers don't have to.
> 
> 
> 2018-08-28  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* gimple.c (gimple_get_lhs): Handle gphis.
> 	* tree-ssa-phionlycprop.c (get_lhs_or_phi_result): Delete and...
> 	(propagate_rhs_into_lhs, eliminate_const_or_copy): ...use
> 	gimple_get_lhs instead.
> 	* tree-ssa-dom.c (eliminate_redundant_computations): Don't handle
> 	phis specially before calling gimple_get_lhs.
> 	* tree-ssa-scopedtables.c (avail_exprs_stack::lookup_avail_expr):
> 	Likewise.
> 	* tree-vect-loop.c (vectorizable_live_operation): Likewise.
> 	* tree-vect-slp.c (vect_get_slp_vect_defs): Likewise.
> 	(vect_get_slp_defs): Likewise.
> 	* tree-vect-stmts.c (vect_get_vec_def_for_operand_1): Likewise.
> 	(vect_get_vec_def_for_stmt_copy, vect_transform_stmt): Likewise.
I pondered doing this repeatedly through the years.  So I'm all for it.
Changes look fine to me.

jeff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [2/6] Make vec_info::lookup_single_use take a stmt_vec_info
  2018-08-28 11:21 ` [2/6] Make vec_info::lookup_single_use take a stmt_vec_info Richard Sandiford
@ 2018-08-28 18:25   ` Jeff Law
  0 siblings, 0 replies; 22+ messages in thread
From: Jeff Law @ 2018-08-28 18:25 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 08/28/2018 05:21 AM, Richard Sandiford wrote:
> All callers to vec_info::lookup_single_use are asking about the lhs of a
> statement that they're already examining.  It makes more sense to pass
> that statement instead of the SSA name, since it is then easier to
> handle statements that have been replaced by pattern statements.
> 
> A later patch adds support for pattern statements.  This one just
> changes the interface to make that possible.
> 
> 
> 2018-08-28  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* tree-vectorizer.h (vec_info::lookup_single_use): Take a
> 	stmt_vec_info instead of an SSA name.
> 	* tree-vectorizer.c (vec_info::lookup_single_use): Likewise.
> 	* tree-vect-loop.c (vectorizable_reduction): Update call
> 	accordingly.
> 	* tree-vect-stmts.c (supportable_widening_operation): Likewise.
OK
jeff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [3/6] Add a vec_basic_block structure
  2018-08-28 11:22 ` [3/6] Add a vec_basic_block structure Richard Sandiford
@ 2018-08-28 22:38   ` Jeff Law
  0 siblings, 0 replies; 22+ messages in thread
From: Jeff Law @ 2018-08-28 22:38 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 08/28/2018 05:22 AM, Richard Sandiford wrote:
> This patch adds a vec_basic_block that records the scalar phis and
> scalar statements that we need to vectorise.  This is a slight
> simplification in its own right, since it avoids unnecesary statement
> lookups and shaves >50 LOC.  But the main reason for doing it is
> to allow the rest of the series to treat pattern statements less
> specially.
> 
> Putting phis (which are logically parallel) and normal statements
> (which are logically serial) into a single list might seem dangerous,
> but I think in practice it should be fine.  Very little vectoriser
> code needs to handle the parallel nature of phis specially, and code
> that does can still do so.  Having a single list simplifies code that
> wants to look at every scalar phi or stmt in isolation.
> 
> The new test is for cases in which we try to hoist the same expression
> twice, which I broke with an earlier version of the patch.
> 
> 
> 2018-08-28  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* tree-vectorizer.h (vec_basic_block): New structure.
> 	(vec_info::blocks, _stmt_vec_info::block, _stmt_vec_info::prev)
> 	(_stmt_vec_info::next): New member variables.
> 	(FOR_EACH_VEC_BB_STMT, FOR_EACH_VEC_BB_STMT_REVERSE): New macros.
> 	(vec_basic_block::vec_basic_block): New function.
> 	* tree-vectorizer.c (vec_basic_block::add_to_end): Likewise.
> 	(vec_basic_block::add_before): Likewise.
> 	(vec_basic_block::remove): Likewise.
> 	(vec_info::~vec_info): Free the vec_basic_blocks.
> 	(vec_info::remove_stmt): Remove the statement from the containing
> 	vec_basic_block.
> 	* tree-vect-patterns.c (vect_determine_precisions)
> 	(vect_pattern_recog): Iterate over vec_basic_blocks.
> 	* tree-vect-loop.c (vect_determine_vectorization_factor)
> 	(vect_compute_single_scalar_iteration_cost, vect_update_vf_for_slp)
> 	(vect_analyze_loop_operations, vect_transform_loop): Likewise.
> 	(_loop_vec_info::_loop_vec_info): Construct vec_basic_blocks.
> 	* tree-vect-slp.c (_bb_vec_info::_bb_vec_info): Likewise.
> 	(vect_detect_hybrid_slp): Iterate over vec_basic_blocks.
> 	* tree-vect-stmts.c (vect_mark_stmts_to_be_vectorized): Likewise.
> 	(vect_finish_replace_stmt, vectorizable_condition): Remove the original
> 	statement from the containing block.
> 	(hoist_defs_of_uses): Likewise the statement that we're hoisting.
> 
> gcc/testsuite/
> 	* gcc.dg/vect/vect-hoist-1.c: New test.
> 
I'm generally not a fan of references to "this" like you do when you add
STMT_INFOs to a VEC_BASIC_BLOCK.  I'm going to trust you're not leaving
dangling pointers in the STMT_INFOs and that storing a pointer to the
containing VEC_BASIC_BLOCK within the STMT_INFO is the best way to get
at the data you want.

Firstly standard doubly linked list routines, lots of fairly mechanical
chunks get simplified a bit.   Overall it looks quite reasonable.

OK for the trunk.

jeff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [4/6] Make the vectoriser do its own DCE
  2018-08-28 11:23 ` [4/6] Make the vectoriser do its own DCE Richard Sandiford
@ 2018-08-28 23:01   ` Jeff Law
  2018-08-29  7:16     ` Richard Biener
  0 siblings, 1 reply; 22+ messages in thread
From: Jeff Law @ 2018-08-28 23:01 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 08/28/2018 05:23 AM, Richard Sandiford wrote:
> The vectoriser normally leaves a later DCE pass to remove the scalar
> code, but we've accumulated various special cases for things that
> DCE can't handle, such as removing the scalar stores that have been
> replaced by vector stores, and the scalar calls to internal functions.
> (The latter must be removed for correctness, since no underlying scalar
> optabs exist for those calls.)
> 
> Now that vec_basic_block gives us an easy way of iterating over the
> original scalar code (ignoring any new code inserted by the vectoriser),
> it seems easier to do the DCE directly.  This involves marking the few
> cases in which the vector code needs part of the original scalar code
> to be kept around.
> 
> 
> 2018-08-28  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* tree-vectorizer.h (_stmt_vec_info::used_by_vector_code_p): New
> 	member variable.
> 	(vect_mark_used_by_vector_code): Declare.
> 	(vect_remove_dead_scalar_stmts): Likewise.
> 	(vect_transform_stmt): Return void.
> 	(vect_remove_stores): Delete.
> 	* tree-vectorizer.c (vec_info::remove_stmt): Handle phis.
> 	* tree-vect-stmts.c (vect_mark_used_by_vector_code): New function.
> 	(vectorizable_call, vectorizable_simd_clone_call): Don't remove
> 	scalar calls here.
> 	(vectorizable_shift): Mark shift amounts in a vector-scalar shift
> 	as being used by the vector code.
> 	(vectorizable_load): Mark unhoisted scalar loads that feed a
> 	load-and-broadcast operation as being needed by the vector code.
> 	(vect_transform_stmt): Return void.
> 	(vect_remove_stores): Delete.
> 	(vect_maybe_remove_scalar_stmt): New function.
> 	(vect_remove_dead_scalar_stmts): Likewise.
> 	* tree-vect-slp.c (vect_slp_bb): Call vect_remove_dead_scalar_stmts.
> 	(vect_remove_slp_scalar_calls): Delete.
> 	(vect_schedule_slp): Don't call it.  Don't remove scalar stores here.
> 	* tree-vect-loop.c (vectorizable_reduction): Mark scalar phis that
> 	are retained by the vector code.
> 	(vectorizable_live_operation): Mark scalar live-out statements that
> 	are retained by the vector code.
> 	(vect_transform_loop_stmt): Remove seen_store argument.  Mark gconds
> 	in nested loops as being needed by the vector code.  Replace the
> 	outer loop's gcond with a dummy condition.
> 	(vect_transform_loop): Update calls accordingly.  Don't remove
> 	scalar stores or calls here; call vect_remove_dead_scalar_stmts
> 	instead.
> 	* tree-vect-loop-manip.c (vect_update_ivs_after_vectorizer): Don't
> 	remove PHIs that were classified as reductions but never actually
> 	vectorized.
Presumably you'll look at killing the actual DCE pass we're running as a
follow-up.

I'm hoping this will help the tendency of the vectorizer to leak
SSA_NAMEs.  Though it's late enough in the pipeline that leaking isn't
that bad.





> Index: gcc/tree-vectorizer.c
> ===================================================================
> --- gcc/tree-vectorizer.c	2018-08-28 12:05:11.466982927 +0100
> +++ gcc/tree-vectorizer.c	2018-08-28 12:05:14.014961439 +0100
> @@ -654,8 +654,13 @@ vec_info::remove_stmt (stmt_vec_info stm
>    set_vinfo_for_stmt (stmt_info->stmt, NULL);
>    gimple_stmt_iterator si = gsi_for_stmt (stmt_info->stmt);
>    unlink_stmt_vdef (stmt_info->stmt);
> -  gsi_remove (&si, true);
> -  release_defs (stmt_info->stmt);
> +  if (is_a <gphi *> (stmt_info->stmt))
> +    remove_phi_node (&si, true);
> +  else
> +    {
> +      gsi_remove (&si, true);
> +      release_defs (stmt_info->stmt);
> +    }
More than once I've wondered if we should factor this into its own
little function.  I'm pretty sure I've seen this style of code
elsewhere.  Your call whether or not to clean that up.

OK with or without creating a little helper for the code above.
jeff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [5/6] Insert pattern statements into vec_basic_blocks
  2018-08-28 11:25 ` [5/6] Insert pattern statements into vec_basic_blocks Richard Sandiford
@ 2018-08-28 23:16   ` Jeff Law
  2018-08-29  7:18     ` Richard Biener
  2018-08-29  7:55   ` Jakub Jelinek
  1 sibling, 1 reply; 22+ messages in thread
From: Jeff Law @ 2018-08-28 23:16 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On 08/28/2018 05:24 AM, Richard Sandiford wrote:
> The point of this patch is to put pattern statements in the same
> vec_basic_block as the statements they replace, with the pattern
> statements for S coming between S and S's original predecessor.
> This removes the need to handle them specially in various places.
> 
> 
> 2018-08-28  Richard Sandiford  <richard.sandiford@arm.com>
> 
> gcc/
> 	* tree-vectorizer.h (vec_basic_block): Expand comment.
> 	(_stmt_vec_info::pattern_def_seq): Delete.
> 	(STMT_VINFO_PATTERN_DEF_SEQ): Likewise.
> 	(is_main_pattern_stmt_p): New function.
> 	* tree-vect-loop.c (vect_determine_vf_for_stmt_1): Rename to...
> 	(vect_determine_vf_for_stmt): ...this, deleting the original
> 	function with this name.  Remove vectype_maybe_set_p argument
> 	and test is_pattern_stmt_p instead.  Retain the "examining..."
> 	message from the previous vect_determine_vf_for_stmt.
> 	(vect_compute_single_scalar_iteration_cost, vect_update_vf_for_slp)
> 	(vect_analyze_loop_2): Don't treat pattern statements specially.
> 	(vect_transform_loop): Likewise.  Use vect_orig_stmt to find the
> 	insertion point.
> 	* tree-vect-slp.c (vect_detect_hybrid_slp): Expect pattern statements
> 	to be in the statement list, without needing to follow
> 	STMT_VINFO_RELATED_STMT.  Remove PATTERN_DEF_SEQ handling.
> 	* tree-vect-stmts.c (vect_analyze_stmt): Don't handle pattern
> 	statements specially.
> 	(vect_remove_dead_scalar_stmts): Ignore pattern statements.
> 	* tree-vect-patterns.c (vect_set_pattern_stmt): Insert the pattern
> 	statement into the vec_basic_block immediately before the statement
> 	it replaces.
> 	(append_pattern_def_seq): Likewise.  If the original statement is
> 	itself a pattern statement, associate the new one with the original
> 	statement.
> 	(vect_split_statement): Use append_pattern_def_seq to insert the
> 	first pattern statement.
> 	(vect_recog_vector_vector_shift_pattern): Remove mention of
> 	STMT_VINFO_PATTERN_DEF_SEQ.
> 	(adjust_bool_stmts): Get the last pattern statement from the
> 	stmt_vec_info chain.
> 	(vect_mark_pattern_stmts): Rename to...
> 	(vect_replace_stmt_with_pattern): ...this.  Remove the
> 	PATTERN_DEF_SEQ handling and process only the pattern statement given.
> 	Use append_pattern_def_seq when replacing a pattern statement with
> 	another pattern statement, and use vec_basic_block::remove instead
> 	of gsi_remove to remove the old one.
> 	(vect_pattern_recog_1): Update accordingly.  Remove PATTERN_DEF_SEQ
> 	handling.  On failure, remove any half-formed pattern sequence from
> 	the vec_basic_block.  Install the vector type in pattern statements
> 	that don't yet have one.
> 	(vect_pattern_recog): Iterate over statements that are added
> 	by previous recognizers, but skipping those that have already
> 	been replaced, or the main pattern statement in such a replacement.
Nice cleanup.  OK.
jeff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [4/6] Make the vectoriser do its own DCE
  2018-08-28 23:01   ` Jeff Law
@ 2018-08-29  7:16     ` Richard Biener
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Biener @ 2018-08-29  7:16 UTC (permalink / raw)
  To: Jeff Law; +Cc: GCC Patches, Richard Sandiford

On Wed, Aug 29, 2018 at 1:01 AM Jeff Law <law@redhat.com> wrote:
>
> On 08/28/2018 05:23 AM, Richard Sandiford wrote:
> > The vectoriser normally leaves a later DCE pass to remove the scalar
> > code, but we've accumulated various special cases for things that
> > DCE can't handle, such as removing the scalar stores that have been
> > replaced by vector stores, and the scalar calls to internal functions.
> > (The latter must be removed for correctness, since no underlying scalar
> > optabs exist for those calls.)
> >
> > Now that vec_basic_block gives us an easy way of iterating over the
> > original scalar code (ignoring any new code inserted by the vectoriser),
> > it seems easier to do the DCE directly.  This involves marking the few
> > cases in which the vector code needs part of the original scalar code
> > to be kept around.
> >
> >
> > 2018-08-28  Richard Sandiford  <richard.sandiford@arm.com>
> >
> > gcc/
> >       * tree-vectorizer.h (_stmt_vec_info::used_by_vector_code_p): New
> >       member variable.
> >       (vect_mark_used_by_vector_code): Declare.
> >       (vect_remove_dead_scalar_stmts): Likewise.
> >       (vect_transform_stmt): Return void.
> >       (vect_remove_stores): Delete.
> >       * tree-vectorizer.c (vec_info::remove_stmt): Handle phis.
> >       * tree-vect-stmts.c (vect_mark_used_by_vector_code): New function.
> >       (vectorizable_call, vectorizable_simd_clone_call): Don't remove
> >       scalar calls here.
> >       (vectorizable_shift): Mark shift amounts in a vector-scalar shift
> >       as being used by the vector code.
> >       (vectorizable_load): Mark unhoisted scalar loads that feed a
> >       load-and-broadcast operation as being needed by the vector code.
> >       (vect_transform_stmt): Return void.
> >       (vect_remove_stores): Delete.
> >       (vect_maybe_remove_scalar_stmt): New function.
> >       (vect_remove_dead_scalar_stmts): Likewise.
> >       * tree-vect-slp.c (vect_slp_bb): Call vect_remove_dead_scalar_stmts.
> >       (vect_remove_slp_scalar_calls): Delete.
> >       (vect_schedule_slp): Don't call it.  Don't remove scalar stores here.
> >       * tree-vect-loop.c (vectorizable_reduction): Mark scalar phis that
> >       are retained by the vector code.
> >       (vectorizable_live_operation): Mark scalar live-out statements that
> >       are retained by the vector code.
> >       (vect_transform_loop_stmt): Remove seen_store argument.  Mark gconds
> >       in nested loops as being needed by the vector code.  Replace the
> >       outer loop's gcond with a dummy condition.
> >       (vect_transform_loop): Update calls accordingly.  Don't remove
> >       scalar stores or calls here; call vect_remove_dead_scalar_stmts
> >       instead.
> >       * tree-vect-loop-manip.c (vect_update_ivs_after_vectorizer): Don't
> >       remove PHIs that were classified as reductions but never actually
> >       vectorized.
> Presumably you'll look at killing the actual DCE pass we're running as a
> follow-up.
>
> I'm hoping this will help the tendency of the vectorizer to leak
> SSA_NAMEs.  Though it's late enough in the pipeline that leaking isn't
> that bad.
>
>
>
>
>
> > Index: gcc/tree-vectorizer.c
> > ===================================================================
> > --- gcc/tree-vectorizer.c     2018-08-28 12:05:11.466982927 +0100
> > +++ gcc/tree-vectorizer.c     2018-08-28 12:05:14.014961439 +0100
> > @@ -654,8 +654,13 @@ vec_info::remove_stmt (stmt_vec_info stm
> >    set_vinfo_for_stmt (stmt_info->stmt, NULL);
> >    gimple_stmt_iterator si = gsi_for_stmt (stmt_info->stmt);
> >    unlink_stmt_vdef (stmt_info->stmt);
> > -  gsi_remove (&si, true);
> > -  release_defs (stmt_info->stmt);
> > +  if (is_a <gphi *> (stmt_info->stmt))
> > +    remove_phi_node (&si, true);
> > +  else
> > +    {
> > +      gsi_remove (&si, true);
> > +      release_defs (stmt_info->stmt);
> > +    }
> More than once I've wondered if we should factor this into its own
> little function.  I'm pretty sure I've seen this style of code
> elsewhere.  Your call whether or not to clean that up.

I think "merging" remove_phi_node and gsi_remove would make more sense.
There's the non-obvious semantic difference of the boolean argument so that
would need cleaning up.  Usually removing also involves calling
unlink_stmt_vdef.

Richard.

> OK with or without creating a little helper for the code above.
> jeff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [5/6] Insert pattern statements into vec_basic_blocks
  2018-08-28 23:16   ` Jeff Law
@ 2018-08-29  7:18     ` Richard Biener
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Biener @ 2018-08-29  7:18 UTC (permalink / raw)
  To: Jeff Law; +Cc: GCC Patches, Richard Sandiford

On Wed, Aug 29, 2018 at 1:16 AM Jeff Law <law@redhat.com> wrote:
>
> On 08/28/2018 05:24 AM, Richard Sandiford wrote:
> > The point of this patch is to put pattern statements in the same
> > vec_basic_block as the statements they replace, with the pattern
> > statements for S coming between S and S's original predecessor.
> > This removes the need to handle them specially in various places.
> >
> >
> > 2018-08-28  Richard Sandiford  <richard.sandiford@arm.com>
> >
> > gcc/
> >       * tree-vectorizer.h (vec_basic_block): Expand comment.
> >       (_stmt_vec_info::pattern_def_seq): Delete.
> >       (STMT_VINFO_PATTERN_DEF_SEQ): Likewise.
> >       (is_main_pattern_stmt_p): New function.
> >       * tree-vect-loop.c (vect_determine_vf_for_stmt_1): Rename to...
> >       (vect_determine_vf_for_stmt): ...this, deleting the original
> >       function with this name.  Remove vectype_maybe_set_p argument
> >       and test is_pattern_stmt_p instead.  Retain the "examining..."
> >       message from the previous vect_determine_vf_for_stmt.
> >       (vect_compute_single_scalar_iteration_cost, vect_update_vf_for_slp)
> >       (vect_analyze_loop_2): Don't treat pattern statements specially.
> >       (vect_transform_loop): Likewise.  Use vect_orig_stmt to find the
> >       insertion point.
> >       * tree-vect-slp.c (vect_detect_hybrid_slp): Expect pattern statements
> >       to be in the statement list, without needing to follow
> >       STMT_VINFO_RELATED_STMT.  Remove PATTERN_DEF_SEQ handling.
> >       * tree-vect-stmts.c (vect_analyze_stmt): Don't handle pattern
> >       statements specially.
> >       (vect_remove_dead_scalar_stmts): Ignore pattern statements.
> >       * tree-vect-patterns.c (vect_set_pattern_stmt): Insert the pattern
> >       statement into the vec_basic_block immediately before the statement
> >       it replaces.
> >       (append_pattern_def_seq): Likewise.  If the original statement is
> >       itself a pattern statement, associate the new one with the original
> >       statement.
> >       (vect_split_statement): Use append_pattern_def_seq to insert the
> >       first pattern statement.
> >       (vect_recog_vector_vector_shift_pattern): Remove mention of
> >       STMT_VINFO_PATTERN_DEF_SEQ.
> >       (adjust_bool_stmts): Get the last pattern statement from the
> >       stmt_vec_info chain.
> >       (vect_mark_pattern_stmts): Rename to...
> >       (vect_replace_stmt_with_pattern): ...this.  Remove the
> >       PATTERN_DEF_SEQ handling and process only the pattern statement given.
> >       Use append_pattern_def_seq when replacing a pattern statement with
> >       another pattern statement, and use vec_basic_block::remove instead
> >       of gsi_remove to remove the old one.
> >       (vect_pattern_recog_1): Update accordingly.  Remove PATTERN_DEF_SEQ
> >       handling.  On failure, remove any half-formed pattern sequence from
> >       the vec_basic_block.  Install the vector type in pattern statements
> >       that don't yet have one.
> >       (vect_pattern_recog): Iterate over statements that are added
> >       by previous recognizers, but skipping those that have already
> >       been replaced, or the main pattern statement in such a replacement.
> Nice cleanup.  OK.

What I wonder is if this direction eventually allows us to get rid of all the
code that deals with pattern stmts not having SSA operands or immediate
uses?  Or is that already gone...?

Richard.

> jeff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [6/6] Link imm uses for pattern stmts
  2018-08-28 11:25 ` [6/6] Link imm uses for pattern stmts Richard Sandiford
@ 2018-08-29  7:43   ` Richard Biener
  2018-08-29  9:25     ` Richard Sandiford
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Biener @ 2018-08-29  7:43 UTC (permalink / raw)
  To: GCC Patches, Richard Sandiford, Michael Matz

On Tue, Aug 28, 2018 at 1:25 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> One of the warts of the vectoriser IR is that it doesn't link SSA name
> uses for pattern statements, leading to complicated special cases in
> vect_mark_stmts_to_be_vectorized and (especially) vect_detect_hybrid_slp.
> It also makes it harder to check how an SSA name is used after pattern
> replacement (something I need for a later patch).
>
> This patch adds a mode in which tree-ssa-operands.c can update
> statements in the same non-invasive way as for debug statements.
> It then uses this mode to update pattern statements when adding
> them to a vec_basic_block, so that pattern statements become
> even more like statements that existed from the outset.

Without reviewing in detail yet I think putting them in the same
ballpark as debug stmts or documenting they have no effect on
code generation is misleading - this is because immediate use
handling _does_ affect code generation unless you mark those
stmts in some way and handle them like debug-uses in all of
the immediate use routines.  For example match-and-simplify
foldings guarded by :s will see the transparent stmts as uses.

So the real effect is to not add virtual operands or marking things
addressable?  IMHO rather than adding a transparent_p flag
we eventually should allow passing down a oep flag to update_stmt?

Do vector pattern stmts cause any of the issues?  I don't remember
seeing things that would cause TREE_ADDRESSABLE.  I know
we have patterns for stores so we'd get extra VDEFs.

IIRC we historically didn't add pattern stmts to the IL simply because
pattern recognition was supposed to be analysis phase and building
and eventually tearing down pattern stmts was thought as to be too
expensive.  That is also because we'd need to copy the loop to
retain a copy w/o those stmts.

So - would update_stmt (stmt, opf_no_vops|opf_non_addressable)
work?  I suppose we'd need to pass down a second sticky_flags
arg.

I know Micha has some cleanup in the tree-ssa-operands.c area
sitting for too long on his disk so maybe he can at least share
if he got all that opf_non_addressable & opf_not_non_addressable
stuff simplified somehow...

Richard.

> 2018-08-28  Richard Sandiford  <richard.sandiford@arm.com>
>
> gcc/
>         * tree-ssa-operands.h (update_stmt_operands): Add a transparent_p
>         argument.
>         * tree-ssa-operands.c (opf_transparent, opf_sticky): New macros.
>         (get_mem_ref_operands, get_tmr_operands): Preserve opf_sticky
>         rather than just opf_no_vops.
>         (get_expr_operands): Preserve opf_sticky bits in the use flags.
>         Assert that opf_no_vops and opf_transparent are already set
>         for the debug statements.  Use opf_transparent rather than
>         is_gimple_debug when deciding whether to mark something as
>         having its address taken.
>         (parse_ssa_operands): Add a transparent_p argument.  Set the
>         opf_no_vops and opf_transparent flags when the argument is true,
>         or when dealing with debug statements.  Check opf_no_vops before
>         adding vuses and vdefs.
>         (build_ssa_operands): Add a transparent_p argument and pass it to
>         parse_ssa_operands.
>         (verify_ssa_operands): Update call to parse_ssa_operands.
>         (update_stmt_operands): Add a transparent_p argument and pass it to
>         build_ssa_operands.
>         * gimple-ssa.h (update_stmt, update_stmt_if_modified)
>         (update_stmt_fn): Add an optional transparent_p parameter and
>         update call to update_stmt_operands.
>         * tree-vect-slp.c (vect_detect_hybrid_slp_1): Delete.
>         (vect_detect_hybrid_slp_2): Likewise.
>         (vect_detect_hybrid_slp): Don't treat pattern statements specially.
>         * tree-vect-stmts.c (vect_mark_stmts_to_be_vectorized): Likewise.
>         (vect_remove_dead_scalar_stmts): Remove pattern statements from
>         the containing vec_info.
>         * tree-vectorizer.h (vec_info::add_pattern_stmt_to_block): Declare.
>         * tree-vectorizer.c (vec_basic_block::add_to_end)
>         (vec_basic_block::add_before): Call add_pattern_stmt_to_block.
>         (vec_basic_block::remove, vec_info::remove_stmt): Call
>         remove_pattern_stmt_from_block.
>         (vec_basic_block::add_pattern_stmt_to_block): New function.
>         (remove_pattern_stmt_from_block): Likewise.
>         (vec_info::free_stmt_vec_info): Handle pattern statements.
>         (vec_info::lookup_single_use): Accept pattern statements
>         as well as original statements.  Ignore uses in statements
>         that have been replaced by a pattern statement.
>         * tree-vect-patterns.c (vect_init_pattern_stmt): Don't call
>         gimple_set_bb.
>         (vect_look_through_possible_promotion): Use vinfo->lookup_single_use
>         instead of has_single_use.  Track single uses for pattern statements
>         too.
>
> Index: gcc/tree-ssa-operands.h
> ===================================================================
> --- gcc/tree-ssa-operands.h     2018-05-02 08:37:32.405761509 +0100
> +++ gcc/tree-ssa-operands.h     2018-08-28 12:05:19.262917177 +0100
> @@ -94,7 +94,7 @@ extern void init_ssa_operands (struct fu
>  extern void fini_ssa_operands (struct function *);
>  extern bool verify_ssa_operands (struct function *, gimple *stmt);
>  extern void free_stmt_operands (struct function *, gimple *);
> -extern void update_stmt_operands (struct function *, gimple *);
> +extern void update_stmt_operands (struct function *, gimple *, bool);
>  extern void swap_ssa_operands (gimple *, tree *, tree *);
>  extern bool verify_imm_links (FILE *f, tree var);
>
> Index: gcc/tree-ssa-operands.c
> ===================================================================
> --- gcc/tree-ssa-operands.c     2018-08-28 11:25:46.242879876 +0100
> +++ gcc/tree-ssa-operands.c     2018-08-28 12:05:19.262917177 +0100
> @@ -99,6 +99,14 @@ #define opf_not_non_addressable (1 << 4)
>  /* Operand is having its address taken.  */
>  #define opf_address_taken (1 << 5)
>
> +/* Operand must have no effect on code generation.  This is used for
> +   debug statements, and also for statements that a pass has no intention
> +   of adding to the block in their current form.  */
> +#define opf_transparent (1 << 6)
> +
> +/* Flags that must never be dropped.  */
> +#define opf_sticky (opf_no_vops | opf_transparent)
> +
>  /* Array for building all the use operands.  */
>  static vec<tree *> build_uses;
>
> @@ -590,7 +598,7 @@ get_mem_ref_operands (struct function *f
>    /* If requested, add a USE operand for the base pointer.  */
>    get_expr_operands (fn, stmt, pptr,
>                      opf_non_addressable | opf_use
> -                    | (flags & (opf_no_vops|opf_not_non_addressable)));
> +                    | (flags & (opf_sticky | opf_not_non_addressable)));
>  }
>
>
> @@ -605,11 +613,11 @@ get_tmr_operands (struct function *fn, g
>
>    /* First record the real operands.  */
>    get_expr_operands (fn, stmt,
> -                    &TMR_BASE (expr), opf_use | (flags & opf_no_vops));
> +                    &TMR_BASE (expr), opf_use | (flags & opf_sticky));
>    get_expr_operands (fn, stmt,
> -                    &TMR_INDEX (expr), opf_use | (flags & opf_no_vops));
> +                    &TMR_INDEX (expr), opf_use | (flags & opf_sticky));
>    get_expr_operands (fn, stmt,
> -                    &TMR_INDEX2 (expr), opf_use | (flags & opf_no_vops));
> +                    &TMR_INDEX2 (expr), opf_use | (flags & opf_sticky));
>
>    add_virtual_operand (fn, stmt, flags);
>  }
> @@ -703,14 +711,14 @@ get_expr_operands (struct function *fn,
>    enum tree_code code;
>    enum tree_code_class codeclass;
>    tree expr = *expr_p;
> -  int uflags = opf_use;
> +  int uflags = opf_use | (flags & opf_sticky);
> +  gcc_checking_assert (!is_gimple_debug (stmt)
> +                      || ((flags & opf_no_vops)
> +                          && (flags & opf_transparent)));
>
>    if (expr == NULL)
>      return;
>
> -  if (is_gimple_debug (stmt))
> -    uflags |= (flags & opf_no_vops);
> -
>    code = TREE_CODE (expr);
>    codeclass = TREE_CODE_CLASS (code);
>
> @@ -723,7 +731,7 @@ get_expr_operands (struct function *fn,
>          resolution).  */
>        if ((!(flags & opf_non_addressable)
>            || (flags & opf_not_non_addressable))
> -         && !is_gimple_debug (stmt))
> +         && !(flags & opf_transparent))
>         mark_address_taken (TREE_OPERAND (expr, 0));
>
>        /* Otherwise, there may be variables referenced inside but there
> @@ -885,43 +893,50 @@ get_expr_operands (struct function *fn,
>
>
>  /* Parse STMT looking for operands.  When finished, the various
> -   build_* operand vectors will have potential operands in them.  */
> +   build_* operand vectors will have potential operands in them.
> +   TRANSPARENT_P as for update_stmt_operands.  */
>
>  static void
> -parse_ssa_operands (struct function *fn, gimple *stmt)
> +parse_ssa_operands (struct function *fn, gimple *stmt, bool transparent_p)
>  {
>    enum gimple_code code = gimple_code (stmt);
>    size_t i, n, start = 0;
> +  int flags = (transparent_p || code == GIMPLE_DEBUG
> +              ? opf_no_vops | opf_transparent : 0);
>
>    switch (code)
>      {
>      case GIMPLE_ASM:
> +      /* Not supported yet (but could be if needed).  */
> +      gcc_assert (!transparent_p);
>        get_asm_stmt_operands (fn, as_a <gasm *> (stmt));
>        break;
>
>      case GIMPLE_TRANSACTION:
>        /* The start of a transaction is a memory barrier.  */
> -      add_virtual_operand (fn, stmt, opf_def | opf_use);
> +      add_virtual_operand (fn, stmt, opf_def | opf_use | flags);
>        break;
>
>      case GIMPLE_DEBUG:
>        if (gimple_debug_bind_p (stmt)
>           && gimple_debug_bind_has_value_p (stmt))
>         get_expr_operands (fn, stmt, gimple_debug_bind_get_value_ptr (stmt),
> -                          opf_use | opf_no_vops);
> +                          opf_use | flags);
>        break;
>
>      case GIMPLE_RETURN:
> -      append_vuse (gimple_vop (fn));
> +      if (!(flags & opf_no_vops))
> +       append_vuse (gimple_vop (fn));
>        goto do_default;
>
>      case GIMPLE_CALL:
>        /* Add call-clobbered operands, if needed.  */
> -      maybe_add_call_vops (fn, as_a <gcall *> (stmt));
> +      if (!(flags & opf_no_vops))
> +       maybe_add_call_vops (fn, as_a <gcall *> (stmt));
>        /* FALLTHRU */
>
>      case GIMPLE_ASSIGN:
> -      get_expr_operands (fn, stmt, gimple_op_ptr (stmt, 0), opf_def);
> +      get_expr_operands (fn, stmt, gimple_op_ptr (stmt, 0), opf_def | flags);
>        start = 1;
>        /* FALLTHRU */
>
> @@ -929,22 +944,23 @@ parse_ssa_operands (struct function *fn,
>      do_default:
>        n = gimple_num_ops (stmt);
>        for (i = start; i < n; i++)
> -       get_expr_operands (fn, stmt, gimple_op_ptr (stmt, i), opf_use);
> +       get_expr_operands (fn, stmt, gimple_op_ptr (stmt, i), opf_use | flags);
>        break;
>      }
>  }
>
>
> -/* Create an operands cache for STMT.  */
> +/* Create an operands cache for STMT.  TRANSPARENT_P as for
> +   update_stmt_operands.  */
>
>  static void
> -build_ssa_operands (struct function *fn, gimple *stmt)
> +build_ssa_operands (struct function *fn, gimple *stmt, bool transparent_p)
>  {
>    /* Initially assume that the statement has no volatile operands.  */
>    gimple_set_has_volatile_ops (stmt, false);
>
>    start_ssa_stmt_operands ();
> -  parse_ssa_operands (fn, stmt);
> +  parse_ssa_operands (fn, stmt, transparent_p);
>    finalize_ssa_stmt_operands (fn, stmt);
>  }
>
> @@ -963,7 +979,7 @@ verify_ssa_operands (struct function *fn
>    /* build_ssa_operands w/o finalizing them.  */
>    gimple_set_has_volatile_ops (stmt, false);
>    start_ssa_stmt_operands ();
> -  parse_ssa_operands (fn, stmt);
> +  parse_ssa_operands (fn, stmt, false);
>
>    /* Now verify the built operands are the same as present in STMT.  */
>    def = gimple_vdef (stmt);
> @@ -1065,10 +1081,11 @@ free_stmt_operands (struct function *fn,
>  }
>
>
> -/* Get the operands of statement STMT.  */
> +/* Get the operands of statement STMT.  TRANSPARENT_P says that opf_transparent
> +   semantics should be used whatever form STMT happens to have.  */
>
>  void
> -update_stmt_operands (struct function *fn, gimple *stmt)
> +update_stmt_operands (struct function *fn, gimple *stmt, bool transparent_p)
>  {
>    /* If update_stmt_operands is called before SSA is initialized, do
>       nothing.  */
> @@ -1078,7 +1095,7 @@ update_stmt_operands (struct function *f
>    timevar_push (TV_TREE_OPS);
>
>    gcc_assert (gimple_modified_p (stmt));
> -  build_ssa_operands (fn, stmt);
> +  build_ssa_operands (fn, stmt, transparent_p);
>    gimple_set_modified (stmt, false);
>
>    timevar_pop (TV_TREE_OPS);
> Index: gcc/gimple-ssa.h
> ===================================================================
> --- gcc/gimple-ssa.h    2018-05-02 08:37:33.501751141 +0100
> +++ gcc/gimple-ssa.h    2018-08-28 12:05:19.262917177 +0100
> @@ -164,36 +164,42 @@ gimple_vdef_op (gimple *g)
>    return NULL_DEF_OPERAND_P;
>  }
>
> -/* Mark statement S as modified, and update it.  */
> +/* Mark statement S as modified, and update it.  TRANSPARENT_P is true
> +   if the update must have no effect on code generation, in much the
> +   same way as for debug statements.  This means in particular that the
> +   statement should not cause things to be marked addressable and should
> +   not use virtual operands.  */
>
>  static inline void
> -update_stmt (gimple *s)
> +update_stmt (gimple *s, bool transparent_p = false)
>  {
>    if (gimple_has_ops (s))
>      {
>        gimple_set_modified (s, true);
> -      update_stmt_operands (cfun, s);
> +      update_stmt_operands (cfun, s, transparent_p);
>      }
>  }
>
> -/* Update statement S if it has been optimized.  */
> +/* Update statement S if it has been optimized.  TRANSPARENT_P is as for
> +   update_stmt.  */
>
>  static inline void
> -update_stmt_if_modified (gimple *s)
> +update_stmt_if_modified (gimple *s, bool transparent_p = false)
>  {
>    if (gimple_modified_p (s))
> -    update_stmt_operands (cfun, s);
> +    update_stmt_operands (cfun, s, transparent_p);
>  }
>
> -/* Mark statement S as modified, and update it.  */
> +/* Mark statement S as modified, and update it.  TRANSPARENT_P is as for
> +   update_stmt.  */
>
>  static inline void
> -update_stmt_fn (struct function *fn, gimple *s)
> +update_stmt_fn (struct function *fn, gimple *s, bool transparent_p = false)
>  {
>    if (gimple_has_ops (s))
>      {
>        gimple_set_modified (s, true);
> -      update_stmt_operands (fn, s);
> +      update_stmt_operands (fn, s, transparent_p);
>      }
>  }
>
> Index: gcc/tree-vect-slp.c
> ===================================================================
> --- gcc/tree-vect-slp.c 2018-08-28 12:05:16.522940287 +0100
> +++ gcc/tree-vect-slp.c 2018-08-28 12:05:19.262917177 +0100
> @@ -2302,50 +2302,6 @@ vect_detect_hybrid_slp_stmts (slp_tree n
>        vect_detect_hybrid_slp_stmts (child, i, stype);
>  }
>
> -/* Helpers for vect_detect_hybrid_slp walking pattern stmt uses.  */
> -
> -static tree
> -vect_detect_hybrid_slp_1 (tree *tp, int *, void *data)
> -{
> -  walk_stmt_info *wi = (walk_stmt_info *)data;
> -  loop_vec_info loop_vinfo = (loop_vec_info) wi->info;
> -
> -  if (wi->is_lhs)
> -    return NULL_TREE;
> -
> -  stmt_vec_info def_stmt_info = loop_vinfo->lookup_def (*tp);
> -  if (def_stmt_info && PURE_SLP_STMT (def_stmt_info))
> -    {
> -      if (dump_enabled_p ())
> -       {
> -         dump_printf_loc (MSG_NOTE, vect_location, "marking hybrid: ");
> -         dump_gimple_stmt (MSG_NOTE, TDF_SLIM, def_stmt_info->stmt, 0);
> -       }
> -      STMT_SLP_TYPE (def_stmt_info) = hybrid;
> -    }
> -
> -  return NULL_TREE;
> -}
> -
> -static tree
> -vect_detect_hybrid_slp_2 (gimple_stmt_iterator *gsi, bool *handled,
> -                         walk_stmt_info *wi)
> -{
> -  loop_vec_info loop_vinfo = (loop_vec_info) wi->info;
> -  stmt_vec_info use_vinfo = loop_vinfo->lookup_stmt (gsi_stmt (*gsi));
> -  /* If the stmt is in a SLP instance then this isn't a reason
> -     to mark use definitions in other SLP instances as hybrid.  */
> -  if (! STMT_SLP_TYPE (use_vinfo)
> -      && (STMT_VINFO_RELEVANT (use_vinfo)
> -         || VECTORIZABLE_CYCLE_DEF (STMT_VINFO_DEF_TYPE (use_vinfo)))
> -      && ! (gimple_code (gsi_stmt (*gsi)) == GIMPLE_PHI
> -           && STMT_VINFO_DEF_TYPE (use_vinfo) == vect_reduction_def))
> -    ;
> -  else
> -    *handled = true;
> -  return NULL_TREE;
> -}
> -
>  /* Find stmts that must be both vectorized and SLPed.  */
>
>  void
> @@ -2357,22 +2313,7 @@ vect_detect_hybrid_slp (loop_vec_info lo
>
>    DUMP_VECT_SCOPE ("vect_detect_hybrid_slp");
>
> -  /* First walk all pattern stmt in the loop and mark defs of uses as
> -     hybrid because immediate uses in them are not recorded.  */
> -  vec_basic_block *vec_bb;
> -  FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
> -    FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
> -      if (is_pattern_stmt_p (stmt_info))
> -       {
> -         walk_stmt_info wi;
> -         memset (&wi, 0, sizeof (wi));
> -         wi.info = loop_vinfo;
> -         gimple_stmt_iterator gsi = gsi_for_stmt (stmt_info->stmt);
> -         walk_gimple_stmt (&gsi, vect_detect_hybrid_slp_2,
> -                           vect_detect_hybrid_slp_1, &wi);
> -       }
> -
> -  /* Then walk the SLP instance trees marking stmts with uses in
> +  /* Walk the SLP instance trees marking stmts with uses in
>       non-SLP stmts as hybrid, also propagating hybrid down the
>       SLP tree, collecting the above info on-the-fly.  */
>    FOR_EACH_VEC_ELT (slp_instances, i, instance)
> Index: gcc/tree-vect-stmts.c
> ===================================================================
> --- gcc/tree-vect-stmts.c       2018-08-28 12:05:16.522940287 +0100
> +++ gcc/tree-vect-stmts.c       2018-08-28 12:05:19.266917143 +0100
> @@ -703,54 +703,13 @@ vect_mark_stmts_to_be_vectorized (loop_v
>              break;
>          }
>
> -      if (is_pattern_stmt_p (stmt_vinfo))
> -        {
> -          /* Pattern statements are not inserted into the code, so
> -             FOR_EACH_PHI_OR_STMT_USE optimizes their operands out, and we
> -             have to scan the RHS or function arguments instead.  */
> -         if (gassign *assign = dyn_cast <gassign *> (stmt_vinfo->stmt))
> -           {
> -             enum tree_code rhs_code = gimple_assign_rhs_code (assign);
> -             tree op = gimple_assign_rhs1 (assign);
> -
> -             i = 1;
> -             if (rhs_code == COND_EXPR && COMPARISON_CLASS_P (op))
> -               {
> -                 if (!process_use (stmt_vinfo, TREE_OPERAND (op, 0),
> -                                   loop_vinfo, relevant, &worklist, false)
> -                     || !process_use (stmt_vinfo, TREE_OPERAND (op, 1),
> -                                      loop_vinfo, relevant, &worklist, false))
> -                   return false;
> -                 i = 2;
> -               }
> -             for (; i < gimple_num_ops (assign); i++)
> -               {
> -                 op = gimple_op (assign, i);
> -                  if (TREE_CODE (op) == SSA_NAME
> -                     && !process_use (stmt_vinfo, op, loop_vinfo, relevant,
> -                                      &worklist, false))
> -                    return false;
> -                 }
> -            }
> -         else if (gcall *call = dyn_cast <gcall *> (stmt_vinfo->stmt))
> -           {
> -             for (i = 0; i < gimple_call_num_args (call); i++)
> -               {
> -                 tree arg = gimple_call_arg (call, i);
> -                 if (!process_use (stmt_vinfo, arg, loop_vinfo, relevant,
> -                                   &worklist, false))
> -                    return false;
> -               }
> -           }
> -        }
> -      else
> -       FOR_EACH_PHI_OR_STMT_USE (use_p, stmt_vinfo->stmt, iter, SSA_OP_USE)
> -          {
> -            tree op = USE_FROM_PTR (use_p);
> -           if (!process_use (stmt_vinfo, op, loop_vinfo, relevant,
> -                             &worklist, false))
> -              return false;
> -          }
> +      FOR_EACH_PHI_OR_STMT_USE (use_p, stmt_vinfo->stmt, iter, SSA_OP_USE)
> +       {
> +         tree op = USE_FROM_PTR (use_p);
> +         if (!process_use (stmt_vinfo, op, loop_vinfo, relevant,
> +                           &worklist, false))
> +           return false;
> +       }
>
>        if (STMT_VINFO_GATHER_SCATTER_P (stmt_vinfo))
>         {
> @@ -10849,7 +10808,9 @@ vect_remove_dead_scalar_stmts (vec_info
>            stmt_info = prev_stmt_info)
>         {
>           prev_stmt_info = stmt_info->prev;
> -         if (!is_pattern_stmt_p (stmt_info))
> +         if (is_pattern_stmt_p (stmt_info))
> +           vinfo->remove_stmt (stmt_info);
> +         else
>             vect_maybe_remove_scalar_stmt (stmt_info);
>         }
>      }
> Index: gcc/tree-vectorizer.h
> ===================================================================
> --- gcc/tree-vectorizer.h       2018-08-28 12:05:16.522940287 +0100
> +++ gcc/tree-vectorizer.h       2018-08-28 12:05:19.266917143 +0100
> @@ -193,6 +193,8 @@ #define SLP_TREE_DEF_TYPE(S)                         (S)->def
>    stmt_vec_info last () const { return m_last; }
>
>  private:
> +  void add_pattern_stmt_to_block (stmt_vec_info);
> +
>    /* The block itself.  */
>    basic_block m_bb;
>
> Index: gcc/tree-vectorizer.c
> ===================================================================
> --- gcc/tree-vectorizer.c       2018-08-28 12:05:14.014961439 +0100
> +++ gcc/tree-vectorizer.c       2018-08-28 12:05:19.266917143 +0100
> @@ -459,6 +459,9 @@ vec_basic_block::add_to_end (stmt_vec_in
>    stmt_info->block = this;
>    stmt_info->prev = m_last;
>    m_last = stmt_info;
> +
> +  if (is_pattern_stmt_p (stmt_info))
> +    add_pattern_stmt_to_block (stmt_info);
>  }
>
>  /* Add STMT_INFO to the block, inserting it before NEXT_STMT_INFO.  */
> @@ -479,6 +482,34 @@ vec_basic_block::add_before (stmt_vec_in
>    stmt_info->prev = next_stmt_info->prev;
>    stmt_info->next = next_stmt_info;
>    next_stmt_info->prev = stmt_info;
> +
> +  if (is_pattern_stmt_p (stmt_info))
> +    add_pattern_stmt_to_block (stmt_info);
> +}
> +
> +/* Record that pattern statement STMT_INFO has just been added to the
> +   vec_basic_block.  Adding it to the underlying basic_block would be
> +   problematic because we need to be able to duplicate the original
> +   scalar code in the middle of vectorization.  Instead we just set the
> +   statement's gimple_bb and link its SSA uses.  */
> +
> +void
> +vec_basic_block::add_pattern_stmt_to_block (stmt_vec_info stmt_info)
> +{
> +  gimple_set_bb (stmt_info->stmt, m_bb);
> +  update_stmt (stmt_info->stmt, true);
> +  gcc_assert (!gimple_vdef (stmt_info->stmt));
> +  gcc_assert (!gimple_vuse (stmt_info->stmt));
> +}
> +
> +/* Record that STMT_INFO has been removed from its vec_basic_block.
> +   Undo the effect of add_pattern_stmt_to_block.  */
> +
> +static void
> +remove_pattern_stmt_from_block (stmt_vec_info stmt_info)
> +{
> +  gimple_set_bb (stmt_info->stmt, NULL);
> +  delink_stmt_imm_use (stmt_info->stmt);
>  }
>
>  /* Remove STMT_INFO from the block.  */
> @@ -497,6 +528,9 @@ vec_basic_block::remove (stmt_vec_info s
>      m_last = stmt_info->prev;
>    stmt_info->block = NULL;
>    stmt_info->prev = stmt_info->next = NULL;
> +
> +  if (is_pattern_stmt_p (stmt_info))
> +    remove_pattern_stmt_from_block (stmt_info);
>  }
>
>  /* Initialize the vec_info with kind KIND_IN and target cost data
> @@ -610,12 +644,37 @@ vec_info::lookup_def (tree name)
>  stmt_vec_info
>  vec_info::lookup_single_use (stmt_vec_info stmt_info)
>  {
> -  tree lhs = gimple_get_lhs (stmt_info->stmt);
> -  use_operand_p dummy;
> -  gimple *use_stmt;
> -  if (single_imm_use (lhs, &dummy, &use_stmt))
> -    return lookup_stmt (use_stmt);
> -  return NULL;
> +  gimple *stmts[2] = { stmt_info->stmt, NULL };
> +  unsigned int num_stmts = 1;
> +  /* If STMT_INFO replaces one of the original scalar statements,
> +     check for uses of that statement's results too, to simulate the
> +     effect of vect_stmt_to_be_vectorized.  */
> +  if (is_main_pattern_stmt_p (stmt_info))
> +    stmts[num_stmts++] = STMT_VINFO_RELATED_STMT (stmt_info)->stmt;
> +  stmt_vec_info result = NULL;
> +  for (unsigned int i = 0; i < num_stmts; ++i)
> +    {
> +      use_operand_p use_p;
> +      imm_use_iterator imm_iter;
> +      tree lhs = gimple_get_lhs (stmts[i]);
> +      FOR_EACH_IMM_USE_FAST (use_p, imm_iter, lhs)
> +       {
> +         gimple *use_stmt = USE_STMT (use_p);
> +         if (is_gimple_debug (use_stmt))
> +           continue;
> +         stmt_vec_info use_stmt_info = lookup_stmt (use_stmt);
> +         if (!use_stmt_info)
> +           return NULL;
> +         /* Ignore statements that have been replaced by a pattern
> +            and consider only the replacement statement.  */
> +         if (STMT_VINFO_IN_PATTERN_P (use_stmt_info))
> +           continue;
> +         if (result)
> +           return NULL;
> +         result = use_stmt_info;
> +       }
> +    }
> +  return result;
>  }
>
>  /* Return vectorization information about DR.  */
> @@ -650,16 +709,18 @@ vec_info::move_dr (stmt_vec_info new_stm
>  void
>  vec_info::remove_stmt (stmt_vec_info stmt_info)
>  {
> -  gcc_assert (!stmt_info->pattern_stmt_p);
>    set_vinfo_for_stmt (stmt_info->stmt, NULL);
>    gimple_stmt_iterator si = gsi_for_stmt (stmt_info->stmt);
>    unlink_stmt_vdef (stmt_info->stmt);
> -  if (is_a <gphi *> (stmt_info->stmt))
> -    remove_phi_node (&si, true);
> -  else
> +  if (!is_pattern_stmt_p (stmt_info))
>      {
> -      gsi_remove (&si, true);
> -      release_defs (stmt_info->stmt);
> +      if (is_a <gphi *> (stmt_info->stmt))
> +       remove_phi_node (&si, true);
> +      else
> +       {
> +         gsi_remove (&si, true);
> +         release_defs (stmt_info->stmt);
> +       }
>      }
>    stmt_info->block->remove (stmt_info);
>    free_stmt_vec_info (stmt_info);
> @@ -749,12 +810,13 @@ vec_info::free_stmt_vec_infos (void)
>  void
>  vec_info::free_stmt_vec_info (stmt_vec_info stmt_info)
>  {
> -  if (stmt_info->pattern_stmt_p)
> +  if (is_pattern_stmt_p (stmt_info))
>      {
> -      gimple_set_bb (stmt_info->stmt, NULL);
>        tree lhs = gimple_get_lhs (stmt_info->stmt);
>        if (lhs && TREE_CODE (lhs) == SSA_NAME)
>         release_ssa_name (lhs);
> +      if (stmt_info->block)
> +       remove_pattern_stmt_from_block (stmt_info);
>      }
>
>    STMT_VINFO_SAME_ALIGN_REFS (stmt_info).release ();
> Index: gcc/tree-vect-patterns.c
> ===================================================================
> --- gcc/tree-vect-patterns.c    2018-08-28 12:05:16.518940320 +0100
> +++ gcc/tree-vect-patterns.c    2018-08-28 12:05:19.262917177 +0100
> @@ -106,7 +106,6 @@ vect_init_pattern_stmt (gimple *pattern_
>    stmt_vec_info pattern_stmt_info = vinfo->lookup_stmt (pattern_stmt);
>    if (pattern_stmt_info == NULL)
>      pattern_stmt_info = orig_stmt_info->vinfo->add_stmt (pattern_stmt);
> -  gimple_set_bb (pattern_stmt, gimple_bb (orig_stmt_info->stmt));
>
>    pattern_stmt_info->pattern_stmt_p = true;
>    STMT_VINFO_RELATED_STMT (pattern_stmt_info) = orig_stmt_info;
> @@ -412,11 +411,9 @@ vect_look_through_possible_promotion (ve
>         break;
>        caster = def_stmt_info;
>
> -      /* Ignore pattern statements, since we don't link uses for them.  */
>        if (caster
>           && single_use_p
> -         && !STMT_VINFO_RELATED_STMT (caster)
> -         && !has_single_use (res))
> +         && !vinfo->lookup_single_use (caster))
>         *single_use_p = false;
>
>        gassign *assign = dyn_cast <gassign *> (def_stmt);

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [5/6] Insert pattern statements into vec_basic_blocks
  2018-08-28 11:25 ` [5/6] Insert pattern statements into vec_basic_blocks Richard Sandiford
  2018-08-28 23:16   ` Jeff Law
@ 2018-08-29  7:55   ` Jakub Jelinek
  2018-08-29  8:59     ` Richard Sandiford
  1 sibling, 1 reply; 22+ messages in thread
From: Jakub Jelinek @ 2018-08-29  7:55 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On Tue, Aug 28, 2018 at 12:24:06PM +0100, Richard Sandiford wrote:
> The point of this patch is to put pattern statements in the same
> vec_basic_block as the statements they replace, with the pattern
> statements for S coming between S and S's original predecessor.
> This removes the need to handle them specially in various places.

My preferred way to handle pattern stmts would be to do what we do in
tree-if-conversion, i.e. whenever creating first pattern stmt for certain
loop, duplicate that loop directly in the IL guarded with an ifn
and modify directly one copy of the loop (the one meant to be vectorized).
If we aren't cycling over multiple vectorization factors, that could be even
done directly on the tree-if-conversion created vector loop copy, otherwise
we'd need more.

For tree-if-conversion, we have:
  if (LOOP_VECTORIZED (num))
    loop;
  else
    loop; // scalar loop, to use if vectorization fails
so for multiple vfs we could have:
  if (LOOP_VECTORIZED (num))
    {
      if (LOOP_VECTORIZED (num, 8))
	loop; // if vf is 8
      else if (LOOP_VECTORIZED (num, 4))
	loop; // if vf is 4
      else
	loop; // otherwise
    }
  else
    loop; // scalar loop, to use if vectorization fails

	Jakub

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [5/6] Insert pattern statements into vec_basic_blocks
  2018-08-29  7:55   ` Jakub Jelinek
@ 2018-08-29  8:59     ` Richard Sandiford
  2018-08-29  9:10       ` Jakub Jelinek
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Sandiford @ 2018-08-29  8:59 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches

Jakub Jelinek <jakub@redhat.com> writes:
> On Tue, Aug 28, 2018 at 12:24:06PM +0100, Richard Sandiford wrote:
>> The point of this patch is to put pattern statements in the same
>> vec_basic_block as the statements they replace, with the pattern
>> statements for S coming between S and S's original predecessor.
>> This removes the need to handle them specially in various places.
>
> My preferred way to handle pattern stmts would be to do what we do in
> tree-if-conversion, i.e. whenever creating first pattern stmt for certain
> loop, duplicate that loop directly in the IL guarded with an ifn
> and modify directly one copy of the loop (the one meant to be vectorized).
> If we aren't cycling over multiple vectorization factors, that could be even
> done directly on the tree-if-conversion created vector loop copy, otherwise
> we'd need more.

I'd originally tried adding the pattern stmts to the gimple bb as well
as the vec_basic_block, but the problem is that we create pattern stmts
before duplicating the scalar loop for peeling.  So I think we'd need to
copy the loop even for the single-size case, or arrange some other way of
temporarily restoring the original code.

And like you say, there's the problem of trying multiple vector sizes.
We'd need to either copy the IL for each size or make sure that we
can reliably roll back to the original scalar code.  Rolling back would
make it harder to try all available vector sizes and pick the cheapest,
since we'd have to reroll the pattern statements for the chosen size.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [5/6] Insert pattern statements into vec_basic_blocks
  2018-08-29  8:59     ` Richard Sandiford
@ 2018-08-29  9:10       ` Jakub Jelinek
  2018-08-29  9:22         ` Richard Biener
  0 siblings, 1 reply; 22+ messages in thread
From: Jakub Jelinek @ 2018-08-29  9:10 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford

On Wed, Aug 29, 2018 at 09:59:07AM +0100, Richard Sandiford wrote:
> Jakub Jelinek <jakub@redhat.com> writes:
> > On Tue, Aug 28, 2018 at 12:24:06PM +0100, Richard Sandiford wrote:
> >> The point of this patch is to put pattern statements in the same
> >> vec_basic_block as the statements they replace, with the pattern
> >> statements for S coming between S and S's original predecessor.
> >> This removes the need to handle them specially in various places.
> >
> > My preferred way to handle pattern stmts would be to do what we do in
> > tree-if-conversion, i.e. whenever creating first pattern stmt for certain
> > loop, duplicate that loop directly in the IL guarded with an ifn
> > and modify directly one copy of the loop (the one meant to be vectorized).
> > If we aren't cycling over multiple vectorization factors, that could be even
> > done directly on the tree-if-conversion created vector loop copy, otherwise
> > we'd need more.
> 
> I'd originally tried adding the pattern stmts to the gimple bb as well
> as the vec_basic_block, but the problem is that we create pattern stmts
> before duplicating the scalar loop for peeling.  So I think we'd need to
> copy the loop even for the single-size case, or arrange some other way of
> temporarily restoring the original code.

Sure, if tree-if-conversion doesn't copy the loop and for single-size case,
we'd need to do that copy (once per loop), if it already copied the loop,
there is already the scalar and for vectorization only loop.
With those copies you don't need to roll back anything, just ensure the
guarding internal fns fold to constants when the vectorization is over based
on if the loop was vectorized and with what vf, and cfg cleanup will do the
rest.
The vectorizer already has code to find the scalar loop in the
LOOP_VECTORIZED condition guarded else block.

	Jakub

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [5/6] Insert pattern statements into vec_basic_blocks
  2018-08-29  9:10       ` Jakub Jelinek
@ 2018-08-29  9:22         ` Richard Biener
  2018-08-29  9:38           ` Richard Sandiford
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Biener @ 2018-08-29  9:22 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: GCC Patches, Richard Sandiford

On Wed, Aug 29, 2018 at 11:10 AM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Wed, Aug 29, 2018 at 09:59:07AM +0100, Richard Sandiford wrote:
> > Jakub Jelinek <jakub@redhat.com> writes:
> > > On Tue, Aug 28, 2018 at 12:24:06PM +0100, Richard Sandiford wrote:
> > >> The point of this patch is to put pattern statements in the same
> > >> vec_basic_block as the statements they replace, with the pattern
> > >> statements for S coming between S and S's original predecessor.
> > >> This removes the need to handle them specially in various places.
> > >
> > > My preferred way to handle pattern stmts would be to do what we do in
> > > tree-if-conversion, i.e. whenever creating first pattern stmt for certain
> > > loop, duplicate that loop directly in the IL guarded with an ifn
> > > and modify directly one copy of the loop (the one meant to be vectorized).
> > > If we aren't cycling over multiple vectorization factors, that could be even
> > > done directly on the tree-if-conversion created vector loop copy, otherwise
> > > we'd need more.
> >
> > I'd originally tried adding the pattern stmts to the gimple bb as well
> > as the vec_basic_block, but the problem is that we create pattern stmts
> > before duplicating the scalar loop for peeling.  So I think we'd need to
> > copy the loop even for the single-size case, or arrange some other way of
> > temporarily restoring the original code.
>
> Sure, if tree-if-conversion doesn't copy the loop and for single-size case,
> we'd need to do that copy (once per loop), if it already copied the loop,
> there is already the scalar and for vectorization only loop.
> With those copies you don't need to roll back anything, just ensure the
> guarding internal fns fold to constants when the vectorization is over based
> on if the loop was vectorized and with what vf, and cfg cleanup will do the
> rest.
> The vectorizer already has code to find the scalar loop in the
> LOOP_VECTORIZED condition guarded else block.

My hope is still that we can eventually get rid of pattern stmts - I see how
they make things easier, esp. for the issue of computing a vectorization
factor ...

Note the duplication comes at a cost - we do not version loops for if-conversion
before we know we'll apply if-conversion and it will succeed.  It would be
bad if we start to duplicate each loop just because we start vectorization
analysis (and pattern recog is done very early).

Richard.

>         Jakub

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [6/6] Link imm uses for pattern stmts
  2018-08-29  7:43   ` Richard Biener
@ 2018-08-29  9:25     ` Richard Sandiford
  2018-08-30 10:24       ` Richard Biener
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Sandiford @ 2018-08-29  9:25 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches, Michael Matz

Richard Biener <richard.guenther@gmail.com> writes:
> On Tue, Aug 28, 2018 at 1:25 PM Richard Sandiford
> <richard.sandiford@arm.com> wrote:
>>
>> One of the warts of the vectoriser IR is that it doesn't link SSA name
>> uses for pattern statements, leading to complicated special cases in
>> vect_mark_stmts_to_be_vectorized and (especially) vect_detect_hybrid_slp.
>> It also makes it harder to check how an SSA name is used after pattern
>> replacement (something I need for a later patch).
>>
>> This patch adds a mode in which tree-ssa-operands.c can update
>> statements in the same non-invasive way as for debug statements.
>> It then uses this mode to update pattern statements when adding
>> them to a vec_basic_block, so that pattern statements become
>> even more like statements that existed from the outset.
>
> Without reviewing in detail yet I think putting them in the same
> ballpark as debug stmts or documenting they have no effect on
> code generation is misleading - this is because immediate use
> handling _does_ affect code generation unless you mark those
> stmts in some way and handle them like debug-uses in all of
> the immediate use routines.  For example match-and-simplify
> foldings guarded by :s will see the transparent stmts as uses.

Yeah, true.  So the comment was plain wrong.  I guess what I meant to
say was that adding the statement and removing it again must leave the
IL the same as before.

> So the real effect is to not add virtual operands or marking things
> addressable?  IMHO rather than adding a transparent_p flag
> we eventually should allow passing down a oep flag to update_stmt?
>
> Do vector pattern stmts cause any of the issues?  I don't remember
> seeing things that would cause TREE_ADDRESSABLE.  I know
> we have patterns for stores so we'd get extra VDEFs.

Yeah, it was the vop case that motivated the patch.  I don't know of
any specific examples in which pattern statements could cause the
addressable flag to be set, but in principle we should handle that
case like debug statements.

The series was building up to adding support for extending loads and
truncating stores, which would be another source of potential VUSEs and VDEFs.

> IIRC we historically didn't add pattern stmts to the IL simply because
> pattern recognition was supposed to be analysis phase and building
> and eventually tearing down pattern stmts was thought as to be too
> expensive.  That is also because we'd need to copy the loop to
> retain a copy w/o those stmts.
>
> So - would update_stmt (stmt, opf_no_vops|opf_non_addressable)
> work?  I suppose we'd need to pass down a second sticky_flags
> arg.

I don't mind making the opf_* flags part of the public interface
if you think that's cleaner, but I'd still prefer to have a name
that abstracts the semantics.  Using opf_no_vops|opf_non_addressable
directly feels a bit low-level.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [5/6] Insert pattern statements into vec_basic_blocks
  2018-08-29  9:22         ` Richard Biener
@ 2018-08-29  9:38           ` Richard Sandiford
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Sandiford @ 2018-08-29  9:38 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jakub Jelinek, GCC Patches

Richard Biener <richard.guenther@gmail.com> writes:
> On Wed, Aug 29, 2018 at 11:10 AM Jakub Jelinek <jakub@redhat.com> wrote:
>>
>> On Wed, Aug 29, 2018 at 09:59:07AM +0100, Richard Sandiford wrote:
>> > Jakub Jelinek <jakub@redhat.com> writes:
>> > > On Tue, Aug 28, 2018 at 12:24:06PM +0100, Richard Sandiford wrote:
>> > >> The point of this patch is to put pattern statements in the same
>> > >> vec_basic_block as the statements they replace, with the pattern
>> > >> statements for S coming between S and S's original predecessor.
>> > >> This removes the need to handle them specially in various places.
>> > >
>> > > My preferred way to handle pattern stmts would be to do what we do in
>> > > tree-if-conversion, i.e. whenever creating first pattern stmt for certain
>> > > loop, duplicate that loop directly in the IL guarded with an ifn
>> > > and modify directly one copy of the loop (the one meant to be vectorized).
>> > > If we aren't cycling over multiple vectorization factors, that
>> > > could be even
>> > > done directly on the tree-if-conversion created vector loop copy,
>> > > otherwise
>> > > we'd need more.
>> >
>> > I'd originally tried adding the pattern stmts to the gimple bb as well
>> > as the vec_basic_block, but the problem is that we create pattern stmts
>> > before duplicating the scalar loop for peeling.  So I think we'd need to
>> > copy the loop even for the single-size case, or arrange some other way of
>> > temporarily restoring the original code.
>>
>> Sure, if tree-if-conversion doesn't copy the loop and for single-size case,
>> we'd need to do that copy (once per loop), if it already copied the loop,
>> there is already the scalar and for vectorization only loop.
>> With those copies you don't need to roll back anything, just ensure the
>> guarding internal fns fold to constants when the vectorization is over based
>> on if the loop was vectorized and with what vf, and cfg cleanup will do the
>> rest.
>> The vectorizer already has code to find the scalar loop in the
>> LOOP_VECTORIZED condition guarded else block.
>
> My hope is still that we can eventually get rid of pattern stmts - I see how
> they make things easier, esp. for the issue of computing a vectorization
> factor ...

They seem like a really nice feature to me :-)  Doing pattern recognition
earlier (e.g. in a prepass) would be a problem because not all vector
sizes provide the same features.  And doing pattern recognition on the
fly during stmt analysis would make it harder to implement some of the
more global transforms.

At the moment pattern stmts seem pretty light-weight too.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [6/6] Link imm uses for pattern stmts
  2018-08-29  9:25     ` Richard Sandiford
@ 2018-08-30 10:24       ` Richard Biener
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Biener @ 2018-08-30 10:24 UTC (permalink / raw)
  To: GCC Patches, Michael Matz, Richard Sandiford

On Wed, Aug 29, 2018 at 11:25 AM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Richard Biener <richard.guenther@gmail.com> writes:
> > On Tue, Aug 28, 2018 at 1:25 PM Richard Sandiford
> > <richard.sandiford@arm.com> wrote:
> >>
> >> One of the warts of the vectoriser IR is that it doesn't link SSA name
> >> uses for pattern statements, leading to complicated special cases in
> >> vect_mark_stmts_to_be_vectorized and (especially) vect_detect_hybrid_slp.
> >> It also makes it harder to check how an SSA name is used after pattern
> >> replacement (something I need for a later patch).
> >>
> >> This patch adds a mode in which tree-ssa-operands.c can update
> >> statements in the same non-invasive way as for debug statements.
> >> It then uses this mode to update pattern statements when adding
> >> them to a vec_basic_block, so that pattern statements become
> >> even more like statements that existed from the outset.
> >
> > Without reviewing in detail yet I think putting them in the same
> > ballpark as debug stmts or documenting they have no effect on
> > code generation is misleading - this is because immediate use
> > handling _does_ affect code generation unless you mark those
> > stmts in some way and handle them like debug-uses in all of
> > the immediate use routines.  For example match-and-simplify
> > foldings guarded by :s will see the transparent stmts as uses.
>
> Yeah, true.  So the comment was plain wrong.  I guess what I meant to
> say was that adding the statement and removing it again must leave the
> IL the same as before.
>
> > So the real effect is to not add virtual operands or marking things
> > addressable?  IMHO rather than adding a transparent_p flag
> > we eventually should allow passing down a oep flag to update_stmt?
> >
> > Do vector pattern stmts cause any of the issues?  I don't remember
> > seeing things that would cause TREE_ADDRESSABLE.  I know
> > we have patterns for stores so we'd get extra VDEFs.
>
> Yeah, it was the vop case that motivated the patch.  I don't know of
> any specific examples in which pattern statements could cause the
> addressable flag to be set, but in principle we should handle that
> case like debug statements.
>
> The series was building up to adding support for extending loads and
> truncating stores, which would be another source of potential VUSEs and VDEFs.
>
> > IIRC we historically didn't add pattern stmts to the IL simply because
> > pattern recognition was supposed to be analysis phase and building
> > and eventually tearing down pattern stmts was thought as to be too
> > expensive.  That is also because we'd need to copy the loop to
> > retain a copy w/o those stmts.
> >
> > So - would update_stmt (stmt, opf_no_vops|opf_non_addressable)
> > work?  I suppose we'd need to pass down a second sticky_flags
> > arg.
>
> I don't mind making the opf_* flags part of the public interface
> if you think that's cleaner, but I'd still prefer to have a name
> that abstracts the semantics.  Using opf_no_vops|opf_non_addressable
> directly feels a bit low-level.

True.  I guess not exposing the opf_ flags would be better.  Maybe
really just clall the flag like_debug_stmt_p ;)  That makes the semantics
obvious as well - you can of course document that, for example, it
doesn't make things addressable or add virtual operands.

I suppose the patch is OK with that mere documentation change.

Richard.

>
> Thanks,
> Richard

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2018-08-30 10:24 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-28 11:19 [0/6] Make vector pattern statements less special Richard Sandiford
2018-08-28 11:20 ` [1/6] Handle gphis in gimple_get_lhs Richard Sandiford
2018-08-28 18:22   ` Jeff Law
2018-08-28 11:21 ` [2/6] Make vec_info::lookup_single_use take a stmt_vec_info Richard Sandiford
2018-08-28 18:25   ` Jeff Law
2018-08-28 11:22 ` [3/6] Add a vec_basic_block structure Richard Sandiford
2018-08-28 22:38   ` Jeff Law
2018-08-28 11:23 ` [4/6] Make the vectoriser do its own DCE Richard Sandiford
2018-08-28 23:01   ` Jeff Law
2018-08-29  7:16     ` Richard Biener
2018-08-28 11:25 ` [6/6] Link imm uses for pattern stmts Richard Sandiford
2018-08-29  7:43   ` Richard Biener
2018-08-29  9:25     ` Richard Sandiford
2018-08-30 10:24       ` Richard Biener
2018-08-28 11:25 ` [5/6] Insert pattern statements into vec_basic_blocks Richard Sandiford
2018-08-28 23:16   ` Jeff Law
2018-08-29  7:18     ` Richard Biener
2018-08-29  7:55   ` Jakub Jelinek
2018-08-29  8:59     ` Richard Sandiford
2018-08-29  9:10       ` Jakub Jelinek
2018-08-29  9:22         ` Richard Biener
2018-08-29  9:38           ` Richard Sandiford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).