public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Tamar Christina <Tamar.Christina@arm.com>
To: Richard Biener <rguenther@suse.de>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
	nd <nd@arm.com>, "jlaw@ventanamicro.com" <jlaw@ventanamicro.com>
Subject: RE: [PATCH 9/21]middle-end: implement vectorizable_early_exit for codegen of exit code
Date: Wed, 6 Dec 2023 04:37:30 +0000	[thread overview]
Message-ID: <VI1PR08MB53254186A2A0585B263FF72AFF84A@VI1PR08MB5325.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <85570n66-1540-0r07-7q80-269p3o133585@fhfr.qr>

[-- Attachment #1: Type: text/plain, Size: 25416 bytes --]

> > > +
> > > +  tree truth_type = truth_type_for (vectype_op);  machine_mode mode =
> > > + TYPE_MODE (truth_type);  int ncopies;
> > > +
> 
> more line break issues ... (also below, check yourself)
> 
> shouldn't STMT_VINFO_VECTYPE already match truth_type here?  If not
> it looks to be set wrongly (or shouldn't be set at all)
> 

Fixed, I now leverage the existing vect_recog_bool_pattern to update the types
If needed and determine the initial type in vect_get_vector_types_for_stmt.

> > > +  if (slp_node)
> > > +    ncopies = 1;
> > > +  else
> > > +    ncopies = vect_get_num_copies (loop_vinfo, truth_type);
> > > +
> > > +  vec_loop_masks *masks = &LOOP_VINFO_MASKS (loop_vinfo);  bool
> > > + masked_loop_p = LOOP_VINFO_FULLY_MASKED_P (loop_vinfo);
> > > +
> 
> what about with_len?

Should be easy to add, but don't know how it works.

> 
> > > +  /* Analyze only.  */
> > > +  if (!vec_stmt)
> > > +    {
> > > +      if (direct_optab_handler (cbranch_optab, mode) == CODE_FOR_nothing)
> > > +	{
> > > +	  if (dump_enabled_p ())
> > > +	      dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> > > +			       "can't vectorize early exit because the "
> > > +			       "target doesn't support flag setting vector "
> > > +			       "comparisons.\n");
> > > +	  return false;
> > > +	}
> > > +
> > > +      if (!expand_vec_cmp_expr_p (vectype_op, truth_type, NE_EXPR))
> 
> Why NE_EXPR?  This looks wrong.  Or vectype_op is wrong if you're
> emitting
> 
>  mask = op0 CMP op1;
>  if (mask != 0)
> 
> I think you need to check for CMP, not NE_EXPR.

Well CMP is checked by vectorizable_comparison_1, but I realized this
check is not checking what I wanted and the cbranch requirements
already do.  So removed.

> 
> > > +	{
> > > +	  if (dump_enabled_p ())
> > > +	      dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> > > +			       "can't vectorize early exit because the "
> > > +			       "target does not support boolean vector "
> > > +			       "comparisons for type %T.\n", truth_type);
> > > +	  return false;
> > > +	}
> > > +
> > > +      if (ncopies > 1
> > > +	  && direct_optab_handler (ior_optab, mode) == CODE_FOR_nothing)
> > > +	{
> > > +	  if (dump_enabled_p ())
> > > +	      dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> > > +			       "can't vectorize early exit because the "
> > > +			       "target does not support boolean vector OR for "
> > > +			       "type %T.\n", truth_type);
> > > +	  return false;
> > > +	}
> > > +
> > > +      if (!vectorizable_comparison_1 (vinfo, truth_type, stmt_info, code, gsi,
> > > +				      vec_stmt, slp_node, cost_vec))
> > > +	return false;
> 
> I suppose vectorizable_comparison_1 will check this again, so the above
> is redundant?
> 

The IOR? No, vectorizable_comparison_1 doesn't reduce so may not check it
depending on the condition.

> > > +  /* Determine if we need to reduce the final value.  */
> > > +  if (stmts.length () > 1)
> > > +    {
> > > +      /* We build the reductions in a way to maintain as much parallelism as
> > > +	 possible.  */
> > > +      auto_vec<tree> workset (stmts.length ());
> > > +      workset.splice (stmts);
> > > +      while (workset.length () > 1)
> > > +	{
> > > +	  new_temp = make_temp_ssa_name (truth_type, NULL,
> > > "vexit_reduc");
> > > +	  tree arg0 = workset.pop ();
> > > +	  tree arg1 = workset.pop ();
> > > +	  new_stmt = gimple_build_assign (new_temp, BIT_IOR_EXPR, arg0,
> > > arg1);
> > > +	  vect_finish_stmt_generation (loop_vinfo, stmt_info, new_stmt,
> > > +				       &cond_gsi);
> > > +	  if (slp_node)
> > > +	    slp_node->push_vec_def (new_stmt);
> > > +	  else
> > > +	    STMT_VINFO_VEC_STMTS (stmt_info).safe_push (new_stmt);
> > > +	  workset.quick_insert (0, new_temp);
> 
> Reduction epilogue handling has similar code to reduce a set of vectors
> to a single one with an operation.  I think we want to share that code.
> 

I've taken a look but that code isn't suitable here since they have different
constraints.  I don't require an in-order reduction since for the comparison
all we care about is whether in a lane any bit is set or not.  This means:

1. we can reduce using a fast operation like IOR.
2. we can reduce in as much parallelism as possible.

The comparison is on the critical path for the loop now, unlike live reductions
which are always at the end, so using the live reduction code resulted in a
slow down since it creates a longer dependency chain.

> > > +	}
> > > +    }
> > > +  else
> > > +    new_temp = stmts[0];
> > > +
> > > +  gcc_assert (new_temp);
> > > +
> > > +  tree cond = new_temp;
> > > +  if (masked_loop_p)
> > > +    {
> > > +      tree mask = vect_get_loop_mask (loop_vinfo, gsi, masks, ncopies,
> > > truth_type, 0);
> > > +      cond = prepare_vec_mask (loop_vinfo, TREE_TYPE (mask), mask, cond,
> > > +			       &cond_gsi);
> 
> I don't think this is correct when 'stmts' had more than one vector?
> 

It is, because even when VLA, since we only support counted loops partial vectors
are disabled. And it looks like --parm vect-partial-vector-usage=1 cannot force it on.

In principal I suppose I could mask the individual stmts, that should handle the future case when
This is relaxed to supposed non-fix length buffers?

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

	* tree-vect-patterns.cc (vect_init_pattern_stmt): Support gconds.
	(check_bool_pattern, adjust_bool_pattern, adjust_bool_stmts,
	vect_recog_bool_pattern): Support gconds type analysis.
	* tree-vect-stmts.cc (vectorizable_comparison_1): Support stmts without
	lhs.
	(vectorizable_early_exit): New.
	(vect_analyze_stmt, vect_transform_stmt): Use it.
	(vect_is_simple_use, vect_get_vector_types_for_stmt): Support gcond.

--- inline copy of patch ---

diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 7debe7f0731673cd1bf25cd39d55e23990a73d0e..c6cedf4fe7c1f1e1126ce166a059a4b2a2b49cbd 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -132,6 +132,7 @@ vect_init_pattern_stmt (vec_info *vinfo, gimple *pattern_stmt,
   if (!STMT_VINFO_VECTYPE (pattern_stmt_info))
     {
       gcc_assert (!vectype
+		  || is_a <gcond *> (pattern_stmt)
 		  || (VECTOR_BOOLEAN_TYPE_P (vectype)
 		      == vect_use_mask_type_p (orig_stmt_info)));
       STMT_VINFO_VECTYPE (pattern_stmt_info) = vectype;
@@ -5210,19 +5211,27 @@ vect_recog_mixed_size_cond_pattern (vec_info *vinfo,
    true if bool VAR can and should be optimized that way.  Assume it shouldn't
    in case it's a result of a comparison which can be directly vectorized into
    a vector comparison.  Fills in STMTS with all stmts visited during the
-   walk.  */
+   walk.  if COND then a gcond is being inspected instead of a normal COND,  */
 
 static bool
-check_bool_pattern (tree var, vec_info *vinfo, hash_set<gimple *> &stmts)
+check_bool_pattern (tree var, vec_info *vinfo, hash_set<gimple *> &stmts,
+		    gcond *cond)
 {
   tree rhs1;
   enum tree_code rhs_code;
+  gassign *def_stmt = NULL;
 
   stmt_vec_info def_stmt_info = vect_get_internal_def (vinfo, var);
-  if (!def_stmt_info)
+  if (!def_stmt_info && !cond)
     return false;
+  else if (!def_stmt_info)
+    /* If we're a gcond we won't be codegen-ing the statements and are only
+       after if the types match.  In that case we can accept loop invariant
+       values.  */
+    def_stmt = dyn_cast <gassign *> (SSA_NAME_DEF_STMT (var));
+  else
+    def_stmt = dyn_cast <gassign *> (def_stmt_info->stmt);
 
-  gassign *def_stmt = dyn_cast <gassign *> (def_stmt_info->stmt);
   if (!def_stmt)
     return false;
 
@@ -5234,27 +5243,28 @@ check_bool_pattern (tree var, vec_info *vinfo, hash_set<gimple *> &stmts)
   switch (rhs_code)
     {
     case SSA_NAME:
-      if (! check_bool_pattern (rhs1, vinfo, stmts))
+      if (! check_bool_pattern (rhs1, vinfo, stmts, cond))
 	return false;
       break;
 
     CASE_CONVERT:
       if (!VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (rhs1)))
 	return false;
-      if (! check_bool_pattern (rhs1, vinfo, stmts))
+      if (! check_bool_pattern (rhs1, vinfo, stmts, cond))
 	return false;
       break;
 
     case BIT_NOT_EXPR:
-      if (! check_bool_pattern (rhs1, vinfo, stmts))
+      if (! check_bool_pattern (rhs1, vinfo, stmts, cond))
 	return false;
       break;
 
     case BIT_AND_EXPR:
     case BIT_IOR_EXPR:
     case BIT_XOR_EXPR:
-      if (! check_bool_pattern (rhs1, vinfo, stmts)
-	  || ! check_bool_pattern (gimple_assign_rhs2 (def_stmt), vinfo, stmts))
+      if (! check_bool_pattern (rhs1, vinfo, stmts, cond)
+	  || ! check_bool_pattern (gimple_assign_rhs2 (def_stmt), vinfo, stmts,
+				   cond))
 	return false;
       break;
 
@@ -5275,6 +5285,7 @@ check_bool_pattern (tree var, vec_info *vinfo, hash_set<gimple *> &stmts)
 	  tree mask_type = get_mask_type_for_scalar_type (vinfo,
 							  TREE_TYPE (rhs1));
 	  if (mask_type
+	      && !cond
 	      && expand_vec_cmp_expr_p (comp_vectype, mask_type, rhs_code))
 	    return false;
 
@@ -5324,11 +5335,13 @@ adjust_bool_pattern_cast (vec_info *vinfo,
    VAR is an SSA_NAME that should be transformed from bool to a wider integer
    type, OUT_TYPE is the desired final integer type of the whole pattern.
    STMT_INFO is the info of the pattern root and is where pattern stmts should
-   be associated with.  DEFS is a map of pattern defs.  */
+   be associated with.  DEFS is a map of pattern defs.  If TYPE_ONLY then don't
+   create new pattern statements and instead only fill LAST_STMT and DEFS.  */
 
 static void
 adjust_bool_pattern (vec_info *vinfo, tree var, tree out_type,
-		     stmt_vec_info stmt_info, hash_map <tree, tree> &defs)
+		     stmt_vec_info stmt_info, hash_map <tree, tree> &defs,
+		     gimple *&last_stmt, bool type_only)
 {
   gimple *stmt = SSA_NAME_DEF_STMT (var);
   enum tree_code rhs_code, def_rhs_code;
@@ -5492,8 +5505,10 @@ adjust_bool_pattern (vec_info *vinfo, tree var, tree out_type,
     }
 
   gimple_set_location (pattern_stmt, loc);
-  append_pattern_def_seq (vinfo, stmt_info, pattern_stmt,
-			  get_vectype_for_scalar_type (vinfo, itype));
+  if (!type_only)
+    append_pattern_def_seq (vinfo, stmt_info, pattern_stmt,
+			    get_vectype_for_scalar_type (vinfo, itype));
+  last_stmt = pattern_stmt;
   defs.put (var, gimple_assign_lhs (pattern_stmt));
 }
 
@@ -5509,11 +5524,14 @@ sort_after_uid (const void *p1, const void *p2)
 
 /* Create pattern stmts for all stmts participating in the bool pattern
    specified by BOOL_STMT_SET and its root STMT_INFO with the desired type
-   OUT_TYPE.  Return the def of the pattern root.  */
+   OUT_TYPE.  Return the def of the pattern root.  If TYPE_ONLY the new
+   statements are not emitted as pattern statements and the tree returned is
+   only useful for type queries.  */
 
 static tree
 adjust_bool_stmts (vec_info *vinfo, hash_set <gimple *> &bool_stmt_set,
-		   tree out_type, stmt_vec_info stmt_info)
+		   tree out_type, stmt_vec_info stmt_info,
+		   bool type_only = false)
 {
   /* Gather original stmts in the bool pattern in their order of appearance
      in the IL.  */
@@ -5523,16 +5541,16 @@ adjust_bool_stmts (vec_info *vinfo, hash_set <gimple *> &bool_stmt_set,
     bool_stmts.quick_push (*i);
   bool_stmts.qsort (sort_after_uid);
 
+  gimple *last_stmt = NULL;
+
   /* Now process them in that order, producing pattern stmts.  */
   hash_map <tree, tree> defs;
   for (unsigned i = 0; i < bool_stmts.length (); ++i)
     adjust_bool_pattern (vinfo, gimple_assign_lhs (bool_stmts[i]),
-			 out_type, stmt_info, defs);
+			 out_type, stmt_info, defs, last_stmt, type_only);
 
   /* Pop the last pattern seq stmt and install it as pattern root for STMT.  */
-  gimple *pattern_stmt
-    = gimple_seq_last_stmt (STMT_VINFO_PATTERN_DEF_SEQ (stmt_info));
-  return gimple_assign_lhs (pattern_stmt);
+  return gimple_assign_lhs (last_stmt);
 }
 
 /* Return the proper type for converting bool VAR into
@@ -5608,13 +5626,22 @@ vect_recog_bool_pattern (vec_info *vinfo,
   enum tree_code rhs_code;
   tree var, lhs, rhs, vectype;
   gimple *pattern_stmt;
-
-  if (!is_gimple_assign (last_stmt))
+  gcond* cond = NULL;
+  if (!is_gimple_assign (last_stmt)
+      && !(cond = dyn_cast <gcond *> (last_stmt)))
     return NULL;
 
-  var = gimple_assign_rhs1 (last_stmt);
-  lhs = gimple_assign_lhs (last_stmt);
-  rhs_code = gimple_assign_rhs_code (last_stmt);
+  if (is_gimple_assign (last_stmt))
+    {
+      var = gimple_assign_rhs1 (last_stmt);
+      lhs = gimple_assign_lhs (last_stmt);
+      rhs_code = gimple_assign_rhs_code (last_stmt);
+    }
+  else
+    {
+      lhs = var = gimple_cond_lhs (last_stmt);
+      rhs_code = gimple_cond_code (last_stmt);
+    }
 
   if (rhs_code == VIEW_CONVERT_EXPR)
     var = TREE_OPERAND (var, 0);
@@ -5632,7 +5659,7 @@ vect_recog_bool_pattern (vec_info *vinfo,
 	return NULL;
       vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (lhs));
 
-      if (check_bool_pattern (var, vinfo, bool_stmts))
+      if (check_bool_pattern (var, vinfo, bool_stmts, cond))
 	{
 	  rhs = adjust_bool_stmts (vinfo, bool_stmts,
 				   TREE_TYPE (lhs), stmt_vinfo);
@@ -5680,7 +5707,7 @@ vect_recog_bool_pattern (vec_info *vinfo,
 
       return pattern_stmt;
     }
-  else if (rhs_code == COND_EXPR
+  else if ((rhs_code == COND_EXPR || cond)
 	   && TREE_CODE (var) == SSA_NAME)
     {
       vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (lhs));
@@ -5700,18 +5727,31 @@ vect_recog_bool_pattern (vec_info *vinfo,
       if (get_vectype_for_scalar_type (vinfo, type) == NULL_TREE)
 	return NULL;
 
-      if (check_bool_pattern (var, vinfo, bool_stmts))
+      if (check_bool_pattern (var, vinfo, bool_stmts, cond))
 	var = adjust_bool_stmts (vinfo, bool_stmts, type, stmt_vinfo);
       else if (integer_type_for_mask (var, vinfo))
 	return NULL;
 
-      lhs = vect_recog_temp_ssa_var (TREE_TYPE (lhs), NULL);
-      pattern_stmt 
-	= gimple_build_assign (lhs, COND_EXPR,
-			       build2 (NE_EXPR, boolean_type_node,
-				       var, build_int_cst (TREE_TYPE (var), 0)),
-			       gimple_assign_rhs2 (last_stmt),
-			       gimple_assign_rhs3 (last_stmt));
+      if (!cond)
+	{
+	  lhs = vect_recog_temp_ssa_var (TREE_TYPE (lhs), NULL);
+	  pattern_stmt
+	    = gimple_build_assign (lhs, COND_EXPR,
+				   build2 (NE_EXPR, boolean_type_node, var,
+					   build_int_cst (TREE_TYPE (var), 0)),
+				   gimple_assign_rhs2 (last_stmt),
+				   gimple_assign_rhs3 (last_stmt));
+	}
+      else
+	{
+	  pattern_stmt
+	    = gimple_build_cond (gimple_cond_code (cond), gimple_cond_lhs (cond),
+				 gimple_cond_rhs (cond),
+				 gimple_cond_true_label (cond),
+				 gimple_cond_false_label (cond));
+	  vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (var));
+	  vectype = truth_type_for (vectype);
+	}
       *type_out = vectype;
       vect_pattern_detected ("vect_recog_bool_pattern", last_stmt);
 
@@ -5725,7 +5765,7 @@ vect_recog_bool_pattern (vec_info *vinfo,
       if (!vectype || !VECTOR_MODE_P (TYPE_MODE (vectype)))
 	return NULL;
 
-      if (check_bool_pattern (var, vinfo, bool_stmts))
+      if (check_bool_pattern (var, vinfo, bool_stmts, cond))
 	rhs = adjust_bool_stmts (vinfo, bool_stmts,
 				 TREE_TYPE (vectype), stmt_vinfo);
       else
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 582c5e678fad802d6e76300fe3c939b9f2978f17..d801b72a149ebe6aa4d1f2942324b042d07be530 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -12489,7 +12489,7 @@ vectorizable_comparison_1 (vec_info *vinfo, tree vectype,
   vec<tree> vec_oprnds0 = vNULL;
   vec<tree> vec_oprnds1 = vNULL;
   tree mask_type;
-  tree mask;
+  tree mask = NULL_TREE;
 
   if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
     return false;
@@ -12629,8 +12629,9 @@ vectorizable_comparison_1 (vec_info *vinfo, tree vectype,
   /* Transform.  */
 
   /* Handle def.  */
-  lhs = gimple_assign_lhs (STMT_VINFO_STMT (stmt_info));
-  mask = vect_create_destination_var (lhs, mask_type);
+  lhs = gimple_get_lhs (STMT_VINFO_STMT (stmt_info));
+  if (lhs)
+    mask = vect_create_destination_var (lhs, mask_type);
 
   vect_get_vec_defs (vinfo, stmt_info, slp_node, ncopies,
 		     rhs1, &vec_oprnds0, vectype,
@@ -12644,7 +12645,10 @@ vectorizable_comparison_1 (vec_info *vinfo, tree vectype,
       gimple *new_stmt;
       vec_rhs2 = vec_oprnds1[i];
 
-      new_temp = make_ssa_name (mask);
+      if (lhs)
+	new_temp = make_ssa_name (mask);
+      else
+	new_temp = make_temp_ssa_name (mask_type, NULL, "cmp");
       if (bitop1 == NOP_EXPR)
 	{
 	  new_stmt = gimple_build_assign (new_temp, code,
@@ -12723,6 +12727,176 @@ vectorizable_comparison (vec_info *vinfo,
   return true;
 }
 
+/* Check to see if the current early break given in STMT_INFO is valid for
+   vectorization.  */
+
+static bool
+vectorizable_early_exit (vec_info *vinfo, stmt_vec_info stmt_info,
+			 gimple_stmt_iterator *gsi, gimple **vec_stmt,
+			 slp_tree slp_node, stmt_vector_for_cost *cost_vec)
+{
+  loop_vec_info loop_vinfo = dyn_cast <loop_vec_info> (vinfo);
+  if (!loop_vinfo
+      || !is_a <gcond *> (STMT_VINFO_STMT (stmt_info)))
+    return false;
+
+  if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_condition_def)
+    return false;
+
+  if (!STMT_VINFO_RELEVANT_P (stmt_info))
+    return false;
+
+  auto code = gimple_cond_code (STMT_VINFO_STMT (stmt_info));
+  tree vectype = STMT_VINFO_VECTYPE (stmt_info);
+  gcc_assert (vectype);
+
+  tree vectype_op0 = NULL_TREE;
+  slp_tree slp_op0;
+  tree op0;
+  enum vect_def_type dt0;
+  if (!vect_is_simple_use (vinfo, stmt_info, slp_node, 0, &op0, &slp_op0, &dt0,
+			   &vectype_op0))
+    {
+      if (dump_enabled_p ())
+	  dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+			   "use not simple.\n");
+	return false;
+    }
+
+  machine_mode mode = TYPE_MODE (vectype);
+  int ncopies;
+
+  if (slp_node)
+    ncopies = 1;
+  else
+    ncopies = vect_get_num_copies (loop_vinfo, vectype);
+
+  vec_loop_masks *masks = &LOOP_VINFO_MASKS (loop_vinfo);
+  bool masked_loop_p = LOOP_VINFO_FULLY_MASKED_P (loop_vinfo);
+
+  /* Analyze only.  */
+  if (!vec_stmt)
+    {
+      if (direct_optab_handler (cbranch_optab, mode) == CODE_FOR_nothing)
+	{
+	  if (dump_enabled_p ())
+	      dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+			       "can't vectorize early exit because the "
+			       "target doesn't support flag setting vector "
+			       "comparisons.\n");
+	  return false;
+	}
+
+      if (ncopies > 1
+	  && direct_optab_handler (ior_optab, mode) == CODE_FOR_nothing)
+	{
+	  if (dump_enabled_p ())
+	      dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+			       "can't vectorize early exit because the "
+			       "target does not support boolean vector OR for "
+			       "type %T.\n", vectype);
+	  return false;
+	}
+
+      if (!vectorizable_comparison_1 (vinfo, vectype, stmt_info, code, gsi,
+				      vec_stmt, slp_node, cost_vec))
+	return false;
+
+      if (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo))
+	{
+	  if (direct_internal_fn_supported_p (IFN_VCOND_MASK_LEN, vectype,
+					      OPTIMIZE_FOR_SPEED))
+	    return false;
+	  else
+	    vect_record_loop_mask (loop_vinfo, masks, ncopies, vectype, NULL);
+	}
+
+
+      return true;
+    }
+
+  /* Tranform.  */
+
+  tree new_temp = NULL_TREE;
+  gimple *new_stmt = NULL;
+
+  if (dump_enabled_p ())
+    dump_printf_loc (MSG_NOTE, vect_location, "transform early-exit.\n");
+
+  if (!vectorizable_comparison_1 (vinfo, vectype, stmt_info, code, gsi,
+				  vec_stmt, slp_node, cost_vec))
+    gcc_unreachable ();
+
+  gimple *stmt = STMT_VINFO_STMT (stmt_info);
+  basic_block cond_bb = gimple_bb (stmt);
+  gimple_stmt_iterator  cond_gsi = gsi_last_bb (cond_bb);
+
+  auto_vec<tree> stmts;
+
+  if (slp_node)
+    stmts.safe_splice (SLP_TREE_VEC_DEFS (slp_node));
+  else
+    {
+      auto vec_stmts = STMT_VINFO_VEC_STMTS (stmt_info);
+      stmts.reserve_exact (vec_stmts.length ());
+      for (auto stmt : vec_stmts)
+	stmts.quick_push (gimple_assign_lhs (stmt));
+    }
+
+  /* Determine if we need to reduce the final value.  */
+  if (stmts.length () > 1)
+    {
+      /* We build the reductions in a way to maintain as much parallelism as
+	 possible.  */
+      auto_vec<tree> workset (stmts.length ());
+      workset.splice (stmts);
+      while (workset.length () > 1)
+	{
+	  new_temp = make_temp_ssa_name (vectype, NULL, "vexit_reduc");
+	  tree arg0 = workset.pop ();
+	  tree arg1 = workset.pop ();
+	  new_stmt = gimple_build_assign (new_temp, BIT_IOR_EXPR, arg0, arg1);
+	  vect_finish_stmt_generation (loop_vinfo, stmt_info, new_stmt,
+				       &cond_gsi);
+	  workset.quick_insert (0, new_temp);
+	}
+    }
+  else
+    new_temp = stmts[0];
+
+  gcc_assert (new_temp);
+
+  tree cond = new_temp;
+  /* If we have multiple statements after reduction we should check all the
+     lanes and treat it as a full vector.  */
+  if (masked_loop_p)
+    {
+      tree mask = vect_get_loop_mask (loop_vinfo, gsi, masks, ncopies,
+				      vectype, 0);
+      cond = prepare_vec_mask (loop_vinfo, TREE_TYPE (mask), mask, cond,
+			       &cond_gsi);
+    }
+
+  /* Now build the new conditional.  Pattern gimple_conds get dropped during
+     codegen so we must replace the original insn.  */
+  stmt = STMT_VINFO_STMT (vect_orig_stmt (stmt_info));
+  gcond *cond_stmt = as_a <gcond *>(stmt);
+  gimple_cond_set_condition (cond_stmt, NE_EXPR, cond,
+			     build_zero_cst (vectype));
+  update_stmt (stmt);
+
+  if (slp_node)
+    SLP_TREE_VEC_DEFS (slp_node).truncate (0);
+   else
+    STMT_VINFO_VEC_STMTS (stmt_info).truncate (0);
+
+
+  if (!slp_node)
+    *vec_stmt = stmt;
+
+  return true;
+}
+
 /* If SLP_NODE is nonnull, return true if vectorizable_live_operation
    can handle all live statements in the node.  Otherwise return true
    if STMT_INFO is not live or if vectorizable_live_operation can handle it.
@@ -12949,7 +13123,9 @@ vect_analyze_stmt (vec_info *vinfo,
 	  || vectorizable_lc_phi (as_a <loop_vec_info> (vinfo),
 				  stmt_info, NULL, node)
 	  || vectorizable_recurr (as_a <loop_vec_info> (vinfo),
-				   stmt_info, NULL, node, cost_vec));
+				   stmt_info, NULL, node, cost_vec)
+	  || vectorizable_early_exit (vinfo, stmt_info, NULL, NULL, node,
+				      cost_vec));
   else
     {
       if (bb_vinfo)
@@ -12972,7 +13148,10 @@ vect_analyze_stmt (vec_info *vinfo,
 					 NULL, NULL, node, cost_vec)
 	      || vectorizable_comparison (vinfo, stmt_info, NULL, NULL, node,
 					  cost_vec)
-	      || vectorizable_phi (vinfo, stmt_info, NULL, node, cost_vec));
+	      || vectorizable_phi (vinfo, stmt_info, NULL, node, cost_vec)
+	      || vectorizable_early_exit (vinfo, stmt_info, NULL, NULL, node,
+					  cost_vec));
+
     }
 
   if (node)
@@ -13131,6 +13310,12 @@ vect_transform_stmt (vec_info *vinfo,
       gcc_assert (done);
       break;
 
+    case loop_exit_ctrl_vec_info_type:
+      done = vectorizable_early_exit (vinfo, stmt_info, gsi, &vec_stmt,
+				      slp_node, NULL);
+      gcc_assert (done);
+      break;
+
     default:
       if (!STMT_VINFO_LIVE_P (stmt_info))
 	{
@@ -14321,10 +14506,19 @@ vect_get_vector_types_for_stmt (vec_info *vinfo, stmt_vec_info stmt_info,
     }
   else
     {
+      gcond *cond = NULL;
       if (data_reference *dr = STMT_VINFO_DATA_REF (stmt_info))
 	scalar_type = TREE_TYPE (DR_REF (dr));
       else if (gimple_call_internal_p (stmt, IFN_MASK_STORE))
 	scalar_type = TREE_TYPE (gimple_call_arg (stmt, 3));
+      else if ((cond = dyn_cast <gcond *> (stmt)))
+	{
+	  /* We can't convert the scalar type to boolean yet, since booleans have a
+	     single bit precision and we need the vector boolean to be a
+	     representation of the integer mask.  So set the correct integer type and
+	     convert to boolean vector once we have a vectype.  */
+	  scalar_type = TREE_TYPE (gimple_cond_lhs (cond));
+	}
       else
 	scalar_type = TREE_TYPE (gimple_get_lhs (stmt));
 
@@ -14339,12 +14533,18 @@ vect_get_vector_types_for_stmt (vec_info *vinfo, stmt_vec_info stmt_info,
 			     "get vectype for scalar type: %T\n", scalar_type);
 	}
       vectype = get_vectype_for_scalar_type (vinfo, scalar_type, group_size);
+
       if (!vectype)
 	return opt_result::failure_at (stmt,
 				       "not vectorized:"
 				       " unsupported data-type %T\n",
 				       scalar_type);
 
+      /* If we were a gcond, convert the resulting type to a vector boolean type now
+	 that we have the correct integer mask type.  */
+      if (cond)
+	vectype = truth_type_for (vectype);
+
       if (dump_enabled_p ())
 	dump_printf_loc (MSG_NOTE, vect_location, "vectype: %T\n", vectype);
     }

[-- Attachment #2: rb17969.patch --]
[-- Type: application/octet-stream, Size: 18893 bytes --]

diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 7debe7f0731673cd1bf25cd39d55e23990a73d0e..c6cedf4fe7c1f1e1126ce166a059a4b2a2b49cbd 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -132,6 +132,7 @@ vect_init_pattern_stmt (vec_info *vinfo, gimple *pattern_stmt,
   if (!STMT_VINFO_VECTYPE (pattern_stmt_info))
     {
       gcc_assert (!vectype
+		  || is_a <gcond *> (pattern_stmt)
 		  || (VECTOR_BOOLEAN_TYPE_P (vectype)
 		      == vect_use_mask_type_p (orig_stmt_info)));
       STMT_VINFO_VECTYPE (pattern_stmt_info) = vectype;
@@ -5210,19 +5211,27 @@ vect_recog_mixed_size_cond_pattern (vec_info *vinfo,
    true if bool VAR can and should be optimized that way.  Assume it shouldn't
    in case it's a result of a comparison which can be directly vectorized into
    a vector comparison.  Fills in STMTS with all stmts visited during the
-   walk.  */
+   walk.  if COND then a gcond is being inspected instead of a normal COND,  */
 
 static bool
-check_bool_pattern (tree var, vec_info *vinfo, hash_set<gimple *> &stmts)
+check_bool_pattern (tree var, vec_info *vinfo, hash_set<gimple *> &stmts,
+		    gcond *cond)
 {
   tree rhs1;
   enum tree_code rhs_code;
+  gassign *def_stmt = NULL;
 
   stmt_vec_info def_stmt_info = vect_get_internal_def (vinfo, var);
-  if (!def_stmt_info)
+  if (!def_stmt_info && !cond)
     return false;
+  else if (!def_stmt_info)
+    /* If we're a gcond we won't be codegen-ing the statements and are only
+       after if the types match.  In that case we can accept loop invariant
+       values.  */
+    def_stmt = dyn_cast <gassign *> (SSA_NAME_DEF_STMT (var));
+  else
+    def_stmt = dyn_cast <gassign *> (def_stmt_info->stmt);
 
-  gassign *def_stmt = dyn_cast <gassign *> (def_stmt_info->stmt);
   if (!def_stmt)
     return false;
 
@@ -5234,27 +5243,28 @@ check_bool_pattern (tree var, vec_info *vinfo, hash_set<gimple *> &stmts)
   switch (rhs_code)
     {
     case SSA_NAME:
-      if (! check_bool_pattern (rhs1, vinfo, stmts))
+      if (! check_bool_pattern (rhs1, vinfo, stmts, cond))
 	return false;
       break;
 
     CASE_CONVERT:
       if (!VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (rhs1)))
 	return false;
-      if (! check_bool_pattern (rhs1, vinfo, stmts))
+      if (! check_bool_pattern (rhs1, vinfo, stmts, cond))
 	return false;
       break;
 
     case BIT_NOT_EXPR:
-      if (! check_bool_pattern (rhs1, vinfo, stmts))
+      if (! check_bool_pattern (rhs1, vinfo, stmts, cond))
 	return false;
       break;
 
     case BIT_AND_EXPR:
     case BIT_IOR_EXPR:
     case BIT_XOR_EXPR:
-      if (! check_bool_pattern (rhs1, vinfo, stmts)
-	  || ! check_bool_pattern (gimple_assign_rhs2 (def_stmt), vinfo, stmts))
+      if (! check_bool_pattern (rhs1, vinfo, stmts, cond)
+	  || ! check_bool_pattern (gimple_assign_rhs2 (def_stmt), vinfo, stmts,
+				   cond))
 	return false;
       break;
 
@@ -5275,6 +5285,7 @@ check_bool_pattern (tree var, vec_info *vinfo, hash_set<gimple *> &stmts)
 	  tree mask_type = get_mask_type_for_scalar_type (vinfo,
 							  TREE_TYPE (rhs1));
 	  if (mask_type
+	      && !cond
 	      && expand_vec_cmp_expr_p (comp_vectype, mask_type, rhs_code))
 	    return false;
 
@@ -5324,11 +5335,13 @@ adjust_bool_pattern_cast (vec_info *vinfo,
    VAR is an SSA_NAME that should be transformed from bool to a wider integer
    type, OUT_TYPE is the desired final integer type of the whole pattern.
    STMT_INFO is the info of the pattern root and is where pattern stmts should
-   be associated with.  DEFS is a map of pattern defs.  */
+   be associated with.  DEFS is a map of pattern defs.  If TYPE_ONLY then don't
+   create new pattern statements and instead only fill LAST_STMT and DEFS.  */
 
 static void
 adjust_bool_pattern (vec_info *vinfo, tree var, tree out_type,
-		     stmt_vec_info stmt_info, hash_map <tree, tree> &defs)
+		     stmt_vec_info stmt_info, hash_map <tree, tree> &defs,
+		     gimple *&last_stmt, bool type_only)
 {
   gimple *stmt = SSA_NAME_DEF_STMT (var);
   enum tree_code rhs_code, def_rhs_code;
@@ -5492,8 +5505,10 @@ adjust_bool_pattern (vec_info *vinfo, tree var, tree out_type,
     }
 
   gimple_set_location (pattern_stmt, loc);
-  append_pattern_def_seq (vinfo, stmt_info, pattern_stmt,
-			  get_vectype_for_scalar_type (vinfo, itype));
+  if (!type_only)
+    append_pattern_def_seq (vinfo, stmt_info, pattern_stmt,
+			    get_vectype_for_scalar_type (vinfo, itype));
+  last_stmt = pattern_stmt;
   defs.put (var, gimple_assign_lhs (pattern_stmt));
 }
 
@@ -5509,11 +5524,14 @@ sort_after_uid (const void *p1, const void *p2)
 
 /* Create pattern stmts for all stmts participating in the bool pattern
    specified by BOOL_STMT_SET and its root STMT_INFO with the desired type
-   OUT_TYPE.  Return the def of the pattern root.  */
+   OUT_TYPE.  Return the def of the pattern root.  If TYPE_ONLY the new
+   statements are not emitted as pattern statements and the tree returned is
+   only useful for type queries.  */
 
 static tree
 adjust_bool_stmts (vec_info *vinfo, hash_set <gimple *> &bool_stmt_set,
-		   tree out_type, stmt_vec_info stmt_info)
+		   tree out_type, stmt_vec_info stmt_info,
+		   bool type_only = false)
 {
   /* Gather original stmts in the bool pattern in their order of appearance
      in the IL.  */
@@ -5523,16 +5541,16 @@ adjust_bool_stmts (vec_info *vinfo, hash_set <gimple *> &bool_stmt_set,
     bool_stmts.quick_push (*i);
   bool_stmts.qsort (sort_after_uid);
 
+  gimple *last_stmt = NULL;
+
   /* Now process them in that order, producing pattern stmts.  */
   hash_map <tree, tree> defs;
   for (unsigned i = 0; i < bool_stmts.length (); ++i)
     adjust_bool_pattern (vinfo, gimple_assign_lhs (bool_stmts[i]),
-			 out_type, stmt_info, defs);
+			 out_type, stmt_info, defs, last_stmt, type_only);
 
   /* Pop the last pattern seq stmt and install it as pattern root for STMT.  */
-  gimple *pattern_stmt
-    = gimple_seq_last_stmt (STMT_VINFO_PATTERN_DEF_SEQ (stmt_info));
-  return gimple_assign_lhs (pattern_stmt);
+  return gimple_assign_lhs (last_stmt);
 }
 
 /* Return the proper type for converting bool VAR into
@@ -5608,13 +5626,22 @@ vect_recog_bool_pattern (vec_info *vinfo,
   enum tree_code rhs_code;
   tree var, lhs, rhs, vectype;
   gimple *pattern_stmt;
-
-  if (!is_gimple_assign (last_stmt))
+  gcond* cond = NULL;
+  if (!is_gimple_assign (last_stmt)
+      && !(cond = dyn_cast <gcond *> (last_stmt)))
     return NULL;
 
-  var = gimple_assign_rhs1 (last_stmt);
-  lhs = gimple_assign_lhs (last_stmt);
-  rhs_code = gimple_assign_rhs_code (last_stmt);
+  if (is_gimple_assign (last_stmt))
+    {
+      var = gimple_assign_rhs1 (last_stmt);
+      lhs = gimple_assign_lhs (last_stmt);
+      rhs_code = gimple_assign_rhs_code (last_stmt);
+    }
+  else
+    {
+      lhs = var = gimple_cond_lhs (last_stmt);
+      rhs_code = gimple_cond_code (last_stmt);
+    }
 
   if (rhs_code == VIEW_CONVERT_EXPR)
     var = TREE_OPERAND (var, 0);
@@ -5632,7 +5659,7 @@ vect_recog_bool_pattern (vec_info *vinfo,
 	return NULL;
       vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (lhs));
 
-      if (check_bool_pattern (var, vinfo, bool_stmts))
+      if (check_bool_pattern (var, vinfo, bool_stmts, cond))
 	{
 	  rhs = adjust_bool_stmts (vinfo, bool_stmts,
 				   TREE_TYPE (lhs), stmt_vinfo);
@@ -5680,7 +5707,7 @@ vect_recog_bool_pattern (vec_info *vinfo,
 
       return pattern_stmt;
     }
-  else if (rhs_code == COND_EXPR
+  else if ((rhs_code == COND_EXPR || cond)
 	   && TREE_CODE (var) == SSA_NAME)
     {
       vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (lhs));
@@ -5700,18 +5727,31 @@ vect_recog_bool_pattern (vec_info *vinfo,
       if (get_vectype_for_scalar_type (vinfo, type) == NULL_TREE)
 	return NULL;
 
-      if (check_bool_pattern (var, vinfo, bool_stmts))
+      if (check_bool_pattern (var, vinfo, bool_stmts, cond))
 	var = adjust_bool_stmts (vinfo, bool_stmts, type, stmt_vinfo);
       else if (integer_type_for_mask (var, vinfo))
 	return NULL;
 
-      lhs = vect_recog_temp_ssa_var (TREE_TYPE (lhs), NULL);
-      pattern_stmt 
-	= gimple_build_assign (lhs, COND_EXPR,
-			       build2 (NE_EXPR, boolean_type_node,
-				       var, build_int_cst (TREE_TYPE (var), 0)),
-			       gimple_assign_rhs2 (last_stmt),
-			       gimple_assign_rhs3 (last_stmt));
+      if (!cond)
+	{
+	  lhs = vect_recog_temp_ssa_var (TREE_TYPE (lhs), NULL);
+	  pattern_stmt
+	    = gimple_build_assign (lhs, COND_EXPR,
+				   build2 (NE_EXPR, boolean_type_node, var,
+					   build_int_cst (TREE_TYPE (var), 0)),
+				   gimple_assign_rhs2 (last_stmt),
+				   gimple_assign_rhs3 (last_stmt));
+	}
+      else
+	{
+	  pattern_stmt
+	    = gimple_build_cond (gimple_cond_code (cond), gimple_cond_lhs (cond),
+				 gimple_cond_rhs (cond),
+				 gimple_cond_true_label (cond),
+				 gimple_cond_false_label (cond));
+	  vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (var));
+	  vectype = truth_type_for (vectype);
+	}
       *type_out = vectype;
       vect_pattern_detected ("vect_recog_bool_pattern", last_stmt);
 
@@ -5725,7 +5765,7 @@ vect_recog_bool_pattern (vec_info *vinfo,
       if (!vectype || !VECTOR_MODE_P (TYPE_MODE (vectype)))
 	return NULL;
 
-      if (check_bool_pattern (var, vinfo, bool_stmts))
+      if (check_bool_pattern (var, vinfo, bool_stmts, cond))
 	rhs = adjust_bool_stmts (vinfo, bool_stmts,
 				 TREE_TYPE (vectype), stmt_vinfo);
       else
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 582c5e678fad802d6e76300fe3c939b9f2978f17..d801b72a149ebe6aa4d1f2942324b042d07be530 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -12489,7 +12489,7 @@ vectorizable_comparison_1 (vec_info *vinfo, tree vectype,
   vec<tree> vec_oprnds0 = vNULL;
   vec<tree> vec_oprnds1 = vNULL;
   tree mask_type;
-  tree mask;
+  tree mask = NULL_TREE;
 
   if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
     return false;
@@ -12629,8 +12629,9 @@ vectorizable_comparison_1 (vec_info *vinfo, tree vectype,
   /* Transform.  */
 
   /* Handle def.  */
-  lhs = gimple_assign_lhs (STMT_VINFO_STMT (stmt_info));
-  mask = vect_create_destination_var (lhs, mask_type);
+  lhs = gimple_get_lhs (STMT_VINFO_STMT (stmt_info));
+  if (lhs)
+    mask = vect_create_destination_var (lhs, mask_type);
 
   vect_get_vec_defs (vinfo, stmt_info, slp_node, ncopies,
 		     rhs1, &vec_oprnds0, vectype,
@@ -12644,7 +12645,10 @@ vectorizable_comparison_1 (vec_info *vinfo, tree vectype,
       gimple *new_stmt;
       vec_rhs2 = vec_oprnds1[i];
 
-      new_temp = make_ssa_name (mask);
+      if (lhs)
+	new_temp = make_ssa_name (mask);
+      else
+	new_temp = make_temp_ssa_name (mask_type, NULL, "cmp");
       if (bitop1 == NOP_EXPR)
 	{
 	  new_stmt = gimple_build_assign (new_temp, code,
@@ -12723,6 +12727,176 @@ vectorizable_comparison (vec_info *vinfo,
   return true;
 }
 
+/* Check to see if the current early break given in STMT_INFO is valid for
+   vectorization.  */
+
+static bool
+vectorizable_early_exit (vec_info *vinfo, stmt_vec_info stmt_info,
+			 gimple_stmt_iterator *gsi, gimple **vec_stmt,
+			 slp_tree slp_node, stmt_vector_for_cost *cost_vec)
+{
+  loop_vec_info loop_vinfo = dyn_cast <loop_vec_info> (vinfo);
+  if (!loop_vinfo
+      || !is_a <gcond *> (STMT_VINFO_STMT (stmt_info)))
+    return false;
+
+  if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_condition_def)
+    return false;
+
+  if (!STMT_VINFO_RELEVANT_P (stmt_info))
+    return false;
+
+  auto code = gimple_cond_code (STMT_VINFO_STMT (stmt_info));
+  tree vectype = STMT_VINFO_VECTYPE (stmt_info);
+  gcc_assert (vectype);
+
+  tree vectype_op0 = NULL_TREE;
+  slp_tree slp_op0;
+  tree op0;
+  enum vect_def_type dt0;
+  if (!vect_is_simple_use (vinfo, stmt_info, slp_node, 0, &op0, &slp_op0, &dt0,
+			   &vectype_op0))
+    {
+      if (dump_enabled_p ())
+	  dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+			   "use not simple.\n");
+	return false;
+    }
+
+  machine_mode mode = TYPE_MODE (vectype);
+  int ncopies;
+
+  if (slp_node)
+    ncopies = 1;
+  else
+    ncopies = vect_get_num_copies (loop_vinfo, vectype);
+
+  vec_loop_masks *masks = &LOOP_VINFO_MASKS (loop_vinfo);
+  bool masked_loop_p = LOOP_VINFO_FULLY_MASKED_P (loop_vinfo);
+
+  /* Analyze only.  */
+  if (!vec_stmt)
+    {
+      if (direct_optab_handler (cbranch_optab, mode) == CODE_FOR_nothing)
+	{
+	  if (dump_enabled_p ())
+	      dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+			       "can't vectorize early exit because the "
+			       "target doesn't support flag setting vector "
+			       "comparisons.\n");
+	  return false;
+	}
+
+      if (ncopies > 1
+	  && direct_optab_handler (ior_optab, mode) == CODE_FOR_nothing)
+	{
+	  if (dump_enabled_p ())
+	      dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+			       "can't vectorize early exit because the "
+			       "target does not support boolean vector OR for "
+			       "type %T.\n", vectype);
+	  return false;
+	}
+
+      if (!vectorizable_comparison_1 (vinfo, vectype, stmt_info, code, gsi,
+				      vec_stmt, slp_node, cost_vec))
+	return false;
+
+      if (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo))
+	{
+	  if (direct_internal_fn_supported_p (IFN_VCOND_MASK_LEN, vectype,
+					      OPTIMIZE_FOR_SPEED))
+	    return false;
+	  else
+	    vect_record_loop_mask (loop_vinfo, masks, ncopies, vectype, NULL);
+	}
+
+
+      return true;
+    }
+
+  /* Tranform.  */
+
+  tree new_temp = NULL_TREE;
+  gimple *new_stmt = NULL;
+
+  if (dump_enabled_p ())
+    dump_printf_loc (MSG_NOTE, vect_location, "transform early-exit.\n");
+
+  if (!vectorizable_comparison_1 (vinfo, vectype, stmt_info, code, gsi,
+				  vec_stmt, slp_node, cost_vec))
+    gcc_unreachable ();
+
+  gimple *stmt = STMT_VINFO_STMT (stmt_info);
+  basic_block cond_bb = gimple_bb (stmt);
+  gimple_stmt_iterator  cond_gsi = gsi_last_bb (cond_bb);
+
+  auto_vec<tree> stmts;
+
+  if (slp_node)
+    stmts.safe_splice (SLP_TREE_VEC_DEFS (slp_node));
+  else
+    {
+      auto vec_stmts = STMT_VINFO_VEC_STMTS (stmt_info);
+      stmts.reserve_exact (vec_stmts.length ());
+      for (auto stmt : vec_stmts)
+	stmts.quick_push (gimple_assign_lhs (stmt));
+    }
+
+  /* Determine if we need to reduce the final value.  */
+  if (stmts.length () > 1)
+    {
+      /* We build the reductions in a way to maintain as much parallelism as
+	 possible.  */
+      auto_vec<tree> workset (stmts.length ());
+      workset.splice (stmts);
+      while (workset.length () > 1)
+	{
+	  new_temp = make_temp_ssa_name (vectype, NULL, "vexit_reduc");
+	  tree arg0 = workset.pop ();
+	  tree arg1 = workset.pop ();
+	  new_stmt = gimple_build_assign (new_temp, BIT_IOR_EXPR, arg0, arg1);
+	  vect_finish_stmt_generation (loop_vinfo, stmt_info, new_stmt,
+				       &cond_gsi);
+	  workset.quick_insert (0, new_temp);
+	}
+    }
+  else
+    new_temp = stmts[0];
+
+  gcc_assert (new_temp);
+
+  tree cond = new_temp;
+  /* If we have multiple statements after reduction we should check all the
+     lanes and treat it as a full vector.  */
+  if (masked_loop_p)
+    {
+      tree mask = vect_get_loop_mask (loop_vinfo, gsi, masks, ncopies,
+				      vectype, 0);
+      cond = prepare_vec_mask (loop_vinfo, TREE_TYPE (mask), mask, cond,
+			       &cond_gsi);
+    }
+
+  /* Now build the new conditional.  Pattern gimple_conds get dropped during
+     codegen so we must replace the original insn.  */
+  stmt = STMT_VINFO_STMT (vect_orig_stmt (stmt_info));
+  gcond *cond_stmt = as_a <gcond *>(stmt);
+  gimple_cond_set_condition (cond_stmt, NE_EXPR, cond,
+			     build_zero_cst (vectype));
+  update_stmt (stmt);
+
+  if (slp_node)
+    SLP_TREE_VEC_DEFS (slp_node).truncate (0);
+   else
+    STMT_VINFO_VEC_STMTS (stmt_info).truncate (0);
+
+
+  if (!slp_node)
+    *vec_stmt = stmt;
+
+  return true;
+}
+
 /* If SLP_NODE is nonnull, return true if vectorizable_live_operation
    can handle all live statements in the node.  Otherwise return true
    if STMT_INFO is not live or if vectorizable_live_operation can handle it.
@@ -12949,7 +13123,9 @@ vect_analyze_stmt (vec_info *vinfo,
 	  || vectorizable_lc_phi (as_a <loop_vec_info> (vinfo),
 				  stmt_info, NULL, node)
 	  || vectorizable_recurr (as_a <loop_vec_info> (vinfo),
-				   stmt_info, NULL, node, cost_vec));
+				   stmt_info, NULL, node, cost_vec)
+	  || vectorizable_early_exit (vinfo, stmt_info, NULL, NULL, node,
+				      cost_vec));
   else
     {
       if (bb_vinfo)
@@ -12972,7 +13148,10 @@ vect_analyze_stmt (vec_info *vinfo,
 					 NULL, NULL, node, cost_vec)
 	      || vectorizable_comparison (vinfo, stmt_info, NULL, NULL, node,
 					  cost_vec)
-	      || vectorizable_phi (vinfo, stmt_info, NULL, node, cost_vec));
+	      || vectorizable_phi (vinfo, stmt_info, NULL, node, cost_vec)
+	      || vectorizable_early_exit (vinfo, stmt_info, NULL, NULL, node,
+					  cost_vec));
+
     }
 
   if (node)
@@ -13131,6 +13310,12 @@ vect_transform_stmt (vec_info *vinfo,
       gcc_assert (done);
       break;
 
+    case loop_exit_ctrl_vec_info_type:
+      done = vectorizable_early_exit (vinfo, stmt_info, gsi, &vec_stmt,
+				      slp_node, NULL);
+      gcc_assert (done);
+      break;
+
     default:
       if (!STMT_VINFO_LIVE_P (stmt_info))
 	{
@@ -14321,10 +14506,19 @@ vect_get_vector_types_for_stmt (vec_info *vinfo, stmt_vec_info stmt_info,
     }
   else
     {
+      gcond *cond = NULL;
       if (data_reference *dr = STMT_VINFO_DATA_REF (stmt_info))
 	scalar_type = TREE_TYPE (DR_REF (dr));
       else if (gimple_call_internal_p (stmt, IFN_MASK_STORE))
 	scalar_type = TREE_TYPE (gimple_call_arg (stmt, 3));
+      else if ((cond = dyn_cast <gcond *> (stmt)))
+	{
+	  /* We can't convert the scalar type to boolean yet, since booleans have a
+	     single bit precision and we need the vector boolean to be a
+	     representation of the integer mask.  So set the correct integer type and
+	     convert to boolean vector once we have a vectype.  */
+	  scalar_type = TREE_TYPE (gimple_cond_lhs (cond));
+	}
       else
 	scalar_type = TREE_TYPE (gimple_get_lhs (stmt));
 
@@ -14339,12 +14533,18 @@ vect_get_vector_types_for_stmt (vec_info *vinfo, stmt_vec_info stmt_info,
 			     "get vectype for scalar type: %T\n", scalar_type);
 	}
       vectype = get_vectype_for_scalar_type (vinfo, scalar_type, group_size);
+
       if (!vectype)
 	return opt_result::failure_at (stmt,
 				       "not vectorized:"
 				       " unsupported data-type %T\n",
 				       scalar_type);
 
+      /* If we were a gcond, convert the resulting type to a vector boolean type now
+	 that we have the correct integer mask type.  */
+      if (cond)
+	vectype = truth_type_for (vectype);
+
       if (dump_enabled_p ())
 	dump_printf_loc (MSG_NOTE, vect_location, "vectype: %T\n", vectype);
     }

  reply	other threads:[~2023-12-06  4:37 UTC|newest]

Thread overview: 202+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-28 13:40 [PATCH v5 0/19] Support early break/return auto-vectorization Tamar Christina
2023-06-28 13:41 ` [PATCH 1/19]middle-end ifcvt: Support bitfield lowering of multiple-exit loops Tamar Christina
2023-07-04 11:29   ` Richard Biener
2023-06-28 13:41 ` [PATCH 2/19][front-end] C/C++ front-end: add pragma GCC novector Tamar Christina
2023-06-29 22:17   ` Jason Merrill
2023-06-30 16:18     ` Tamar Christina
2023-06-30 16:44       ` Jason Merrill
2023-06-28 13:42 ` [PATCH 3/19]middle-end clean up vect testsuite using pragma novector Tamar Christina
2023-06-28 13:54   ` Tamar Christina
2023-07-04 11:31   ` Richard Biener
2023-06-28 13:43 ` [PATCH 4/19]middle-end: Fix scale_loop_frequencies segfault on multiple-exits Tamar Christina
2023-07-04 11:52   ` Richard Biener
2023-07-04 14:57     ` Jan Hubicka
2023-07-06 14:34       ` Jan Hubicka
2023-07-07  5:59         ` Richard Biener
2023-07-07 12:20           ` Jan Hubicka
2023-07-07 12:27             ` Tamar Christina
2023-07-07 14:10               ` Jan Hubicka
2023-07-10  7:07             ` Richard Biener
2023-07-10  8:33               ` Jan Hubicka
2023-07-10  9:24                 ` Richard Biener
2023-07-10  9:23               ` Jan Hubicka
2023-07-10  9:29                 ` Richard Biener
2023-07-11  9:28                   ` Jan Hubicka
2023-07-11 10:31                     ` Richard Biener
2023-07-11 12:40                       ` Jan Hubicka
2023-07-11 13:04                         ` Richard Biener
2023-06-28 13:43 ` [PATCH 5/19]middle-end: Enable bit-field vectorization to work correctly when we're vectoring inside conds Tamar Christina
2023-07-04 12:05   ` Richard Biener
2023-07-10 15:32     ` Tamar Christina
2023-07-11 11:03       ` Richard Biener
2023-06-28 13:44 ` [PATCH 6/19]middle-end: Don't enter piecewise expansion if VF is not constant Tamar Christina
2023-07-04 12:10   ` Richard Biener
2023-07-06 10:37     ` Tamar Christina
2023-07-06 10:51       ` Richard Biener
2023-06-28 13:44 ` [PATCH 7/19]middle-end: Refactor vectorizer loop conditionals and separate out IV to new variables Tamar Christina
2023-07-13 11:32   ` Richard Biener
2023-07-13 11:54     ` Tamar Christina
2023-07-13 12:10       ` Richard Biener
2023-06-28 13:45 ` [PATCH 8/19]middle-end: updated niters analysis to handle multiple exits Tamar Christina
2023-07-13 11:49   ` Richard Biener
2023-07-13 12:03     ` Tamar Christina
2023-07-14  9:09     ` Richard Biener
2023-06-28 13:45 ` [PATCH 9/19]AArch64 middle-end: refactor vectorizable_comparison to make the main body re-usable Tamar Christina
2023-06-28 13:55   ` [PATCH 9/19] " Tamar Christina
2023-07-13 16:23     ` Richard Biener
2023-06-28 13:46 ` [PATCH 10/19]middle-end: implement vectorizable_early_break Tamar Christina
2023-06-28 13:46 ` [PATCH 11/19]middle-end: implement code motion for early break Tamar Christina
2023-06-28 13:47 ` [PATCH 12/19]middle-end: implement loop peeling and IV updates " Tamar Christina
2023-07-13 17:31   ` Richard Biener
2023-07-13 19:05     ` Tamar Christina
2023-07-14 13:34       ` Richard Biener
2023-07-17 10:56         ` Tamar Christina
2023-07-17 12:48           ` Richard Biener
2023-08-18 11:35         ` Tamar Christina
2023-08-18 12:53           ` Richard Biener
2023-08-18 13:12             ` Tamar Christina
2023-08-18 13:15               ` Richard Biener
2023-10-23 20:21         ` Tamar Christina
2023-06-28 13:47 ` [PATCH 13/19]middle-end testsuite: un-xfail TSVC loops that check for exit control flow vectorization Tamar Christina
2023-06-28 13:47 ` [PATCH 14/19]middle-end testsuite: Add new tests for early break vectorization Tamar Christina
2023-06-28 13:48 ` [PATCH 15/19]AArch64: Add implementation for vector cbranch for Advanced SIMD Tamar Christina
2023-06-28 13:48 ` [PATCH 16/19]AArch64 Add optimization for vector != cbranch fed into compare with 0 " Tamar Christina
2023-06-28 13:48 ` [PATCH 17/19]AArch64 Add optimization for vector cbranch combining SVE and " Tamar Christina
2023-06-28 13:49 ` [PATCH 18/19]Arm: Add Advanced SIMD cbranch implementation Tamar Christina
2023-06-28 13:50 ` [PATCH 19/19]Arm: Add MVE " Tamar Christina
     [not found] ` <MW5PR11MB5908414D8B2AB0580A888ECAA924A@MW5PR11MB5908.namprd11.prod.outlook.com>
2023-06-28 14:49   ` FW: [PATCH v5 0/19] Support early break/return auto-vectorization 钟居哲
2023-06-28 16:00     ` Tamar Christina
2023-11-06  7:36 ` [PATCH v6 0/21]middle-end: " Tamar Christina
2023-11-06  7:37 ` [PATCH 1/21]middle-end testsuite: Add more pragma novector to new tests Tamar Christina
2023-11-07  9:46   ` Richard Biener
2023-11-06  7:37 ` [PATCH 2/21]middle-end testsuite: Add tests for early break vectorization Tamar Christina
2023-11-07  9:52   ` Richard Biener
2023-11-16 10:53     ` Richard Biener
2023-11-06  7:37 ` [PATCH 3/21]middle-end: Implement code motion and dependency analysis for early breaks Tamar Christina
2023-11-07 10:53   ` Richard Biener
2023-11-07 11:34     ` Tamar Christina
2023-11-07 14:23       ` Richard Biener
2023-12-19 10:11         ` Tamar Christina
2023-12-19 14:05           ` Richard Biener
2023-12-20 10:51             ` Tamar Christina
2023-12-20 12:24               ` Richard Biener
2023-11-06  7:38 ` [PATCH 4/21]middle-end: update loop peeling code to maintain LCSSA form " Tamar Christina
2023-11-15  0:00   ` Tamar Christina
2023-11-15 12:40     ` Richard Biener
2023-11-20 21:51       ` Tamar Christina
2023-11-24 10:16         ` Tamar Christina
2023-11-24 12:38           ` Richard Biener
2023-11-06  7:38 ` [PATCH 5/21]middle-end: update vectorizer's control update to support picking an exit other than loop latch Tamar Christina
2023-11-07 15:04   ` Richard Biener
2023-11-07 23:10     ` Tamar Christina
2023-11-13 20:11     ` Tamar Christina
2023-11-14  7:56       ` Richard Biener
2023-11-14  8:07         ` Tamar Christina
2023-11-14 23:59           ` Tamar Christina
2023-11-15 12:14             ` Richard Biener
2023-11-06  7:38 ` [PATCH 6/21]middle-end: support multiple exits in loop versioning Tamar Christina
2023-11-07 14:54   ` Richard Biener
2023-11-06  7:39 ` [PATCH 7/21]middle-end: update IV update code to support early breaks and arbitrary exits Tamar Christina
2023-11-15  0:03   ` Tamar Christina
2023-11-15 13:01     ` Richard Biener
2023-11-15 13:09       ` Tamar Christina
2023-11-15 13:22         ` Richard Biener
2023-11-15 14:14           ` Tamar Christina
2023-11-16 10:40             ` Richard Biener
2023-11-16 11:08               ` Tamar Christina
2023-11-16 11:27                 ` Richard Biener
2023-11-16 12:01                   ` Tamar Christina
2023-11-16 12:30                     ` Richard Biener
2023-11-16 13:22                       ` Tamar Christina
2023-11-16 13:35                         ` Richard Biener
2023-11-16 14:14                           ` Tamar Christina
2023-11-16 14:17                             ` Richard Biener
2023-11-16 15:19                               ` Tamar Christina
2023-11-16 18:41                                 ` Tamar Christina
2023-11-17 10:40                                   ` Tamar Christina
2023-11-17 12:13                                     ` Richard Biener
2023-11-20 21:54                                       ` Tamar Christina
2023-11-24 10:18                                         ` Tamar Christina
2023-11-24 12:41                                           ` Richard Biener
2023-11-06  7:39 ` [PATCH 8/21]middle-end: update vectorizable_live_reduction with support for multiple exits and different exits Tamar Christina
2023-11-15  0:05   ` Tamar Christina
2023-11-15 13:41     ` Richard Biener
2023-11-15 14:26       ` Tamar Christina
2023-11-16 11:16         ` Richard Biener
2023-11-20 21:57           ` Tamar Christina
2023-11-24 10:20             ` Tamar Christina
2023-11-24 13:23               ` Richard Biener
2023-11-27 22:47                 ` Tamar Christina
2023-11-29 13:28                   ` Richard Biener
2023-11-29 21:22                     ` Tamar Christina
2023-11-30 13:23                       ` Richard Biener
2023-12-06  4:21                         ` Tamar Christina
2023-12-06  9:33                           ` Richard Biener
2023-11-06  7:39 ` [PATCH 9/21]middle-end: implement vectorizable_early_exit for codegen of exit code Tamar Christina
2023-11-27 22:49   ` Tamar Christina
2023-11-29 13:50     ` Richard Biener
2023-12-06  4:37       ` Tamar Christina [this message]
2023-12-06  9:37         ` Richard Biener
2023-12-08  8:58           ` Tamar Christina
2023-12-08 10:28             ` Richard Biener
2023-12-08 13:45               ` Tamar Christina
2023-12-08 13:59                 ` Richard Biener
2023-12-08 15:01                   ` Tamar Christina
2023-12-11  7:09                   ` Tamar Christina
2023-12-11  9:36                     ` Richard Biener
2023-12-11 23:12                       ` Tamar Christina
2023-12-12 10:10                         ` Richard Biener
2023-12-12 10:27                           ` Tamar Christina
2023-12-12 10:59                           ` Richard Sandiford
2023-12-12 11:30                             ` Richard Biener
2023-12-13 14:13                               ` Tamar Christina
2023-12-14 13:12                                 ` Richard Biener
2023-12-14 18:44                                   ` Tamar Christina
2023-11-06  7:39 ` [PATCH 10/21]middle-end: implement relevancy analysis support for control flow Tamar Christina
2023-11-27 22:49   ` Tamar Christina
2023-11-29 14:47     ` Richard Biener
2023-12-06  4:10       ` Tamar Christina
2023-12-06  9:44         ` Richard Biener
2023-11-06  7:40 ` [PATCH 11/21]middle-end: wire through peeling changes and dominator updates after guard edge split Tamar Christina
2023-11-06  7:40 ` [PATCH 12/21]middle-end: Add remaining changes to peeling and vectorizer to support early breaks Tamar Christina
2023-11-27 22:48   ` Tamar Christina
2023-12-06  8:31   ` Richard Biener
2023-12-06  9:10     ` Tamar Christina
2023-12-06  9:27       ` Richard Biener
2023-11-06  7:40 ` [PATCH 13/21]middle-end: Update loop form analysis to support early break Tamar Christina
2023-11-27 22:48   ` Tamar Christina
2023-12-06  4:00     ` Tamar Christina
2023-12-06  8:18   ` Richard Biener
2023-12-06  8:52     ` Tamar Christina
2023-12-06  9:15       ` Richard Biener
2023-12-06  9:29         ` Tamar Christina
2023-11-06  7:41 ` [PATCH 14/21]middle-end: Change loop analysis from looking at at number of BB to actual cfg Tamar Christina
2023-11-06 14:44   ` Richard Biener
2023-11-06  7:41 ` [PATCH 15/21]middle-end: [RFC] conditionally support forcing final edge for debugging Tamar Christina
2023-12-09 10:38   ` Richard Sandiford
2023-12-11  7:38     ` Richard Biener
2023-12-11  8:49       ` Tamar Christina
2023-12-11  9:00         ` Richard Biener
2023-11-06  7:41 ` [PATCH 16/21]middle-end testsuite: un-xfail TSVC loops that check for exit control flow vectorization Tamar Christina
2023-11-06  7:41 ` [PATCH 17/21]AArch64: Add implementation for vector cbranch for Advanced SIMD Tamar Christina
2023-11-28 16:37   ` Richard Sandiford
2023-11-28 17:55     ` Richard Sandiford
2023-12-06 16:25       ` Tamar Christina
2023-12-07  0:56         ` Richard Sandiford
2023-12-14 18:40           ` Tamar Christina
2023-12-14 19:34             ` Richard Sandiford
2023-11-06  7:42 ` [PATCH 18/21]AArch64: Add optimization for vector != cbranch fed into compare with 0 " Tamar Christina
2023-11-06  7:42 ` [PATCH 19/21]AArch64: Add optimization for vector cbranch combining SVE and " Tamar Christina
2023-11-06  7:42 ` [PATCH 20/21]Arm: Add Advanced SIMD cbranch implementation Tamar Christina
2023-11-27 12:48   ` Kyrylo Tkachov
2023-11-06  7:43 ` [PATCH 21/21]Arm: Add MVE " Tamar Christina
2023-11-27 12:47   ` Kyrylo Tkachov
2023-11-06 14:25 ` [PATCH v6 0/21]middle-end: Support early break/return auto-vectorization Richard Biener
2023-11-06 15:17   ` Tamar Christina
2023-11-07  9:42     ` Richard Biener
2023-11-07 10:47       ` Tamar Christina
2023-11-07 13:58         ` Richard Biener
2023-11-27 18:30           ` Richard Sandiford
2023-11-28  8:11             ` Richard Biener
2023-11-30  3:47 [PATCH 9/21]middle-end: implement vectorizable_early_exit for codegen of exit code juzhe.zhong
2023-11-30 10:39 ` Tamar Christina

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=VI1PR08MB53254186A2A0585B263FF72AFF84A@VI1PR08MB5325.eurprd08.prod.outlook.com \
    --to=tamar.christina@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jlaw@ventanamicro.com \
    --cc=nd@arm.com \
    --cc=rguenther@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).