RE: [PATCH 1/4]middle-end Vect: Add support for dot-product where the sign for the multiplicant changes.

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Tamar Christina <Tamar.Christina@arm.com>
To: Richard Sandiford <Richard.Sandiford@arm.com>
Cc: Richard Biener <rguenther@suse.de>, nd <nd@arm.com>,
	"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: RE: [PATCH 1/4]middle-end Vect: Add support for dot-product where the sign for the multiplicant changes.
Date: Mon, 12 Jul 2021 12:29:23 +0000	[thread overview]
Message-ID: <VI1PR08MB532570940234239E9F607285FF159@VI1PR08MB5325.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <mpt5yxfj0bp.fsf@arm.com>

[-- Attachment #1: Type: text/plain, Size: 5496 bytes --]



> -----Original Message-----
> From: Richard Sandiford <richard.sandiford@arm.com>
> Sent: Monday, July 12, 2021 11:26 AM
> To: Tamar Christina <Tamar.Christina@arm.com>
> Cc: Richard Biener <rguenther@suse.de>; nd <nd@arm.com>; gcc-
> patches@gcc.gnu.org
> Subject: Re: [PATCH 1/4]middle-end Vect: Add support for dot-product
> where the sign for the multiplicant changes.
> 
> Tamar Christina <Tamar.Christina@arm.com> writes:
> >> -----Original Message-----
> >> From: Richard Sandiford <richard.sandiford@arm.com>
> >> Sent: Monday, July 12, 2021 10:39 AM
> >> To: Tamar Christina <Tamar.Christina@arm.com>
> >> Cc: Richard Biener <rguenther@suse.de>; nd <nd@arm.com>; gcc-
> >> patches@gcc.gnu.org
> >> Subject: Re: [PATCH 1/4]middle-end Vect: Add support for dot-product
> >> where the sign for the multiplicant changes.
> >>
> >> Tamar Christina <Tamar.Christina@arm.com> writes:
> >> > Hi,
> >> >
> >> >> Richard Sandiford <richard.sandiford@arm.com> writes:
> >> >> >> @@ -992,21 +1029,27 @@ vect_recog_dot_prod_pattern (vec_info
> >> >> *vinfo,
> >> >> >>    /* FORNOW.  Can continue analyzing the def-use chain when
> >> >> >> this stmt in
> >> >> a phi
> >> >> >>       inside the loop (in case we are analyzing an outer-loop).  */
> >> >> >>    vect_unpromoted_value unprom0[2];
> >> >> >> +  enum optab_subtype subtype = optab_vector;
> >> >> >>    if (!vect_widened_op_tree (vinfo, mult_vinfo, MULT_EXPR,
> >> >> WIDEN_MULT_EXPR,
> >> >> >> -			     false, 2, unprom0, &half_type))
> >> >> >> +			     false, 2, unprom0, &half_type, &subtype))
> >> >> >> +    return NULL;
> >> >> >> +
> >> >> >> +  if (subtype == optab_vector_mixed_sign
> >> >> >> +      && TYPE_UNSIGNED (unprom_mult.type)
> >> >> >> +      && TYPE_PRECISION (half_type) * 4 > TYPE_PRECISION
> >> >> >> + (unprom_mult.type))
> >> >> >>      return NULL;
> >> >> >
> >> >> > Isn't the final condition here instead that TYPE1 is narrower than
> TYPE2?
> >> >> > I.e. we need to reject the case in which we multiply a signed
> >> >> > and an unsigned value to get a (logically) signed result, but
> >> >> > then zero-extend it (rather than sign-extend it) to the
> >> >> > precision of the
> >> addition.
> >> >> >
> >> >> > That would make the test:
> >> >> >
> >> >> >   if (subtype == optab_vector_mixed_sign
> >> >> >       && TYPE_UNSIGNED (unprom_mult.type)
> >> >> >       && TYPE_PRECISION (unprom_mult.type) < TYPE_PRECISION
> (type))
> >> >> >     return NULL;
> >> >> >
> >> >> > instead.
> >> >>
> >> >> And folding that into the existing test gives:
> >> >>
> >> >>   /* If there are two widening operations, make sure they agree on
> >> >> the
> >> sign
> >> >>      of the extension.  The result of an optab_vector_mixed_sign
> operation
> >> >>      is signed; otherwise, the result has the same sign as the operands.
> */
> >> >>   if (TYPE_PRECISION (unprom_mult.type) != TYPE_PRECISION (type)
> >> >>       && (subtype == optab_vector_mixed_sign
> >> >> 	  ? TYPE_UNSIGNED (unprom_mult.type)
> >> >> 	  : TYPE_SIGN (unprom_mult.type) != TYPE_SIGN (half_type)))
> >> >>     return NULL;
> >> >>
> >> >
> >> > I went with the first one which doesn't add the extra constraints
> >> > for the normal dotproduct as that makes it too restrictive. It's
> >> > the type of the multiplication that determines the operation so
> >> > dotproduct can be used a bit more than where we currently do.
> >> >
> >> > This was relaxed in an earlier patch.
> >>
> >> I didn't mean that we should add extra constraints to the normal case
> though.
> >> The existing test I was referring to above was:
> >>
> >>   /* If there are two widening operations, make sure they agree on
> >>      the sign of the extension.  */
> >>   if (TYPE_PRECISION (unprom_mult.type) != TYPE_PRECISION (type)
> >>       && TYPE_SIGN (unprom_mult.type) != TYPE_SIGN (half_type))
> >>     return NULL;
> >
> > But as I mentioned, this restriction is unneeded and has been removed
> hence why it's not in my patchset's diff.
> > It's removed by
> > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/569851.html which
> Richi conditioned on the rest of these patches being approved.
> >
> > This change needlessly blocks test vect-reduc-dot-[2,3,6,7].c from
> > being dotproducts for instance
> >
> > It's also part of the deficiency between GCC codegen and Clang
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88492#c6
> 
> Hmm, OK.  Just removing the check regresses:
> 
> unsigned long __attribute__ ((noipa))
> f (signed short *x, signed short *y)
> {
>   unsigned long res = 0;
>   for (int i = 0; i < 100; ++i)
>     res += (unsigned int) x[i] * (unsigned int) y[i];
>   return res;
> }
> 
> int
> main (void)
> {
>   signed short x[100], y[100];
>   for (int i = 0; i < 100; ++i)
>     {
>       x[i] = -1;
>       y[i] = 1;
>     }
>   if (f (x, y) != 0x6400000000ULL - 100)
>     __builtin_abort ();
>   return 0;
> }
> 
> on SVE.  We then use SDOT even though the result of the multiplication is
> zero- rather than sign-extended to 64 bits.  Does something else in the series
> stop that from that happening?

No, and I hadn't noticed it before because it looks like the mid-end tests that are execution test don't turn on dot-product for arm targets :/ 

I'll look at it separately, for now I've then added the check back in.

Ok for trunk now?

Thanks,
Tamar

> 
> Richard

[-- Attachment #2: rb14433.patch --]
[-- Type: application/octet-stream, Size: 16746 bytes --]

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 1b91814433057b1b377283fd1f40cb970dc3d243..323ba8eab78e2b2e582fa0633752930182e83ee5 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -5446,13 +5446,55 @@ Like @samp{fold_left_plus_@var{m}}, but takes an additional mask operand
 
 @cindex @code{sdot_prod@var{m}} instruction pattern
 @item @samp{sdot_prod@var{m}}
+
+Compute the sum of the products of two signed elements.
+Operand 1 and operand 2 are of the same mode. Their
+product, which is of a wider mode, is computed and added to operand 3.
+Operand 3 is of a mode equal or wider than the mode of the product. The
+result is placed in operand 0, which is of the same mode as operand 3.
+
+Semantically the expressions perform the multiplication in the following signs
+
+@smallexample
+sdot<signed c, signed a, signed b> ==
+   res = sign-ext (a) * sign-ext (b) + c
+@dots{}
+@end smallexample
+
 @cindex @code{udot_prod@var{m}} instruction pattern
-@itemx @samp{udot_prod@var{m}}
-Compute the sum of the products of two signed/unsigned elements.
-Operand 1 and operand 2 are of the same mode. Their product, which is of a
-wider mode, is computed and added to operand 3. Operand 3 is of a mode equal or
-wider than the mode of the product. The result is placed in operand 0, which
-is of the same mode as operand 3.
+@item @samp{udot_prod@var{m}}
+
+Compute the sum of the products of two unsigned elements.
+Operand 1 and operand 2 are of the same mode. Their
+product, which is of a wider mode, is computed and added to operand 3.
+Operand 3 is of a mode equal or wider than the mode of the product. The
+result is placed in operand 0, which is of the same mode as operand 3.
+
+Semantically the expressions perform the multiplication in the following signs
+
+@smallexample
+udot<unsigned c, unsigned a, unsigned b> ==
+   res = zero-ext (a) * zero-ext (b) + c
+@dots{}
+@end smallexample
+
+
+
+@cindex @code{usdot_prod@var{m}} instruction pattern
+@item @samp{usdot_prod@var{m}}
+Compute the sum of the products of elements of different signs.
+Operand 1 must be unsigned and operand 2 signed. Their
+product, which is of a wider mode, is computed and added to operand 3.
+Operand 3 is of a mode equal or wider than the mode of the product. The
+result is placed in operand 0, which is of the same mode as operand 3.
+
+Semantically the expressions perform the multiplication in the following signs
+
+@smallexample
+usdot<unsigned c, unsigned a, signed b> ==
+   res = ((unsigned-conv) sign-ext (a)) * zero-ext (b) + c
+@dots{}
+@end smallexample
 
 @cindex @code{ssad@var{m}} instruction pattern
 @item @samp{ssad@var{m}}
diff --git a/gcc/optabs-tree.h b/gcc/optabs-tree.h
index c3aaa1a416991e856d3e24da45968a92ebada82c..fbd2b06b8dbfd560dfb66b314830e6b564b37abb 100644
--- a/gcc/optabs-tree.h
+++ b/gcc/optabs-tree.h
@@ -29,7 +29,8 @@ enum optab_subtype
 {
   optab_default,
   optab_scalar,
-  optab_vector
+  optab_vector,
+  optab_vector_mixed_sign
 };
 
 /* Return the optab used for computing the given operation on the type given by
diff --git a/gcc/optabs-tree.c b/gcc/optabs-tree.c
index 95ffe397c23e80c105afea52e9d47216bf52f55a..eeb5aeed3202cc6971b6447994bc5311e9c010bb 100644
--- a/gcc/optabs-tree.c
+++ b/gcc/optabs-tree.c
@@ -127,7 +127,12 @@ optab_for_tree_code (enum tree_code code, const_tree type,
       return TYPE_UNSIGNED (type) ? usum_widen_optab : ssum_widen_optab;
 
     case DOT_PROD_EXPR:
-      return TYPE_UNSIGNED (type) ? udot_prod_optab : sdot_prod_optab;
+      {
+	if (subtype == optab_vector_mixed_sign)
+	  return usdot_prod_optab;
+
+	return (TYPE_UNSIGNED (type) ? udot_prod_optab : sdot_prod_optab);
+      }
 
     case SAD_EXPR:
       return TYPE_UNSIGNED (type) ? usad_optab : ssad_optab;
diff --git a/gcc/optabs.c b/gcc/optabs.c
index 62a6bdb4c59bf8263c499245795576199606d372..14d8ad2f33fd75388435fe912380e177f8f3c54b 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -262,6 +262,11 @@ expand_widen_pattern_expr (sepops ops, rtx op0, rtx op1, rtx wide_op,
   bool sbool = false;
 
   oprnd0 = ops->op0;
+  if (nops >= 2)
+    oprnd1 = ops->op1;
+  if (nops >= 3)
+    oprnd2 = ops->op2;
+
   tmode0 = TYPE_MODE (TREE_TYPE (oprnd0));
   if (ops->code == VEC_UNPACK_FIX_TRUNC_HI_EXPR
       || ops->code == VEC_UNPACK_FIX_TRUNC_LO_EXPR)
@@ -285,6 +290,27 @@ expand_widen_pattern_expr (sepops ops, rtx op0, rtx op1, rtx wide_op,
 	   ? vec_unpacks_sbool_hi_optab : vec_unpacks_sbool_lo_optab);
       sbool = true;
     }
+  else if (ops->code == DOT_PROD_EXPR)
+    {
+      enum optab_subtype subtype = optab_default;
+      signop sign1 = TYPE_SIGN (TREE_TYPE (oprnd0));
+      signop sign2 = TYPE_SIGN (TREE_TYPE (oprnd1));
+      if (sign1 == sign2)
+	;
+      else if (sign1 == SIGNED && sign2 == UNSIGNED)
+	{
+	  subtype = optab_vector_mixed_sign;
+	  /* Same as optab_vector_mixed_sign but flip the operands.  */
+	  std::swap (op0, op1);
+	}
+      else if (sign1 == UNSIGNED && sign2 == SIGNED)
+	subtype = optab_vector_mixed_sign;
+      else
+	gcc_unreachable ();
+
+      widen_pattern_optab
+	= optab_for_tree_code (ops->code, TREE_TYPE (oprnd0), subtype);
+    }
   else
     widen_pattern_optab
       = optab_for_tree_code (ops->code, TREE_TYPE (oprnd0), optab_default);
@@ -298,10 +324,7 @@ expand_widen_pattern_expr (sepops ops, rtx op0, rtx op1, rtx wide_op,
   gcc_assert (icode != CODE_FOR_nothing);
 
   if (nops >= 2)
-    {
-      oprnd1 = ops->op1;
-      tmode1 = TYPE_MODE (TREE_TYPE (oprnd1));
-    }
+    tmode1 = TYPE_MODE (TREE_TYPE (oprnd1));
   else if (sbool)
     {
       nops = 2;
@@ -316,7 +339,6 @@ expand_widen_pattern_expr (sepops ops, rtx op0, rtx op1, rtx wide_op,
     {
       gcc_assert (tmode1 == tmode0);
       gcc_assert (op1);
-      oprnd2 = ops->op2;
       wmode = TYPE_MODE (TREE_TYPE (oprnd2));
     }
 
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 41ab2598eb6c32c003cbed490796abf25d2ee315..574d355b6b3092cf893f5ab0e8ae0f6d9ffcefbd 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -352,6 +352,7 @@ OPTAB_D (uavg_ceil_optab, "uavg$a3_ceil")
 OPTAB_D (sdot_prod_optab, "sdot_prod$I$a")
 OPTAB_D (ssum_widen_optab, "widen_ssum$I$a3")
 OPTAB_D (udot_prod_optab, "udot_prod$I$a")
+OPTAB_D (usdot_prod_optab, "usdot_prod$I$a")
 OPTAB_D (usum_widen_optab, "widen_usum$I$a3")
 OPTAB_D (usad_optab, "usad$I$a")
 OPTAB_D (ssad_optab, "ssad$I$a")
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index c73e1cbdda6b9380190b03de66caee48c4e173e3..3750d2881cbb7fd1e71c0eb8c0d4929925fd4152 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -4434,7 +4434,8 @@ verify_gimple_assign_ternary (gassign *stmt)
 		  && !SCALAR_FLOAT_TYPE_P (rhs1_type))
 		 || (!INTEGRAL_TYPE_P (lhs_type)
 		     && !SCALAR_FLOAT_TYPE_P (lhs_type))))
-	    || !types_compatible_p (rhs1_type, rhs2_type)
+	    /* rhs1_type and rhs2_type may differ in sign.  */
+	    || !tree_nop_conversion_p (rhs1_type, rhs2_type)
 	    || !useless_type_conversion_p (lhs_type, rhs3_type)
 	    || maybe_lt (GET_MODE_SIZE (element_mode (rhs3_type)),
 			 2 * GET_MODE_SIZE (element_mode (rhs1_type))))
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 51a46a6d852fb342278bb9513d013702cff4b868..4e63e84cc70ca60c706c19367ccf256ea3f851b5 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -6663,6 +6663,12 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
   bool lane_reduc_code_p
     = (code == DOT_PROD_EXPR || code == WIDEN_SUM_EXPR || code == SAD_EXPR);
   int op_type = TREE_CODE_LENGTH (code);
+  enum optab_subtype optab_query_kind = optab_vector;
+  if (code == DOT_PROD_EXPR
+      && TYPE_SIGN (TREE_TYPE (gimple_assign_rhs1 (stmt)))
+	   != TYPE_SIGN (TREE_TYPE (gimple_assign_rhs2 (stmt))))
+    optab_query_kind = optab_vector_mixed_sign;
+
 
   scalar_dest = gimple_assign_lhs (stmt);
   scalar_type = TREE_TYPE (scalar_dest);
@@ -7190,7 +7196,7 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
       bool ok = true;
 
       /* 4.1. check support for the operation in the loop  */
-      optab optab = optab_for_tree_code (code, vectype_in, optab_vector);
+      optab optab = optab_for_tree_code (code, vectype_in, optab_query_kind);
       if (!optab)
 	{
 	  if (dump_enabled_p ())
diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
index b2e7fc2cc7adad72697b8d76deb0448d0b03e0a8..71533e61c934c63dd05a33c8f7159185e9b11a1b 100644
--- a/gcc/tree-vect-patterns.c
+++ b/gcc/tree-vect-patterns.c
@@ -191,9 +191,9 @@ vect_get_external_def_edge (vec_info *vinfo, tree var)
 }
 
 /* Return true if the target supports a vector version of CODE,
-   where CODE is known to map to a direct optab.  ITYPE specifies
-   the type of (some of) the scalar inputs and OTYPE specifies the
-   type of the scalar result.
+   where CODE is known to map to a direct optab with the given SUBTYPE.
+   ITYPE specifies the type of (some of) the scalar inputs and OTYPE
+   specifies the type of the scalar result.
 
    If CODE allows the inputs and outputs to have different type
    (such as for WIDEN_SUM_EXPR), it is the input mode rather
@@ -208,7 +208,8 @@ vect_get_external_def_edge (vec_info *vinfo, tree var)
 static bool
 vect_supportable_direct_optab_p (vec_info *vinfo, tree otype, tree_code code,
 				 tree itype, tree *vecotype_out,
-				 tree *vecitype_out = NULL)
+				 tree *vecitype_out = NULL,
+				 enum optab_subtype subtype = optab_default)
 {
   tree vecitype = get_vectype_for_scalar_type (vinfo, itype);
   if (!vecitype)
@@ -218,7 +219,7 @@ vect_supportable_direct_optab_p (vec_info *vinfo, tree otype, tree_code code,
   if (!vecotype)
     return false;
 
-  optab optab = optab_for_tree_code (code, vecitype, optab_default);
+  optab optab = optab_for_tree_code (code, vecitype, subtype);
   if (!optab)
     return false;
 
@@ -521,6 +522,7 @@ vect_joust_widened_type (tree type, tree new_type, tree *common_type)
   unsigned int precision = MAX (TYPE_PRECISION (*common_type),
 				TYPE_PRECISION (new_type));
   precision *= 2;
+
   if (precision * 2 > TYPE_PRECISION (type))
     return false;
 
@@ -539,6 +541,10 @@ vect_joust_widened_type (tree type, tree new_type, tree *common_type)
    to a type that (a) is narrower than the result of STMT_INFO and
    (b) can hold all leaf operand values.
 
+   If SUBTYPE then allow that the signs of the operands
+   may differ in signs but not in precision.  SUBTYPE is updated to reflect
+   this.
+
    Return 0 if STMT_INFO isn't such a tree, or if no such COMMON_TYPE
    exists.  */
 
@@ -546,7 +552,8 @@ static unsigned int
 vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
 		      tree_code widened_code, bool shift_p,
 		      unsigned int max_nops,
-		      vect_unpromoted_value *unprom, tree *common_type)
+		      vect_unpromoted_value *unprom, tree *common_type,
+		      enum optab_subtype *subtype = NULL)
 {
   /* Check for an integer operation with the right code.  */
   gassign *assign = dyn_cast <gassign *> (stmt_info->stmt);
@@ -607,7 +614,8 @@ vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
 		= vinfo->lookup_def (this_unprom->op);
 	      nops = vect_widened_op_tree (vinfo, def_stmt_info, code,
 					   widened_code, shift_p, max_nops,
-					   this_unprom, common_type);
+					   this_unprom, common_type,
+					   subtype);
 	      if (nops == 0)
 		return 0;
 
@@ -625,7 +633,18 @@ vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
 		*common_type = this_unprom->type;
 	      else if (!vect_joust_widened_type (type, this_unprom->type,
 						 common_type))
-		return 0;
+		{
+		  if (subtype)
+		    {
+		      /* See if we can sign extend the smaller type.  */
+		      if (TYPE_PRECISION (this_unprom->type)
+			  > TYPE_PRECISION (*common_type))
+			*common_type = this_unprom->type;
+		      *subtype = optab_vector_mixed_sign;
+		    }
+		  else
+		    return 0;
+		}
 	    }
 	}
       next_op += nops;
@@ -725,12 +744,22 @@ vect_split_statement (vec_info *vinfo, stmt_vec_info stmt2_info, tree new_rhs,
 
 /* Convert UNPROM to TYPE and return the result, adding new statements
    to STMT_INFO's pattern definition statements if no better way is
-   available.  VECTYPE is the vector form of TYPE.  */
+   available.  VECTYPE is the vector form of TYPE.
+
+   If SUBTYPE then convert the type based on the subtype.  */
 
 static tree
 vect_convert_input (vec_info *vinfo, stmt_vec_info stmt_info, tree type,
-		    vect_unpromoted_value *unprom, tree vectype)
+		    vect_unpromoted_value *unprom, tree vectype,
+		    enum optab_subtype subtype = optab_default)
 {
+
+  /* Update the type if the signs differ.  */
+  if (subtype == optab_vector_mixed_sign
+      && TYPE_SIGN (type) != TYPE_SIGN (TREE_TYPE (unprom->op)))
+    type = build_nonstandard_integer_type (TYPE_PRECISION (type),
+					   TYPE_SIGN (unprom->type));
+
   /* Check for a no-op conversion.  */
   if (types_compatible_p (type, TREE_TYPE (unprom->op)))
     return unprom->op;
@@ -806,12 +835,14 @@ vect_convert_input (vec_info *vinfo, stmt_vec_info stmt_info, tree type,
 }
 
 /* Invoke vect_convert_input for N elements of UNPROM and store the
-   result in the corresponding elements of RESULT.  */
+   result in the corresponding elements of RESULT.
+
+   If SUBTYPE then convert the type based on the subtype.  */
 
 static void
 vect_convert_inputs (vec_info *vinfo, stmt_vec_info stmt_info, unsigned int n,
 		     tree *result, tree type, vect_unpromoted_value *unprom,
-		     tree vectype)
+		     tree vectype, enum optab_subtype subtype = optab_default)
 {
   for (unsigned int i = 0; i < n; ++i)
     {
@@ -819,11 +850,12 @@ vect_convert_inputs (vec_info *vinfo, stmt_vec_info stmt_info, unsigned int n,
       for (j = 0; j < i; ++j)
 	if (unprom[j].op == unprom[i].op)
 	  break;
+
       if (j < i)
 	result[i] = result[j];
       else
 	result[i] = vect_convert_input (vinfo, stmt_info,
-					type, &unprom[i], vectype);
+					type, &unprom[i], vectype, subtype);
     }
 }
 
@@ -895,7 +927,8 @@ vect_reassociating_reduction_p (vec_info *vinfo,
 
    Try to find the following pattern:
 
-     type x_t, y_t;
+     type1a x_t
+     type1b y_t;
      TYPE1 prod;
      TYPE2 sum = init;
    loop:
@@ -908,9 +941,9 @@ vect_reassociating_reduction_p (vec_info *vinfo,
      [S6  prod = (TYPE2) prod;  #optional]
      S7  sum_1 = prod + sum_0;
 
-   where 'TYPE1' is exactly double the size of type 'type', and 'TYPE2' is the
-   same size of 'TYPE1' or bigger. This is a special case of a reduction
-   computation.
+   where 'TYPE1' is exactly double the size of type 'type1a' and 'type1b',
+   the sign of 'TYPE1' must be one of 'type1a' or 'type1b' but the sign of
+   'type1a' and 'type1b' can differ.
 
    Input:
 
@@ -953,7 +986,8 @@ vect_recog_dot_prod_pattern (vec_info *vinfo,
      In which
      - DX is double the size of X
      - DY is double the size of Y
-     - DX, DY, DPROD all have the same type
+     - DX, DY, DPROD all have the same type but the sign
+       between X, Y and DPROD can differ.
      - sum is the same size of DPROD or bigger
      - sum has been recognized as a reduction variable.
 
@@ -991,8 +1025,18 @@ vect_recog_dot_prod_pattern (vec_info *vinfo,
   /* FORNOW.  Can continue analyzing the def-use chain when this stmt in a phi
      inside the loop (in case we are analyzing an outer-loop).  */
   vect_unpromoted_value unprom0[2];
+  enum optab_subtype subtype = optab_vector;
   if (!vect_widened_op_tree (vinfo, mult_vinfo, MULT_EXPR, WIDEN_MULT_EXPR,
-			     false, 2, unprom0, &half_type))
+			     false, 2, unprom0, &half_type, &subtype))
+    return NULL;
+
+  /* If there are two widening operations, make sure they agree on the sign
+     of the extension.  The result of an optab_vector_mixed_sign operation
+     is signed; otherwise, the result has the same sign as the operands.  */
+  if (TYPE_PRECISION (unprom_mult.type) != TYPE_PRECISION (type)
+      && (subtype == optab_vector_mixed_sign
+	? TYPE_UNSIGNED (unprom_mult.type)
+	: TYPE_SIGN (unprom_mult.type) != TYPE_SIGN (half_type)))
     return NULL;
 
   /* If there are two widening operations, make sure they agree on
@@ -1005,13 +1049,13 @@ vect_recog_dot_prod_pattern (vec_info *vinfo,
 
   tree half_vectype;
   if (!vect_supportable_direct_optab_p (vinfo, type, DOT_PROD_EXPR, half_type,
-					type_out, &half_vectype))
+					type_out, &half_vectype, subtype))
     return NULL;
 
   /* Get the inputs in the appropriate types.  */
   tree mult_oprnd[2];
   vect_convert_inputs (vinfo, stmt_vinfo, 2, mult_oprnd, half_type,
-		       unprom0, half_vectype);
+		       unprom0, half_vectype, subtype);
 
   var = vect_recog_temp_ssa_var (type, NULL);
   pattern_stmt = gimple_build_assign (var, DOT_PROD_EXPR,

next prev parent reply	other threads:[~2021-07-12 12:29 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-05 17:38 Tamar Christina
2021-05-05 17:38 ` [PATCH 2/4]AArch64: Add support for sign differing dot-product usdot for NEON and SVE Tamar Christina
2021-05-10 16:49   ` Richard Sandiford
2021-05-25 14:57     ` Tamar Christina
2021-05-26  8:50       ` Richard Sandiford
2021-05-05 17:39 ` [PATCH 3/4][AArch32]: Add support for sign differing dot-product usdot for NEON Tamar Christina
2021-05-05 17:42   ` FW: " Tamar Christina
     [not found]     ` <VI1PR08MB5325B832EE3BB6139886C0E9FF259@VI1PR08MB5325.eurprd08.prod.outlook.com>
2021-05-25 15:02       ` Tamar Christina
2021-05-26 10:45         ` Kyrylo Tkachov
2021-05-06  9:23   ` Christophe Lyon
2021-05-06  9:27     ` Tamar Christina
2021-05-05 17:39 ` [PATCH 4/4]middle-end: Add tests middle end generic tests for sign differing dotproduct Tamar Christina
     [not found]   ` <VI1PR08MB532511701573C18A33AC6291FF259@VI1PR08MB5325.eurprd08.prod.outlook.com>
2021-05-25 15:01     ` FW: " Tamar Christina
     [not found]     ` <11s2181-8856-30rq-26or-84q8o7qrr2o@fhfr.qr>
2021-05-26  8:48       ` Tamar Christina
2021-06-14 12:08       ` Tamar Christina
2021-05-07 11:45 ` [PATCH 1/4]middle-end Vect: Add support for dot-product where the sign for the multiplicant changes Richard Biener
2021-05-07 12:42   ` Tamar Christina
2021-05-10 11:39     ` Richard Biener
2021-05-10 12:58       ` Tamar Christina
2021-05-10 13:29         ` Richard Biener
2021-05-25 14:57           ` Tamar Christina
2021-05-26  8:56             ` Richard Biener
2021-06-02  9:28               ` Tamar Christina
2021-06-04 10:12                 ` Tamar Christina
2021-06-07 10:10                   ` Richard Sandiford
2021-06-14 12:06                     ` Tamar Christina
2021-06-21  8:11                       ` Tamar Christina
2021-06-22 10:56                       ` Richard Sandiford
2021-06-22 11:16                         ` Richard Sandiford
2021-07-12  9:18                           ` Tamar Christina
2021-07-12  9:39                             ` Richard Sandiford
2021-07-12  9:56                               ` Tamar Christina
2021-07-12 10:25                                 ` Richard Sandiford
2021-07-12 12:29                                   ` Tamar Christina [this message]
2021-07-12 14:55                                     ` Richard Sandiford

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=VI1PR08MB532570940234239E9F607285FF159@VI1PR08MB5325.eurprd08.prod.outlook.com \
    --to=tamar.christina@arm.com \
    --cc=Richard.Sandiford@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=nd@arm.com \
    --cc=rguenther@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).