public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] vect: Don't set excess bits in unform masks
@ 2023-10-20 15:48 Andrew Stubbs
  2023-10-23 10:43 ` Richard Biener
  0 siblings, 1 reply; 4+ messages in thread
From: Andrew Stubbs @ 2023-10-20 15:48 UTC (permalink / raw)
  To: gcc-patches, Richard Biener

[-- Attachment #1: Type: text/plain, Size: 977 bytes --]

This patch fixes a wrong-code bug on amdgcn in which the excess "ones" 
in the mask enable extra lanes that were supposed to be unused and are 
therefore undefined.

Richi suggested an alternative approach involving narrower types and 
then a zero-extend to the actual mask type.  This solved the problem for 
the specific test case that I had, but I'm not sure if it would work 
with V2 and V4 modes (not that I've observed bad behaviour from them 
anyway, but still).  There were some other caveats involving "two-lane 
division" that I don't fully understand, so I went with the simpler 
implementation.

This patch does have the disadvantage of an additional "and" instruction 
in the non-constant case even for machines that don't need it. I'm not 
sure how to fix that without an additional target hook. (If GCC could 
use the 64-lane vectors more effectively without the assistance of 
artificially reduced sizes then this problem wouldn't exist.)

OK to commit?

Andrew

[-- Attachment #2: 231020-uniform-masks.patch --]
[-- Type: text/plain, Size: 1487 bytes --]

vect: Don't set excess bits in unform masks

AVX ignores any excess bits in the mask, but AMD GCN magically uses a larger
vector than was intended (the smaller sizes are "fake"), leading to wrong-code.

gcc/ChangeLog:

	* expr.cc (store_constructor): Add "and" operation to uniform mask
	generation.

diff --git a/gcc/expr.cc b/gcc/expr.cc
index 4220cbd9f8f..fb4609f616e 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -7440,7 +7440,7 @@ store_constructor (tree exp, rtx target, int cleared, poly_int64 size,
 	    break;
 	  }
 	/* Use sign-extension for uniform boolean vectors with
-	   integer modes.  */
+	   integer modes.  Effectively "vec_duplicate" for bitmasks.  */
 	if (!TREE_SIDE_EFFECTS (exp)
 	    && VECTOR_BOOLEAN_TYPE_P (type)
 	    && SCALAR_INT_MODE_P (mode)
@@ -7449,7 +7449,21 @@ store_constructor (tree exp, rtx target, int cleared, poly_int64 size,
 	  {
 	    rtx op0 = force_reg (TYPE_MODE (TREE_TYPE (elt)),
 				 expand_normal (elt));
-	    convert_move (target, op0, 0);
+	    rtx tmp = gen_reg_rtx (mode);
+	    convert_move (tmp, op0, 0);
+
+	    if (known_ne (TYPE_VECTOR_SUBPARTS (type),
+			  GET_MODE_PRECISION (mode)))
+	      {
+		/* Ensure no excess bits are set.
+		   GCN needs this, AVX does not.  */
+		expand_binop (mode, and_optab, tmp,
+			      GEN_INT ((1 << (TYPE_VECTOR_SUBPARTS (type)
+					      .to_constant())) - 1),
+			      target, true, OPTAB_DIRECT);
+	      }
+	    else
+	      emit_move_insn (target, tmp);
 	    break;
 	  }
 

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-11-10 11:53 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-20 15:48 [PATCH] vect: Don't set excess bits in unform masks Andrew Stubbs
2023-10-23 10:43 ` Richard Biener
2023-11-10 11:45   ` Andrew Stubbs
2023-11-10 11:53     ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).