public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Xionghu Luo <luoxhu@linux.ibm.com>
To: Segher Boessenkool <segher@kernel.crashing.org>
Cc: gcc-patches@gcc.gnu.org, dje.gcc@gmail.com,
	wschmidt@linux.ibm.com, guojiufu@linux.ibm.com,
	linkw@gcc.gnu.org
Subject: Re: [PATCH] rs6000: Fix wrong code generation for vec_sel [PR94613]
Date: Fri, 14 May 2021 14:57:10 +0800	[thread overview]
Message-ID: <973f14d9-c626-89b4-753f-e5c613eef6f2@linux.ibm.com> (raw)
In-Reply-To: <20210513104931.GG10366@gate.crashing.org>

[-- Attachment #1: Type: text/plain, Size: 6436 bytes --]

Hi,

On 2021/5/13 18:49, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Apr 30, 2021 at 01:32:58AM -0500, Xionghu Luo wrote:
>> The vsel instruction is a bit-wise select instruction.  Using an
>> IF_THEN_ELSE to express it in RTL is wrong and leads to wrong code
>> being generated in the combine pass.  Per element selection is a
>> subset of per bit-wise selection,with the patch the pattern is
>> written using bit operations.  But there are 8 different patterns
>> to define "op0 := (op1 & ~op3) | (op2 & op3)":
>>
>> (~op3&op1) | (op3&op2),
>> (~op3&op1) | (op2&op3),
>> (op3&op2) | (~op3&op1),
>> (op2&op3) | (~op3&op1),
>> (op1&~op3) | (op3&op2),
>> (op1&~op3) | (op2&op3),
>> (op3&op2) | (op1&~op3),
>> (op2&op3) | (op1&~op3),
>>
>> Combine pass will swap (op1&~op3) to (~op3&op1) due to commutative
>> canonical, which could reduce it to the FIRST 4 patterns, but it won't
>> swap (op2&op3) | (~op3&op1) to (~op3&op1) | (op2&op3), so this patch
>> handles it with two patterns with different NOT op3 position and check
>> equality inside it.
> 
> Yup, that latter case does not have canonicalisation rules.  Btw, not
> only combine does this canonicalisation: everything should,
> non-canonical RTL is invalid RTL (in the instruction stream, you can do
> everything in temporary code of course, as long as the RTL isn't
> malformed).
> 
>> -(define_insn "*altivec_vsel<mode>"
>> +(define_insn "altivec_vsel<mode>"
>>     [(set (match_operand:VM 0 "altivec_register_operand" "=v")
>> -	(if_then_else:VM
>> -	 (ne:CC (match_operand:VM 1 "altivec_register_operand" "v")
>> -		(match_operand:VM 4 "zero_constant" ""))
>> -	 (match_operand:VM 2 "altivec_register_operand" "v")
>> -	 (match_operand:VM 3 "altivec_register_operand" "v")))]
>> -  "VECTOR_MEM_ALTIVEC_P (<MODE>mode)"
>> -  "vsel %0,%3,%2,%1"
>> +	(ior:VM
>> +	 (and:VM
>> +	  (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
>> +	  (match_operand:VM 1 "altivec_register_operand" "v"))
>> +	 (and:VM
>> +	  (match_operand:VM 2 "altivec_register_operand" "v")
>> +	  (match_operand:VM 4 "altivec_register_operand" "v"))))]
>> +  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)
>> +  && (rtx_equal_p (operands[2], operands[3])
>> +  || rtx_equal_p (operands[4], operands[3]))"
>> +  {
>> +    if (rtx_equal_p (operands[2], operands[3]))
>> +      return "vsel %0,%1,%4,%3";
>> +    else
>> +      return "vsel %0,%1,%2,%3";
>> +  }
>>     [(set_attr "type" "vecmove")])
> 
> That rtx_equal_p stuff is nice and tricky, but it is a bit too tricky I
> think.  So please write this as two patterns (and keep the expand if
> that helps).

I was a bit concerned that there would be a lot of duplicate code if we
write two patterns for each vsel, totally 4 similar patterns in
altivec.md and another 4 in vsx.md make it difficult to maintain, however
I updated it since you prefer this way, as you pointed out the xxsel in
vsx.md could be folded by later patch.

> 
>> +(define_insn "altivec_vsel<mode>2"
> 
> (same here of course).
> 
>>   ;; Fused multiply add.
>> diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
>> index f5676255387..d65bdc01055 100644
>> --- a/gcc/config/rs6000/rs6000-call.c
>> +++ b/gcc/config/rs6000/rs6000-call.c
>> @@ -3362,11 +3362,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
>>       RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_unsigned_V2DI },
>>     { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
>>       RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI },
>> -  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
>> +  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI_UNS,
> 
> Are the _uns things still used for anything?  But, let's not change
> this until Bill's stuff is in :-)
> 
> Why do you want to change this here, btw?  I don't understand.

OK, they are actually "unsigned type" overload builtin functions, change
it or not so far won't cause functionality issue, I will revert this change
in the updated patch.

> 
>> +  if (target == 0
>> +      || GET_MODE (target) != tmode
>> +      || ! (*insn_data[icode].operand[0].predicate) (target, tmode))
> 
> No space after ! and other unary operators (except for casts and other
> operators you write with alphanumerics, like "sizeof").  I know you
> copied this code, but :-)

OK, thanks.

> 
>> @@ -15608,8 +15606,6 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
>>       case GEU:
>>       case LTU:
>>       case LEU:
>> -      /* Mark unsigned tests with CCUNSmode.  */
>> -      cc_mode = CCUNSmode;
>>   
>>         /* Invert condition to avoid compound test if necessary.  */
>>         if (rcode == GEU || rcode == LEU)
> 
> So this is related to the _uns thing.  Could you split off that change?
> Probably as an earlier patch (but either works for me).

Not related to the ALTIVEC_BUILTIN_VSEL_2DI_UNS things, previously cc_mode
is a parameter to generate the condition for IF_THEN_ELSE instruction, now
we don't need it again as we use IOR (AND... AND...) style, remove it to avoid
build error.


-  cond2 = gen_rtx_fmt_ee (NE, cc_mode, gen_lowpart (dest_mode, mask),
-                         CONST0_RTX (dest_mode));
-  emit_insn (gen_rtx_SET (dest,
-                         gen_rtx_IF_THEN_ELSE (dest_mode,
-                                               cond2,
-                                               op_true,
-                                               op_false)));
+  rtx tmp = gen_rtx_IOR (dest_mode,
+                        gen_rtx_AND (dest_mode, gen_rtx_NOT (dest_mode, mask),
+                                     op_false),
+                        gen_rtx_AND (dest_mode, mask, op_true));
+  emit_insn (gen_rtx_SET (dest, tmp));


> 
>> @@ -15629,6 +15625,9 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
>>     if (!mask)
>>       return 0;
>>   
>> +  if (mask_mode != dest_mode)
>> +      mask = simplify_gen_subreg (dest_mode, mask, mask_mode, 0);
> 
> Indent just two characters please: line continuations (usually) align,
> but indents do not.> 
> 
> Can you fold vsel and xxsel together completely?  They have exactly the
> same semantics!  This does not have to be in this patch of course.

I noticed that vperm/xxperm are folded together, do you mean fold vsel/xxsel
like them?  It's attached as:
0002-rs6000-Fold-xxsel-to-vsel-since-they-have-same-seman.patch
  

Thanks,
Xionghu

[-- Attachment #2: 0001-rs6000-Fix-wrong-code-generation-for-vec_sel-PR94613.patch --]
[-- Type: text/plain, Size: 16722 bytes --]

From 68cb56305caea6aaee083d814fd4d11590cbc766 Mon Sep 17 00:00:00 2001
From: Xionghu Luo <luoxhu@linux.ibm.com>
Date: Tue, 27 Apr 2021 01:07:25 -0500
Subject: [PATCH 1/2] rs6000: Fix wrong code generation for vec_sel [PR94613]

The vsel instruction is a bit-wise select instruction.  Using an
IF_THEN_ELSE to express it in RTL is wrong and leads to wrong code
being generated in the combine pass.  Per element selection is a
subset of per bit-wise selection,with the patch the pattern is
written using bit operations.  But there are 8 different patterns
to define "op0 := (op1 & ~op3) | (op2 & op3)":

(~op3&op1) | (op3&op2),
(~op3&op1) | (op2&op3),
(op3&op2) | (~op3&op1),
(op2&op3) | (~op3&op1),
(op1&~op3) | (op3&op2),
(op1&~op3) | (op2&op3),
(op3&op2) | (op1&~op3),
(op2&op3) | (op1&~op3),

Combine pass will swap (op1&~op3) to (~op3&op1) due to commutative
canonical, which could reduce it to the FIRST 4 patterns, but it won't
swap (op2&op3) | (~op3&op1) to (~op3&op1) | (op2&op3), so this patch
handles it with two patterns with different NOT op3 position and check
equality inside it.

Tested pass on Power8LE, any comments?

gcc/ChangeLog:

	* config/rs6000/altivec.md (*altivec_vsel<mode>): Change to ...
	(altivec_vsel<mode>): ... this and update define.
	(*altivec_vsel<mode>_uns): Delete.
	(altivec_vsel<mode>2): New define_insn.
	(altivec_vsel<mode>3): Likewise.
	(altivec_vsel<mode>4): Likewise.
	* config/rs6000/rs6000-call.c (altivec_expand_vec_sel_builtin):
	New.
	(altivec_expand_builtin): Call altivec_expand_vec_sel_builtin to
	expand vel_sel.
	* config/rs6000/rs6000.c (rs6000_emit_vector_cond_expr): Use
	bit-wise selection instead of per element.
	* config/rs6000/vector.md:
	* config/rs6000/vsx.md (*vsx_xxsel<mode>): Change to ...
	(vsx_xxsel<mode>): ... this and update define.
	(*vsx_xxsel<mode>_uns): Delete.
	(vsx_xxsel<mode>2): New define_insn.
	(vsx_xxsel<mode>3): Likewise.
	(vsx_xxsel<mode>4): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.target/powerpc/pr94613.c: New test.
---
 gcc/config/rs6000/altivec.md               | 66 ++++++++++++++++-----
 gcc/config/rs6000/rs6000-call.c            | 62 ++++++++++++++++++++
 gcc/config/rs6000/rs6000.c                 | 19 +++---
 gcc/config/rs6000/vector.md                | 26 ++++-----
 gcc/config/rs6000/vsx.md                   | 68 +++++++++++++++++-----
 gcc/testsuite/gcc.target/powerpc/pr94613.c | 38 ++++++++++++
 6 files changed, 222 insertions(+), 57 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr94613.c

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 1351dafbc41..df8f24701d0 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -667,26 +667,60 @@ (define_insn "*altivec_gev4sf"
   "vcmpgefp %0,%1,%2"
   [(set_attr "type" "veccmp")])
 
-(define_insn "*altivec_vsel<mode>"
+(define_insn "altivec_vsel<mode>"
   [(set (match_operand:VM 0 "altivec_register_operand" "=v")
-	(if_then_else:VM
-	 (ne:CC (match_operand:VM 1 "altivec_register_operand" "v")
-		(match_operand:VM 4 "zero_constant" ""))
-	 (match_operand:VM 2 "altivec_register_operand" "v")
-	 (match_operand:VM 3 "altivec_register_operand" "v")))]
-  "VECTOR_MEM_ALTIVEC_P (<MODE>mode)"
-  "vsel %0,%3,%2,%1"
+	(ior:VM
+	 (and:VM
+	  (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
+	  (match_operand:VM 1 "altivec_register_operand" "v"))
+	 (and:VM
+	  (match_operand:VM 2 "altivec_register_operand" "v")
+	  (match_operand:VM 4 "altivec_register_operand" "v"))))]
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)
+  && rtx_equal_p (operands[2], operands[3])"
+  "vsel %0,%1,%4,%3"
   [(set_attr "type" "vecmove")])
 
-(define_insn "*altivec_vsel<mode>_uns"
+(define_insn "altivec_vsel<mode>2"
   [(set (match_operand:VM 0 "altivec_register_operand" "=v")
-	(if_then_else:VM
-	 (ne:CCUNS (match_operand:VM 1 "altivec_register_operand" "v")
-		   (match_operand:VM 4 "zero_constant" ""))
-	 (match_operand:VM 2 "altivec_register_operand" "v")
-	 (match_operand:VM 3 "altivec_register_operand" "v")))]
-  "VECTOR_MEM_ALTIVEC_P (<MODE>mode)"
-  "vsel %0,%3,%2,%1"
+	(ior:VM
+	 (and:VM
+	  (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
+	  (match_operand:VM 1 "altivec_register_operand" "v"))
+	 (and:VM
+	  (match_operand:VM 2 "altivec_register_operand" "v")
+	  (match_operand:VM 4 "altivec_register_operand" "v"))))]
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)
+  && rtx_equal_p (operands[4], operands[3])"
+  "vsel %0,%1,%2,%3"
+  [(set_attr "type" "vecmove")])
+
+(define_insn "altivec_vsel<mode>3"
+  [(set (match_operand:VM 0 "altivec_register_operand" "=v")
+	(ior:VM
+	 (and:VM
+	  (match_operand:VM 1 "altivec_register_operand" "v")
+	  (match_operand:VM 2 "altivec_register_operand" "v"))
+	 (and:VM
+	  (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
+	  (match_operand:VM 4 "altivec_register_operand" "v"))))]
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)
+  && rtx_equal_p (operands[1], operands[3])"
+  "vsel %0,%4,%2,%3"
+  [(set_attr "type" "vecmove")])
+
+(define_insn "altivec_vsel<mode>4"
+  [(set (match_operand:VM 0 "altivec_register_operand" "=v")
+	(ior:VM
+	 (and:VM
+	  (match_operand:VM 1 "altivec_register_operand" "v")
+	  (match_operand:VM 2 "altivec_register_operand" "v"))
+	 (and:VM
+	  (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
+	  (match_operand:VM 4 "altivec_register_operand" "v"))))]
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)
+  && rtx_equal_p (operands[2], operands[3])"
+  "vsel %0,%4,%1,%3"
   [(set_attr "type" "vecmove")])
 
 ;; Fused multiply add.
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index f5676255387..94593be0d2e 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -10996,6 +10996,45 @@ altivec_expand_vec_ext_builtin (tree exp, rtx target)
   return target;
 }
 
+/* Expand vec_sel builtin.  */
+static rtx
+altivec_expand_vec_sel_builtin (enum insn_code icode, tree exp, rtx target)
+{
+  rtx op0, op1, op2, pat;
+  tree arg0, arg1, arg2;
+
+  arg0 = CALL_EXPR_ARG (exp, 0);
+  op0 = expand_normal (arg0);
+  arg1 = CALL_EXPR_ARG (exp, 1);
+  op1 = expand_normal (arg1);
+  arg2 = CALL_EXPR_ARG (exp, 2);
+  op2 = expand_normal (arg2);
+
+  machine_mode tmode = insn_data[icode].operand[0].mode;
+  machine_mode mode0 = insn_data[icode].operand[1].mode;
+  machine_mode mode1 = insn_data[icode].operand[2].mode;
+  machine_mode mode2 = insn_data[icode].operand[3].mode;
+
+  if (target == 0 || GET_MODE (target) != tmode
+      || !(*insn_data[icode].operand[0].predicate) (target, tmode))
+    target = gen_reg_rtx (tmode);
+
+  if (!(*insn_data[icode].operand[1].predicate) (op0, mode0))
+    op0 = copy_to_mode_reg (mode0, op0);
+  if (!(*insn_data[icode].operand[2].predicate) (op1, mode1))
+    op1 = copy_to_mode_reg (mode1, op1);
+  if (!(*insn_data[icode].operand[3].predicate) (op2, mode2))
+    op2 = copy_to_mode_reg (mode2, op2);
+
+  pat = GEN_FCN (icode) (target, op0, op1, op2, op2);
+  if (pat)
+    emit_insn (pat);
+  else
+    return NULL_RTX;
+
+  return target;
+}
+
 /* Expand the builtin in EXP and store the result in TARGET.  Store
    true in *EXPANDEDP if we found a builtin to expand.  */
 static rtx
@@ -11181,6 +11220,29 @@ altivec_expand_builtin (tree exp, rtx target, bool *expandedp)
 	emit_insn (pat);
       return NULL_RTX;
 
+    case ALTIVEC_BUILTIN_VSEL_2DF:
+      return altivec_expand_vec_sel_builtin (CODE_FOR_altivec_vselv2df, exp,
+					     target);
+    case ALTIVEC_BUILTIN_VSEL_2DI:
+    case ALTIVEC_BUILTIN_VSEL_2DI_UNS:
+      return altivec_expand_vec_sel_builtin (CODE_FOR_altivec_vselv2di, exp,
+					     target);
+    case ALTIVEC_BUILTIN_VSEL_4SF:
+      return altivec_expand_vec_sel_builtin (CODE_FOR_altivec_vselv4sf, exp,
+					     target);
+    case ALTIVEC_BUILTIN_VSEL_4SI:
+    case ALTIVEC_BUILTIN_VSEL_4SI_UNS:
+      return altivec_expand_vec_sel_builtin (CODE_FOR_altivec_vselv4si, exp,
+					     target);
+    case ALTIVEC_BUILTIN_VSEL_8HI:
+    case ALTIVEC_BUILTIN_VSEL_8HI_UNS:
+      return altivec_expand_vec_sel_builtin (CODE_FOR_altivec_vselv8hi, exp,
+					     target);
+    case ALTIVEC_BUILTIN_VSEL_16QI:
+    case ALTIVEC_BUILTIN_VSEL_16QI_UNS:
+      return altivec_expand_vec_sel_builtin (CODE_FOR_altivec_vselv16qi, exp,
+					     target);
+
     case ALTIVEC_BUILTIN_DSSALL:
       emit_insn (gen_altivec_dssall ());
       return NULL_RTX;
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index befab53031c..5a5202c455b 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -15566,9 +15566,7 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
   machine_mode dest_mode = GET_MODE (dest);
   machine_mode mask_mode = GET_MODE (cc_op0);
   enum rtx_code rcode = GET_CODE (cond);
-  machine_mode cc_mode = CCmode;
   rtx mask;
-  rtx cond2;
   bool invert_move = false;
 
   if (VECTOR_UNIT_NONE_P (dest_mode))
@@ -15608,8 +15606,6 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
     case GEU:
     case LTU:
     case LEU:
-      /* Mark unsigned tests with CCUNSmode.  */
-      cc_mode = CCUNSmode;
 
       /* Invert condition to avoid compound test if necessary.  */
       if (rcode == GEU || rcode == LEU)
@@ -15629,6 +15625,9 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
   if (!mask)
     return 0;
 
+  if (mask_mode != dest_mode)
+    mask = simplify_gen_subreg (dest_mode, mask, mask_mode, 0);
+
   if (invert_move)
     std::swap (op_true, op_false);
 
@@ -15668,13 +15667,11 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
   if (!REG_P (op_false) && !SUBREG_P (op_false))
     op_false = force_reg (dest_mode, op_false);
 
-  cond2 = gen_rtx_fmt_ee (NE, cc_mode, gen_lowpart (dest_mode, mask),
-			  CONST0_RTX (dest_mode));
-  emit_insn (gen_rtx_SET (dest,
-			  gen_rtx_IF_THEN_ELSE (dest_mode,
-						cond2,
-						op_true,
-						op_false)));
+  rtx tmp = gen_rtx_IOR (dest_mode,
+			 gen_rtx_AND (dest_mode, gen_rtx_NOT (dest_mode, mask),
+				      op_false),
+			 gen_rtx_AND (dest_mode, mask, op_true));
+  emit_insn (gen_rtx_SET (dest, tmp));
   return 1;
 }
 
diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md
index 3446b03d40d..85ffcf6a763 100644
--- a/gcc/config/rs6000/vector.md
+++ b/gcc/config/rs6000/vector.md
@@ -862,23 +862,21 @@ (define_insn_and_split "vector_<code><mode>"
 ;; which is in the reverse order that we want
 (define_expand "vector_select_<mode>"
   [(set (match_operand:VEC_L 0 "vlogical_operand")
-	(if_then_else:VEC_L
-	 (ne:CC (match_operand:VEC_L 3 "vlogical_operand")
-		(match_dup 4))
-	 (match_operand:VEC_L 2 "vlogical_operand")
-	 (match_operand:VEC_L 1 "vlogical_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-  "operands[4] = CONST0_RTX (<MODE>mode);")
+    (ior:VEC_L
+     (and:VEC_L (not:VEC_L (match_operand:VEC_L 3 "vlogical_operand"))
+      (match_operand:VEC_L 1 "vlogical_operand"))
+     (and:VEC_L (match_dup 3)
+     (match_operand:VEC_L 2 "vlogical_operand"))))]
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)")
 
 (define_expand "vector_select_<mode>_uns"
   [(set (match_operand:VEC_L 0 "vlogical_operand")
-	(if_then_else:VEC_L
-	 (ne:CCUNS (match_operand:VEC_L 3 "vlogical_operand")
-		   (match_dup 4))
-	 (match_operand:VEC_L 2 "vlogical_operand")
-	 (match_operand:VEC_L 1 "vlogical_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-  "operands[4] = CONST0_RTX (<MODE>mode);")
+    (ior:VEC_L
+     (and:VEC_L (not:VEC_L (match_operand:VEC_L 3 "vlogical_operand"))
+      (match_operand:VEC_L 1 "vlogical_operand"))
+     (and:VEC_L (match_dup 3)
+     (match_operand:VEC_L 2 "vlogical_operand"))))]
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)")
 
 ;; Expansions that compare vectors producing a vector result and a predicate,
 ;; setting CR6 to indicate a combined status
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index bcb92be2f5c..f53de7bf34c 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -2105,30 +2105,66 @@ (define_insn "*vsx_ge_<mode>_p"
   [(set_attr "type" "<VStype_simple>")])
 
 ;; Vector select
-(define_insn "*vsx_xxsel<mode>"
+(define_insn "vsx_xxsel<mode>"
   [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
-	(if_then_else:VSX_L
-	 (ne:CC (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa")
-		(match_operand:VSX_L 4 "zero_constant" ""))
-	 (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa")
-	 (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa")))]
-  "VECTOR_MEM_VSX_P (<MODE>mode)"
-  "xxsel %x0,%x3,%x2,%x1"
+    (ior:VSX_L
+     (and:VSX_L
+      (not:VSX_L (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa"))
+      (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa"))
+     (and:VSX_L
+      (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa")
+      (match_operand:VSX_L 4 "vsx_register_operand" "<VSr>,wa"))))]
+  "VECTOR_MEM_VSX_P (<MODE>mode)
+  && rtx_equal_p (operands[2], operands[3])"
+  "xxsel %x0,%x1,%x4,%x3"
   [(set_attr "type" "vecmove")
    (set_attr "isa" "<VSisa>")])
 
-(define_insn "*vsx_xxsel<mode>_uns"
+(define_insn "vsx_xxsel<mode>2"
   [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
-	(if_then_else:VSX_L
-	 (ne:CCUNS (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa")
-		   (match_operand:VSX_L 4 "zero_constant" ""))
-	 (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa")
-	 (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa")))]
-  "VECTOR_MEM_VSX_P (<MODE>mode)"
-  "xxsel %x0,%x3,%x2,%x1"
+    (ior:VSX_L
+     (and:VSX_L
+      (not:VSX_L (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa"))
+      (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa"))
+     (and:VSX_L
+      (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa")
+      (match_operand:VSX_L 4 "vsx_register_operand" "<VSr>,wa"))))]
+  "VECTOR_MEM_VSX_P (<MODE>mode)
+  && rtx_equal_p (operands[4], operands[3])"
+  "xxsel %x0,%x1,%x2,%x3"
   [(set_attr "type" "vecmove")
    (set_attr "isa" "<VSisa>")])
 
+(define_insn "vsx_xxsel<mode>3"
+ [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
+   (ior:VSX_L
+    (and:VSX_L
+     (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa")
+     (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa"))
+    (and:VSX_L
+     (not:VSX_L (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa"))
+     (match_operand:VSX_L 4 "vsx_register_operand" "<VSr>,wa"))))]
+ "VECTOR_MEM_VSX_P (<MODE>mode)
+  && rtx_equal_p (operands[1], operands[3])"
+  "xxsel %x0,%x4,%x2,%x3"
+ [(set_attr "type" "vecmove")
+ (set_attr "isa" "<VSisa>")])
+
+(define_insn "vsx_xxsel<mode>4"
+ [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
+   (ior:VSX_L
+    (and:VSX_L
+     (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa")
+     (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa"))
+    (and:VSX_L
+     (not:VSX_L (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa"))
+     (match_operand:VSX_L 4 "vsx_register_operand" "<VSr>,wa"))))]
+ "VECTOR_MEM_VSX_P (<MODE>mode)
+ && rtx_equal_p (operands[2], operands[3])"
+ "xxsel %x0,%x4,%x1,%x3"
+ [(set_attr "type" "vecmove")
+ (set_attr "isa" "<VSisa>")])
+
 ;; Copy sign
 (define_insn "vsx_copysign<mode>3"
   [(set (match_operand:VSX_F 0 "vsx_register_operand" "=wa")
diff --git a/gcc/testsuite/gcc.target/powerpc/pr94613.c b/gcc/testsuite/gcc.target/powerpc/pr94613.c
new file mode 100644
index 00000000000..06699088f9a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr94613.c
@@ -0,0 +1,38 @@
+/* { dg-do run } */
+/* { dg-require-effective-target vmx_hw } */
+/* { dg-options "-O3 -maltivec" } */
+
+#include <altivec.h>
+
+/* The initial implementation of vec_sel used an IF_THEN_ELSE rtx.
+   This did NOT match what the vsel instruction does.  vsel is a
+   bit-wise operation.  Using IF_THEN_ELSE made the + operation to be
+   simplified away in combine.  A plus operation affects other bits in
+   the same element. Hence per-element simplifications are wrong for
+   vsel.  */
+vector unsigned char __attribute__((noinline))
+foo (vector unsigned char a, vector unsigned char b, vector unsigned char c)
+{
+  return vec_sel (a + b, c, a);
+}
+
+vector unsigned char __attribute__((noinline))
+bar (vector unsigned char a, vector unsigned char b, vector unsigned char c)
+{
+  return vec_sel (a | b, c, a);
+}
+
+int
+main ()
+{
+  vector unsigned char v = (vector unsigned char){ 1 };
+
+  if (foo (v, v, v)[0] != 3)
+      __builtin_abort ();
+
+  if (bar (v, v, v)[0] != 1)
+    __builtin_abort ();
+
+  return 0;
+}
+
-- 
2.25.1


[-- Attachment #3: 0002-rs6000-Fold-xxsel-to-vsel-since-they-have-same-seman.patch --]
[-- Type: text/plain, Size: 7301 bytes --]

From 74cf1fd298e4923c106deaba3192423d48049559 Mon Sep 17 00:00:00 2001
From: Xionghu Luo <luoxhu@linux.ibm.com>
Date: Fri, 14 May 2021 01:21:06 -0500
Subject: [PATCH 2/2] rs6000: Fold xxsel to vsel since they have same semantics

gcc/ChangeLog:

	* config/rs6000/altivec.md: Add vsx register constraints.
	* config/rs6000/vsx.md (vsx_xxsel<mode>): Delete.
	(vsx_xxsel<mode>2): Likewise.
	(vsx_xxsel<mode>3): Likewise.
	(vsx_xxsel<mode>4): Likewise.
---
 gcc/config/rs6000/altivec.md | 56 +++++++++++++++++++--------------
 gcc/config/rs6000/vsx.md     | 61 ------------------------------------
 2 files changed, 32 insertions(+), 85 deletions(-)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index df8f24701d0..6ce3eb9dbc8 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -668,59 +668,67 @@ (define_insn "*altivec_gev4sf"
   [(set_attr "type" "veccmp")])
 
 (define_insn "altivec_vsel<mode>"
-  [(set (match_operand:VM 0 "altivec_register_operand" "=v")
+  [(set (match_operand:VM 0 "altivec_register_operand" "=wa,v")
 	(ior:VM
 	 (and:VM
-	  (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
-	  (match_operand:VM 1 "altivec_register_operand" "v"))
+	  (not:VM (match_operand:VM 3 "altivec_register_operand" "wa,v"))
+	  (match_operand:VM 1 "altivec_register_operand" "wa,v"))
 	 (and:VM
-	  (match_operand:VM 2 "altivec_register_operand" "v")
-	  (match_operand:VM 4 "altivec_register_operand" "v"))))]
+	  (match_operand:VM 2 "altivec_register_operand" "wa,v")
+	  (match_operand:VM 4 "altivec_register_operand" "wa,v"))))]
   "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)
   && rtx_equal_p (operands[2], operands[3])"
-  "vsel %0,%1,%4,%3"
+  "@
+  xxsel %x0,%x1,%x4,%x3
+  vsel %0,%1,%4,%3"
   [(set_attr "type" "vecmove")])
 
 (define_insn "altivec_vsel<mode>2"
-  [(set (match_operand:VM 0 "altivec_register_operand" "=v")
+  [(set (match_operand:VM 0 "altivec_register_operand" "=wa,v")
 	(ior:VM
 	 (and:VM
-	  (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
-	  (match_operand:VM 1 "altivec_register_operand" "v"))
+	  (not:VM (match_operand:VM 3 "altivec_register_operand" "wa,v"))
+	  (match_operand:VM 1 "altivec_register_operand" "wa,v"))
 	 (and:VM
-	  (match_operand:VM 2 "altivec_register_operand" "v")
-	  (match_operand:VM 4 "altivec_register_operand" "v"))))]
+	  (match_operand:VM 2 "altivec_register_operand" "wa,v")
+	  (match_operand:VM 4 "altivec_register_operand" "wa,v"))))]
   "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)
   && rtx_equal_p (operands[4], operands[3])"
-  "vsel %0,%1,%2,%3"
+  "@
+  xxsel %x0,%x1,%x2,%x3
+  vsel %0,%1,%2,%3"
   [(set_attr "type" "vecmove")])
 
 (define_insn "altivec_vsel<mode>3"
-  [(set (match_operand:VM 0 "altivec_register_operand" "=v")
+  [(set (match_operand:VM 0 "altivec_register_operand" "=wa,v")
 	(ior:VM
 	 (and:VM
-	  (match_operand:VM 1 "altivec_register_operand" "v")
-	  (match_operand:VM 2 "altivec_register_operand" "v"))
+	  (match_operand:VM 1 "altivec_register_operand" "wa,v")
+	  (match_operand:VM 2 "altivec_register_operand" "wa,v"))
 	 (and:VM
-	  (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
-	  (match_operand:VM 4 "altivec_register_operand" "v"))))]
+	  (not:VM (match_operand:VM 3 "altivec_register_operand" "wa,v"))
+	  (match_operand:VM 4 "altivec_register_operand" "wa,v"))))]
   "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)
   && rtx_equal_p (operands[1], operands[3])"
-  "vsel %0,%4,%2,%3"
+  "@
+  xxsel %x0,%x4,%x2,%x3
+  vsel %0,%4,%2,%3"
   [(set_attr "type" "vecmove")])
 
 (define_insn "altivec_vsel<mode>4"
-  [(set (match_operand:VM 0 "altivec_register_operand" "=v")
+  [(set (match_operand:VM 0 "altivec_register_operand" "=wa,v")
 	(ior:VM
 	 (and:VM
-	  (match_operand:VM 1 "altivec_register_operand" "v")
-	  (match_operand:VM 2 "altivec_register_operand" "v"))
+	  (match_operand:VM 1 "altivec_register_operand" "wa,v")
+	  (match_operand:VM 2 "altivec_register_operand" "wa,v"))
 	 (and:VM
-	  (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
-	  (match_operand:VM 4 "altivec_register_operand" "v"))))]
+	  (not:VM (match_operand:VM 3 "altivec_register_operand" "wa,v"))
+	  (match_operand:VM 4 "altivec_register_operand" "wa,v"))))]
   "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)
   && rtx_equal_p (operands[2], operands[3])"
-  "vsel %0,%4,%1,%3"
+  "@
+  xxsel %x0,%x4,%x1,%x3
+  vsel %0,%4,%1,%3"
   [(set_attr "type" "vecmove")])
 
 ;; Fused multiply add.
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index f53de7bf34c..55d830d0f20 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -2104,67 +2104,6 @@ (define_insn "*vsx_ge_<mode>_p"
   "xvcmpge<sd>p. %x0,%x1,%x2"
   [(set_attr "type" "<VStype_simple>")])
 
-;; Vector select
-(define_insn "vsx_xxsel<mode>"
-  [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
-    (ior:VSX_L
-     (and:VSX_L
-      (not:VSX_L (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa"))
-      (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa"))
-     (and:VSX_L
-      (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa")
-      (match_operand:VSX_L 4 "vsx_register_operand" "<VSr>,wa"))))]
-  "VECTOR_MEM_VSX_P (<MODE>mode)
-  && rtx_equal_p (operands[2], operands[3])"
-  "xxsel %x0,%x1,%x4,%x3"
-  [(set_attr "type" "vecmove")
-   (set_attr "isa" "<VSisa>")])
-
-(define_insn "vsx_xxsel<mode>2"
-  [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
-    (ior:VSX_L
-     (and:VSX_L
-      (not:VSX_L (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa"))
-      (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa"))
-     (and:VSX_L
-      (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa")
-      (match_operand:VSX_L 4 "vsx_register_operand" "<VSr>,wa"))))]
-  "VECTOR_MEM_VSX_P (<MODE>mode)
-  && rtx_equal_p (operands[4], operands[3])"
-  "xxsel %x0,%x1,%x2,%x3"
-  [(set_attr "type" "vecmove")
-   (set_attr "isa" "<VSisa>")])
-
-(define_insn "vsx_xxsel<mode>3"
- [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
-   (ior:VSX_L
-    (and:VSX_L
-     (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa")
-     (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa"))
-    (and:VSX_L
-     (not:VSX_L (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa"))
-     (match_operand:VSX_L 4 "vsx_register_operand" "<VSr>,wa"))))]
- "VECTOR_MEM_VSX_P (<MODE>mode)
-  && rtx_equal_p (operands[1], operands[3])"
-  "xxsel %x0,%x4,%x2,%x3"
- [(set_attr "type" "vecmove")
- (set_attr "isa" "<VSisa>")])
-
-(define_insn "vsx_xxsel<mode>4"
- [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
-   (ior:VSX_L
-    (and:VSX_L
-     (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa")
-     (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa"))
-    (and:VSX_L
-     (not:VSX_L (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa"))
-     (match_operand:VSX_L 4 "vsx_register_operand" "<VSr>,wa"))))]
- "VECTOR_MEM_VSX_P (<MODE>mode)
- && rtx_equal_p (operands[2], operands[3])"
- "xxsel %x0,%x4,%x1,%x3"
- [(set_attr "type" "vecmove")
- (set_attr "isa" "<VSisa>")])
-
 ;; Copy sign
 (define_insn "vsx_copysign<mode>3"
   [(set (match_operand:VSX_F 0 "vsx_register_operand" "=wa")
-- 
2.25.1


  reply	other threads:[~2021-05-14  6:57 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-30  6:32 Xionghu Luo
2021-05-13  1:18 ` *Ping*: " Xionghu Luo
2021-05-13 10:49 ` Segher Boessenkool
2021-05-14  6:57   ` Xionghu Luo [this message]
2021-06-07  2:15     ` Ping: " Xionghu Luo
2021-06-30  1:42     ` Xionghu Luo
2021-09-06  0:52       ` Ping ^ 2: " Xionghu Luo
2021-09-15  7:50         ` Ping ^ 3: " Xionghu Luo
2021-09-15 13:11           ` David Edelsohn
2021-09-17  5:43             ` Xionghu Luo
2021-09-15 14:14     ` Segher Boessenkool

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=973f14d9-c626-89b4-753f-e5c613eef6f2@linux.ibm.com \
    --to=luoxhu@linux.ibm.com \
    --cc=dje.gcc@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=guojiufu@linux.ibm.com \
    --cc=linkw@gcc.gnu.org \
    --cc=segher@kernel.crashing.org \
    --cc=wschmidt@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).