[PATCH] rs6000: Fix wrong code generation for vec

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH] rs6000: Fix wrong code generation for vec_sel [PR94613]
@ 2021-04-30  6:32 Xionghu Luo
  2021-05-13  1:18 ` *Ping*: " Xionghu Luo
  2021-05-13 10:49 ` Segher Boessenkool
  0 siblings, 2 replies; 11+ messages in thread
From: Xionghu Luo @ 2021-04-30  6:32 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, wschmidt, guojiufu, linkw, Xionghu Luo

The vsel instruction is a bit-wise select instruction.  Using an
IF_THEN_ELSE to express it in RTL is wrong and leads to wrong code
being generated in the combine pass.  Per element selection is a
subset of per bit-wise selection,with the patch the pattern is
written using bit operations.  But there are 8 different patterns
to define "op0 := (op1 & ~op3) | (op2 & op3)":

(~op3&op1) | (op3&op2),
(~op3&op1) | (op2&op3),
(op3&op2) | (~op3&op1),
(op2&op3) | (~op3&op1),
(op1&~op3) | (op3&op2),
(op1&~op3) | (op2&op3),
(op3&op2) | (op1&~op3),
(op2&op3) | (op1&~op3),

Combine pass will swap (op1&~op3) to (~op3&op1) due to commutative
canonical, which could reduce it to the FIRST 4 patterns, but it won't
swap (op2&op3) | (~op3&op1) to (~op3&op1) | (op2&op3), so this patch
handles it with two patterns with different NOT op3 position and check
equality inside it.

Tested pass on Power8LE, any comments?

gcc/ChangeLog:

	* config/rs6000/altivec.md (*altivec_vsel<mode>): Change to ...
	(altivec_vsel<mode>): ... this and update define.
	(*altivec_vsel<mode>_uns): Delete.
	(altivec_vsel<mode>2): New define_insn.
	* config/rs6000/rs6000-call.c (altivec_expand_vec_sel_builtin):
	New.
	(altivec_expand_builtin): Call altivec_expand_vec_sel_builtin to
	expand vel_sel.
	* config/rs6000/rs6000.c (rs6000_emit_vector_cond_expr): Use
	bit-wise selection instead of per element.
	* config/rs6000/vector.md:
	* config/rs6000/vsx.md (*vsx_xxsel<mode>): Change to ...
	(vsx_xxsel<mode>): ... this and update define.
	(*vsx_xxsel<mode>_uns): Delete.
	(vsx_xxsel<mode>2): New define_insn.

gcc/testsuite/ChangeLog:

	* gcc.target/powerpc/pr94613.c: New test.
---
 gcc/config/rs6000/altivec.md               | 50 ++++++++-----
 gcc/config/rs6000/rs6000-call.c            | 81 +++++++++++++++++++---
 gcc/config/rs6000/rs6000.c                 | 19 +++--
 gcc/config/rs6000/vector.md                | 26 ++++---
 gcc/config/rs6000/vsx.md                   | 56 ++++++++++-----
 gcc/testsuite/gcc.target/powerpc/pr94613.c | 38 ++++++++++
 6 files changed, 201 insertions(+), 69 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr94613.c

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 1351dafbc41..f874f975519 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -667,26 +667,44 @@ (define_insn "*altivec_gev4sf"
   "vcmpgefp %0,%1,%2"
   [(set_attr "type" "veccmp")])
 
-(define_insn "*altivec_vsel<mode>"
+(define_insn "altivec_vsel<mode>"
   [(set (match_operand:VM 0 "altivec_register_operand" "=v")
-	(if_then_else:VM
-	 (ne:CC (match_operand:VM 1 "altivec_register_operand" "v")
-		(match_operand:VM 4 "zero_constant" ""))
-	 (match_operand:VM 2 "altivec_register_operand" "v")
-	 (match_operand:VM 3 "altivec_register_operand" "v")))]
-  "VECTOR_MEM_ALTIVEC_P (<MODE>mode)"
-  "vsel %0,%3,%2,%1"
+	(ior:VM
+	 (and:VM
+	  (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
+	  (match_operand:VM 1 "altivec_register_operand" "v"))
+	 (and:VM
+	  (match_operand:VM 2 "altivec_register_operand" "v")
+	  (match_operand:VM 4 "altivec_register_operand" "v"))))]
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)
+  && (rtx_equal_p (operands[2], operands[3])
+  || rtx_equal_p (operands[4], operands[3]))"
+  {
+    if (rtx_equal_p (operands[2], operands[3]))
+      return "vsel %0,%1,%4,%3";
+    else
+      return "vsel %0,%1,%2,%3";
+  }
   [(set_attr "type" "vecmove")])
 
-(define_insn "*altivec_vsel<mode>_uns"
+(define_insn "altivec_vsel<mode>2"
   [(set (match_operand:VM 0 "altivec_register_operand" "=v")
-	(if_then_else:VM
-	 (ne:CCUNS (match_operand:VM 1 "altivec_register_operand" "v")
-		   (match_operand:VM 4 "zero_constant" ""))
-	 (match_operand:VM 2 "altivec_register_operand" "v")
-	 (match_operand:VM 3 "altivec_register_operand" "v")))]
-  "VECTOR_MEM_ALTIVEC_P (<MODE>mode)"
-  "vsel %0,%3,%2,%1"
+	(ior:VM
+	 (and:VM
+	  (match_operand:VM 1 "altivec_register_operand" "v")
+	  (match_operand:VM 2 "altivec_register_operand" "v"))
+	 (and:VM
+	  (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
+	  (match_operand:VM 4 "altivec_register_operand" "v"))))]
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)
+  && (rtx_equal_p (operands[1], operands[3])
+  || rtx_equal_p (operands[2], operands[3]))"
+  {
+    if (rtx_equal_p (operands[1], operands[3]))
+      return "vsel %0,%4,%2,%3";
+    else
+      return "vsel %0,%4,%1,%3";
+ }
   [(set_attr "type" "vecmove")])
 
 ;; Fused multiply add.
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index f5676255387..d65bdc01055 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -3362,11 +3362,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
     RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_unsigned_V2DI },
   { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
     RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI },
-  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
+  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI_UNS,
     RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_bool_V2DI },
-  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
+  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI_UNS,
     RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI },
-  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
+  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI_UNS,
     RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_V2DI },
   { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
     RS6000_BTI_bool_V2DI, RS6000_BTI_bool_V2DI, RS6000_BTI_bool_V2DI,
@@ -3386,9 +3386,9 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
     RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_bool_V4SI },
   { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_4SI,
     RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_unsigned_V4SI },
-  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_4SI,
+  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_4SI_UNS,
     RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_bool_V4SI },
-  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_4SI,
+  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_4SI_UNS,
     RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI },
   { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_4SI,
     RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI },
@@ -3398,9 +3398,9 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
     RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_bool_V8HI },
   { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_8HI,
     RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_unsigned_V8HI },
-  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_8HI,
+  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_8HI_UNS,
     RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_bool_V8HI },
-  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_8HI,
+  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_8HI_UNS,
     RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI },
   { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_8HI,
     RS6000_BTI_bool_V8HI, RS6000_BTI_bool_V8HI, RS6000_BTI_bool_V8HI, RS6000_BTI_bool_V8HI },
@@ -3410,9 +3410,9 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
     RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_bool_V16QI },
   { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_16QI,
     RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_unsigned_V16QI },
-  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_16QI,
+  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_16QI_UNS,
     RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_bool_V16QI },
-  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_16QI,
+  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_16QI_UNS,
     RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI },
   { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_16QI,
     RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI },
@@ -10996,6 +10996,46 @@ altivec_expand_vec_ext_builtin (tree exp, rtx target)
   return target;
 }
 
+/* Expand vec_sel builtin.  */
+static rtx
+altivec_expand_vec_sel_builtin (enum insn_code icode, tree exp, rtx target)
+{
+  rtx op0, op1, op2, pat;
+  tree arg0, arg1, arg2;
+
+  arg0 = CALL_EXPR_ARG (exp, 0);
+  op0 = expand_normal (arg0);
+  arg1 = CALL_EXPR_ARG (exp, 1);
+  op1 = expand_normal (arg1);
+  arg2 = CALL_EXPR_ARG (exp, 2);
+  op2 = expand_normal (arg2);
+
+  machine_mode tmode = insn_data[icode].operand[0].mode;
+  machine_mode mode0 = insn_data[icode].operand[1].mode;
+  machine_mode mode1 = insn_data[icode].operand[2].mode;
+  machine_mode mode2 = insn_data[icode].operand[3].mode;
+
+  if (target == 0
+      || GET_MODE (target) != tmode
+      || ! (*insn_data[icode].operand[0].predicate) (target, tmode))
+    target = gen_reg_rtx (tmode);
+
+  if (! (*insn_data[icode].operand[1].predicate) (op0, mode0))
+    op0 = copy_to_mode_reg (mode0, op0);
+  if (! (*insn_data[icode].operand[2].predicate) (op1, mode1))
+    op1 = copy_to_mode_reg (mode1, op1);
+  if (! (*insn_data[icode].operand[3].predicate) (op2, mode2))
+    op2 = copy_to_mode_reg (mode2, op2);
+
+  pat = GEN_FCN (icode) (target, op0, op1, op2, op2);
+  if (pat)
+    emit_insn (pat);
+  else
+    return NULL_RTX;
+
+  return target;
+}
+
 /* Expand the builtin in EXP and store the result in TARGET.  Store
    true in *EXPANDEDP if we found a builtin to expand.  */
 static rtx
@@ -11181,6 +11221,29 @@ altivec_expand_builtin (tree exp, rtx target, bool *expandedp)
 	emit_insn (pat);
       return NULL_RTX;
 
+    case ALTIVEC_BUILTIN_VSEL_2DF:
+      return altivec_expand_vec_sel_builtin (CODE_FOR_altivec_vselv2df, exp,
+					     target);
+    case ALTIVEC_BUILTIN_VSEL_2DI:
+    case ALTIVEC_BUILTIN_VSEL_2DI_UNS:
+      return altivec_expand_vec_sel_builtin (CODE_FOR_altivec_vselv2di, exp,
+					     target);
+    case ALTIVEC_BUILTIN_VSEL_4SF:
+      return altivec_expand_vec_sel_builtin (CODE_FOR_altivec_vselv4sf, exp,
+					     target);
+    case ALTIVEC_BUILTIN_VSEL_4SI:
+    case ALTIVEC_BUILTIN_VSEL_4SI_UNS:
+      return altivec_expand_vec_sel_builtin (CODE_FOR_altivec_vselv4si, exp,
+					     target);
+    case ALTIVEC_BUILTIN_VSEL_8HI:
+    case ALTIVEC_BUILTIN_VSEL_8HI_UNS:
+      return altivec_expand_vec_sel_builtin (CODE_FOR_altivec_vselv8hi, exp,
+					     target);
+    case ALTIVEC_BUILTIN_VSEL_16QI:
+    case ALTIVEC_BUILTIN_VSEL_16QI_UNS:
+      return altivec_expand_vec_sel_builtin (CODE_FOR_altivec_vselv16qi, exp,
+					     target);
+
     case ALTIVEC_BUILTIN_DSSALL:
       emit_insn (gen_altivec_dssall ());
       return NULL_RTX;
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index befab53031c..329c6e7e5fb 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -15566,9 +15566,7 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
   machine_mode dest_mode = GET_MODE (dest);
   machine_mode mask_mode = GET_MODE (cc_op0);
   enum rtx_code rcode = GET_CODE (cond);
-  machine_mode cc_mode = CCmode;
   rtx mask;
-  rtx cond2;
   bool invert_move = false;
 
   if (VECTOR_UNIT_NONE_P (dest_mode))
@@ -15608,8 +15606,6 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
     case GEU:
     case LTU:
     case LEU:
-      /* Mark unsigned tests with CCUNSmode.  */
-      cc_mode = CCUNSmode;
 
       /* Invert condition to avoid compound test if necessary.  */
       if (rcode == GEU || rcode == LEU)
@@ -15629,6 +15625,9 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
   if (!mask)
     return 0;
 
+  if (mask_mode != dest_mode)
+      mask = simplify_gen_subreg (dest_mode, mask, mask_mode, 0);
+
   if (invert_move)
     std::swap (op_true, op_false);
 
@@ -15668,13 +15667,11 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
   if (!REG_P (op_false) && !SUBREG_P (op_false))
     op_false = force_reg (dest_mode, op_false);
 
-  cond2 = gen_rtx_fmt_ee (NE, cc_mode, gen_lowpart (dest_mode, mask),
-			  CONST0_RTX (dest_mode));
-  emit_insn (gen_rtx_SET (dest,
-			  gen_rtx_IF_THEN_ELSE (dest_mode,
-						cond2,
-						op_true,
-						op_false)));
+  rtx tmp = gen_rtx_IOR (dest_mode,
+			 gen_rtx_AND (dest_mode, gen_rtx_NOT (dest_mode, mask),
+				      op_false),
+			 gen_rtx_AND (dest_mode, mask, op_true));
+  emit_insn (gen_rtx_SET (dest, tmp));
   return 1;
 }
 
diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md
index 3446b03d40d..85ffcf6a763 100644
--- a/gcc/config/rs6000/vector.md
+++ b/gcc/config/rs6000/vector.md
@@ -862,23 +862,21 @@ (define_insn_and_split "vector_<code><mode>"
 ;; which is in the reverse order that we want
 (define_expand "vector_select_<mode>"
   [(set (match_operand:VEC_L 0 "vlogical_operand")
-	(if_then_else:VEC_L
-	 (ne:CC (match_operand:VEC_L 3 "vlogical_operand")
-		(match_dup 4))
-	 (match_operand:VEC_L 2 "vlogical_operand")
-	 (match_operand:VEC_L 1 "vlogical_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-  "operands[4] = CONST0_RTX (<MODE>mode);")
+    (ior:VEC_L
+     (and:VEC_L (not:VEC_L (match_operand:VEC_L 3 "vlogical_operand"))
+      (match_operand:VEC_L 1 "vlogical_operand"))
+     (and:VEC_L (match_dup 3)
+     (match_operand:VEC_L 2 "vlogical_operand"))))]
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)")
 
 (define_expand "vector_select_<mode>_uns"
   [(set (match_operand:VEC_L 0 "vlogical_operand")
-	(if_then_else:VEC_L
-	 (ne:CCUNS (match_operand:VEC_L 3 "vlogical_operand")
-		   (match_dup 4))
-	 (match_operand:VEC_L 2 "vlogical_operand")
-	 (match_operand:VEC_L 1 "vlogical_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-  "operands[4] = CONST0_RTX (<MODE>mode);")
+    (ior:VEC_L
+     (and:VEC_L (not:VEC_L (match_operand:VEC_L 3 "vlogical_operand"))
+      (match_operand:VEC_L 1 "vlogical_operand"))
+     (and:VEC_L (match_dup 3)
+     (match_operand:VEC_L 2 "vlogical_operand"))))]
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)")
 
 ;; Expansions that compare vectors producing a vector result and a predicate,
 ;; setting CR6 to indicate a combined status
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index bcb92be2f5c..e4817bd3060 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -2105,29 +2105,47 @@ (define_insn "*vsx_ge_<mode>_p"
   [(set_attr "type" "<VStype_simple>")])
 
 ;; Vector select
-(define_insn "*vsx_xxsel<mode>"
+(define_insn "vsx_xxsel<mode>"
   [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
-	(if_then_else:VSX_L
-	 (ne:CC (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa")
-		(match_operand:VSX_L 4 "zero_constant" ""))
-	 (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa")
-	 (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa")))]
-  "VECTOR_MEM_VSX_P (<MODE>mode)"
-  "xxsel %x0,%x3,%x2,%x1"
+    (ior:VSX_L
+     (and:VSX_L
+      (not:VSX_L (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa"))
+      (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa"))
+     (and:VSX_L
+      (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa")
+      (match_operand:VSX_L 4 "vsx_register_operand" "<VSr>,wa"))))]
+  "VECTOR_MEM_VSX_P (<MODE>mode)
+  && (rtx_equal_p (operands[2], operands[3])
+  || rtx_equal_p (operands[4], operands[3]))"
+  {
+    if (rtx_equal_p (operands[2], operands[3]))
+      return "xxsel %x0,%x1,%x4,%x3";
+    else
+      return "xxsel %x0,%x1,%x2,%x3";
+  }
   [(set_attr "type" "vecmove")
    (set_attr "isa" "<VSisa>")])
 
-(define_insn "*vsx_xxsel<mode>_uns"
-  [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
-	(if_then_else:VSX_L
-	 (ne:CCUNS (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa")
-		   (match_operand:VSX_L 4 "zero_constant" ""))
-	 (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa")
-	 (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa")))]
-  "VECTOR_MEM_VSX_P (<MODE>mode)"
-  "xxsel %x0,%x3,%x2,%x1"
-  [(set_attr "type" "vecmove")
-   (set_attr "isa" "<VSisa>")])
+(define_insn "vsx_xxsel<mode>2"
+ [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
+   (ior:VSX_L
+    (and:VSX_L
+     (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa")
+     (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa"))
+    (and:VSX_L
+     (not:VSX_L (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa"))
+     (match_operand:VSX_L 4 "vsx_register_operand" "<VSr>,wa"))))]
+ "VECTOR_MEM_VSX_P (<MODE>mode)
+  && (rtx_equal_p (operands[1], operands[3])
+  || rtx_equal_p (operands[2], operands[3]))"
+  {
+    if (rtx_equal_p (operands[1], operands[3]))
+      return "xxsel %x0,%x4,%x2,%x3";
+    else
+      return "xxsel %x0,%x4,%x1,%x3";
+ }
+ [(set_attr "type" "vecmove")
+ (set_attr "isa" "<VSisa>")])
 
 ;; Copy sign
 (define_insn "vsx_copysign<mode>3"
diff --git a/gcc/testsuite/gcc.target/powerpc/pr94613.c b/gcc/testsuite/gcc.target/powerpc/pr94613.c
new file mode 100644
index 00000000000..06699088f9a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr94613.c
@@ -0,0 +1,38 @@
+/* { dg-do run } */
+/* { dg-require-effective-target vmx_hw } */
+/* { dg-options "-O3 -maltivec" } */
+
+#include <altivec.h>
+
+/* The initial implementation of vec_sel used an IF_THEN_ELSE rtx.
+   This did NOT match what the vsel instruction does.  vsel is a
+   bit-wise operation.  Using IF_THEN_ELSE made the + operation to be
+   simplified away in combine.  A plus operation affects other bits in
+   the same element. Hence per-element simplifications are wrong for
+   vsel.  */
+vector unsigned char __attribute__((noinline))
+foo (vector unsigned char a, vector unsigned char b, vector unsigned char c)
+{
+  return vec_sel (a + b, c, a);
+}
+
+vector unsigned char __attribute__((noinline))
+bar (vector unsigned char a, vector unsigned char b, vector unsigned char c)
+{
+  return vec_sel (a | b, c, a);
+}
+
+int
+main ()
+{
+  vector unsigned char v = (vector unsigned char){ 1 };
+
+  if (foo (v, v, v)[0] != 3)
+      __builtin_abort ();
+
+  if (bar (v, v, v)[0] != 1)
+    __builtin_abort ();
+
+  return 0;
+}
+
-- 
2.25.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* *Ping*: [PATCH] rs6000: Fix wrong code generation for vec_sel [PR94613]
  2021-04-30  6:32 [PATCH] rs6000: Fix wrong code generation for vec_sel [PR94613] Xionghu Luo
@ 2021-05-13  1:18 ` Xionghu Luo
  2021-05-13 10:49 ` Segher Boessenkool
  1 sibling, 0 replies; 11+ messages in thread
From: Xionghu Luo @ 2021-05-13  1:18 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, wschmidt, guojiufu, linkw



On 2021/4/30 14:32, Xionghu Luo wrote:
> The vsel instruction is a bit-wise select instruction.  Using an
> IF_THEN_ELSE to express it in RTL is wrong and leads to wrong code
> being generated in the combine pass.  Per element selection is a
> subset of per bit-wise selection,with the patch the pattern is
> written using bit operations.  But there are 8 different patterns
> to define "op0 := (op1 & ~op3) | (op2 & op3)":
> 
> (~op3&op1) | (op3&op2),
> (~op3&op1) | (op2&op3),
> (op3&op2) | (~op3&op1),
> (op2&op3) | (~op3&op1),
> (op1&~op3) | (op3&op2),
> (op1&~op3) | (op2&op3),
> (op3&op2) | (op1&~op3),
> (op2&op3) | (op1&~op3),
> 
> Combine pass will swap (op1&~op3) to (~op3&op1) due to commutative
> canonical, which could reduce it to the FIRST 4 patterns, but it won't
> swap (op2&op3) | (~op3&op1) to (~op3&op1) | (op2&op3), so this patch
> handles it with two patterns with different NOT op3 position and check
> equality inside it.
> 
> Tested pass on Power8LE, any comments?
> 
> gcc/ChangeLog:
> 
> 	* config/rs6000/altivec.md (*altivec_vsel<mode>): Change to ...
> 	(altivec_vsel<mode>): ... this and update define.
> 	(*altivec_vsel<mode>_uns): Delete.
> 	(altivec_vsel<mode>2): New define_insn.
> 	* config/rs6000/rs6000-call.c (altivec_expand_vec_sel_builtin):
> 	New.
> 	(altivec_expand_builtin): Call altivec_expand_vec_sel_builtin to
> 	expand vel_sel.
> 	* config/rs6000/rs6000.c (rs6000_emit_vector_cond_expr): Use
> 	bit-wise selection instead of per element.
> 	* config/rs6000/vector.md:
> 	* config/rs6000/vsx.md (*vsx_xxsel<mode>): Change to ...
> 	(vsx_xxsel<mode>): ... this and update define.
> 	(*vsx_xxsel<mode>_uns): Delete.
> 	(vsx_xxsel<mode>2): New define_insn.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.target/powerpc/pr94613.c: New test.
> ---
>   gcc/config/rs6000/altivec.md               | 50 ++++++++-----
>   gcc/config/rs6000/rs6000-call.c            | 81 +++++++++++++++++++---
>   gcc/config/rs6000/rs6000.c                 | 19 +++--
>   gcc/config/rs6000/vector.md                | 26 ++++---
>   gcc/config/rs6000/vsx.md                   | 56 ++++++++++-----
>   gcc/testsuite/gcc.target/powerpc/pr94613.c | 38 ++++++++++
>   6 files changed, 201 insertions(+), 69 deletions(-)
>   create mode 100644 gcc/testsuite/gcc.target/powerpc/pr94613.c
> 
> diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
> index 1351dafbc41..f874f975519 100644
> --- a/gcc/config/rs6000/altivec.md
> +++ b/gcc/config/rs6000/altivec.md
> @@ -667,26 +667,44 @@ (define_insn "*altivec_gev4sf"
>     "vcmpgefp %0,%1,%2"
>     [(set_attr "type" "veccmp")])
>   
> -(define_insn "*altivec_vsel<mode>"
> +(define_insn "altivec_vsel<mode>"
>     [(set (match_operand:VM 0 "altivec_register_operand" "=v")
> -	(if_then_else:VM
> -	 (ne:CC (match_operand:VM 1 "altivec_register_operand" "v")
> -		(match_operand:VM 4 "zero_constant" ""))
> -	 (match_operand:VM 2 "altivec_register_operand" "v")
> -	 (match_operand:VM 3 "altivec_register_operand" "v")))]
> -  "VECTOR_MEM_ALTIVEC_P (<MODE>mode)"
> -  "vsel %0,%3,%2,%1"
> +	(ior:VM
> +	 (and:VM
> +	  (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
> +	  (match_operand:VM 1 "altivec_register_operand" "v"))
> +	 (and:VM
> +	  (match_operand:VM 2 "altivec_register_operand" "v")
> +	  (match_operand:VM 4 "altivec_register_operand" "v"))))]
> +  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)
> +  && (rtx_equal_p (operands[2], operands[3])
> +  || rtx_equal_p (operands[4], operands[3]))"
> +  {
> +    if (rtx_equal_p (operands[2], operands[3]))
> +      return "vsel %0,%1,%4,%3";
> +    else
> +      return "vsel %0,%1,%2,%3";
> +  }
>     [(set_attr "type" "vecmove")])
>   
> -(define_insn "*altivec_vsel<mode>_uns"
> +(define_insn "altivec_vsel<mode>2"
>     [(set (match_operand:VM 0 "altivec_register_operand" "=v")
> -	(if_then_else:VM
> -	 (ne:CCUNS (match_operand:VM 1 "altivec_register_operand" "v")
> -		   (match_operand:VM 4 "zero_constant" ""))
> -	 (match_operand:VM 2 "altivec_register_operand" "v")
> -	 (match_operand:VM 3 "altivec_register_operand" "v")))]
> -  "VECTOR_MEM_ALTIVEC_P (<MODE>mode)"
> -  "vsel %0,%3,%2,%1"
> +	(ior:VM
> +	 (and:VM
> +	  (match_operand:VM 1 "altivec_register_operand" "v")
> +	  (match_operand:VM 2 "altivec_register_operand" "v"))
> +	 (and:VM
> +	  (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
> +	  (match_operand:VM 4 "altivec_register_operand" "v"))))]
> +  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)
> +  && (rtx_equal_p (operands[1], operands[3])
> +  || rtx_equal_p (operands[2], operands[3]))"
> +  {
> +    if (rtx_equal_p (operands[1], operands[3]))
> +      return "vsel %0,%4,%2,%3";
> +    else
> +      return "vsel %0,%4,%1,%3";
> + }
>     [(set_attr "type" "vecmove")])
>   
>   ;; Fused multiply add.
> diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
> index f5676255387..d65bdc01055 100644
> --- a/gcc/config/rs6000/rs6000-call.c
> +++ b/gcc/config/rs6000/rs6000-call.c
> @@ -3362,11 +3362,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
>       RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_unsigned_V2DI },
>     { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
>       RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI },
> -  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
> +  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI_UNS,
>       RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_bool_V2DI },
> -  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
> +  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI_UNS,
>       RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI },
> -  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
> +  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI_UNS,
>       RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_V2DI },
>     { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
>       RS6000_BTI_bool_V2DI, RS6000_BTI_bool_V2DI, RS6000_BTI_bool_V2DI,
> @@ -3386,9 +3386,9 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
>       RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_bool_V4SI },
>     { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_4SI,
>       RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_unsigned_V4SI },
> -  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_4SI,
> +  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_4SI_UNS,
>       RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_bool_V4SI },
> -  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_4SI,
> +  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_4SI_UNS,
>       RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI },
>     { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_4SI,
>       RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI },
> @@ -3398,9 +3398,9 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
>       RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_bool_V8HI },
>     { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_8HI,
>       RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_unsigned_V8HI },
> -  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_8HI,
> +  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_8HI_UNS,
>       RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_bool_V8HI },
> -  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_8HI,
> +  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_8HI_UNS,
>       RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI },
>     { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_8HI,
>       RS6000_BTI_bool_V8HI, RS6000_BTI_bool_V8HI, RS6000_BTI_bool_V8HI, RS6000_BTI_bool_V8HI },
> @@ -3410,9 +3410,9 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
>       RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_bool_V16QI },
>     { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_16QI,
>       RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_unsigned_V16QI },
> -  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_16QI,
> +  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_16QI_UNS,
>       RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_bool_V16QI },
> -  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_16QI,
> +  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_16QI_UNS,
>       RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI },
>     { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_16QI,
>       RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI },
> @@ -10996,6 +10996,46 @@ altivec_expand_vec_ext_builtin (tree exp, rtx target)
>     return target;
>   }
>   
> +/* Expand vec_sel builtin.  */
> +static rtx
> +altivec_expand_vec_sel_builtin (enum insn_code icode, tree exp, rtx target)
> +{
> +  rtx op0, op1, op2, pat;
> +  tree arg0, arg1, arg2;
> +
> +  arg0 = CALL_EXPR_ARG (exp, 0);
> +  op0 = expand_normal (arg0);
> +  arg1 = CALL_EXPR_ARG (exp, 1);
> +  op1 = expand_normal (arg1);
> +  arg2 = CALL_EXPR_ARG (exp, 2);
> +  op2 = expand_normal (arg2);
> +
> +  machine_mode tmode = insn_data[icode].operand[0].mode;
> +  machine_mode mode0 = insn_data[icode].operand[1].mode;
> +  machine_mode mode1 = insn_data[icode].operand[2].mode;
> +  machine_mode mode2 = insn_data[icode].operand[3].mode;
> +
> +  if (target == 0
> +      || GET_MODE (target) != tmode
> +      || ! (*insn_data[icode].operand[0].predicate) (target, tmode))
> +    target = gen_reg_rtx (tmode);
> +
> +  if (! (*insn_data[icode].operand[1].predicate) (op0, mode0))
> +    op0 = copy_to_mode_reg (mode0, op0);
> +  if (! (*insn_data[icode].operand[2].predicate) (op1, mode1))
> +    op1 = copy_to_mode_reg (mode1, op1);
> +  if (! (*insn_data[icode].operand[3].predicate) (op2, mode2))
> +    op2 = copy_to_mode_reg (mode2, op2);
> +
> +  pat = GEN_FCN (icode) (target, op0, op1, op2, op2);
> +  if (pat)
> +    emit_insn (pat);
> +  else
> +    return NULL_RTX;
> +
> +  return target;
> +}
> +
>   /* Expand the builtin in EXP and store the result in TARGET.  Store
>      true in *EXPANDEDP if we found a builtin to expand.  */
>   static rtx
> @@ -11181,6 +11221,29 @@ altivec_expand_builtin (tree exp, rtx target, bool *expandedp)
>   	emit_insn (pat);
>         return NULL_RTX;
>   
> +    case ALTIVEC_BUILTIN_VSEL_2DF:
> +      return altivec_expand_vec_sel_builtin (CODE_FOR_altivec_vselv2df, exp,
> +					     target);
> +    case ALTIVEC_BUILTIN_VSEL_2DI:
> +    case ALTIVEC_BUILTIN_VSEL_2DI_UNS:
> +      return altivec_expand_vec_sel_builtin (CODE_FOR_altivec_vselv2di, exp,
> +					     target);
> +    case ALTIVEC_BUILTIN_VSEL_4SF:
> +      return altivec_expand_vec_sel_builtin (CODE_FOR_altivec_vselv4sf, exp,
> +					     target);
> +    case ALTIVEC_BUILTIN_VSEL_4SI:
> +    case ALTIVEC_BUILTIN_VSEL_4SI_UNS:
> +      return altivec_expand_vec_sel_builtin (CODE_FOR_altivec_vselv4si, exp,
> +					     target);
> +    case ALTIVEC_BUILTIN_VSEL_8HI:
> +    case ALTIVEC_BUILTIN_VSEL_8HI_UNS:
> +      return altivec_expand_vec_sel_builtin (CODE_FOR_altivec_vselv8hi, exp,
> +					     target);
> +    case ALTIVEC_BUILTIN_VSEL_16QI:
> +    case ALTIVEC_BUILTIN_VSEL_16QI_UNS:
> +      return altivec_expand_vec_sel_builtin (CODE_FOR_altivec_vselv16qi, exp,
> +					     target);
> +
>       case ALTIVEC_BUILTIN_DSSALL:
>         emit_insn (gen_altivec_dssall ());
>         return NULL_RTX;
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index befab53031c..329c6e7e5fb 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -15566,9 +15566,7 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
>     machine_mode dest_mode = GET_MODE (dest);
>     machine_mode mask_mode = GET_MODE (cc_op0);
>     enum rtx_code rcode = GET_CODE (cond);
> -  machine_mode cc_mode = CCmode;
>     rtx mask;
> -  rtx cond2;
>     bool invert_move = false;
>   
>     if (VECTOR_UNIT_NONE_P (dest_mode))
> @@ -15608,8 +15606,6 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
>       case GEU:
>       case LTU:
>       case LEU:
> -      /* Mark unsigned tests with CCUNSmode.  */
> -      cc_mode = CCUNSmode;
>   
>         /* Invert condition to avoid compound test if necessary.  */
>         if (rcode == GEU || rcode == LEU)
> @@ -15629,6 +15625,9 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
>     if (!mask)
>       return 0;
>   
> +  if (mask_mode != dest_mode)
> +      mask = simplify_gen_subreg (dest_mode, mask, mask_mode, 0);
> +
>     if (invert_move)
>       std::swap (op_true, op_false);
>   
> @@ -15668,13 +15667,11 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
>     if (!REG_P (op_false) && !SUBREG_P (op_false))
>       op_false = force_reg (dest_mode, op_false);
>   
> -  cond2 = gen_rtx_fmt_ee (NE, cc_mode, gen_lowpart (dest_mode, mask),
> -			  CONST0_RTX (dest_mode));
> -  emit_insn (gen_rtx_SET (dest,
> -			  gen_rtx_IF_THEN_ELSE (dest_mode,
> -						cond2,
> -						op_true,
> -						op_false)));
> +  rtx tmp = gen_rtx_IOR (dest_mode,
> +			 gen_rtx_AND (dest_mode, gen_rtx_NOT (dest_mode, mask),
> +				      op_false),
> +			 gen_rtx_AND (dest_mode, mask, op_true));
> +  emit_insn (gen_rtx_SET (dest, tmp));
>     return 1;
>   }
>   
> diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md
> index 3446b03d40d..85ffcf6a763 100644
> --- a/gcc/config/rs6000/vector.md
> +++ b/gcc/config/rs6000/vector.md
> @@ -862,23 +862,21 @@ (define_insn_and_split "vector_<code><mode>"
>   ;; which is in the reverse order that we want
>   (define_expand "vector_select_<mode>"
>     [(set (match_operand:VEC_L 0 "vlogical_operand")
> -	(if_then_else:VEC_L
> -	 (ne:CC (match_operand:VEC_L 3 "vlogical_operand")
> -		(match_dup 4))
> -	 (match_operand:VEC_L 2 "vlogical_operand")
> -	 (match_operand:VEC_L 1 "vlogical_operand")))]
> -  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
> -  "operands[4] = CONST0_RTX (<MODE>mode);")
> +    (ior:VEC_L
> +     (and:VEC_L (not:VEC_L (match_operand:VEC_L 3 "vlogical_operand"))
> +      (match_operand:VEC_L 1 "vlogical_operand"))
> +     (and:VEC_L (match_dup 3)
> +     (match_operand:VEC_L 2 "vlogical_operand"))))]
> +  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)")
>   
>   (define_expand "vector_select_<mode>_uns"
>     [(set (match_operand:VEC_L 0 "vlogical_operand")
> -	(if_then_else:VEC_L
> -	 (ne:CCUNS (match_operand:VEC_L 3 "vlogical_operand")
> -		   (match_dup 4))
> -	 (match_operand:VEC_L 2 "vlogical_operand")
> -	 (match_operand:VEC_L 1 "vlogical_operand")))]
> -  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
> -  "operands[4] = CONST0_RTX (<MODE>mode);")
> +    (ior:VEC_L
> +     (and:VEC_L (not:VEC_L (match_operand:VEC_L 3 "vlogical_operand"))
> +      (match_operand:VEC_L 1 "vlogical_operand"))
> +     (and:VEC_L (match_dup 3)
> +     (match_operand:VEC_L 2 "vlogical_operand"))))]
> +  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)")
>   
>   ;; Expansions that compare vectors producing a vector result and a predicate,
>   ;; setting CR6 to indicate a combined status
> diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
> index bcb92be2f5c..e4817bd3060 100644
> --- a/gcc/config/rs6000/vsx.md
> +++ b/gcc/config/rs6000/vsx.md
> @@ -2105,29 +2105,47 @@ (define_insn "*vsx_ge_<mode>_p"
>     [(set_attr "type" "<VStype_simple>")])
>   
>   ;; Vector select
> -(define_insn "*vsx_xxsel<mode>"
> +(define_insn "vsx_xxsel<mode>"
>     [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
> -	(if_then_else:VSX_L
> -	 (ne:CC (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa")
> -		(match_operand:VSX_L 4 "zero_constant" ""))
> -	 (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa")
> -	 (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa")))]
> -  "VECTOR_MEM_VSX_P (<MODE>mode)"
> -  "xxsel %x0,%x3,%x2,%x1"
> +    (ior:VSX_L
> +     (and:VSX_L
> +      (not:VSX_L (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa"))
> +      (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa"))
> +     (and:VSX_L
> +      (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa")
> +      (match_operand:VSX_L 4 "vsx_register_operand" "<VSr>,wa"))))]
> +  "VECTOR_MEM_VSX_P (<MODE>mode)
> +  && (rtx_equal_p (operands[2], operands[3])
> +  || rtx_equal_p (operands[4], operands[3]))"
> +  {
> +    if (rtx_equal_p (operands[2], operands[3]))
> +      return "xxsel %x0,%x1,%x4,%x3";
> +    else
> +      return "xxsel %x0,%x1,%x2,%x3";
> +  }
>     [(set_attr "type" "vecmove")
>      (set_attr "isa" "<VSisa>")])
>   
> -(define_insn "*vsx_xxsel<mode>_uns"
> -  [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
> -	(if_then_else:VSX_L
> -	 (ne:CCUNS (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa")
> -		   (match_operand:VSX_L 4 "zero_constant" ""))
> -	 (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa")
> -	 (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa")))]
> -  "VECTOR_MEM_VSX_P (<MODE>mode)"
> -  "xxsel %x0,%x3,%x2,%x1"
> -  [(set_attr "type" "vecmove")
> -   (set_attr "isa" "<VSisa>")])
> +(define_insn "vsx_xxsel<mode>2"
> + [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
> +   (ior:VSX_L
> +    (and:VSX_L
> +     (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa")
> +     (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa"))
> +    (and:VSX_L
> +     (not:VSX_L (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa"))
> +     (match_operand:VSX_L 4 "vsx_register_operand" "<VSr>,wa"))))]
> + "VECTOR_MEM_VSX_P (<MODE>mode)
> +  && (rtx_equal_p (operands[1], operands[3])
> +  || rtx_equal_p (operands[2], operands[3]))"
> +  {
> +    if (rtx_equal_p (operands[1], operands[3]))
> +      return "xxsel %x0,%x4,%x2,%x3";
> +    else
> +      return "xxsel %x0,%x4,%x1,%x3";
> + }
> + [(set_attr "type" "vecmove")
> + (set_attr "isa" "<VSisa>")])
>   
>   ;; Copy sign
>   (define_insn "vsx_copysign<mode>3"
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr94613.c b/gcc/testsuite/gcc.target/powerpc/pr94613.c
> new file mode 100644
> index 00000000000..06699088f9a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr94613.c
> @@ -0,0 +1,38 @@
> +/* { dg-do run } */
> +/* { dg-require-effective-target vmx_hw } */
> +/* { dg-options "-O3 -maltivec" } */
> +
> +#include <altivec.h>
> +
> +/* The initial implementation of vec_sel used an IF_THEN_ELSE rtx.
> +   This did NOT match what the vsel instruction does.  vsel is a
> +   bit-wise operation.  Using IF_THEN_ELSE made the + operation to be
> +   simplified away in combine.  A plus operation affects other bits in
> +   the same element. Hence per-element simplifications are wrong for
> +   vsel.  */
> +vector unsigned char __attribute__((noinline))
> +foo (vector unsigned char a, vector unsigned char b, vector unsigned char c)
> +{
> +  return vec_sel (a + b, c, a);
> +}
> +
> +vector unsigned char __attribute__((noinline))
> +bar (vector unsigned char a, vector unsigned char b, vector unsigned char c)
> +{
> +  return vec_sel (a | b, c, a);
> +}
> +
> +int
> +main ()
> +{
> +  vector unsigned char v = (vector unsigned char){ 1 };
> +
> +  if (foo (v, v, v)[0] != 3)
> +      __builtin_abort ();
> +
> +  if (bar (v, v, v)[0] != 1)
> +    __builtin_abort ();
> +
> +  return 0;
> +}
> +
> 

-- 
Thanks,
Xionghu

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] rs6000: Fix wrong code generation for vec_sel [PR94613]
  2021-04-30  6:32 [PATCH] rs6000: Fix wrong code generation for vec_sel [PR94613] Xionghu Luo
  2021-05-13  1:18 ` *Ping*: " Xionghu Luo
@ 2021-05-13 10:49 ` Segher Boessenkool
  2021-05-14  6:57   ` Xionghu Luo
  1 sibling, 1 reply; 11+ messages in thread
From: Segher Boessenkool @ 2021-05-13 10:49 UTC (permalink / raw)
  To: Xionghu Luo; +Cc: gcc-patches, dje.gcc, wschmidt, guojiufu, linkw

Hi!

On Fri, Apr 30, 2021 at 01:32:58AM -0500, Xionghu Luo wrote:
> The vsel instruction is a bit-wise select instruction.  Using an
> IF_THEN_ELSE to express it in RTL is wrong and leads to wrong code
> being generated in the combine pass.  Per element selection is a
> subset of per bit-wise selection,with the patch the pattern is
> written using bit operations.  But there are 8 different patterns
> to define "op0 := (op1 & ~op3) | (op2 & op3)":
> 
> (~op3&op1) | (op3&op2),
> (~op3&op1) | (op2&op3),
> (op3&op2) | (~op3&op1),
> (op2&op3) | (~op3&op1),
> (op1&~op3) | (op3&op2),
> (op1&~op3) | (op2&op3),
> (op3&op2) | (op1&~op3),
> (op2&op3) | (op1&~op3),
> 
> Combine pass will swap (op1&~op3) to (~op3&op1) due to commutative
> canonical, which could reduce it to the FIRST 4 patterns, but it won't
> swap (op2&op3) | (~op3&op1) to (~op3&op1) | (op2&op3), so this patch
> handles it with two patterns with different NOT op3 position and check
> equality inside it.

Yup, that latter case does not have canonicalisation rules.  Btw, not
only combine does this canonicalisation: everything should,
non-canonical RTL is invalid RTL (in the instruction stream, you can do
everything in temporary code of course, as long as the RTL isn't
malformed).

> -(define_insn "*altivec_vsel<mode>"
> +(define_insn "altivec_vsel<mode>"
>    [(set (match_operand:VM 0 "altivec_register_operand" "=v")
> -	(if_then_else:VM
> -	 (ne:CC (match_operand:VM 1 "altivec_register_operand" "v")
> -		(match_operand:VM 4 "zero_constant" ""))
> -	 (match_operand:VM 2 "altivec_register_operand" "v")
> -	 (match_operand:VM 3 "altivec_register_operand" "v")))]
> -  "VECTOR_MEM_ALTIVEC_P (<MODE>mode)"
> -  "vsel %0,%3,%2,%1"
> +	(ior:VM
> +	 (and:VM
> +	  (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
> +	  (match_operand:VM 1 "altivec_register_operand" "v"))
> +	 (and:VM
> +	  (match_operand:VM 2 "altivec_register_operand" "v")
> +	  (match_operand:VM 4 "altivec_register_operand" "v"))))]
> +  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)
> +  && (rtx_equal_p (operands[2], operands[3])
> +  || rtx_equal_p (operands[4], operands[3]))"
> +  {
> +    if (rtx_equal_p (operands[2], operands[3]))
> +      return "vsel %0,%1,%4,%3";
> +    else
> +      return "vsel %0,%1,%2,%3";
> +  }
>    [(set_attr "type" "vecmove")])

That rtx_equal_p stuff is nice and tricky, but it is a bit too tricky I
think.  So please write this as two patterns (and keep the expand if
that helps).

> +(define_insn "altivec_vsel<mode>2"

(same here of course).

>  ;; Fused multiply add.
> diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
> index f5676255387..d65bdc01055 100644
> --- a/gcc/config/rs6000/rs6000-call.c
> +++ b/gcc/config/rs6000/rs6000-call.c
> @@ -3362,11 +3362,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
>      RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_unsigned_V2DI },
>    { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
>      RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI },
> -  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
> +  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI_UNS,

Are the _uns things still used for anything?  But, let's not change
this until Bill's stuff is in :-)

Why do you want to change this here, btw?  I don't understand.

> +  if (target == 0
> +      || GET_MODE (target) != tmode
> +      || ! (*insn_data[icode].operand[0].predicate) (target, tmode))

No space after ! and other unary operators (except for casts and other
operators you write with alphanumerics, like "sizeof").  I know you
copied this code, but :-)

> @@ -15608,8 +15606,6 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
>      case GEU:
>      case LTU:
>      case LEU:
> -      /* Mark unsigned tests with CCUNSmode.  */
> -      cc_mode = CCUNSmode;
>  
>        /* Invert condition to avoid compound test if necessary.  */
>        if (rcode == GEU || rcode == LEU)

So this is related to the _uns thing.  Could you split off that change?
Probably as an earlier patch (but either works for me).

> @@ -15629,6 +15625,9 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
>    if (!mask)
>      return 0;
>  
> +  if (mask_mode != dest_mode)
> +      mask = simplify_gen_subreg (dest_mode, mask, mask_mode, 0);

Indent just two characters please: line continuations (usually) align,
but indents do not.


Can you fold vsel and xxsel together completely?  They have exactly the
same semantics!  This does not have to be in this patch of course.

Thanks,


Segher

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] rs6000: Fix wrong code generation for vec_sel [PR94613]
  2021-05-13 10:49 ` Segher Boessenkool
@ 2021-05-14  6:57   ` Xionghu Luo
  2021-06-07  2:15     ` Ping: " Xionghu Luo
                       ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Xionghu Luo @ 2021-05-14  6:57 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches, dje.gcc, wschmidt, guojiufu, linkw

[-- Attachment #1: Type: text/plain, Size: 6436 bytes --]

Hi,

On 2021/5/13 18:49, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Apr 30, 2021 at 01:32:58AM -0500, Xionghu Luo wrote:
>> The vsel instruction is a bit-wise select instruction.  Using an
>> IF_THEN_ELSE to express it in RTL is wrong and leads to wrong code
>> being generated in the combine pass.  Per element selection is a
>> subset of per bit-wise selection,with the patch the pattern is
>> written using bit operations.  But there are 8 different patterns
>> to define "op0 := (op1 & ~op3) | (op2 & op3)":
>>
>> (~op3&op1) | (op3&op2),
>> (~op3&op1) | (op2&op3),
>> (op3&op2) | (~op3&op1),
>> (op2&op3) | (~op3&op1),
>> (op1&~op3) | (op3&op2),
>> (op1&~op3) | (op2&op3),
>> (op3&op2) | (op1&~op3),
>> (op2&op3) | (op1&~op3),
>>
>> Combine pass will swap (op1&~op3) to (~op3&op1) due to commutative
>> canonical, which could reduce it to the FIRST 4 patterns, but it won't
>> swap (op2&op3) | (~op3&op1) to (~op3&op1) | (op2&op3), so this patch
>> handles it with two patterns with different NOT op3 position and check
>> equality inside it.
> 
> Yup, that latter case does not have canonicalisation rules.  Btw, not
> only combine does this canonicalisation: everything should,
> non-canonical RTL is invalid RTL (in the instruction stream, you can do
> everything in temporary code of course, as long as the RTL isn't
> malformed).
> 
>> -(define_insn "*altivec_vsel<mode>"
>> +(define_insn "altivec_vsel<mode>"
>>     [(set (match_operand:VM 0 "altivec_register_operand" "=v")
>> -	(if_then_else:VM
>> -	 (ne:CC (match_operand:VM 1 "altivec_register_operand" "v")
>> -		(match_operand:VM 4 "zero_constant" ""))
>> -	 (match_operand:VM 2 "altivec_register_operand" "v")
>> -	 (match_operand:VM 3 "altivec_register_operand" "v")))]
>> -  "VECTOR_MEM_ALTIVEC_P (<MODE>mode)"
>> -  "vsel %0,%3,%2,%1"
>> +	(ior:VM
>> +	 (and:VM
>> +	  (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
>> +	  (match_operand:VM 1 "altivec_register_operand" "v"))
>> +	 (and:VM
>> +	  (match_operand:VM 2 "altivec_register_operand" "v")
>> +	  (match_operand:VM 4 "altivec_register_operand" "v"))))]
>> +  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)
>> +  && (rtx_equal_p (operands[2], operands[3])
>> +  || rtx_equal_p (operands[4], operands[3]))"
>> +  {
>> +    if (rtx_equal_p (operands[2], operands[3]))
>> +      return "vsel %0,%1,%4,%3";
>> +    else
>> +      return "vsel %0,%1,%2,%3";
>> +  }
>>     [(set_attr "type" "vecmove")])
> 
> That rtx_equal_p stuff is nice and tricky, but it is a bit too tricky I
> think.  So please write this as two patterns (and keep the expand if
> that helps).

I was a bit concerned that there would be a lot of duplicate code if we
write two patterns for each vsel, totally 4 similar patterns in
altivec.md and another 4 in vsx.md make it difficult to maintain, however
I updated it since you prefer this way, as you pointed out the xxsel in
vsx.md could be folded by later patch.

> 
>> +(define_insn "altivec_vsel<mode>2"
> 
> (same here of course).
> 
>>   ;; Fused multiply add.
>> diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
>> index f5676255387..d65bdc01055 100644
>> --- a/gcc/config/rs6000/rs6000-call.c
>> +++ b/gcc/config/rs6000/rs6000-call.c
>> @@ -3362,11 +3362,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
>>       RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_unsigned_V2DI },
>>     { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
>>       RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI },
>> -  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
>> +  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI_UNS,
> 
> Are the _uns things still used for anything?  But, let's not change
> this until Bill's stuff is in :-)
> 
> Why do you want to change this here, btw?  I don't understand.

OK, they are actually "unsigned type" overload builtin functions, change
it or not so far won't cause functionality issue, I will revert this change
in the updated patch.

> 
>> +  if (target == 0
>> +      || GET_MODE (target) != tmode
>> +      || ! (*insn_data[icode].operand[0].predicate) (target, tmode))
> 
> No space after ! and other unary operators (except for casts and other
> operators you write with alphanumerics, like "sizeof").  I know you
> copied this code, but :-)

OK, thanks.

> 
>> @@ -15608,8 +15606,6 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
>>       case GEU:
>>       case LTU:
>>       case LEU:
>> -      /* Mark unsigned tests with CCUNSmode.  */
>> -      cc_mode = CCUNSmode;
>>   
>>         /* Invert condition to avoid compound test if necessary.  */
>>         if (rcode == GEU || rcode == LEU)
> 
> So this is related to the _uns thing.  Could you split off that change?
> Probably as an earlier patch (but either works for me).

Not related to the ALTIVEC_BUILTIN_VSEL_2DI_UNS things, previously cc_mode
is a parameter to generate the condition for IF_THEN_ELSE instruction, now
we don't need it again as we use IOR (AND... AND...) style, remove it to avoid
build error.


-  cond2 = gen_rtx_fmt_ee (NE, cc_mode, gen_lowpart (dest_mode, mask),
-                         CONST0_RTX (dest_mode));
-  emit_insn (gen_rtx_SET (dest,
-                         gen_rtx_IF_THEN_ELSE (dest_mode,
-                                               cond2,
-                                               op_true,
-                                               op_false)));
+  rtx tmp = gen_rtx_IOR (dest_mode,
+                        gen_rtx_AND (dest_mode, gen_rtx_NOT (dest_mode, mask),
+                                     op_false),
+                        gen_rtx_AND (dest_mode, mask, op_true));
+  emit_insn (gen_rtx_SET (dest, tmp));


> 
>> @@ -15629,6 +15625,9 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
>>     if (!mask)
>>       return 0;
>>   
>> +  if (mask_mode != dest_mode)
>> +      mask = simplify_gen_subreg (dest_mode, mask, mask_mode, 0);
> 
> Indent just two characters please: line continuations (usually) align,
> but indents do not.> 
> 
> Can you fold vsel and xxsel together completely?  They have exactly the
> same semantics!  This does not have to be in this patch of course.

I noticed that vperm/xxperm are folded together, do you mean fold vsel/xxsel
like them?  It's attached as:
0002-rs6000-Fold-xxsel-to-vsel-since-they-have-same-seman.patch
  

Thanks,
Xionghu

[-- Attachment #2: 0001-rs6000-Fix-wrong-code-generation-for-vec_sel-PR94613.patch --]
[-- Type: text/plain, Size: 16722 bytes --]

From 68cb56305caea6aaee083d814fd4d11590cbc766 Mon Sep 17 00:00:00 2001
From: Xionghu Luo <luoxhu@linux.ibm.com>
Date: Tue, 27 Apr 2021 01:07:25 -0500
Subject: [PATCH 1/2] rs6000: Fix wrong code generation for vec_sel [PR94613]

The vsel instruction is a bit-wise select instruction.  Using an
IF_THEN_ELSE to express it in RTL is wrong and leads to wrong code
being generated in the combine pass.  Per element selection is a
subset of per bit-wise selection,with the patch the pattern is
written using bit operations.  But there are 8 different patterns
to define "op0 := (op1 & ~op3) | (op2 & op3)":

(~op3&op1) | (op3&op2),
(~op3&op1) | (op2&op3),
(op3&op2) | (~op3&op1),
(op2&op3) | (~op3&op1),
(op1&~op3) | (op3&op2),
(op1&~op3) | (op2&op3),
(op3&op2) | (op1&~op3),
(op2&op3) | (op1&~op3),

Combine pass will swap (op1&~op3) to (~op3&op1) due to commutative
canonical, which could reduce it to the FIRST 4 patterns, but it won't
swap (op2&op3) | (~op3&op1) to (~op3&op1) | (op2&op3), so this patch
handles it with two patterns with different NOT op3 position and check
equality inside it.

Tested pass on Power8LE, any comments?

gcc/ChangeLog:

	* config/rs6000/altivec.md (*altivec_vsel<mode>): Change to ...
	(altivec_vsel<mode>): ... this and update define.
	(*altivec_vsel<mode>_uns): Delete.
	(altivec_vsel<mode>2): New define_insn.
	(altivec_vsel<mode>3): Likewise.
	(altivec_vsel<mode>4): Likewise.
	* config/rs6000/rs6000-call.c (altivec_expand_vec_sel_builtin):
	New.
	(altivec_expand_builtin): Call altivec_expand_vec_sel_builtin to
	expand vel_sel.
	* config/rs6000/rs6000.c (rs6000_emit_vector_cond_expr): Use
	bit-wise selection instead of per element.
	* config/rs6000/vector.md:
	* config/rs6000/vsx.md (*vsx_xxsel<mode>): Change to ...
	(vsx_xxsel<mode>): ... this and update define.
	(*vsx_xxsel<mode>_uns): Delete.
	(vsx_xxsel<mode>2): New define_insn.
	(vsx_xxsel<mode>3): Likewise.
	(vsx_xxsel<mode>4): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.target/powerpc/pr94613.c: New test.
---
 gcc/config/rs6000/altivec.md               | 66 ++++++++++++++++-----
 gcc/config/rs6000/rs6000-call.c            | 62 ++++++++++++++++++++
 gcc/config/rs6000/rs6000.c                 | 19 +++---
 gcc/config/rs6000/vector.md                | 26 ++++-----
 gcc/config/rs6000/vsx.md                   | 68 +++++++++++++++++-----
 gcc/testsuite/gcc.target/powerpc/pr94613.c | 38 ++++++++++++
 6 files changed, 222 insertions(+), 57 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr94613.c

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 1351dafbc41..df8f24701d0 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -667,26 +667,60 @@ (define_insn "*altivec_gev4sf"
   "vcmpgefp %0,%1,%2"
   [(set_attr "type" "veccmp")])
 
-(define_insn "*altivec_vsel<mode>"
+(define_insn "altivec_vsel<mode>"
   [(set (match_operand:VM 0 "altivec_register_operand" "=v")
-	(if_then_else:VM
-	 (ne:CC (match_operand:VM 1 "altivec_register_operand" "v")
-		(match_operand:VM 4 "zero_constant" ""))
-	 (match_operand:VM 2 "altivec_register_operand" "v")
-	 (match_operand:VM 3 "altivec_register_operand" "v")))]
-  "VECTOR_MEM_ALTIVEC_P (<MODE>mode)"
-  "vsel %0,%3,%2,%1"
+	(ior:VM
+	 (and:VM
+	  (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
+	  (match_operand:VM 1 "altivec_register_operand" "v"))
+	 (and:VM
+	  (match_operand:VM 2 "altivec_register_operand" "v")
+	  (match_operand:VM 4 "altivec_register_operand" "v"))))]
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)
+  && rtx_equal_p (operands[2], operands[3])"
+  "vsel %0,%1,%4,%3"
   [(set_attr "type" "vecmove")])
 
-(define_insn "*altivec_vsel<mode>_uns"
+(define_insn "altivec_vsel<mode>2"
   [(set (match_operand:VM 0 "altivec_register_operand" "=v")
-	(if_then_else:VM
-	 (ne:CCUNS (match_operand:VM 1 "altivec_register_operand" "v")
-		   (match_operand:VM 4 "zero_constant" ""))
-	 (match_operand:VM 2 "altivec_register_operand" "v")
-	 (match_operand:VM 3 "altivec_register_operand" "v")))]
-  "VECTOR_MEM_ALTIVEC_P (<MODE>mode)"
-  "vsel %0,%3,%2,%1"
+	(ior:VM
+	 (and:VM
+	  (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
+	  (match_operand:VM 1 "altivec_register_operand" "v"))
+	 (and:VM
+	  (match_operand:VM 2 "altivec_register_operand" "v")
+	  (match_operand:VM 4 "altivec_register_operand" "v"))))]
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)
+  && rtx_equal_p (operands[4], operands[3])"
+  "vsel %0,%1,%2,%3"
+  [(set_attr "type" "vecmove")])
+
+(define_insn "altivec_vsel<mode>3"
+  [(set (match_operand:VM 0 "altivec_register_operand" "=v")
+	(ior:VM
+	 (and:VM
+	  (match_operand:VM 1 "altivec_register_operand" "v")
+	  (match_operand:VM 2 "altivec_register_operand" "v"))
+	 (and:VM
+	  (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
+	  (match_operand:VM 4 "altivec_register_operand" "v"))))]
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)
+  && rtx_equal_p (operands[1], operands[3])"
+  "vsel %0,%4,%2,%3"
+  [(set_attr "type" "vecmove")])
+
+(define_insn "altivec_vsel<mode>4"
+  [(set (match_operand:VM 0 "altivec_register_operand" "=v")
+	(ior:VM
+	 (and:VM
+	  (match_operand:VM 1 "altivec_register_operand" "v")
+	  (match_operand:VM 2 "altivec_register_operand" "v"))
+	 (and:VM
+	  (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
+	  (match_operand:VM 4 "altivec_register_operand" "v"))))]
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)
+  && rtx_equal_p (operands[2], operands[3])"
+  "vsel %0,%4,%1,%3"
   [(set_attr "type" "vecmove")])
 
 ;; Fused multiply add.
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index f5676255387..94593be0d2e 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -10996,6 +10996,45 @@ altivec_expand_vec_ext_builtin (tree exp, rtx target)
   return target;
 }
 
+/* Expand vec_sel builtin.  */
+static rtx
+altivec_expand_vec_sel_builtin (enum insn_code icode, tree exp, rtx target)
+{
+  rtx op0, op1, op2, pat;
+  tree arg0, arg1, arg2;
+
+  arg0 = CALL_EXPR_ARG (exp, 0);
+  op0 = expand_normal (arg0);
+  arg1 = CALL_EXPR_ARG (exp, 1);
+  op1 = expand_normal (arg1);
+  arg2 = CALL_EXPR_ARG (exp, 2);
+  op2 = expand_normal (arg2);
+
+  machine_mode tmode = insn_data[icode].operand[0].mode;
+  machine_mode mode0 = insn_data[icode].operand[1].mode;
+  machine_mode mode1 = insn_data[icode].operand[2].mode;
+  machine_mode mode2 = insn_data[icode].operand[3].mode;
+
+  if (target == 0 || GET_MODE (target) != tmode
+      || !(*insn_data[icode].operand[0].predicate) (target, tmode))
+    target = gen_reg_rtx (tmode);
+
+  if (!(*insn_data[icode].operand[1].predicate) (op0, mode0))
+    op0 = copy_to_mode_reg (mode0, op0);
+  if (!(*insn_data[icode].operand[2].predicate) (op1, mode1))
+    op1 = copy_to_mode_reg (mode1, op1);
+  if (!(*insn_data[icode].operand[3].predicate) (op2, mode2))
+    op2 = copy_to_mode_reg (mode2, op2);
+
+  pat = GEN_FCN (icode) (target, op0, op1, op2, op2);
+  if (pat)
+    emit_insn (pat);
+  else
+    return NULL_RTX;
+
+  return target;
+}
+
 /* Expand the builtin in EXP and store the result in TARGET.  Store
    true in *EXPANDEDP if we found a builtin to expand.  */
 static rtx
@@ -11181,6 +11220,29 @@ altivec_expand_builtin (tree exp, rtx target, bool *expandedp)
 	emit_insn (pat);
       return NULL_RTX;
 
+    case ALTIVEC_BUILTIN_VSEL_2DF:
+      return altivec_expand_vec_sel_builtin (CODE_FOR_altivec_vselv2df, exp,
+					     target);
+    case ALTIVEC_BUILTIN_VSEL_2DI:
+    case ALTIVEC_BUILTIN_VSEL_2DI_UNS:
+      return altivec_expand_vec_sel_builtin (CODE_FOR_altivec_vselv2di, exp,
+					     target);
+    case ALTIVEC_BUILTIN_VSEL_4SF:
+      return altivec_expand_vec_sel_builtin (CODE_FOR_altivec_vselv4sf, exp,
+					     target);
+    case ALTIVEC_BUILTIN_VSEL_4SI:
+    case ALTIVEC_BUILTIN_VSEL_4SI_UNS:
+      return altivec_expand_vec_sel_builtin (CODE_FOR_altivec_vselv4si, exp,
+					     target);
+    case ALTIVEC_BUILTIN_VSEL_8HI:
+    case ALTIVEC_BUILTIN_VSEL_8HI_UNS:
+      return altivec_expand_vec_sel_builtin (CODE_FOR_altivec_vselv8hi, exp,
+					     target);
+    case ALTIVEC_BUILTIN_VSEL_16QI:
+    case ALTIVEC_BUILTIN_VSEL_16QI_UNS:
+      return altivec_expand_vec_sel_builtin (CODE_FOR_altivec_vselv16qi, exp,
+					     target);
+
     case ALTIVEC_BUILTIN_DSSALL:
       emit_insn (gen_altivec_dssall ());
       return NULL_RTX;
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index befab53031c..5a5202c455b 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -15566,9 +15566,7 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
   machine_mode dest_mode = GET_MODE (dest);
   machine_mode mask_mode = GET_MODE (cc_op0);
   enum rtx_code rcode = GET_CODE (cond);
-  machine_mode cc_mode = CCmode;
   rtx mask;
-  rtx cond2;
   bool invert_move = false;
 
   if (VECTOR_UNIT_NONE_P (dest_mode))
@@ -15608,8 +15606,6 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
     case GEU:
     case LTU:
     case LEU:
-      /* Mark unsigned tests with CCUNSmode.  */
-      cc_mode = CCUNSmode;
 
       /* Invert condition to avoid compound test if necessary.  */
       if (rcode == GEU || rcode == LEU)
@@ -15629,6 +15625,9 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
   if (!mask)
     return 0;
 
+  if (mask_mode != dest_mode)
+    mask = simplify_gen_subreg (dest_mode, mask, mask_mode, 0);
+
   if (invert_move)
     std::swap (op_true, op_false);
 
@@ -15668,13 +15667,11 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
   if (!REG_P (op_false) && !SUBREG_P (op_false))
     op_false = force_reg (dest_mode, op_false);
 
-  cond2 = gen_rtx_fmt_ee (NE, cc_mode, gen_lowpart (dest_mode, mask),
-			  CONST0_RTX (dest_mode));
-  emit_insn (gen_rtx_SET (dest,
-			  gen_rtx_IF_THEN_ELSE (dest_mode,
-						cond2,
-						op_true,
-						op_false)));
+  rtx tmp = gen_rtx_IOR (dest_mode,
+			 gen_rtx_AND (dest_mode, gen_rtx_NOT (dest_mode, mask),
+				      op_false),
+			 gen_rtx_AND (dest_mode, mask, op_true));
+  emit_insn (gen_rtx_SET (dest, tmp));
   return 1;
 }
 
diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md
index 3446b03d40d..85ffcf6a763 100644
--- a/gcc/config/rs6000/vector.md
+++ b/gcc/config/rs6000/vector.md
@@ -862,23 +862,21 @@ (define_insn_and_split "vector_<code><mode>"
 ;; which is in the reverse order that we want
 (define_expand "vector_select_<mode>"
   [(set (match_operand:VEC_L 0 "vlogical_operand")
-	(if_then_else:VEC_L
-	 (ne:CC (match_operand:VEC_L 3 "vlogical_operand")
-		(match_dup 4))
-	 (match_operand:VEC_L 2 "vlogical_operand")
-	 (match_operand:VEC_L 1 "vlogical_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-  "operands[4] = CONST0_RTX (<MODE>mode);")
+    (ior:VEC_L
+     (and:VEC_L (not:VEC_L (match_operand:VEC_L 3 "vlogical_operand"))
+      (match_operand:VEC_L 1 "vlogical_operand"))
+     (and:VEC_L (match_dup 3)
+     (match_operand:VEC_L 2 "vlogical_operand"))))]
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)")
 
 (define_expand "vector_select_<mode>_uns"
   [(set (match_operand:VEC_L 0 "vlogical_operand")
-	(if_then_else:VEC_L
-	 (ne:CCUNS (match_operand:VEC_L 3 "vlogical_operand")
-		   (match_dup 4))
-	 (match_operand:VEC_L 2 "vlogical_operand")
-	 (match_operand:VEC_L 1 "vlogical_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-  "operands[4] = CONST0_RTX (<MODE>mode);")
+    (ior:VEC_L
+     (and:VEC_L (not:VEC_L (match_operand:VEC_L 3 "vlogical_operand"))
+      (match_operand:VEC_L 1 "vlogical_operand"))
+     (and:VEC_L (match_dup 3)
+     (match_operand:VEC_L 2 "vlogical_operand"))))]
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)")
 
 ;; Expansions that compare vectors producing a vector result and a predicate,
 ;; setting CR6 to indicate a combined status
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index bcb92be2f5c..f53de7bf34c 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -2105,30 +2105,66 @@ (define_insn "*vsx_ge_<mode>_p"
   [(set_attr "type" "<VStype_simple>")])
 
 ;; Vector select
-(define_insn "*vsx_xxsel<mode>"
+(define_insn "vsx_xxsel<mode>"
   [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
-	(if_then_else:VSX_L
-	 (ne:CC (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa")
-		(match_operand:VSX_L 4 "zero_constant" ""))
-	 (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa")
-	 (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa")))]
-  "VECTOR_MEM_VSX_P (<MODE>mode)"
-  "xxsel %x0,%x3,%x2,%x1"
+    (ior:VSX_L
+     (and:VSX_L
+      (not:VSX_L (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa"))
+      (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa"))
+     (and:VSX_L
+      (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa")
+      (match_operand:VSX_L 4 "vsx_register_operand" "<VSr>,wa"))))]
+  "VECTOR_MEM_VSX_P (<MODE>mode)
+  && rtx_equal_p (operands[2], operands[3])"
+  "xxsel %x0,%x1,%x4,%x3"
   [(set_attr "type" "vecmove")
    (set_attr "isa" "<VSisa>")])
 
-(define_insn "*vsx_xxsel<mode>_uns"
+(define_insn "vsx_xxsel<mode>2"
   [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
-	(if_then_else:VSX_L
-	 (ne:CCUNS (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa")
-		   (match_operand:VSX_L 4 "zero_constant" ""))
-	 (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa")
-	 (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa")))]
-  "VECTOR_MEM_VSX_P (<MODE>mode)"
-  "xxsel %x0,%x3,%x2,%x1"
+    (ior:VSX_L
+     (and:VSX_L
+      (not:VSX_L (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa"))
+      (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa"))
+     (and:VSX_L
+      (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa")
+      (match_operand:VSX_L 4 "vsx_register_operand" "<VSr>,wa"))))]
+  "VECTOR_MEM_VSX_P (<MODE>mode)
+  && rtx_equal_p (operands[4], operands[3])"
+  "xxsel %x0,%x1,%x2,%x3"
   [(set_attr "type" "vecmove")
    (set_attr "isa" "<VSisa>")])
 
+(define_insn "vsx_xxsel<mode>3"
+ [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
+   (ior:VSX_L
+    (and:VSX_L
+     (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa")
+     (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa"))
+    (and:VSX_L
+     (not:VSX_L (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa"))
+     (match_operand:VSX_L 4 "vsx_register_operand" "<VSr>,wa"))))]
+ "VECTOR_MEM_VSX_P (<MODE>mode)
+  && rtx_equal_p (operands[1], operands[3])"
+  "xxsel %x0,%x4,%x2,%x3"
+ [(set_attr "type" "vecmove")
+ (set_attr "isa" "<VSisa>")])
+
+(define_insn "vsx_xxsel<mode>4"
+ [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
+   (ior:VSX_L
+    (and:VSX_L
+     (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa")
+     (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa"))
+    (and:VSX_L
+     (not:VSX_L (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa"))
+     (match_operand:VSX_L 4 "vsx_register_operand" "<VSr>,wa"))))]
+ "VECTOR_MEM_VSX_P (<MODE>mode)
+ && rtx_equal_p (operands[2], operands[3])"
+ "xxsel %x0,%x4,%x1,%x3"
+ [(set_attr "type" "vecmove")
+ (set_attr "isa" "<VSisa>")])
+
 ;; Copy sign
 (define_insn "vsx_copysign<mode>3"
   [(set (match_operand:VSX_F 0 "vsx_register_operand" "=wa")
diff --git a/gcc/testsuite/gcc.target/powerpc/pr94613.c b/gcc/testsuite/gcc.target/powerpc/pr94613.c
new file mode 100644
index 00000000000..06699088f9a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr94613.c
@@ -0,0 +1,38 @@
+/* { dg-do run } */
+/* { dg-require-effective-target vmx_hw } */
+/* { dg-options "-O3 -maltivec" } */
+
+#include <altivec.h>
+
+/* The initial implementation of vec_sel used an IF_THEN_ELSE rtx.
+   This did NOT match what the vsel instruction does.  vsel is a
+   bit-wise operation.  Using IF_THEN_ELSE made the + operation to be
+   simplified away in combine.  A plus operation affects other bits in
+   the same element. Hence per-element simplifications are wrong for
+   vsel.  */
+vector unsigned char __attribute__((noinline))
+foo (vector unsigned char a, vector unsigned char b, vector unsigned char c)
+{
+  return vec_sel (a + b, c, a);
+}
+
+vector unsigned char __attribute__((noinline))
+bar (vector unsigned char a, vector unsigned char b, vector unsigned char c)
+{
+  return vec_sel (a | b, c, a);
+}
+
+int
+main ()
+{
+  vector unsigned char v = (vector unsigned char){ 1 };
+
+  if (foo (v, v, v)[0] != 3)
+      __builtin_abort ();
+
+  if (bar (v, v, v)[0] != 1)
+    __builtin_abort ();
+
+  return 0;
+}
+
-- 
2.25.1


[-- Attachment #3: 0002-rs6000-Fold-xxsel-to-vsel-since-they-have-same-seman.patch --]
[-- Type: text/plain, Size: 7301 bytes --]

From 74cf1fd298e4923c106deaba3192423d48049559 Mon Sep 17 00:00:00 2001
From: Xionghu Luo <luoxhu@linux.ibm.com>
Date: Fri, 14 May 2021 01:21:06 -0500
Subject: [PATCH 2/2] rs6000: Fold xxsel to vsel since they have same semantics

gcc/ChangeLog:

	* config/rs6000/altivec.md: Add vsx register constraints.
	* config/rs6000/vsx.md (vsx_xxsel<mode>): Delete.
	(vsx_xxsel<mode>2): Likewise.
	(vsx_xxsel<mode>3): Likewise.
	(vsx_xxsel<mode>4): Likewise.
---
 gcc/config/rs6000/altivec.md | 56 +++++++++++++++++++--------------
 gcc/config/rs6000/vsx.md     | 61 ------------------------------------
 2 files changed, 32 insertions(+), 85 deletions(-)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index df8f24701d0..6ce3eb9dbc8 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -668,59 +668,67 @@ (define_insn "*altivec_gev4sf"
   [(set_attr "type" "veccmp")])
 
 (define_insn "altivec_vsel<mode>"
-  [(set (match_operand:VM 0 "altivec_register_operand" "=v")
+  [(set (match_operand:VM 0 "altivec_register_operand" "=wa,v")
 	(ior:VM
 	 (and:VM
-	  (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
-	  (match_operand:VM 1 "altivec_register_operand" "v"))
+	  (not:VM (match_operand:VM 3 "altivec_register_operand" "wa,v"))
+	  (match_operand:VM 1 "altivec_register_operand" "wa,v"))
 	 (and:VM
-	  (match_operand:VM 2 "altivec_register_operand" "v")
-	  (match_operand:VM 4 "altivec_register_operand" "v"))))]
+	  (match_operand:VM 2 "altivec_register_operand" "wa,v")
+	  (match_operand:VM 4 "altivec_register_operand" "wa,v"))))]
   "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)
   && rtx_equal_p (operands[2], operands[3])"
-  "vsel %0,%1,%4,%3"
+  "@
+  xxsel %x0,%x1,%x4,%x3
+  vsel %0,%1,%4,%3"
   [(set_attr "type" "vecmove")])
 
 (define_insn "altivec_vsel<mode>2"
-  [(set (match_operand:VM 0 "altivec_register_operand" "=v")
+  [(set (match_operand:VM 0 "altivec_register_operand" "=wa,v")
 	(ior:VM
 	 (and:VM
-	  (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
-	  (match_operand:VM 1 "altivec_register_operand" "v"))
+	  (not:VM (match_operand:VM 3 "altivec_register_operand" "wa,v"))
+	  (match_operand:VM 1 "altivec_register_operand" "wa,v"))
 	 (and:VM
-	  (match_operand:VM 2 "altivec_register_operand" "v")
-	  (match_operand:VM 4 "altivec_register_operand" "v"))))]
+	  (match_operand:VM 2 "altivec_register_operand" "wa,v")
+	  (match_operand:VM 4 "altivec_register_operand" "wa,v"))))]
   "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)
   && rtx_equal_p (operands[4], operands[3])"
-  "vsel %0,%1,%2,%3"
+  "@
+  xxsel %x0,%x1,%x2,%x3
+  vsel %0,%1,%2,%3"
   [(set_attr "type" "vecmove")])
 
 (define_insn "altivec_vsel<mode>3"
-  [(set (match_operand:VM 0 "altivec_register_operand" "=v")
+  [(set (match_operand:VM 0 "altivec_register_operand" "=wa,v")
 	(ior:VM
 	 (and:VM
-	  (match_operand:VM 1 "altivec_register_operand" "v")
-	  (match_operand:VM 2 "altivec_register_operand" "v"))
+	  (match_operand:VM 1 "altivec_register_operand" "wa,v")
+	  (match_operand:VM 2 "altivec_register_operand" "wa,v"))
 	 (and:VM
-	  (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
-	  (match_operand:VM 4 "altivec_register_operand" "v"))))]
+	  (not:VM (match_operand:VM 3 "altivec_register_operand" "wa,v"))
+	  (match_operand:VM 4 "altivec_register_operand" "wa,v"))))]
   "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)
   && rtx_equal_p (operands[1], operands[3])"
-  "vsel %0,%4,%2,%3"
+  "@
+  xxsel %x0,%x4,%x2,%x3
+  vsel %0,%4,%2,%3"
   [(set_attr "type" "vecmove")])
 
 (define_insn "altivec_vsel<mode>4"
-  [(set (match_operand:VM 0 "altivec_register_operand" "=v")
+  [(set (match_operand:VM 0 "altivec_register_operand" "=wa,v")
 	(ior:VM
 	 (and:VM
-	  (match_operand:VM 1 "altivec_register_operand" "v")
-	  (match_operand:VM 2 "altivec_register_operand" "v"))
+	  (match_operand:VM 1 "altivec_register_operand" "wa,v")
+	  (match_operand:VM 2 "altivec_register_operand" "wa,v"))
 	 (and:VM
-	  (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
-	  (match_operand:VM 4 "altivec_register_operand" "v"))))]
+	  (not:VM (match_operand:VM 3 "altivec_register_operand" "wa,v"))
+	  (match_operand:VM 4 "altivec_register_operand" "wa,v"))))]
   "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)
   && rtx_equal_p (operands[2], operands[3])"
-  "vsel %0,%4,%1,%3"
+  "@
+  xxsel %x0,%x4,%x1,%x3
+  vsel %0,%4,%1,%3"
   [(set_attr "type" "vecmove")])
 
 ;; Fused multiply add.
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index f53de7bf34c..55d830d0f20 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -2104,67 +2104,6 @@ (define_insn "*vsx_ge_<mode>_p"
   "xvcmpge<sd>p. %x0,%x1,%x2"
   [(set_attr "type" "<VStype_simple>")])
 
-;; Vector select
-(define_insn "vsx_xxsel<mode>"
-  [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
-    (ior:VSX_L
-     (and:VSX_L
-      (not:VSX_L (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa"))
-      (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa"))
-     (and:VSX_L
-      (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa")
-      (match_operand:VSX_L 4 "vsx_register_operand" "<VSr>,wa"))))]
-  "VECTOR_MEM_VSX_P (<MODE>mode)
-  && rtx_equal_p (operands[2], operands[3])"
-  "xxsel %x0,%x1,%x4,%x3"
-  [(set_attr "type" "vecmove")
-   (set_attr "isa" "<VSisa>")])
-
-(define_insn "vsx_xxsel<mode>2"
-  [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
-    (ior:VSX_L
-     (and:VSX_L
-      (not:VSX_L (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa"))
-      (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa"))
-     (and:VSX_L
-      (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa")
-      (match_operand:VSX_L 4 "vsx_register_operand" "<VSr>,wa"))))]
-  "VECTOR_MEM_VSX_P (<MODE>mode)
-  && rtx_equal_p (operands[4], operands[3])"
-  "xxsel %x0,%x1,%x2,%x3"
-  [(set_attr "type" "vecmove")
-   (set_attr "isa" "<VSisa>")])
-
-(define_insn "vsx_xxsel<mode>3"
- [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
-   (ior:VSX_L
-    (and:VSX_L
-     (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa")
-     (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa"))
-    (and:VSX_L
-     (not:VSX_L (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa"))
-     (match_operand:VSX_L 4 "vsx_register_operand" "<VSr>,wa"))))]
- "VECTOR_MEM_VSX_P (<MODE>mode)
-  && rtx_equal_p (operands[1], operands[3])"
-  "xxsel %x0,%x4,%x2,%x3"
- [(set_attr "type" "vecmove")
- (set_attr "isa" "<VSisa>")])
-
-(define_insn "vsx_xxsel<mode>4"
- [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
-   (ior:VSX_L
-    (and:VSX_L
-     (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa")
-     (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa"))
-    (and:VSX_L
-     (not:VSX_L (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa"))
-     (match_operand:VSX_L 4 "vsx_register_operand" "<VSr>,wa"))))]
- "VECTOR_MEM_VSX_P (<MODE>mode)
- && rtx_equal_p (operands[2], operands[3])"
- "xxsel %x0,%x4,%x1,%x3"
- [(set_attr "type" "vecmove")
- (set_attr "isa" "<VSisa>")])
-
 ;; Copy sign
 (define_insn "vsx_copysign<mode>3"
   [(set (match_operand:VSX_F 0 "vsx_register_operand" "=wa")
-- 
2.25.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Ping: [PATCH] rs6000: Fix wrong code generation for vec_sel [PR94613]
  2021-05-14  6:57   ` Xionghu Luo
@ 2021-06-07  2:15     ` Xionghu Luo
  2021-06-30  1:42     ` Xionghu Luo
  2021-09-15 14:14     ` Segher Boessenkool
  2 siblings, 0 replies; 11+ messages in thread
From: Xionghu Luo @ 2021-06-07  2:15 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: wschmidt, gcc-patches, linkw, dje.gcc

Gentle ping, thanks.

https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570333.html


On 2021/5/14 14:57, Xionghu Luo via Gcc-patches wrote:
> Hi,
> 
> On 2021/5/13 18:49, Segher Boessenkool wrote:
>> Hi!
>>
>> On Fri, Apr 30, 2021 at 01:32:58AM -0500, Xionghu Luo wrote:
>>> The vsel instruction is a bit-wise select instruction.  Using an
>>> IF_THEN_ELSE to express it in RTL is wrong and leads to wrong code
>>> being generated in the combine pass.  Per element selection is a
>>> subset of per bit-wise selection,with the patch the pattern is
>>> written using bit operations.  But there are 8 different patterns
>>> to define "op0 := (op1 & ~op3) | (op2 & op3)":
>>>
>>> (~op3&op1) | (op3&op2),
>>> (~op3&op1) | (op2&op3),
>>> (op3&op2) | (~op3&op1),
>>> (op2&op3) | (~op3&op1),
>>> (op1&~op3) | (op3&op2),
>>> (op1&~op3) | (op2&op3),
>>> (op3&op2) | (op1&~op3),
>>> (op2&op3) | (op1&~op3),
>>>
>>> Combine pass will swap (op1&~op3) to (~op3&op1) due to commutative
>>> canonical, which could reduce it to the FIRST 4 patterns, but it won't
>>> swap (op2&op3) | (~op3&op1) to (~op3&op1) | (op2&op3), so this patch
>>> handles it with two patterns with different NOT op3 position and check
>>> equality inside it.
>>
>> Yup, that latter case does not have canonicalisation rules.  Btw, not
>> only combine does this canonicalisation: everything should,
>> non-canonical RTL is invalid RTL (in the instruction stream, you can do
>> everything in temporary code of course, as long as the RTL isn't
>> malformed).
>>
>>> -(define_insn "*altivec_vsel<mode>"
>>> +(define_insn "altivec_vsel<mode>"
>>>     [(set (match_operand:VM 0 "altivec_register_operand" "=v")
>>> -    (if_then_else:VM
>>> -     (ne:CC (match_operand:VM 1 "altivec_register_operand" "v")
>>> -        (match_operand:VM 4 "zero_constant" ""))
>>> -     (match_operand:VM 2 "altivec_register_operand" "v")
>>> -     (match_operand:VM 3 "altivec_register_operand" "v")))]
>>> -  "VECTOR_MEM_ALTIVEC_P (<MODE>mode)"
>>> -  "vsel %0,%3,%2,%1"
>>> +    (ior:VM
>>> +     (and:VM
>>> +      (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
>>> +      (match_operand:VM 1 "altivec_register_operand" "v"))
>>> +     (and:VM
>>> +      (match_operand:VM 2 "altivec_register_operand" "v")
>>> +      (match_operand:VM 4 "altivec_register_operand" "v"))))]
>>> +  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)
>>> +  && (rtx_equal_p (operands[2], operands[3])
>>> +  || rtx_equal_p (operands[4], operands[3]))"
>>> +  {
>>> +    if (rtx_equal_p (operands[2], operands[3]))
>>> +      return "vsel %0,%1,%4,%3";
>>> +    else
>>> +      return "vsel %0,%1,%2,%3";
>>> +  }
>>>     [(set_attr "type" "vecmove")])
>>
>> That rtx_equal_p stuff is nice and tricky, but it is a bit too tricky I
>> think.  So please write this as two patterns (and keep the expand if
>> that helps).
> 
> I was a bit concerned that there would be a lot of duplicate code if we
> write two patterns for each vsel, totally 4 similar patterns in
> altivec.md and another 4 in vsx.md make it difficult to maintain, however
> I updated it since you prefer this way, as you pointed out the xxsel in
> vsx.md could be folded by later patch.
> 
>>
>>> +(define_insn "altivec_vsel<mode>2"
>>
>> (same here of course).
>>
>>>   ;; Fused multiply add.
>>> diff --git a/gcc/config/rs6000/rs6000-call.c 
>>> b/gcc/config/rs6000/rs6000-call.c
>>> index f5676255387..d65bdc01055 100644
>>> --- a/gcc/config/rs6000/rs6000-call.c
>>> +++ b/gcc/config/rs6000/rs6000-call.c
>>> @@ -3362,11 +3362,11 @@ const struct altivec_builtin_types 
>>> altivec_overloaded_builtins[] = {
>>>       RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 
>>> RS6000_BTI_unsigned_V2DI },
>>>     { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
>>>       RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 
>>> RS6000_BTI_V2DI },
>>> -  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
>>> +  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI_UNS,
>>
>> Are the _uns things still used for anything?  But, let's not change
>> this until Bill's stuff is in :-)
>>
>> Why do you want to change this here, btw?  I don't understand.
> 
> OK, they are actually "unsigned type" overload builtin functions, change
> it or not so far won't cause functionality issue, I will revert this change
> in the updated patch.
> 
>>
>>> +  if (target == 0
>>> +      || GET_MODE (target) != tmode
>>> +      || ! (*insn_data[icode].operand[0].predicate) (target, tmode))
>>
>> No space after ! and other unary operators (except for casts and other
>> operators you write with alphanumerics, like "sizeof").  I know you
>> copied this code, but :-)
> 
> OK, thanks.
> 
>>
>>> @@ -15608,8 +15606,6 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx 
>>> op_true, rtx op_false,
>>>       case GEU:
>>>       case LTU:
>>>       case LEU:
>>> -      /* Mark unsigned tests with CCUNSmode.  */
>>> -      cc_mode = CCUNSmode;
>>>         /* Invert condition to avoid compound test if necessary.  */
>>>         if (rcode == GEU || rcode == LEU)
>>
>> So this is related to the _uns thing.  Could you split off that change?
>> Probably as an earlier patch (but either works for me).
> 
> Not related to the ALTIVEC_BUILTIN_VSEL_2DI_UNS things, previously cc_mode
> is a parameter to generate the condition for IF_THEN_ELSE instruction, now
> we don't need it again as we use IOR (AND... AND...) style, remove it to 
> avoid
> build error.
> 
> 
> -  cond2 = gen_rtx_fmt_ee (NE, cc_mode, gen_lowpart (dest_mode, mask),
> -                         CONST0_RTX (dest_mode));
> -  emit_insn (gen_rtx_SET (dest,
> -                         gen_rtx_IF_THEN_ELSE (dest_mode,
> -                                               cond2,
> -                                               op_true,
> -                                               op_false)));
> +  rtx tmp = gen_rtx_IOR (dest_mode,
> +                        gen_rtx_AND (dest_mode, gen_rtx_NOT (dest_mode, 
> mask),
> +                                     op_false),
> +                        gen_rtx_AND (dest_mode, mask, op_true));
> +  emit_insn (gen_rtx_SET (dest, tmp));
> 
> 
>>
>>> @@ -15629,6 +15625,9 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx 
>>> op_true, rtx op_false,
>>>     if (!mask)
>>>       return 0;
>>> +  if (mask_mode != dest_mode)
>>> +      mask = simplify_gen_subreg (dest_mode, mask, mask_mode, 0);
>>
>> Indent just two characters please: line continuations (usually) align,
>> but indents do not.>
>> Can you fold vsel and xxsel together completely?  They have exactly the
>> same semantics!  This does not have to be in this patch of course.
> 
> I noticed that vperm/xxperm are folded together, do you mean fold 
> vsel/xxsel
> like them?  It's attached as:
> 0002-rs6000-Fold-xxsel-to-vsel-since-they-have-same-seman.patch
> 
> 
> Thanks,
> Xionghu

-- 
Thanks,
Xionghu

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Ping: [PATCH] rs6000: Fix wrong code generation for vec_sel [PR94613]
  2021-05-14  6:57   ` Xionghu Luo
  2021-06-07  2:15     ` Ping: " Xionghu Luo
@ 2021-06-30  1:42     ` Xionghu Luo
  2021-09-06  0:52       ` Ping ^ 2: " Xionghu Luo
  2021-09-15 14:14     ` Segher Boessenkool
  2 siblings, 1 reply; 11+ messages in thread
From: Xionghu Luo @ 2021-06-30  1:42 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: wschmidt, gcc-patches, linkw, dje.gcc

Gentle ping, thanks.

https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570333.html


On 2021/5/14 14:57, Xionghu Luo via Gcc-patches wrote:
> Hi,
> 
> On 2021/5/13 18:49, Segher Boessenkool wrote:
>> Hi!
>>
>> On Fri, Apr 30, 2021 at 01:32:58AM -0500, Xionghu Luo wrote:
>>> The vsel instruction is a bit-wise select instruction.  Using an
>>> IF_THEN_ELSE to express it in RTL is wrong and leads to wrong code
>>> being generated in the combine pass.  Per element selection is a
>>> subset of per bit-wise selection,with the patch the pattern is
>>> written using bit operations.  But there are 8 different patterns
>>> to define "op0 := (op1 & ~op3) | (op2 & op3)":
>>>
>>> (~op3&op1) | (op3&op2),
>>> (~op3&op1) | (op2&op3),
>>> (op3&op2) | (~op3&op1),
>>> (op2&op3) | (~op3&op1),
>>> (op1&~op3) | (op3&op2),
>>> (op1&~op3) | (op2&op3),
>>> (op3&op2) | (op1&~op3),
>>> (op2&op3) | (op1&~op3),
>>>
>>> Combine pass will swap (op1&~op3) to (~op3&op1) due to commutative
>>> canonical, which could reduce it to the FIRST 4 patterns, but it won't
>>> swap (op2&op3) | (~op3&op1) to (~op3&op1) | (op2&op3), so this patch
>>> handles it with two patterns with different NOT op3 position and check
>>> equality inside it.
>>
>> Yup, that latter case does not have canonicalisation rules.  Btw, not
>> only combine does this canonicalisation: everything should,
>> non-canonical RTL is invalid RTL (in the instruction stream, you can do
>> everything in temporary code of course, as long as the RTL isn't
>> malformed).
>>
>>> -(define_insn "*altivec_vsel<mode>"
>>> +(define_insn "altivec_vsel<mode>"
>>>     [(set (match_operand:VM 0 "altivec_register_operand" "=v")
>>> -    (if_then_else:VM
>>> -     (ne:CC (match_operand:VM 1 "altivec_register_operand" "v")
>>> -        (match_operand:VM 4 "zero_constant" ""))
>>> -     (match_operand:VM 2 "altivec_register_operand" "v")
>>> -     (match_operand:VM 3 "altivec_register_operand" "v")))]
>>> -  "VECTOR_MEM_ALTIVEC_P (<MODE>mode)"
>>> -  "vsel %0,%3,%2,%1"
>>> +    (ior:VM
>>> +     (and:VM
>>> +      (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
>>> +      (match_operand:VM 1 "altivec_register_operand" "v"))
>>> +     (and:VM
>>> +      (match_operand:VM 2 "altivec_register_operand" "v")
>>> +      (match_operand:VM 4 "altivec_register_operand" "v"))))]
>>> +  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)
>>> +  && (rtx_equal_p (operands[2], operands[3])
>>> +  || rtx_equal_p (operands[4], operands[3]))"
>>> +  {
>>> +    if (rtx_equal_p (operands[2], operands[3]))
>>> +      return "vsel %0,%1,%4,%3";
>>> +    else
>>> +      return "vsel %0,%1,%2,%3";
>>> +  }
>>>     [(set_attr "type" "vecmove")])
>>
>> That rtx_equal_p stuff is nice and tricky, but it is a bit too tricky I
>> think.  So please write this as two patterns (and keep the expand if
>> that helps).
> 
> I was a bit concerned that there would be a lot of duplicate code if we
> write two patterns for each vsel, totally 4 similar patterns in
> altivec.md and another 4 in vsx.md make it difficult to maintain, however
> I updated it since you prefer this way, as you pointed out the xxsel in
> vsx.md could be folded by later patch.
> 
>>
>>> +(define_insn "altivec_vsel<mode>2"
>>
>> (same here of course).
>>
>>>   ;; Fused multiply add.
>>> diff --git a/gcc/config/rs6000/rs6000-call.c 
>>> b/gcc/config/rs6000/rs6000-call.c
>>> index f5676255387..d65bdc01055 100644
>>> --- a/gcc/config/rs6000/rs6000-call.c
>>> +++ b/gcc/config/rs6000/rs6000-call.c
>>> @@ -3362,11 +3362,11 @@ const struct altivec_builtin_types 
>>> altivec_overloaded_builtins[] = {
>>>       RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 
>>> RS6000_BTI_unsigned_V2DI },
>>>     { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
>>>       RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 
>>> RS6000_BTI_V2DI },
>>> -  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
>>> +  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI_UNS,
>>
>> Are the _uns things still used for anything?  But, let's not change
>> this until Bill's stuff is in :-)
>>
>> Why do you want to change this here, btw?  I don't understand.
> 
> OK, they are actually "unsigned type" overload builtin functions, change
> it or not so far won't cause functionality issue, I will revert this change
> in the updated patch.
> 
>>
>>> +  if (target == 0
>>> +      || GET_MODE (target) != tmode
>>> +      || ! (*insn_data[icode].operand[0].predicate) (target, tmode))
>>
>> No space after ! and other unary operators (except for casts and other
>> operators you write with alphanumerics, like "sizeof").  I know you
>> copied this code, but :-)
> 
> OK, thanks.
> 
>>
>>> @@ -15608,8 +15606,6 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx 
>>> op_true, rtx op_false,
>>>       case GEU:
>>>       case LTU:
>>>       case LEU:
>>> -      /* Mark unsigned tests with CCUNSmode.  */
>>> -      cc_mode = CCUNSmode;
>>>         /* Invert condition to avoid compound test if necessary.  */
>>>         if (rcode == GEU || rcode == LEU)
>>
>> So this is related to the _uns thing.  Could you split off that change?
>> Probably as an earlier patch (but either works for me).
> 
> Not related to the ALTIVEC_BUILTIN_VSEL_2DI_UNS things, previously cc_mode
> is a parameter to generate the condition for IF_THEN_ELSE instruction, now
> we don't need it again as we use IOR (AND... AND...) style, remove it to 
> avoid
> build error.
> 
> 
> -  cond2 = gen_rtx_fmt_ee (NE, cc_mode, gen_lowpart (dest_mode, mask),
> -                         CONST0_RTX (dest_mode));
> -  emit_insn (gen_rtx_SET (dest,
> -                         gen_rtx_IF_THEN_ELSE (dest_mode,
> -                                               cond2,
> -                                               op_true,
> -                                               op_false)));
> +  rtx tmp = gen_rtx_IOR (dest_mode,
> +                        gen_rtx_AND (dest_mode, gen_rtx_NOT (dest_mode, 
> mask),
> +                                     op_false),
> +                        gen_rtx_AND (dest_mode, mask, op_true));
> +  emit_insn (gen_rtx_SET (dest, tmp));
> 
> 
>>
>>> @@ -15629,6 +15625,9 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx 
>>> op_true, rtx op_false,
>>>     if (!mask)
>>>       return 0;
>>> +  if (mask_mode != dest_mode)
>>> +      mask = simplify_gen_subreg (dest_mode, mask, mask_mode, 0);
>>
>> Indent just two characters please: line continuations (usually) align,
>> but indents do not.>
>> Can you fold vsel and xxsel together completely?  They have exactly the
>> same semantics!  This does not have to be in this patch of course.
> 
> I noticed that vperm/xxperm are folded together, do you mean fold 
> vsel/xxsel
> like them?  It's attached as:
> 0002-rs6000-Fold-xxsel-to-vsel-since-they-have-same-seman.patch
> 
> 
> Thanks,
> Xionghu

-- 
Thanks,
Xionghu

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Ping ^ 2: [PATCH] rs6000: Fix wrong code generation for vec_sel [PR94613]
  2021-06-30  1:42     ` Xionghu Luo
@ 2021-09-06  0:52       ` Xionghu Luo
  2021-09-15  7:50         ` Ping ^ 3: " Xionghu Luo
  0 siblings, 1 reply; 11+ messages in thread
From: Xionghu Luo @ 2021-09-06  0:52 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches, wschmidt, linkw, dje.gcc

Ping^2, thanks.

https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570333.html

On 2021/6/30 09:42, Xionghu Luo via Gcc-patches wrote:
> Gentle ping, thanks.
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570333.html
> 
> 
> On 2021/5/14 14:57, Xionghu Luo via Gcc-patches wrote:
>> Hi,
>>
>> On 2021/5/13 18:49, Segher Boessenkool wrote:
>>> Hi!
>>>
>>> On Fri, Apr 30, 2021 at 01:32:58AM -0500, Xionghu Luo wrote:
>>>> The vsel instruction is a bit-wise select instruction.  Using an
>>>> IF_THEN_ELSE to express it in RTL is wrong and leads to wrong code
>>>> being generated in the combine pass.  Per element selection is a
>>>> subset of per bit-wise selection,with the patch the pattern is
>>>> written using bit operations.  But there are 8 different patterns
>>>> to define "op0 := (op1 & ~op3) | (op2 & op3)":
>>>>
>>>> (~op3&op1) | (op3&op2),
>>>> (~op3&op1) | (op2&op3),
>>>> (op3&op2) | (~op3&op1),
>>>> (op2&op3) | (~op3&op1),
>>>> (op1&~op3) | (op3&op2),
>>>> (op1&~op3) | (op2&op3),
>>>> (op3&op2) | (op1&~op3),
>>>> (op2&op3) | (op1&~op3),
>>>>
>>>> Combine pass will swap (op1&~op3) to (~op3&op1) due to commutative
>>>> canonical, which could reduce it to the FIRST 4 patterns, but it won't
>>>> swap (op2&op3) | (~op3&op1) to (~op3&op1) | (op2&op3), so this patch
>>>> handles it with two patterns with different NOT op3 position and check
>>>> equality inside it.
>>>
>>> Yup, that latter case does not have canonicalisation rules.  Btw, not
>>> only combine does this canonicalisation: everything should,
>>> non-canonical RTL is invalid RTL (in the instruction stream, you can do
>>> everything in temporary code of course, as long as the RTL isn't
>>> malformed).
>>>
>>>> -(define_insn "*altivec_vsel<mode>"
>>>> +(define_insn "altivec_vsel<mode>"
>>>>     [(set (match_operand:VM 0 "altivec_register_operand" "=v")
>>>> -    (if_then_else:VM
>>>> -     (ne:CC (match_operand:VM 1 "altivec_register_operand" "v")
>>>> -        (match_operand:VM 4 "zero_constant" ""))
>>>> -     (match_operand:VM 2 "altivec_register_operand" "v")
>>>> -     (match_operand:VM 3 "altivec_register_operand" "v")))]
>>>> -  "VECTOR_MEM_ALTIVEC_P (<MODE>mode)"
>>>> -  "vsel %0,%3,%2,%1"
>>>> +    (ior:VM
>>>> +     (and:VM
>>>> +      (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
>>>> +      (match_operand:VM 1 "altivec_register_operand" "v"))
>>>> +     (and:VM
>>>> +      (match_operand:VM 2 "altivec_register_operand" "v")
>>>> +      (match_operand:VM 4 "altivec_register_operand" "v"))))]
>>>> +  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)
>>>> +  && (rtx_equal_p (operands[2], operands[3])
>>>> +  || rtx_equal_p (operands[4], operands[3]))"
>>>> +  {
>>>> +    if (rtx_equal_p (operands[2], operands[3]))
>>>> +      return "vsel %0,%1,%4,%3";
>>>> +    else
>>>> +      return "vsel %0,%1,%2,%3";
>>>> +  }
>>>>     [(set_attr "type" "vecmove")])
>>>
>>> That rtx_equal_p stuff is nice and tricky, but it is a bit too tricky I
>>> think.  So please write this as two patterns (and keep the expand if
>>> that helps).
>>
>> I was a bit concerned that there would be a lot of duplicate code if we
>> write two patterns for each vsel, totally 4 similar patterns in
>> altivec.md and another 4 in vsx.md make it difficult to maintain, however
>> I updated it since you prefer this way, as you pointed out the xxsel in
>> vsx.md could be folded by later patch.
>>
>>>
>>>> +(define_insn "altivec_vsel<mode>2"
>>>
>>> (same here of course).
>>>
>>>>   ;; Fused multiply add.
>>>> diff --git a/gcc/config/rs6000/rs6000-call.c 
>>>> b/gcc/config/rs6000/rs6000-call.c
>>>> index f5676255387..d65bdc01055 100644
>>>> --- a/gcc/config/rs6000/rs6000-call.c
>>>> +++ b/gcc/config/rs6000/rs6000-call.c
>>>> @@ -3362,11 +3362,11 @@ const struct altivec_builtin_types 
>>>> altivec_overloaded_builtins[] = {
>>>>       RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 
>>>> RS6000_BTI_unsigned_V2DI },
>>>>     { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
>>>>       RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 
>>>> RS6000_BTI_V2DI },
>>>> -  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
>>>> +  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI_UNS,
>>>
>>> Are the _uns things still used for anything?  But, let's not change
>>> this until Bill's stuff is in :-)
>>>
>>> Why do you want to change this here, btw?  I don't understand.
>>
>> OK, they are actually "unsigned type" overload builtin functions, change
>> it or not so far won't cause functionality issue, I will revert this 
>> change
>> in the updated patch.
>>
>>>
>>>> +  if (target == 0
>>>> +      || GET_MODE (target) != tmode
>>>> +      || ! (*insn_data[icode].operand[0].predicate) (target, tmode))
>>>
>>> No space after ! and other unary operators (except for casts and other
>>> operators you write with alphanumerics, like "sizeof").  I know you
>>> copied this code, but :-)
>>
>> OK, thanks.
>>
>>>
>>>> @@ -15608,8 +15606,6 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx 
>>>> op_true, rtx op_false,
>>>>       case GEU:
>>>>       case LTU:
>>>>       case LEU:
>>>> -      /* Mark unsigned tests with CCUNSmode.  */
>>>> -      cc_mode = CCUNSmode;
>>>>         /* Invert condition to avoid compound test if necessary.  */
>>>>         if (rcode == GEU || rcode == LEU)
>>>
>>> So this is related to the _uns thing.  Could you split off that change?
>>> Probably as an earlier patch (but either works for me).
>>
>> Not related to the ALTIVEC_BUILTIN_VSEL_2DI_UNS things, previously 
>> cc_mode
>> is a parameter to generate the condition for IF_THEN_ELSE instruction, 
>> now
>> we don't need it again as we use IOR (AND... AND...) style, remove it 
>> to avoid
>> build error.
>>
>>
>> -  cond2 = gen_rtx_fmt_ee (NE, cc_mode, gen_lowpart (dest_mode, mask),
>> -                         CONST0_RTX (dest_mode));
>> -  emit_insn (gen_rtx_SET (dest,
>> -                         gen_rtx_IF_THEN_ELSE (dest_mode,
>> -                                               cond2,
>> -                                               op_true,
>> -                                               op_false)));
>> +  rtx tmp = gen_rtx_IOR (dest_mode,
>> +                        gen_rtx_AND (dest_mode, gen_rtx_NOT 
>> (dest_mode, mask),
>> +                                     op_false),
>> +                        gen_rtx_AND (dest_mode, mask, op_true));
>> +  emit_insn (gen_rtx_SET (dest, tmp));
>>
>>
>>>
>>>> @@ -15629,6 +15625,9 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx 
>>>> op_true, rtx op_false,
>>>>     if (!mask)
>>>>       return 0;
>>>> +  if (mask_mode != dest_mode)
>>>> +      mask = simplify_gen_subreg (dest_mode, mask, mask_mode, 0);
>>>
>>> Indent just two characters please: line continuations (usually) align,
>>> but indents do not.>
>>> Can you fold vsel and xxsel together completely?  They have exactly the
>>> same semantics!  This does not have to be in this patch of course.
>>
>> I noticed that vperm/xxperm are folded together, do you mean fold 
>> vsel/xxsel
>> like them?  It's attached as:
>> 0002-rs6000-Fold-xxsel-to-vsel-since-they-have-same-seman.patch
>>
>>
>> Thanks,
>> Xionghu
> 

-- 
Thanks,
Xionghu

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Ping ^ 3: [PATCH] rs6000: Fix wrong code generation for vec_sel [PR94613]
  2021-09-06  0:52       ` Ping ^ 2: " Xionghu Luo
@ 2021-09-15  7:50         ` Xionghu Luo
  2021-09-15 13:11           ` David Edelsohn
  0 siblings, 1 reply; 11+ messages in thread
From: Xionghu Luo @ 2021-09-15  7:50 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: wschmidt, gcc-patches, linkw, dje.gcc

Ping^3, thanks.
  
https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570333.html


On 2021/9/6 08:52, Xionghu Luo via Gcc-patches wrote:
> Ping^2, thanks.
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570333.html
> 
> On 2021/6/30 09:42, Xionghu Luo via Gcc-patches wrote:
>> Gentle ping, thanks.
>>
>> https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570333.html
>>
>>
>> On 2021/5/14 14:57, Xionghu Luo via Gcc-patches wrote:
>>> Hi,
>>>
>>> On 2021/5/13 18:49, Segher Boessenkool wrote:
>>>> Hi!
>>>>
>>>> On Fri, Apr 30, 2021 at 01:32:58AM -0500, Xionghu Luo wrote:
>>>>> The vsel instruction is a bit-wise select instruction.  Using an
>>>>> IF_THEN_ELSE to express it in RTL is wrong and leads to wrong code
>>>>> being generated in the combine pass.  Per element selection is a
>>>>> subset of per bit-wise selection,with the patch the pattern is
>>>>> written using bit operations.  But there are 8 different patterns
>>>>> to define "op0 := (op1 & ~op3) | (op2 & op3)":
>>>>>
>>>>> (~op3&op1) | (op3&op2),
>>>>> (~op3&op1) | (op2&op3),
>>>>> (op3&op2) | (~op3&op1),
>>>>> (op2&op3) | (~op3&op1),
>>>>> (op1&~op3) | (op3&op2),
>>>>> (op1&~op3) | (op2&op3),
>>>>> (op3&op2) | (op1&~op3),
>>>>> (op2&op3) | (op1&~op3),
>>>>>
>>>>> Combine pass will swap (op1&~op3) to (~op3&op1) due to commutative
>>>>> canonical, which could reduce it to the FIRST 4 patterns, but it won't
>>>>> swap (op2&op3) | (~op3&op1) to (~op3&op1) | (op2&op3), so this patch
>>>>> handles it with two patterns with different NOT op3 position and check
>>>>> equality inside it.
>>>>
>>>> Yup, that latter case does not have canonicalisation rules.  Btw, not
>>>> only combine does this canonicalisation: everything should,
>>>> non-canonical RTL is invalid RTL (in the instruction stream, you can do
>>>> everything in temporary code of course, as long as the RTL isn't
>>>> malformed).
>>>>
>>>>> -(define_insn "*altivec_vsel<mode>"
>>>>> +(define_insn "altivec_vsel<mode>"
>>>>>     [(set (match_operand:VM 0 "altivec_register_operand" "=v")
>>>>> -    (if_then_else:VM
>>>>> -     (ne:CC (match_operand:VM 1 "altivec_register_operand" "v")
>>>>> -        (match_operand:VM 4 "zero_constant" ""))
>>>>> -     (match_operand:VM 2 "altivec_register_operand" "v")
>>>>> -     (match_operand:VM 3 "altivec_register_operand" "v")))]
>>>>> -  "VECTOR_MEM_ALTIVEC_P (<MODE>mode)"
>>>>> -  "vsel %0,%3,%2,%1"
>>>>> +    (ior:VM
>>>>> +     (and:VM
>>>>> +      (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
>>>>> +      (match_operand:VM 1 "altivec_register_operand" "v"))
>>>>> +     (and:VM
>>>>> +      (match_operand:VM 2 "altivec_register_operand" "v")
>>>>> +      (match_operand:VM 4 "altivec_register_operand" "v"))))]
>>>>> +  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)
>>>>> +  && (rtx_equal_p (operands[2], operands[3])
>>>>> +  || rtx_equal_p (operands[4], operands[3]))"
>>>>> +  {
>>>>> +    if (rtx_equal_p (operands[2], operands[3]))
>>>>> +      return "vsel %0,%1,%4,%3";
>>>>> +    else
>>>>> +      return "vsel %0,%1,%2,%3";
>>>>> +  }
>>>>>     [(set_attr "type" "vecmove")])
>>>>
>>>> That rtx_equal_p stuff is nice and tricky, but it is a bit too tricky I
>>>> think.  So please write this as two patterns (and keep the expand if
>>>> that helps).
>>>
>>> I was a bit concerned that there would be a lot of duplicate code if we
>>> write two patterns for each vsel, totally 4 similar patterns in
>>> altivec.md and another 4 in vsx.md make it difficult to maintain, 
>>> however
>>> I updated it since you prefer this way, as you pointed out the xxsel in
>>> vsx.md could be folded by later patch.
>>>
>>>>
>>>>> +(define_insn "altivec_vsel<mode>2"
>>>>
>>>> (same here of course).
>>>>
>>>>>   ;; Fused multiply add.
>>>>> diff --git a/gcc/config/rs6000/rs6000-call.c 
>>>>> b/gcc/config/rs6000/rs6000-call.c
>>>>> index f5676255387..d65bdc01055 100644
>>>>> --- a/gcc/config/rs6000/rs6000-call.c
>>>>> +++ b/gcc/config/rs6000/rs6000-call.c
>>>>> @@ -3362,11 +3362,11 @@ const struct altivec_builtin_types 
>>>>> altivec_overloaded_builtins[] = {
>>>>>       RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 
>>>>> RS6000_BTI_unsigned_V2DI },
>>>>>     { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
>>>>>       RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 
>>>>> RS6000_BTI_V2DI },
>>>>> -  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
>>>>> +  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI_UNS,
>>>>
>>>> Are the _uns things still used for anything?  But, let's not change
>>>> this until Bill's stuff is in :-)
>>>>
>>>> Why do you want to change this here, btw?  I don't understand.
>>>
>>> OK, they are actually "unsigned type" overload builtin functions, change
>>> it or not so far won't cause functionality issue, I will revert this 
>>> change
>>> in the updated patch.
>>>
>>>>
>>>>> +  if (target == 0
>>>>> +      || GET_MODE (target) != tmode
>>>>> +      || ! (*insn_data[icode].operand[0].predicate) (target, tmode))
>>>>
>>>> No space after ! and other unary operators (except for casts and other
>>>> operators you write with alphanumerics, like "sizeof").  I know you
>>>> copied this code, but :-)
>>>
>>> OK, thanks.
>>>
>>>>
>>>>> @@ -15608,8 +15606,6 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx 
>>>>> op_true, rtx op_false,
>>>>>       case GEU:
>>>>>       case LTU:
>>>>>       case LEU:
>>>>> -      /* Mark unsigned tests with CCUNSmode.  */
>>>>> -      cc_mode = CCUNSmode;
>>>>>         /* Invert condition to avoid compound test if necessary.  */
>>>>>         if (rcode == GEU || rcode == LEU)
>>>>
>>>> So this is related to the _uns thing.  Could you split off that change?
>>>> Probably as an earlier patch (but either works for me).
>>>
>>> Not related to the ALTIVEC_BUILTIN_VSEL_2DI_UNS things, previously 
>>> cc_mode
>>> is a parameter to generate the condition for IF_THEN_ELSE 
>>> instruction, now
>>> we don't need it again as we use IOR (AND... AND...) style, remove it 
>>> to avoid
>>> build error.
>>>
>>>
>>> -  cond2 = gen_rtx_fmt_ee (NE, cc_mode, gen_lowpart (dest_mode, mask),
>>> -                         CONST0_RTX (dest_mode));
>>> -  emit_insn (gen_rtx_SET (dest,
>>> -                         gen_rtx_IF_THEN_ELSE (dest_mode,
>>> -                                               cond2,
>>> -                                               op_true,
>>> -                                               op_false)));
>>> +  rtx tmp = gen_rtx_IOR (dest_mode,
>>> +                        gen_rtx_AND (dest_mode, gen_rtx_NOT 
>>> (dest_mode, mask),
>>> +                                     op_false),
>>> +                        gen_rtx_AND (dest_mode, mask, op_true));
>>> +  emit_insn (gen_rtx_SET (dest, tmp));
>>>
>>>
>>>>
>>>>> @@ -15629,6 +15625,9 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx 
>>>>> op_true, rtx op_false,
>>>>>     if (!mask)
>>>>>       return 0;
>>>>> +  if (mask_mode != dest_mode)
>>>>> +      mask = simplify_gen_subreg (dest_mode, mask, mask_mode, 0);
>>>>
>>>> Indent just two characters please: line continuations (usually) align,
>>>> but indents do not.>
>>>> Can you fold vsel and xxsel together completely?  They have exactly the
>>>> same semantics!  This does not have to be in this patch of course.
>>>
>>> I noticed that vperm/xxperm are folded together, do you mean fold 
>>> vsel/xxsel
>>> like them?  It's attached as:
>>> 0002-rs6000-Fold-xxsel-to-vsel-since-they-have-same-seman.patch
>>>
>>>
>>> Thanks,
>>> Xionghu
>>
> 

-- 
Thanks,
Xionghu

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Ping ^ 3: [PATCH] rs6000: Fix wrong code generation for vec_sel [PR94613]
  2021-09-15  7:50         ` Ping ^ 3: " Xionghu Luo
@ 2021-09-15 13:11           ` David Edelsohn
  2021-09-17  5:43             ` Xionghu Luo
  0 siblings, 1 reply; 11+ messages in thread
From: David Edelsohn @ 2021-09-15 13:11 UTC (permalink / raw)
  To: Xionghu Luo; +Cc: Segher Boessenkool, Bill Schmidt, GCC Patches, linkw

Hi, Xionhu

Should "altivec_vsel<mode>2" .. 3 .. 4 be "*altivec_vsel<mode>2", etc.
because they are combiner patterns and never referenced by name?  Only
the first, named pattern is referenced by the builtin code.

Other than that question / suggestion, this patch is okay.  Please
coordinate with Bill and his builtin patches.

Thanks, David

On Wed, Sep 15, 2021 at 3:50 AM Xionghu Luo <luoxhu@linux.ibm.com> wrote:
>
> Ping^3, thanks.
>
> https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570333.html
>
>
> On 2021/9/6 08:52, Xionghu Luo via Gcc-patches wrote:
> > Ping^2, thanks.
> >
> > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570333.html
> >
> > On 2021/6/30 09:42, Xionghu Luo via Gcc-patches wrote:
> >> Gentle ping, thanks.
> >>
> >> https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570333.html
> >>
> >>
> >> On 2021/5/14 14:57, Xionghu Luo via Gcc-patches wrote:
> >>> Hi,
> >>>
> >>> On 2021/5/13 18:49, Segher Boessenkool wrote:
> >>>> Hi!
> >>>>
> >>>> On Fri, Apr 30, 2021 at 01:32:58AM -0500, Xionghu Luo wrote:
> >>>>> The vsel instruction is a bit-wise select instruction.  Using an
> >>>>> IF_THEN_ELSE to express it in RTL is wrong and leads to wrong code
> >>>>> being generated in the combine pass.  Per element selection is a
> >>>>> subset of per bit-wise selection,with the patch the pattern is
> >>>>> written using bit operations.  But there are 8 different patterns
> >>>>> to define "op0 := (op1 & ~op3) | (op2 & op3)":
> >>>>>
> >>>>> (~op3&op1) | (op3&op2),
> >>>>> (~op3&op1) | (op2&op3),
> >>>>> (op3&op2) | (~op3&op1),
> >>>>> (op2&op3) | (~op3&op1),
> >>>>> (op1&~op3) | (op3&op2),
> >>>>> (op1&~op3) | (op2&op3),
> >>>>> (op3&op2) | (op1&~op3),
> >>>>> (op2&op3) | (op1&~op3),
> >>>>>
> >>>>> Combine pass will swap (op1&~op3) to (~op3&op1) due to commutative
> >>>>> canonical, which could reduce it to the FIRST 4 patterns, but it won't
> >>>>> swap (op2&op3) | (~op3&op1) to (~op3&op1) | (op2&op3), so this patch
> >>>>> handles it with two patterns with different NOT op3 position and check
> >>>>> equality inside it.
> >>>>
> >>>> Yup, that latter case does not have canonicalisation rules.  Btw, not
> >>>> only combine does this canonicalisation: everything should,
> >>>> non-canonical RTL is invalid RTL (in the instruction stream, you can do
> >>>> everything in temporary code of course, as long as the RTL isn't
> >>>> malformed).
> >>>>
> >>>>> -(define_insn "*altivec_vsel<mode>"
> >>>>> +(define_insn "altivec_vsel<mode>"
> >>>>>     [(set (match_operand:VM 0 "altivec_register_operand" "=v")
> >>>>> -    (if_then_else:VM
> >>>>> -     (ne:CC (match_operand:VM 1 "altivec_register_operand" "v")
> >>>>> -        (match_operand:VM 4 "zero_constant" ""))
> >>>>> -     (match_operand:VM 2 "altivec_register_operand" "v")
> >>>>> -     (match_operand:VM 3 "altivec_register_operand" "v")))]
> >>>>> -  "VECTOR_MEM_ALTIVEC_P (<MODE>mode)"
> >>>>> -  "vsel %0,%3,%2,%1"
> >>>>> +    (ior:VM
> >>>>> +     (and:VM
> >>>>> +      (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
> >>>>> +      (match_operand:VM 1 "altivec_register_operand" "v"))
> >>>>> +     (and:VM
> >>>>> +      (match_operand:VM 2 "altivec_register_operand" "v")
> >>>>> +      (match_operand:VM 4 "altivec_register_operand" "v"))))]
> >>>>> +  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)
> >>>>> +  && (rtx_equal_p (operands[2], operands[3])
> >>>>> +  || rtx_equal_p (operands[4], operands[3]))"
> >>>>> +  {
> >>>>> +    if (rtx_equal_p (operands[2], operands[3]))
> >>>>> +      return "vsel %0,%1,%4,%3";
> >>>>> +    else
> >>>>> +      return "vsel %0,%1,%2,%3";
> >>>>> +  }
> >>>>>     [(set_attr "type" "vecmove")])
> >>>>
> >>>> That rtx_equal_p stuff is nice and tricky, but it is a bit too tricky I
> >>>> think.  So please write this as two patterns (and keep the expand if
> >>>> that helps).
> >>>
> >>> I was a bit concerned that there would be a lot of duplicate code if we
> >>> write two patterns for each vsel, totally 4 similar patterns in
> >>> altivec.md and another 4 in vsx.md make it difficult to maintain,
> >>> however
> >>> I updated it since you prefer this way, as you pointed out the xxsel in
> >>> vsx.md could be folded by later patch.
> >>>
> >>>>
> >>>>> +(define_insn "altivec_vsel<mode>2"
> >>>>
> >>>> (same here of course).
> >>>>
> >>>>>   ;; Fused multiply add.
> >>>>> diff --git a/gcc/config/rs6000/rs6000-call.c
> >>>>> b/gcc/config/rs6000/rs6000-call.c
> >>>>> index f5676255387..d65bdc01055 100644
> >>>>> --- a/gcc/config/rs6000/rs6000-call.c
> >>>>> +++ b/gcc/config/rs6000/rs6000-call.c
> >>>>> @@ -3362,11 +3362,11 @@ const struct altivec_builtin_types
> >>>>> altivec_overloaded_builtins[] = {
> >>>>>       RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI,
> >>>>> RS6000_BTI_unsigned_V2DI },
> >>>>>     { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
> >>>>>       RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI,
> >>>>> RS6000_BTI_V2DI },
> >>>>> -  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
> >>>>> +  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI_UNS,
> >>>>
> >>>> Are the _uns things still used for anything?  But, let's not change
> >>>> this until Bill's stuff is in :-)
> >>>>
> >>>> Why do you want to change this here, btw?  I don't understand.
> >>>
> >>> OK, they are actually "unsigned type" overload builtin functions, change
> >>> it or not so far won't cause functionality issue, I will revert this
> >>> change
> >>> in the updated patch.
> >>>
> >>>>
> >>>>> +  if (target == 0
> >>>>> +      || GET_MODE (target) != tmode
> >>>>> +      || ! (*insn_data[icode].operand[0].predicate) (target, tmode))
> >>>>
> >>>> No space after ! and other unary operators (except for casts and other
> >>>> operators you write with alphanumerics, like "sizeof").  I know you
> >>>> copied this code, but :-)
> >>>
> >>> OK, thanks.
> >>>
> >>>>
> >>>>> @@ -15608,8 +15606,6 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx
> >>>>> op_true, rtx op_false,
> >>>>>       case GEU:
> >>>>>       case LTU:
> >>>>>       case LEU:
> >>>>> -      /* Mark unsigned tests with CCUNSmode.  */
> >>>>> -      cc_mode = CCUNSmode;
> >>>>>         /* Invert condition to avoid compound test if necessary.  */
> >>>>>         if (rcode == GEU || rcode == LEU)
> >>>>
> >>>> So this is related to the _uns thing.  Could you split off that change?
> >>>> Probably as an earlier patch (but either works for me).
> >>>
> >>> Not related to the ALTIVEC_BUILTIN_VSEL_2DI_UNS things, previously
> >>> cc_mode
> >>> is a parameter to generate the condition for IF_THEN_ELSE
> >>> instruction, now
> >>> we don't need it again as we use IOR (AND... AND...) style, remove it
> >>> to avoid
> >>> build error.
> >>>
> >>>
> >>> -  cond2 = gen_rtx_fmt_ee (NE, cc_mode, gen_lowpart (dest_mode, mask),
> >>> -                         CONST0_RTX (dest_mode));
> >>> -  emit_insn (gen_rtx_SET (dest,
> >>> -                         gen_rtx_IF_THEN_ELSE (dest_mode,
> >>> -                                               cond2,
> >>> -                                               op_true,
> >>> -                                               op_false)));
> >>> +  rtx tmp = gen_rtx_IOR (dest_mode,
> >>> +                        gen_rtx_AND (dest_mode, gen_rtx_NOT
> >>> (dest_mode, mask),
> >>> +                                     op_false),
> >>> +                        gen_rtx_AND (dest_mode, mask, op_true));
> >>> +  emit_insn (gen_rtx_SET (dest, tmp));
> >>>
> >>>
> >>>>
> >>>>> @@ -15629,6 +15625,9 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx
> >>>>> op_true, rtx op_false,
> >>>>>     if (!mask)
> >>>>>       return 0;
> >>>>> +  if (mask_mode != dest_mode)
> >>>>> +      mask = simplify_gen_subreg (dest_mode, mask, mask_mode, 0);
> >>>>
> >>>> Indent just two characters please: line continuations (usually) align,
> >>>> but indents do not.>
> >>>> Can you fold vsel and xxsel together completely?  They have exactly the
> >>>> same semantics!  This does not have to be in this patch of course.
> >>>
> >>> I noticed that vperm/xxperm are folded together, do you mean fold
> >>> vsel/xxsel
> >>> like them?  It's attached as:
> >>> 0002-rs6000-Fold-xxsel-to-vsel-since-they-have-same-seman.patch
> >>>
> >>>
> >>> Thanks,
> >>> Xionghu
> >>
> >
>
> --
> Thanks,
> Xionghu

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] rs6000: Fix wrong code generation for vec_sel [PR94613]
  2021-05-14  6:57   ` Xionghu Luo
  2021-06-07  2:15     ` Ping: " Xionghu Luo
  2021-06-30  1:42     ` Xionghu Luo
@ 2021-09-15 14:14     ` Segher Boessenkool
  2 siblings, 0 replies; 11+ messages in thread
From: Segher Boessenkool @ 2021-09-15 14:14 UTC (permalink / raw)
  To: Xionghu Luo; +Cc: gcc-patches, dje.gcc, wschmidt, guojiufu, linkw

Hi!

Please do not send patches as attachments to replies.  Each patch (or
patch series) starts its own thread.  New versions of patches (or patch
series) are new threads.

> From: Xionghu Luo <luoxhu@linux.ibm.com>
> Date: Tue, 27 Apr 2021 01:07:25 -0500
> Subject: [PATCH 1/2] rs6000: Fix wrong code generation for vec_sel [PR94613]

> 	* config/rs6000/rs6000-call.c (altivec_expand_vec_sel_builtin):
> 	New.

That fits on one line.  Changelogs are 80 chars wide.

> 	* config/rs6000/rs6000.c (rs6000_emit_vector_cond_expr): Use
> 	bit-wise selection instead of per element.

So "bit-wise" fits on the previous line, too.

> +  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)
> +  && rtx_equal_p (operands[2], operands[3])"

The "&&" should align with "VECTOR.."  (Many times in this).

> +(define_insn "altivec_vsel<mode>4"
> +  [(set (match_operand:VM 0 "altivec_register_operand" "=v")
> +	(ior:VM
> +	 (and:VM
> +	  (match_operand:VM 1 "altivec_register_operand" "v")
> +	  (match_operand:VM 2 "altivec_register_operand" "v"))
> +	 (and:VM
> +	  (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
> +	  (match_operand:VM 4 "altivec_register_operand" "v"))))]
> +  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)
> +  && rtx_equal_p (operands[2], operands[3])"
> +  "vsel %0,%4,%1,%3"
>    [(set_attr "type" "vecmove")])

I still don't see how rtx_equal_p is correct here.  Either it should be
a match_dup or the constraints can do that.

> +    (ior:VEC_L
> +     (and:VEC_L (not:VEC_L (match_operand:VEC_L 3 "vlogical_operand"))
> +      (match_operand:VEC_L 1 "vlogical_operand"))

We indent RTL by two chars, not one.  An advantage of that is that wrong
indent like in this last line is more obvious (the "(match.." should
align with the "(not").

> +     (and:VEC_L (match_dup 3)
> +     (match_operand:VEC_L 2 "vlogical_operand"))))]

The two "(match.." should align here.

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr94613.c
> @@ -0,0 +1,38 @@
> +/* { dg-do run } */
> +/* { dg-require-effective-target vmx_hw } */
> +/* { dg-options "-O3 -maltivec" } */

Why -O3?  Please just -O2 (if that works).

> From 74cf1fd298e4923c106deaba3192423d48049559 Mon Sep 17 00:00:00 2001
> From: Xionghu Luo <luoxhu@linux.ibm.com>
> Date: Fri, 14 May 2021 01:21:06 -0500
> Subject: [PATCH 2/2] rs6000: Fold xxsel to vsel since they have same semantics

Nevwer send two patches in one mail.  Make a series please.

> --- a/gcc/config/rs6000/altivec.md
> +++ b/gcc/config/rs6000/altivec.md
> @@ -668,59 +668,67 @@ (define_insn "*altivec_gev4sf"
>    [(set_attr "type" "veccmp")])
>  
>  (define_insn "altivec_vsel<mode>"
> -  [(set (match_operand:VM 0 "altivec_register_operand" "=v")
> +  [(set (match_operand:VM 0 "altivec_register_operand" "=wa,v")
>  	(ior:VM
>  	 (and:VM
> -	  (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
> -	  (match_operand:VM 1 "altivec_register_operand" "v"))
> +	  (not:VM (match_operand:VM 3 "altivec_register_operand" "wa,v"))
> +	  (match_operand:VM 1 "altivec_register_operand" "wa,v"))
>  	 (and:VM
> -	  (match_operand:VM 2 "altivec_register_operand" "v")
> -	  (match_operand:VM 4 "altivec_register_operand" "v"))))]
> +	  (match_operand:VM 2 "altivec_register_operand" "wa,v")
> +	  (match_operand:VM 4 "altivec_register_operand" "wa,v"))))]
>    "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)
>    && rtx_equal_p (operands[2], operands[3])"
> -  "vsel %0,%1,%4,%3"
> +  "@
> +  xxsel %x0,%x1,%x4,%x3
> +  vsel %0,%1,%4,%3"

The mnemonics should align with the @.

This ordering makes us prefer xxsel over vsel.  Do we want that?  We
probably do, but it is a change I think?

Do we want to add an "isa" attribute?  Most patterns still don't, but we
probably should wherever we can.

"altivec_register_operand" is wrong.  Just "gpc_reg_operand" I think?


Segher

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Ping ^ 3: [PATCH] rs6000: Fix wrong code generation for vec_sel [PR94613]
  2021-09-15 13:11           ` David Edelsohn
@ 2021-09-17  5:43             ` Xionghu Luo
  0 siblings, 0 replies; 11+ messages in thread
From: Xionghu Luo @ 2021-09-17  5:43 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Segher Boessenkool, Bill Schmidt, GCC Patches, linkw



On 2021/9/15 21:11, David Edelsohn wrote:
> Hi, Xionhu
> 
> Should "altivec_vsel<mode>2" .. 3 .. 4 be "*altivec_vsel<mode>2", etc.
> because they are combiner patterns and never referenced by name?  Only
> the first, named pattern is referenced by the builtin code.

Thanks, updated the patchset with Segher's review comments, he didn't mention
about this and sorry to forget change this part,  I am also not
sure whether "altivec_vsel<mode>2" .. 3 .. 4 will be used/generated or
optimized by expander in future, is there any benefit to add "*" to the
define_insn patterns?

> 
> Other than that question / suggestion, this patch is okay.  Please
> coordinate with Bill and his builtin patches.

OK.

> 
> Thanks, David
> 
> On Wed, Sep 15, 2021 at 3:50 AM Xionghu Luo <luoxhu@linux.ibm.com> wrote:
>>
>> Ping^3, thanks.
>>
>> https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570333.html
>>
>>
>> On 2021/9/6 08:52, Xionghu Luo via Gcc-patches wrote:
>>> Ping^2, thanks.
>>>
>>> https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570333.html
>>>
>>> On 2021/6/30 09:42, Xionghu Luo via Gcc-patches wrote:
>>>> Gentle ping, thanks.
>>>>
>>>> https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570333.html
>>>>
>>>>
>>>> On 2021/5/14 14:57, Xionghu Luo via Gcc-patches wrote:
>>>>> Hi,
>>>>>
>>>>> On 2021/5/13 18:49, Segher Boessenkool wrote:
>>>>>> Hi!
>>>>>>
>>>>>> On Fri, Apr 30, 2021 at 01:32:58AM -0500, Xionghu Luo wrote:
>>>>>>> The vsel instruction is a bit-wise select instruction.  Using an
>>>>>>> IF_THEN_ELSE to express it in RTL is wrong and leads to wrong code
>>>>>>> being generated in the combine pass.  Per element selection is a
>>>>>>> subset of per bit-wise selection,with the patch the pattern is
>>>>>>> written using bit operations.  But there are 8 different patterns
>>>>>>> to define "op0 := (op1 & ~op3) | (op2 & op3)":
>>>>>>>
>>>>>>> (~op3&op1) | (op3&op2),
>>>>>>> (~op3&op1) | (op2&op3),
>>>>>>> (op3&op2) | (~op3&op1),
>>>>>>> (op2&op3) | (~op3&op1),
>>>>>>> (op1&~op3) | (op3&op2),
>>>>>>> (op1&~op3) | (op2&op3),
>>>>>>> (op3&op2) | (op1&~op3),
>>>>>>> (op2&op3) | (op1&~op3),
>>>>>>>
>>>>>>> Combine pass will swap (op1&~op3) to (~op3&op1) due to commutative
>>>>>>> canonical, which could reduce it to the FIRST 4 patterns, but it won't
>>>>>>> swap (op2&op3) | (~op3&op1) to (~op3&op1) | (op2&op3), so this patch
>>>>>>> handles it with two patterns with different NOT op3 position and check
>>>>>>> equality inside it.
>>>>>>
>>>>>> Yup, that latter case does not have canonicalisation rules.  Btw, not
>>>>>> only combine does this canonicalisation: everything should,
>>>>>> non-canonical RTL is invalid RTL (in the instruction stream, you can do
>>>>>> everything in temporary code of course, as long as the RTL isn't
>>>>>> malformed).
>>>>>>
>>>>>>> -(define_insn "*altivec_vsel<mode>"
>>>>>>> +(define_insn "altivec_vsel<mode>"
>>>>>>>      [(set (match_operand:VM 0 "altivec_register_operand" "=v")
>>>>>>> -    (if_then_else:VM
>>>>>>> -     (ne:CC (match_operand:VM 1 "altivec_register_operand" "v")
>>>>>>> -        (match_operand:VM 4 "zero_constant" ""))
>>>>>>> -     (match_operand:VM 2 "altivec_register_operand" "v")
>>>>>>> -     (match_operand:VM 3 "altivec_register_operand" "v")))]
>>>>>>> -  "VECTOR_MEM_ALTIVEC_P (<MODE>mode)"
>>>>>>> -  "vsel %0,%3,%2,%1"
>>>>>>> +    (ior:VM
>>>>>>> +     (and:VM
>>>>>>> +      (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
>>>>>>> +      (match_operand:VM 1 "altivec_register_operand" "v"))
>>>>>>> +     (and:VM
>>>>>>> +      (match_operand:VM 2 "altivec_register_operand" "v")
>>>>>>> +      (match_operand:VM 4 "altivec_register_operand" "v"))))]
>>>>>>> +  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)
>>>>>>> +  && (rtx_equal_p (operands[2], operands[3])
>>>>>>> +  || rtx_equal_p (operands[4], operands[3]))"
>>>>>>> +  {
>>>>>>> +    if (rtx_equal_p (operands[2], operands[3]))
>>>>>>> +      return "vsel %0,%1,%4,%3";
>>>>>>> +    else
>>>>>>> +      return "vsel %0,%1,%2,%3";
>>>>>>> +  }
>>>>>>>      [(set_attr "type" "vecmove")])
>>>>>>
>>>>>> That rtx_equal_p stuff is nice and tricky, but it is a bit too tricky I
>>>>>> think.  So please write this as two patterns (and keep the expand if
>>>>>> that helps).
>>>>>
>>>>> I was a bit concerned that there would be a lot of duplicate code if we
>>>>> write two patterns for each vsel, totally 4 similar patterns in
>>>>> altivec.md and another 4 in vsx.md make it difficult to maintain,
>>>>> however
>>>>> I updated it since you prefer this way, as you pointed out the xxsel in
>>>>> vsx.md could be folded by later patch.
>>>>>
>>>>>>
>>>>>>> +(define_insn "altivec_vsel<mode>2"
>>>>>>
>>>>>> (same here of course).
>>>>>>
>>>>>>>    ;; Fused multiply add.
>>>>>>> diff --git a/gcc/config/rs6000/rs6000-call.c
>>>>>>> b/gcc/config/rs6000/rs6000-call.c
>>>>>>> index f5676255387..d65bdc01055 100644
>>>>>>> --- a/gcc/config/rs6000/rs6000-call.c
>>>>>>> +++ b/gcc/config/rs6000/rs6000-call.c
>>>>>>> @@ -3362,11 +3362,11 @@ const struct altivec_builtin_types
>>>>>>> altivec_overloaded_builtins[] = {
>>>>>>>        RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI,
>>>>>>> RS6000_BTI_unsigned_V2DI },
>>>>>>>      { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
>>>>>>>        RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI,
>>>>>>> RS6000_BTI_V2DI },
>>>>>>> -  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
>>>>>>> +  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI_UNS,
>>>>>>
>>>>>> Are the _uns things still used for anything?  But, let's not change
>>>>>> this until Bill's stuff is in :-)
>>>>>>
>>>>>> Why do you want to change this here, btw?  I don't understand.
>>>>>
>>>>> OK, they are actually "unsigned type" overload builtin functions, change
>>>>> it or not so far won't cause functionality issue, I will revert this
>>>>> change
>>>>> in the updated patch.
>>>>>
>>>>>>
>>>>>>> +  if (target == 0
>>>>>>> +      || GET_MODE (target) != tmode
>>>>>>> +      || ! (*insn_data[icode].operand[0].predicate) (target, tmode))
>>>>>>
>>>>>> No space after ! and other unary operators (except for casts and other
>>>>>> operators you write with alphanumerics, like "sizeof").  I know you
>>>>>> copied this code, but :-)
>>>>>
>>>>> OK, thanks.
>>>>>
>>>>>>
>>>>>>> @@ -15608,8 +15606,6 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx
>>>>>>> op_true, rtx op_false,
>>>>>>>        case GEU:
>>>>>>>        case LTU:
>>>>>>>        case LEU:
>>>>>>> -      /* Mark unsigned tests with CCUNSmode.  */
>>>>>>> -      cc_mode = CCUNSmode;
>>>>>>>          /* Invert condition to avoid compound test if necessary.  */
>>>>>>>          if (rcode == GEU || rcode == LEU)
>>>>>>
>>>>>> So this is related to the _uns thing.  Could you split off that change?
>>>>>> Probably as an earlier patch (but either works for me).
>>>>>
>>>>> Not related to the ALTIVEC_BUILTIN_VSEL_2DI_UNS things, previously
>>>>> cc_mode
>>>>> is a parameter to generate the condition for IF_THEN_ELSE
>>>>> instruction, now
>>>>> we don't need it again as we use IOR (AND... AND...) style, remove it
>>>>> to avoid
>>>>> build error.
>>>>>
>>>>>
>>>>> -  cond2 = gen_rtx_fmt_ee (NE, cc_mode, gen_lowpart (dest_mode, mask),
>>>>> -                         CONST0_RTX (dest_mode));
>>>>> -  emit_insn (gen_rtx_SET (dest,
>>>>> -                         gen_rtx_IF_THEN_ELSE (dest_mode,
>>>>> -                                               cond2,
>>>>> -                                               op_true,
>>>>> -                                               op_false)));
>>>>> +  rtx tmp = gen_rtx_IOR (dest_mode,
>>>>> +                        gen_rtx_AND (dest_mode, gen_rtx_NOT
>>>>> (dest_mode, mask),
>>>>> +                                     op_false),
>>>>> +                        gen_rtx_AND (dest_mode, mask, op_true));
>>>>> +  emit_insn (gen_rtx_SET (dest, tmp));
>>>>>
>>>>>
>>>>>>
>>>>>>> @@ -15629,6 +15625,9 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx
>>>>>>> op_true, rtx op_false,
>>>>>>>      if (!mask)
>>>>>>>        return 0;
>>>>>>> +  if (mask_mode != dest_mode)
>>>>>>> +      mask = simplify_gen_subreg (dest_mode, mask, mask_mode, 0);
>>>>>>
>>>>>> Indent just two characters please: line continuations (usually) align,
>>>>>> but indents do not.>
>>>>>> Can you fold vsel and xxsel together completely?  They have exactly the
>>>>>> same semantics!  This does not have to be in this patch of course.
>>>>>
>>>>> I noticed that vperm/xxperm are folded together, do you mean fold
>>>>> vsel/xxsel
>>>>> like them?  It's attached as:
>>>>> 0002-rs6000-Fold-xxsel-to-vsel-since-they-have-same-seman.patch
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Xionghu
>>>>
>>>
>>
>> --
>> Thanks,
>> Xionghu

-- 
Thanks,
Xionghu

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-09-17  5:44 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-30  6:32 [PATCH] rs6000: Fix wrong code generation for vec_sel [PR94613] Xionghu Luo
2021-05-13  1:18 ` *Ping*: " Xionghu Luo
2021-05-13 10:49 ` Segher Boessenkool
2021-05-14  6:57   ` Xionghu Luo
2021-06-07  2:15     ` Ping: " Xionghu Luo
2021-06-30  1:42     ` Xionghu Luo
2021-09-06  0:52       ` Ping ^ 2: " Xionghu Luo
2021-09-15  7:50         ` Ping ^ 3: " Xionghu Luo
2021-09-15 13:11           ` David Edelsohn
2021-09-17  5:43             ` Xionghu Luo
2021-09-15 14:14     ` Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).