public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] Fix vec_sel code generation and merge xxsel to vsel
@ 2021-09-17  5:25 Xionghu Luo
  2021-09-17  5:25 ` [PATCH v2 1/2] rs6000: Fix wrong code generation for vec_sel [PR94613] Xionghu Luo
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Xionghu Luo @ 2021-09-17  5:25 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, wschmidt, guojiufu, linkw, Xionghu Luo

These two patches are updated version from:
https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579490.html

Changes:
1. Fix alignment error in md files.
2. Replace rtx_equal_p with match_dup.
3. Use register_operand instead of gpc_reg_operand to align with
   vperm/xxperm.
4. Regression tested pass on P8LE.

Xionghu Luo (2):
  rs6000: Fix wrong code generation for vec_sel [PR94613]
  rs6000: Fold xxsel to vsel since they have same semantics

 gcc/config/rs6000/altivec.md                  | 84 ++++++++++++++-----
 gcc/config/rs6000/rs6000-call.c               | 62 ++++++++++++++
 gcc/config/rs6000/rs6000.c                    | 19 ++---
 gcc/config/rs6000/vector.md                   | 26 +++---
 gcc/config/rs6000/vsx.md                      | 25 ------
 gcc/testsuite/gcc.target/powerpc/builtins-1.c |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr94613.c    | 47 +++++++++++
 7 files changed, 193 insertions(+), 72 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr94613.c

-- 
2.25.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2 1/2] rs6000: Fix wrong code generation for vec_sel [PR94613]
  2021-09-17  5:25 [PATCH v2 0/2] Fix vec_sel code generation and merge xxsel to vsel Xionghu Luo
@ 2021-09-17  5:25 ` Xionghu Luo
  2021-09-17  5:25 ` [PATCH v2 2/2] rs6000: Fold xxsel to vsel since they have same semantics Xionghu Luo
  2021-10-08  1:17 ` Ping: [PATCH v2 0/2] Fix vec_sel code generation and merge xxsel to vsel Xionghu Luo
  2 siblings, 0 replies; 7+ messages in thread
From: Xionghu Luo @ 2021-09-17  5:25 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, wschmidt, guojiufu, linkw, Xionghu Luo

The vsel instruction is a bit-wise select instruction.  Using an
IF_THEN_ELSE to express it in RTL is wrong and leads to wrong code
being generated in the combine pass.  Per element selection is a
subset of per bit-wise selection,with the patch the pattern is
written using bit operations.  But there are 8 different patterns
to define "op0 := (op1 & ~op3) | (op2 & op3)":

(~op3&op1) | (op3&op2),
(~op3&op1) | (op2&op3),
(op3&op2) | (~op3&op1),
(op2&op3) | (~op3&op1),
(op1&~op3) | (op3&op2),
(op1&~op3) | (op2&op3),
(op3&op2) | (op1&~op3),
(op2&op3) | (op1&~op3),

The latter 4 cases does not follow canonicalisation rules, non-canonical
RTL is invalid RTL in vregs pass.  Secondly, combine pass will swap
(op1&~op3) to (~op3&op1) by commutative canonical, which could reduce
it to the FIRST 4 patterns, but it won't swap (op2&op3) | (~op3&op1) to
(~op3&op1) | (op2&op3), so this patch handles it with 4 patterns with
different NOT op3 position and check equality inside it.

Tested pass on Power8LE, any comments?

gcc/ChangeLog:

2021-09-17  Xionghu Luo  <luoxhu@linux.ibm.com>

	* config/rs6000/altivec.md (*altivec_vsel<mode>): Change to ...
	(altivec_vsel<mode>): ... this and update define.
	(*altivec_vsel<mode>_uns): Delete.
	(altivec_vsel<mode>2): New define_insn.
	(altivec_vsel<mode>3): Likewise.
	(altivec_vsel<mode>4): Likewise.
	* config/rs6000/rs6000-call.c (altivec_expand_vec_sel_builtin): New.
	(altivec_expand_builtin): Call altivec_expand_vec_sel_builtin to expand
	vel_sel.
	* config/rs6000/rs6000.c (rs6000_emit_vector_cond_expr): Use bit-wise
	selection instead of per element.
	* config/rs6000/vector.md:
	* config/rs6000/vsx.md (*vsx_xxsel<mode>): Change to ...
	(vsx_xxsel<mode>): ... this and update define.
	(*vsx_xxsel<mode>_uns): Delete.
	(vsx_xxsel<mode>2): New define_insn.
	(vsx_xxsel<mode>3): Likewise.
	(vsx_xxsel<mode>4): Likewise.

gcc/testsuite/ChangeLog:

2021-09-17  Xionghu Luo  <luoxhu@linux.ibm.com>

	* gcc.target/powerpc/pr94613.c: New test.
---
 gcc/config/rs6000/altivec.md               | 62 ++++++++++++++++------
 gcc/config/rs6000/rs6000-call.c            | 62 ++++++++++++++++++++++
 gcc/config/rs6000/rs6000.c                 | 19 +++----
 gcc/config/rs6000/vector.md                | 26 +++++----
 gcc/config/rs6000/vsx.md                   | 60 ++++++++++++++++-----
 gcc/testsuite/gcc.target/powerpc/pr94613.c | 47 ++++++++++++++++
 6 files changed, 221 insertions(+), 55 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr94613.c

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 93d237156d5..a3424e1a458 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -683,26 +683,56 @@ (define_insn "*altivec_gev4sf"
   "vcmpgefp %0,%1,%2"
   [(set_attr "type" "veccmp")])
 
-(define_insn "*altivec_vsel<mode>"
+(define_insn "altivec_vsel<mode>"
   [(set (match_operand:VM 0 "altivec_register_operand" "=v")
-	(if_then_else:VM
-	 (ne:CC (match_operand:VM 1 "altivec_register_operand" "v")
-		(match_operand:VM 4 "zero_constant" ""))
-	 (match_operand:VM 2 "altivec_register_operand" "v")
-	 (match_operand:VM 3 "altivec_register_operand" "v")))]
-  "VECTOR_MEM_ALTIVEC_P (<MODE>mode)"
-  "vsel %0,%3,%2,%1"
+	(ior:VM
+	  (and:VM
+	    (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
+	    (match_operand:VM 1 "altivec_register_operand" "v"))
+	  (and:VM
+	    (match_dup 3)
+	    (match_operand:VM 2 "altivec_register_operand" "v"))))]
+  "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)"
+  "vsel %0,%1,%2,%3"
   [(set_attr "type" "vecmove")])
 
-(define_insn "*altivec_vsel<mode>_uns"
+(define_insn "altivec_vsel<mode>2"
   [(set (match_operand:VM 0 "altivec_register_operand" "=v")
-	(if_then_else:VM
-	 (ne:CCUNS (match_operand:VM 1 "altivec_register_operand" "v")
-		   (match_operand:VM 4 "zero_constant" ""))
-	 (match_operand:VM 2 "altivec_register_operand" "v")
-	 (match_operand:VM 3 "altivec_register_operand" "v")))]
-  "VECTOR_MEM_ALTIVEC_P (<MODE>mode)"
-  "vsel %0,%3,%2,%1"
+	(ior:VM
+	  (and:VM
+	    (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
+	    (match_operand:VM 1 "altivec_register_operand" "v"))
+	  (and:VM
+	    (match_operand:VM 2 "altivec_register_operand" "v")
+	    (match_dup 3))))]
+  "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)"
+  "vsel %0,%1,%2,%3"
+  [(set_attr "type" "vecmove")])
+
+(define_insn "altivec_vsel<mode>3"
+  [(set (match_operand:VM 0 "altivec_register_operand" "=v")
+	(ior:VM
+	  (and:VM
+	    (match_operand:VM 3 "altivec_register_operand" "v")
+	    (match_operand:VM 1 "altivec_register_operand" "v"))
+	  (and:VM
+	    (not:VM (match_dup 3))
+	    (match_operand:VM 2 "altivec_register_operand" "v"))))]
+  "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)"
+  "vsel %0,%2,%1,%3"
+  [(set_attr "type" "vecmove")])
+
+(define_insn "altivec_vsel<mode>4"
+  [(set (match_operand:VM 0 "altivec_register_operand" "=v")
+	(ior:VM
+	  (and:VM
+	    (match_operand:VM 1 "altivec_register_operand" "v")
+	    (match_operand:VM 3 "altivec_register_operand" "v"))
+	  (and:VM
+	    (not:VM (match_dup 3))
+	    (match_operand:VM 2 "altivec_register_operand" "v"))))]
+  "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)"
+  "vsel %0,%2,%1,%3"
   [(set_attr "type" "vecmove")])
 
 ;; Fused multiply add.
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index e8625d17d18..cddfa76e7cb 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -11109,6 +11109,45 @@ altivec_expand_vec_ext_builtin (tree exp, rtx target)
   return target;
 }
 
+/* Expand vec_sel builtin.  */
+static rtx
+altivec_expand_vec_sel_builtin (enum insn_code icode, tree exp, rtx target)
+{
+  rtx op0, op1, op2, pat;
+  tree arg0, arg1, arg2;
+
+  arg0 = CALL_EXPR_ARG (exp, 0);
+  op0 = expand_normal (arg0);
+  arg1 = CALL_EXPR_ARG (exp, 1);
+  op1 = expand_normal (arg1);
+  arg2 = CALL_EXPR_ARG (exp, 2);
+  op2 = expand_normal (arg2);
+
+  machine_mode tmode = insn_data[icode].operand[0].mode;
+  machine_mode mode0 = insn_data[icode].operand[1].mode;
+  machine_mode mode1 = insn_data[icode].operand[2].mode;
+  machine_mode mode2 = insn_data[icode].operand[3].mode;
+
+  if (target == 0 || GET_MODE (target) != tmode
+      || !(*insn_data[icode].operand[0].predicate) (target, tmode))
+    target = gen_reg_rtx (tmode);
+
+  if (!(*insn_data[icode].operand[1].predicate) (op0, mode0))
+    op0 = copy_to_mode_reg (mode0, op0);
+  if (!(*insn_data[icode].operand[2].predicate) (op1, mode1))
+    op1 = copy_to_mode_reg (mode1, op1);
+  if (!(*insn_data[icode].operand[3].predicate) (op2, mode2))
+    op2 = copy_to_mode_reg (mode2, op2);
+
+  pat = GEN_FCN (icode) (target, op0, op1, op2, op2);
+  if (pat)
+    emit_insn (pat);
+  else
+    return NULL_RTX;
+
+  return target;
+}
+
 /* Expand the builtin in EXP and store the result in TARGET.  Store
    true in *EXPANDEDP if we found a builtin to expand.  */
 static rtx
@@ -11294,6 +11333,29 @@ altivec_expand_builtin (tree exp, rtx target, bool *expandedp)
 	emit_insn (pat);
       return NULL_RTX;
 
+    case ALTIVEC_BUILTIN_VSEL_2DF:
+      return altivec_expand_vec_sel_builtin (CODE_FOR_altivec_vselv2df, exp,
+					     target);
+    case ALTIVEC_BUILTIN_VSEL_2DI:
+    case ALTIVEC_BUILTIN_VSEL_2DI_UNS:
+      return altivec_expand_vec_sel_builtin (CODE_FOR_altivec_vselv2di, exp,
+					     target);
+    case ALTIVEC_BUILTIN_VSEL_4SF:
+      return altivec_expand_vec_sel_builtin (CODE_FOR_altivec_vselv4sf, exp,
+					     target);
+    case ALTIVEC_BUILTIN_VSEL_4SI:
+    case ALTIVEC_BUILTIN_VSEL_4SI_UNS:
+      return altivec_expand_vec_sel_builtin (CODE_FOR_altivec_vselv4si, exp,
+					     target);
+    case ALTIVEC_BUILTIN_VSEL_8HI:
+    case ALTIVEC_BUILTIN_VSEL_8HI_UNS:
+      return altivec_expand_vec_sel_builtin (CODE_FOR_altivec_vselv8hi, exp,
+					     target);
+    case ALTIVEC_BUILTIN_VSEL_16QI:
+    case ALTIVEC_BUILTIN_VSEL_16QI_UNS:
+      return altivec_expand_vec_sel_builtin (CODE_FOR_altivec_vselv16qi, exp,
+					     target);
+
     case ALTIVEC_BUILTIN_DSSALL:
       emit_insn (gen_altivec_dssall ());
       return NULL_RTX;
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index ad81dfb316d..c9ce0550df1 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -15785,9 +15785,7 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
   machine_mode dest_mode = GET_MODE (dest);
   machine_mode mask_mode = GET_MODE (cc_op0);
   enum rtx_code rcode = GET_CODE (cond);
-  machine_mode cc_mode = CCmode;
   rtx mask;
-  rtx cond2;
   bool invert_move = false;
 
   if (VECTOR_UNIT_NONE_P (dest_mode))
@@ -15827,8 +15825,6 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
     case GEU:
     case LTU:
     case LEU:
-      /* Mark unsigned tests with CCUNSmode.  */
-      cc_mode = CCUNSmode;
 
       /* Invert condition to avoid compound test if necessary.  */
       if (rcode == GEU || rcode == LEU)
@@ -15848,6 +15844,9 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
   if (!mask)
     return 0;
 
+  if (mask_mode != dest_mode)
+    mask = simplify_gen_subreg (dest_mode, mask, mask_mode, 0);
+
   if (invert_move)
     std::swap (op_true, op_false);
 
@@ -15887,13 +15886,11 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
   if (!REG_P (op_false) && !SUBREG_P (op_false))
     op_false = force_reg (dest_mode, op_false);
 
-  cond2 = gen_rtx_fmt_ee (NE, cc_mode, gen_lowpart (dest_mode, mask),
-			  CONST0_RTX (dest_mode));
-  emit_insn (gen_rtx_SET (dest,
-			  gen_rtx_IF_THEN_ELSE (dest_mode,
-						cond2,
-						op_true,
-						op_false)));
+  rtx tmp = gen_rtx_IOR (dest_mode,
+			 gen_rtx_AND (dest_mode, gen_rtx_NOT (dest_mode, mask),
+				      op_false),
+			 gen_rtx_AND (dest_mode, mask, op_true));
+  emit_insn (gen_rtx_SET (dest, tmp));
   return 1;
 }
 
diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md
index 7e36c788b97..062aef70f2b 100644
--- a/gcc/config/rs6000/vector.md
+++ b/gcc/config/rs6000/vector.md
@@ -916,23 +916,21 @@ (define_insn_and_split "vector_<code><mode>"
 ;; which is in the reverse order that we want
 (define_expand "vector_select_<mode>"
   [(set (match_operand:VEC_L 0 "vlogical_operand")
-	(if_then_else:VEC_L
-	 (ne:CC (match_operand:VEC_L 3 "vlogical_operand")
-		(match_dup 4))
-	 (match_operand:VEC_L 2 "vlogical_operand")
-	 (match_operand:VEC_L 1 "vlogical_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-  "operands[4] = CONST0_RTX (<MODE>mode);")
+	(ior:VEC_L
+	  (and:VEC_L (not:VEC_L (match_operand:VEC_L 3 "vlogical_operand"))
+		     (match_operand:VEC_L 1 "vlogical_operand"))
+	  (and:VEC_L (match_dup 3)
+		     (match_operand:VEC_L 2 "vlogical_operand"))))]
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)")
 
 (define_expand "vector_select_<mode>_uns"
   [(set (match_operand:VEC_L 0 "vlogical_operand")
-	(if_then_else:VEC_L
-	 (ne:CCUNS (match_operand:VEC_L 3 "vlogical_operand")
-		   (match_dup 4))
-	 (match_operand:VEC_L 2 "vlogical_operand")
-	 (match_operand:VEC_L 1 "vlogical_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-  "operands[4] = CONST0_RTX (<MODE>mode);")
+	(ior:VEC_L
+	  (and:VEC_L (not:VEC_L (match_operand:VEC_L 3 "vlogical_operand"))
+		     (match_operand:VEC_L 1 "vlogical_operand"))
+	  (and:VEC_L (match_dup 3)
+		     (match_operand:VEC_L 2 "vlogical_operand"))))]
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)")
 
 ;; Expansions that compare vectors producing a vector result and a predicate,
 ;; setting CR6 to indicate a combined status
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index bf033e31c1c..601eb81e316 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -2185,30 +2185,62 @@ (define_insn "*vsx_ge_<mode>_p"
   [(set_attr "type" "<VStype_simple>")])
 
 ;; Vector select
-(define_insn "*vsx_xxsel<mode>"
+(define_insn "vsx_xxsel<mode>"
   [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
-	(if_then_else:VSX_L
-	 (ne:CC (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa")
-		(match_operand:VSX_L 4 "zero_constant" ""))
-	 (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa")
-	 (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa")))]
+	(ior:VSX_L
+	  (and:VSX_L
+	    (not:VSX_L (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa"))
+	    (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa"))
+	  (and:VSX_L
+	    (match_dup 3)
+	    (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa"))))]
   "VECTOR_MEM_VSX_P (<MODE>mode)"
-  "xxsel %x0,%x3,%x2,%x1"
+  "xxsel %x0,%x1,%x2,%x3"
   [(set_attr "type" "vecmove")
    (set_attr "isa" "<VSisa>")])
 
-(define_insn "*vsx_xxsel<mode>_uns"
+(define_insn "vsx_xxsel<mode>2"
   [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
-	(if_then_else:VSX_L
-	 (ne:CCUNS (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa")
-		   (match_operand:VSX_L 4 "zero_constant" ""))
-	 (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa")
-	 (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa")))]
+	(ior:VSX_L
+	  (and:VSX_L
+	    (not:VSX_L (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa"))
+	    (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa"))
+	  (and:VSX_L
+	    (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa")
+	    (match_dup 3))))]
   "VECTOR_MEM_VSX_P (<MODE>mode)"
-  "xxsel %x0,%x3,%x2,%x1"
+  "xxsel %x0,%x1,%x2,%x3"
   [(set_attr "type" "vecmove")
    (set_attr "isa" "<VSisa>")])
 
+(define_insn "vsx_xxsel<mode>3"
+  [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
+	(ior:VSX_L
+	  (and:VSX_L
+	    (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa")
+	    (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa"))
+	  (and:VSX_L
+	    (not:VSX_L (match_dup 3))
+	    (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa"))))]
+ "VECTOR_MEM_VSX_P (<MODE>mode)"
+  "xxsel %x0,%x2,%x1,%x3"
+ [(set_attr "type" "vecmove")
+ (set_attr "isa" "<VSisa>")])
+
+(define_insn "vsx_xxsel<mode>4"
+  [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
+	(ior:VSX_L
+	  (and:VSX_L
+	    (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa")
+	    (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa"))
+	  (and:VSX_L
+	    (not:VSX_L (match_dup 3))
+	    (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa"))))]
+ "VECTOR_MEM_VSX_P (<MODE>mode)"
+ "xxsel %x0,%x2,%x1,%x3"
+ [(set_attr "type" "vecmove")
+ (set_attr "isa" "<VSisa>")])
+
 ;; Copy sign
 (define_insn "vsx_copysign<mode>3"
   [(set (match_operand:VSX_F 0 "vsx_register_operand" "=wa")
diff --git a/gcc/testsuite/gcc.target/powerpc/pr94613.c b/gcc/testsuite/gcc.target/powerpc/pr94613.c
new file mode 100644
index 00000000000..13cab13cb83
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr94613.c
@@ -0,0 +1,47 @@
+/* { dg-do run } */
+/* { dg-require-effective-target vmx_hw } */
+/* { dg-options "-O2 -maltivec" } */
+
+#include <altivec.h>
+
+/* The initial implementation of vec_sel used an IF_THEN_ELSE rtx.
+   This did NOT match what the vsel instruction does.  vsel is a
+   bit-wise operation.  Using IF_THEN_ELSE made the + operation to be
+   simplified away in combine.  A plus operation affects other bits in
+   the same element. Hence per-element simplifications are wrong for
+   vsel.  */
+vector unsigned char __attribute__((noinline))
+foo (vector unsigned char a, vector unsigned char b, vector unsigned char c)
+{
+  return vec_sel (a + b, c, a);
+}
+
+vector unsigned char __attribute__((noinline))
+foor (vector unsigned char a, vector unsigned char b, vector unsigned char c)
+{
+  return vec_sel (c, a + b, ~a);
+}
+
+vector unsigned char __attribute__((noinline))
+bar (vector unsigned char a, vector unsigned char b, vector unsigned char c)
+{
+  return vec_sel (a | b, c, a);
+}
+
+int
+main ()
+{
+  vector unsigned char v = (vector unsigned char){ 1 };
+
+  if (foo (v, v, v)[0] != 3)
+      __builtin_abort ();
+
+  if (bar (v, v, v)[0] != 1)
+    __builtin_abort ();
+
+  if (foor (v, v, v)[0] != 3)
+    __builtin_abort ();
+
+  return 0;
+}
+
-- 
2.25.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2 2/2] rs6000: Fold xxsel to vsel since they have same semantics
  2021-09-17  5:25 [PATCH v2 0/2] Fix vec_sel code generation and merge xxsel to vsel Xionghu Luo
  2021-09-17  5:25 ` [PATCH v2 1/2] rs6000: Fix wrong code generation for vec_sel [PR94613] Xionghu Luo
@ 2021-09-17  5:25 ` Xionghu Luo
  2021-10-08  1:17 ` Ping: [PATCH v2 0/2] Fix vec_sel code generation and merge xxsel to vsel Xionghu Luo
  2 siblings, 0 replies; 7+ messages in thread
From: Xionghu Luo @ 2021-09-17  5:25 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, wschmidt, guojiufu, linkw, Xionghu Luo

Fold xxsel to vsel like xxperm/vperm to avoid duplicate code.

gcc/ChangeLog:

2021-09-17  Xionghu Luo  <luoxhu@linux.ibm.com>

	* config/rs6000/altivec.md: Add vsx register constraints.
	* config/rs6000/vsx.md (vsx_xxsel<mode>): Delete.
	(vsx_xxsel<mode>2): Likewise.
	(vsx_xxsel<mode>3): Likewise.
	(vsx_xxsel<mode>4): Likewise.
---
 gcc/config/rs6000/altivec.md                  | 60 +++++++++++--------
 gcc/config/rs6000/vsx.md                      | 57 ------------------
 gcc/testsuite/gcc.target/powerpc/builtins-1.c |  2 +-
 3 files changed, 37 insertions(+), 82 deletions(-)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index a3424e1a458..4b4ca2c5d17 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -684,56 +684,68 @@ (define_insn "*altivec_gev4sf"
   [(set_attr "type" "veccmp")])
 
 (define_insn "altivec_vsel<mode>"
-  [(set (match_operand:VM 0 "altivec_register_operand" "=v")
+  [(set (match_operand:VM 0 "register_operand" "=wa,v")
 	(ior:VM
 	  (and:VM
-	    (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
-	    (match_operand:VM 1 "altivec_register_operand" "v"))
+	    (not:VM (match_operand:VM 3 "register_operand" "wa,v"))
+	    (match_operand:VM 1 "register_operand" "wa,v"))
 	  (and:VM
 	    (match_dup 3)
-	    (match_operand:VM 2 "altivec_register_operand" "v"))))]
+	    (match_operand:VM 2 "register_operand" "wa,v"))))]
   "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)"
-  "vsel %0,%1,%2,%3"
-  [(set_attr "type" "vecmove")])
+  "@
+   xxsel %x0,%x1,%x2,%x3
+   vsel %0,%1,%2,%3"
+  [(set_attr "type" "vecmove")
+   (set_attr "isa" "<VSisa>")])
 
 (define_insn "altivec_vsel<mode>2"
-  [(set (match_operand:VM 0 "altivec_register_operand" "=v")
+  [(set (match_operand:VM 0 "register_operand" "=wa,v")
 	(ior:VM
 	  (and:VM
-	    (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
-	    (match_operand:VM 1 "altivec_register_operand" "v"))
+	    (not:VM (match_operand:VM 3 "register_operand" "wa,v"))
+	    (match_operand:VM 1 "register_operand" "wa,v"))
 	  (and:VM
-	    (match_operand:VM 2 "altivec_register_operand" "v")
+	    (match_operand:VM 2 "register_operand" "wa,v")
 	    (match_dup 3))))]
   "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)"
-  "vsel %0,%1,%2,%3"
-  [(set_attr "type" "vecmove")])
+  "@
+   xxsel %x0,%x1,%x2,%x3
+   vsel %0,%1,%2,%3"
+  [(set_attr "type" "vecmove")
+   (set_attr "isa" "<VSisa>")])
 
 (define_insn "altivec_vsel<mode>3"
-  [(set (match_operand:VM 0 "altivec_register_operand" "=v")
+  [(set (match_operand:VM 0 "register_operand" "=wa,v")
 	(ior:VM
 	  (and:VM
-	    (match_operand:VM 3 "altivec_register_operand" "v")
-	    (match_operand:VM 1 "altivec_register_operand" "v"))
+	    (match_operand:VM 3 "register_operand" "wa,v")
+	    (match_operand:VM 1 "register_operand" "wa,v"))
 	  (and:VM
 	    (not:VM (match_dup 3))
-	    (match_operand:VM 2 "altivec_register_operand" "v"))))]
+	    (match_operand:VM 2 "register_operand" "wa,v"))))]
   "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)"
-  "vsel %0,%2,%1,%3"
-  [(set_attr "type" "vecmove")])
+  "@
+   xxsel %x0,%x2,%x1,%x3
+   vsel %0,%2,%1,%3"
+  [(set_attr "type" "vecmove")
+   (set_attr "isa" "<VSisa>")])
 
 (define_insn "altivec_vsel<mode>4"
-  [(set (match_operand:VM 0 "altivec_register_operand" "=v")
+  [(set (match_operand:VM 0 "register_operand" "=wa,v")
 	(ior:VM
 	  (and:VM
-	    (match_operand:VM 1 "altivec_register_operand" "v")
-	    (match_operand:VM 3 "altivec_register_operand" "v"))
+	    (match_operand:VM 1 "register_operand" "wa,v")
+	    (match_operand:VM 3 "register_operand" "wa,v"))
 	  (and:VM
 	    (not:VM (match_dup 3))
-	    (match_operand:VM 2 "altivec_register_operand" "v"))))]
+	    (match_operand:VM 2 "register_operand" "wa,v"))))]
   "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)"
-  "vsel %0,%2,%1,%3"
-  [(set_attr "type" "vecmove")])
+  "@
+   xxsel %x0,%x2,%x1,%x3
+   vsel %0,%2,%1,%3"
+  [(set_attr "type" "vecmove")
+   (set_attr "isa" "<VSisa>")])
 
 ;; Fused multiply add.
 
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 601eb81e316..1d9a1eaaa54 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -2184,63 +2184,6 @@ (define_insn "*vsx_ge_<mode>_p"
   "xvcmpge<sd>p. %x0,%x1,%x2"
   [(set_attr "type" "<VStype_simple>")])
 
-;; Vector select
-(define_insn "vsx_xxsel<mode>"
-  [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
-	(ior:VSX_L
-	  (and:VSX_L
-	    (not:VSX_L (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa"))
-	    (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa"))
-	  (and:VSX_L
-	    (match_dup 3)
-	    (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa"))))]
-  "VECTOR_MEM_VSX_P (<MODE>mode)"
-  "xxsel %x0,%x1,%x2,%x3"
-  [(set_attr "type" "vecmove")
-   (set_attr "isa" "<VSisa>")])
-
-(define_insn "vsx_xxsel<mode>2"
-  [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
-	(ior:VSX_L
-	  (and:VSX_L
-	    (not:VSX_L (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa"))
-	    (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa"))
-	  (and:VSX_L
-	    (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa")
-	    (match_dup 3))))]
-  "VECTOR_MEM_VSX_P (<MODE>mode)"
-  "xxsel %x0,%x1,%x2,%x3"
-  [(set_attr "type" "vecmove")
-   (set_attr "isa" "<VSisa>")])
-
-(define_insn "vsx_xxsel<mode>3"
-  [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
-	(ior:VSX_L
-	  (and:VSX_L
-	    (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa")
-	    (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa"))
-	  (and:VSX_L
-	    (not:VSX_L (match_dup 3))
-	    (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa"))))]
- "VECTOR_MEM_VSX_P (<MODE>mode)"
-  "xxsel %x0,%x2,%x1,%x3"
- [(set_attr "type" "vecmove")
- (set_attr "isa" "<VSisa>")])
-
-(define_insn "vsx_xxsel<mode>4"
-  [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
-	(ior:VSX_L
-	  (and:VSX_L
-	    (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa")
-	    (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,wa"))
-	  (and:VSX_L
-	    (not:VSX_L (match_dup 3))
-	    (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa"))))]
- "VECTOR_MEM_VSX_P (<MODE>mode)"
- "xxsel %x0,%x2,%x1,%x3"
- [(set_attr "type" "vecmove")
- (set_attr "isa" "<VSisa>")])
-
 ;; Copy sign
 (define_insn "vsx_copysign<mode>3"
   [(set (match_operand:VSX_F 0 "vsx_register_operand" "=wa")
diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-1.c b/gcc/testsuite/gcc.target/powerpc/builtins-1.c
index 83aed5a5141..3ec1024a955 100644
--- a/gcc/testsuite/gcc.target/powerpc/builtins-1.c
+++ b/gcc/testsuite/gcc.target/powerpc/builtins-1.c
@@ -326,7 +326,7 @@ int main ()
 /* { dg-final { scan-assembler-times {\mvpkudus\M} 1 } } */
 /* { dg-final { scan-assembler-times "vperm" 4 } } */
 /* { dg-final { scan-assembler-times "xvrdpi" 2 } } */
-/* { dg-final { scan-assembler-times "xxsel" 10 } } */
+/* { dg-final { scan-assembler-times "xxsel" 5 } } */
 /* { dg-final { scan-assembler-times "xxlxor" 6 } } */
 /* { dg-final { scan-assembler-times "divd" 8  { target lp64 } } } */
 /* { dg-final { scan-assembler-times "divdu" 2  { target lp64 } } } */
-- 
2.25.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Ping: [PATCH v2 0/2] Fix vec_sel code generation and merge xxsel to vsel
  2021-09-17  5:25 [PATCH v2 0/2] Fix vec_sel code generation and merge xxsel to vsel Xionghu Luo
  2021-09-17  5:25 ` [PATCH v2 1/2] rs6000: Fix wrong code generation for vec_sel [PR94613] Xionghu Luo
  2021-09-17  5:25 ` [PATCH v2 2/2] rs6000: Fold xxsel to vsel since they have same semantics Xionghu Luo
@ 2021-10-08  1:17 ` Xionghu Luo
  2021-10-15  6:28   ` Ping^2: " Xionghu Luo
  2 siblings, 1 reply; 7+ messages in thread
From: Xionghu Luo @ 2021-10-08  1:17 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, wschmidt, guojiufu, linkw

Ping, thanks.


On 2021/9/17 13:25, Xionghu Luo wrote:
> These two patches are updated version from:
> https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579490.html
> 
> Changes:
> 1. Fix alignment error in md files.
> 2. Replace rtx_equal_p with match_dup.
> 3. Use register_operand instead of gpc_reg_operand to align with
>    vperm/xxperm.
> 4. Regression tested pass on P8LE.
> 
> Xionghu Luo (2):
>   rs6000: Fix wrong code generation for vec_sel [PR94613]
>   rs6000: Fold xxsel to vsel since they have same semantics
> 
>  gcc/config/rs6000/altivec.md                  | 84 ++++++++++++++-----
>  gcc/config/rs6000/rs6000-call.c               | 62 ++++++++++++++
>  gcc/config/rs6000/rs6000.c                    | 19 ++---
>  gcc/config/rs6000/vector.md                   | 26 +++---
>  gcc/config/rs6000/vsx.md                      | 25 ------
>  gcc/testsuite/gcc.target/powerpc/builtins-1.c |  2 +-
>  gcc/testsuite/gcc.target/powerpc/pr94613.c    | 47 +++++++++++
>  7 files changed, 193 insertions(+), 72 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr94613.c
> 

-- 
Thanks,
Xionghu

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Ping^2: [PATCH v2 0/2] Fix vec_sel code generation and merge xxsel to vsel
  2021-10-08  1:17 ` Ping: [PATCH v2 0/2] Fix vec_sel code generation and merge xxsel to vsel Xionghu Luo
@ 2021-10-15  6:28   ` Xionghu Luo
  2021-10-22  3:25     ` Ping^3: " Xionghu Luo
  0 siblings, 1 reply; 7+ messages in thread
From: Xionghu Luo @ 2021-10-15  6:28 UTC (permalink / raw)
  To: gcc-patches; +Cc: wschmidt, dje.gcc, segher, linkw

Ping^2, thanks.

https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579637.html


On 2021/10/8 09:17, Xionghu Luo via Gcc-patches wrote:
> Ping, thanks.
> 
> 
> On 2021/9/17 13:25, Xionghu Luo wrote:
>> These two patches are updated version from:
>> https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579490.html
>>
>> Changes:
>> 1. Fix alignment error in md files.
>> 2. Replace rtx_equal_p with match_dup.
>> 3. Use register_operand instead of gpc_reg_operand to align with
>>    vperm/xxperm.
>> 4. Regression tested pass on P8LE.
>>
>> Xionghu Luo (2):
>>   rs6000: Fix wrong code generation for vec_sel [PR94613]
>>   rs6000: Fold xxsel to vsel since they have same semantics
>>
>>  gcc/config/rs6000/altivec.md                  | 84 ++++++++++++++-----
>>  gcc/config/rs6000/rs6000-call.c               | 62 ++++++++++++++
>>  gcc/config/rs6000/rs6000.c                    | 19 ++---
>>  gcc/config/rs6000/vector.md                   | 26 +++---
>>  gcc/config/rs6000/vsx.md                      | 25 ------
>>  gcc/testsuite/gcc.target/powerpc/builtins-1.c |  2 +-
>>  gcc/testsuite/gcc.target/powerpc/pr94613.c    | 47 +++++++++++
>>  7 files changed, 193 insertions(+), 72 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr94613.c
>>
> 

-- 
Thanks,
Xionghu

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Ping^3: [PATCH v2 0/2] Fix vec_sel code generation and merge xxsel to vsel
  2021-10-15  6:28   ` Ping^2: " Xionghu Luo
@ 2021-10-22  3:25     ` Xionghu Luo
  2021-10-27 13:17       ` David Edelsohn
  0 siblings, 1 reply; 7+ messages in thread
From: Xionghu Luo @ 2021-10-22  3:25 UTC (permalink / raw)
  To: gcc-patches; +Cc: wschmidt, segher, dje.gcc, linkw

Ping^3, thanks.

https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579637.html


On 2021/10/15 14:28, Xionghu Luo via Gcc-patches wrote:
> Ping^2, thanks.
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579637.html
> 
> 
> On 2021/10/8 09:17, Xionghu Luo via Gcc-patches wrote:
>> Ping, thanks.
>>
>>
>> On 2021/9/17 13:25, Xionghu Luo wrote:
>>> These two patches are updated version from:
>>> https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579490.html
>>>
>>> Changes:
>>> 1. Fix alignment error in md files.
>>> 2. Replace rtx_equal_p with match_dup.
>>> 3. Use register_operand instead of gpc_reg_operand to align with
>>>    vperm/xxperm.
>>> 4. Regression tested pass on P8LE.
>>>
>>> Xionghu Luo (2):
>>>   rs6000: Fix wrong code generation for vec_sel [PR94613]
>>>   rs6000: Fold xxsel to vsel since they have same semantics
>>>
>>>  gcc/config/rs6000/altivec.md                  | 84 ++++++++++++++-----
>>>  gcc/config/rs6000/rs6000-call.c               | 62 ++++++++++++++
>>>  gcc/config/rs6000/rs6000.c                    | 19 ++---
>>>  gcc/config/rs6000/vector.md                   | 26 +++---
>>>  gcc/config/rs6000/vsx.md                      | 25 ------
>>>  gcc/testsuite/gcc.target/powerpc/builtins-1.c |  2 +-
>>>  gcc/testsuite/gcc.target/powerpc/pr94613.c    | 47 +++++++++++
>>>  7 files changed, 193 insertions(+), 72 deletions(-)
>>>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr94613.c
>>>
>>
> 

-- 
Thanks,
Xionghu

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Ping^3: [PATCH v2 0/2] Fix vec_sel code generation and merge xxsel to vsel
  2021-10-22  3:25     ` Ping^3: " Xionghu Luo
@ 2021-10-27 13:17       ` David Edelsohn
  0 siblings, 0 replies; 7+ messages in thread
From: David Edelsohn @ 2021-10-27 13:17 UTC (permalink / raw)
  To: Xionghu Luo; +Cc: GCC Patches, Bill Schmidt, Segher Boessenkool, linkw

This patch series is okay.

Thanks, David

On Thu, Oct 21, 2021 at 11:25 PM Xionghu Luo <luoxhu@linux.ibm.com> wrote:
>
> Ping^3, thanks.
>
> https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579637.html
>
>
> On 2021/10/15 14:28, Xionghu Luo via Gcc-patches wrote:
> > Ping^2, thanks.
> >
> > https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579637.html
> >
> >
> > On 2021/10/8 09:17, Xionghu Luo via Gcc-patches wrote:
> >> Ping, thanks.
> >>
> >>
> >> On 2021/9/17 13:25, Xionghu Luo wrote:
> >>> These two patches are updated version from:
> >>> https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579490.html
> >>>
> >>> Changes:
> >>> 1. Fix alignment error in md files.
> >>> 2. Replace rtx_equal_p with match_dup.
> >>> 3. Use register_operand instead of gpc_reg_operand to align with
> >>>    vperm/xxperm.
> >>> 4. Regression tested pass on P8LE.
> >>>
> >>> Xionghu Luo (2):
> >>>   rs6000: Fix wrong code generation for vec_sel [PR94613]
> >>>   rs6000: Fold xxsel to vsel since they have same semantics
> >>>
> >>>  gcc/config/rs6000/altivec.md                  | 84 ++++++++++++++-----
> >>>  gcc/config/rs6000/rs6000-call.c               | 62 ++++++++++++++
> >>>  gcc/config/rs6000/rs6000.c                    | 19 ++---
> >>>  gcc/config/rs6000/vector.md                   | 26 +++---
> >>>  gcc/config/rs6000/vsx.md                      | 25 ------
> >>>  gcc/testsuite/gcc.target/powerpc/builtins-1.c |  2 +-
> >>>  gcc/testsuite/gcc.target/powerpc/pr94613.c    | 47 +++++++++++
> >>>  7 files changed, 193 insertions(+), 72 deletions(-)
> >>>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr94613.c
> >>>
> >>
> >
>
> --
> Thanks,
> Xionghu

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-10-27 13:18 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-17  5:25 [PATCH v2 0/2] Fix vec_sel code generation and merge xxsel to vsel Xionghu Luo
2021-09-17  5:25 ` [PATCH v2 1/2] rs6000: Fix wrong code generation for vec_sel [PR94613] Xionghu Luo
2021-09-17  5:25 ` [PATCH v2 2/2] rs6000: Fold xxsel to vsel since they have same semantics Xionghu Luo
2021-10-08  1:17 ` Ping: [PATCH v2 0/2] Fix vec_sel code generation and merge xxsel to vsel Xionghu Luo
2021-10-15  6:28   ` Ping^2: " Xionghu Luo
2021-10-22  3:25     ` Ping^3: " Xionghu Luo
2021-10-27 13:17       ` David Edelsohn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).