public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH-1, rs6000] Add a new type of CC mode - CCBCD for bcd insns [PR100736, PR114732]
@ 2024-04-30  7:18 HAO CHEN GUI
  2024-05-13  1:56 ` Ping " HAO CHEN GUI
  2024-05-29  5:26 ` Kewen.Lin
  0 siblings, 2 replies; 6+ messages in thread
From: HAO CHEN GUI @ 2024-04-30  7:18 UTC (permalink / raw)
  To: gcc-patches; +Cc: Segher Boessenkool, David, Kewen.Lin, Peter Bergner

Hi,
  It's the first patch of a series of patches optimizing CC modes on
rs6000.

  bcd insns set all four bits of a CR field. But it has different single
bit reverse behavior than CCFP's. The forth bit of bcd cr fields is used
to indict overflow or invalid number. It's not a bit for unordered test.
So the "le" test should be reversed to "gt" not "ungt". The "ge" test
should be reversed to "lt" not "unlt". That's the root cause of PR100736
and PR114732.

  This patch fixes the issue by adding a new type of CC mode - CCBCD for
all bcd insns. Here a new setcc_rev pattern is added for ccbcd. It will
be merged to a uniform pattern which is for all CC modes in sequential
patch.

  The rtl code "unordered" is still used for testing overflow or
invalid number. IMHO, the "unordered" on a CC mode can be considered as
testing the forth bit of a CR field setting or not. The "eq" on a CC mode
can be considered as testing the third bit setting or not. Thus we avoid
creating lots of unspecs for the CR bit testing.

  Bootstrapped and tested on powerpc64-linux BE and LE with no
regressions. Is it OK for the trunk?

Thanks
Gui Haochen


ChangeLog
rs6000: Add a new type of CC mode - CCBCD for bcd insns

gcc/
	PR target/100736
	PR target/114732
	* config/rs6000/altivec.md (bcd<bcd_add_sub>_<mode>): Replace CCFP
	with CCBCD.
	(*bcd<bcd_add_sub>_test_<mode>): Likewise.
	(*bcd<bcd_add_sub>_test2_<mode>): Likewise.
	(bcd<bcd_add_sub>_<code>_<mode>): Likewise.
	(*bcdinvalid_<mode>): Likewise.
	(bcdinvalid_<mode>): Likewise.
	(bcdshift_v16qi): Likewise.
	(bcdmul10_v16qi): Likewise.
	(bcddiv10_v16qi): Likewise.
	(peephole for bcd_add/sub): Likewise.
	* config/rs6000/predicates.md (branch_comparison_operator): Add CCBCD
	and its supported comparison codes.
	* config/rs6000/rs6000-modes.def (CC_MODE): Add CCBCD.
	* config/rs6000/rs6000.cc (validate_condition_mode): Add CCBCD
	assertion.
	* config/rs6000/rs6000.md (CC_any): Add CCBCD.
	(ccbcd_rev): New code iterator.
	(*<code><mode>_cc): New insn and split pattern for CCBCD reverse
	compare.

gcc/testsuite/
	PR target/100736
	PR target/114732
	* gcc.target/powerpc/pr100736.c: New.
	* gcc.target/powerpc/pr114732.c: New.

patch.diff
diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index bb20441c096..9fa8cf89f61 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -4443,7 +4443,7 @@ (define_insn "bcd<bcd_add_sub>_<mode>"
 		      (match_operand:VBCD 2 "register_operand" "v")
 		      (match_operand:QI 3 "const_0_to_1_operand" "n")]
 		     UNSPEC_BCD_ADD_SUB))
-   (clobber (reg:CCFP CR6_REGNO))]
+   (clobber (reg:CCBCD CR6_REGNO))]
   "TARGET_P8_VECTOR"
   "bcd<bcd_add_sub>. %0,%1,%2,%3"
   [(set_attr "type" "vecsimple")])
@@ -4454,8 +4454,8 @@ (define_insn "bcd<bcd_add_sub>_<mode>"
 ;; probably should be one that can go in the VMX (Altivec) registers, so we
 ;; can't use DDmode or DFmode.
 (define_insn "*bcd<bcd_add_sub>_test_<mode>"
-  [(set (reg:CCFP CR6_REGNO)
-	(compare:CCFP
+  [(set (reg:CCBCD CR6_REGNO)
+	(compare:CCBCD
 	 (unspec:V2DF [(match_operand:VBCD 1 "register_operand" "v")
 		       (match_operand:VBCD 2 "register_operand" "v")
 		       (match_operand:QI 3 "const_0_to_1_operand" "i")]
@@ -4472,8 +4472,8 @@ (define_insn "*bcd<bcd_add_sub>_test2_<mode>"
 		      (match_operand:VBCD 2 "register_operand" "v")
 		      (match_operand:QI 3 "const_0_to_1_operand" "i")]
 		     UNSPEC_BCD_ADD_SUB))
-   (set (reg:CCFP CR6_REGNO)
-	(compare:CCFP
+   (set (reg:CCBCD CR6_REGNO)
+	(compare:CCBCD
 	 (unspec:V2DF [(match_dup 1)
 		       (match_dup 2)
 		       (match_dup 3)]
@@ -4566,8 +4566,8 @@ (define_insn "vclrrb"
    [(set_attr "type" "vecsimple")])

 (define_expand "bcd<bcd_add_sub>_<code>_<mode>"
-  [(parallel [(set (reg:CCFP CR6_REGNO)
-		   (compare:CCFP
+  [(parallel [(set (reg:CCBCD CR6_REGNO)
+		   (compare:CCBCD
 		    (unspec:V2DF [(match_operand:VBCD 1 "register_operand")
 				  (match_operand:VBCD 2 "register_operand")
 				  (match_operand:QI 3 "const_0_to_1_operand")]
@@ -4575,7 +4575,7 @@ (define_expand "bcd<bcd_add_sub>_<code>_<mode>"
 		    (match_dup 4)))
 	      (clobber (match_scratch:VBCD 5))])
    (set (match_operand:SI 0 "register_operand")
-	(BCD_TEST:SI (reg:CCFP CR6_REGNO)
+	(BCD_TEST:SI (reg:CCBCD CR6_REGNO)
 		     (const_int 0)))]
   "TARGET_P8_VECTOR"
 {
@@ -4583,8 +4583,8 @@ (define_expand "bcd<bcd_add_sub>_<code>_<mode>"
 })

 (define_insn "*bcdinvalid_<mode>"
-  [(set (reg:CCFP CR6_REGNO)
-	(compare:CCFP
+  [(set (reg:CCBCD CR6_REGNO)
+	(compare:CCBCD
 	 (unspec:V2DF [(match_operand:VBCD 1 "register_operand" "v")]
 		      UNSPEC_BCDSUB)
 	 (match_operand:V2DF 2 "zero_constant" "j")))
@@ -4594,14 +4594,14 @@ (define_insn "*bcdinvalid_<mode>"
   [(set_attr "type" "vecsimple")])

 (define_expand "bcdinvalid_<mode>"
-  [(parallel [(set (reg:CCFP CR6_REGNO)
-		   (compare:CCFP
+  [(parallel [(set (reg:CCBCD CR6_REGNO)
+		   (compare:CCBCD
 		    (unspec:V2DF [(match_operand:VBCD 1 "register_operand")]
 				 UNSPEC_BCDSUB)
 		    (match_dup 2)))
 	      (clobber (match_scratch:VBCD 3))])
    (set (match_operand:SI 0 "register_operand")
-	(unordered:SI (reg:CCFP CR6_REGNO)
+	(unordered:SI (reg:CCBCD CR6_REGNO)
 		      (const_int 0)))]
   "TARGET_P8_VECTOR"
 {
@@ -4614,7 +4614,7 @@ (define_insn "bcdshift_v16qi"
 		       (match_operand:V16QI 2 "register_operand" "v")
 		       (match_operand:QI 3 "const_0_to_1_operand" "n")]
 		     UNSPEC_BCDSHIFT))
-   (clobber (reg:CCFP CR6_REGNO))]
+   (clobber (reg:CCBCD CR6_REGNO))]
   "TARGET_P8_VECTOR"
   "bcds. %0,%1,%2,%3"
   [(set_attr "type" "vecsimple")])
@@ -4623,7 +4623,7 @@ (define_expand "bcdmul10_v16qi"
   [(set (match_operand:V16QI 0 "register_operand")
 	(unspec:V16QI [(match_operand:V16QI 1 "register_operand")]
 		      UNSPEC_BCDSHIFT))
-   (clobber (reg:CCFP CR6_REGNO))]
+   (clobber (reg:CCBCD CR6_REGNO))]
   "TARGET_P9_VECTOR"
 {
   rtx one = gen_reg_rtx (V16QImode);
@@ -4638,7 +4638,7 @@ (define_expand "bcddiv10_v16qi"
   [(set (match_operand:V16QI 0 "register_operand")
 	(unspec:V16QI [(match_operand:V16QI 1 "register_operand")]
 		      UNSPEC_BCDSHIFT))
-   (clobber (reg:CCFP CR6_REGNO))]
+   (clobber (reg:CCBCD CR6_REGNO))]
   "TARGET_P9_VECTOR"
 {
   rtx one = gen_reg_rtx (V16QImode);
@@ -4662,9 +4662,9 @@ (define_peephole2
 				 (match_operand:V1TI 2 "register_operand")
 				 (match_operand:QI 3 "const_0_to_1_operand")]
 				UNSPEC_BCD_ADD_SUB))
-	      (clobber (reg:CCFP CR6_REGNO))])
-   (parallel [(set (reg:CCFP CR6_REGNO)
-		   (compare:CCFP
+	      (clobber (reg:CCBCD CR6_REGNO))])
+   (parallel [(set (reg:CCBCD CR6_REGNO)
+		   (compare:CCBCD
 		    (unspec:V2DF [(match_dup 1)
 				  (match_dup 2)
 				  (match_dup 3)]
@@ -4677,8 +4677,8 @@ (define_peephole2
 				 (match_dup 2)
 				 (match_dup 3)]
 				UNSPEC_BCD_ADD_SUB))
-	      (set (reg:CCFP CR6_REGNO)
-		   (compare:CCFP
+	      (set (reg:CCBCD CR6_REGNO)
+		   (compare:CCBCD
 		    (unspec:V2DF [(match_dup 1)
 				  (match_dup 2)
 				  (match_dup 3)]
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index d23ce9a77a3..18198add744 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -1350,7 +1350,9 @@ (define_predicate "branch_comparison_operator"
 	  (if_then_else (match_test "flag_finite_math_only")
 	    (match_code "lt,le,gt,ge,eq,ne,unordered,ordered")
 	    (match_code "lt,gt,eq,unordered,unge,unle,ne,ordered"))
-	  (match_code "lt,ltu,le,leu,gt,gtu,ge,geu,eq,ne"))
+	  (if_then_else (match_test "GET_MODE (XEXP (op, 0)) == CCBCDmode")
+	    (match_code "lt,le,gt,ge,eq,ne,unordered,ordered")
+	    (match_code "lt,ltu,le,leu,gt,gtu,ge,geu,eq,ne")))
 	(match_test "validate_condition_mode (GET_CODE (op),
 					      GET_MODE (XEXP (op, 0))),
 		     1")))
diff --git a/gcc/config/rs6000/rs6000-modes.def b/gcc/config/rs6000/rs6000-modes.def
index 094b246c834..3e2e6dfb4ff 100644
--- a/gcc/config/rs6000/rs6000-modes.def
+++ b/gcc/config/rs6000/rs6000-modes.def
@@ -61,6 +61,7 @@ FRACTIONAL_FLOAT_MODE (TF, FLOAT_PRECISION_TFmode, 16, ieee_quad_format);

 CC_MODE (CCUNS);
 CC_MODE (CCFP);
+CC_MODE (CCBCD);	/* Used for bcd insns */
 CC_MODE (CCEQ);

 /* Vector modes.  */
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 6ba9df4f02e..4068cd8b929 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -11597,9 +11597,11 @@ validate_condition_mode (enum rtx_code code, machine_mode mode)
   gcc_assert ((code != GTU && code != LTU && code != GEU && code != LEU)
 	      || mode == CCUNSmode);

+  gcc_assert (mode == CCFPmode || mode == CCBCDmode
+	      || (code != ORDERED && code != UNORDERED));
+
   gcc_assert (mode == CCFPmode
-	      || (code != ORDERED && code != UNORDERED
-		  && code != UNEQ && code != LTGT
+	      || (code != UNEQ && code != LTGT
 		  && code != UNGT && code != UNLT
 		  && code != UNGE && code != UNLE));

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index bc8bc6ab060..9b5fcdc8db0 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -8115,7 +8115,7 @@ (define_expand "movcc"
   ""
   "")

-(define_mode_iterator CC_any [CC CCUNS CCEQ CCFP])
+(define_mode_iterator CC_any [CC CCUNS CCEQ CCFP CCBCD])

 (define_insn "*movcc_<mode>"
   [(set (match_operand:CC_any 0 "nonimmediate_operand"
@@ -13245,6 +13245,7 @@ (define_insn_and_split "*nesi3_ext<mode>"

 (define_code_iterator fp_rev [ordered ne unle unge])
 (define_code_iterator fp_two [ltgt le ge unlt ungt uneq])
+(define_code_iterator ccbcd_rev [ordered ne le ge])

 (define_insn_and_split "*<code><mode>_cc"
   [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
@@ -13264,6 +13265,24 @@ (define_insn_and_split "*<code><mode>_cc"
 }
   [(set_attr "length" "12")])

+(define_insn_and_split "*<code><mode>_cc"
+  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
+	(ccbcd_rev:GPR (match_operand:CCBCD 1 "cc_reg_operand" "y")
+		    (const_int 0)))]
+  ""
+  "#"
+  "&& 1"
+  [(pc)]
+{
+  rtx_code revcode = reverse_condition (<CODE>);
+  rtx eq = gen_rtx_fmt_ee (revcode, <MODE>mode, operands[1], const0_rtx);
+  rtx tmp = gen_reg_rtx (<MODE>mode);
+  emit_move_insn (tmp, eq);
+  emit_insn (gen_xor<mode>3 (operands[0], tmp, const1_rtx));
+  DONE;
+}
+  [(set_attr "length" "12")])
+
 (define_insn_and_split "*<code><mode>_cc"
   [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
 	(fp_two:GPR (match_operand:CCFP 1 "cc_reg_operand" "y")
diff --git a/gcc/testsuite/gcc.target/powerpc/pr100736.c b/gcc/testsuite/gcc.target/powerpc/pr100736.c
new file mode 100644
index 00000000000..85e3ae56d55
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr100736.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-mdejagnu-cpu=power8 -O2 -ffinite-math-only" } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+
+/* Verify there is no ICE with finite-math-only */
+
+int foo (vector unsigned char a, vector unsigned char b)
+{
+  return __builtin_vec_bcdsub_ge (a, b, 0);
+}
+
+
diff --git a/gcc/testsuite/gcc.target/powerpc/pr114732.c b/gcc/testsuite/gcc.target/powerpc/pr114732.c
new file mode 100644
index 00000000000..d0b878780c6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr114732.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-mdejagnu-cpu=power8 -O2" } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+
+/* Verify only one cr bit need to be tested. */
+
+int foo (vector unsigned char a, vector unsigned char b)
+{
+  return __builtin_vec_bcdsub_ge (a, b, 0) != 1;
+}
+
+/* { dg-final { scan-assembler-not "cror" } } */

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Ping [PATCH-1, rs6000] Add a new type of CC mode - CCBCD for bcd insns [PR100736, PR114732]
  2024-04-30  7:18 [PATCH-1, rs6000] Add a new type of CC mode - CCBCD for bcd insns [PR100736, PR114732] HAO CHEN GUI
@ 2024-05-13  1:56 ` HAO CHEN GUI
  2024-05-27  1:55   ` Ping^2 " HAO CHEN GUI
  2024-05-29  5:26 ` Kewen.Lin
  1 sibling, 1 reply; 6+ messages in thread
From: HAO CHEN GUI @ 2024-05-13  1:56 UTC (permalink / raw)
  To: gcc-patches; +Cc: Segher Boessenkool, David, Kewen.Lin, Peter Bergner

Hi,
  Gently ping the series of patches.
[PATCH-1, rs6000]Add a new type of CC mode - CCBCD for bcd insns [PR100736, PR114732]
https://gcc.gnu.org/pipermail/gcc-patches/2024-April/650217.html
[PATCH-2, rs6000] Add a new type of CC mode - CCLTEQ
https://gcc.gnu.org/pipermail/gcc-patches/2024-April/650218.html
[PATCH-3, rs6000] Set CC mode of vector string isolate insns to CCEQ
https://gcc.gnu.org/pipermail/gcc-patches/2024-April/650219.html
[PATCH-4, rs6000] Optimize single cc bit reverse implementation
https://gcc.gnu.org/pipermail/gcc-patches/2024-April/650220.html
[PATCH-5, rs6000] Replace explicit CC bit reverse with common format
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650766.html
[PATCH-6, rs6000] Split setcc to two insns after reload
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650856.html

Thanks
Gui Haochen

在 2024/4/30 15:18, HAO CHEN GUI 写道:
> Hi,
>   It's the first patch of a series of patches optimizing CC modes on
> rs6000.
> 
>   bcd insns set all four bits of a CR field. But it has different single
> bit reverse behavior than CCFP's. The forth bit of bcd cr fields is used
> to indict overflow or invalid number. It's not a bit for unordered test.
> So the "le" test should be reversed to "gt" not "ungt". The "ge" test
> should be reversed to "lt" not "unlt". That's the root cause of PR100736
> and PR114732.
> 
>   This patch fixes the issue by adding a new type of CC mode - CCBCD for
> all bcd insns. Here a new setcc_rev pattern is added for ccbcd. It will
> be merged to a uniform pattern which is for all CC modes in sequential
> patch.
> 
>   The rtl code "unordered" is still used for testing overflow or
> invalid number. IMHO, the "unordered" on a CC mode can be considered as
> testing the forth bit of a CR field setting or not. The "eq" on a CC mode
> can be considered as testing the third bit setting or not. Thus we avoid
> creating lots of unspecs for the CR bit testing.
> 
>   Bootstrapped and tested on powerpc64-linux BE and LE with no
> regressions. Is it OK for the trunk?
> 
> Thanks
> Gui Haochen
> 
> 
> ChangeLog
> rs6000: Add a new type of CC mode - CCBCD for bcd insns
> 
> gcc/
> 	PR target/100736
> 	PR target/114732
> 	* config/rs6000/altivec.md (bcd<bcd_add_sub>_<mode>): Replace CCFP
> 	with CCBCD.
> 	(*bcd<bcd_add_sub>_test_<mode>): Likewise.
> 	(*bcd<bcd_add_sub>_test2_<mode>): Likewise.
> 	(bcd<bcd_add_sub>_<code>_<mode>): Likewise.
> 	(*bcdinvalid_<mode>): Likewise.
> 	(bcdinvalid_<mode>): Likewise.
> 	(bcdshift_v16qi): Likewise.
> 	(bcdmul10_v16qi): Likewise.
> 	(bcddiv10_v16qi): Likewise.
> 	(peephole for bcd_add/sub): Likewise.
> 	* config/rs6000/predicates.md (branch_comparison_operator): Add CCBCD
> 	and its supported comparison codes.
> 	* config/rs6000/rs6000-modes.def (CC_MODE): Add CCBCD.
> 	* config/rs6000/rs6000.cc (validate_condition_mode): Add CCBCD
> 	assertion.
> 	* config/rs6000/rs6000.md (CC_any): Add CCBCD.
> 	(ccbcd_rev): New code iterator.
> 	(*<code><mode>_cc): New insn and split pattern for CCBCD reverse
> 	compare.
> 
> gcc/testsuite/
> 	PR target/100736
> 	PR target/114732
> 	* gcc.target/powerpc/pr100736.c: New.
> 	* gcc.target/powerpc/pr114732.c: New.
> 
> patch.diff
> diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
> index bb20441c096..9fa8cf89f61 100644
> --- a/gcc/config/rs6000/altivec.md
> +++ b/gcc/config/rs6000/altivec.md
> @@ -4443,7 +4443,7 @@ (define_insn "bcd<bcd_add_sub>_<mode>"
>  		      (match_operand:VBCD 2 "register_operand" "v")
>  		      (match_operand:QI 3 "const_0_to_1_operand" "n")]
>  		     UNSPEC_BCD_ADD_SUB))
> -   (clobber (reg:CCFP CR6_REGNO))]
> +   (clobber (reg:CCBCD CR6_REGNO))]
>    "TARGET_P8_VECTOR"
>    "bcd<bcd_add_sub>. %0,%1,%2,%3"
>    [(set_attr "type" "vecsimple")])
> @@ -4454,8 +4454,8 @@ (define_insn "bcd<bcd_add_sub>_<mode>"
>  ;; probably should be one that can go in the VMX (Altivec) registers, so we
>  ;; can't use DDmode or DFmode.
>  (define_insn "*bcd<bcd_add_sub>_test_<mode>"
> -  [(set (reg:CCFP CR6_REGNO)
> -	(compare:CCFP
> +  [(set (reg:CCBCD CR6_REGNO)
> +	(compare:CCBCD
>  	 (unspec:V2DF [(match_operand:VBCD 1 "register_operand" "v")
>  		       (match_operand:VBCD 2 "register_operand" "v")
>  		       (match_operand:QI 3 "const_0_to_1_operand" "i")]
> @@ -4472,8 +4472,8 @@ (define_insn "*bcd<bcd_add_sub>_test2_<mode>"
>  		      (match_operand:VBCD 2 "register_operand" "v")
>  		      (match_operand:QI 3 "const_0_to_1_operand" "i")]
>  		     UNSPEC_BCD_ADD_SUB))
> -   (set (reg:CCFP CR6_REGNO)
> -	(compare:CCFP
> +   (set (reg:CCBCD CR6_REGNO)
> +	(compare:CCBCD
>  	 (unspec:V2DF [(match_dup 1)
>  		       (match_dup 2)
>  		       (match_dup 3)]
> @@ -4566,8 +4566,8 @@ (define_insn "vclrrb"
>     [(set_attr "type" "vecsimple")])
> 
>  (define_expand "bcd<bcd_add_sub>_<code>_<mode>"
> -  [(parallel [(set (reg:CCFP CR6_REGNO)
> -		   (compare:CCFP
> +  [(parallel [(set (reg:CCBCD CR6_REGNO)
> +		   (compare:CCBCD
>  		    (unspec:V2DF [(match_operand:VBCD 1 "register_operand")
>  				  (match_operand:VBCD 2 "register_operand")
>  				  (match_operand:QI 3 "const_0_to_1_operand")]
> @@ -4575,7 +4575,7 @@ (define_expand "bcd<bcd_add_sub>_<code>_<mode>"
>  		    (match_dup 4)))
>  	      (clobber (match_scratch:VBCD 5))])
>     (set (match_operand:SI 0 "register_operand")
> -	(BCD_TEST:SI (reg:CCFP CR6_REGNO)
> +	(BCD_TEST:SI (reg:CCBCD CR6_REGNO)
>  		     (const_int 0)))]
>    "TARGET_P8_VECTOR"
>  {
> @@ -4583,8 +4583,8 @@ (define_expand "bcd<bcd_add_sub>_<code>_<mode>"
>  })
> 
>  (define_insn "*bcdinvalid_<mode>"
> -  [(set (reg:CCFP CR6_REGNO)
> -	(compare:CCFP
> +  [(set (reg:CCBCD CR6_REGNO)
> +	(compare:CCBCD
>  	 (unspec:V2DF [(match_operand:VBCD 1 "register_operand" "v")]
>  		      UNSPEC_BCDSUB)
>  	 (match_operand:V2DF 2 "zero_constant" "j")))
> @@ -4594,14 +4594,14 @@ (define_insn "*bcdinvalid_<mode>"
>    [(set_attr "type" "vecsimple")])
> 
>  (define_expand "bcdinvalid_<mode>"
> -  [(parallel [(set (reg:CCFP CR6_REGNO)
> -		   (compare:CCFP
> +  [(parallel [(set (reg:CCBCD CR6_REGNO)
> +		   (compare:CCBCD
>  		    (unspec:V2DF [(match_operand:VBCD 1 "register_operand")]
>  				 UNSPEC_BCDSUB)
>  		    (match_dup 2)))
>  	      (clobber (match_scratch:VBCD 3))])
>     (set (match_operand:SI 0 "register_operand")
> -	(unordered:SI (reg:CCFP CR6_REGNO)
> +	(unordered:SI (reg:CCBCD CR6_REGNO)
>  		      (const_int 0)))]
>    "TARGET_P8_VECTOR"
>  {
> @@ -4614,7 +4614,7 @@ (define_insn "bcdshift_v16qi"
>  		       (match_operand:V16QI 2 "register_operand" "v")
>  		       (match_operand:QI 3 "const_0_to_1_operand" "n")]
>  		     UNSPEC_BCDSHIFT))
> -   (clobber (reg:CCFP CR6_REGNO))]
> +   (clobber (reg:CCBCD CR6_REGNO))]
>    "TARGET_P8_VECTOR"
>    "bcds. %0,%1,%2,%3"
>    [(set_attr "type" "vecsimple")])
> @@ -4623,7 +4623,7 @@ (define_expand "bcdmul10_v16qi"
>    [(set (match_operand:V16QI 0 "register_operand")
>  	(unspec:V16QI [(match_operand:V16QI 1 "register_operand")]
>  		      UNSPEC_BCDSHIFT))
> -   (clobber (reg:CCFP CR6_REGNO))]
> +   (clobber (reg:CCBCD CR6_REGNO))]
>    "TARGET_P9_VECTOR"
>  {
>    rtx one = gen_reg_rtx (V16QImode);
> @@ -4638,7 +4638,7 @@ (define_expand "bcddiv10_v16qi"
>    [(set (match_operand:V16QI 0 "register_operand")
>  	(unspec:V16QI [(match_operand:V16QI 1 "register_operand")]
>  		      UNSPEC_BCDSHIFT))
> -   (clobber (reg:CCFP CR6_REGNO))]
> +   (clobber (reg:CCBCD CR6_REGNO))]
>    "TARGET_P9_VECTOR"
>  {
>    rtx one = gen_reg_rtx (V16QImode);
> @@ -4662,9 +4662,9 @@ (define_peephole2
>  				 (match_operand:V1TI 2 "register_operand")
>  				 (match_operand:QI 3 "const_0_to_1_operand")]
>  				UNSPEC_BCD_ADD_SUB))
> -	      (clobber (reg:CCFP CR6_REGNO))])
> -   (parallel [(set (reg:CCFP CR6_REGNO)
> -		   (compare:CCFP
> +	      (clobber (reg:CCBCD CR6_REGNO))])
> +   (parallel [(set (reg:CCBCD CR6_REGNO)
> +		   (compare:CCBCD
>  		    (unspec:V2DF [(match_dup 1)
>  				  (match_dup 2)
>  				  (match_dup 3)]
> @@ -4677,8 +4677,8 @@ (define_peephole2
>  				 (match_dup 2)
>  				 (match_dup 3)]
>  				UNSPEC_BCD_ADD_SUB))
> -	      (set (reg:CCFP CR6_REGNO)
> -		   (compare:CCFP
> +	      (set (reg:CCBCD CR6_REGNO)
> +		   (compare:CCBCD
>  		    (unspec:V2DF [(match_dup 1)
>  				  (match_dup 2)
>  				  (match_dup 3)]
> diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
> index d23ce9a77a3..18198add744 100644
> --- a/gcc/config/rs6000/predicates.md
> +++ b/gcc/config/rs6000/predicates.md
> @@ -1350,7 +1350,9 @@ (define_predicate "branch_comparison_operator"
>  	  (if_then_else (match_test "flag_finite_math_only")
>  	    (match_code "lt,le,gt,ge,eq,ne,unordered,ordered")
>  	    (match_code "lt,gt,eq,unordered,unge,unle,ne,ordered"))
> -	  (match_code "lt,ltu,le,leu,gt,gtu,ge,geu,eq,ne"))
> +	  (if_then_else (match_test "GET_MODE (XEXP (op, 0)) == CCBCDmode")
> +	    (match_code "lt,le,gt,ge,eq,ne,unordered,ordered")
> +	    (match_code "lt,ltu,le,leu,gt,gtu,ge,geu,eq,ne")))
>  	(match_test "validate_condition_mode (GET_CODE (op),
>  					      GET_MODE (XEXP (op, 0))),
>  		     1")))
> diff --git a/gcc/config/rs6000/rs6000-modes.def b/gcc/config/rs6000/rs6000-modes.def
> index 094b246c834..3e2e6dfb4ff 100644
> --- a/gcc/config/rs6000/rs6000-modes.def
> +++ b/gcc/config/rs6000/rs6000-modes.def
> @@ -61,6 +61,7 @@ FRACTIONAL_FLOAT_MODE (TF, FLOAT_PRECISION_TFmode, 16, ieee_quad_format);
> 
>  CC_MODE (CCUNS);
>  CC_MODE (CCFP);
> +CC_MODE (CCBCD);	/* Used for bcd insns */
>  CC_MODE (CCEQ);
> 
>  /* Vector modes.  */
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index 6ba9df4f02e..4068cd8b929 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -11597,9 +11597,11 @@ validate_condition_mode (enum rtx_code code, machine_mode mode)
>    gcc_assert ((code != GTU && code != LTU && code != GEU && code != LEU)
>  	      || mode == CCUNSmode);
> 
> +  gcc_assert (mode == CCFPmode || mode == CCBCDmode
> +	      || (code != ORDERED && code != UNORDERED));
> +
>    gcc_assert (mode == CCFPmode
> -	      || (code != ORDERED && code != UNORDERED
> -		  && code != UNEQ && code != LTGT
> +	      || (code != UNEQ && code != LTGT
>  		  && code != UNGT && code != UNLT
>  		  && code != UNGE && code != UNLE));
> 
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index bc8bc6ab060..9b5fcdc8db0 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -8115,7 +8115,7 @@ (define_expand "movcc"
>    ""
>    "")
> 
> -(define_mode_iterator CC_any [CC CCUNS CCEQ CCFP])
> +(define_mode_iterator CC_any [CC CCUNS CCEQ CCFP CCBCD])
> 
>  (define_insn "*movcc_<mode>"
>    [(set (match_operand:CC_any 0 "nonimmediate_operand"
> @@ -13245,6 +13245,7 @@ (define_insn_and_split "*nesi3_ext<mode>"
> 
>  (define_code_iterator fp_rev [ordered ne unle unge])
>  (define_code_iterator fp_two [ltgt le ge unlt ungt uneq])
> +(define_code_iterator ccbcd_rev [ordered ne le ge])
> 
>  (define_insn_and_split "*<code><mode>_cc"
>    [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
> @@ -13264,6 +13265,24 @@ (define_insn_and_split "*<code><mode>_cc"
>  }
>    [(set_attr "length" "12")])
> 
> +(define_insn_and_split "*<code><mode>_cc"
> +  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
> +	(ccbcd_rev:GPR (match_operand:CCBCD 1 "cc_reg_operand" "y")
> +		    (const_int 0)))]
> +  ""
> +  "#"
> +  "&& 1"
> +  [(pc)]
> +{
> +  rtx_code revcode = reverse_condition (<CODE>);
> +  rtx eq = gen_rtx_fmt_ee (revcode, <MODE>mode, operands[1], const0_rtx);
> +  rtx tmp = gen_reg_rtx (<MODE>mode);
> +  emit_move_insn (tmp, eq);
> +  emit_insn (gen_xor<mode>3 (operands[0], tmp, const1_rtx));
> +  DONE;
> +}
> +  [(set_attr "length" "12")])
> +
>  (define_insn_and_split "*<code><mode>_cc"
>    [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
>  	(fp_two:GPR (match_operand:CCFP 1 "cc_reg_operand" "y")
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr100736.c b/gcc/testsuite/gcc.target/powerpc/pr100736.c
> new file mode 100644
> index 00000000000..85e3ae56d55
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr100736.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mdejagnu-cpu=power8 -O2 -ffinite-math-only" } */
> +/* { dg-require-effective-target powerpc_vsx_ok } */
> +
> +/* Verify there is no ICE with finite-math-only */
> +
> +int foo (vector unsigned char a, vector unsigned char b)
> +{
> +  return __builtin_vec_bcdsub_ge (a, b, 0);
> +}
> +
> +
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr114732.c b/gcc/testsuite/gcc.target/powerpc/pr114732.c
> new file mode 100644
> index 00000000000..d0b878780c6
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr114732.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mdejagnu-cpu=power8 -O2" } */
> +/* { dg-require-effective-target powerpc_vsx_ok } */
> +
> +/* Verify only one cr bit need to be tested. */
> +
> +int foo (vector unsigned char a, vector unsigned char b)
> +{
> +  return __builtin_vec_bcdsub_ge (a, b, 0) != 1;
> +}
> +
> +/* { dg-final { scan-assembler-not "cror" } } */

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Ping^2 [PATCH-1, rs6000] Add a new type of CC mode - CCBCD for bcd insns [PR100736, PR114732]
  2024-05-13  1:56 ` Ping " HAO CHEN GUI
@ 2024-05-27  1:55   ` HAO CHEN GUI
  0 siblings, 0 replies; 6+ messages in thread
From: HAO CHEN GUI @ 2024-05-27  1:55 UTC (permalink / raw)
  To: gcc-patches; +Cc: Segher Boessenkool, David, Kewen.Lin, Peter Bergner

Hi,
  Gently ping them.

Thanks
Gui Haochen

在 2024/5/13 9:56, HAO CHEN GUI 写道:
> Hi,
>   Gently ping the series of patches.
> [PATCH-1, rs6000]Add a new type of CC mode - CCBCD for bcd insns [PR100736, PR114732]
> https://gcc.gnu.org/pipermail/gcc-patches/2024-April/650217.html
> [PATCH-2, rs6000] Add a new type of CC mode - CCLTEQ
> https://gcc.gnu.org/pipermail/gcc-patches/2024-April/650218.html
> [PATCH-3, rs6000] Set CC mode of vector string isolate insns to CCEQ
> https://gcc.gnu.org/pipermail/gcc-patches/2024-April/650219.html
> [PATCH-4, rs6000] Optimize single cc bit reverse implementation
> https://gcc.gnu.org/pipermail/gcc-patches/2024-April/650220.html
> [PATCH-5, rs6000] Replace explicit CC bit reverse with common format
> https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650766.html
> [PATCH-6, rs6000] Split setcc to two insns after reload
> https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650856.html
> 
> Thanks
> Gui Haochen
> 
> 在 2024/4/30 15:18, HAO CHEN GUI 写道:
>> Hi,
>>   It's the first patch of a series of patches optimizing CC modes on
>> rs6000.
>>
>>   bcd insns set all four bits of a CR field. But it has different single
>> bit reverse behavior than CCFP's. The forth bit of bcd cr fields is used
>> to indict overflow or invalid number. It's not a bit for unordered test.
>> So the "le" test should be reversed to "gt" not "ungt". The "ge" test
>> should be reversed to "lt" not "unlt". That's the root cause of PR100736
>> and PR114732.
>>
>>   This patch fixes the issue by adding a new type of CC mode - CCBCD for
>> all bcd insns. Here a new setcc_rev pattern is added for ccbcd. It will
>> be merged to a uniform pattern which is for all CC modes in sequential
>> patch.
>>
>>   The rtl code "unordered" is still used for testing overflow or
>> invalid number. IMHO, the "unordered" on a CC mode can be considered as
>> testing the forth bit of a CR field setting or not. The "eq" on a CC mode
>> can be considered as testing the third bit setting or not. Thus we avoid
>> creating lots of unspecs for the CR bit testing.
>>
>>   Bootstrapped and tested on powerpc64-linux BE and LE with no
>> regressions. Is it OK for the trunk?
>>
>> Thanks
>> Gui Haochen
>>
>>
>> ChangeLog
>> rs6000: Add a new type of CC mode - CCBCD for bcd insns
>>
>> gcc/
>> 	PR target/100736
>> 	PR target/114732
>> 	* config/rs6000/altivec.md (bcd<bcd_add_sub>_<mode>): Replace CCFP
>> 	with CCBCD.
>> 	(*bcd<bcd_add_sub>_test_<mode>): Likewise.
>> 	(*bcd<bcd_add_sub>_test2_<mode>): Likewise.
>> 	(bcd<bcd_add_sub>_<code>_<mode>): Likewise.
>> 	(*bcdinvalid_<mode>): Likewise.
>> 	(bcdinvalid_<mode>): Likewise.
>> 	(bcdshift_v16qi): Likewise.
>> 	(bcdmul10_v16qi): Likewise.
>> 	(bcddiv10_v16qi): Likewise.
>> 	(peephole for bcd_add/sub): Likewise.
>> 	* config/rs6000/predicates.md (branch_comparison_operator): Add CCBCD
>> 	and its supported comparison codes.
>> 	* config/rs6000/rs6000-modes.def (CC_MODE): Add CCBCD.
>> 	* config/rs6000/rs6000.cc (validate_condition_mode): Add CCBCD
>> 	assertion.
>> 	* config/rs6000/rs6000.md (CC_any): Add CCBCD.
>> 	(ccbcd_rev): New code iterator.
>> 	(*<code><mode>_cc): New insn and split pattern for CCBCD reverse
>> 	compare.
>>
>> gcc/testsuite/
>> 	PR target/100736
>> 	PR target/114732
>> 	* gcc.target/powerpc/pr100736.c: New.
>> 	* gcc.target/powerpc/pr114732.c: New.
>>
>> patch.diff
>> diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
>> index bb20441c096..9fa8cf89f61 100644
>> --- a/gcc/config/rs6000/altivec.md
>> +++ b/gcc/config/rs6000/altivec.md
>> @@ -4443,7 +4443,7 @@ (define_insn "bcd<bcd_add_sub>_<mode>"
>>  		      (match_operand:VBCD 2 "register_operand" "v")
>>  		      (match_operand:QI 3 "const_0_to_1_operand" "n")]
>>  		     UNSPEC_BCD_ADD_SUB))
>> -   (clobber (reg:CCFP CR6_REGNO))]
>> +   (clobber (reg:CCBCD CR6_REGNO))]
>>    "TARGET_P8_VECTOR"
>>    "bcd<bcd_add_sub>. %0,%1,%2,%3"
>>    [(set_attr "type" "vecsimple")])
>> @@ -4454,8 +4454,8 @@ (define_insn "bcd<bcd_add_sub>_<mode>"
>>  ;; probably should be one that can go in the VMX (Altivec) registers, so we
>>  ;; can't use DDmode or DFmode.
>>  (define_insn "*bcd<bcd_add_sub>_test_<mode>"
>> -  [(set (reg:CCFP CR6_REGNO)
>> -	(compare:CCFP
>> +  [(set (reg:CCBCD CR6_REGNO)
>> +	(compare:CCBCD
>>  	 (unspec:V2DF [(match_operand:VBCD 1 "register_operand" "v")
>>  		       (match_operand:VBCD 2 "register_operand" "v")
>>  		       (match_operand:QI 3 "const_0_to_1_operand" "i")]
>> @@ -4472,8 +4472,8 @@ (define_insn "*bcd<bcd_add_sub>_test2_<mode>"
>>  		      (match_operand:VBCD 2 "register_operand" "v")
>>  		      (match_operand:QI 3 "const_0_to_1_operand" "i")]
>>  		     UNSPEC_BCD_ADD_SUB))
>> -   (set (reg:CCFP CR6_REGNO)
>> -	(compare:CCFP
>> +   (set (reg:CCBCD CR6_REGNO)
>> +	(compare:CCBCD
>>  	 (unspec:V2DF [(match_dup 1)
>>  		       (match_dup 2)
>>  		       (match_dup 3)]
>> @@ -4566,8 +4566,8 @@ (define_insn "vclrrb"
>>     [(set_attr "type" "vecsimple")])
>>
>>  (define_expand "bcd<bcd_add_sub>_<code>_<mode>"
>> -  [(parallel [(set (reg:CCFP CR6_REGNO)
>> -		   (compare:CCFP
>> +  [(parallel [(set (reg:CCBCD CR6_REGNO)
>> +		   (compare:CCBCD
>>  		    (unspec:V2DF [(match_operand:VBCD 1 "register_operand")
>>  				  (match_operand:VBCD 2 "register_operand")
>>  				  (match_operand:QI 3 "const_0_to_1_operand")]
>> @@ -4575,7 +4575,7 @@ (define_expand "bcd<bcd_add_sub>_<code>_<mode>"
>>  		    (match_dup 4)))
>>  	      (clobber (match_scratch:VBCD 5))])
>>     (set (match_operand:SI 0 "register_operand")
>> -	(BCD_TEST:SI (reg:CCFP CR6_REGNO)
>> +	(BCD_TEST:SI (reg:CCBCD CR6_REGNO)
>>  		     (const_int 0)))]
>>    "TARGET_P8_VECTOR"
>>  {
>> @@ -4583,8 +4583,8 @@ (define_expand "bcd<bcd_add_sub>_<code>_<mode>"
>>  })
>>
>>  (define_insn "*bcdinvalid_<mode>"
>> -  [(set (reg:CCFP CR6_REGNO)
>> -	(compare:CCFP
>> +  [(set (reg:CCBCD CR6_REGNO)
>> +	(compare:CCBCD
>>  	 (unspec:V2DF [(match_operand:VBCD 1 "register_operand" "v")]
>>  		      UNSPEC_BCDSUB)
>>  	 (match_operand:V2DF 2 "zero_constant" "j")))
>> @@ -4594,14 +4594,14 @@ (define_insn "*bcdinvalid_<mode>"
>>    [(set_attr "type" "vecsimple")])
>>
>>  (define_expand "bcdinvalid_<mode>"
>> -  [(parallel [(set (reg:CCFP CR6_REGNO)
>> -		   (compare:CCFP
>> +  [(parallel [(set (reg:CCBCD CR6_REGNO)
>> +		   (compare:CCBCD
>>  		    (unspec:V2DF [(match_operand:VBCD 1 "register_operand")]
>>  				 UNSPEC_BCDSUB)
>>  		    (match_dup 2)))
>>  	      (clobber (match_scratch:VBCD 3))])
>>     (set (match_operand:SI 0 "register_operand")
>> -	(unordered:SI (reg:CCFP CR6_REGNO)
>> +	(unordered:SI (reg:CCBCD CR6_REGNO)
>>  		      (const_int 0)))]
>>    "TARGET_P8_VECTOR"
>>  {
>> @@ -4614,7 +4614,7 @@ (define_insn "bcdshift_v16qi"
>>  		       (match_operand:V16QI 2 "register_operand" "v")
>>  		       (match_operand:QI 3 "const_0_to_1_operand" "n")]
>>  		     UNSPEC_BCDSHIFT))
>> -   (clobber (reg:CCFP CR6_REGNO))]
>> +   (clobber (reg:CCBCD CR6_REGNO))]
>>    "TARGET_P8_VECTOR"
>>    "bcds. %0,%1,%2,%3"
>>    [(set_attr "type" "vecsimple")])
>> @@ -4623,7 +4623,7 @@ (define_expand "bcdmul10_v16qi"
>>    [(set (match_operand:V16QI 0 "register_operand")
>>  	(unspec:V16QI [(match_operand:V16QI 1 "register_operand")]
>>  		      UNSPEC_BCDSHIFT))
>> -   (clobber (reg:CCFP CR6_REGNO))]
>> +   (clobber (reg:CCBCD CR6_REGNO))]
>>    "TARGET_P9_VECTOR"
>>  {
>>    rtx one = gen_reg_rtx (V16QImode);
>> @@ -4638,7 +4638,7 @@ (define_expand "bcddiv10_v16qi"
>>    [(set (match_operand:V16QI 0 "register_operand")
>>  	(unspec:V16QI [(match_operand:V16QI 1 "register_operand")]
>>  		      UNSPEC_BCDSHIFT))
>> -   (clobber (reg:CCFP CR6_REGNO))]
>> +   (clobber (reg:CCBCD CR6_REGNO))]
>>    "TARGET_P9_VECTOR"
>>  {
>>    rtx one = gen_reg_rtx (V16QImode);
>> @@ -4662,9 +4662,9 @@ (define_peephole2
>>  				 (match_operand:V1TI 2 "register_operand")
>>  				 (match_operand:QI 3 "const_0_to_1_operand")]
>>  				UNSPEC_BCD_ADD_SUB))
>> -	      (clobber (reg:CCFP CR6_REGNO))])
>> -   (parallel [(set (reg:CCFP CR6_REGNO)
>> -		   (compare:CCFP
>> +	      (clobber (reg:CCBCD CR6_REGNO))])
>> +   (parallel [(set (reg:CCBCD CR6_REGNO)
>> +		   (compare:CCBCD
>>  		    (unspec:V2DF [(match_dup 1)
>>  				  (match_dup 2)
>>  				  (match_dup 3)]
>> @@ -4677,8 +4677,8 @@ (define_peephole2
>>  				 (match_dup 2)
>>  				 (match_dup 3)]
>>  				UNSPEC_BCD_ADD_SUB))
>> -	      (set (reg:CCFP CR6_REGNO)
>> -		   (compare:CCFP
>> +	      (set (reg:CCBCD CR6_REGNO)
>> +		   (compare:CCBCD
>>  		    (unspec:V2DF [(match_dup 1)
>>  				  (match_dup 2)
>>  				  (match_dup 3)]
>> diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
>> index d23ce9a77a3..18198add744 100644
>> --- a/gcc/config/rs6000/predicates.md
>> +++ b/gcc/config/rs6000/predicates.md
>> @@ -1350,7 +1350,9 @@ (define_predicate "branch_comparison_operator"
>>  	  (if_then_else (match_test "flag_finite_math_only")
>>  	    (match_code "lt,le,gt,ge,eq,ne,unordered,ordered")
>>  	    (match_code "lt,gt,eq,unordered,unge,unle,ne,ordered"))
>> -	  (match_code "lt,ltu,le,leu,gt,gtu,ge,geu,eq,ne"))
>> +	  (if_then_else (match_test "GET_MODE (XEXP (op, 0)) == CCBCDmode")
>> +	    (match_code "lt,le,gt,ge,eq,ne,unordered,ordered")
>> +	    (match_code "lt,ltu,le,leu,gt,gtu,ge,geu,eq,ne")))
>>  	(match_test "validate_condition_mode (GET_CODE (op),
>>  					      GET_MODE (XEXP (op, 0))),
>>  		     1")))
>> diff --git a/gcc/config/rs6000/rs6000-modes.def b/gcc/config/rs6000/rs6000-modes.def
>> index 094b246c834..3e2e6dfb4ff 100644
>> --- a/gcc/config/rs6000/rs6000-modes.def
>> +++ b/gcc/config/rs6000/rs6000-modes.def
>> @@ -61,6 +61,7 @@ FRACTIONAL_FLOAT_MODE (TF, FLOAT_PRECISION_TFmode, 16, ieee_quad_format);
>>
>>  CC_MODE (CCUNS);
>>  CC_MODE (CCFP);
>> +CC_MODE (CCBCD);	/* Used for bcd insns */
>>  CC_MODE (CCEQ);
>>
>>  /* Vector modes.  */
>> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
>> index 6ba9df4f02e..4068cd8b929 100644
>> --- a/gcc/config/rs6000/rs6000.cc
>> +++ b/gcc/config/rs6000/rs6000.cc
>> @@ -11597,9 +11597,11 @@ validate_condition_mode (enum rtx_code code, machine_mode mode)
>>    gcc_assert ((code != GTU && code != LTU && code != GEU && code != LEU)
>>  	      || mode == CCUNSmode);
>>
>> +  gcc_assert (mode == CCFPmode || mode == CCBCDmode
>> +	      || (code != ORDERED && code != UNORDERED));
>> +
>>    gcc_assert (mode == CCFPmode
>> -	      || (code != ORDERED && code != UNORDERED
>> -		  && code != UNEQ && code != LTGT
>> +	      || (code != UNEQ && code != LTGT
>>  		  && code != UNGT && code != UNLT
>>  		  && code != UNGE && code != UNLE));
>>
>> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
>> index bc8bc6ab060..9b5fcdc8db0 100644
>> --- a/gcc/config/rs6000/rs6000.md
>> +++ b/gcc/config/rs6000/rs6000.md
>> @@ -8115,7 +8115,7 @@ (define_expand "movcc"
>>    ""
>>    "")
>>
>> -(define_mode_iterator CC_any [CC CCUNS CCEQ CCFP])
>> +(define_mode_iterator CC_any [CC CCUNS CCEQ CCFP CCBCD])
>>
>>  (define_insn "*movcc_<mode>"
>>    [(set (match_operand:CC_any 0 "nonimmediate_operand"
>> @@ -13245,6 +13245,7 @@ (define_insn_and_split "*nesi3_ext<mode>"
>>
>>  (define_code_iterator fp_rev [ordered ne unle unge])
>>  (define_code_iterator fp_two [ltgt le ge unlt ungt uneq])
>> +(define_code_iterator ccbcd_rev [ordered ne le ge])
>>
>>  (define_insn_and_split "*<code><mode>_cc"
>>    [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
>> @@ -13264,6 +13265,24 @@ (define_insn_and_split "*<code><mode>_cc"
>>  }
>>    [(set_attr "length" "12")])
>>
>> +(define_insn_and_split "*<code><mode>_cc"
>> +  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
>> +	(ccbcd_rev:GPR (match_operand:CCBCD 1 "cc_reg_operand" "y")
>> +		    (const_int 0)))]
>> +  ""
>> +  "#"
>> +  "&& 1"
>> +  [(pc)]
>> +{
>> +  rtx_code revcode = reverse_condition (<CODE>);
>> +  rtx eq = gen_rtx_fmt_ee (revcode, <MODE>mode, operands[1], const0_rtx);
>> +  rtx tmp = gen_reg_rtx (<MODE>mode);
>> +  emit_move_insn (tmp, eq);
>> +  emit_insn (gen_xor<mode>3 (operands[0], tmp, const1_rtx));
>> +  DONE;
>> +}
>> +  [(set_attr "length" "12")])
>> +
>>  (define_insn_and_split "*<code><mode>_cc"
>>    [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
>>  	(fp_two:GPR (match_operand:CCFP 1 "cc_reg_operand" "y")
>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr100736.c b/gcc/testsuite/gcc.target/powerpc/pr100736.c
>> new file mode 100644
>> index 00000000000..85e3ae56d55
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/pr100736.c
>> @@ -0,0 +1,12 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-mdejagnu-cpu=power8 -O2 -ffinite-math-only" } */
>> +/* { dg-require-effective-target powerpc_vsx_ok } */
>> +
>> +/* Verify there is no ICE with finite-math-only */
>> +
>> +int foo (vector unsigned char a, vector unsigned char b)
>> +{
>> +  return __builtin_vec_bcdsub_ge (a, b, 0);
>> +}
>> +
>> +
>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr114732.c b/gcc/testsuite/gcc.target/powerpc/pr114732.c
>> new file mode 100644
>> index 00000000000..d0b878780c6
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/pr114732.c
>> @@ -0,0 +1,12 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-mdejagnu-cpu=power8 -O2" } */
>> +/* { dg-require-effective-target powerpc_vsx_ok } */
>> +
>> +/* Verify only one cr bit need to be tested. */
>> +
>> +int foo (vector unsigned char a, vector unsigned char b)
>> +{
>> +  return __builtin_vec_bcdsub_ge (a, b, 0) != 1;
>> +}
>> +
>> +/* { dg-final { scan-assembler-not "cror" } } */

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH-1, rs6000] Add a new type of CC mode - CCBCD for bcd insns [PR100736, PR114732]
  2024-04-30  7:18 [PATCH-1, rs6000] Add a new type of CC mode - CCBCD for bcd insns [PR100736, PR114732] HAO CHEN GUI
  2024-05-13  1:56 ` Ping " HAO CHEN GUI
@ 2024-05-29  5:26 ` Kewen.Lin
  2024-05-30  3:14   ` HAO CHEN GUI
  1 sibling, 1 reply; 6+ messages in thread
From: Kewen.Lin @ 2024-05-29  5:26 UTC (permalink / raw)
  To: HAO CHEN GUI; +Cc: Segher Boessenkool, David, Peter Bergner, gcc-patches

Hi,

on 2024/4/30 15:18, HAO CHEN GUI wrote:
> Hi,
>   It's the first patch of a series of patches optimizing CC modes on
> rs6000.
> 
>   bcd insns set all four bits of a CR field. But it has different single
> bit reverse behavior than CCFP's. The forth bit of bcd cr fields is used
> to indict overflow or invalid number. It's not a bit for unordered test.
> So the "le" test should be reversed to "gt" not "ungt". The "ge" test
> should be reversed to "lt" not "unlt". That's the root cause of PR100736
> and PR114732.
> 
>   This patch fixes the issue by adding a new type of CC mode - CCBCD for
> all bcd insns. Here a new setcc_rev pattern is added for ccbcd. It will
> be merged to a uniform pattern which is for all CC modes in sequential
> patch.

Thanks for doing this, adding one more CCmode for BCD specific looks
reasonable and make code more clear.

> 
>   The rtl code "unordered" is still used for testing overflow or
> invalid number. IMHO, the "unordered" on a CC mode can be considered as
> testing the forth bit of a CR field setting or not. The "eq" on a CC mode
> can be considered as testing the third bit setting or not. Thus we avoid
> creating lots of unspecs for the CR bit testing.

I can understand re-using "unordered" and "eq" will save some efforts than
doing with unspecs, but they are actually RTL codes instead of bits on the
specific hardware CR, a downside is that people who isn't aware of this
design point can have some misunderstanding when reading/checking the code
or dumping, from this perspective unspecs (with reasonable name) can be
more meaningful.  Normally adopting RTL code is better since they have the
chance to be considered (optimized) in generic pass/code, but it isn't the
case here as we just use the code itself but not be with the same semantic
(meaning).  Looking forward to others' opinions on this, if we want to adopt
"unordered" and "eq" like what this patch does, I think we should at least
emphasize such points in rs6000-modes.def.

> 
>   Bootstrapped and tested on powerpc64-linux BE and LE with no
> regressions. Is it OK for the trunk?

Some minor comments are inlined, Segher did a lot of work on CC, I'm looking
forward to his review on this patch series. :)

> 
> Thanks
> Gui Haochen
> 
> 
> ChangeLog
> rs6000: Add a new type of CC mode - CCBCD for bcd insns
> 
> gcc/
> 	PR target/100736
> 	PR target/114732
> 	* config/rs6000/altivec.md (bcd<bcd_add_sub>_<mode>): Replace CCFP
> 	with CCBCD.
> 	(*bcd<bcd_add_sub>_test_<mode>): Likewise.
> 	(*bcd<bcd_add_sub>_test2_<mode>): Likewise.
> 	(bcd<bcd_add_sub>_<code>_<mode>): Likewise.
> 	(*bcdinvalid_<mode>): Likewise.
> 	(bcdinvalid_<mode>): Likewise.
> 	(bcdshift_v16qi): Likewise.
> 	(bcdmul10_v16qi): Likewise.
> 	(bcddiv10_v16qi): Likewise.
> 	(peephole for bcd_add/sub): Likewise.
> 	* config/rs6000/predicates.md (branch_comparison_operator): Add CCBCD
> 	and its supported comparison codes.
> 	* config/rs6000/rs6000-modes.def (CC_MODE): Add CCBCD.
> 	* config/rs6000/rs6000.cc (validate_condition_mode): Add CCBCD
> 	assertion.
> 	* config/rs6000/rs6000.md (CC_any): Add CCBCD.
> 	(ccbcd_rev): New code iterator.
> 	(*<code><mode>_cc): New insn and split pattern for CCBCD reverse
> 	compare.
> 
> gcc/testsuite/
> 	PR target/100736
> 	PR target/114732
> 	* gcc.target/powerpc/pr100736.c: New.
> 	* gcc.target/powerpc/pr114732.c: New.
> 
> patch.diff
> diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
> index bb20441c096..9fa8cf89f61 100644
> --- a/gcc/config/rs6000/altivec.md
> +++ b/gcc/config/rs6000/altivec.md
> @@ -4443,7 +4443,7 @@ (define_insn "bcd<bcd_add_sub>_<mode>"
>  		      (match_operand:VBCD 2 "register_operand" "v")
>  		      (match_operand:QI 3 "const_0_to_1_operand" "n")]
>  		     UNSPEC_BCD_ADD_SUB))
> -   (clobber (reg:CCFP CR6_REGNO))]
> +   (clobber (reg:CCBCD CR6_REGNO))]
>    "TARGET_P8_VECTOR"
>    "bcd<bcd_add_sub>. %0,%1,%2,%3"
>    [(set_attr "type" "vecsimple")])
> @@ -4454,8 +4454,8 @@ (define_insn "bcd<bcd_add_sub>_<mode>"
>  ;; probably should be one that can go in the VMX (Altivec) registers, so we
>  ;; can't use DDmode or DFmode.

Here is a paragraph of comments above:

;; Use a floating point type (V2DFmode) for the compare to set CR6 so that we
;; can use the unordered test for BCD nans and add/subtracts that overflow.  An
;; UNORDERED test on an integer type (like V1TImode) is not defined.  The type
;; probably should be one that can go in the VMX (Altivec) registers, so we
;; can't use DDmode or DFmode.

Is it still hold?  It's not obvious where is the code checking unordered test
should be on fp type (modes), if it still takes effect, "unspec" would help
to get rid of this restriction.  Otherwise, this comment should be updated
and we can drop this workaround with V2DF here.

>  (define_insn "*bcd<bcd_add_sub>_test_<mode>"
> -  [(set (reg:CCFP CR6_REGNO)
> -	(compare:CCFP
> +  [(set (reg:CCBCD CR6_REGNO)
> +	(compare:CCBCD
>  	 (unspec:V2DF [(match_operand:VBCD 1 "register_operand" "v")
>  		       (match_operand:VBCD 2 "register_operand" "v")
>  		       (match_operand:QI 3 "const_0_to_1_operand" "i")]
> @@ -4472,8 +4472,8 @@ (define_insn "*bcd<bcd_add_sub>_test2_<mode>"
>  		      (match_operand:VBCD 2 "register_operand" "v")
>  		      (match_operand:QI 3 "const_0_to_1_operand" "i")]
>  		     UNSPEC_BCD_ADD_SUB))
> -   (set (reg:CCFP CR6_REGNO)
> -	(compare:CCFP
> +   (set (reg:CCBCD CR6_REGNO)
> +	(compare:CCBCD
>  	 (unspec:V2DF [(match_dup 1)
>  		       (match_dup 2)
>  		       (match_dup 3)]
> @@ -4566,8 +4566,8 @@ (define_insn "vclrrb"
>     [(set_attr "type" "vecsimple")])
> 
>  (define_expand "bcd<bcd_add_sub>_<code>_<mode>"
> -  [(parallel [(set (reg:CCFP CR6_REGNO)
> -		   (compare:CCFP
> +  [(parallel [(set (reg:CCBCD CR6_REGNO)
> +		   (compare:CCBCD
>  		    (unspec:V2DF [(match_operand:VBCD 1 "register_operand")
>  				  (match_operand:VBCD 2 "register_operand")
>  				  (match_operand:QI 3 "const_0_to_1_operand")]
> @@ -4575,7 +4575,7 @@ (define_expand "bcd<bcd_add_sub>_<code>_<mode>"
>  		    (match_dup 4)))
>  	      (clobber (match_scratch:VBCD 5))])
>     (set (match_operand:SI 0 "register_operand")
> -	(BCD_TEST:SI (reg:CCFP CR6_REGNO)
> +	(BCD_TEST:SI (reg:CCBCD CR6_REGNO)
>  		     (const_int 0)))]
>    "TARGET_P8_VECTOR"
>  {
> @@ -4583,8 +4583,8 @@ (define_expand "bcd<bcd_add_sub>_<code>_<mode>"
>  })
> 
>  (define_insn "*bcdinvalid_<mode>"
> -  [(set (reg:CCFP CR6_REGNO)
> -	(compare:CCFP
> +  [(set (reg:CCBCD CR6_REGNO)
> +	(compare:CCBCD
>  	 (unspec:V2DF [(match_operand:VBCD 1 "register_operand" "v")]
>  		      UNSPEC_BCDSUB)
>  	 (match_operand:V2DF 2 "zero_constant" "j")))
> @@ -4594,14 +4594,14 @@ (define_insn "*bcdinvalid_<mode>"
>    [(set_attr "type" "vecsimple")])
> 
>  (define_expand "bcdinvalid_<mode>"
> -  [(parallel [(set (reg:CCFP CR6_REGNO)
> -		   (compare:CCFP
> +  [(parallel [(set (reg:CCBCD CR6_REGNO)
> +		   (compare:CCBCD
>  		    (unspec:V2DF [(match_operand:VBCD 1 "register_operand")]
>  				 UNSPEC_BCDSUB)
>  		    (match_dup 2)))
>  	      (clobber (match_scratch:VBCD 3))])
>     (set (match_operand:SI 0 "register_operand")
> -	(unordered:SI (reg:CCFP CR6_REGNO)
> +	(unordered:SI (reg:CCBCD CR6_REGNO)
>  		      (const_int 0)))]
>    "TARGET_P8_VECTOR"
>  {
> @@ -4614,7 +4614,7 @@ (define_insn "bcdshift_v16qi"
>  		       (match_operand:V16QI 2 "register_operand" "v")
>  		       (match_operand:QI 3 "const_0_to_1_operand" "n")]
>  		     UNSPEC_BCDSHIFT))
> -   (clobber (reg:CCFP CR6_REGNO))]
> +   (clobber (reg:CCBCD CR6_REGNO))]
>    "TARGET_P8_VECTOR"
>    "bcds. %0,%1,%2,%3"
>    [(set_attr "type" "vecsimple")])
> @@ -4623,7 +4623,7 @@ (define_expand "bcdmul10_v16qi"
>    [(set (match_operand:V16QI 0 "register_operand")
>  	(unspec:V16QI [(match_operand:V16QI 1 "register_operand")]
>  		      UNSPEC_BCDSHIFT))
> -   (clobber (reg:CCFP CR6_REGNO))]
> +   (clobber (reg:CCBCD CR6_REGNO))]
>    "TARGET_P9_VECTOR"
>  {
>    rtx one = gen_reg_rtx (V16QImode);
> @@ -4638,7 +4638,7 @@ (define_expand "bcddiv10_v16qi"
>    [(set (match_operand:V16QI 0 "register_operand")
>  	(unspec:V16QI [(match_operand:V16QI 1 "register_operand")]
>  		      UNSPEC_BCDSHIFT))
> -   (clobber (reg:CCFP CR6_REGNO))]
> +   (clobber (reg:CCBCD CR6_REGNO))]
>    "TARGET_P9_VECTOR"
>  {
>    rtx one = gen_reg_rtx (V16QImode);
> @@ -4662,9 +4662,9 @@ (define_peephole2
>  				 (match_operand:V1TI 2 "register_operand")
>  				 (match_operand:QI 3 "const_0_to_1_operand")]
>  				UNSPEC_BCD_ADD_SUB))
> -	      (clobber (reg:CCFP CR6_REGNO))])
> -   (parallel [(set (reg:CCFP CR6_REGNO)
> -		   (compare:CCFP
> +	      (clobber (reg:CCBCD CR6_REGNO))])
> +   (parallel [(set (reg:CCBCD CR6_REGNO)
> +		   (compare:CCBCD
>  		    (unspec:V2DF [(match_dup 1)
>  				  (match_dup 2)
>  				  (match_dup 3)]
> @@ -4677,8 +4677,8 @@ (define_peephole2
>  				 (match_dup 2)
>  				 (match_dup 3)]
>  				UNSPEC_BCD_ADD_SUB))
> -	      (set (reg:CCFP CR6_REGNO)
> -		   (compare:CCFP
> +	      (set (reg:CCBCD CR6_REGNO)
> +		   (compare:CCBCD
>  		    (unspec:V2DF [(match_dup 1)
>  				  (match_dup 2)
>  				  (match_dup 3)]
> diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
> index d23ce9a77a3..18198add744 100644
> --- a/gcc/config/rs6000/predicates.md
> +++ b/gcc/config/rs6000/predicates.md
> @@ -1350,7 +1350,9 @@ (define_predicate "branch_comparison_operator"
>  	  (if_then_else (match_test "flag_finite_math_only")
>  	    (match_code "lt,le,gt,ge,eq,ne,unordered,ordered")
>  	    (match_code "lt,gt,eq,unordered,unge,unle,ne,ordered"))
> -	  (match_code "lt,ltu,le,leu,gt,gtu,ge,geu,eq,ne"))
> +	  (if_then_else (match_test "GET_MODE (XEXP (op, 0)) == CCBCDmode")
> +	    (match_code "lt,le,gt,ge,eq,ne,unordered,ordered")
> +	    (match_code "lt,ltu,le,leu,gt,gtu,ge,geu,eq,ne")))
>  	(match_test "validate_condition_mode (GET_CODE (op),
>  					      GET_MODE (XEXP (op, 0))),
>  		     1")))
> diff --git a/gcc/config/rs6000/rs6000-modes.def b/gcc/config/rs6000/rs6000-modes.def
> index 094b246c834..3e2e6dfb4ff 100644
> --- a/gcc/config/rs6000/rs6000-modes.def
> +++ b/gcc/config/rs6000/rs6000-modes.def
> @@ -61,6 +61,7 @@ FRACTIONAL_FLOAT_MODE (TF, FLOAT_PRECISION_TFmode, 16, ieee_quad_format);
> 
>  CC_MODE (CCUNS);
>  CC_MODE (CCFP);
> +CC_MODE (CCBCD);	/* Used for bcd insns */
>  CC_MODE (CCEQ);
> 
>  /* Vector modes.  */
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index 6ba9df4f02e..4068cd8b929 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -11597,9 +11597,11 @@ validate_condition_mode (enum rtx_code code, machine_mode mode)
>    gcc_assert ((code != GTU && code != LTU && code != GEU && code != LEU)
>  	      || mode == CCUNSmode);
> 
> +  gcc_assert (mode == CCFPmode || mode == CCBCDmode
> +	      || (code != ORDERED && code != UNORDERED));
> +

An example for the above concern, if people read this code without knowing
the design point above, one question would come up like how can CCBCDmode
have {UN,}ORDERED available, shouldn't it only have overflow/invalid.

>    gcc_assert (mode == CCFPmode
> -	      || (code != ORDERED && code != UNORDERED
> -		  && code != UNEQ && code != LTGT
> +	      || (code != UNEQ && code != LTGT
>  		  && code != UNGT && code != UNLT
>  		  && code != UNGE && code != UNLE));
> 
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index bc8bc6ab060..9b5fcdc8db0 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -8115,7 +8115,7 @@ (define_expand "movcc"
>    ""
>    "")
> 
> -(define_mode_iterator CC_any [CC CCUNS CCEQ CCFP])
> +(define_mode_iterator CC_any [CC CCUNS CCEQ CCFP CCBCD])
> 
>  (define_insn "*movcc_<mode>"
>    [(set (match_operand:CC_any 0 "nonimmediate_operand"
> @@ -13245,6 +13245,7 @@ (define_insn_and_split "*nesi3_ext<mode>"
> 
>  (define_code_iterator fp_rev [ordered ne unle unge])
>  (define_code_iterator fp_two [ltgt le ge unlt ungt uneq])
> +(define_code_iterator ccbcd_rev [ordered ne le ge])

Nit: As "fp_rev" uses "fp", s/ccbcd/bcd/

> 
>  (define_insn_and_split "*<code><mode>_cc"
>    [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
> @@ -13264,6 +13265,24 @@ (define_insn_and_split "*<code><mode>_cc"
>  }
>    [(set_attr "length" "12")])
> 
> +(define_insn_and_split "*<code><mode>_cc"
> +  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
> +	(ccbcd_rev:GPR (match_operand:CCBCD 1 "cc_reg_operand" "y")
> +		    (const_int 0)))]
> +  ""
> +  "#"
> +  "&& 1"
> +  [(pc)]
> +{
> +  rtx_code revcode = reverse_condition (<CODE>);
> +  rtx eq = gen_rtx_fmt_ee (revcode, <MODE>mode, operands[1], const0_rtx);
> +  rtx tmp = gen_reg_rtx (<MODE>mode);
> +  emit_move_insn (tmp, eq);
> +  emit_insn (gen_xor<mode>3 (operands[0], tmp, const1_rtx));
> +  DONE;
> +}
> +  [(set_attr "length" "12")])
> +
>  (define_insn_and_split "*<code><mode>_cc"
>    [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
>  	(fp_two:GPR (match_operand:CCFP 1 "cc_reg_operand" "y")
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr100736.c b/gcc/testsuite/gcc.target/powerpc/pr100736.c
> new file mode 100644
> index 00000000000..85e3ae56d55
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr100736.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mdejagnu-cpu=power8 -O2 -ffinite-math-only" } */
> +/* { dg-require-effective-target powerpc_vsx_ok } */

Nit: s/powerpc_vsx_ok/powerpc_vsx/

> +
> +/* Verify there is no ICE with finite-math-only */
> +
> +int foo (vector unsigned char a, vector unsigned char b)
> +{
> +  return __builtin_vec_bcdsub_ge (a, b, 0);
> +}
> +
> +
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr114732.c b/gcc/testsuite/gcc.target/powerpc/pr114732.c
> new file mode 100644
> index 00000000000..d0b878780c6
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr114732.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mdejagnu-cpu=power8 -O2" } */
> +/* { dg-require-effective-target powerpc_vsx_ok } */

Ditto.

BR,
Kewen

> +
> +/* Verify only one cr bit need to be tested. */
> +
> +int foo (vector unsigned char a, vector unsigned char b)
> +{
> +  return __builtin_vec_bcdsub_ge (a, b, 0) != 1;
> +}
> +
> +/* { dg-final { scan-assembler-not "cror" } } */




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH-1, rs6000] Add a new type of CC mode - CCBCD for bcd insns [PR100736, PR114732]
  2024-05-29  5:26 ` Kewen.Lin
@ 2024-05-30  3:14   ` HAO CHEN GUI
  2024-05-31  7:30     ` Kewen.Lin
  0 siblings, 1 reply; 6+ messages in thread
From: HAO CHEN GUI @ 2024-05-30  3:14 UTC (permalink / raw)
  To: Kewen.Lin; +Cc: Segher Boessenkool, David, Peter Bergner, gcc-patches

Hi Kewen,

在 2024/5/29 13:26, Kewen.Lin 写道:
> I can understand re-using "unordered" and "eq" will save some efforts than
> doing with unspecs, but they are actually RTL codes instead of bits on the
> specific hardware CR, a downside is that people who isn't aware of this
> design point can have some misunderstanding when reading/checking the code
> or dumping, from this perspective unspecs (with reasonable name) can be
> more meaningful.  Normally adopting RTL code is better since they have the
> chance to be considered (optimized) in generic pass/code, but it isn't the
> case here as we just use the code itself but not be with the same semantic
> (meaning).  Looking forward to others' opinions on this, if we want to adopt
> "unordered" and "eq" like what this patch does, I think we should at least
> emphasize such points in rs6000-modes.def.

Thanks so much for your comments. IMHO, the core is if we can re-define
"unordered" or "eq" for certain CC mode on a specific target. If we can't or
it's unsafe, we have to use the unspecs. In this case, I just want to define
the code "unordered" on CCBCD as testing if the bit 3 is set on this CR field.
Actually rs6000 already use "lt" code to test if bit 0 is set for vector
compare instructions. The following expand is an example.

(define_expand "vector_ae_<mode>_p"
  [(parallel
    [(set (reg:CC CR6_REGNO)
          (unspec:CC [(ne:CC (match_operand:VI 1 "vlogical_operand")
                             (match_operand:VI 2 "vlogical_operand"))]
           UNSPEC_PREDICATE))
     (set (match_dup 3)
          (ne:VI (match_dup 1)
                 (match_dup 2)))])
   (set (match_operand:SI 0 "register_operand" "=r")
        (lt:SI (reg:CC CR6_REGNO)
               (const_int 0)))
   (set (match_dup 0)
        (xor:SI (match_dup 0)
                (const_int 1)))]

I think the "lt" on CC just doesn't mean it compares if CC value is less than an
integer. It just tests the "lt" bit (bit 0) is set or not on this CC.

  Looking forward to your and Segher's further invaluable comments.

Thanks
Gui Haochen

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH-1, rs6000] Add a new type of CC mode - CCBCD for bcd insns [PR100736, PR114732]
  2024-05-30  3:14   ` HAO CHEN GUI
@ 2024-05-31  7:30     ` Kewen.Lin
  0 siblings, 0 replies; 6+ messages in thread
From: Kewen.Lin @ 2024-05-31  7:30 UTC (permalink / raw)
  To: HAO CHEN GUI; +Cc: Segher Boessenkool, David, Peter Bergner, gcc-patches

Hi Haochen,

on 2024/5/30 11:14, HAO CHEN GUI wrote:
> Hi Kewen,
> 
> 在 2024/5/29 13:26, Kewen.Lin 写道:
>> I can understand re-using "unordered" and "eq" will save some efforts than
>> doing with unspecs, but they are actually RTL codes instead of bits on the
>> specific hardware CR, a downside is that people who isn't aware of this
>> design point can have some misunderstanding when reading/checking the code
>> or dumping, from this perspective unspecs (with reasonable name) can be
>> more meaningful.  Normally adopting RTL code is better since they have the
>> chance to be considered (optimized) in generic pass/code, but it isn't the
>> case here as we just use the code itself but not be with the same semantic
>> (meaning).  Looking forward to others' opinions on this, if we want to adopt
>> "unordered" and "eq" like what this patch does, I think we should at least
>> emphasize such points in rs6000-modes.def.
> 
> Thanks so much for your comments. IMHO, the core is if we can re-define
> "unordered" or "eq" for certain CC mode on a specific target. If we can't or
> it's unsafe, we have to use the unspecs. In this case, I just want to define
> the code "unordered" on CCBCD as testing if the bit 3 is set on this CR field.

But my point is that "unordered" has its semantic, it looks a bit tricky to
adopt it on the result from BCD comparison when which only has "invalid" and
"overflow" other than normal ones, though I can understand that this patch
wants to use it to test bit 3 since for float comparison bit 3 is for
"unordered".  However, IMHO it would be more clear to have one unspec to test
bit 3 when bit 3 doesn't actually mean unordered result.

> Actually rs6000 already use "lt" code to test if bit 0 is set for vector
> compare instructions. The following expand is an example.

Yeah, but it doesn't mean it's the most sensible way to do this, IMHO it
suffers from the similar issue and can be improved as well.

> 
> (define_expand "vector_ae_<mode>_p"
>   [(parallel
>     [(set (reg:CC CR6_REGNO)
>           (unspec:CC [(ne:CC (match_operand:VI 1 "vlogical_operand")
>                              (match_operand:VI 2 "vlogical_operand"))]
>            UNSPEC_PREDICATE))
>      (set (match_dup 3)
>           (ne:VI (match_dup 1)
>                  (match_dup 2)))])
>    (set (match_operand:SI 0 "register_operand" "=r")
>         (lt:SI (reg:CC CR6_REGNO)
>                (const_int 0)))
>    (set (match_dup 0)
>         (xor:SI (match_dup 0)
>                 (const_int 1)))]
> 
> I think the "lt" on CC just doesn't mean it compares if CC value is less than an
> integer. It just tests the "lt" bit (bit 0) is set or not on this CC.

But bit 0 doesn't mean for "lt" comparison result in this context any more.

BR,
Kewen

> 
>   Looking forward to your and Segher's further invaluable comments.
> 
> Thanks
> Gui Haochen


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-05-31  7:30 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-30  7:18 [PATCH-1, rs6000] Add a new type of CC mode - CCBCD for bcd insns [PR100736, PR114732] HAO CHEN GUI
2024-05-13  1:56 ` Ping " HAO CHEN GUI
2024-05-27  1:55   ` Ping^2 " HAO CHEN GUI
2024-05-29  5:26 ` Kewen.Lin
2024-05-30  3:14   ` HAO CHEN GUI
2024-05-31  7:30     ` Kewen.Lin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).