public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: "Kewen.Lin" <linkw@linux.ibm.com>
To: HAO CHEN GUI <guihaoc@linux.ibm.com>
Cc: Segher Boessenkool <segher@kernel.crashing.org>,
	David <dje.gcc@gmail.com>, Peter Bergner <bergner@linux.ibm.com>,
	gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH-1, rs6000] Add a new type of CC mode - CCBCD for bcd insns [PR100736, PR114732]
Date: Wed, 29 May 2024 13:26:42 +0800	[thread overview]
Message-ID: <53003cc6-a67c-0faf-5b4d-824de5bddb88@linux.ibm.com> (raw)
In-Reply-To: <65719e53-d438-45db-b0cb-2829ba6ac7e2@linux.ibm.com>

Hi,

on 2024/4/30 15:18, HAO CHEN GUI wrote:
> Hi,
>   It's the first patch of a series of patches optimizing CC modes on
> rs6000.
> 
>   bcd insns set all four bits of a CR field. But it has different single
> bit reverse behavior than CCFP's. The forth bit of bcd cr fields is used
> to indict overflow or invalid number. It's not a bit for unordered test.
> So the "le" test should be reversed to "gt" not "ungt". The "ge" test
> should be reversed to "lt" not "unlt". That's the root cause of PR100736
> and PR114732.
> 
>   This patch fixes the issue by adding a new type of CC mode - CCBCD for
> all bcd insns. Here a new setcc_rev pattern is added for ccbcd. It will
> be merged to a uniform pattern which is for all CC modes in sequential
> patch.

Thanks for doing this, adding one more CCmode for BCD specific looks
reasonable and make code more clear.

> 
>   The rtl code "unordered" is still used for testing overflow or
> invalid number. IMHO, the "unordered" on a CC mode can be considered as
> testing the forth bit of a CR field setting or not. The "eq" on a CC mode
> can be considered as testing the third bit setting or not. Thus we avoid
> creating lots of unspecs for the CR bit testing.

I can understand re-using "unordered" and "eq" will save some efforts than
doing with unspecs, but they are actually RTL codes instead of bits on the
specific hardware CR, a downside is that people who isn't aware of this
design point can have some misunderstanding when reading/checking the code
or dumping, from this perspective unspecs (with reasonable name) can be
more meaningful.  Normally adopting RTL code is better since they have the
chance to be considered (optimized) in generic pass/code, but it isn't the
case here as we just use the code itself but not be with the same semantic
(meaning).  Looking forward to others' opinions on this, if we want to adopt
"unordered" and "eq" like what this patch does, I think we should at least
emphasize such points in rs6000-modes.def.

> 
>   Bootstrapped and tested on powerpc64-linux BE and LE with no
> regressions. Is it OK for the trunk?

Some minor comments are inlined, Segher did a lot of work on CC, I'm looking
forward to his review on this patch series. :)

> 
> Thanks
> Gui Haochen
> 
> 
> ChangeLog
> rs6000: Add a new type of CC mode - CCBCD for bcd insns
> 
> gcc/
> 	PR target/100736
> 	PR target/114732
> 	* config/rs6000/altivec.md (bcd<bcd_add_sub>_<mode>): Replace CCFP
> 	with CCBCD.
> 	(*bcd<bcd_add_sub>_test_<mode>): Likewise.
> 	(*bcd<bcd_add_sub>_test2_<mode>): Likewise.
> 	(bcd<bcd_add_sub>_<code>_<mode>): Likewise.
> 	(*bcdinvalid_<mode>): Likewise.
> 	(bcdinvalid_<mode>): Likewise.
> 	(bcdshift_v16qi): Likewise.
> 	(bcdmul10_v16qi): Likewise.
> 	(bcddiv10_v16qi): Likewise.
> 	(peephole for bcd_add/sub): Likewise.
> 	* config/rs6000/predicates.md (branch_comparison_operator): Add CCBCD
> 	and its supported comparison codes.
> 	* config/rs6000/rs6000-modes.def (CC_MODE): Add CCBCD.
> 	* config/rs6000/rs6000.cc (validate_condition_mode): Add CCBCD
> 	assertion.
> 	* config/rs6000/rs6000.md (CC_any): Add CCBCD.
> 	(ccbcd_rev): New code iterator.
> 	(*<code><mode>_cc): New insn and split pattern for CCBCD reverse
> 	compare.
> 
> gcc/testsuite/
> 	PR target/100736
> 	PR target/114732
> 	* gcc.target/powerpc/pr100736.c: New.
> 	* gcc.target/powerpc/pr114732.c: New.
> 
> patch.diff
> diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
> index bb20441c096..9fa8cf89f61 100644
> --- a/gcc/config/rs6000/altivec.md
> +++ b/gcc/config/rs6000/altivec.md
> @@ -4443,7 +4443,7 @@ (define_insn "bcd<bcd_add_sub>_<mode>"
>  		      (match_operand:VBCD 2 "register_operand" "v")
>  		      (match_operand:QI 3 "const_0_to_1_operand" "n")]
>  		     UNSPEC_BCD_ADD_SUB))
> -   (clobber (reg:CCFP CR6_REGNO))]
> +   (clobber (reg:CCBCD CR6_REGNO))]
>    "TARGET_P8_VECTOR"
>    "bcd<bcd_add_sub>. %0,%1,%2,%3"
>    [(set_attr "type" "vecsimple")])
> @@ -4454,8 +4454,8 @@ (define_insn "bcd<bcd_add_sub>_<mode>"
>  ;; probably should be one that can go in the VMX (Altivec) registers, so we
>  ;; can't use DDmode or DFmode.

Here is a paragraph of comments above:

;; Use a floating point type (V2DFmode) for the compare to set CR6 so that we
;; can use the unordered test for BCD nans and add/subtracts that overflow.  An
;; UNORDERED test on an integer type (like V1TImode) is not defined.  The type
;; probably should be one that can go in the VMX (Altivec) registers, so we
;; can't use DDmode or DFmode.

Is it still hold?  It's not obvious where is the code checking unordered test
should be on fp type (modes), if it still takes effect, "unspec" would help
to get rid of this restriction.  Otherwise, this comment should be updated
and we can drop this workaround with V2DF here.

>  (define_insn "*bcd<bcd_add_sub>_test_<mode>"
> -  [(set (reg:CCFP CR6_REGNO)
> -	(compare:CCFP
> +  [(set (reg:CCBCD CR6_REGNO)
> +	(compare:CCBCD
>  	 (unspec:V2DF [(match_operand:VBCD 1 "register_operand" "v")
>  		       (match_operand:VBCD 2 "register_operand" "v")
>  		       (match_operand:QI 3 "const_0_to_1_operand" "i")]
> @@ -4472,8 +4472,8 @@ (define_insn "*bcd<bcd_add_sub>_test2_<mode>"
>  		      (match_operand:VBCD 2 "register_operand" "v")
>  		      (match_operand:QI 3 "const_0_to_1_operand" "i")]
>  		     UNSPEC_BCD_ADD_SUB))
> -   (set (reg:CCFP CR6_REGNO)
> -	(compare:CCFP
> +   (set (reg:CCBCD CR6_REGNO)
> +	(compare:CCBCD
>  	 (unspec:V2DF [(match_dup 1)
>  		       (match_dup 2)
>  		       (match_dup 3)]
> @@ -4566,8 +4566,8 @@ (define_insn "vclrrb"
>     [(set_attr "type" "vecsimple")])
> 
>  (define_expand "bcd<bcd_add_sub>_<code>_<mode>"
> -  [(parallel [(set (reg:CCFP CR6_REGNO)
> -		   (compare:CCFP
> +  [(parallel [(set (reg:CCBCD CR6_REGNO)
> +		   (compare:CCBCD
>  		    (unspec:V2DF [(match_operand:VBCD 1 "register_operand")
>  				  (match_operand:VBCD 2 "register_operand")
>  				  (match_operand:QI 3 "const_0_to_1_operand")]
> @@ -4575,7 +4575,7 @@ (define_expand "bcd<bcd_add_sub>_<code>_<mode>"
>  		    (match_dup 4)))
>  	      (clobber (match_scratch:VBCD 5))])
>     (set (match_operand:SI 0 "register_operand")
> -	(BCD_TEST:SI (reg:CCFP CR6_REGNO)
> +	(BCD_TEST:SI (reg:CCBCD CR6_REGNO)
>  		     (const_int 0)))]
>    "TARGET_P8_VECTOR"
>  {
> @@ -4583,8 +4583,8 @@ (define_expand "bcd<bcd_add_sub>_<code>_<mode>"
>  })
> 
>  (define_insn "*bcdinvalid_<mode>"
> -  [(set (reg:CCFP CR6_REGNO)
> -	(compare:CCFP
> +  [(set (reg:CCBCD CR6_REGNO)
> +	(compare:CCBCD
>  	 (unspec:V2DF [(match_operand:VBCD 1 "register_operand" "v")]
>  		      UNSPEC_BCDSUB)
>  	 (match_operand:V2DF 2 "zero_constant" "j")))
> @@ -4594,14 +4594,14 @@ (define_insn "*bcdinvalid_<mode>"
>    [(set_attr "type" "vecsimple")])
> 
>  (define_expand "bcdinvalid_<mode>"
> -  [(parallel [(set (reg:CCFP CR6_REGNO)
> -		   (compare:CCFP
> +  [(parallel [(set (reg:CCBCD CR6_REGNO)
> +		   (compare:CCBCD
>  		    (unspec:V2DF [(match_operand:VBCD 1 "register_operand")]
>  				 UNSPEC_BCDSUB)
>  		    (match_dup 2)))
>  	      (clobber (match_scratch:VBCD 3))])
>     (set (match_operand:SI 0 "register_operand")
> -	(unordered:SI (reg:CCFP CR6_REGNO)
> +	(unordered:SI (reg:CCBCD CR6_REGNO)
>  		      (const_int 0)))]
>    "TARGET_P8_VECTOR"
>  {
> @@ -4614,7 +4614,7 @@ (define_insn "bcdshift_v16qi"
>  		       (match_operand:V16QI 2 "register_operand" "v")
>  		       (match_operand:QI 3 "const_0_to_1_operand" "n")]
>  		     UNSPEC_BCDSHIFT))
> -   (clobber (reg:CCFP CR6_REGNO))]
> +   (clobber (reg:CCBCD CR6_REGNO))]
>    "TARGET_P8_VECTOR"
>    "bcds. %0,%1,%2,%3"
>    [(set_attr "type" "vecsimple")])
> @@ -4623,7 +4623,7 @@ (define_expand "bcdmul10_v16qi"
>    [(set (match_operand:V16QI 0 "register_operand")
>  	(unspec:V16QI [(match_operand:V16QI 1 "register_operand")]
>  		      UNSPEC_BCDSHIFT))
> -   (clobber (reg:CCFP CR6_REGNO))]
> +   (clobber (reg:CCBCD CR6_REGNO))]
>    "TARGET_P9_VECTOR"
>  {
>    rtx one = gen_reg_rtx (V16QImode);
> @@ -4638,7 +4638,7 @@ (define_expand "bcddiv10_v16qi"
>    [(set (match_operand:V16QI 0 "register_operand")
>  	(unspec:V16QI [(match_operand:V16QI 1 "register_operand")]
>  		      UNSPEC_BCDSHIFT))
> -   (clobber (reg:CCFP CR6_REGNO))]
> +   (clobber (reg:CCBCD CR6_REGNO))]
>    "TARGET_P9_VECTOR"
>  {
>    rtx one = gen_reg_rtx (V16QImode);
> @@ -4662,9 +4662,9 @@ (define_peephole2
>  				 (match_operand:V1TI 2 "register_operand")
>  				 (match_operand:QI 3 "const_0_to_1_operand")]
>  				UNSPEC_BCD_ADD_SUB))
> -	      (clobber (reg:CCFP CR6_REGNO))])
> -   (parallel [(set (reg:CCFP CR6_REGNO)
> -		   (compare:CCFP
> +	      (clobber (reg:CCBCD CR6_REGNO))])
> +   (parallel [(set (reg:CCBCD CR6_REGNO)
> +		   (compare:CCBCD
>  		    (unspec:V2DF [(match_dup 1)
>  				  (match_dup 2)
>  				  (match_dup 3)]
> @@ -4677,8 +4677,8 @@ (define_peephole2
>  				 (match_dup 2)
>  				 (match_dup 3)]
>  				UNSPEC_BCD_ADD_SUB))
> -	      (set (reg:CCFP CR6_REGNO)
> -		   (compare:CCFP
> +	      (set (reg:CCBCD CR6_REGNO)
> +		   (compare:CCBCD
>  		    (unspec:V2DF [(match_dup 1)
>  				  (match_dup 2)
>  				  (match_dup 3)]
> diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
> index d23ce9a77a3..18198add744 100644
> --- a/gcc/config/rs6000/predicates.md
> +++ b/gcc/config/rs6000/predicates.md
> @@ -1350,7 +1350,9 @@ (define_predicate "branch_comparison_operator"
>  	  (if_then_else (match_test "flag_finite_math_only")
>  	    (match_code "lt,le,gt,ge,eq,ne,unordered,ordered")
>  	    (match_code "lt,gt,eq,unordered,unge,unle,ne,ordered"))
> -	  (match_code "lt,ltu,le,leu,gt,gtu,ge,geu,eq,ne"))
> +	  (if_then_else (match_test "GET_MODE (XEXP (op, 0)) == CCBCDmode")
> +	    (match_code "lt,le,gt,ge,eq,ne,unordered,ordered")
> +	    (match_code "lt,ltu,le,leu,gt,gtu,ge,geu,eq,ne")))
>  	(match_test "validate_condition_mode (GET_CODE (op),
>  					      GET_MODE (XEXP (op, 0))),
>  		     1")))
> diff --git a/gcc/config/rs6000/rs6000-modes.def b/gcc/config/rs6000/rs6000-modes.def
> index 094b246c834..3e2e6dfb4ff 100644
> --- a/gcc/config/rs6000/rs6000-modes.def
> +++ b/gcc/config/rs6000/rs6000-modes.def
> @@ -61,6 +61,7 @@ FRACTIONAL_FLOAT_MODE (TF, FLOAT_PRECISION_TFmode, 16, ieee_quad_format);
> 
>  CC_MODE (CCUNS);
>  CC_MODE (CCFP);
> +CC_MODE (CCBCD);	/* Used for bcd insns */
>  CC_MODE (CCEQ);
> 
>  /* Vector modes.  */
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index 6ba9df4f02e..4068cd8b929 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -11597,9 +11597,11 @@ validate_condition_mode (enum rtx_code code, machine_mode mode)
>    gcc_assert ((code != GTU && code != LTU && code != GEU && code != LEU)
>  	      || mode == CCUNSmode);
> 
> +  gcc_assert (mode == CCFPmode || mode == CCBCDmode
> +	      || (code != ORDERED && code != UNORDERED));
> +

An example for the above concern, if people read this code without knowing
the design point above, one question would come up like how can CCBCDmode
have {UN,}ORDERED available, shouldn't it only have overflow/invalid.

>    gcc_assert (mode == CCFPmode
> -	      || (code != ORDERED && code != UNORDERED
> -		  && code != UNEQ && code != LTGT
> +	      || (code != UNEQ && code != LTGT
>  		  && code != UNGT && code != UNLT
>  		  && code != UNGE && code != UNLE));
> 
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index bc8bc6ab060..9b5fcdc8db0 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -8115,7 +8115,7 @@ (define_expand "movcc"
>    ""
>    "")
> 
> -(define_mode_iterator CC_any [CC CCUNS CCEQ CCFP])
> +(define_mode_iterator CC_any [CC CCUNS CCEQ CCFP CCBCD])
> 
>  (define_insn "*movcc_<mode>"
>    [(set (match_operand:CC_any 0 "nonimmediate_operand"
> @@ -13245,6 +13245,7 @@ (define_insn_and_split "*nesi3_ext<mode>"
> 
>  (define_code_iterator fp_rev [ordered ne unle unge])
>  (define_code_iterator fp_two [ltgt le ge unlt ungt uneq])
> +(define_code_iterator ccbcd_rev [ordered ne le ge])

Nit: As "fp_rev" uses "fp", s/ccbcd/bcd/

> 
>  (define_insn_and_split "*<code><mode>_cc"
>    [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
> @@ -13264,6 +13265,24 @@ (define_insn_and_split "*<code><mode>_cc"
>  }
>    [(set_attr "length" "12")])
> 
> +(define_insn_and_split "*<code><mode>_cc"
> +  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
> +	(ccbcd_rev:GPR (match_operand:CCBCD 1 "cc_reg_operand" "y")
> +		    (const_int 0)))]
> +  ""
> +  "#"
> +  "&& 1"
> +  [(pc)]
> +{
> +  rtx_code revcode = reverse_condition (<CODE>);
> +  rtx eq = gen_rtx_fmt_ee (revcode, <MODE>mode, operands[1], const0_rtx);
> +  rtx tmp = gen_reg_rtx (<MODE>mode);
> +  emit_move_insn (tmp, eq);
> +  emit_insn (gen_xor<mode>3 (operands[0], tmp, const1_rtx));
> +  DONE;
> +}
> +  [(set_attr "length" "12")])
> +
>  (define_insn_and_split "*<code><mode>_cc"
>    [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
>  	(fp_two:GPR (match_operand:CCFP 1 "cc_reg_operand" "y")
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr100736.c b/gcc/testsuite/gcc.target/powerpc/pr100736.c
> new file mode 100644
> index 00000000000..85e3ae56d55
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr100736.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mdejagnu-cpu=power8 -O2 -ffinite-math-only" } */
> +/* { dg-require-effective-target powerpc_vsx_ok } */

Nit: s/powerpc_vsx_ok/powerpc_vsx/

> +
> +/* Verify there is no ICE with finite-math-only */
> +
> +int foo (vector unsigned char a, vector unsigned char b)
> +{
> +  return __builtin_vec_bcdsub_ge (a, b, 0);
> +}
> +
> +
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr114732.c b/gcc/testsuite/gcc.target/powerpc/pr114732.c
> new file mode 100644
> index 00000000000..d0b878780c6
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr114732.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mdejagnu-cpu=power8 -O2" } */
> +/* { dg-require-effective-target powerpc_vsx_ok } */

Ditto.

BR,
Kewen

> +
> +/* Verify only one cr bit need to be tested. */
> +
> +int foo (vector unsigned char a, vector unsigned char b)
> +{
> +  return __builtin_vec_bcdsub_ge (a, b, 0) != 1;
> +}
> +
> +/* { dg-final { scan-assembler-not "cror" } } */




  parent reply	other threads:[~2024-05-29  5:26 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-30  7:18 HAO CHEN GUI
2024-05-13  1:56 ` Ping " HAO CHEN GUI
2024-05-27  1:55   ` Ping^2 " HAO CHEN GUI
2024-05-29  5:26 ` Kewen.Lin [this message]
2024-05-30  3:14   ` HAO CHEN GUI
2024-05-31  7:30     ` Kewen.Lin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53003cc6-a67c-0faf-5b4d-824de5bddb88@linux.ibm.com \
    --to=linkw@linux.ibm.com \
    --cc=bergner@linux.ibm.com \
    --cc=dje.gcc@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=guihaoc@linux.ibm.com \
    --cc=segher@kernel.crashing.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).