[PATCH v2, rs6000] Fix ICE on expand bcd<bcd_add_sub>_<code>

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH v2, rs6000] Fix ICE on expand bcd<bcd_add_sub>_<code>_<mode> [PR100736]
@ 2022-05-26  7:35 HAO CHEN GUI
  2022-05-30  1:26 ` Ping " HAO CHEN GUI
  2022-05-30 10:12 ` Kewen.Lin
  0 siblings, 2 replies; 8+ messages in thread
From: HAO CHEN GUI @ 2022-05-26  7:35 UTC (permalink / raw)
  To: gcc-patches; +Cc: Segher Boessenkool, David, Kewen.Lin, Peter Bergner

Hi,
  This patch fixes the ICE reported in PR100736. It removes the condition
check of finite math only flag not setting in "*<code><mode>_cc" pattern.
With or without this flag, we still can use "cror" to check if either
two bits of CC is set or not for "fp_two" codes. We don't need a reverse
comparison (implemented by crnot) here when the finite math flag is set,
as the latency of "cror" and "crnor" are the same.

  Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
Is this okay for trunk? Any recommendations? Thanks a lot.

ChangeLog
2022-05-26 Haochen Gui <guihaoc@linux.ibm.com>

gcc/
	* config/rs6000/rs6000.md (*<code><mode>_cc): Remove condition of
	finite math only flag not setting.

gcc/testsuite/
	* gcc.target/powerpc/pr100736.c: New.


patch.diff
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index fdfbc6566a5..a6f9cbc9b8b 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -12995,9 +12995,9 @@ (define_insn_and_split "*<code><mode>_cc"
   [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
 	(fp_two:GPR (match_operand:CCFP 1 "cc_reg_operand" "y")
 		  (const_int 0)))]
-  "!flag_finite_math_only"
+  ""
   "#"
-  "&& 1"
+  ""
   [(pc)]
 {
   rtx cc = rs6000_emit_fp_cror (<CODE>, <MODE>mode, operands[1]);
diff --git a/gcc/testsuite/gcc.target/powerpc/pr100736.c b/gcc/testsuite/gcc.target/powerpc/pr100736.c
new file mode 100644
index 00000000000..32cb6df6cd9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr100736.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-mdejagnu-cpu=power8 -O2 -ffinite-math-only" } */
+
+typedef __attribute__ ((altivec (vector__))) unsigned char v;
+
+int foo (v a, v b)
+{
+  return __builtin_vec_bcdsub_ge (a, b, 0);
+}
+
+/* { dg-final { scan-assembler {\mcror\M} } } */


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Ping [PATCH v2, rs6000] Fix ICE on expand bcd<bcd_add_sub>_<code>_<mode> [PR100736]
  2022-05-26  7:35 [PATCH v2, rs6000] Fix ICE on expand bcd<bcd_add_sub>_<code>_<mode> [PR100736] HAO CHEN GUI
@ 2022-05-30  1:26 ` HAO CHEN GUI
  2022-05-30 10:12 ` Kewen.Lin
  1 sibling, 0 replies; 8+ messages in thread
From: HAO CHEN GUI @ 2022-05-30  1:26 UTC (permalink / raw)
  To: gcc-patches; +Cc: Segher Boessenkool, David, Kewen.Lin, Peter Bergner

Hi,
   Gentle ping this:
https://gcc.gnu.org/pipermail/gcc-patches/2022-May/595661.html
Thanks.

On 26/5/2022 下午 3:35, HAO CHEN GUI wrote:
> Hi,
>   This patch fixes the ICE reported in PR100736. It removes the condition
> check of finite math only flag not setting in "*<code><mode>_cc" pattern.
> With or without this flag, we still can use "cror" to check if either
> two bits of CC is set or not for "fp_two" codes. We don't need a reverse
> comparison (implemented by crnot) here when the finite math flag is set,
> as the latency of "cror" and "crnor" are the same.
> 
>   Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
> Is this okay for trunk? Any recommendations? Thanks a lot.
> 
> ChangeLog
> 2022-05-26 Haochen Gui <guihaoc@linux.ibm.com>
> 
> gcc/
> 	* config/rs6000/rs6000.md (*<code><mode>_cc): Remove condition of
> 	finite math only flag not setting.
> 
> gcc/testsuite/
> 	* gcc.target/powerpc/pr100736.c: New.
> 
> 
> patch.diff
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index fdfbc6566a5..a6f9cbc9b8b 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -12995,9 +12995,9 @@ (define_insn_and_split "*<code><mode>_cc"
>    [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
>  	(fp_two:GPR (match_operand:CCFP 1 "cc_reg_operand" "y")
>  		  (const_int 0)))]
> -  "!flag_finite_math_only"
> +  ""
>    "#"
> -  "&& 1"
> +  ""
>    [(pc)]
>  {
>    rtx cc = rs6000_emit_fp_cror (<CODE>, <MODE>mode, operands[1]);
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr100736.c b/gcc/testsuite/gcc.target/powerpc/pr100736.c
> new file mode 100644
> index 00000000000..32cb6df6cd9
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr100736.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target powerpc_p8vector_ok } */
> +/* { dg-options "-mdejagnu-cpu=power8 -O2 -ffinite-math-only" } */
> +
> +typedef __attribute__ ((altivec (vector__))) unsigned char v;
> +
> +int foo (v a, v b)
> +{
> +  return __builtin_vec_bcdsub_ge (a, b, 0);
> +}
> +
> +/* { dg-final { scan-assembler {\mcror\M} } } */
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2, rs6000] Fix ICE on expand bcd<bcd_add_sub>_<code>_<mode> [PR100736]
  2022-05-26  7:35 [PATCH v2, rs6000] Fix ICE on expand bcd<bcd_add_sub>_<code>_<mode> [PR100736] HAO CHEN GUI
  2022-05-30  1:26 ` Ping " HAO CHEN GUI
@ 2022-05-30 10:12 ` Kewen.Lin
  2022-05-31 23:56   ` Segher Boessenkool
  1 sibling, 1 reply; 8+ messages in thread
From: Kewen.Lin @ 2022-05-30 10:12 UTC (permalink / raw)
  To: HAO CHEN GUI; +Cc: Segher Boessenkool, David, Peter Bergner, gcc-patches

Hi Haochen,

on 2022/5/26 15:35, HAO CHEN GUI wrote:
> Hi,
>   This patch fixes the ICE reported in PR100736. It removes the condition
> check of finite math only flag not setting in "*<code><mode>_cc" pattern.
> With or without this flag, we still can use "cror" to check if either
> two bits of CC is set or not for "fp_two" codes. We don't need a reverse
> comparison (implemented by crnot) here when the finite math flag is set,
> as the latency of "cror" and "crnor" are the same.
> 
>   Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
> Is this okay for trunk? Any recommendations? Thanks a lot.
> 
> ChangeLog
> 2022-05-26 Haochen Gui <guihaoc@linux.ibm.com>
> 
> gcc/
> 	* config/rs6000/rs6000.md (*<code><mode>_cc): Remove condition of
> 	finite math only flag not setting.
> 
> gcc/testsuite/
> 	* gcc.target/powerpc/pr100736.c: New.
> 
> 
> patch.diff
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index fdfbc6566a5..a6f9cbc9b8b 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -12995,9 +12995,9 @@ (define_insn_and_split "*<code><mode>_cc"
>    [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
>  	(fp_two:GPR (match_operand:CCFP 1 "cc_reg_operand" "y")
>  		  (const_int 0)))]
> -  "!flag_finite_math_only"
> +  ""
>    "#"
> -  "&& 1"
> +  ""

Segher added this hunk, not sure if he prefer to keep the condition unchanged
and update the expansion side, looking forward to his comments.  :)

>    [(pc)]
>  {
>    rtx cc = rs6000_emit_fp_cror (<CODE>, <MODE>mode, operands[1]);
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr100736.c b/gcc/testsuite/gcc.target/powerpc/pr100736.c
> new file mode 100644
> index 00000000000..32cb6df6cd9
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr100736.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target powerpc_p8vector_ok } */
> +/* { dg-options "-mdejagnu-cpu=power8 -O2 -ffinite-math-only" } */
> +
> +typedef __attribute__ ((altivec (vector__))) unsigned char v;
> +
> +int foo (v a, v b)
> +{
> +  return __builtin_vec_bcdsub_ge (a, b, 0);
> +}
> +
> +/* { dg-final { scan-assembler {\mcror\M} } } */
> 

The case of PR100736 fails with ICE as reported, maybe we can remove this dg-final check,
since as you noted in the description above either "cror" or "crnor" are acceptable,
this extra check could probably make this case fragile.

BR,
Kewen

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2, rs6000] Fix ICE on expand bcd<bcd_add_sub>_<code>_<mode> [PR100736]
  2022-05-30 10:12 ` Kewen.Lin
@ 2022-05-31 23:56   ` Segher Boessenkool
  2022-06-01 22:05     ` Segher Boessenkool
  0 siblings, 1 reply; 8+ messages in thread
From: Segher Boessenkool @ 2022-05-31 23:56 UTC (permalink / raw)
  To: Kewen.Lin; +Cc: HAO CHEN GUI, David, Peter Bergner, gcc-patches

Hi!

On Mon, May 30, 2022 at 06:12:26PM +0800, Kewen.Lin wrote:
> on 2022/5/26 15:35, HAO CHEN GUI wrote:
> >   This patch fixes the ICE reported in PR100736. It removes the condition
> > check of finite math only flag not setting in "*<code><mode>_cc" pattern.
> > With or without this flag, we still can use "cror" to check if either
> > two bits of CC is set or not for "fp_two" codes. We don't need a reverse
> > comparison (implemented by crnot) here when the finite math flag is set,
> > as the latency of "cror" and "crnor" are the same.

> > --- a/gcc/config/rs6000/rs6000.md
> > +++ b/gcc/config/rs6000/rs6000.md
> > @@ -12995,9 +12995,9 @@ (define_insn_and_split "*<code><mode>_cc"
> >    [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
> >  	(fp_two:GPR (match_operand:CCFP 1 "cc_reg_operand" "y")
> >  		  (const_int 0)))]
> > -  "!flag_finite_math_only"
> > +  ""
> >    "#"
> > -  "&& 1"
> > +  ""
> 
> Segher added this hunk, not sure if he prefer to keep the condition unchanged
> and update the expansion side, looking forward to his comments.  :)

It's not clear to me how this can ever happen without finite_math_only?
The patch is safe, sure, but it may the real problem is elsewhere.

> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/powerpc/pr100736.c
> > @@ -0,0 +1,12 @@
> > +/* { dg-do compile } */
> > +/* { dg-require-effective-target powerpc_p8vector_ok } */
> > +/* { dg-options "-mdejagnu-cpu=power8 -O2 -ffinite-math-only" } */

The usual flag to use would be -ffast-math :-)

> > +/* { dg-final { scan-assembler {\mcror\M} } } */
> 
> The case of PR100736 fails with ICE as reported, maybe we can remove this dg-final check,
> since as you noted in the description above either "cror" or "crnor" are acceptable,
> this extra check could probably make this case fragile.

Check for \mcrn?or\M then?  But, is crnor something we want here ever?

The reason we do not have cror for finte-math-only is that comparisons
can only (validly :-) ) return LT, GT, or EQ then, and we can branch on
that without twiddling CRF bits first.  Is this not true for BCD
compares, is that what the problem is?  Or, is our builtin expansion
returning something invalid?  Or something else :-)


Segher

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2, rs6000] Fix ICE on expand bcd<bcd_add_sub>_<code>_<mode> [PR100736]
  2022-05-31 23:56   ` Segher Boessenkool
@ 2022-06-01 22:05     ` Segher Boessenkool
  2022-06-02  5:30       ` HAO CHEN GUI
  0 siblings, 1 reply; 8+ messages in thread
From: Segher Boessenkool @ 2022-06-01 22:05 UTC (permalink / raw)
  To: Kewen.Lin; +Cc: gcc-patches, Peter Bergner, David

Hi!

On Tue, May 31, 2022 at 06:56:00PM -0500, Segher Boessenkool wrote:
> It's not clear to me how this can ever happen without finite_math_only?
> The patch is safe, sure, but it may the real problem is elsewhere.

So, it is incorrect the RTL for our bcd{add,sub} insns uses CCFP at all.

CCFP stands for the result of a 4-way comparison, regular float
comparison: lt gt eq un.  But bcdadd does not have an unordered at all.
Instead, it has the result of a 3-way comparison (lt gt eq), and bit 3
is set if an overflow happened -- but still exactly one of bits 0..2 is
set then!  (If one of the inputs is an invalid number it sets bits 0..3
to 0001 though.)

So it would be much more correct and sensible to use regular integer
comparison results here, so, CC.

Does that fix the problem?

Segher

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2, rs6000] Fix ICE on expand bcd<bcd_add_sub>_<code>_<mode> [PR100736]
  2022-06-01 22:05     ` Segher Boessenkool
@ 2022-06-02  5:30       ` HAO CHEN GUI
  2022-06-02  9:04         ` Segher Boessenkool
  0 siblings, 1 reply; 8+ messages in thread
From: HAO CHEN GUI @ 2022-06-02  5:30 UTC (permalink / raw)
  To: Segher Boessenkool, Kewen.Lin; +Cc: Peter Bergner, gcc-patches, David

Segher,
  Does BCD comparison return false when either operand is invalid coding?
If yes, the result could be 3-way. We can check gt and eq bits for ge.
We still can't use crnot to only check lt bit as there could be invalid
coding.
  Also, do you think finite-math-only excludes invalid coding? Seems GCC
doesn't clear define it.

Thanks.
Gui Haochen


On 2/6/2022 上午 6:05, Segher Boessenkool wrote:
> Hi!
> 
> On Tue, May 31, 2022 at 06:56:00PM -0500, Segher Boessenkool wrote:
>> It's not clear to me how this can ever happen without finite_math_only?
>> The patch is safe, sure, but it may the real problem is elsewhere.
> 
> So, it is incorrect the RTL for our bcd{add,sub} insns uses CCFP at all.
> 
> CCFP stands for the result of a 4-way comparison, regular float
> comparison: lt gt eq un.  But bcdadd does not have an unordered at all.
> Instead, it has the result of a 3-way comparison (lt gt eq), and bit 3
> is set if an overflow happened -- but still exactly one of bits 0..2 is
> set then!  (If one of the inputs is an invalid number it sets bits 0..3
> to 0001 though.)
> 
> So it would be much more correct and sensible to use regular integer
> comparison results here, so, CC.
> 
> Does that fix the problem?
> 
> 
> Segher

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2, rs6000] Fix ICE on expand bcd<bcd_add_sub>_<code>_<mode> [PR100736]
  2022-06-02  5:30       ` HAO CHEN GUI
@ 2022-06-02  9:04         ` Segher Boessenkool
  2022-06-06  8:14           ` HAO CHEN GUI
  0 siblings, 1 reply; 8+ messages in thread
From: Segher Boessenkool @ 2022-06-02  9:04 UTC (permalink / raw)
  To: HAO CHEN GUI; +Cc: Kewen.Lin, Peter Bergner, gcc-patches, David

Hi!

On Thu, Jun 02, 2022 at 01:30:04PM +0800, HAO CHEN GUI wrote:
> Segher,
>   Does BCD comparison return false when either operand is invalid coding?

It sets all of LT, GT, and EQ to 0 (it normally sets exactly one of them
to 1).  It sets bit 3 (the "SO" bit usually) to 1.

That is what the machine insns do.  What the builtins do is undefined as
far as I know?  If So we can do whatever is most convenient, so, not
handle it specifically at all, just go with what falls out.

> If yes, the result could be 3-way. We can check gt and eq bits for ge.

You can check the LT bit, instead: it is only one branch insn, and also
only one setbc[r] insn (it can be slightly more expensive if you can use
only older insns).

> We still can't use crnot to only check lt bit as there could be invalid
> coding.
>   Also, do you think finite-math-only excludes invalid coding? Seems GCC
> doesn't clear define it.

This is not floating-point code at all, it should not be influenced at
all by finite-math-only!

Segher

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2, rs6000] Fix ICE on expand bcd<bcd_add_sub>_<code>_<mode> [PR100736]
  2022-06-02  9:04         ` Segher Boessenkool
@ 2022-06-06  8:14           ` HAO CHEN GUI
  0 siblings, 0 replies; 8+ messages in thread
From: HAO CHEN GUI @ 2022-06-06  8:14 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Kewen.Lin, Peter Bergner, gcc-patches, David



On 2/6/2022 下午 5:04, Segher Boessenkool wrote:
> Hi!
> 
> On Thu, Jun 02, 2022 at 01:30:04PM +0800, HAO CHEN GUI wrote:
>> Segher,
>>   Does BCD comparison return false when either operand is invalid coding?
> 
> It sets all of LT, GT, and EQ to 0 (it normally sets exactly one of them
> to 1).  It sets bit 3 (the "SO" bit usually) to 1.
> 
> That is what the machine insns do.  What the builtins do is undefined as
> far as I know?  If So we can do whatever is most convenient, so, not
> handle it specifically at all, just go with what falls out.

We defined the following unordered BCD builtins in rs6000-builtin.def.
They check the bit 3 for overflow.

  const signed int __builtin_bcdadd_ov_v1ti (vsq, vsq, const int<1>);
    BCDADD_OV_V1TI bcdadd_unordered_v1ti {}

  const signed int __builtin_bcdadd_ov_v16qi (vsc, vsc, const int<1>);
    BCDADD_OV_V16QI bcdadd_unordered_v16qi {}

Also Xlc defines three BCD builtins for overflow and invalid coding.
https://www.ibm.com/docs/en/xl-c-and-cpp-linux/16.1.1?topic=functions-bcd-test-add-subtract-overflow
Shall GCC keep up with Xlc? Please advise.

Thanks
Gui Haochen

> 
>> If yes, the result could be 3-way. We can check gt and eq bits for ge.
> 
> You can check the LT bit, instead: it is only one branch insn, and also
> only one setbc[r] insn (it can be slightly more expensive if you can use
> only older insns).
> 
>> We still can't use crnot to only check lt bit as there could be invalid
>> coding.
>>   Also, do you think finite-math-only excludes invalid coding? Seems GCC
>> doesn't clear define it.
> 
> This is not floating-point code at all, it should not be influenced at
> all by finite-math-only!
> 
> 
> Segher

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-06-06  8:14 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-26  7:35 [PATCH v2, rs6000] Fix ICE on expand bcd<bcd_add_sub>_<code>_<mode> [PR100736] HAO CHEN GUI
2022-05-30  1:26 ` Ping " HAO CHEN GUI
2022-05-30 10:12 ` Kewen.Lin
2022-05-31 23:56   ` Segher Boessenkool
2022-06-01 22:05     ` Segher Boessenkool
2022-06-02  5:30       ` HAO CHEN GUI
2022-06-02  9:04         ` Segher Boessenkool
2022-06-06  8:14           ` HAO CHEN GUI

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).