public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] RISC-V: optim const DF +0.0 store to mem [PR/110748]
@ 2023-07-21 17:55 Vineet Gupta
  2023-07-21 18:15 ` Philipp Tomsich
  2023-07-21 18:31 ` Palmer Dabbelt
  0 siblings, 2 replies; 10+ messages in thread
From: Vineet Gupta @ 2023-07-21 17:55 UTC (permalink / raw)
  To: gcc-patches, Manolis Tsamis
  Cc: kito.cheng, Jeff Law, Palmer Dabbelt, gnu-toolchain, Vineet Gupta

DF +0.0 is bitwise all zeros so int x0 store to mem can be used to optimize it.

void zd(double *) { *d = 0.0; }

currently:

| fmv.d.x fa5,zero
| fsd     fa5,0(a0)
| ret

With patch

| sd      zero,0(a0)
| ret

This came to light when testing the in-flight f-m-o patch where an ICE
was gettinh triggered due to lack of this pattern but turns out this
is an independent optimization of its own [1]

[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624857.html

Apparently this is a regression in gcc-13, introduced by commit
ef85d150b5963 ("RISC-V: Enable TARGET_SUPPORTS_WIDE_INT") and the fix
thus is a partial revert of that change.

Ran thru full multilib testsuite, there was 1 false failure due to
random string "lw" appearing in lto build assembler output,
which is also fixed in the patch.

gcc/Changelog:

	* config/riscv/predicates.md (const_0_operand): Add back
	  const_double.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/pr110748-1.c: New Test.
	* gcc.target/riscv/xtheadfmv-fmv.c: Add '\t' around test
	  patterns to avoid random string matches.

Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
---
 gcc/config/riscv/predicates.md                 |  2 +-
 gcc/testsuite/gcc.target/riscv/pr110748-1.c    | 10 ++++++++++
 gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c |  8 ++++----
 3 files changed, 15 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr110748-1.c

diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index 5a22c77f0cd0..9db28c2def7e 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -58,7 +58,7 @@
        (match_test "INTVAL (op) + 1 != 0")))
 
 (define_predicate "const_0_operand"
-  (and (match_code "const_int,const_wide_int,const_vector")
+  (and (match_code "const_int,const_wide_int,const_double,const_vector")
        (match_test "op == CONST0_RTX (GET_MODE (op))")))
 
 (define_predicate "const_1_operand"
diff --git a/gcc/testsuite/gcc.target/riscv/pr110748-1.c b/gcc/testsuite/gcc.target/riscv/pr110748-1.c
new file mode 100644
index 000000000000..2f5bc08aae72
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr110748-1.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target hard_float } */
+/* { dg-options "-march=rv64g -mabi=lp64d -O2" } */
+
+
+void zd(double *d) { *d = 0.0;  }
+void zf(float *f)  { *f = 0.0;  }
+
+/* { dg-final { scan-assembler-not "\tfmv\\.d\\.x\t" } } */
+/* { dg-final { scan-assembler-not "\tfmv\\.s\\.x\t" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c b/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c
index 1036044291e7..89eb48bed1b9 100644
--- a/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c
+++ b/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c
@@ -18,7 +18,7 @@ d2ll (double d)
 /* { dg-final { scan-assembler "th.fmv.hw.x" } } */
 /* { dg-final { scan-assembler "fmv.x.w" } } */
 /* { dg-final { scan-assembler "th.fmv.x.hw" } } */
-/* { dg-final { scan-assembler-not "sw" } } */
-/* { dg-final { scan-assembler-not "fld" } } */
-/* { dg-final { scan-assembler-not "fsd" } } */
-/* { dg-final { scan-assembler-not "lw" } } */
+/* { dg-final { scan-assembler-not "\tsw\t" } } */
+/* { dg-final { scan-assembler-not "\tfld\t" } } */
+/* { dg-final { scan-assembler-not "\tfsd\t" } } */
+/* { dg-final { scan-assembler-not "\tlw\t" } } */
-- 
2.34.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] RISC-V: optim const DF +0.0 store to mem [PR/110748]
  2023-07-21 17:55 [PATCH] RISC-V: optim const DF +0.0 store to mem [PR/110748] Vineet Gupta
@ 2023-07-21 18:15 ` Philipp Tomsich
  2023-07-21 18:23   ` Vineet Gupta
  2023-07-21 18:31 ` Palmer Dabbelt
  1 sibling, 1 reply; 10+ messages in thread
From: Philipp Tomsich @ 2023-07-21 18:15 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: gcc-patches, Manolis Tsamis, kito.cheng, Jeff Law,
	Palmer Dabbelt, gnu-toolchain

On Fri, 21 Jul 2023 at 19:56, Vineet Gupta <vineetg@rivosinc.com> wrote:
>
> DF +0.0 is bitwise all zeros so int x0 store to mem can be used to optimize it.
>
> void zd(double *) { *d = 0.0; }
>
> currently:
>
> | fmv.d.x fa5,zero
> | fsd     fa5,0(a0)
> | ret
>
> With patch
>
> | sd      zero,0(a0)
> | ret
> This came to light when testing the in-flight f-m-o patch where an ICE
> was gettinh triggered due to lack of this pattern but turns out this

typo: "gettinh" -> "getting"

> is an independent optimization of its own [1]
>
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624857.html
>
> Apparently this is a regression in gcc-13, introduced by commit
> ef85d150b5963 ("RISC-V: Enable TARGET_SUPPORTS_WIDE_INT") and the fix
> thus is a partial revert of that change.

Should we add a "Fixes: "?

> Ran thru full multilib testsuite, there was 1 false failure due to
> random string "lw" appearing in lto build assembler output,
> which is also fixed in the patch.
>
> gcc/Changelog:

PR target/110748

>
>         * config/riscv/predicates.md (const_0_operand): Add back
>           const_double.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.target/riscv/pr110748-1.c: New Test.
>         * gcc.target/riscv/xtheadfmv-fmv.c: Add '\t' around test
>           patterns to avoid random string matches.
>
> Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
> ---
>  gcc/config/riscv/predicates.md                 |  2 +-
>  gcc/testsuite/gcc.target/riscv/pr110748-1.c    | 10 ++++++++++
>  gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c |  8 ++++----
>  3 files changed, 15 insertions(+), 5 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr110748-1.c
>
> diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
> index 5a22c77f0cd0..9db28c2def7e 100644
> --- a/gcc/config/riscv/predicates.md
> +++ b/gcc/config/riscv/predicates.md
> @@ -58,7 +58,7 @@
>         (match_test "INTVAL (op) + 1 != 0")))
>
>  (define_predicate "const_0_operand"
> -  (and (match_code "const_int,const_wide_int,const_vector")
> +  (and (match_code "const_int,const_wide_int,const_double,const_vector")
>         (match_test "op == CONST0_RTX (GET_MODE (op))")))
>
>  (define_predicate "const_1_operand"
> diff --git a/gcc/testsuite/gcc.target/riscv/pr110748-1.c b/gcc/testsuite/gcc.target/riscv/pr110748-1.c
> new file mode 100644
> index 000000000000..2f5bc08aae72
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/pr110748-1.c
> @@ -0,0 +1,10 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target hard_float } */
> +/* { dg-options "-march=rv64g -mabi=lp64d -O2" } */
> +
> +
> +void zd(double *d) { *d = 0.0;  }
> +void zf(float *f)  { *f = 0.0;  }
> +
> +/* { dg-final { scan-assembler-not "\tfmv\\.d\\.x\t" } } */
> +/* { dg-final { scan-assembler-not "\tfmv\\.s\\.x\t" } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c b/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c
> index 1036044291e7..89eb48bed1b9 100644
> --- a/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c
> +++ b/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c
> @@ -18,7 +18,7 @@ d2ll (double d)
>  /* { dg-final { scan-assembler "th.fmv.hw.x" } } */
>  /* { dg-final { scan-assembler "fmv.x.w" } } */
>  /* { dg-final { scan-assembler "th.fmv.x.hw" } } */
> -/* { dg-final { scan-assembler-not "sw" } } */
> -/* { dg-final { scan-assembler-not "fld" } } */
> -/* { dg-final { scan-assembler-not "fsd" } } */
> -/* { dg-final { scan-assembler-not "lw" } } */
> +/* { dg-final { scan-assembler-not "\tsw\t" } } */
> +/* { dg-final { scan-assembler-not "\tfld\t" } } */
> +/* { dg-final { scan-assembler-not "\tfsd\t" } } */
> +/* { dg-final { scan-assembler-not "\tlw\t" } } */
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] RISC-V: optim const DF +0.0 store to mem [PR/110748]
  2023-07-21 18:15 ` Philipp Tomsich
@ 2023-07-21 18:23   ` Vineet Gupta
  0 siblings, 0 replies; 10+ messages in thread
From: Vineet Gupta @ 2023-07-21 18:23 UTC (permalink / raw)
  To: Philipp Tomsich
  Cc: gcc-patches, Manolis Tsamis, kito.cheng, Jeff Law,
	Palmer Dabbelt, gnu-toolchain



On 7/21/23 11:15, Philipp Tomsich wrote:
> On Fri, 21 Jul 2023 at 19:56, Vineet Gupta <vineetg@rivosinc.com> wrote:
>> DF +0.0 is bitwise all zeros so int x0 store to mem can be used to optimize it.
>>
>> void zd(double *) { *d = 0.0; }
>>
>> currently:
>>
>> | fmv.d.x fa5,zero
>> | fsd     fa5,0(a0)
>> | ret
>>
>> With patch
>>
>> | sd      zero,0(a0)
>> | ret
>> This came to light when testing the in-flight f-m-o patch where an ICE
>> was gettinh triggered due to lack of this pattern but turns out this
> typo: "gettinh" -> "getting"

Fixed.

>> is an independent optimization of its own [1]
>>
>> [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624857.html
>>
>> Apparently this is a regression in gcc-13, introduced by commit
>> ef85d150b5963 ("RISC-V: Enable TARGET_SUPPORTS_WIDE_INT") and the fix
>> thus is a partial revert of that change.
> Should we add a "Fixes: "?

Sure. Although gcc usage of Fixes tag seems slightly different than say 
linux kernel's.


>
>> Ran thru full multilib testsuite, there was 1 false failure due to
>> random string "lw" appearing in lto build assembler output,
>> which is also fixed in the patch.
>>
>> gcc/Changelog:
> PR target/110748

Added.

Thx,
-Vineet


>
>>          * config/riscv/predicates.md (const_0_operand): Add back
>>            const_double.
>>
>> gcc/testsuite/ChangeLog:
>>
>>          * gcc.target/riscv/pr110748-1.c: New Test.
>>          * gcc.target/riscv/xtheadfmv-fmv.c: Add '\t' around test
>>            patterns to avoid random string matches.
>>
>> Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
>> ---
>>   gcc/config/riscv/predicates.md                 |  2 +-
>>   gcc/testsuite/gcc.target/riscv/pr110748-1.c    | 10 ++++++++++
>>   gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c |  8 ++++----
>>   3 files changed, 15 insertions(+), 5 deletions(-)
>>   create mode 100644 gcc/testsuite/gcc.target/riscv/pr110748-1.c
>>
>> diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
>> index 5a22c77f0cd0..9db28c2def7e 100644
>> --- a/gcc/config/riscv/predicates.md
>> +++ b/gcc/config/riscv/predicates.md
>> @@ -58,7 +58,7 @@
>>          (match_test "INTVAL (op) + 1 != 0")))
>>
>>   (define_predicate "const_0_operand"
>> -  (and (match_code "const_int,const_wide_int,const_vector")
>> +  (and (match_code "const_int,const_wide_int,const_double,const_vector")
>>          (match_test "op == CONST0_RTX (GET_MODE (op))")))
>>
>>   (define_predicate "const_1_operand"
>> diff --git a/gcc/testsuite/gcc.target/riscv/pr110748-1.c b/gcc/testsuite/gcc.target/riscv/pr110748-1.c
>> new file mode 100644
>> index 000000000000..2f5bc08aae72
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/riscv/pr110748-1.c
>> @@ -0,0 +1,10 @@
>> +/* { dg-do compile } */
>> +/* { dg-require-effective-target hard_float } */
>> +/* { dg-options "-march=rv64g -mabi=lp64d -O2" } */
>> +
>> +
>> +void zd(double *d) { *d = 0.0;  }
>> +void zf(float *f)  { *f = 0.0;  }
>> +
>> +/* { dg-final { scan-assembler-not "\tfmv\\.d\\.x\t" } } */
>> +/* { dg-final { scan-assembler-not "\tfmv\\.s\\.x\t" } } */
>> diff --git a/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c b/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c
>> index 1036044291e7..89eb48bed1b9 100644
>> --- a/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c
>> +++ b/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c
>> @@ -18,7 +18,7 @@ d2ll (double d)
>>   /* { dg-final { scan-assembler "th.fmv.hw.x" } } */
>>   /* { dg-final { scan-assembler "fmv.x.w" } } */
>>   /* { dg-final { scan-assembler "th.fmv.x.hw" } } */
>> -/* { dg-final { scan-assembler-not "sw" } } */
>> -/* { dg-final { scan-assembler-not "fld" } } */
>> -/* { dg-final { scan-assembler-not "fsd" } } */
>> -/* { dg-final { scan-assembler-not "lw" } } */
>> +/* { dg-final { scan-assembler-not "\tsw\t" } } */
>> +/* { dg-final { scan-assembler-not "\tfld\t" } } */
>> +/* { dg-final { scan-assembler-not "\tfsd\t" } } */
>> +/* { dg-final { scan-assembler-not "\tlw\t" } } */
>> --
>> 2.34.1
>>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] RISC-V: optim const DF +0.0 store to mem [PR/110748]
  2023-07-21 17:55 [PATCH] RISC-V: optim const DF +0.0 store to mem [PR/110748] Vineet Gupta
  2023-07-21 18:15 ` Philipp Tomsich
@ 2023-07-21 18:31 ` Palmer Dabbelt
  2023-07-21 18:47   ` Jeff Law
                     ` (2 more replies)
  1 sibling, 3 replies; 10+ messages in thread
From: Palmer Dabbelt @ 2023-07-21 18:31 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: gcc-patches, manolis.tsamis, Kito Cheng, jeffreyalaw,
	gnu-toolchain, Vineet Gupta

On Fri, 21 Jul 2023 10:55:52 PDT (-0700), Vineet Gupta wrote:
> DF +0.0 is bitwise all zeros so int x0 store to mem can be used to optimize it.
>
> void zd(double *) { *d = 0.0; }
>
> currently:
>
> | fmv.d.x fa5,zero
> | fsd     fa5,0(a0)
> | ret
>
> With patch
>
> | sd      zero,0(a0)
> | ret
>
> This came to light when testing the in-flight f-m-o patch where an ICE
> was gettinh triggered due to lack of this pattern but turns out this
> is an independent optimization of its own [1]
>
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624857.html
>
> Apparently this is a regression in gcc-13, introduced by commit
> ef85d150b5963 ("RISC-V: Enable TARGET_SUPPORTS_WIDE_INT") and the fix
> thus is a partial revert of that change.

Given that it can ICE, we should probably backport it to 13.

> Ran thru full multilib testsuite, there was 1 false failure due to

Did you run the test with autovec?  There's also a 
pmode_reg_or_0_operand, some of those don't appear protected from FP 
values.  So we might need something like

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index cd5b19457f8..d8ce9223343 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -63,7 +63,7 @@ (define_expand "movmisalign<mode>"
 
 (define_expand "len_mask_gather_load<VNX1_QHSD:mode><VNX1_QHSDI:mode>"
   [(match_operand:VNX1_QHSD 0 "register_operand")
-   (match_operand 1 "pmode_reg_or_0_operand")
+   (match_operand:P 1 "pmode_reg_or_0_operand")
    (match_operand:VNX1_QHSDI 2 "register_operand")
    (match_operand 3 "<VNX1_QHSD:gs_extension>")
    (match_operand 4 "<VNX1_QHSD:gs_scale>")

a bunch of times, as there's a ton of them?  I'm not entirely sure if that
could manifest as an actual bug, though...

> random string "lw" appearing in lto build assembler output,
> which is also fixed in the patch.
>
> gcc/Changelog:
>
> 	* config/riscv/predicates.md (const_0_operand): Add back
> 	  const_double.
>
> gcc/testsuite/ChangeLog:
>
> 	* gcc.target/riscv/pr110748-1.c: New Test.
> 	* gcc.target/riscv/xtheadfmv-fmv.c: Add '\t' around test
> 	  patterns to avoid random string matches.
>
> Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
> ---
>  gcc/config/riscv/predicates.md                 |  2 +-
>  gcc/testsuite/gcc.target/riscv/pr110748-1.c    | 10 ++++++++++
>  gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c |  8 ++++----
>  3 files changed, 15 insertions(+), 5 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr110748-1.c
>
> diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
> index 5a22c77f0cd0..9db28c2def7e 100644
> --- a/gcc/config/riscv/predicates.md
> +++ b/gcc/config/riscv/predicates.md
> @@ -58,7 +58,7 @@
>         (match_test "INTVAL (op) + 1 != 0")))
>
>  (define_predicate "const_0_operand"
> -  (and (match_code "const_int,const_wide_int,const_vector")
> +  (and (match_code "const_int,const_wide_int,const_double,const_vector")
>         (match_test "op == CONST0_RTX (GET_MODE (op))")))
>
>  (define_predicate "const_1_operand"
> diff --git a/gcc/testsuite/gcc.target/riscv/pr110748-1.c b/gcc/testsuite/gcc.target/riscv/pr110748-1.c
> new file mode 100644
> index 000000000000..2f5bc08aae72
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/pr110748-1.c
> @@ -0,0 +1,10 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target hard_float } */
> +/* { dg-options "-march=rv64g -mabi=lp64d -O2" } */
> +
> +
> +void zd(double *d) { *d = 0.0;  }
> +void zf(float *f)  { *f = 0.0;  }
> +
> +/* { dg-final { scan-assembler-not "\tfmv\\.d\\.x\t" } } */
> +/* { dg-final { scan-assembler-not "\tfmv\\.s\\.x\t" } } */

IIUC the pattern to emit fmv suffers from the same bug -- it's fixed in the same
way, but I think we might be able to come up with a test for it: `fmv.d.x FREG,
x0` would be the fastest way to generate 0.0, so maybe something like

    double sum(double *d) {
      double sum = 0;
      for (int i = 0; i < 8; ++i)
        sum += d[i];
      return sum;
    }

would do it?  That's generating the fmv on 13 for me, though, so maybe I'm
missing something?`

> diff --git a/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c b/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c
> index 1036044291e7..89eb48bed1b9 100644
> --- a/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c
> +++ b/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c
> @@ -18,7 +18,7 @@ d2ll (double d)
>  /* { dg-final { scan-assembler "th.fmv.hw.x" } } */
>  /* { dg-final { scan-assembler "fmv.x.w" } } */
>  /* { dg-final { scan-assembler "th.fmv.x.hw" } } */
> -/* { dg-final { scan-assembler-not "sw" } } */
> -/* { dg-final { scan-assembler-not "fld" } } */
> -/* { dg-final { scan-assembler-not "fsd" } } */
> -/* { dg-final { scan-assembler-not "lw" } } */
> +/* { dg-final { scan-assembler-not "\tsw\t" } } */
> +/* { dg-final { scan-assembler-not "\tfld\t" } } */
> +/* { dg-final { scan-assembler-not "\tfsd\t" } } */
> +/* { dg-final { scan-assembler-not "\tlw\t" } } */

I think that autovec one is the only possible dependency that might have snuck
in, so we should be safe otherwise.  Thanks!

Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] RISC-V: optim const DF +0.0 store to mem [PR/110748]
  2023-07-21 18:31 ` Palmer Dabbelt
@ 2023-07-21 18:47   ` Jeff Law
  2023-07-25 23:05     ` Palmer Dabbelt
  2023-07-21 18:55   ` Vineet Gupta
  2023-07-21 19:37   ` Vineet Gupta
  2 siblings, 1 reply; 10+ messages in thread
From: Jeff Law @ 2023-07-21 18:47 UTC (permalink / raw)
  To: Palmer Dabbelt, Vineet Gupta
  Cc: gcc-patches, manolis.tsamis, Kito Cheng, gnu-toolchain



On 7/21/23 12:31, Palmer Dabbelt wrote:

> 
> (define_expand "len_mask_gather_load<VNX1_QHSD:mode><VNX1_QHSDI:mode>"
>    [(match_operand:VNX1_QHSD 0 "register_operand")
> -   (match_operand 1 "pmode_reg_or_0_operand")
> +   (match_operand:P 1 "pmode_reg_or_0_operand")
>     (match_operand:VNX1_QHSDI 2 "register_operand")
>     (match_operand 3 "<VNX1_QHSD:gs_extension>")
>     (match_operand 4 "<VNX1_QHSD:gs_scale>")
> 
> a bunch of times, as there's a ton of them?  I'm not entirely sure if that
> could manifest as an actual bug, though...
But won't this cause (const_int 0) to no longer match because CONST_INT 
nodes are modeless (VOIDmode)?

Jeff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] RISC-V: optim const DF +0.0 store to mem [PR/110748]
  2023-07-21 18:31 ` Palmer Dabbelt
  2023-07-21 18:47   ` Jeff Law
@ 2023-07-21 18:55   ` Vineet Gupta
  2023-07-22  6:03     ` Jeff Law
  2023-07-21 19:37   ` Vineet Gupta
  2 siblings, 1 reply; 10+ messages in thread
From: Vineet Gupta @ 2023-07-21 18:55 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: gcc-patches, manolis.tsamis, Kito Cheng, jeffreyalaw, gnu-toolchain



On 7/21/23 11:31, Palmer Dabbelt wrote:
> On Fri, 21 Jul 2023 10:55:52 PDT (-0700), Vineet Gupta wrote:
>> DF +0.0 is bitwise all zeros so int x0 store to mem can be used to 
>> optimize it.
>>
>> void zd(double *) { *d = 0.0; }
>>
>> currently:
>>
>> | fmv.d.x fa5,zero
>> | fsd     fa5,0(a0)
>> | ret
>>
>> With patch
>>
>> | sd      zero,0(a0)
>> | ret
>>
>> This came to light when testing the in-flight f-m-o patch where an ICE
>> was gettinh triggered due to lack of this pattern but turns out this
>> is an independent optimization of its own [1]
>>
>> [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624857.html
>>
>> Apparently this is a regression in gcc-13, introduced by commit
>> ef85d150b5963 ("RISC-V: Enable TARGET_SUPPORTS_WIDE_INT") and the fix
>> thus is a partial revert of that change.
>
> Given that it can ICE, we should probably backport it to 13.

FWIW ICE is on an in-flight for-gcc-14 patch, not something in tree 
already. And this will merge ahead of that.
I'm fine with backport though.

>
>> Ran thru full multilib testsuite, there was 1 false failure due to
>
> Did you run the test with autovec?

I have standard 32/64 mutlilibs, but no 'v' in arch so autovec despite 
being enabled at -O2 and above will not kick in.
I think we should add a 'v' multilib.


> There's also a pmode_reg_or_0_operand, some of those don't appear 
> protected from FP values.  So we might need something like
>
> diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
> index cd5b19457f8..d8ce9223343 100644
> --- a/gcc/config/riscv/autovec.md
> +++ b/gcc/config/riscv/autovec.md
> @@ -63,7 +63,7 @@ (define_expand "movmisalign<mode>"
>
> (define_expand "len_mask_gather_load<VNX1_QHSD:mode><VNX1_QHSDI:mode>"
>   [(match_operand:VNX1_QHSD 0 "register_operand")
> -   (match_operand 1 "pmode_reg_or_0_operand")
> +   (match_operand:P 1 "pmode_reg_or_0_operand")
>    (match_operand:VNX1_QHSDI 2 "register_operand")
>    (match_operand 3 "<VNX1_QHSD:gs_extension>")
>    (match_operand 4 "<VNX1_QHSD:gs_scale>")
>
> a bunch of times, as there's a ton of them?  I'm not entirely sure if 
> that
> could manifest as an actual bug, though...

What does 'P' do here ?

>> +
>> +void zd(double *d) { *d = 0.0;  }
>> +void zf(float *f)  { *f = 0.0;  }
>> +
>> +/* { dg-final { scan-assembler-not "\tfmv\\.d\\.x\t" } } */
>> +/* { dg-final { scan-assembler-not "\tfmv\\.s\\.x\t" } } */
>
> IIUC the pattern to emit fmv suffers from the same bug -- it's fixed 
> in the same
> way, but I think we might be able to come up with a test for it: 
> `fmv.d.x FREG,
> x0` would be the fastest way to generate 0.0, so maybe something like
>
>    double sum(double *d) {
>      double sum = 0;
>      for (int i = 0; i < 8; ++i)
>        sum += d[i];
>      return sum;
>    }
>
> would do it?  That's generating the fmv on 13 for me, though, so maybe 
> I'm
> missing something?`

I need to unpack this first :-)

>
>> diff --git a/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c 
>> b/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c
>> index 1036044291e7..89eb48bed1b9 100644
>> --- a/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c
>> +++ b/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c
>> @@ -18,7 +18,7 @@ d2ll (double d)
>>  /* { dg-final { scan-assembler "th.fmv.hw.x" } } */
>>  /* { dg-final { scan-assembler "fmv.x.w" } } */
>>  /* { dg-final { scan-assembler "th.fmv.x.hw" } } */
>> -/* { dg-final { scan-assembler-not "sw" } } */
>> -/* { dg-final { scan-assembler-not "fld" } } */
>> -/* { dg-final { scan-assembler-not "fsd" } } */
>> -/* { dg-final { scan-assembler-not "lw" } } */
>> +/* { dg-final { scan-assembler-not "\tsw\t" } } */
>> +/* { dg-final { scan-assembler-not "\tfld\t" } } */
>> +/* { dg-final { scan-assembler-not "\tfsd\t" } } */
>> +/* { dg-final { scan-assembler-not "\tlw\t" } } */
>
> I think that autovec one is the only possible dependency that might 
> have snuck
> in, so we should be safe otherwise.  Thanks!

I'm not sure if this specific comment is related to the xthead test or 
continuation of above.
For xthead it is real issue since I saw a random "lw" in lto assembler 
output.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] RISC-V: optim const DF +0.0 store to mem [PR/110748]
  2023-07-21 18:31 ` Palmer Dabbelt
  2023-07-21 18:47   ` Jeff Law
  2023-07-21 18:55   ` Vineet Gupta
@ 2023-07-21 19:37   ` Vineet Gupta
  2 siblings, 0 replies; 10+ messages in thread
From: Vineet Gupta @ 2023-07-21 19:37 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: gcc-patches, manolis.tsamis, Kito Cheng, jeffreyalaw, gnu-toolchain



On 7/21/23 11:31, Palmer Dabbelt wrote:
>
> IIUC the pattern to emit fmv suffers from the same bug -- it's fixed 
> in the same
> way, but I think we might be able to come up with a test for it: 
> `fmv.d.x FREG,
> x0` would be the fastest way to generate 0.0, so maybe something like
>
>    double sum(double *d) {
>      double sum = 0;
>      for (int i = 0; i < 8; ++i)
>        sum += d[i];
>      return sum;
>    }
>
> would do it?  That's generating the fmv on 13 for me, though, so maybe 
> I'm
> missing something?` 

I don't think we can avoid FMV in this case

     fmv.d.x    fa0,zero     #1
     addi    a5,a0,64
.L2:
     fld    fa5,0(a0)
     addi    a0,a0,8
     fadd.d    fa0,fa0,fa5   #2
     bne    a0,a5,.L2
     ret

In #1, the zero needs to be setup in FP reg (possible using FMV), since 
in #2 it will be used for FP math.

If we change ur test slightly,

double zadd(double *d) {
      double sum = 0.0;
      for (int i = 0; i < 8; ++i)
        d[i] = sum;
      return sum;
}

We still get the optimal code for writing to FP 0. The last FMV is 
unavoidable as we need an FP return reg.


     addi    a5,a0,64
.L2:
     sd    zero,0(a0)
     addi    a0,a0,8
     bne    a0,a5,.L2
     fmv.d.x    fa0,zero
     ret

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] RISC-V: optim const DF +0.0 store to mem [PR/110748]
  2023-07-21 18:55   ` Vineet Gupta
@ 2023-07-22  6:03     ` Jeff Law
  0 siblings, 0 replies; 10+ messages in thread
From: Jeff Law @ 2023-07-22  6:03 UTC (permalink / raw)
  To: Vineet Gupta, Palmer Dabbelt
  Cc: gcc-patches, manolis.tsamis, Kito Cheng, gnu-toolchain



On 7/21/23 12:55, Vineet Gupta wrote:

>>>
>>> Apparently this is a regression in gcc-13, introduced by commit
>>> ef85d150b5963 ("RISC-V: Enable TARGET_SUPPORTS_WIDE_INT") and the fix
>>> thus is a partial revert of that change.
>>
>> Given that it can ICE, we should probably backport it to 13.
> 
> FWIW ICE is on an in-flight for-gcc-14 patch, not something in tree 
> already. And this will merge ahead of that.
> I'm fine with backport though.
It's latent on the gcc-13 branch from an ICE standpoint and would 
probably stay that way -- triggering would require something to 
re-recognize the pattern after register allocation.

I won't object to it going into gcc-13.  No strong opinions either way, 
happy to go with consensus opinion.



>> There's also a pmode_reg_or_0_operand, some of those don't appear 
>> protected from FP values.  So we might need something like
>>
>> diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
>> index cd5b19457f8..d8ce9223343 100644
>> --- a/gcc/config/riscv/autovec.md
>> +++ b/gcc/config/riscv/autovec.md
>> @@ -63,7 +63,7 @@ (define_expand "movmisalign<mode>"
>>
>> (define_expand "len_mask_gather_load<VNX1_QHSD:mode><VNX1_QHSDI:mode>"
>>   [(match_operand:VNX1_QHSD 0 "register_operand")
>> -   (match_operand 1 "pmode_reg_or_0_operand")
>> +   (match_operand:P 1 "pmode_reg_or_0_operand")
>>    (match_operand:VNX1_QHSDI 2 "register_operand")
>>    (match_operand 3 "<VNX1_QHSD:gs_extension>")
>>    (match_operand 4 "<VNX1_QHSD:gs_scale>")
>>
>> a bunch of times, as there's a ton of them?  I'm not entirely sure if 
>> that
>> could manifest as an actual bug, though...
> 
> What does 'P' do here ?
It will force the operand to match the P mode iterator, which should be 
DImode for rv64 and SImode for rv32.  I'm not sure this is wise as I 
think it'll end up rejecting (const_int 0) because CONST_INT nodes do 
not have a mode.

Jeff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] RISC-V: optim const DF +0.0 store to mem [PR/110748]
  2023-07-21 18:47   ` Jeff Law
@ 2023-07-25 23:05     ` Palmer Dabbelt
  2023-07-26  3:09       ` Jeff Law
  0 siblings, 1 reply; 10+ messages in thread
From: Palmer Dabbelt @ 2023-07-25 23:05 UTC (permalink / raw)
  To: gcc-patches
  Cc: Vineet Gupta, gcc-patches, manolis.tsamis, Kito Cheng, gnu-toolchain

On Fri, 21 Jul 2023 11:47:58 PDT (-0700), gcc-patches@gcc.gnu.org wrote:
> On 7/21/23 12:31, Palmer Dabbelt wrote:
>> (define_expand "len_mask_gather_load<VNX1_QHSD:mode><VNX1_QHSDI:mode>"
>>    [(match_operand:VNX1_QHSD 0 "register_operand")
>> -   (match_operand 1 "pmode_reg_or_0_operand")
>> +   (match_operand:P 1 "pmode_reg_or_0_operand")
>>     (match_operand:VNX1_QHSDI 2 "register_operand")
>>     (match_operand 3 "<VNX1_QHSD:gs_extension>")
>>     (match_operand 4 "<VNX1_QHSD:gs_scale>")
>>
>> a bunch of times, as there's a ton of them?  I'm not entirely sure if that
>> could manifest as an actual bug, though...
> But won't this cause (const_int 0) to no longer match because CONST_INT
> nodes are modeless (VOIDmode)?

I poked around a bit and I'm not actually sure, I'm kind of lost on the docs
here.  IIUC we're eliding the VOIDmode in the predicate correctly

    (define_predicate "const_0_operand"
      (and (match_code "const_int,const_wide_int,const_vector")
           (match_test "op == CONST0_RTX (GET_MODE (op))")))

so we're OK there, otherwise we'd presumably have similar problems with
expanders like

    (define_expand "subsi3"
      [(set (match_operand:SI           0 "register_operand" "= r")
           (minus:SI (match_operand:SI 1 "reg_or_0_operand" " rJ")
                     (match_operand:SI 2 "register_operand" "  r")))]
      ""

which we have a few of -- though it'd be kind of a silent failure, as
presumably we'd just end up with some more move-x0s emitted?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] RISC-V: optim const DF +0.0 store to mem [PR/110748]
  2023-07-25 23:05     ` Palmer Dabbelt
@ 2023-07-26  3:09       ` Jeff Law
  0 siblings, 0 replies; 10+ messages in thread
From: Jeff Law @ 2023-07-26  3:09 UTC (permalink / raw)
  To: Palmer Dabbelt, gcc-patches
  Cc: Vineet Gupta, manolis.tsamis, Kito Cheng, gnu-toolchain



On 7/25/23 17:05, Palmer Dabbelt wrote:
> On Fri, 21 Jul 2023 11:47:58 PDT (-0700), gcc-patches@gcc.gnu.org wrote:
>> On 7/21/23 12:31, Palmer Dabbelt wrote:
>>> (define_expand "len_mask_gather_load<VNX1_QHSD:mode><VNX1_QHSDI:mode>"
>>>    [(match_operand:VNX1_QHSD 0 "register_operand")
>>> -   (match_operand 1 "pmode_reg_or_0_operand")
>>> +   (match_operand:P 1 "pmode_reg_or_0_operand")
>>>     (match_operand:VNX1_QHSDI 2 "register_operand")
>>>     (match_operand 3 "<VNX1_QHSD:gs_extension>")
>>>     (match_operand 4 "<VNX1_QHSD:gs_scale>")
>>>
>>> a bunch of times, as there's a ton of them?  I'm not entirely sure if 
>>> that
>>> could manifest as an actual bug, though...
>> But won't this cause (const_int 0) to no longer match because CONST_INT
>> nodes are modeless (VOIDmode)?
> 
> I poked around a bit and I'm not actually sure, I'm kind of lost on the 
> docs
> here.  IIUC we're eliding the VOIDmode in the predicate correctly
> 
>     (define_predicate "const_0_operand"
>       (and (match_code "const_int,const_wide_int,const_vector")
>            (match_test "op == CONST0_RTX (GET_MODE (op))")))
> 
> so we're OK there, otherwise we'd presumably have similar problems with
> expanders like
> 
>     (define_expand "subsi3"
>       [(set (match_operand:SI           0 "register_operand" "= r")
>            (minus:SI (match_operand:SI 1 "reg_or_0_operand" " rJ")
>                      (match_operand:SI 2 "register_operand" "  r")))]
>       ""
> 
> which we have a few of -- though it'd be kind of a silent failure, as
> presumably we'd just end up with some more move-x0s emitted?
It's a bit messy to say the least.  However, we can look at other ports 
and after doing so I'm less sure my concern is valid.

Take the typical movXX pattern or expander.  Both operands have a mode, 
so things like CONST_INT must be passing through, even though they're 
VOIDmode.

So it's probably a non-issue.
jeff



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-07-26  3:09 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-21 17:55 [PATCH] RISC-V: optim const DF +0.0 store to mem [PR/110748] Vineet Gupta
2023-07-21 18:15 ` Philipp Tomsich
2023-07-21 18:23   ` Vineet Gupta
2023-07-21 18:31 ` Palmer Dabbelt
2023-07-21 18:47   ` Jeff Law
2023-07-25 23:05     ` Palmer Dabbelt
2023-07-26  3:09       ` Jeff Law
2023-07-21 18:55   ` Vineet Gupta
2023-07-22  6:03     ` Jeff Law
2023-07-21 19:37   ` Vineet Gupta

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).