* [PATCH] RISC-V: Don't add fractional LMUL types to V_VLS for XTheadVector
@ 2024-03-21 23:45 Christoph Müllner
2024-03-22 1:17 ` juzhe.zhong
2024-03-22 3:43 ` Bruce Hoult
0 siblings, 2 replies; 5+ messages in thread
From: Christoph Müllner @ 2024-03-21 23:45 UTC (permalink / raw)
To: gcc-patches, Kito Cheng, Palmer Dabbelt, Andrew Waterman,
Philipp Tomsich, Camel Coder, Bruce Hoult, Juzhe-Zhong, Jun Sha,
Xianmiao Qu, Jin Ma
Cc: Christoph Müllner
The expansion of `memset` (via expand_builtin_memset_args())
uses clear_by_pieces() and store_by_pieces() to avoid calls
to the C runtime. To check if a type can be used for that purpose
the function by_pieces_mode_supported_p() tests if a `mov` and
a `vec_duplicate` INSN can be expaned by the backend.
The `vec_duplicate` expansion takes arguments of type `V_VLS`.
The `mov` expansions take arguments of type `V`, `VB`, `VT`,
`VLS_AVL_IMM`, and `VLS_AVL_REG`. Some of these types (in fact
not types but type iterators) include fractional LMUL types.
E.g. `V_VLS` includes `V`, which includes `VI`, which includes
`RVVMF2QI`.
This results in an attempt to use fractional LMUL-types for
the `memset` expansion resulting in an ICE for XTheadVector,
because that extension cannot handle fractional LMULs.
This patch addresses this issue by splitting the definition
of the `VI` mode itereator into `VI_NOFRAC` (without fractional
LMUL types) and `VI_FRAC` (only fractional LMUL types).
Further, it defines `V_VLS` such, that `VI_FRAC` types are only
included if XTheadVector is not enabled.
The effect is demonstrated by a new test case that shows
that the by-pieces framework now emits `sb` instructions
instead of triggering an ICE.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
PR 114194
gcc/ChangeLog:
* config/riscv/vector-iterators.md: Split VI into VI_FRAC and VI_NOFRAC.
Only include VI_NOFRAC in V_VLS without TARGET_XTHEADVECTOR.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/xtheadvector/pr114194.c: New test.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
---
gcc/config/riscv/vector-iterators.md | 19 +++++--
.../riscv/rvv/xtheadvector/pr114194.c | 56 +++++++++++++++++++
2 files changed, 69 insertions(+), 6 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md
index c2ea7e8b10a..a24e1bf078f 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -108,17 +108,24 @@ (define_c_enum "unspecv" [
UNSPECV_FRM_RESTORE_EXIT
])
-(define_mode_iterator VI [
- RVVM8QI RVVM4QI RVVM2QI RVVM1QI RVVMF2QI RVVMF4QI (RVVMF8QI "TARGET_MIN_VLEN > 32")
-
- RVVM8HI RVVM4HI RVVM2HI RVVM1HI RVVMF2HI (RVVMF4HI "TARGET_MIN_VLEN > 32")
-
- RVVM8SI RVVM4SI RVVM2SI RVVM1SI (RVVMF2SI "TARGET_MIN_VLEN > 32")
+;; Subset of VI with fractional LMUL types
+(define_mode_iterator VI_FRAC [
+ RVVMF2QI RVVMF4QI (RVVMF8QI "TARGET_MIN_VLEN > 32")
+ RVVMF2HI (RVVMF4HI "TARGET_MIN_VLEN > 32")
+ (RVVMF2SI "TARGET_MIN_VLEN > 32")
+])
+;; Subset of VI with non-fractional LMUL types
+(define_mode_iterator VI_NOFRAC [
+ RVVM8QI RVVM4QI RVVM2QI RVVM1QI
+ RVVM8HI RVVM4HI RVVM2HI RVVM1HI
+ RVVM8SI RVVM4SI RVVM2SI RVVM1SI
(RVVM8DI "TARGET_VECTOR_ELEN_64") (RVVM4DI "TARGET_VECTOR_ELEN_64")
(RVVM2DI "TARGET_VECTOR_ELEN_64") (RVVM1DI "TARGET_VECTOR_ELEN_64")
])
+(define_mode_iterator VI [ VI_NOFRAC (VI_FRAC "!TARGET_XTHEADVECTOR") ])
+
;; This iterator is the same as above but with TARGET_VECTOR_ELEN_FP_16
;; changed to TARGET_ZVFH. TARGET_VECTOR_ELEN_FP_16 is also true for
;; TARGET_ZVFHMIN while we actually want to disable all instructions apart
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
new file mode 100644
index 00000000000..fc2d1349425
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
@@ -0,0 +1,56 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gc_xtheadvector" { target { rv32 } } } */
+/* { dg-options "-march=rv64gc_xtheadvector" { target { rv64 } } } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+/*
+** foo0_1:
+** sb\tzero,0([a-x0-9]+)
+** ret
+*/
+void foo0_1 (void *p)
+{
+ __builtin_memset (p, 0, 1);
+}
+
+/*
+** foo0_7:
+** sb\tzero,0([a-x0-9]+)
+** sb\tzero,1([a-x0-9]+)
+** sb\tzero,2([a-x0-9]+)
+** sb\tzero,3([a-x0-9]+)
+** sb\tzero,4([a-x0-9]+)
+** sb\tzero,5([a-x0-9]+)
+** sb\tzero,6([a-x0-9]+)
+** ret
+*/
+void foo0_7 (void *p)
+{
+ __builtin_memset (p, 0, 7);
+}
+
+/*
+** foo1_1:
+** li\t[a-x0-9]+,1
+** sb\t[a-x0-9]+,0([a-x0-9]+)
+** ret
+*/
+void foo1_1 (void *p)
+{
+ __builtin_memset (p, 1, 1);
+}
+
+/*
+** foo1_5:
+** li\t[a-x0-9]+,1
+** sb\t[a-x0-9]+,0([a-x0-9]+)
+** sb\t[a-x0-9]+,1([a-x0-9]+)
+** sb\t[a-x0-9]+,2([a-x0-9]+)
+** sb\t[a-x0-9]+,3([a-x0-9]+)
+** sb\t[a-x0-9]+,4([a-x0-9]+)
+** ret
+*/
+void foo1_5 (void *p)
+{
+ __builtin_memset (p, 1, 5);
+}
--
2.44.0
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] RISC-V: Don't add fractional LMUL types to V_VLS for XTheadVector
2024-03-21 23:45 [PATCH] RISC-V: Don't add fractional LMUL types to V_VLS for XTheadVector Christoph Müllner
@ 2024-03-22 1:17 ` juzhe.zhong
2024-03-22 6:54 ` Christoph Müllner
2024-03-22 3:43 ` Bruce Hoult
1 sibling, 1 reply; 5+ messages in thread
From: juzhe.zhong @ 2024-03-22 1:17 UTC (permalink / raw)
To: christoph.muellner, gcc-patches, Kito.cheng, palmer, andrew,
philipp.tomsich, Camel Coder, Bruce Hoult, cooper.joshua,
cooper.qu, jinma
Cc: christoph.muellner
[-- Attachment #1: Type: text/plain, Size: 5092 bytes --]
LGTM.
juzhe.zhong@rivai.ai
From: Christoph Müllner
Date: 2024-03-22 07:45
To: gcc-patches; Kito Cheng; Palmer Dabbelt; Andrew Waterman; Philipp Tomsich; Camel Coder; Bruce Hoult; Juzhe-Zhong; Jun Sha; Xianmiao Qu; Jin Ma
CC: Christoph Müllner
Subject: [PATCH] RISC-V: Don't add fractional LMUL types to V_VLS for XTheadVector
The expansion of `memset` (via expand_builtin_memset_args())
uses clear_by_pieces() and store_by_pieces() to avoid calls
to the C runtime. To check if a type can be used for that purpose
the function by_pieces_mode_supported_p() tests if a `mov` and
a `vec_duplicate` INSN can be expaned by the backend.
The `vec_duplicate` expansion takes arguments of type `V_VLS`.
The `mov` expansions take arguments of type `V`, `VB`, `VT`,
`VLS_AVL_IMM`, and `VLS_AVL_REG`. Some of these types (in fact
not types but type iterators) include fractional LMUL types.
E.g. `V_VLS` includes `V`, which includes `VI`, which includes
`RVVMF2QI`.
This results in an attempt to use fractional LMUL-types for
the `memset` expansion resulting in an ICE for XTheadVector,
because that extension cannot handle fractional LMULs.
This patch addresses this issue by splitting the definition
of the `VI` mode itereator into `VI_NOFRAC` (without fractional
LMUL types) and `VI_FRAC` (only fractional LMUL types).
Further, it defines `V_VLS` such, that `VI_FRAC` types are only
included if XTheadVector is not enabled.
The effect is demonstrated by a new test case that shows
that the by-pieces framework now emits `sb` instructions
instead of triggering an ICE.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
PR 114194
gcc/ChangeLog:
* config/riscv/vector-iterators.md: Split VI into VI_FRAC and VI_NOFRAC.
Only include VI_NOFRAC in V_VLS without TARGET_XTHEADVECTOR.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/xtheadvector/pr114194.c: New test.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
---
gcc/config/riscv/vector-iterators.md | 19 +++++--
.../riscv/rvv/xtheadvector/pr114194.c | 56 +++++++++++++++++++
2 files changed, 69 insertions(+), 6 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md
index c2ea7e8b10a..a24e1bf078f 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -108,17 +108,24 @@ (define_c_enum "unspecv" [
UNSPECV_FRM_RESTORE_EXIT
])
-(define_mode_iterator VI [
- RVVM8QI RVVM4QI RVVM2QI RVVM1QI RVVMF2QI RVVMF4QI (RVVMF8QI "TARGET_MIN_VLEN > 32")
-
- RVVM8HI RVVM4HI RVVM2HI RVVM1HI RVVMF2HI (RVVMF4HI "TARGET_MIN_VLEN > 32")
-
- RVVM8SI RVVM4SI RVVM2SI RVVM1SI (RVVMF2SI "TARGET_MIN_VLEN > 32")
+;; Subset of VI with fractional LMUL types
+(define_mode_iterator VI_FRAC [
+ RVVMF2QI RVVMF4QI (RVVMF8QI "TARGET_MIN_VLEN > 32")
+ RVVMF2HI (RVVMF4HI "TARGET_MIN_VLEN > 32")
+ (RVVMF2SI "TARGET_MIN_VLEN > 32")
+])
+;; Subset of VI with non-fractional LMUL types
+(define_mode_iterator VI_NOFRAC [
+ RVVM8QI RVVM4QI RVVM2QI RVVM1QI
+ RVVM8HI RVVM4HI RVVM2HI RVVM1HI
+ RVVM8SI RVVM4SI RVVM2SI RVVM1SI
(RVVM8DI "TARGET_VECTOR_ELEN_64") (RVVM4DI "TARGET_VECTOR_ELEN_64")
(RVVM2DI "TARGET_VECTOR_ELEN_64") (RVVM1DI "TARGET_VECTOR_ELEN_64")
])
+(define_mode_iterator VI [ VI_NOFRAC (VI_FRAC "!TARGET_XTHEADVECTOR") ])
+
;; This iterator is the same as above but with TARGET_VECTOR_ELEN_FP_16
;; changed to TARGET_ZVFH. TARGET_VECTOR_ELEN_FP_16 is also true for
;; TARGET_ZVFHMIN while we actually want to disable all instructions apart
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
new file mode 100644
index 00000000000..fc2d1349425
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
@@ -0,0 +1,56 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gc_xtheadvector" { target { rv32 } } } */
+/* { dg-options "-march=rv64gc_xtheadvector" { target { rv64 } } } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+/*
+** foo0_1:
+** sb\tzero,0([a-x0-9]+)
+** ret
+*/
+void foo0_1 (void *p)
+{
+ __builtin_memset (p, 0, 1);
+}
+
+/*
+** foo0_7:
+** sb\tzero,0([a-x0-9]+)
+** sb\tzero,1([a-x0-9]+)
+** sb\tzero,2([a-x0-9]+)
+** sb\tzero,3([a-x0-9]+)
+** sb\tzero,4([a-x0-9]+)
+** sb\tzero,5([a-x0-9]+)
+** sb\tzero,6([a-x0-9]+)
+** ret
+*/
+void foo0_7 (void *p)
+{
+ __builtin_memset (p, 0, 7);
+}
+
+/*
+** foo1_1:
+** li\t[a-x0-9]+,1
+** sb\t[a-x0-9]+,0([a-x0-9]+)
+** ret
+*/
+void foo1_1 (void *p)
+{
+ __builtin_memset (p, 1, 1);
+}
+
+/*
+** foo1_5:
+** li\t[a-x0-9]+,1
+** sb\t[a-x0-9]+,0([a-x0-9]+)
+** sb\t[a-x0-9]+,1([a-x0-9]+)
+** sb\t[a-x0-9]+,2([a-x0-9]+)
+** sb\t[a-x0-9]+,3([a-x0-9]+)
+** sb\t[a-x0-9]+,4([a-x0-9]+)
+** ret
+*/
+void foo1_5 (void *p)
+{
+ __builtin_memset (p, 1, 5);
+}
--
2.44.0
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] RISC-V: Don't add fractional LMUL types to V_VLS for XTheadVector
2024-03-21 23:45 [PATCH] RISC-V: Don't add fractional LMUL types to V_VLS for XTheadVector Christoph Müllner
2024-03-22 1:17 ` juzhe.zhong
@ 2024-03-22 3:43 ` Bruce Hoult
2024-03-22 7:41 ` Christoph Müllner
1 sibling, 1 reply; 5+ messages in thread
From: Bruce Hoult @ 2024-03-22 3:43 UTC (permalink / raw)
To: Christoph Müllner
Cc: gcc-patches, Kito Cheng, Palmer Dabbelt, Andrew Waterman,
Philipp Tomsich, Camel Coder, Juzhe-Zhong, Jun Sha, Xianmiao Qu,
Jin Ma
> The effect is demonstrated by a new test case that shows
that the by-pieces framework now emits `sb` instructions
instead of triggering an ICE
So these small memset() now don't use RVV at all if xtheadvector is enabled?
I don't have evidence whether the use of RVV (whether V or
xtheadvector) for these memsets is a win or not, but the treatment
should probably be consistent.
I don't know why RVV 1.0 uses a fractional LMUL at all here. It would
work perfectly well with LMUL=1 and just setting vl to the appropriate
length (which is always less than 16 bytes). Use of fractional LMUL
doesn't save any resources.
On Fri, Mar 22, 2024 at 12:46 PM Christoph Müllner
<christoph.muellner@vrull.eu> wrote:
>
> The expansion of `memset` (via expand_builtin_memset_args())
> uses clear_by_pieces() and store_by_pieces() to avoid calls
> to the C runtime. To check if a type can be used for that purpose
> the function by_pieces_mode_supported_p() tests if a `mov` and
> a `vec_duplicate` INSN can be expaned by the backend.
>
> The `vec_duplicate` expansion takes arguments of type `V_VLS`.
> The `mov` expansions take arguments of type `V`, `VB`, `VT`,
> `VLS_AVL_IMM`, and `VLS_AVL_REG`. Some of these types (in fact
> not types but type iterators) include fractional LMUL types.
> E.g. `V_VLS` includes `V`, which includes `VI`, which includes
> `RVVMF2QI`.
>
> This results in an attempt to use fractional LMUL-types for
> the `memset` expansion resulting in an ICE for XTheadVector,
> because that extension cannot handle fractional LMULs.
>
> This patch addresses this issue by splitting the definition
> of the `VI` mode itereator into `VI_NOFRAC` (without fractional
> LMUL types) and `VI_FRAC` (only fractional LMUL types).
> Further, it defines `V_VLS` such, that `VI_FRAC` types are only
> included if XTheadVector is not enabled.
>
> The effect is demonstrated by a new test case that shows
> that the by-pieces framework now emits `sb` instructions
> instead of triggering an ICE.
>
> Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
>
> PR 114194
>
> gcc/ChangeLog:
>
> * config/riscv/vector-iterators.md: Split VI into VI_FRAC and VI_NOFRAC.
> Only include VI_NOFRAC in V_VLS without TARGET_XTHEADVECTOR.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/xtheadvector/pr114194.c: New test.
>
> Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
> ---
> gcc/config/riscv/vector-iterators.md | 19 +++++--
> .../riscv/rvv/xtheadvector/pr114194.c | 56 +++++++++++++++++++
> 2 files changed, 69 insertions(+), 6 deletions(-)
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
>
> diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md
> index c2ea7e8b10a..a24e1bf078f 100644
> --- a/gcc/config/riscv/vector-iterators.md
> +++ b/gcc/config/riscv/vector-iterators.md
> @@ -108,17 +108,24 @@ (define_c_enum "unspecv" [
> UNSPECV_FRM_RESTORE_EXIT
> ])
>
> -(define_mode_iterator VI [
> - RVVM8QI RVVM4QI RVVM2QI RVVM1QI RVVMF2QI RVVMF4QI (RVVMF8QI "TARGET_MIN_VLEN > 32")
> -
> - RVVM8HI RVVM4HI RVVM2HI RVVM1HI RVVMF2HI (RVVMF4HI "TARGET_MIN_VLEN > 32")
> -
> - RVVM8SI RVVM4SI RVVM2SI RVVM1SI (RVVMF2SI "TARGET_MIN_VLEN > 32")
> +;; Subset of VI with fractional LMUL types
> +(define_mode_iterator VI_FRAC [
> + RVVMF2QI RVVMF4QI (RVVMF8QI "TARGET_MIN_VLEN > 32")
> + RVVMF2HI (RVVMF4HI "TARGET_MIN_VLEN > 32")
> + (RVVMF2SI "TARGET_MIN_VLEN > 32")
> +])
>
> +;; Subset of VI with non-fractional LMUL types
> +(define_mode_iterator VI_NOFRAC [
> + RVVM8QI RVVM4QI RVVM2QI RVVM1QI
> + RVVM8HI RVVM4HI RVVM2HI RVVM1HI
> + RVVM8SI RVVM4SI RVVM2SI RVVM1SI
> (RVVM8DI "TARGET_VECTOR_ELEN_64") (RVVM4DI "TARGET_VECTOR_ELEN_64")
> (RVVM2DI "TARGET_VECTOR_ELEN_64") (RVVM1DI "TARGET_VECTOR_ELEN_64")
> ])
>
> +(define_mode_iterator VI [ VI_NOFRAC (VI_FRAC "!TARGET_XTHEADVECTOR") ])
> +
> ;; This iterator is the same as above but with TARGET_VECTOR_ELEN_FP_16
> ;; changed to TARGET_ZVFH. TARGET_VECTOR_ELEN_FP_16 is also true for
> ;; TARGET_ZVFHMIN while we actually want to disable all instructions apart
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
> new file mode 100644
> index 00000000000..fc2d1349425
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
> @@ -0,0 +1,56 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv32gc_xtheadvector" { target { rv32 } } } */
> +/* { dg-options "-march=rv64gc_xtheadvector" { target { rv64 } } } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> +
> +/*
> +** foo0_1:
> +** sb\tzero,0([a-x0-9]+)
> +** ret
> +*/
> +void foo0_1 (void *p)
> +{
> + __builtin_memset (p, 0, 1);
> +}
> +
> +/*
> +** foo0_7:
> +** sb\tzero,0([a-x0-9]+)
> +** sb\tzero,1([a-x0-9]+)
> +** sb\tzero,2([a-x0-9]+)
> +** sb\tzero,3([a-x0-9]+)
> +** sb\tzero,4([a-x0-9]+)
> +** sb\tzero,5([a-x0-9]+)
> +** sb\tzero,6([a-x0-9]+)
> +** ret
> +*/
> +void foo0_7 (void *p)
> +{
> + __builtin_memset (p, 0, 7);
> +}
> +
> +/*
> +** foo1_1:
> +** li\t[a-x0-9]+,1
> +** sb\t[a-x0-9]+,0([a-x0-9]+)
> +** ret
> +*/
> +void foo1_1 (void *p)
> +{
> + __builtin_memset (p, 1, 1);
> +}
> +
> +/*
> +** foo1_5:
> +** li\t[a-x0-9]+,1
> +** sb\t[a-x0-9]+,0([a-x0-9]+)
> +** sb\t[a-x0-9]+,1([a-x0-9]+)
> +** sb\t[a-x0-9]+,2([a-x0-9]+)
> +** sb\t[a-x0-9]+,3([a-x0-9]+)
> +** sb\t[a-x0-9]+,4([a-x0-9]+)
> +** ret
> +*/
> +void foo1_5 (void *p)
> +{
> + __builtin_memset (p, 1, 5);
> +}
> --
> 2.44.0
>
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] RISC-V: Don't add fractional LMUL types to V_VLS for XTheadVector
2024-03-22 1:17 ` juzhe.zhong
@ 2024-03-22 6:54 ` Christoph Müllner
0 siblings, 0 replies; 5+ messages in thread
From: Christoph Müllner @ 2024-03-22 6:54 UTC (permalink / raw)
To: juzhe.zhong
Cc: gcc-patches, Kito.cheng, palmer, andrew, philipp.tomsich,
Camel Coder, Bruce Hoult, cooper.joshua, cooper.qu, jinma
On Fri, Mar 22, 2024 at 2:18 AM juzhe.zhong@rivai.ai
<juzhe.zhong@rivai.ai> wrote:
>
> LGTM.
Pushed.
Thanks!
>
> ________________________________
> juzhe.zhong@rivai.ai
>
>
> From: Christoph Müllner
> Date: 2024-03-22 07:45
> To: gcc-patches; Kito Cheng; Palmer Dabbelt; Andrew Waterman; Philipp Tomsich; Camel Coder; Bruce Hoult; Juzhe-Zhong; Jun Sha; Xianmiao Qu; Jin Ma
> CC: Christoph Müllner
> Subject: [PATCH] RISC-V: Don't add fractional LMUL types to V_VLS for XTheadVector
> The expansion of `memset` (via expand_builtin_memset_args())
> uses clear_by_pieces() and store_by_pieces() to avoid calls
> to the C runtime. To check if a type can be used for that purpose
> the function by_pieces_mode_supported_p() tests if a `mov` and
> a `vec_duplicate` INSN can be expaned by the backend.
>
> The `vec_duplicate` expansion takes arguments of type `V_VLS`.
> The `mov` expansions take arguments of type `V`, `VB`, `VT`,
> `VLS_AVL_IMM`, and `VLS_AVL_REG`. Some of these types (in fact
> not types but type iterators) include fractional LMUL types.
> E.g. `V_VLS` includes `V`, which includes `VI`, which includes
> `RVVMF2QI`.
>
> This results in an attempt to use fractional LMUL-types for
> the `memset` expansion resulting in an ICE for XTheadVector,
> because that extension cannot handle fractional LMULs.
>
> This patch addresses this issue by splitting the definition
> of the `VI` mode itereator into `VI_NOFRAC` (without fractional
> LMUL types) and `VI_FRAC` (only fractional LMUL types).
> Further, it defines `V_VLS` such, that `VI_FRAC` types are only
> included if XTheadVector is not enabled.
>
> The effect is demonstrated by a new test case that shows
> that the by-pieces framework now emits `sb` instructions
> instead of triggering an ICE.
>
> Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
>
> PR 114194
>
> gcc/ChangeLog:
>
> * config/riscv/vector-iterators.md: Split VI into VI_FRAC and VI_NOFRAC.
> Only include VI_NOFRAC in V_VLS without TARGET_XTHEADVECTOR.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/xtheadvector/pr114194.c: New test.
>
> Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
> ---
> gcc/config/riscv/vector-iterators.md | 19 +++++--
> .../riscv/rvv/xtheadvector/pr114194.c | 56 +++++++++++++++++++
> 2 files changed, 69 insertions(+), 6 deletions(-)
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
>
> diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md
> index c2ea7e8b10a..a24e1bf078f 100644
> --- a/gcc/config/riscv/vector-iterators.md
> +++ b/gcc/config/riscv/vector-iterators.md
> @@ -108,17 +108,24 @@ (define_c_enum "unspecv" [
> UNSPECV_FRM_RESTORE_EXIT
> ])
> -(define_mode_iterator VI [
> - RVVM8QI RVVM4QI RVVM2QI RVVM1QI RVVMF2QI RVVMF4QI (RVVMF8QI "TARGET_MIN_VLEN > 32")
> -
> - RVVM8HI RVVM4HI RVVM2HI RVVM1HI RVVMF2HI (RVVMF4HI "TARGET_MIN_VLEN > 32")
> -
> - RVVM8SI RVVM4SI RVVM2SI RVVM1SI (RVVMF2SI "TARGET_MIN_VLEN > 32")
> +;; Subset of VI with fractional LMUL types
> +(define_mode_iterator VI_FRAC [
> + RVVMF2QI RVVMF4QI (RVVMF8QI "TARGET_MIN_VLEN > 32")
> + RVVMF2HI (RVVMF4HI "TARGET_MIN_VLEN > 32")
> + (RVVMF2SI "TARGET_MIN_VLEN > 32")
> +])
> +;; Subset of VI with non-fractional LMUL types
> +(define_mode_iterator VI_NOFRAC [
> + RVVM8QI RVVM4QI RVVM2QI RVVM1QI
> + RVVM8HI RVVM4HI RVVM2HI RVVM1HI
> + RVVM8SI RVVM4SI RVVM2SI RVVM1SI
> (RVVM8DI "TARGET_VECTOR_ELEN_64") (RVVM4DI "TARGET_VECTOR_ELEN_64")
> (RVVM2DI "TARGET_VECTOR_ELEN_64") (RVVM1DI "TARGET_VECTOR_ELEN_64")
> ])
> +(define_mode_iterator VI [ VI_NOFRAC (VI_FRAC "!TARGET_XTHEADVECTOR") ])
> +
> ;; This iterator is the same as above but with TARGET_VECTOR_ELEN_FP_16
> ;; changed to TARGET_ZVFH. TARGET_VECTOR_ELEN_FP_16 is also true for
> ;; TARGET_ZVFHMIN while we actually want to disable all instructions apart
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
> new file mode 100644
> index 00000000000..fc2d1349425
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
> @@ -0,0 +1,56 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv32gc_xtheadvector" { target { rv32 } } } */
> +/* { dg-options "-march=rv64gc_xtheadvector" { target { rv64 } } } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> +
> +/*
> +** foo0_1:
> +** sb\tzero,0([a-x0-9]+)
> +** ret
> +*/
> +void foo0_1 (void *p)
> +{
> + __builtin_memset (p, 0, 1);
> +}
> +
> +/*
> +** foo0_7:
> +** sb\tzero,0([a-x0-9]+)
> +** sb\tzero,1([a-x0-9]+)
> +** sb\tzero,2([a-x0-9]+)
> +** sb\tzero,3([a-x0-9]+)
> +** sb\tzero,4([a-x0-9]+)
> +** sb\tzero,5([a-x0-9]+)
> +** sb\tzero,6([a-x0-9]+)
> +** ret
> +*/
> +void foo0_7 (void *p)
> +{
> + __builtin_memset (p, 0, 7);
> +}
> +
> +/*
> +** foo1_1:
> +** li\t[a-x0-9]+,1
> +** sb\t[a-x0-9]+,0([a-x0-9]+)
> +** ret
> +*/
> +void foo1_1 (void *p)
> +{
> + __builtin_memset (p, 1, 1);
> +}
> +
> +/*
> +** foo1_5:
> +** li\t[a-x0-9]+,1
> +** sb\t[a-x0-9]+,0([a-x0-9]+)
> +** sb\t[a-x0-9]+,1([a-x0-9]+)
> +** sb\t[a-x0-9]+,2([a-x0-9]+)
> +** sb\t[a-x0-9]+,3([a-x0-9]+)
> +** sb\t[a-x0-9]+,4([a-x0-9]+)
> +** ret
> +*/
> +void foo1_5 (void *p)
> +{
> + __builtin_memset (p, 1, 5);
> +}
> --
> 2.44.0
>
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] RISC-V: Don't add fractional LMUL types to V_VLS for XTheadVector
2024-03-22 3:43 ` Bruce Hoult
@ 2024-03-22 7:41 ` Christoph Müllner
0 siblings, 0 replies; 5+ messages in thread
From: Christoph Müllner @ 2024-03-22 7:41 UTC (permalink / raw)
To: Bruce Hoult
Cc: gcc-patches, Kito Cheng, Palmer Dabbelt, Andrew Waterman,
Philipp Tomsich, Camel Coder, Juzhe-Zhong, Jun Sha, Xianmiao Qu,
Jin Ma
On Fri, Mar 22, 2024 at 4:43 AM Bruce Hoult <bruce@hoult.org> wrote:
>
> > The effect is demonstrated by a new test case that shows
> that the by-pieces framework now emits `sb` instructions
> instead of triggering an ICE
>
> So these small memset() now don't use RVV at all if xtheadvector is enabled?
Yes, but not directly.
The patch just prevents fractional LMUL modes from being considered
for XTheadVector.
That's necessary because further lowering memory moves with a
fractional LMUL mode
cannot be done for XTheadVector (that's the reason for the ICE).
> I don't have evidence whether the use of RVV (whether V or
> xtheadvector) for these memsets is a win or not, but the treatment
> should probably be consistent.
>
> I don't know why RVV 1.0 uses a fractional LMUL at all here. It would
> work perfectly well with LMUL=1 and just setting vl to the appropriate
> length (which is always less than 16 bytes). Use of fractional LMUL
> doesn't save any resources.
The compiler can consider fractional LMUL values for expansion for RVV,
but that does not mean it will be used in the emitted instruction sequence.
Details like cost model and data alignment also matter.
During testing, I observed that RVV and XTheadVector will both emit sequences
of 'sd' for short memsets with known length, known data to set,
and unknown alignment of the data to be written.
However, I have not excessively tested using all possible tuning parameters,
as my primary goal was to eliminate the reason for the ICE with XTheadVector.
>
> On Fri, Mar 22, 2024 at 12:46 PM Christoph Müllner
> <christoph.muellner@vrull.eu> wrote:
> >
> > The expansion of `memset` (via expand_builtin_memset_args())
> > uses clear_by_pieces() and store_by_pieces() to avoid calls
> > to the C runtime. To check if a type can be used for that purpose
> > the function by_pieces_mode_supported_p() tests if a `mov` and
> > a `vec_duplicate` INSN can be expaned by the backend.
> >
> > The `vec_duplicate` expansion takes arguments of type `V_VLS`.
> > The `mov` expansions take arguments of type `V`, `VB`, `VT`,
> > `VLS_AVL_IMM`, and `VLS_AVL_REG`. Some of these types (in fact
> > not types but type iterators) include fractional LMUL types.
> > E.g. `V_VLS` includes `V`, which includes `VI`, which includes
> > `RVVMF2QI`.
> >
> > This results in an attempt to use fractional LMUL-types for
> > the `memset` expansion resulting in an ICE for XTheadVector,
> > because that extension cannot handle fractional LMULs.
> >
> > This patch addresses this issue by splitting the definition
> > of the `VI` mode itereator into `VI_NOFRAC` (without fractional
> > LMUL types) and `VI_FRAC` (only fractional LMUL types).
> > Further, it defines `V_VLS` such, that `VI_FRAC` types are only
> > included if XTheadVector is not enabled.
> >
> > The effect is demonstrated by a new test case that shows
> > that the by-pieces framework now emits `sb` instructions
> > instead of triggering an ICE.
> >
> > Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
> >
> > PR 114194
> >
> > gcc/ChangeLog:
> >
> > * config/riscv/vector-iterators.md: Split VI into VI_FRAC and VI_NOFRAC.
> > Only include VI_NOFRAC in V_VLS without TARGET_XTHEADVECTOR.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/riscv/rvv/xtheadvector/pr114194.c: New test.
> >
> > Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
> > ---
> > gcc/config/riscv/vector-iterators.md | 19 +++++--
> > .../riscv/rvv/xtheadvector/pr114194.c | 56 +++++++++++++++++++
> > 2 files changed, 69 insertions(+), 6 deletions(-)
> > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
> >
> > diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md
> > index c2ea7e8b10a..a24e1bf078f 100644
> > --- a/gcc/config/riscv/vector-iterators.md
> > +++ b/gcc/config/riscv/vector-iterators.md
> > @@ -108,17 +108,24 @@ (define_c_enum "unspecv" [
> > UNSPECV_FRM_RESTORE_EXIT
> > ])
> >
> > -(define_mode_iterator VI [
> > - RVVM8QI RVVM4QI RVVM2QI RVVM1QI RVVMF2QI RVVMF4QI (RVVMF8QI "TARGET_MIN_VLEN > 32")
> > -
> > - RVVM8HI RVVM4HI RVVM2HI RVVM1HI RVVMF2HI (RVVMF4HI "TARGET_MIN_VLEN > 32")
> > -
> > - RVVM8SI RVVM4SI RVVM2SI RVVM1SI (RVVMF2SI "TARGET_MIN_VLEN > 32")
> > +;; Subset of VI with fractional LMUL types
> > +(define_mode_iterator VI_FRAC [
> > + RVVMF2QI RVVMF4QI (RVVMF8QI "TARGET_MIN_VLEN > 32")
> > + RVVMF2HI (RVVMF4HI "TARGET_MIN_VLEN > 32")
> > + (RVVMF2SI "TARGET_MIN_VLEN > 32")
> > +])
> >
> > +;; Subset of VI with non-fractional LMUL types
> > +(define_mode_iterator VI_NOFRAC [
> > + RVVM8QI RVVM4QI RVVM2QI RVVM1QI
> > + RVVM8HI RVVM4HI RVVM2HI RVVM1HI
> > + RVVM8SI RVVM4SI RVVM2SI RVVM1SI
> > (RVVM8DI "TARGET_VECTOR_ELEN_64") (RVVM4DI "TARGET_VECTOR_ELEN_64")
> > (RVVM2DI "TARGET_VECTOR_ELEN_64") (RVVM1DI "TARGET_VECTOR_ELEN_64")
> > ])
> >
> > +(define_mode_iterator VI [ VI_NOFRAC (VI_FRAC "!TARGET_XTHEADVECTOR") ])
> > +
> > ;; This iterator is the same as above but with TARGET_VECTOR_ELEN_FP_16
> > ;; changed to TARGET_ZVFH. TARGET_VECTOR_ELEN_FP_16 is also true for
> > ;; TARGET_ZVFHMIN while we actually want to disable all instructions apart
> > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
> > new file mode 100644
> > index 00000000000..fc2d1349425
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
> > @@ -0,0 +1,56 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-march=rv32gc_xtheadvector" { target { rv32 } } } */
> > +/* { dg-options "-march=rv64gc_xtheadvector" { target { rv64 } } } */
> > +/* { dg-final { check-function-bodies "**" "" } } */
> > +
> > +/*
> > +** foo0_1:
> > +** sb\tzero,0([a-x0-9]+)
> > +** ret
> > +*/
> > +void foo0_1 (void *p)
> > +{
> > + __builtin_memset (p, 0, 1);
> > +}
> > +
> > +/*
> > +** foo0_7:
> > +** sb\tzero,0([a-x0-9]+)
> > +** sb\tzero,1([a-x0-9]+)
> > +** sb\tzero,2([a-x0-9]+)
> > +** sb\tzero,3([a-x0-9]+)
> > +** sb\tzero,4([a-x0-9]+)
> > +** sb\tzero,5([a-x0-9]+)
> > +** sb\tzero,6([a-x0-9]+)
> > +** ret
> > +*/
> > +void foo0_7 (void *p)
> > +{
> > + __builtin_memset (p, 0, 7);
> > +}
> > +
> > +/*
> > +** foo1_1:
> > +** li\t[a-x0-9]+,1
> > +** sb\t[a-x0-9]+,0([a-x0-9]+)
> > +** ret
> > +*/
> > +void foo1_1 (void *p)
> > +{
> > + __builtin_memset (p, 1, 1);
> > +}
> > +
> > +/*
> > +** foo1_5:
> > +** li\t[a-x0-9]+,1
> > +** sb\t[a-x0-9]+,0([a-x0-9]+)
> > +** sb\t[a-x0-9]+,1([a-x0-9]+)
> > +** sb\t[a-x0-9]+,2([a-x0-9]+)
> > +** sb\t[a-x0-9]+,3([a-x0-9]+)
> > +** sb\t[a-x0-9]+,4([a-x0-9]+)
> > +** ret
> > +*/
> > +void foo1_5 (void *p)
> > +{
> > + __builtin_memset (p, 1, 5);
> > +}
> > --
> > 2.44.0
> >
> >
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-03-22 8:00 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-21 23:45 [PATCH] RISC-V: Don't add fractional LMUL types to V_VLS for XTheadVector Christoph Müllner
2024-03-22 1:17 ` juzhe.zhong
2024-03-22 6:54 ` Christoph Müllner
2024-03-22 3:43 ` Bruce Hoult
2024-03-22 7:41 ` Christoph Müllner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).