Do you mean for ARM SVE, these tests need to be specified as only ARM SVE ?

Actually, for RVV, is same situation as ARM. We are using VLS modes (fixed-length vectors) to vectorize these cases so that they are XPASS.

The difference between RVV and ARM is that: variable-length and fixed-length vectors are both valid on RVV, using same RVV ISA.
Wheras, for ARM, variable-length vectors use SVE ISA but fixed-length vectors use NEON ISA.




juzhe.zhong@rivai.ai
 
From: Tamar Christina
Date: 2023-12-19 20:29
To: Juzhe-Zhong; gcc-patches@gcc.gnu.org
CC: rguenther@suse.de
Subject: RE: [PATCH] Regression FIX: Remove vect_variable_length XFAIL from some tests
Hi Juzhe,
 
> -----Original Message-----
> From: Juzhe-Zhong <juzhe.zhong@rivai.ai>
> Sent: Tuesday, December 19, 2023 11:19 AM
> To: gcc-patches@gcc.gnu.org
> Cc: rguenther@suse.de; Tamar Christina <Tamar.Christina@arm.com>; Juzhe-
> Zhong <juzhe.zhong@rivai.ai>
> Subject: [PATCH] Regression FIX: Remove vect_variable_length XFAIL from some
> tests
> 
> Hi, this patch fixes these following regression FAILs on RVV:
> 
> XPASS: gcc.dg/tree-ssa/pr84512.c scan-tree-dump optimized "return 285;"
> XPASS: gcc.dg/vect/bb-slp-43.c -flto -ffat-lto-objects  scan-tree-dump-not slp2
> "vector operands from scalars"
> XPASS: gcc.dg/vect/bb-slp-43.c scan-tree-dump-not slp2 "vector operands from
> scalars"
> XPASS: gcc.dg/vect/bb-slp-subgroups-3.c -flto -ffat-lto-objects  scan-tree-dump-
> times slp2 "optimized: basic block" 2
> XPASS: gcc.dg/vect/bb-slp-subgroups-3.c scan-tree-dump-times slp2 "optimized:
> basic block" 2
> 
> Since vect_variable_length are available for ARM SVE and RVV, I just use compiler
> explorer to confirm ARM SVE same as
> RVV.
> 
> Hi, @Tamar. Could you double check whether this patch fix is reasonable to you ?
> 
 
Hmm I would be surprised if this is working correctly for RVV since as far as I know we don't have
variable length support in SLP i.e. SLP can't predicate operation during build so the
current vectorizer only supports fixed length vector SLP, unless Richi did some magic?
 
For SVE the reason this XPASS is because the compiler will fallback to NEON unless it's
told it can't.  But that's not actually testing VLA SLP.
 
i.e. https://godbolt.org/z/5n5fWahxh  just using `+sve` isn't enough and it has to be told
it can only use SVE.  Is it perhaps something similar for RVV?
 
If RVV has a similar param, perhaps the correct fix is to append it to the tests so they
XFAIL correctly?
 
Regards,
Tamar
 
> And.
> 
> Hi, @Richard. Is this patch Ok for trunk if this patch fixes regression for both RVV
> and ARM SVE.
> 
> gcc/testsuite/ChangeLog:
> 
> * gcc.dg/tree-ssa/pr84512.c: Remove vect_variable_length XFAIL.
> * gcc.dg/vect/bb-slp-43.c: Ditto.
> * gcc.dg/vect/bb-slp-subgroups-3.c: Ditto.
> 
> ---
>  gcc/testsuite/gcc.dg/tree-ssa/pr84512.c        | 2 +-
>  gcc/testsuite/gcc.dg/vect/bb-slp-43.c          | 2 +-
>  gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-3.c | 2 +-
>  3 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr84512.c b/gcc/testsuite/gcc.dg/tree-
> ssa/pr84512.c
> index 496c78b28dc..3c027012670 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/pr84512.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr84512.c
> @@ -13,4 +13,4 @@ int foo()
>  }
> 
>  /* Listed targets xfailed due to PR84958.  */
> -/* { dg-final { scan-tree-dump "return 285;" "optimized" { xfail { amdgcn*-*-* ||
> vect_variable_length } } } } */
> +/* { dg-final { scan-tree-dump "return 285;" "optimized" { xfail { amdgcn*-*-* } } }
> } */
> diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-43.c b/gcc/testsuite/gcc.dg/vect/bb-
> slp-43.c
> index dad2d24262d..40bd2e0dfbf 100644
> --- a/gcc/testsuite/gcc.dg/vect/bb-slp-43.c
> +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-43.c
> @@ -14,4 +14,4 @@ f (int *restrict x, short *restrict y)
>  }
> 
>  /* { dg-final { scan-tree-dump-not "mixed mask and nonmask" "slp2" } } */
> -/* { dg-final { scan-tree-dump-not "vector operands from scalars" "slp2" { target {
> { vect_int && vect_bool_cmp } && { vect_unpack && vect_hw_misalign } } xfail {
> vect_variable_length && { ! vect256 } } } } } */
> +/* { dg-final { scan-tree-dump-not "vector operands from scalars" "slp2" { target {
> { vect_int && vect_bool_cmp } && { vect_unpack && vect_hw_misalign } } } } } */
> diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-3.c
> b/gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-3.c
> index fb719915db7..3f0d45ce4a1 100644
> --- a/gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-3.c
> +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-3.c
> @@ -42,7 +42,7 @@ main (int argc, char **argv)
>  /* Because we disable the cost model, targets with variable-length
>     vectors can end up vectorizing the store to a[0..7] on its own.
>     With the cost model we do something sensible.  */
> -/* { dg-final { scan-tree-dump-times "optimized: basic block" 2 "slp2" { target { !
> amdgcn-*-* } xfail vect_variable_length } } } */
> +/* { dg-final { scan-tree-dump-times "optimized: basic block" 2 "slp2" { target { !
> amdgcn-*-* } } } } */
> 
>  /* amdgcn can do this in one vector.  */
>  /* { dg-final { scan-tree-dump-times "optimized: basic block" 1 "slp2" { target
> amdgcn-*-* } } } */
> --
> 2.36.3