Cool. Thank you all for this, have a nice day!

Pan

From: juzhe.zhong@rivai.ai <juzhe.zhong@rivai.ai>
Sent: Wednesday, March 29, 2023 5:35 PM
To: rguenther <rguenther@suse.de>; Li, Pan2 <pan2.li@intel.com>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>; Kito.cheng <kito.cheng@sifive.com>; Wang, Yanzhang <yanzhang.wang@intel.com>
Subject: Re: Re: [PATCH v2] RISC-V: Bugfix for RVV vbool*_t vn_reference_equal.

Thanks Richard && Pan.

Pan has passed the bootstrap and I will merge this patch when GCC 14 is open (I have write access now).

________________________________
juzhe.zhong@rivai.ai<mailto:juzhe.zhong@rivai.ai>

From: Richard Biener<mailto:rguenther@suse.de>
Date: 2023-03-29 17:24
To: pan2.li<mailto:pan2.li@intel.com>
CC: gcc-patches<mailto:gcc-patches@gcc.gnu.org>; juzhe.zhong<mailto:juzhe.zhong@rivai.ai>; kito.cheng<mailto:kito.cheng@sifive.com>; yanzhang.wang<mailto:yanzhang.wang@intel.com>
Subject: Re: [PATCH v2] RISC-V: Bugfix for RVV vbool*_t vn_reference_equal.
On Wed, 29 Mar 2023, pan2.li@intel.com<mailto:pan2.li@intel.com> wrote:

> From: Pan Li <pan2.li@intel.com<mailto:pan2.li@intel.com>>
>
> In most architecture the precision_size of vbool*_t types are caculated
> like as the multiple of the type size.  For example:
> precision_size = type_size * 8 (aka, bit count per bytes).
>
> Unfortunately, some architecture like RISC-V will adjust the precision_size
> for the vbool*_t in order to align the ISA. For example as below.
> type_size      = [1, 1, 1, 1,  2,  4,  8]
> precision_size = [1, 2, 4, 8, 16, 32, 64]
>
> Then the precision_size of RISC-V vbool*_t will not be the multiple of the
> type_size. This PATCH try to enrich this case when comparing the vn_reference.
>
> Given we have the below code:
> void test_vbool8_then_vbool16(int8_t * restrict in, int8_t * restrict out) {
>     vbool8_t v1 = *(vbool8_t*)in;
>     vbool16_t v2 = *(vbool16_t*)in;
>
>     *(vbool8_t*)(out + 100) = v1;
>     *(vbool16_t*)(out + 200) = v2;
> }
>
> Before this PATCH:
> csrr    t0,vlenb
> slli    t1,t0,1
> csrr    a3,vlenb
> sub     sp,sp,t1
> slli    a4,a3,1
> add     a4,a4,sp
> addi    a2,a1,100
> vsetvli a5,zero,e8,m1,ta,ma
> sub     a3,a4,a3
> vlm.v   v24,0(a0)
> vsm.v   v24,0(a2)
> vsm.v   v24,0(a3)
> addi    a1,a1,200
> csrr    t0,vlenb
> vsetvli a4,zero,e8,mf2,ta,ma
> slli    t1,t0,1
> vlm.v   v24,0(a3)
> vsm.v   v24,0(a1)
> add     sp,sp,t1
> jr      ra
>
> After this PATCH:
> addi    a3,a1,100
> vsetvli a4,zero,e8,m1,ta,ma
> addi    a1,a1,200
> vlm.v   v24,0(a0)
> vsm.v   v24,0(a3)
> vsetvli a5,zero,e8,mf2,ta,ma
> vlm.v   v24,0(a0)
> vsm.v   v24,0(a1)
> ret

OK if this passes bootstrap / regtest.

Thanks,
Richard.

> PR 109272
>
> gcc/ChangeLog:
>
> * tree-ssa-sccvn.cc (vn_reference_eq):
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/pr108185-4.c:
> * gcc.target/riscv/rvv/base/pr108185-5.c:
> * gcc.target/riscv/rvv/base/pr108185-6.c:
>
> Signed-off-by: Pan Li <pan2.li@intel.com<mailto:pan2.li@intel.com>>
> ---
>  .../gcc.target/riscv/rvv/base/pr108185-4.c    |  2 +-
>  .../gcc.target/riscv/rvv/base/pr108185-5.c    |  2 +-
>  .../gcc.target/riscv/rvv/base/pr108185-6.c    |  2 +-
>  gcc/tree-ssa-sccvn.cc                         | 20 +++++++++++++++++++
>  4 files changed, 23 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr108185-4.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr108185-4.c
> index ea3c360d756..e70284fada8 100644
> --- a/gcc/testsuite/gcc.target/riscv/rvv/base/pr108185-4.c
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr108185-4.c
> @@ -65,4 +65,4 @@ test_vbool8_then_vbool64(int8_t * restrict in, int8_t * restrict out) {
>  /* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf4,\s*ta,\s*ma} 1 } } */
>  /* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf8,\s*ta,\s*ma} 1 } } */
>  /* { dg-final { scan-assembler-times {vlm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 12 } } */
> -/* { dg-final { scan-assembler-times {vsm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 15 } } */
> +/* { dg-final { scan-assembler-times {vsm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 12 } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr108185-5.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr108185-5.c
> index 9fc659d2402..575a7842cdf 100644
> --- a/gcc/testsuite/gcc.target/riscv/rvv/base/pr108185-5.c
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr108185-5.c
> @@ -65,4 +65,4 @@ test_vbool16_then_vbool64(int8_t * restrict in, int8_t * restrict out) {
>  /* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf4,\s*ta,\s*ma} 1 } } */
>  /* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf8,\s*ta,\s*ma} 1 } } */
>  /* { dg-final { scan-assembler-times {vlm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 12 } } */
> -/* { dg-final { scan-assembler-times {vsm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 14 } } */
> +/* { dg-final { scan-assembler-times {vsm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 12 } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr108185-6.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr108185-6.c
> index 98275e5267d..95a11d37016 100644
> --- a/gcc/testsuite/gcc.target/riscv/rvv/base/pr108185-6.c
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr108185-6.c
> @@ -65,4 +65,4 @@ test_vbool32_then_vbool64(int8_t * restrict in, int8_t * restrict out) {
>  /* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf2,\s*ta,\s*ma} 1 } } */
>  /* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf8,\s*ta,\s*ma} 1 } } */
>  /* { dg-final { scan-assembler-times {vlm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 12 } } */
> -/* { dg-final { scan-assembler-times {vsm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 13 } } */
> +/* { dg-final { scan-assembler-times {vsm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 12 } } */
> diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
> index 6b8d38b270c..567df3cb2c6 100644
> --- a/gcc/tree-ssa-sccvn.cc
> +++ b/gcc/tree-ssa-sccvn.cc
> @@ -799,6 +799,26 @@ vn_reference_eq (const_vn_reference_t const vr1, const_vn_reference_t const vr2)
>     && (TYPE_PRECISION (vr2->type)
>         != TREE_INT_CST_LOW (TYPE_SIZE (vr2->type))))
>      return false;
> +  else if (VECTOR_BOOLEAN_TYPE_P (vr1->type)
> +    && VECTOR_BOOLEAN_TYPE_P (vr2->type))
> +    {
> +      /* Vector boolean types can have padding, verify we are dealing with
> + the same number of elements, aka the precision of the types.
> + For example, In most architecture the precision_size of vbool*_t
> + types are caculated like below:
> + precision_size = type_size * 8
> +
> + Unfortunately, the RISC-V will adjust the precision_size for the
> + vbool*_t in order to align the ISA as below:
> + type_size      = [1, 1, 1, 1,  2,  4,  8]
> + precision_size = [1, 2, 4, 8, 16, 32, 64]
> +
> + Then the precision_size of RISC-V vbool*_t will not be the multiple
> + of the type_size.  We take care of this case consolidated here.  */
> +      if (maybe_ne (TYPE_VECTOR_SUBPARTS (vr1->type),
> +     TYPE_VECTOR_SUBPARTS (vr2->type)))
> + return false;
> +    }
>
>    i = 0;
>    j = 0;
>

--
Richard Biener <rguenther@suse.de<mailto:rguenther@suse.de>>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)