Re: Re: [PATCH V2] RISC-V: Support RVV VLA SLP auto-vectorization

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: "juzhe.zhong@rivai.ai" <juzhe.zhong@rivai.ai>
To: jeffreyalaw <jeffreyalaw@gmail.com>,
	 gcc-patches <gcc-patches@gcc.gnu.org>
Cc: kito.cheng <kito.cheng@gmail.com>,
	 Kito.cheng <kito.cheng@sifive.com>,  palmer <palmer@dabbelt.com>,
	 palmer <palmer@rivosinc.com>,
	 "Robin Dapp" <rdapp.gcc@gmail.com>,  pan2.li <pan2.li@intel.com>
Subject: Re: Re: [PATCH V2] RISC-V: Support RVV VLA SLP auto-vectorization
Date: Tue, 13 Jun 2023 10:27:10 +0800	[thread overview]
Message-ID: <B94725CABFEB1972+2023061310271005805912@rivai.ai> (raw)
In-Reply-To: <03469f7f-0812-ab09-e2e9-607689bd1dca@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 7329 bytes --]

Ok.
https://gcc.gnu.org/pipermail/gcc-patches/2023-June/thread.html 
I have add comments as you suggested.



juzhe.zhong@rivai.ai
 
From: Jeff Law
Date: 2023-06-13 07:21
To: juzhe.zhong; gcc-patches
CC: kito.cheng; kito.cheng; palmer; palmer; rdapp.gcc; pan2.li
Subject: Re: [PATCH V2] RISC-V: Support RVV VLA SLP auto-vectorization
 
 
On 6/6/23 21:19, juzhe.zhong@rivai.ai wrote:
> From: Juzhe-Zhong <juzhe.zhong@rivai.ai>
> 
> This patch enables basic VLA SLP auto-vectorization.
> Consider this following case:
> void
> f (uint8_t *restrict a, uint8_t *restrict b)
> {
>    for (int i = 0; i < 100; ++i)
>      {
>        a[i * 8 + 0] = b[i * 8 + 7] + 1;
>        a[i * 8 + 1] = b[i * 8 + 7] + 2;
>        a[i * 8 + 2] = b[i * 8 + 7] + 8;
>        a[i * 8 + 3] = b[i * 8 + 7] + 4;
>        a[i * 8 + 4] = b[i * 8 + 7] + 5;
>        a[i * 8 + 5] = b[i * 8 + 7] + 6;
>        a[i * 8 + 6] = b[i * 8 + 7] + 7;
>        a[i * 8 + 7] = b[i * 8 + 7] + 3;
>      }
> }
> 
> To enable VLA SLP auto-vectorization, we should be able to handle this following const vector:
> 
> 1. NPATTERNS = 8, NELTS_PER_PATTERN = 3.
> { 0, 0, 0, 0, 0, 0, 0, 0, 8, 8, 8, 8, 8, 8, 8, 8, 16, 16, 16, 16, 16, 16, 16, 16, ... }
> 
> 2. NPATTERNS = 8, NELTS_PER_PATTERN = 1.
> { 1, 2, 8, 4, 5, 6, 7, 3, ... }
> 
> And these vector can be generated at prologue.
> 
> After this patch, we end up with this following codegen:
> 
> Prologue:
> ...
>          vsetvli a7,zero,e16,m2,ta,ma
>          vid.v   v4
>          vsrl.vi v4,v4,3
>          li      a3,8
>          vmul.vx v4,v4,a3  ===> v4 = { 0, 0, 0, 0, 0, 0, 0, 0, 8, 8, 8, 8, 8, 8, 8, 8, 16, 16, 16, 16, 16, 16, 16, 16, ... }
> ...
>          li      t1,67633152
>          addi    t1,t1,513
>          li      a3,50790400
>          addi    a3,a3,1541
>          slli    a3,a3,32
>          add     a3,a3,t1
>          vsetvli t1,zero,e64,m1,ta,ma
>          vmv.v.x v3,a3   ===> v3 = { 1, 2, 8, 4, 5, 6, 7, 3, ... }
> ...
> LoopBody:
> ...
>          min     a3,...
>          vsetvli zero,a3,e8,m1,ta,ma
>          vle8.v  v2,0(a6)
>          vsetvli a7,zero,e8,m1,ta,ma
>          vrgatherei16.vv v1,v2,v4
>          vadd.vv v1,v1,v3
>          vsetvli zero,a3,e8,m1,ta,ma
>          vse8.v  v1,0(a2)
>          add     a6,a6,a4
>          add     a2,a2,a4
>          mv      a3,a5
>          add     a5,a5,t1
>          bgtu    a3,a4,.L3
> ...
> 
> Note: we need to use "vrgatherei16.vv" instead of "vrgather.vv" for SEW = 8 since "vrgatherei16.vv" can cover larger
>        range than "vrgather.vv" (which only can maximum element index = 255).
> Epilogue:
>          lbu     a5,799(a1)
>          addiw   a4,a5,1
>          sb      a4,792(a0)
>          addiw   a4,a5,2
>          sb      a4,793(a0)
>          addiw   a4,a5,8
>          sb      a4,794(a0)
>          addiw   a4,a5,4
>          sb      a4,795(a0)
>          addiw   a4,a5,5
>          sb      a4,796(a0)
>          addiw   a4,a5,6
>          sb      a4,797(a0)
>          addiw   a4,a5,7
>          sb      a4,798(a0)
>          addiw   a5,a5,3
>          sb      a5,799(a0)
>          ret
> 
> There is one more last thing we need to do is the "Epilogue auto-vectorization" which needs VLS modes support.
> I will support VLS modes for "Epilogue auto-vectorization" in the future.
> 
> gcc/ChangeLog:
> 
>          * config/riscv/riscv-protos.h (expand_vec_perm_const): New function.
>          * config/riscv/riscv-v.cc (rvv_builder::can_duplicate_repeating_sequence_p): Support POLY handling.
>          (rvv_builder::single_step_npatterns_p): New function.
>          (rvv_builder::npatterns_all_equal_p): Ditto.
>          (const_vec_all_in_range_p): Support POLY handling.
>          (gen_const_vector_dup): Ditto.
>          (emit_vlmax_gather_insn): Add vrgatherei16.
>          (emit_vlmax_masked_gather_mu_insn): Ditto.
>          (expand_const_vector): Add VLA SLP const vector support.
>          (expand_vec_perm): Support POLY.
>          (struct expand_vec_perm_d): New struct.
>          (shuffle_generic_patterns): New function.
>          (expand_vec_perm_const_1): Ditto.
>          (expand_vec_perm_const): Ditto.
>          * config/riscv/riscv.cc (riscv_vectorize_vec_perm_const): Ditto.
>          (TARGET_VECTORIZE_VEC_PERM_CONST): New targethook.
> 
> gcc/testsuite/ChangeLog:
> 
>          * gcc.target/riscv/rvv/autovec/scalable-1.c: Adapt testcase for VLA vectorizer.
>          * gcc.target/riscv/rvv/autovec/v-1.c: Ditto.
>          * gcc.target/riscv/rvv/autovec/zve32f_zvl128b-1.c: Ditto.
>          * gcc.target/riscv/rvv/autovec/zve32x_zvl128b-1.c: Ditto.
>          * gcc.target/riscv/rvv/autovec/zve64d-1.c: Ditto.
>          * gcc.target/riscv/rvv/autovec/zve64d_zvl128b-1.c: Ditto.
>          * gcc.target/riscv/rvv/autovec/zve64f-1.c: Ditto.
>          * gcc.target/riscv/rvv/autovec/zve64f_zvl128b-1.c: Ditto.
>          * gcc.target/riscv/rvv/autovec/zve64x_zvl128b-1.c: Ditto.
>          * gcc.target/riscv/rvv/autovec/partial/slp-1.c: New test.
>          * gcc.target/riscv/rvv/autovec/partial/slp-2.c: New test.
>          * gcc.target/riscv/rvv/autovec/partial/slp-3.c: New test.
>          * gcc.target/riscv/rvv/autovec/partial/slp-4.c: New test.
>          * gcc.target/riscv/rvv/autovec/partial/slp-5.c: New test.
>          * gcc.target/riscv/rvv/autovec/partial/slp-6.c: New test.
>          * gcc.target/riscv/rvv/autovec/partial/slp-7.c: New test.
>          * gcc.target/riscv/rvv/autovec/partial/slp_run-1.c: New test.
>          * gcc.target/riscv/rvv/autovec/partial/slp_run-2.c: New test.
>          * gcc.target/riscv/rvv/autovec/partial/slp_run-3.c: New test.
>          * gcc.target/riscv/rvv/autovec/partial/slp_run-4.c: New test.
>          * gcc.target/riscv/rvv/autovec/partial/slp_run-5.c: New test.
>          * gcc.target/riscv/rvv/autovec/partial/slp_run-6.c: New test.
>          * gcc.target/riscv/rvv/autovec/partial/slp_run-7.c: New test.
> 
 
 
 
 
> +}
> +
> +/* Return true if all elements of NPATTERNS are equal.
> +
> +   E.g. NPATTERNS = 4:
> +     { 2, 2, 2, 2, 4, 4, 4, 4, 8, 8, 8, 8, 16, 16, 16, 16, ... }
> +   E.g. NPATTERNS = 8:
> +     { 2, 2, 2, 2, 2, 2, 2, 2, 8, 8, 8, 8, 8, 8, 8, 8, ... }
> +*/
> +bool
> +rvv_builder::npatterns_all_equal_p () const
> +{
> +  poly_int64 ele0 = rtx_to_poly_int64 (elt (0));
> +  for (unsigned int i = 1; i < npatterns (); i++)
> +    {
> +      poly_int64 ele = rtx_to_poly_int64 (elt (i));
> +      if (!known_eq (ele, ele0))
> + return false;
> +    }
> +  return true;
> +}
There seems to be a disconnect here.  You only seem to check the first 
NPATTERN elements.  Don't you need to check the rest?   Or am I just 
getting confused by the function comment?
 
 
 
 
> +
> +static bool
> +expand_vec_perm_const_1 (struct expand_vec_perm_d *d)
Needs a function comment.
 
 
>
> +
> +bool
> +expand_vec_perm_const (machine_mode vmode, machine_mode op_mode, rtx target,
> +        rtx op0, rtx op1, const vec_perm_indices &sel)
Similarly.
 
 
Overall it looks really good.  Just a couple comments to fix and sort 
out whether or not I'm misinterpreting rvv_builder::npatterns_all_equal_p.
 
Jeff

next prev parent reply	other threads:[~2023-06-13  2:27 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-07  3:19 juzhe.zhong
2023-06-12 23:21 ` Jeff Law
2023-06-13  2:27   ` juzhe.zhong [this message]
2023-06-13 14:06     ` Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=B94725CABFEB1972+2023061310271005805912@rivai.ai \
    --to=juzhe.zhong@rivai.ai \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jeffreyalaw@gmail.com \
    --cc=kito.cheng@gmail.com \
    --cc=kito.cheng@sifive.com \
    --cc=palmer@dabbelt.com \
    --cc=palmer@rivosinc.com \
    --cc=pan2.li@intel.com \
    --cc=rdapp.gcc@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).