From: "juzhe.zhong@rivai.ai" <juzhe.zhong@rivai.ai>
To: jeffreyalaw <jeffreyalaw@gmail.com>,
gcc-patches <gcc-patches@gcc.gnu.org>
Cc: kito.cheng <kito.cheng@gmail.com>,
Kito.cheng <kito.cheng@sifive.com>, palmer <palmer@dabbelt.com>,
palmer <palmer@rivosinc.com>,
"Robin Dapp" <rdapp.gcc@gmail.com>, pan2.li <pan2.li@intel.com>
Subject: Re: Re: [PATCH V2] RISC-V: Support RVV VLA SLP auto-vectorization
Date: Tue, 13 Jun 2023 10:27:10 +0800 [thread overview]
Message-ID: <B94725CABFEB1972+2023061310271005805912@rivai.ai> (raw)
In-Reply-To: <03469f7f-0812-ab09-e2e9-607689bd1dca@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 7329 bytes --]
Ok.
https://gcc.gnu.org/pipermail/gcc-patches/2023-June/thread.html
I have add comments as you suggested.
juzhe.zhong@rivai.ai
From: Jeff Law
Date: 2023-06-13 07:21
To: juzhe.zhong; gcc-patches
CC: kito.cheng; kito.cheng; palmer; palmer; rdapp.gcc; pan2.li
Subject: Re: [PATCH V2] RISC-V: Support RVV VLA SLP auto-vectorization
On 6/6/23 21:19, juzhe.zhong@rivai.ai wrote:
> From: Juzhe-Zhong <juzhe.zhong@rivai.ai>
>
> This patch enables basic VLA SLP auto-vectorization.
> Consider this following case:
> void
> f (uint8_t *restrict a, uint8_t *restrict b)
> {
> for (int i = 0; i < 100; ++i)
> {
> a[i * 8 + 0] = b[i * 8 + 7] + 1;
> a[i * 8 + 1] = b[i * 8 + 7] + 2;
> a[i * 8 + 2] = b[i * 8 + 7] + 8;
> a[i * 8 + 3] = b[i * 8 + 7] + 4;
> a[i * 8 + 4] = b[i * 8 + 7] + 5;
> a[i * 8 + 5] = b[i * 8 + 7] + 6;
> a[i * 8 + 6] = b[i * 8 + 7] + 7;
> a[i * 8 + 7] = b[i * 8 + 7] + 3;
> }
> }
>
> To enable VLA SLP auto-vectorization, we should be able to handle this following const vector:
>
> 1. NPATTERNS = 8, NELTS_PER_PATTERN = 3.
> { 0, 0, 0, 0, 0, 0, 0, 0, 8, 8, 8, 8, 8, 8, 8, 8, 16, 16, 16, 16, 16, 16, 16, 16, ... }
>
> 2. NPATTERNS = 8, NELTS_PER_PATTERN = 1.
> { 1, 2, 8, 4, 5, 6, 7, 3, ... }
>
> And these vector can be generated at prologue.
>
> After this patch, we end up with this following codegen:
>
> Prologue:
> ...
> vsetvli a7,zero,e16,m2,ta,ma
> vid.v v4
> vsrl.vi v4,v4,3
> li a3,8
> vmul.vx v4,v4,a3 ===> v4 = { 0, 0, 0, 0, 0, 0, 0, 0, 8, 8, 8, 8, 8, 8, 8, 8, 16, 16, 16, 16, 16, 16, 16, 16, ... }
> ...
> li t1,67633152
> addi t1,t1,513
> li a3,50790400
> addi a3,a3,1541
> slli a3,a3,32
> add a3,a3,t1
> vsetvli t1,zero,e64,m1,ta,ma
> vmv.v.x v3,a3 ===> v3 = { 1, 2, 8, 4, 5, 6, 7, 3, ... }
> ...
> LoopBody:
> ...
> min a3,...
> vsetvli zero,a3,e8,m1,ta,ma
> vle8.v v2,0(a6)
> vsetvli a7,zero,e8,m1,ta,ma
> vrgatherei16.vv v1,v2,v4
> vadd.vv v1,v1,v3
> vsetvli zero,a3,e8,m1,ta,ma
> vse8.v v1,0(a2)
> add a6,a6,a4
> add a2,a2,a4
> mv a3,a5
> add a5,a5,t1
> bgtu a3,a4,.L3
> ...
>
> Note: we need to use "vrgatherei16.vv" instead of "vrgather.vv" for SEW = 8 since "vrgatherei16.vv" can cover larger
> range than "vrgather.vv" (which only can maximum element index = 255).
> Epilogue:
> lbu a5,799(a1)
> addiw a4,a5,1
> sb a4,792(a0)
> addiw a4,a5,2
> sb a4,793(a0)
> addiw a4,a5,8
> sb a4,794(a0)
> addiw a4,a5,4
> sb a4,795(a0)
> addiw a4,a5,5
> sb a4,796(a0)
> addiw a4,a5,6
> sb a4,797(a0)
> addiw a4,a5,7
> sb a4,798(a0)
> addiw a5,a5,3
> sb a5,799(a0)
> ret
>
> There is one more last thing we need to do is the "Epilogue auto-vectorization" which needs VLS modes support.
> I will support VLS modes for "Epilogue auto-vectorization" in the future.
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-protos.h (expand_vec_perm_const): New function.
> * config/riscv/riscv-v.cc (rvv_builder::can_duplicate_repeating_sequence_p): Support POLY handling.
> (rvv_builder::single_step_npatterns_p): New function.
> (rvv_builder::npatterns_all_equal_p): Ditto.
> (const_vec_all_in_range_p): Support POLY handling.
> (gen_const_vector_dup): Ditto.
> (emit_vlmax_gather_insn): Add vrgatherei16.
> (emit_vlmax_masked_gather_mu_insn): Ditto.
> (expand_const_vector): Add VLA SLP const vector support.
> (expand_vec_perm): Support POLY.
> (struct expand_vec_perm_d): New struct.
> (shuffle_generic_patterns): New function.
> (expand_vec_perm_const_1): Ditto.
> (expand_vec_perm_const): Ditto.
> * config/riscv/riscv.cc (riscv_vectorize_vec_perm_const): Ditto.
> (TARGET_VECTORIZE_VEC_PERM_CONST): New targethook.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/autovec/scalable-1.c: Adapt testcase for VLA vectorizer.
> * gcc.target/riscv/rvv/autovec/v-1.c: Ditto.
> * gcc.target/riscv/rvv/autovec/zve32f_zvl128b-1.c: Ditto.
> * gcc.target/riscv/rvv/autovec/zve32x_zvl128b-1.c: Ditto.
> * gcc.target/riscv/rvv/autovec/zve64d-1.c: Ditto.
> * gcc.target/riscv/rvv/autovec/zve64d_zvl128b-1.c: Ditto.
> * gcc.target/riscv/rvv/autovec/zve64f-1.c: Ditto.
> * gcc.target/riscv/rvv/autovec/zve64f_zvl128b-1.c: Ditto.
> * gcc.target/riscv/rvv/autovec/zve64x_zvl128b-1.c: Ditto.
> * gcc.target/riscv/rvv/autovec/partial/slp-1.c: New test.
> * gcc.target/riscv/rvv/autovec/partial/slp-2.c: New test.
> * gcc.target/riscv/rvv/autovec/partial/slp-3.c: New test.
> * gcc.target/riscv/rvv/autovec/partial/slp-4.c: New test.
> * gcc.target/riscv/rvv/autovec/partial/slp-5.c: New test.
> * gcc.target/riscv/rvv/autovec/partial/slp-6.c: New test.
> * gcc.target/riscv/rvv/autovec/partial/slp-7.c: New test.
> * gcc.target/riscv/rvv/autovec/partial/slp_run-1.c: New test.
> * gcc.target/riscv/rvv/autovec/partial/slp_run-2.c: New test.
> * gcc.target/riscv/rvv/autovec/partial/slp_run-3.c: New test.
> * gcc.target/riscv/rvv/autovec/partial/slp_run-4.c: New test.
> * gcc.target/riscv/rvv/autovec/partial/slp_run-5.c: New test.
> * gcc.target/riscv/rvv/autovec/partial/slp_run-6.c: New test.
> * gcc.target/riscv/rvv/autovec/partial/slp_run-7.c: New test.
>
> +}
> +
> +/* Return true if all elements of NPATTERNS are equal.
> +
> + E.g. NPATTERNS = 4:
> + { 2, 2, 2, 2, 4, 4, 4, 4, 8, 8, 8, 8, 16, 16, 16, 16, ... }
> + E.g. NPATTERNS = 8:
> + { 2, 2, 2, 2, 2, 2, 2, 2, 8, 8, 8, 8, 8, 8, 8, 8, ... }
> +*/
> +bool
> +rvv_builder::npatterns_all_equal_p () const
> +{
> + poly_int64 ele0 = rtx_to_poly_int64 (elt (0));
> + for (unsigned int i = 1; i < npatterns (); i++)
> + {
> + poly_int64 ele = rtx_to_poly_int64 (elt (i));
> + if (!known_eq (ele, ele0))
> + return false;
> + }
> + return true;
> +}
There seems to be a disconnect here. You only seem to check the first
NPATTERN elements. Don't you need to check the rest? Or am I just
getting confused by the function comment?
> +
> +static bool
> +expand_vec_perm_const_1 (struct expand_vec_perm_d *d)
Needs a function comment.
>
> +
> +bool
> +expand_vec_perm_const (machine_mode vmode, machine_mode op_mode, rtx target,
> + rtx op0, rtx op1, const vec_perm_indices &sel)
Similarly.
Overall it looks really good. Just a couple comments to fix and sort
out whether or not I'm misinterpreting rvv_builder::npatterns_all_equal_p.
Jeff
next prev parent reply other threads:[~2023-06-13 2:27 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-07 3:19 juzhe.zhong
2023-06-12 23:21 ` Jeff Law
2023-06-13 2:27 ` juzhe.zhong [this message]
2023-06-13 14:06 ` Jeff Law
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=B94725CABFEB1972+2023061310271005805912@rivai.ai \
--to=juzhe.zhong@rivai.ai \
--cc=gcc-patches@gcc.gnu.org \
--cc=jeffreyalaw@gmail.com \
--cc=kito.cheng@gmail.com \
--cc=kito.cheng@sifive.com \
--cc=palmer@dabbelt.com \
--cc=palmer@rivosinc.com \
--cc=pan2.li@intel.com \
--cc=rdapp.gcc@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).