From: Richard Biener <rguenther@suse.de>
To: "juzhe.zhong@rivai.ai" <juzhe.zhong@rivai.ai>
Cc: linkw <linkw@linux.ibm.com>,
gcc-patches <gcc-patches@gcc.gnu.org>,
jeffreyalaw <jeffreyalaw@gmail.com>, rdapp <rdapp@linux.ibm.com>,
"richard.sandiford" <richard.sandiford@arm.com>
Subject: Re: Re: [PATCH] VECT: Add WHILE_LEN pattern for decrement IV support for auto-vectorization
Date: Fri, 14 Apr 2023 06:52:06 +0000 (UTC) [thread overview]
Message-ID: <nycvar.YFH.7.77.849.2304140647430.4466@jbgna.fhfr.qr> (raw)
In-Reply-To: <2E7B1DB75F2F78AE+2023041411394350100020@rivai.ai>
On Fri, 14 Apr 2023, juzhe.zhong@rivai.ai wrote:
> And also I already decided to make remove WHILE_LEN pattern since it seems to be unnecessary.
> And as Richard said, it's just a simple airthmetic and it's not worthwhile to do that.
>
> So, I plan to replace WHILE_LEN into MIN_EXPR and make everything RVV specific done in RISC-V port.
> I think it's more reasonable for IBM use and more target use in the future.
>
> So, this patch will need to changed as "introduce a new flow to do vectorization loop control" which is a new loop control flow
> with saturating subtracting n down to zero, and add a target hook for it so that we can switch to this flow ?
>
> Is it more reasonable ?
I think we want to change the various IVs the vectorizer uses to
control the exit condition of prologue/vect/epilogue loops to a single
one counting the remaining _scalar_ iterations to zero. Currently
it's somewhat of a mess which also leads to difficult to CSE expressions
based on derived values of such an IV.
But yes, whether for example the vector loop control stmt should
be a test for zero mask (while-ult) or zero scalar iterations
(or (signed) <= zero) could be subject to a new target hook if it
isn't an obvious choice based on HW capability checks we can already
do.
Richard.
> Thanks.
>
>
> juzhe.zhong@rivai.ai
>
> From: Kewen.Lin
> Date: 2023-04-14 10:54
> To: ???
> CC: gcc-patches; Jeff Law; rdapp; richard.sandiford; rguenther
> Subject: Re: [PATCH] VECT: Add WHILE_LEN pattern for decrement IV support for auto-vectorization
> Hi Juzhe,
>
> on 2023/4/13 21:44, ??? wrote:
> > Thanks Kewen.
> >
> > Current flow in this patch like you said:
> > ....
> > len = WHILE_LEN (n,vf);
> > ...
> > v = len_load (addr,len);
> > ..
> > addr = addr + vf (in byte align);
> > ....
> >
> > This patch is just keep adding address with a vector factor (adjust as byte align).
> > For example, if your vector length = 512bit. Then this patch is just updating address as
> > addr = addr + 64;
> >
> > However, today after I read RVV ISA more deeply, it should be more appropriate that
> > the address should updated as : addr = addr + (len * 4) if len is element number of INT32.
> > the len is the result by WHILE_LEN which calculate the len.
>
> I just read your detailed explanation on the usage of vsetvli insn (really appreciate that),
> it looks that this WHILE_LEN wants some more semantics than MIN, so I assume you still want
> to introduce this WHILE_LEN.
>
> >
> > I assume for IBM target, it's better to just update address directly adding the whole register bytesize
> > in address IV. Since I think the second way (address = addr + (len * 4)) is too RVV specific, and won't be suitable for IBM. Is that right?
>
> Yes, we just wants to add the whole vector register length in bytes.
>
> > If it is true, I will keep this patch flow (won't change to address = addr + (len * 4)) to see what else I need to do for IBM.
> > I would rather do that in RISC-V backend port.
>
> IMHO, you don't need to push this down to RV backend, just query these ports having len_{load,store}
> support with a target hook or special operand in optab while_len (see internal_len_load_store_bias)
> for this need, and generate different codes accordingly. IIUC, for WHILE_LEN, you want it to have
> the semantics as what vsetvli performs, but for IBM ports, it would be just like MIN_EXPR, maybe we
> can also generate MIN or WHILE_LEN based on this kind of target information.
>
> If the above assumption holds, I wonder if you also want WHILE_LEN to have the implicit effect
> to update vector length register? If yes, the codes with multiple rgroups looks unexpected:
>
> + _76 = .WHILE_LEN (ivtmp_74, vf * nitems_per_ctrl);
> + _79 = .WHILE_LEN (ivtmp_77, vf * nitems_per_ctrl);
>
> as the latter one seems to override the former. Besides, if the given operands are known constants,
> it can't directly be folded into constants and do further propagation. From this perspective, Richi's
> suggestion on "tieing the scalar result with the uses" looks better IMHO.
>
> >
> >>> I tried
> >>>to compile the above source files on Power, the former can adopt doloop
> >>>optimization but the latter fails to.
> > You mean GCC can not do hardward loop optimization when IV loop control is variable ?
>
> No, for both cases, IV is variable, the dumping at loop2_doloop for the proposed sequence says
> "Doloop: Possible infinite iteration case.", it seems to show that for the proposed sequence compiler
> isn't able to figure out the loop is finite, it may miss the range information on n, or it isn't
> able to analyze how the invariant involves, but I didn't look into it, all my guesses.
>
> BR,
> Kewen
>
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)
next prev parent reply other threads:[~2023-04-14 6:52 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-07 1:47 juzhe.zhong
2023-04-07 3:23 ` Li, Pan2
2023-04-11 12:12 ` juzhe.zhong
2023-04-11 12:44 ` Richard Sandiford
2023-04-12 7:00 ` Richard Biener
2023-04-12 8:00 ` juzhe.zhong
2023-04-12 8:42 ` Richard Biener
2023-04-12 9:15 ` juzhe.zhong
2023-04-12 9:29 ` Richard Biener
2023-04-12 9:42 ` Robin Dapp
2023-04-12 11:17 ` Richard Sandiford
2023-04-12 11:37 ` juzhe.zhong
2023-04-12 12:24 ` Richard Sandiford
2023-04-12 14:18 ` 钟居哲
2023-04-13 6:47 ` Richard Biener
2023-04-13 9:54 ` juzhe.zhong
2023-04-18 9:32 ` Richard Sandiford
2023-04-12 12:56 ` Kewen.Lin
2023-04-12 13:22 ` 钟居哲
2023-04-13 7:29 ` Kewen.Lin
2023-04-13 13:44 ` 钟居哲
2023-04-14 2:54 ` Kewen.Lin
2023-04-14 3:09 ` juzhe.zhong
2023-04-14 5:40 ` Kewen.Lin
2023-04-14 3:39 ` juzhe.zhong
2023-04-14 6:31 ` Kewen.Lin
2023-04-14 6:39 ` juzhe.zhong
2023-04-14 7:41 ` Kewen.Lin
2023-04-14 6:52 ` Richard Biener [this message]
2023-04-12 11:42 ` Richard Biener
[not found] ` <2023041217154958074655@rivai.ai>
2023-04-12 9:20 ` juzhe.zhong
2023-04-19 21:53 ` 钟居哲
2023-04-20 8:52 ` Richard Sandiford
2023-04-20 8:57 ` juzhe.zhong
2023-04-20 9:11 ` Richard Sandiford
2023-04-20 9:19 ` juzhe.zhong
2023-04-20 9:22 ` Richard Sandiford
2023-04-20 9:50 ` Richard Biener
2023-04-20 9:54 ` Richard Sandiford
2023-04-20 10:38 ` juzhe.zhong
2023-04-20 12:05 ` Richard Biener
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=nycvar.YFH.7.77.849.2304140647430.4466@jbgna.fhfr.qr \
--to=rguenther@suse.de \
--cc=gcc-patches@gcc.gnu.org \
--cc=jeffreyalaw@gmail.com \
--cc=juzhe.zhong@rivai.ai \
--cc=linkw@linux.ibm.com \
--cc=rdapp@linux.ibm.com \
--cc=richard.sandiford@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).