From: "Kewen.Lin" <linkw@linux.ibm.com>
To: Tamar Christina <Tamar.Christina@arm.com>
Cc: Richard Biener <richard.guenther@gmail.com>,
GCC Development <gcc@gcc.gnu.org>,
Segher Boessenkool <segher@kernel.crashing.org>
Subject: Re: How to extend SLP to support this case
Date: Wed, 11 Mar 2020 11:58:58 +0800 [thread overview]
Message-ID: <0507b1ac-bc08-20be-435f-3bfd11d03f42@linux.ibm.com> (raw)
In-Reply-To: <PR2PR08MB47470B2C69981173EE216D52FFFF0@PR2PR08MB4747.eurprd08.prod.outlook.com>
Hi Tamar,
on 2020/3/10 下午7:31, Tamar Christina wrote:
>
>> -----Original Message-----
>> From: Gcc <gcc-bounces@gcc.gnu.org> On Behalf Of Richard Biener
>> Sent: Tuesday, March 10, 2020 11:12 AM
>> To: Kewen.Lin <linkw@linux.ibm.com>
>> Cc: GCC Development <gcc@gcc.gnu.org>; Segher Boessenkool
>> <segher@kernel.crashing.org>
>> Subject: Re: How to extend SLP to support this case
>>
>> On Tue, Mar 10, 2020 at 7:52 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
>>>
>>> Hi all,
>>>
>>> But how to teach it to be aware of this? Currently the processing
>>> starts from bottom to up (from stores), can we do some analysis on the
>>> SLP instance, detect some pattern and update the whole instance?
>>
>> In theory yes (Tamar had something like that for AARCH64 complex rotations
>> IIRC). And yes, the issue boils down to how we handle SLP discovery. I'd like
>> to improve SLP discovery but it's on my list only after I managed to get rid of
>> the non-SLP code paths. I have played with some ideas (even produced
>> hackish patches) to find "seeds" to form SLP groups from using multi-level
>> hashing of stmts.
>
> I still have this but missed the stage-1 deadline after doing the rewriting to C++ 😊
>
> We've also been looking at this and the approach I'm investigating now is trying to get
> the SLP codepath to handle this after it's been fully unrolled. I'm looking into whether
> the build-slp can be improved to work for the group size == 16 case that it tries but fails
> on.
>
Thanks! Glad to know you have been working this!
Yes, I saw the standalone SLP pass split the group (16 store stmts) finally.
> My intention is to see if doing so would make it simpler to recognize this as just 4 linear
> loads and two permutes. I think the loop aware SLP will have a much harder time with this
> seeing the load permutations it thinks it needs because of the permutes caused by the +/-
> pattern.
I may miss something, just to double confirm, do you mean for either of p1/p2 make it
4 linear loads? Since as the optimal vectorized version, p1 and p2 have 4 separate
loads and construction then further permutations.
>
> One Idea I had before was from your comment on the complex number patch, which is to try
> and move up TWO_OPERATORS and undo the permute always when doing +/-. This would simplify
> the load permute handling and if a target doesn't have an instruction to support this it would just
> fall back to doing an explicit permute after the loads. But I wasn't sure this approach would get me the
> results I wanted.>
IIUC, we have to seek for either <a0, a1, a2, a3> or <a0_iter0, a0_iter1, a0_iter2, a0_iter3> ...,
since either can leverage the isomorphic byte loads, subtraction, shift and addition.
I was thinking that SLP pattern matcher can detect the pattern with two levels of TWO_OPERATORS,
one level is with t/0,1,2,3,/, the other is with a/0,1,2,3/, as well as the dependent isomorphic
computations for a/0,1,2,3/, transform it into isomorphic subtraction, int promotion shift and addition.
> In the end you don't want a loop here at all. And in order to do the above with TWO_OPERATORS I would
> have to let the SLP pattern matcher be able to reduce the group size and increase the no# iterations during
> the matching otherwise the matching itself becomes quite difficult in certain cases..
>
OK, it sounds unable to get the optimal one which requires all 16 bytes (0-3 or 4-7 x 4 iterations).
BR,
Kewen
next prev parent reply other threads:[~2020-03-11 3:59 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-10 6:52 Kewen.Lin
2020-03-10 11:12 ` Richard Biener
2020-03-10 11:14 ` Richard Biener
2020-03-11 5:34 ` Kewen.Lin
2020-03-10 11:31 ` Tamar Christina
2020-03-11 3:58 ` Kewen.Lin [this message]
2020-03-13 11:57 ` Richard Biener
2020-03-18 11:37 ` Tamar Christina
2020-03-11 1:56 ` Kewen.Lin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0507b1ac-bc08-20be-435f-3bfd11d03f42@linux.ibm.com \
--to=linkw@linux.ibm.com \
--cc=Tamar.Christina@arm.com \
--cc=gcc@gcc.gnu.org \
--cc=richard.guenther@gmail.com \
--cc=segher@kernel.crashing.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).