From: Michael Collison <Michael.Collison@arm.com>
To: Richard Sandiford <richard.sandiford@linaro.org>
Cc: Richard Biener <richard.guenther@gmail.com>,
Richard Kenner <kenner@vlsi1.ultra.nyu.edu>,
GCC Patches <gcc-patches@gcc.gnu.org>, nd <nd@arm.com>,
Andrew Pinski <pinskia@gmail.com>
Subject: RE: [PATCH] [Aarch64] Optimize subtract in shift counts
Date: Wed, 06 Sep 2017 15:41:00 -0000 [thread overview]
Message-ID: <HE1PR0802MB237763E2263323F17908C03295970@HE1PR0802MB2377.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <87bmmot8wu.fsf@linaro.org>
Richard,
The problem with this approach for Aarch64 is that TARGET_SHIFT_TRUNCATION_MASK is based on SHIFT_COUNT_TRUNCATED which is normally 0 as it based on the TARGET_SIMD flag.
-----Original Message-----
From: Richard Sandiford [mailto:richard.sandiford@linaro.org]
Sent: Wednesday, September 6, 2017 11:32 AM
To: Michael Collison <Michael.Collison@arm.com>
Cc: Richard Biener <richard.guenther@gmail.com>; Richard Kenner <kenner@vlsi1.ultra.nyu.edu>; GCC Patches <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>; Andrew Pinski <pinskia@gmail.com>
Subject: Re: [PATCH] [Aarch64] Optimize subtract in shift counts
Michael Collison <Michael.Collison@arm.com> writes:
> Richard Sandiford do you have any objections to the patch as it stands?
> It doesn't appear as if anything is going to change in the mid-end
> anytime soon.
I think one of the suggestions was to do it in expand, taking advantage of range info and TARGET_SHIFT_TRUNCATION_MASK. This would be like the current FMA_EXPR handling in expand_expr_real_2.
I know there was talk about cleaner approaches, but at least doing the above seems cleaner than doing in the backend. It should also be a nicely-contained piece of work.
Thanks,
Richard
> -----Original Message-----
> From: Richard Sandiford [mailto:richard.sandiford@linaro.org]
> Sent: Tuesday, August 22, 2017 9:11 AM
> To: Richard Biener <richard.guenther@gmail.com>
> Cc: Richard Kenner <kenner@vlsi1.ultra.nyu.edu>; Michael Collison
> <Michael.Collison@arm.com>; GCC Patches <gcc-patches@gcc.gnu.org>; nd
> <nd@arm.com>; Andrew Pinski <pinskia@gmail.com>
> Subject: Re: [PATCH] [Aarch64] Optimize subtract in shift counts
>
> Richard Biener <richard.guenther@gmail.com> writes:
>> On Tue, Aug 22, 2017 at 9:29 AM, Richard Sandiford
>> <richard.sandiford@linaro.org> wrote:
>>> Richard Biener <richard.guenther@gmail.com> writes:
>>>> On August 21, 2017 7:46:09 PM GMT+02:00, Richard Sandiford
>>>> <richard.sandiford@linaro.org> wrote:
>>>>>Richard Biener <richard.guenther@gmail.com> writes:
>>>>>> On Tue, Aug 8, 2017 at 10:20 PM, Richard Kenner
>>>>>> <kenner@vlsi1.ultra.nyu.edu> wrote:
>>>>>>>> Correct. It is truncated for integer shift, but not simd shift
>>>>>>>> instructions. We generate a pattern in the split that only
>>>>>generates
>>>>>>>> the integer shift instructions.
>>>>>>>
>>>>>>> That's unfortunate, because it would be nice to do this in
>>>>>simplify_rtx,
>>>>>>> since it's machine-independent, but that has to be conditioned
>>>>>>> on SHIFT_COUNT_TRUNCATED, so you wouldn't get the benefit of it.
>>>>>>
>>>>>> SHIFT_COUNT_TRUNCATED should go ... you should express this in
>>>>>> the patterns, like for example with
>>>>>>
>>>>>> (define_insn ashlSI3
>>>>>> [(set (match_operand 0 "")
>>>>>> (ashl:SI (match_operand ... )
>>>>>> (subreg:QI (match_operand:SI ...)))]
>>>>>>
>>>>>> or an explicit and:SI and combine / simplify_rtx should apply the
>>>>>magic
>>>>>> optimization we expect.
>>>>>
>>>>>The problem with the explicit AND is that you'd end up with either
>>>>>an AND of two constants for constant shifts, or with two separate
>>>>>patterns, one for constant shifts and one for variable shifts.
>>>>>(And the problem in theory with two patterns is that it reduces the
>>>>>RA's freedom, although in practice I guess we'd always want a
>>>>>constant shift where possible for cost reasons, and so the RA would
>>>>>never need to replace pseudos with constants itself.)
>>>>>
>>>>>I think all useful instances of this optimisation will be exposed
>>>>>by the gimple optimisers, so maybe expand could to do it based on
>>>>>TARGET_SHIFT_TRUNCATION_MASK? That describes the optab rather than
>>>>>the rtx code and it does take the mode into account.
>>>>
>>>> Sure, that could work as well and also take into account range info.
>>>> But we'd then need named expanders and the result would still have
>>>> the explicit and or need to be an unspec or a different RTL operation.
>>>
>>> Without SHIFT_COUNT_TRUNCATED, out-of-range rtl shifts have
>>> target-dependent rather than undefined behaviour, so it's OK for a
>>> target to use shift codes with out-of-range values.
>>
>> Hmm, but that means simplify-rtx can't do anything with them because
>> we need to preserve target dependent behavior.
>
> Yeah, it needs to punt. In practice that shouldn't matter much.
>
>> I think the RTL IL should be always well-defined and its semantics
>> shouldn't have any target dependences (ideally, and if, then they
>> should be well specified via extra target hooks/macros).
>
> That would be nice :-) I think the problem has traditionally been that
>> shifts can be used in quite a few define_insn patterns besides those
>> for shift instructions. So if your target defines shifts to have
>> 256-bit precision (say) then you need to make sure that every
>> define_insn with a shift rtx will honour that.
>
> It's more natural for target guarantees to apply to instructions than
> to
>> rtx codes.
>
>>> And
>>> TARGET_SHIFT_TRUNCATION_MASK is a guarantee from the target about
>>> how the normal shift optabs behave, so I don't think we'd need new
>>> optabs or new unspecs.
>>>
>>> E.g. it already works this way when expanding double-word shifts,
>>> which IIRC is why TARGET_SHIFT_TRUNCATION_MASK was added. There
>>> it's possible to use a shorter sequence if you know that the shift
>>> optab truncates the count, so we can do that even if
>>> SHIFT_COUNT_TRUNCATED isn't defined.
>>
>> I'm somewhat confused by docs saying TARGET_SHIFT_TRUNCATION_MASK
>> applies to the instructions generated by the named shift patterns but
>> _not_ general shift RTXen. But the generated pattern contains shift
>> RTXen and how can we figure whether they were generated by the named
>> expanders or by other means? Don't define_expand also serve as
>> define_insn for things like combine?
>
> Yeah, you can't (and aren't supposed to try to) reverse-engineer the
>> expander from the generated instructions.
>> TARGET_SHIFT_TRUNCATION_MASK should only be used if you're expanding
>> a shift optab.
>
>> That said, from looking at the docs and your description above it
>> seems that semantics are not fully reflected in the RTL instruction stream?
>
> Yeah, the semantics go from being well-defined in the optab interface to being target-dependent in the rtl stream.
>
> Thanks,
> Richard
>
>>
>> Richard.
>>
>>> Thanks,
>>> Richard
>>>
>>>>
>>>> Richard.
>>>>
>>>>>Thanks,
>>>>>Richard
next prev parent reply other threads:[~2017-09-06 15:41 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-07 20:36 Michael Collison
2017-08-07 20:44 ` Andrew Pinski
2017-08-07 21:02 ` Richard Kenner
2017-08-07 23:44 ` Michael Collison
2017-08-08 1:56 ` Richard Kenner
2017-08-08 4:06 ` Michael Collison
2017-08-08 12:13 ` Richard Kenner
2017-08-08 19:46 ` Michael Collison
2017-08-08 19:52 ` Richard Kenner
2017-08-08 19:59 ` Michael Collison
2017-08-08 20:04 ` Richard Kenner
2017-08-08 20:18 ` Michael Collison
2017-08-08 20:20 ` Richard Kenner
2017-08-08 20:33 ` Michael Collison
2017-08-14 8:58 ` Richard Biener
2017-08-15 7:28 ` Michael Collison
2017-08-21 19:11 ` Richard Sandiford
2017-08-21 19:14 ` Richard Biener
2017-08-22 8:17 ` Richard Sandiford
2017-08-22 8:46 ` Richard Biener
2017-08-22 8:50 ` Richard Sandiford
2017-09-06 9:11 ` Michael Collison
2017-09-06 10:32 ` Richard Sandiford
2017-09-06 15:41 ` Michael Collison [this message]
2017-09-06 16:53 ` Richard Sandiford
2017-09-06 16:56 ` Richard Sandiford
2017-09-06 17:21 ` Richard Sandiford
2017-09-15 8:15 ` Michael Collison
2017-09-29 23:01 ` James Greenhalgh
2017-08-07 20:58 ` Richard Kenner
2017-08-08 10:04 Wilco Dijkstra
2017-08-22 11:01 Wilco Dijkstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=HE1PR0802MB237763E2263323F17908C03295970@HE1PR0802MB2377.eurprd08.prod.outlook.com \
--to=michael.collison@arm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=kenner@vlsi1.ultra.nyu.edu \
--cc=nd@arm.com \
--cc=pinskia@gmail.com \
--cc=richard.guenther@gmail.com \
--cc=richard.sandiford@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).