From: Richard Biener <rguenther@suse.de>
To: Richard Sandiford via Gcc-patches <gcc-patches@gcc.gnu.org>
Cc: Tamar Christina <tamar.christina@arm.com>, nd <nd@arm.com>
Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend
Date: Fri, 30 Sep 2022 16:33:14 +0200 [thread overview]
Message-ID: <A1577554-DE27-4AB8-BE0D-99F26A7C3481@suse.de> (raw)
In-Reply-To: <mptill5ndh5.fsf@arm.com>
> Am 30.09.2022 um 16:29 schrieb Richard Sandiford via Gcc-patches <gcc-patches@gcc.gnu.org>:
>
> Tamar Christina <Tamar.Christina@arm.com> writes:
>>> -----Original Message-----
>>> From: Richard Biener <rguenther@suse.de>
>>> Sent: Friday, September 30, 2022 12:53 PM
>>> To: Tamar Christina <Tamar.Christina@arm.com>
>>> Cc: Richard Sandiford <Richard.Sandiford@arm.com>; Tamar Christina via
>>> Gcc-patches <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>; Jeff Law
>>> <jeffreyalaw@gmail.com>
>>> Subject: RE: [PATCH 1/2]middle-end: RFC: On expansion of conditional
>>> branches, give hint if argument is a truth type to backend
>>>
>>>> On Fri, 30 Sep 2022, Tamar Christina wrote:
>>>
>>>>> -----Original Message-----
>>>>> From: Richard Biener <rguenther@suse.de>
>>>>> Sent: Friday, September 30, 2022 11:17 AM
>>>>> To: Tamar Christina <Tamar.Christina@arm.com>
>>>>> Cc: Richard Sandiford <Richard.Sandiford@arm.com>; Tamar Christina
>>>>> via Gcc-patches <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>; Jeff
>>> Law
>>>>> <jeffreyalaw@gmail.com>
>>>>> Subject: RE: [PATCH 1/2]middle-end: RFC: On expansion of conditional
>>>>> branches, give hint if argument is a truth type to backend
>>>>>
>>>>> On Fri, 30 Sep 2022, Tamar Christina wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Richard Sandiford <richard.sandiford@arm.com>
>>>>>>> Sent: Friday, September 30, 2022 9:49 AM
>>>>>>> To: Tamar Christina <Tamar.Christina@arm.com>
>>>>>>> Cc: Richard Biener <rguenther@suse.de>; Tamar Christina via
>>>>>>> Gcc-patches <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>; Jeff
>>> Law
>>>>>>> <jeffreyalaw@gmail.com>
>>>>>>> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of
>>>>>>> conditional branches, give hint if argument is a truth type to
>>>>>>> backend
>>>>>>>
>>>>>>> Tamar Christina <Tamar.Christina@arm.com> writes:
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Richard Sandiford <richard.sandiford@arm.com>
>>>>>>>>> Sent: Friday, September 30, 2022 9:29 AM
>>>>>>>>> To: Tamar Christina <Tamar.Christina@arm.com>
>>>>>>>>> Cc: Richard Biener <rguenther@suse.de>; Tamar Christina via
>>>>>>>>> Gcc-patches <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>; Jeff
>>>>> Law
>>>>>>>>> <jeffreyalaw@gmail.com>
>>>>>>>>> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of
>>>>>>>>> conditional branches, give hint if argument is a truth type
>>>>>>>>> to backend
>>>>>>>>>
>>>>>>>>> Tamar Christina <Tamar.Christina@arm.com> writes:
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: Gcc-patches <gcc-patches-
>>>>>>>>>>> bounces+tamar.christina=arm.com@gcc.gnu.org> On Behalf Of
>>>>>>> Richard
>>>>>>>>>>> Biener via Gcc-patches
>>>>>>>>>>> Sent: Thursday, September 29, 2022 12:09 PM
>>>>>>>>>>> To: Tamar Christina via Gcc-patches
>>>>>>>>>>> <gcc-patches@gcc.gnu.org>
>>>>>>>>>>> Cc: Richard Sandiford <Richard.Sandiford@arm.com>; nd
>>>>>>> <nd@arm.com>
>>>>>>>>>>> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion of
>>>>>>>>>>> conditional branches, give hint if argument is a truth
>>>>>>>>>>> type to backend
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Am 29.09.2022 um 12:23 schrieb Tamar Christina via
>>>>>>>>>>>> Gcc-patches
>>>>>>>>>>>> <gcc-
>>>>>>>>>>> patches@gcc.gnu.org>:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>> From: Richard Biener <rguenther@suse.de>
>>>>>>>>>>>>> Sent: Thursday, September 29, 2022 10:41 AM
>>>>>>>>>>>>> To: Richard Sandiford <Richard.Sandiford@arm.com>
>>>>>>>>>>>>> Cc: Jeff Law <jeffreyalaw@gmail.com>; Tamar Christina
>>>>>>>>>>>>> <Tamar.Christina@arm.com>; gcc-patches@gcc.gnu.org; nd
>>>>>>>>>>> <nd@arm.com>
>>>>>>>>>>>>> Subject: Re: [PATCH 1/2]middle-end: RFC: On expansion
>>>>>>>>>>>>> of conditional branches, give hint if argument is a
>>>>>>>>>>>>> truth type to backend
>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, 29 Sep 2022, Richard Sandiford wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Jeff Law <jeffreyalaw@gmail.com> writes:
>>>>>>>>>>>>>>> On 9/28/22 09:04, Richard Sandiford wrote:
>>>>>>>>>>>>>>>> Tamar Christina <Tamar.Christina@arm.com> writes:
>>>>>>>>>>>>>>>>>> Maybe the target could use (subreg:SI (reg:BI
>>>>>>>>>>>>>>>>>> ...)) as
>>>>>>> argument.
>>>>>>>>>>> Heh.
>>>>>>>>>>>>>>>>> But then I'd still need to change the expansion
>>>>>>>>>>>>>>>>> code. I suppose this could prevent the issue with
>>>>>>>>>>>>>>>>> changes to code on
>>>>>>>>> other targets.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> We have undocumented addcc, negcc, etc.
>>>>>>>>>>>>>>>>>>>> patterns,
>>>>>>> should
>>>>>>>>> we
>>>>>>>>>>>>>>>>>>>> have aandcc
>>>>>>>>>>>>>>>>> pattern for this indicating support for andcc +
>>>>>>>>>>>>>>>>> jump as opposedto
>>>>>>>>>>>>> cmpcc + jump?
>>>>>>>>>>>>>>>>>>> This could work yeah. I didn't know these existed.
>>>>>>>>>>>>>>>>>> Ah, so they are conditional add, not add setting
>>>>>>>>>>>>>>>>>> CC, so andcc wouldn't be appropriate.
>>>>>>>>>>>>>>>>>> So I'm not sure how we'd handle such situation -
>>>>>>>>>>>>>>>>>> maybe looking at REG_DECL and recognizing a _Bool
>>>>>>>>>>>>>>>>>> PARM_DECL is
>>>>>>> OK?
>>>>>>>>>>>>>>>>> I have a slight suspicion that Richard Sandiford
>>>>>>>>>>>>>>>>> would likely reject this though..
>>>>>>>>>>>>>>>> Good guess :-P We shouldn't rely on something like
>>>>>>>>>>>>>>>> that for
>>>>>>>>>>>>> correctness.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Would it help if we promoted the test-and-branch
>>>>>>>>>>>>>>>> instructions to optabs, alongside cbranch? The jump
>>>>>>>>>>>>>>>> expanders could then target it
>>>>>>>>>>>>> directly.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> IMO that'd be a reasonable thing to do if it does help.
>>>>>>>>>>>>>>>> It's a relatively common operation, especially on
>>>>>>>>>>>>>>>> CISCy
>>>>> targets.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> But don't we represent these single bit tests using
>>>>>>>>>>>>>>> zero_extract as the condition of the branch? I guess
>>>>>>>>>>>>>>> if we can generate them directly rather than waiting
>>>>>>>>>>>>>>> for combine to deduce that we're dealing with a
>>>>>>>>>>>>>>> single bit test and constructing the zero_extract
>>>>>>>>>>>>>>> form would be an improvement and might help aarch at
>>>>>>>>>>>>>>> the same
>>>>>>>>> time.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Do you mean that the promote_mode stuff should use
>>>>>>>>>>>>>> ext(z)v rather than zero_extend to promote a bool, where
>>> available?
>>>>>>>>>>>>>> If so, I agree that might help. But it sounds like it
>>>>>>>>>>>>>> would have downsides
>>>>>>>>> too.
>>>>>>>>>>>>>> Currently a bool memory can be zero-extended on the
>>>>>>>>>>>>>> fly using a load, but if we used the zero_extract form
>>>>>>>>>>>>>> instead, we'd have to extract the bit after the load.
>>>>>>>>>>>>>> And (as an
>>>>>>>>>>>>>> alternative) choosing different behaviour based on
>>>>>>>>>>>>>> whether expand sees a REG or a MEM sounds like it
>>>>>>>>>>>>>> could still cause problems, since REGs could be
>>>>>>>>>>>>>> replaced by MEMs (or vice versa)
>>>>>>> later in the RTL passes.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ISTM that the original patch was inserting an extra
>>>>>>>>>>>>>> operation in the branch expansion in order to target a
>>>>>>>>>>>>>> specific
>>>>> instruction.
>>>>>>>>>>>>>> Targeting the instruction in expand seems good, but
>>>>>>>>>>>>>> IMO we should do it directly, based on knowledge of
>>>>>>>>>>>>>> whether the instruction actually
>>>>>>>>>>> exists.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yes, I think a compare-and-branch pattern is the best fit
>>> here.
>>>>>>>>>>>>> Note on GIMPLE we'd rely on the fact this is a
>>>>>>>>>>>>> BOOLEAN_TYPE (so even 8 bit precision bools only have 1
>>>>>>>>>>>>> and 0 as meaningful
>>>>> values).
>>>>>>>>>>>>> So the 'compare-' bit in compare-and-branch would be
>>>>>>>>>>>>> interpreting a BOOLEAN_TYPE, not so much a general
>>> compare.
>>>>>>>>>>>>
>>>>>>>>>>>> Oh, I was thinking of adding a constant argument
>>>>>>>>>>>> representing the precision that is relevant for the
>>>>>>>>>>>> compare in order to make this a bit more
>>>>>>>>>>> general/future proof.
>>>>>>>>>>>>
>>>>>>>>>>>> Are you thinking I should instead just make the optab
>>>>>>>>>>>> implicitly only work for 1-bit precision comparisons?
>>>>>>>>>>>
>>>>>>>>>>> What?s the optab you propose (cite also the documentation
>>> part)?
>>>>>>>>>>
>>>>>>>>>> tbranchmode5
>>>>>>>>>> Conditional branch instruction combined with a bit test
>>> instruction.
>>>>>>>>> Operand 0 is a comparison operator.
>>>>>>>>>> Operand 1 and Operand 2 are the first and second operands
>>>>>>>>>> of the
>>>>>>>>> comparison, respectively.
>>>>>>>>>> Operand 3 is the number of low-order bits that are
>>>>>>>>>> relevant for the
>>>>>>>>> comparison.
>>>>>>>>>> Operand 4 is the code_label to jump to.
>>>>>>>>>
>>>>>>>>> For the TB instructions (and for other similar instructions
>>>>>>>>> that I've seen on other architectures) it would be more
>>>>>>>>> useful to have a single-bit test, with operand 4 specifying the bit
>>> position.
>>>>>>>>> Arguably it might then be better to have separate eq and ne
>>>>>>>>> optabs, to avoid the awkward doubling of the operands
>>>>>>>>> (operand 1 contains
>>>>>>> operands 2 and 3).
>>>>>>>>>
>>>>>>>>> I guess a more general way of achieving the same thing would
>>>>>>>>> be to make operand 4 in the optab above a mask rather than a bit
>>> count.
>>>>>>>>> But that might be overly general, if there are no known
>>>>>>>>> architectures that have such an instruction.
>>>>>>>>
>>>>>>>> One of the reasons I wanted a range rather than a single bit
>>>>>>>> is that I can the use this to generate cbz/cbnz early on as well.
>>>>>>>
>>>>>>> We already have the opportunity to do that via cbranch<mode>4.
>>>>>>> But at the moment aarch64.md always forces the separate
>>>>>>> comparison instead. (Not sure why TBH. Does it enable more
>>>>>>> ifcvt
>>>>>>> opportunities?)
>>>>>>>
>>>>>>> If we change the body of cbranch<mode>4 to:
>>>>>>>
>>>>>>> if ((GET_CODE (operands[0]) != EQ && GET_CODE (operands[0]) !=
>>> NE)
>>>>>>> || operands[2] != const0_rtx)
>>>>>>> {
>>>>>>> operands[1] = aarch64_gen_compare_reg (GET_CODE
>>> (operands[0]),
>>>>>>> operands[1], operands[2]);
>>>>>>> operands[2] = const0_rtx;
>>>>>>> }
>>>>>>>
>>>>>>> then we generate the cbz/cbnz directly.
>>>>>>>
>>>>>>
>>>>>> Ah ok, then if Richi agrees, bitpos it is then instead of bit count.
>>>>>
>>>>> Somehow I understood that cbranch<>4 is already fully capable of the
>>>>> optimization?
>>>>>
>>>>> On your earlier proposal I'd have commented that if it wouldn't make
>>>>> sense to instead have a CCmode setter instead of an expander with a
>>> branch label?
>>>>> That would be a bit test, like {sign,zero}_extract compared against
>>>>> zero which can then be combined with a branch?
>>>>>
>>>>
>>>> I missed that part, that could work too.
>>>>
>>>>> Of course if the ISA is really bit-test-and-branch then cbranch<>4
>>>>> itself or a variant of it might be more useful. Maybe
>>>>> cbranchbi4 would be "abused" for this?
>>>>
>>>> The instruction is an actual bit-test-and-branch with any arbitrary bitpos.
>>>> Yes we can abuse cbranchbi4 for this, but then it also means we can't e.g.
>>>> use this to optimize a < 0 where a is a signed value. With the new
>>>> optab this would just be a bit-test-and-branch of the sign bit.
>>>>
>>>> But also I'm not entirely convinced that using `BImode` and assuming a
>>>> single bit is safe here. What happens if I compile my source with -std=c89?
>>>>
>>>> So I personally think the new optab makes more sense here. The CC setter
>>> would work too.
>>>>
>>>> I guess my question is, between you folks, which approach would you
>>>> like. It seems that Richi You'd like a CC setter. Richard do you have a
>>> preference of one over the other?
>>>
>>> My order of preference is
>>>
>>> a) an existing pattern, if possible
>>> b) something usable by N > 1 targets we know of, even if it requires some
>>> combine magic
>>> c) something that fits the actual ISA
>>>
>>> For b), x86 has BEXTR which performs a zero_extract from reg/mem and sets
>>> ZF when the result is zero. For a branch on sign bit there's easier ways, so it's
>>> probably not very useful for general compare and branch optimization and if
>>> (a & 0x30) is probably handled by combine(?).
>>
>> Agreed, I was more giving an example of other uses. But ok.
>>
>>> It also seems that if (a & 1) is handled for aarch64 already and it's just a lack of
>>> an optab that allows RTL expansion to expand if (bool) as if (bool & 1)?
>>
>> Yes this was what I was originally after. Your original suggestion was to abuse BImode
>> and cbranch for this. That could work, but I worry about introducing BImode in the RTL,
>> as I don't think we currently use it at all and wonder about new missed optimizations.
>>
>> Richard Sandiford, would this be OK with you? I also don't want to emit an extract here as I think
>> that's gonna have a bigger effect on other targets than & 1 did.
>>
>> I would rather have something I can explicitly check for target support for. I also wonder if just a
>> target hook asking the target how to expand a given tree Boolean argument would work.
>
> Personally I'd prefer the test-and-branch optab. We used to have
> separate compare optabs and branch optabs in the cc0 days, but having
> the combined pattern turned out to be much easier. We don't currently
> expose CC registers at the optabs level and I think that's a good thing.
>
> We could have a test-bit-and-set-pseudo optab. But using that optab
> here would reintroduce the problem that I had with the original patch,
> which is:
>
> - Currently, the branch on the boolean value expands to a single optab
> (cbranch<mode>4). That optab could easily expand to a single cb(n)z
> instruction on aarch64 (even though the port chooses not to do that
> currently). So taken on its own, the branch is already maximally
> efficient.
>
> The patch instead adds an extra instruction to the branch expansion,
> so that it uses two optabs rather than one. On the face of it,
> this extra instruction is entirely redundant. But emitting it can
> sometimes allow profitable combines. That is, rather than expand to:
>
> and reg, reg, 0xFF (from promote_mode)
> cbranch on reg (from the branch expansion)
>
> the patch expanded to:
>
> and reg, reg, 0xFF (from promote_mode)
> and reg, reg, 0x1 (from the branch expansion)
> cbranch on reg (from the branch expansion)
>
> The first two ANDs come from separate sources. When both ANDs exist,
> combine can get rid of the redundancy and can then see that the
> retained AND 0x1 + cbranch optimise to a test-and-branch instruction.
>
> But if the target can't in fact use a test-and-branch in a particular
> situation, we'd be left with the AND 0x1 and the cbranch. That's not
> too bad if the AND from promote_mode and the AND 0x1 can be combined.
> But if the AND with 0xff happens "too far" from the AND with 0x1
> (e.g. because it's in a loop preheader), or if there's intermediate
> logic, we might end up keeping all three instructions.
>
> Emitting an extra bit test instruction as part of the branch expansion
> IMO has the same problem. We're relying on combine combining the
> (redundant) bit test with the cbranch. If combine doesn't do that
> (e.g. because the target doesn't support the combination in certain
> cases) then we could be left with three instructions rather than the
> original two.
>
> That's why I think the fix is either for promote_mode to use a different
> expansion for booleans (which could also have knock-on effects) or for
> the branch expanders to target the test-and-branch instruction directly.
>
> The optab wouldn't be an aarch64 special. arc, avr, and h8300sx have
> similar instructions. There are problem others too: h8300sx was from
> memory and I stopped looking after avr. :-)
Fair enough, let’s go with that then.
Richard
> Thanks,
> Richard
next prev parent reply other threads:[~2022-09-30 14:33 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-23 9:24 Tamar Christina
2022-09-23 9:25 ` [PATCH 2/2]AArch64 Extend tbz pattern to allow SI to SI extensions Tamar Christina
2022-09-23 9:42 ` Richard Sandiford
2022-09-23 9:48 ` Tamar Christina
2022-09-26 10:35 ` [PATCH 1/2]middle-end: RFC: On expansion of conditional branches, give hint if argument is a truth type to backend Richard Biener
2022-09-26 11:05 ` Tamar Christina
2022-09-26 11:32 ` Richard Biener
2022-09-26 11:46 ` Tamar Christina
2022-09-26 12:34 ` Richard Biener
2022-09-26 12:43 ` Richard Biener
2022-09-26 14:02 ` Tamar Christina
2022-09-28 15:04 ` Richard Sandiford
2022-09-28 17:23 ` Jeff Law
2022-09-29 9:37 ` Richard Sandiford
2022-09-29 9:40 ` Richard Biener
2022-09-29 10:21 ` Tamar Christina
2022-09-29 11:09 ` Richard Biener
2022-09-30 8:00 ` Tamar Christina
2022-09-30 8:28 ` Richard Sandiford
2022-09-30 8:38 ` Tamar Christina
2022-09-30 8:48 ` Richard Sandiford
2022-09-30 9:15 ` Tamar Christina
2022-09-30 10:16 ` Richard Biener
2022-09-30 11:11 ` Tamar Christina
2022-09-30 11:52 ` Richard Biener
2022-09-30 12:48 ` Tamar Christina
2022-09-30 14:28 ` Richard Sandiford
2022-09-30 14:33 ` Richard Biener [this message]
2022-09-29 20:49 ` Jeff Law
2022-10-27 3:22 ` Andrew Pinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=A1577554-DE27-4AB8-BE0D-99F26A7C3481@suse.de \
--to=rguenther@suse.de \
--cc=gcc-patches@gcc.gnu.org \
--cc=nd@arm.com \
--cc=tamar.christina@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).