From: Jeff Law <law@redhat.com>
To: Kyrill Tkachov <kyrylo.tkachov@arm.com>,
GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH][RTL-ifcvt] Improve conditional select ops on immediates
Date: Fri, 31 Jul 2015 16:16:00 -0000 [thread overview]
Message-ID: <55BB9D41.7000702@redhat.com> (raw)
In-Reply-To: <55BA29B3.30102@arm.com>
On 07/30/2015 07:42 AM, Kyrill Tkachov wrote:
> Hi Jeff,
>
> On 29/07/15 23:38, Jeff Law wrote:
>> On 07/29/2015 07:49 AM, Kyrill Tkachov wrote:
>>> Hi all,
>>>
>>> This patch improves RTL if-conversion on sequences that perform a
>>> conditional select on integer constants.
>>> Most of the smart logic to do that already exists in the
>>> noce_try_store_flag_constants function.
>>> However, currently that function is tried after noce_try_cmove.
>>> noce_try_cmove is a simple catch-all function that just loads the two
>>> immediates and performs a conditional
>>> select between them. It returns true and then the caller
>>> noce_process_if_block doesn't try any other transformations,
>>> completely skipping the more aggressive transformations that
>>> noce_try_store_flag_constants allows!
>>>
>>> Calling noce_try_store_flag_constants before noce_try_cmove allows for
>>> the smarter if-conversion transformations
>>> to be used. An example that occurs a lot in the gcc code itself is for
>>> the C code:
>>> int
>>> foo (int a, int b)
>>> {
>>> return ((a & (1 << 25)) ? 5 : 4);
>>> }
>>>
>>> i.e. test a bit in a and return 5 or 4. Currently on aarch64 this
>>> generates the naive:
>>> and w2, w0, 33554432 // mask away all bits except bit 25
>>> mov w1, 4
>>> cmp w2, wzr
>>> mov w0, 5
>>> csel w0, w0, w1, ne
>>>
>>>
>>> whereas with this patch this can be transformed into the much better:
>>> ubfx x0, x0, 25, 1 // extract bit 25
>>> add w0, w0, 4
>> I suspect the PA would benefit from this as well, probably several
>> other architectures with reasonable bitfield extraction capabilities.
>
> This helps on arm as well but isn't enabled there at the moment
> because this function is gated on !have_conditional_execution ().
> We can look at enabling this for conditional execution
> as well as a separate piece of work.
>
>>
>>> Another issue I encountered is that the code that claims to perform the
>>> transformation:
>>> /* if (test) x = 3; else x = 4;
>>> => x = 3 + (test == 0); */
>>>
>>> doesn't seem to do exactly that in all cases. In fact for that case it
>>> will try something like:
>>> x = 4 - (test == 0)
>>> which is suboptimal for targets like aarch64 which have a conditional
>>> increment operation.
>> I vaguely recall targets that don't have const - X insns, but do have X
>> + const style insns. And more generally I think we're better off
>> generating the PLUS version.
>
> Now that I think of it, aarch64 does have the CSETM instruction that
> conditionally sets a register to 0 or -1, but I didn't see it being
> utilised
> as frequently as I'd like to. Anyway, I do prefer generating the PLUS
> version
> when possible too, which is what this patch tries to do.
>
>>
>>
>>> This patch tweaks that code to always try to generate an addition of the
>>> condition rather than
>>> a subtraction.
>>>
>>> Anyway, for code:
>>> int
>>> fooinc (int x)
>>> {
>>> return x ? 1025 : 1026;
>>> }
>>>
>>> we currently generate:
>>> mov w2, 1025
>>> mov w1, 1026
>>> cmp w0, wzr
>>> csel w0, w2, w1, ne
>>>
>>> whereas with this patch we will generate:
>>> cmp w0, wzr
>>> cset w0, eq
>>> add w0, w0, 1025
>>>
>>> Bootstrapped and tested on arm, aarch64, x86_64.
>>> Ok for trunk?
>>>
>>> Thanks,
>>> Kyrill
>>>
>>> P.S. noce_try_store_flag_constants is currently gated on
>>> !targetm.have_conditional_execution () but I don't see
>>> any reason to restrict it on targets with conditional execution. For
>>> example, I think the first example above
>>> would show a benefit on arm if it was enabled there. But that can be a
>>> separate investigation.
>>>
>>> 2015-07-29 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
>>>
>>> * ifcvt.c (noce_try_store_flag_constants): Reverse when diff is
>>> -STORE_FLAG and condition is reversable. Prefer to add to the
>>> flag value.
>>> (noce_process_if_block): Try noce_try_store_flag_constants before
>>> noce_try_cmove.
>>>
>>> 2015-07-29 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
>>>
>>> * gcc.target/aarch64/csel_bfx_1.c: New test.
>>> * gcc.target/aarch64/csel_imms_inc_1.c: Likewise.
>>>
>>> ifcvt-csel-imms.patch
>>>
>>>
>>> commit 0164ef164483bdf0b2f73e267e2ff1df7800dd6d
>>> Author: Kyrylo Tkachov<kyrylo.tkachov@arm.com>
>>> Date: Tue Jul 28 14:59:46 2015 +0100
>>>
>>> [RTL-ifcvt] Improve conditional increment ops on immediates
>>>
>>> diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
>>> index a57d78c..80d0285 100644
>>> --- a/gcc/ifcvt.c
>>> +++ b/gcc/ifcvt.c
>>> @@ -1222,7 +1222,7 @@ noce_try_store_flag_constants (struct
>>> noce_if_info *if_info)
>>>
>>> reversep = 0;
>>> if (diff == STORE_FLAG_VALUE || diff == -STORE_FLAG_VALUE)
>>> - normalize = 0;
>>> + normalize = 0, reversep = (diff == -STORE_FLAG_VALUE) &&
>>> can_reverse;
>> We generally avoid using a ',' operator like this. However, I can see
>> that you're just following existing convention in that code. So I won't
>> object.
>
> Ok, I've respun the patch to convert this part to use a more conventional
> style and also took the liberty of converting reversep and can_reverse into
> bool variables.
>
>>
>>
>>
>>> else if (ifalse == 0 && exact_log2 (itrue) >= 0
>>> && (STORE_FLAG_VALUE == 1
>>> || if_info->branch_cost >= 2))
>>> @@ -1261,10 +1261,13 @@ noce_try_store_flag_constants (struct
>>> noce_if_info *if_info)
>>> => x = 3 + (test == 0); */
>>> if (diff == STORE_FLAG_VALUE || diff == -STORE_FLAG_VALUE)
>>> {
>>> - target = expand_simple_binop (mode,
>>> - (diff == STORE_FLAG_VALUE
>>> - ? PLUS : MINUS),
>>> - gen_int_mode (ifalse, mode), target,
>>> + rtx_code code = reversep ? PLUS :
>>> + (diff == STORE_FLAG_VALUE ? PLUS
>>> + : MINUS);
>>> + HOST_WIDE_INT to_add = reversep ? MIN (ifalse, itrue) : ifalse;
>>> +
>>> + target = expand_simple_binop (mode, code,
>>> + gen_int_mode (to_add, mode), target,
>>> if_info->x, 0, OPTAB_WIDEN);
>>> }
>> Note that STORE_FLAG_VALUE can potentially be any value. Most targets
>> use "1", but others use -1 (m68k for example, there are others). I'm
>> concerned that the reversep ? MIN (ifalse, itrue) won't do what we want
>> on targets such as the m68k.
>>
>> The logic here is also somewhat convoluted. When reversep is true, we
>> always use PLUS, but is that actually correct? Normally we'd compute
>> the code, then invert the code. ie, if we normally want PLUS, then
>> we've flip to MINUS and vice-versa.
>
> I agree that it's hard to follow. In this respin I have
> explicitly laid out the different combinations of diff
> and STORE_FLAG_VALUE. It is more verbose now, but hopefully
> it's easier to follow.
>
>>
>>
>>> @@ -3120,13 +3123,14 @@ noce_process_if_block (struct noce_if_info
>>> *if_info)
>>> goto success;
>>> if (noce_try_abs (if_info))
>>> goto success;
>>> + if (!targetm.have_conditional_execution ()
>>> + && noce_try_store_flag_constants (if_info))
>>> + goto success;
>>> if (HAVE_conditional_move
>>> && noce_try_cmove (if_info))
>>> goto success;
>>> if (! targetm.have_conditional_execution ())
>>> {
>>> - if (noce_try_store_flag_constants (if_info))
>>> - goto success;
>>> if (noce_try_addcc (if_info))
>>> goto success;
>>> if (noce_try_store_flag_mask (if_info))
>> This part seems fine and ought to be able to go forward independently.
>> Consider this hunk and its associated testcase approved if you want to
>> test it independently and get it installed while we iterate on the other
>> hunks,
>
> This respin includes that hunk, however if this patch goes through any more
> respins then I'll separate it out into a separate patch.
>
> This version has been bootstrapped and tested on x86_64 and aarch64.
>
> Thanks for the feedback!
> Kyrill
>
> 2015-07-30 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
>
> * ifcvt.c (noce_try_store_flag_constants): Make logic of the case
> when diff == STORE_FLAG_VALUE or diff == -STORE_FLAG_VALUE more
> explicit. Prefer to add the flag whenever possible.
> (noce_process_if_block): Try noce_try_store_flag_constants before
> noce_try_cmove.
>
> 2015-07-30 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
>
> * gcc.target/aarch64/csel_bfx_1.c: New test.
> * gcc.target/aarch64/csel_imms_inc_1.c: Likewise.
OK. Thanks for taking care of this.
jeff
next prev parent reply other threads:[~2015-07-31 16:07 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-29 14:16 Kyrill Tkachov
2015-07-29 22:46 ` Jeff Law
2015-07-30 14:30 ` Kyrill Tkachov
2015-07-31 16:16 ` Jeff Law [this message]
2015-08-03 12:33 Uros Bizjak
2015-08-03 13:02 ` Kyrill Tkachov
2015-08-03 13:37 ` Uros Bizjak
2015-08-03 13:43 ` Kyrill Tkachov
2015-08-03 14:12 ` Kyrill Tkachov
2015-08-03 14:15 ` Uros Bizjak
2015-08-03 15:37 ` Kyrill Tkachov
2015-08-03 15:46 ` Uros Bizjak
2015-08-03 15:53 ` Kyrill Tkachov
2015-08-03 17:20 ` Kyrill Tkachov
2015-08-03 17:37 ` Uros Bizjak
2015-08-04 8:44 ` Kyrill Tkachov
2015-08-10 9:36 ` Kyrill Tkachov
2015-08-10 9:43 ` Uros Bizjak
2015-08-10 9:44 ` Kyrill Tkachov
2015-08-12 17:52 ` Jeff Law
2015-08-03 14:04 ` Uros Bizjak
2015-08-03 14:37 ` H.J. Lu
2015-08-03 15:50 ` Ilya Enkovich
2015-08-03 16:29 ` Jeff Law
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55BB9D41.7000702@redhat.com \
--to=law@redhat.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=kyrylo.tkachov@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).