From: Jeff Law <law@redhat.com>
To: Vlad Lazar <vlad.lazar@arm.com>,
"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
ian@airs.com, rguenther@suse.de, richard.sandiford@arm.com
Cc: nd <nd@arm.com>
Subject: Re: [RFC][PATCH][mid-end] Optimize immediate choice in comparisons.
Date: Thu, 16 Aug 2018 16:35:00 -0000 [thread overview]
Message-ID: <3e36417b-f23d-54dd-a5c9-dbea3a10d3ed@redhat.com> (raw)
In-Reply-To: <9766b8ff-9bf6-4437-ec20-75423152eb68@arm.com>
On 08/14/2018 11:01 AM, Vlad Lazar wrote:
> On 13/08/18 15:00, Richard Sandiford wrote:
>> Vlad Lazar <vlad.lazar@arm.com> writes:
>>> diff --git a/gcc/expmed.h b/gcc/expmed.h
>>> index
>>> 2890d9c9bbd034f01030dd551d544bf73e73b784..86a32a643fdd0fc9f396bd2c7904244bd484df16
>>> 100644
>>> --- a/gcc/expmed.h
>>> +++ b/gcc/expmed.h
>>> @@ -702,6 +702,10 @@ extern rtx emit_store_flag (rtx, enum rtx_code,
>>> rtx, rtx, machine_mode,
>>> Â Â extern rtx emit_store_flag_force (rtx, enum rtx_code, rtx, rtx,
>>> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â machine_mode, int, int);
>>> Â Â +extern void canonicalize_comparison (machine_mode, enum rtx_code
>>> *, rtx *);
>>> +
>>> +extern enum rtx_code canonicalized_cmp_code (enum rtx_code);
>>
>> It would probably be better to make the second function static (local
>> to expmed.c), since it's only used internally by canonicalize_comparison.
>> Switching the order of the two functions in expmed.c would avoid the need
>> for a forward declaration.
>>
>>> @@ -6153,6 +6156,97 @@ emit_store_flag_force (rtx target, enum
>>> rtx_code code, rtx op0, rtx op1,
>>> Â Â Â Â Â Â return target;
>>> Â Â }
>>> +
>>> +/* Choose the more appropiate immediate in comparisons. The purpose
>>> of this
>>
>> Maybe worth saying "scalar integer comparisons", since the function
>> doesn't handle vector or floating-point comparisons.
>>
>>> +Â Â is to end up with an immediate which can be loaded into a
>>> register in fewer
>>> +Â Â moves, if possible.
>>> +
>>> +Â Â For each integer comparison there exists an equivalent choice:
>>> +Â Â Â Â i)Â Â a >Â b or a >= b + 1
>>> +Â Â Â Â ii)Â a <= b or a <Â b + 1
>>> +Â Â Â Â iii) a >= b or a >Â b - 1
>>> +Â Â Â Â iv)Â a <Â b or a <= b - 1
>>> +
>>> +Â Â The function is called in the gimple expanders which handle
>>> GIMPLE_ASSIGN
>>> +  and GIMPLE_COND. It assumes that the first operand of the
>>> comparison is a
>>> +Â Â register and the second is a constant.
>>
>> This last paragraph doesn't seem accurate any more. Probably best to
>> drop it, since there's no real need to say who the callers are if we
>> describe the interface.
>>
>>> +Â Â mode is the mode of the first operand.
>>> +Â Â code points to the comparison code.
>>> +  imm points to the rtx containing the immediate. */
>>
>> GCC's convention is to put parameter names in caps in comments,
>> so "MODE is...", etc.
>>
>> On the IMM line, it might be worth adding "*IMM must satisfy
>> CONST_SCALAR_INT_P on entry and continues to satisfy CONST_SCALAR_INT_P
>> on exit."
>>
>>> +void canonicalize_comparison (machine_mode mode, enum rtx_code
>>> *code, rtx *imm)
>>> +{
>>> +Â int to_add = 0;
>>> +
>>> +Â if (!SCALAR_INT_MODE_P (mode))
>>> +Â Â Â return;
>>> +
>>> + /* Extract the immediate value from the rtx. */
>>> +Â wide_int imm_val = wi::shwi (INTVAL (*imm), mode);
>>
>> This should be:
>>
>> Â Â rtx_mode_t imm_val (*imm, mode);
>>
>> so that it copes with all CONST_SCALAR_INT_P, not just CONST_INT.
>>
>>> +Â if (*code == GT || *code == GTU || *code == LE || *code == LEU)
>>> +Â Â Â to_add = 1;
>>> +
>>> +Â if (*code == GE || *code == GEU || *code == LT || *code == LTU)
>>> +Â Â Â to_add = -1;
>>
>> Might be better to have:
>>
>> Â Â if (*code == GT || *code == GTU || *code == LE || *code == LEU)
>> Â Â Â Â to_add = 1;
>> Â Â else if (*code == GE || *code == GEU || *code == LT || *code == LTU)
>> Â Â Â Â to_add = -1;
>> Â Â else
>> Â Â Â Â return;
>>
>> so that we exit early for other comparisons, rather than doing wasted
>> work.
>>
>>> +Â /* Check for overflow/underflow in the case of signed values and
>>> +    wrapping around in the case of unsigned values. If any occur
>>> +    cancel the optimization. */
>>> +Â wide_int max_val = wi::max_value (mode, SIGNED);
>>> +Â wide_int min_val = wi::min_value (mode, SIGNED);
>>> +Â if ((wi::cmps (imm_val, max_val) == 0 && to_add == 1)
>>> +Â Â Â Â Â || (wi::cmps (imm_val, min_val) == 0 && to_add == -1))
>>> +Â Â Â return;
>>
>> This needs to use the SIGNED/UNSIGNED choice appropriate for the
>> comparison (SIGNED for GT, UNSIGNED for GTU, etc.). There doesn't
>> seem to be an existing function that says whether an rtx comparison
>> operation is signed or not (bit of a surprise), but there is
>> unsigned_condition, which returns the unsigned form of an rtx
>> comparison operator. It might be worth adding something like:
>>
>> Â Â /* Return true if integer comparison operator CODE interprets its
>> operands
>>      as unsigned. */
>>
>> Â Â inline bool
>> Â Â unsigned_condition_p (rtx_code code)
>> Â Â {
>> Â Â Â Â return unsigned_condition (code) == code;
>> Â Â }
>>
>> and using that to choose between SIGNED and UNSIGNED.
>>
>> Rather than using wi::cmp, a more direct way of checking for overflow
>> is to change this:
>>
>>> +Â /* Generate a mov instruction for both cases and see whether the
>>> change
>>> +    was profitable. */
>>> +Â wide_int imm_modif = wi::add (imm_val, to_add);
>>
>> to use the overflow form of wi::add, i.e.:
>>
>> Â Â wide_int imm_modif = wi::add (imm_val, to_add, sign, &overflow);
>>
>> and return if "overflow" is set.
>>
>>> +
>>> +Â rtx reg = gen_reg_rtx (mode);
>>
>> gen_reg_rtx creates a new pseudo register that essentially stays
>> around until we've finished compiling the function. It's better to use:
>>
>> Â Â gen_rtx_REG (mode, LAST_VIRTUAL_REGISTER + 1);
>>
>> for cost calculations, so that we don't waste pseudo register numbers.
>>
>>> +Â rtx new_imm = GEN_INT (imm_modif.to_shwi ());
>>
>> This should be:
>>
>> Â Â rtx new_imm = immed_wide_int_const (imm_modif, mode);
>>
>> to cope with non-CONST_INT immediates, and to ensure that CONST_INT
>> immediates are properly sign-extended.
>>
>> (The rtx interfaces are a bit clunky, sorry.)
>>
>> Patch looks good to me with those changes, but someone else will
>> need to approve.
>>
>> Thanks,
>> Richard
>>
>
> Thanks for taking the time to point everything out.
>
> I've updated the patch according to Richard's comments.
> See the updated patch below. OK for trunk?
>
> Thanks,
> Vlad
>
> ---
>
> This patch optimises the choice of immediates in integer comparisons.
> Integer
> comparisons allow for different choices (e.g. a > b is equivalent to a
>>= b+1)
> and there are cases where an incremented/decremented immediate can be
> loaded into a
> register in fewer instructions. The cases are as follows:
>
> i)Â Â a >Â b or a >= b + 1
> ii)Â a <= b or a <Â b + 1
> iii) a >= b or a >Â b - 1
> iv)Â a <Â b or a <= b - 1
>
> For each comparison we check whether the equivalent can be performed in
> less instructions.
> This is done in the gimple expanders. Therefore, it requires a correct
> implementation of the
> TARGET_INSN_COST hook. The change is general and it applies to any
> integer comparison, regardless
> of types or location.
>
> For example, on AArch64 for the following code:
>
> int foo (int x)
> {
> Â return x > 0xfefffffe;
> }
>
> it generates:
>
> mov   w1, -16777217
> cmp   w0, w1
> cset   w0, cs
>
> instead of:
>
> mov   w1, 65534
> movk   w1, 0xfeff, lsl 16
> cmp   w0, w1
> cset   w0, hi
>
> Bootstrapped and regtested on aarch64-none-linux-gnu,
> x86_64-pc-linux-gnu and there are no regressions.
>
> Thanks,
> Vlad
>
> gcc/testsuite/
> Changelog for gcc/testsuite/Changelog
> 2018-08-14 Vlad Lazar <vlad.lazar@arm.com>
>
> Â Â Â Â * gcc.target/aarch64/imm_choice_comparison.c: New.
>
> gcc/
> Changelog for gcc/Changelog
> 2018-08-14 Vlad Lazar <vlad.lazar@arm.com>
> Â Â Â Â * expmed.h (canonicalize_comparison): New declaration.
> Â Â Â Â * expmed.c (canonicalize_comparison, equivalent_cmp_code): New.
> Â Â Â Â * expmed.c (emit_store_flag_1): Add call to canonicalize_comparison.
> Â Â Â Â * optabs.c (prepare_cmp_insn): Likewise.
> Â Â Â Â * rtl.h (unsigned_condition_p): New function which checks if a
> comparison operator is unsigned.
Thanks. I fixed up the ChangeLog entry and installed this on the trunk.
Richard S -- thanks for working with Vlad to push this forward.
jeff
next prev parent reply other threads:[~2018-08-16 16:35 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-07 13:34 Vlad Lazar
2018-08-07 20:11 ` Richard Sandiford
2018-08-09 5:48 ` Jeff Law
2018-08-10 15:54 ` Vlad Lazar
2018-08-13 14:00 ` Richard Sandiford
2018-08-14 17:01 ` Vlad Lazar
2018-08-16 10:28 ` Richard Sandiford
2018-08-16 16:35 ` Jeff Law [this message]
2018-08-16 16:46 ` Vlad Lazar
2018-08-16 16:48 ` Jeff Law
2018-08-18 8:44 Uros Bizjak
2018-08-18 10:12 ` Richard Sandiford
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3e36417b-f23d-54dd-a5c9-dbea3a10d3ed@redhat.com \
--to=law@redhat.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=ian@airs.com \
--cc=nd@arm.com \
--cc=rguenther@suse.de \
--cc=richard.sandiford@arm.com \
--cc=vlad.lazar@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).