Re: [GCC][PATCH][Aarch64] Exploiting BFXIL when OR-ing two AND-operations with appropriate bitmasks

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Sam Tebbs <sam.tebbs@arm.com>
To: Richard Henderson <rth@twiddle.net>,
	Sudakshina Das <sudi.das@arm.com>,
	"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Cc: nd <nd@arm.com>,
	richard.earnshaw@arm.com, marcus.shawcroft@arm.com,
	james.greenhalgh@arm.com
Subject: Re: [GCC][PATCH][Aarch64] Exploiting BFXIL when OR-ing two AND-operations with appropriate bitmasks
Date: Fri, 20 Jul 2018 09:31:00 -0000	[thread overview]
Message-ID: <2382d230-7c81-15df-e880-aaf7f25efb00@arm.com> (raw)
In-Reply-To: <4fec5bc7-aa21-6470-c9e1-6cfb0dd17efe@arm.com>

Hi all,

Here is an updated patch that does the following:

* Adds a new constraint in config/aarch64/constraints.md to check for a 
constant integer that is left consecutive. This addresses Richard 
Henderson's suggestion about combining the aarch64_is_left_consecutive 
call and the const_int match in the pattern.

* Merges the two patterns defined into one.

* Changes the pattern's type attribute to bfm.

* Improved the comment above the aarch64_is_left_consecutive implementation.

* Makes the pattern use the GPI iterator to accept smaller integer sizes 
(an extra test is added to check for this).

* Improves the tests in combine_bfxil.c to ensure they aren't optimised 
away and that they check for the pattern's correctness.

Below is a new changelog and the example given before.

Is this OK for trunk?

This patch adds an optimisation that exploits the AArch64 BFXIL instruction
when or-ing the result of two bitwise and operations with non-overlapping
bitmasks (e.g. (a & 0xFFFF0000) | (b & 0x0000FFFF)).

Example:

unsigned long long combine(unsigned long long a, unsigned long long b) {
 Â  return (a & 0xffffffff00000000ll) | (b & 0x00000000ffffffffll);
}

void read(unsigned long long a, unsigned long long b, unsigned long long 
*c) {
 Â  *c = combine(a, b);
}

When compiled with -O2, read would result in:

read:
 Â  andÂ Â  x5, x1, #0xffffffff
 Â  andÂ Â  x4, x0, #0xffffffff00000000
 Â  orrÂ Â  x4, x4, x5
 Â  strÂ Â  x4, [x2]
 Â  ret

But with this patch results in:

read:
 Â  movÂ Â Â  x4, x0
 Â  bfxilÂ Â Â  x4, x1, 0, 32
 Â  strÂ Â Â  x4, [x2]
 Â  ret

Bootstrapped and regtested on aarch64-none-linux-gnu and 
aarch64-none-elf with no regressions.

gcc/
2018-07-11Â  Sam TebbsÂ  <sam.tebbs@arm.com>

 Â Â Â Â Â Â Â  PR target/85628
 Â Â Â Â Â Â Â  * config/aarch64/aarch64.md (*aarch64_bfxil):
 Â Â Â Â Â Â Â  Define.
 Â Â Â Â Â Â Â  * config/aarch64/constraints.md (Ulc): Define
 Â Â Â Â Â Â Â  * config/aarch64/aarch64-protos.h (aarch64_is_left_consecutive):
 Â Â Â Â Â Â Â  Define.
 Â Â Â Â Â Â Â  * config/aarch64/aarch64.c (aarch64_is_left_consecutive): New 
function.

gcc/testsuite
2018-07-11Â  Sam TebbsÂ  <sam.tebbs@arm.com>

 Â Â Â Â Â Â Â  PR target/85628
 Â Â Â Â Â Â Â  * gcc.target/aarch64/combine_bfxil.c: New file.
 Â Â Â Â Â Â Â  * gcc.target/aarch64/combine_bfxil_2.c: New file.

On 07/19/2018 02:02 PM, Sam Tebbs wrote:
> Hi Richard,
>
> Thanks for the feedback. I find that using "is_left_consecutive" is 
> more descriptive than checking for it being a power of 2 - 1, since it 
> describes the requirement (having consecutive ones from the MSB) more 
> explicitly. I would be happy to change it though if that is the 
> consensus.
>
> I have addressed your point about just returning the string instead of 
> using output_asm_insn and have changed it locally. I'll send an 
> updated patch soon.
>
>
> On 07/17/2018 02:33 AM, Richard Henderson wrote:
>> On 07/16/2018 10:10 AM, Sam Tebbs wrote:
>>> +++ b/gcc/config/aarch64/aarch64.c
>>> @@ -1439,6 +1439,14 @@ aarch64_hard_regno_caller_save_mode (unsigned 
>>> regno, unsigned,
>>> Â Â Â Â Â  return SImode;
>>> Â  }
>>> Â  +/* Implement IS_LEFT_CONSECUTIVE.Â  Check if an integer's bits are 
>>> consecutive
>>> +Â Â  ones from the MSB.Â  */
>>> +bool
>>> +aarch64_is_left_consecutive (HOST_WIDE_INT i)
>>> +{
>>> +Â  return (i | (i - 1)) == HOST_WIDE_INT_M1;
>>> +}
>>> +
>> ...
>>> +(define_insn "*aarch64_bfxil"
>>> +Â  [(set (match_operand:DI 0 "register_operand" "=r")
>>> +Â Â Â  (ior:DI (and:DI (match_operand:DI 1 "register_operand" "r")
>>> +Â Â Â Â Â Â Â Â Â Â Â  (match_operand 3 "const_int_operand"))
>>> +Â Â Â Â Â Â Â  (and:DI (match_operand:DI 2 "register_operand" "0")
>>> +Â Â Â Â Â Â Â Â Â Â Â  (match_operand 4 "const_int_operand"))))]
>>> +Â  "INTVAL (operands[3]) == ~INTVAL (operands[4])
>>> +Â Â Â  && aarch64_is_left_consecutive (INTVAL (operands[4]))"
>> Better is to use a define_predicate to merge both that second test 
>> and the
>> const_int_operand.
>>
>> (I'm not sure about the "left_consecutive" language either.
>> Isn't it more descriptive to say that op3 is a power of 2 minus 1?)
>>
>> (define_predicate "pow2m1_operand"
>> Â Â  (and (match_code "const_int")
>> Â Â Â Â Â Â Â  (match_test "exact_pow2 (INTVAL(op) + 1) > 0")))
>>
>> and use
>>
>> Â Â  (match_operand:DI 3 "pow2m1_operand")
>>
>> and then just the
>>
>> Â Â  INTVAL (operands[3]) == ~INTVAL (operands[4])
>>
>> test.
>>
>> Also, don't omit the modes for the constants.
>> Also, there's no reason this applies only to DI mode;
>> use the GPI iterator and %<w> in the output template.
>>
>>> +Â Â Â  HOST_WIDE_INT op3 = INTVAL (operands[3]);
>>> +Â Â Â  operands[3] = GEN_INT (ceil_log2 (op3));
>>> +Â Â Â  output_asm_insn ("bfxil\\t%0, %1, 0, %3", operands);
>>> +Â Â Â  return "";
>> You can just return the string that you passed to output_asm_insn.
>>
>>> +Â  }
>>> +Â  [(set_attr "type" "bfx")]
>> The other aliases of the BFM insn use type "bfm";
>> "bfx" appears to be aliases of UBFM and SBFM.
>> Not that it appears to matter to the scheduling
>> descriptions, but it is inconsistent.
>>
>>
>> r~
>

next prev parent reply	other threads:[~2018-07-20  9:31 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-13 16:09 Sam Tebbs
2018-07-16 10:55 ` Sudakshina Das
2018-07-16 17:11   ` Sam Tebbs
2018-07-17  1:34     ` Richard Henderson
2018-07-17 13:33       ` Richard Earnshaw (lists)
2018-07-17 15:46         ` Richard Henderson
2018-07-19 13:03       ` Sam Tebbs
2018-07-20  9:31         ` Sam Tebbs [this message]
2018-07-20  9:33           ` Sam Tebbs
2018-07-23 11:38             ` Renlin Li
2018-07-23 13:15               ` Sam Tebbs
2018-07-24 16:24                 ` Sam Tebbs
2018-07-30 11:31                   ` Sam Tebbs

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2382d230-7c81-15df-e880-aaf7f25efb00@arm.com \
    --to=sam.tebbs@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=james.greenhalgh@arm.com \
    --cc=marcus.shawcroft@arm.com \
    --cc=nd@arm.com \
    --cc=richard.earnshaw@arm.com \
    --cc=rth@twiddle.net \
    --cc=sudi.das@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).