Re: [RFA] [target/87369] Prefer "bit" over "bfxil"

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: "Richard Earnshaw (lists)" <Richard.Earnshaw@arm.com>
To: Jeff Law <law@redhat.com>, gcc-patches <gcc-patches@gcc.gnu.org>,
	James Greenhalgh <james.greenhalgh@arm.com>
Subject: Re: [RFA] [target/87369] Prefer "bit" over "bfxil"
Date: Fri, 07 Dec 2018 17:31:00 -0000	[thread overview]
Message-ID: <b58a962b-674d-15e3-2436-a46a1b2a3f9c@arm.com> (raw)
In-Reply-To: <e0aa8d50-71ae-7a58-0a7e-2b74a8fcac7d@redhat.com>

On 07/12/2018 15:52, Jeff Law wrote:
> As I suggested in the BZ, this patch rejects constants with  just the
> high bit set for the recently added "bfxil" pattern.  As a result we'll
> return to using "bit" for the test in the BZ.
> 
> I'm not versed enough in aarch64 performance tuning to know if "bit" is
> actually a better choice than "bfxil".  "bit" results in better code for
> the testcase, but that seems more a function of register allocation than
> "bit" being inherently better than "bfxil".   Obviously someone with
> more aarch64 knowledge needs to make a decision here.
> 
> My first iteration of the patch changed "aarch64_high_bits_all_ones_p".
> We could still go that way too, though the name probably needs to change.
> 
> I've bootstrapped and regression tested on aarch64-linux-gnu and it
> fixes the regression.  I've also bootstrapped aarch64_be-linux-gnu, but
> haven't done any kind of regression tested on that platform.
> 
> 
> OK for the trunk?

The problem here is that the optimum solution depends on the register
classes involved and we don't know this during combine.  If we have
general register, then we want bfi/bfxil to be used; if we have a vector
register, then bit is preferable as it changes 3 inter-bank register
copies to a single inter-bank copy; and that copy might be hoisted out
of a loop.

For example, this case:

unsigned long
f (unsigned long a, unsigned long b)
{
  return (b & 0x7fffffffffffffff) | (a & 0x8000000000000000);
}

before your patch this expands to just a single bfxil instruction and
that's exactly what we'd want here.  With it, however, I'm now seeing

f:
        and     x1, x1, 9223372036854775807
        and     x0, x0, -9223372036854775808
        orr     x0, x1, x0
        ret

which seems to be even worse than gcc-8 where we got a bfi instruction.

Ultimately, the best solution here will probably depend on which we
think is more likely, copysign or the example I give above.

It might be that for copysign we'll need to expand initially to some
unspec that uses a register initialized with a suitable immediate, but
otherwise hides the operation from combine until after that has run,
thus preventing the compiler from doing the otherwise right thing.  We'd
lose in the (hopefully) rare case where the operands really were in
general registers, but otherwise win for the more common case where they
aren't.

R.

> 
> Jeff
> 
> 
> P
> 
> 	PR target/87369
> 	* config/aarch64/aarch64.md (aarch64_bfxil<mode>): Do not accept
> 	constant with just the high bit set.  That's better handled by
> 	the "bit" pattern.
> 
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 88f66104db3..ad6822410c2 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -5342,9 +5342,11 @@
>  		    (match_operand:GPI 3 "const_int_operand" "n, Ulc"))
>  	    (and:GPI (match_operand:GPI 2 "register_operand" "0,r")
>  		    (match_operand:GPI 4 "const_int_operand" "Ulc, n"))))]
> -  "(INTVAL (operands[3]) == ~INTVAL (operands[4]))
> -  && (aarch64_high_bits_all_ones_p (INTVAL (operands[3]))
> -    || aarch64_high_bits_all_ones_p (INTVAL (operands[4])))"
> +  "(INTVAL (operands[3]) == ~INTVAL (operands[4])
> +    && ((aarch64_high_bits_all_ones_p (INTVAL (operands[3]))
> +	 && popcount_hwi (INTVAL (operands[3])) != 1)
> +        || (aarch64_high_bits_all_ones_p (INTVAL (operands[4]))
> +	    && popcount_hwi (INTVAL (operands[4])) != 1)))"
>    {
>      switch (which_alternative)
>      {
>

next prev parent reply	other threads:[~2018-12-07 17:31 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-07 15:52 Jeff Law
2018-12-07 17:31 ` Richard Earnshaw (lists) [this message]
2018-12-07 18:01   ` Jeff Law
2018-12-07 18:48 Wilco Dijkstra
2018-12-07 19:01 ` Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b58a962b-674d-15e3-2436-a46a1b2a3f9c@arm.com \
    --to=richard.earnshaw@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=james.greenhalgh@arm.com \
    --cc=law@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).