Re: [PATCH AArch64]Handle REG+REG+CONST and REG+NON_REG+CONST in legitimize address

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: "Bin.Cheng" <amker.cheng@gmail.com>
To: James Greenhalgh <james.greenhalgh@arm.com>
Cc: Bin Cheng <bin.cheng@arm.com>,
	gcc-patches List <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH AArch64]Handle REG+REG+CONST and REG+NON_REG+CONST in legitimize address
Date: Thu, 19 Nov 2015 02:32:00 -0000	[thread overview]
Message-ID: <CAHFci2_fK2LFS8cjaePZr66tCgL8YufmrswyYUGUFb00MbTMRQ@mail.gmail.com> (raw)
In-Reply-To: <20151117100800.GA6727@arm.com>

[-- Attachment #1: Type: text/plain, Size: 5849 bytes --]

On Tue, Nov 17, 2015 at 6:08 PM, James Greenhalgh
<james.greenhalgh@arm.com> wrote:
> On Tue, Nov 17, 2015 at 05:21:01PM +0800, Bin Cheng wrote:
>> Hi,
>> GIMPLE IVO needs to call backend interface to calculate costs for addr
>> expressions like below:
>>    FORM1: "r73 + r74 + 16380"
>>    FORM2: "r73 << 2 + r74 + 16380"
>>
>> They are invalid address expression on AArch64, so will be legitimized by
>> aarch64_legitimize_address.  Below are what we got from that function:
>>
>> For FORM1, the address expression is legitimized into below insn sequence
>> and rtx:
>>    r84:DI=r73:DI+r74:DI
>>    r85:DI=r84:DI+0x3000
>>    r83:DI=r85:DI
>>    "r83 + 4092"
>>
>> For FORM2, the address expression is legitimized into below insn sequence
>> and rtx:
>>    r108:DI=r73:DI<<0x2
>>    r109:DI=r108:DI+r74:DI
>>    r110:DI=r109:DI+0x3000
>>    r107:DI=r110:DI
>>    "r107 + 4092"
>>
>> So the costs computed are 12/16 respectively.  The high cost prevents IVO
>> from choosing right candidates.  Besides cost computation, I also think the
>> legitmization is bad in terms of code generation.
>> The root cause in aarch64_legitimize_address can be described by it's
>> comment:
>>    /* Try to split X+CONST into Y=X+(CONST & ~mask), Y+(CONST&mask),
>>       where mask is selected by alignment and size of the offset.
>>       We try to pick as large a range for the offset as possible to
>>       maximize the chance of a CSE.  However, for aligned addresses
>>       we limit the range to 4k so that structures with different sized
>>       elements are likely to use the same base.  */
>> I think the split of CONST is intended for REG+CONST where the const offset
>> is not in the range of AArch64's addressing modes.  Unfortunately, it
>> doesn't explicitly handle/reject "REG+REG+CONST" and "REG+REG<<SCALE+CONST"
>> when the CONST are in the range of addressing modes.  As a result, these two
>> cases fallthrough this logic, resulting in sub-optimal results.
>>
>> It's obvious we can do below legitimization:
>> FORM1:
>>    r83:DI=r73:DI+r74:DI
>>    "r83 + 16380"
>> FORM2:
>>    r107:DI=0x3ffc
>>    r106:DI=r74:DI+r107:DI
>>       REG_EQUAL r74:DI+0x3ffc
>>    "r106 + r73 << 2"
>>
>> This patch handles these two cases as described.
>
> Thanks for the description, it made the patch very easy to review. I only
> have a style comment.
>
>> Bootstrap & test on AArch64 along with other patch.  Is it OK?
>>
>> 2015-11-04  Bin Cheng  <bin.cheng@arm.com>
>>           Jiong Wang  <jiong.wang@arm.com>
>>
>>       * config/aarch64/aarch64.c (aarch64_legitimize_address): Handle
>>       address expressions like REG+REG+CONST and REG+NON_REG+CONST.
>
>> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
>> index 5c8604f..47875ac 100644
>> --- a/gcc/config/aarch64/aarch64.c
>> +++ b/gcc/config/aarch64/aarch64.c
>> @@ -4710,6 +4710,51 @@ aarch64_legitimize_address (rtx x, rtx /* orig_x  */, machine_mode mode)
>>      {
>>        HOST_WIDE_INT offset = INTVAL (XEXP (x, 1));
>>        HOST_WIDE_INT base_offset;
>> +      rtx op0 = XEXP (x,0);
>> +
>> +      if (GET_CODE (op0) == PLUS)
>> +     {
>> +       rtx op0_ = XEXP (op0, 0);
>> +       rtx op1_ = XEXP (op0, 1);
>
> I don't see this trailing _ on a variable name in many places in the source
> tree (mostly in the Go frontend), and certainly not in the aarch64 backend.
> Can we pick a different name for op0_ and op1_?
>
>> +
>> +       /* RTX pattern in the form of (PLUS (PLUS REG, REG), CONST) will
>> +          reach here, the 'CONST' may be valid in which case we should
>> +          not split.  */
>> +       if (REG_P (op0_) && REG_P (op1_))
>> +         {
>> +           machine_mode addr_mode = GET_MODE (op0);
>> +           rtx addr = gen_reg_rtx (addr_mode);
>> +
>> +           rtx ret = plus_constant (addr_mode, addr, offset);
>> +           if (aarch64_legitimate_address_hook_p (mode, ret, false))
>> +             {
>> +               emit_insn (gen_adddi3 (addr, op0_, op1_));
>> +               return ret;
>> +             }
>> +         }
>> +       /* RTX pattern in the form of (PLUS (PLUS REG, NON_REG), CONST)
>> +          will reach here.  If (PLUS REG, NON_REG) is valid addr expr,
>> +          we split it into Y=REG+CONST, Y+NON_REG.  */
>> +       else if (REG_P (op0_) || REG_P (op1_))
>> +         {
>> +           machine_mode addr_mode = GET_MODE (op0);
>> +           rtx addr = gen_reg_rtx (addr_mode);
>> +
>> +           /* Switch to make sure that register is in op0_.  */
>> +           if (REG_P (op1_))
>> +             std::swap (op0_, op1_);
>> +
>> +           rtx ret = gen_rtx_fmt_ee (PLUS, addr_mode, addr, op1_);
>> +           if (aarch64_legitimate_address_hook_p (mode, ret, false))
>> +             {
>> +               addr = force_operand (plus_constant (addr_mode,
>> +                                                    op0_, offset),
>> +                                     NULL_RTX);
>> +               ret = gen_rtx_fmt_ee (PLUS, addr_mode, addr, op1_);
>> +               return ret;
>> +             }
>
> The logic here is a bit hairy to follow, you construct a PLUS RTX to check
> aarch64_legitimate_address_hook_p, then construct a different PLUS RTX
> to use as the return value. This can probably be clarified by choosing a
> name other than ret for the temporary address expression you construct.
>
> It would also be good to take some of your detailed description and write
> that here. Certainly I found the explicit examples in the cover letter
> easier to follow than:
>
>> +       /* RTX pattern in the form of (PLUS (PLUS REG, NON_REG), CONST)
>> +          will reach here.  If (PLUS REG, NON_REG) is valid addr expr,
>> +          we split it into Y=REG+CONST, Y+NON_REG.  */
>
> Otherwise this patch is OK.
Thanks for reviewing, here is the updated patch.

Thanks,
bin

[-- Attachment #2: aarch64_legitimize_addr-20151105.txt --]
[-- Type: text/plain, Size: 2564 bytes --]

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 5c8604f..64bc6a4 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -4704,13 +4704,65 @@ aarch64_legitimize_address (rtx x, rtx /* orig_x  */, machine_mode mode)
      We try to pick as large a range for the offset as possible to
      maximize the chance of a CSE.  However, for aligned addresses
      we limit the range to 4k so that structures with different sized
-     elements are likely to use the same base.  */
+     elements are likely to use the same base.  We need to be careful
+     not split CONST for some forms address expressions, otherwise it
+     will generate sub-optimal code.  */
 
   if (GET_CODE (x) == PLUS && CONST_INT_P (XEXP (x, 1)))
     {
       HOST_WIDE_INT offset = INTVAL (XEXP (x, 1));
       HOST_WIDE_INT base_offset;
 
+      if (GET_CODE (XEXP (x, 0)) == PLUS)
+	{
+	  rtx op0 = XEXP (XEXP (x, 0), 0);
+	  rtx op1 = XEXP (XEXP (x, 0), 1);
+
+	  /* For addr expression in the form like "r1 + r2 + 0x3ffc".
+	     Since the offset is within range supported by addressing
+	     mode "reg+offset", we don't split the const and legalize
+	     it into below insn and expr sequence:
+	       r3 = r1 + r2;
+	       "r3 + 0x3ffc".  */
+	  if (REG_P (op0) && REG_P (op1))
+	    {
+	      machine_mode addr_mode = GET_MODE (x);
+	      rtx base = gen_reg_rtx (addr_mode);
+	      rtx addr = plus_constant (addr_mode, base, offset);
+
+	      if (aarch64_legitimate_address_hook_p (mode, addr, false))
+		{
+		  emit_insn (gen_adddi3 (base, op0, op1));
+		  return addr;
+		}
+	    }
+	  /* For addr expression in the form like "r1 + r2<<2 + 0x3ffc".
+	     Live above, we don't split the const and legalize it into
+	     below insn and expr sequence:
+	       r3 = 0x3ffc;
+	       r4 = r1 + r3;
+	       "r4 + r2<<2".  */
+	  else if (REG_P (op0) || REG_P (op1))
+	    {
+	      machine_mode addr_mode = GET_MODE (x);
+	      rtx base = gen_reg_rtx (addr_mode);
+
+	      /* Switch to make sure that register is in op0.  */
+	      if (REG_P (op1))
+		std::swap (op0, op1);
+
+	      rtx addr = gen_rtx_fmt_ee (PLUS, addr_mode, base, op1);
+
+	      if (aarch64_legitimate_address_hook_p (mode, addr, false))
+		{
+		  base = force_operand (plus_constant (addr_mode,
+						       op0, offset),
+					NULL_RTX);
+		  return gen_rtx_fmt_ee (PLUS, addr_mode, base, op1);
+		}
+	    }
+	}
+
       /* Does it look like we'll need a load/store-pair operation?  */
       if (GET_MODE_SIZE (mode) > 16
 	  || mode == TImode)

next prev parent reply	other threads:[~2015-11-19  2:32 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-17  9:21 Bin Cheng
2015-11-17 10:08 ` James Greenhalgh
2015-11-19  2:32   ` Bin.Cheng [this message]
2015-11-20  8:31     ` Bin.Cheng
2015-11-20 17:39       ` Richard Earnshaw
2015-11-24  3:23         ` Bin.Cheng
2015-11-24  9:59           ` Richard Earnshaw
2015-11-24 10:21             ` Richard Earnshaw
2015-11-24 13:13               ` Jiong Wang
2015-11-24 13:29                 ` Richard Earnshaw
2015-11-24 14:39                   ` Jiong Wang
2015-11-24 14:55                     ` Richard Earnshaw
2015-12-01  3:19               ` Bin.Cheng
2015-12-01 10:25                 ` Richard Earnshaw
2015-12-03  5:26                   ` Bin.Cheng
2015-12-03 10:26                     ` Richard Earnshaw
2015-12-04  3:18                       ` Bin.Cheng
2015-11-25  4:53             ` Bin.Cheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHFci2_fK2LFS8cjaePZr66tCgL8YufmrswyYUGUFb00MbTMRQ@mail.gmail.com \
    --to=amker.cheng@gmail.com \
    --cc=bin.cheng@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=james.greenhalgh@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).