On Tue, Nov 17, 2015 at 6:08 PM, James Greenhalgh wrote: > On Tue, Nov 17, 2015 at 05:21:01PM +0800, Bin Cheng wrote: >> Hi, >> GIMPLE IVO needs to call backend interface to calculate costs for addr >> expressions like below: >> FORM1: "r73 + r74 + 16380" >> FORM2: "r73 << 2 + r74 + 16380" >> >> They are invalid address expression on AArch64, so will be legitimized by >> aarch64_legitimize_address. Below are what we got from that function: >> >> For FORM1, the address expression is legitimized into below insn sequence >> and rtx: >> r84:DI=r73:DI+r74:DI >> r85:DI=r84:DI+0x3000 >> r83:DI=r85:DI >> "r83 + 4092" >> >> For FORM2, the address expression is legitimized into below insn sequence >> and rtx: >> r108:DI=r73:DI<<0x2 >> r109:DI=r108:DI+r74:DI >> r110:DI=r109:DI+0x3000 >> r107:DI=r110:DI >> "r107 + 4092" >> >> So the costs computed are 12/16 respectively. The high cost prevents IVO >> from choosing right candidates. Besides cost computation, I also think the >> legitmization is bad in terms of code generation. >> The root cause in aarch64_legitimize_address can be described by it's >> comment: >> /* Try to split X+CONST into Y=X+(CONST & ~mask), Y+(CONST&mask), >> where mask is selected by alignment and size of the offset. >> We try to pick as large a range for the offset as possible to >> maximize the chance of a CSE. However, for aligned addresses >> we limit the range to 4k so that structures with different sized >> elements are likely to use the same base. */ >> I think the split of CONST is intended for REG+CONST where the const offset >> is not in the range of AArch64's addressing modes. Unfortunately, it >> doesn't explicitly handle/reject "REG+REG+CONST" and "REG+REG<> when the CONST are in the range of addressing modes. As a result, these two >> cases fallthrough this logic, resulting in sub-optimal results. >> >> It's obvious we can do below legitimization: >> FORM1: >> r83:DI=r73:DI+r74:DI >> "r83 + 16380" >> FORM2: >> r107:DI=0x3ffc >> r106:DI=r74:DI+r107:DI >> REG_EQUAL r74:DI+0x3ffc >> "r106 + r73 << 2" >> >> This patch handles these two cases as described. > > Thanks for the description, it made the patch very easy to review. I only > have a style comment. > >> Bootstrap & test on AArch64 along with other patch. Is it OK? >> >> 2015-11-04 Bin Cheng >> Jiong Wang >> >> * config/aarch64/aarch64.c (aarch64_legitimize_address): Handle >> address expressions like REG+REG+CONST and REG+NON_REG+CONST. > >> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c >> index 5c8604f..47875ac 100644 >> --- a/gcc/config/aarch64/aarch64.c >> +++ b/gcc/config/aarch64/aarch64.c >> @@ -4710,6 +4710,51 @@ aarch64_legitimize_address (rtx x, rtx /* orig_x */, machine_mode mode) >> { >> HOST_WIDE_INT offset = INTVAL (XEXP (x, 1)); >> HOST_WIDE_INT base_offset; >> + rtx op0 = XEXP (x,0); >> + >> + if (GET_CODE (op0) == PLUS) >> + { >> + rtx op0_ = XEXP (op0, 0); >> + rtx op1_ = XEXP (op0, 1); > > I don't see this trailing _ on a variable name in many places in the source > tree (mostly in the Go frontend), and certainly not in the aarch64 backend. > Can we pick a different name for op0_ and op1_? > >> + >> + /* RTX pattern in the form of (PLUS (PLUS REG, REG), CONST) will >> + reach here, the 'CONST' may be valid in which case we should >> + not split. */ >> + if (REG_P (op0_) && REG_P (op1_)) >> + { >> + machine_mode addr_mode = GET_MODE (op0); >> + rtx addr = gen_reg_rtx (addr_mode); >> + >> + rtx ret = plus_constant (addr_mode, addr, offset); >> + if (aarch64_legitimate_address_hook_p (mode, ret, false)) >> + { >> + emit_insn (gen_adddi3 (addr, op0_, op1_)); >> + return ret; >> + } >> + } >> + /* RTX pattern in the form of (PLUS (PLUS REG, NON_REG), CONST) >> + will reach here. If (PLUS REG, NON_REG) is valid addr expr, >> + we split it into Y=REG+CONST, Y+NON_REG. */ >> + else if (REG_P (op0_) || REG_P (op1_)) >> + { >> + machine_mode addr_mode = GET_MODE (op0); >> + rtx addr = gen_reg_rtx (addr_mode); >> + >> + /* Switch to make sure that register is in op0_. */ >> + if (REG_P (op1_)) >> + std::swap (op0_, op1_); >> + >> + rtx ret = gen_rtx_fmt_ee (PLUS, addr_mode, addr, op1_); >> + if (aarch64_legitimate_address_hook_p (mode, ret, false)) >> + { >> + addr = force_operand (plus_constant (addr_mode, >> + op0_, offset), >> + NULL_RTX); >> + ret = gen_rtx_fmt_ee (PLUS, addr_mode, addr, op1_); >> + return ret; >> + } > > The logic here is a bit hairy to follow, you construct a PLUS RTX to check > aarch64_legitimate_address_hook_p, then construct a different PLUS RTX > to use as the return value. This can probably be clarified by choosing a > name other than ret for the temporary address expression you construct. > > It would also be good to take some of your detailed description and write > that here. Certainly I found the explicit examples in the cover letter > easier to follow than: > >> + /* RTX pattern in the form of (PLUS (PLUS REG, NON_REG), CONST) >> + will reach here. If (PLUS REG, NON_REG) is valid addr expr, >> + we split it into Y=REG+CONST, Y+NON_REG. */ > > Otherwise this patch is OK. Thanks for reviewing, here is the updated patch. Thanks, bin