On 9 June 2017 at 15:13, Richard Earnshaw (lists) wrote: > On 21/02/17 16:54, charles.baylis@linaro.org wrote: >> From: Charles Baylis >> >> This patch adds support for modelling the varying costs of >> different addressing modes. The generic cost table treats >> all addressing modes as having equal cost. The cost table >> for Cortex-A57 is derived from http://infocenter.arm.com/help/topic/com.arm.doc.uan0015b/Cortex_A57_Software_Optimization_Guide_external.pdf >> and treats addressing modes with write-back as having a >> cost equal to one additional instruction. >> >> gcc/ChangeLog: >> >> Charles Baylis >> >> * config/arm/aarch-common-protos.h (enum arm_addr_mode_op): New. >> (struct addr_mode_cost_table): New. >> (struct cpu_cost_table): Add pointer to an addr_mode_cost_table. >> * config/arm/aarch-cost-tables.h: (generic_addr_mode_costs): New. >> (generic_extra_costs) Initialise aarch32_addr_mode. >> (cortexa53_extra_costs) Likewise. >> (addr_mode_costs_cortexa57) New. >> (cortexa57_extra_costs) Initialise aarch32_addr_mode. >> (exynosm1_extra_costs) Likewise. >> (xgene1_extra_costs) Likewise. >> (qdf24xx_extra_costs) Likewise. >> * config/arm/arm.c (cortexa9_extra_costs) Initialise aarch32_addr_mode. >> (cortexa9_extra_costs) Likewise. >> (cortexa8_extra_costs) Likewise. >> (cortexa5_extra_costs) Likewise. >> (cortexa7_extra_costs) Likewise. >> (cortexa12_extra_costs) Likewise. >> (cortexv7m_extra_costs) Likewise. >> (arm_mem_costs): Use table lookup to calculate cost of addressing >> mode. >> >> Change-Id: If71bd7c4f4bb876c5ed82dc28791130efb8bf89e >> --- >> gcc/config/arm/aarch-common-protos.h | 16 +++++++++++ >> gcc/config/arm/aarch-cost-tables.h | 54 ++++++++++++++++++++++++++++++---- >> gcc/config/arm/arm.c | 56 ++++++++++++++++++++++++++++++------ >> 3 files changed, 113 insertions(+), 13 deletions(-) >> >> diff --git a/gcc/config/arm/aarch-common-protos.h b/gcc/config/arm/aarch-common-protos.h >> index 7c2bb4c..f6fcc94 100644 >> --- a/gcc/config/arm/aarch-common-protos.h >> +++ b/gcc/config/arm/aarch-common-protos.h >> @@ -130,6 +130,21 @@ struct vector_cost_table >> const int alu; >> }; >> >> +enum arm_addr_mode_op >> +{ >> + AMO_DEFAULT, >> + AMO_NO_WB, >> + AMO_WB, >> + AMO_MAX /* for array size */ > > Comment style: Capital letter at start, full stop and two spaces at the end. Done. > The enum and structure below should have some comments explaining what > they are for. Done. >> +const struct addr_mode_cost_table generic_addr_mode_costs = >> +{ >> + /* int */ >> + { 0, 0, 0 }, > > Elements need to be commented, otherwise minor changes to the contents > of the arrays is hard to update and maintain. Done. >> + /* Addressing mode */ > > Comment style. Done. >> - /* TODO: Add table-driven costs for addressing modes. */ >> + arm_addr_mode_op op_type; >> + switch (GET_CODE(XEXP (x, 0))) >> + { >> + case REG: >> + default: >> + op_type = AMO_DEFAULT; > > Why would REG and default share the same cost? Presumably default is > too complex to recognize, but then it's unlikely to be cheap. Default covers literals in various forms of RTL, for which the cost is the same as regular, and PIC, which is handled in the original code above this section. >> + break; >> + case PLUS: >> + case MINUS: >> + op_type = AMO_NO_WB; > > GCC doesn't support MINUS in addresses, even though the architecture > could in theory handle this. I've noted that in a comment, but kept the "case MINUS:" in place. > Also, I think you need a different cost for scaled offset addressing, > and possibly even for different scaling factors and types of scaling. ... see below: >> + break; >> + case PRE_INC: >> + case PRE_DEC: >> + case POST_INC: >> + case POST_DEC: >> + case PRE_MODIFY: >> + case POST_MODIFY: >> + op_type = AMO_WB; > > Pre and post might also need separate entries (plus further entries for > scaling). A post operation might happen in parallel with the data > fetch, while a pre operation must happen before the address can be sent > to the load/store pipe. The {DEFAULT, NO_WB, WB} range is also Ramana's requested design. I think this is OK because it is sufficient to describe the currently publicly documented microarchitectures, and the structure is readily extensible if future cores require more detailed modelling. Expanding the range of AMO_* entries in the array makes the tables more unwieldy for no immediate benefit. >> + break; >> + } > > blank line between the switch block and the subsequent statements. > Also, the following needs to laid out in GNU style. Done. This version of the patch also moves the Cortex-A57 tuning (which was present in the earlier version) into a separate patch. Discussion of that in a follow-up post. Updated patch attached