public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 1/2] [ARM] Refactor costs calculation for MEM.
  2017-02-21 16:54 [PATCH 0/2] [ARM] PR61551 addressing mode costs charles.baylis
@ 2017-02-21 16:54 ` charles.baylis
  2017-02-23  7:46   ` Bernhard Reutner-Fischer
  2017-06-09 13:59   ` Richard Earnshaw (lists)
  2017-02-21 16:58 ` [PATCH 2/2] [ARM] Add table of costs for AAarch32 addressing modes charles.baylis
  2017-05-12 15:53 ` [PATCH 0/2] [ARM] PR61551 addressing mode costs Charles Baylis
  2 siblings, 2 replies; 10+ messages in thread
From: charles.baylis @ 2017-02-21 16:54 UTC (permalink / raw)
  To: Ramana.Radhakrishnan, kyrylo.tkachov; +Cc: rearnsha, gcc-patches

From: Charles Baylis <charles.baylis@linaro.org>

This patch moves the calculation of costs for MEM into a
separate function, and reforms the calculation into two
parts. Firstly any additional cost of the addressing mode
is calculated, and then the cost of the memory access itself
is added.

In this patch, the calculation of the cost of the addressing
mode is left as a placeholder, to be added in a subsequent
patch.

gcc/ChangeLog:

<date>  Charles Baylis  <charles.baylis@linaro.org>

        * config/arm/arm.c (arm_mem_costs): New function.
        (arm_rtx_costs_internal): Use arm_mem_costs.

Change-Id: I99e93406ea39ee31f71c7bf428ad3e127b7a618e
---
 gcc/config/arm/arm.c | 66 +++++++++++++++++++++++++++++++++-------------------
 1 file changed, 42 insertions(+), 24 deletions(-)

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 6cae178..7f002f1 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -9072,6 +9072,47 @@ arm_unspec_cost (rtx x, enum rtx_code /* outer_code */, bool speed_p, int *cost)
 	  }								\
 	while (0);
 
+/* Helper function for arm_rtx_costs_internal. Calculates the cost of a MEM,
+   considering the costs of the addressing mode and memory access
+   separately.  */
+static bool
+arm_mem_costs (rtx x, const struct cpu_cost_table *extra_cost,
+	       int *cost, bool speed_p)
+{
+  machine_mode mode = GET_MODE (x);
+  if (flag_pic
+      && GET_CODE (XEXP (x, 0)) == PLUS
+      && will_be_in_index_register (XEXP (XEXP (x, 0), 1)))
+    /* This will be split into two instructions.  Add the cost of the
+       additional instruction here.  The cost of the memory access is computed
+       below.  See arm.md:calculate_pic_address.  */
+    *cost = COSTS_N_INSNS (1);
+  else
+    *cost = 0;
+
+  /* Calculate cost of the addressing mode.  */
+  if (speed_p)
+  {
+    /* TODO: Add table-driven costs for addressing modes.  */
+  }
+
+  /* cost of memory access */
+  if (speed_p)
+  {
+    /* data transfer is transfer size divided by bus width.  */
+    int bus_width = arm_arch7 ? 8 : 4;
+    *cost += COSTS_N_INSNS((GET_MODE_SIZE (mode) + bus_width - 1) / bus_width);
+    *cost += extra_cost->ldst.load;
+  } else {
+    *cost += COSTS_N_INSNS (1);
+  }
+
+  return true;
+}
+/* Convert fron bytes to ints.  */
+#define ARM_NUM_INTS(X) (((X) + UNITS_PER_WORD - 1) / UNITS_PER_WORD)
+
+
 /* RTX costs.  Make an estimate of the cost of executing the operation
    X, which is contained with an operation with code OUTER_CODE.
    SPEED_P indicates whether the cost desired is the performance cost,
@@ -9152,30 +9193,7 @@ arm_rtx_costs_internal (rtx x, enum rtx_code code, enum rtx_code outer_code,
       return false;
 
     case MEM:
-      /* A memory access costs 1 insn if the mode is small, or the address is
-	 a single register, otherwise it costs one insn per word.  */
-      if (REG_P (XEXP (x, 0)))
-	*cost = COSTS_N_INSNS (1);
-      else if (flag_pic
-	       && GET_CODE (XEXP (x, 0)) == PLUS
-	       && will_be_in_index_register (XEXP (XEXP (x, 0), 1)))
-	/* This will be split into two instructions.
-	   See arm.md:calculate_pic_address.  */
-	*cost = COSTS_N_INSNS (2);
-      else
-	*cost = COSTS_N_INSNS (ARM_NUM_REGS (mode));
-
-      /* For speed optimizations, add the costs of the address and
-	 accessing memory.  */
-      if (speed_p)
-#ifdef NOT_YET
-	*cost += (extra_cost->ldst.load
-		  + arm_address_cost (XEXP (x, 0), mode,
-				      ADDR_SPACE_GENERIC, speed_p));
-#else
-        *cost += extra_cost->ldst.load;
-#endif
-      return true;
+      return arm_mem_costs (x, extra_cost, cost, speed_p);
 
     case PARALLEL:
     {
-- 
2.7.4

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 0/2] [ARM] PR61551 addressing mode costs
@ 2017-02-21 16:54 charles.baylis
  2017-02-21 16:54 ` [PATCH 1/2] [ARM] Refactor costs calculation for MEM charles.baylis
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: charles.baylis @ 2017-02-21 16:54 UTC (permalink / raw)
  To: Ramana.Radhakrishnan, kyrylo.tkachov; +Cc: rearnsha, gcc-patches

From: Charles Baylis <charles.baylis@linaro.org>

Hi Ramana,

This patch set continues previous work on fixing the cost calculations for MEMs
which use different addressing modes. It implements the approach we discussed
at Linaro Connect BKK16.

I have included some notes on the patch set as follows:


Background:

The motivating problem is that this function:
  char *f(char *p, int8x8x4_t v, int r) { vst4_s8(p, v); p+=32; return p; }
compiles to:
        mov     r3, r0
        adds    r0, r0, #32
        vst4.8  {d0-d3}, [r3]
        bx      lr
but we would like to get:
        vst4.8  {d0-d3}, [r0]!
        bx      lr

Although the ARM back end contains patterns for the write-back forms of these
instructions, they are not currently generated. The reason for this is that the
auto-inc-dec phase does not perform this optimisation because arm_rtx_costs
incorrectly calculates the cost of "vst4.8  {d0-d3}, [r0]!" as much higher than
"vst4.8  {d0-d3}, [r3]". For that reason, it considers the POST_INC form to be
worse than the initial sequence of vst4/add and does not perform the
transformation.

In fact, GCC6 has regressions compared to GCC5 in this area, and no longer
does post-indexed addressing for int64_t or 64 bit vector types.


Solution:

Change cost calculation for MEMs so that the cost of the memory access
is computed separately from the cost of the addressing mode. A new
table-driven mechanism is introduced for the costs of the addressing modes.

The first patch in the series implements the calculation of the cost of
the memory access.

The second patch adds the table-driven model of the extra cost of the
selected addressing mode. I don't have access to a lot of CPU pipeline
information, so most CPUs use the generic cost table, with the exception of
Cortex-A57.


Testing:

I did "make check" on arm-linux-gnueabihf with qemu. This patch fixes one test
failure in lp1243022.c.


Benchmarking:

On Cortex-A15, SPEC2006 and a popular suite of embedded benchmarks perform the
same as before this patch is applied.  This is expected, the expected gain is
in code quality for hand-written NEON intrinsics code.



Charles Baylis (2):
  [ARM] Refactor costs calculation for MEM.
  [ARM] Add table of costs for AAarch32 addressing modes.

 gcc/config/arm/aarch-common-protos.h |  16 +++++
 gcc/config/arm/aarch-cost-tables.h   |  54 ++++++++++++++--
 gcc/config/arm/arm.c                 | 120 ++++++++++++++++++++++++++---------
 3 files changed, 154 insertions(+), 36 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 2/2] [ARM] Add table of costs for AAarch32 addressing modes.
  2017-02-21 16:54 [PATCH 0/2] [ARM] PR61551 addressing mode costs charles.baylis
  2017-02-21 16:54 ` [PATCH 1/2] [ARM] Refactor costs calculation for MEM charles.baylis
@ 2017-02-21 16:58 ` charles.baylis
  2017-06-09 14:13   ` Richard Earnshaw (lists)
  2017-05-12 15:53 ` [PATCH 0/2] [ARM] PR61551 addressing mode costs Charles Baylis
  2 siblings, 1 reply; 10+ messages in thread
From: charles.baylis @ 2017-02-21 16:58 UTC (permalink / raw)
  To: Ramana.Radhakrishnan, kyrylo.tkachov; +Cc: rearnsha, gcc-patches

From: Charles Baylis <charles.baylis@linaro.org>

This patch adds support for modelling the varying costs of
different addressing modes. The generic cost table treats
all addressing modes as having equal cost. The cost table
for Cortex-A57 is derived from http://infocenter.arm.com/help/topic/com.arm.doc.uan0015b/Cortex_A57_Software_Optimization_Guide_external.pdf
and treats addressing modes with write-back as having a
cost equal to one additional instruction.

gcc/ChangeLog:

<date>  Charles Baylis  <charles.baylis@linaro.org>

	* config/arm/aarch-common-protos.h (enum arm_addr_mode_op): New.
	(struct addr_mode_cost_table): New.
	(struct cpu_cost_table): Add pointer to an addr_mode_cost_table.
	* config/arm/aarch-cost-tables.h: (generic_addr_mode_costs): New.
	(generic_extra_costs) Initialise aarch32_addr_mode.
	(cortexa53_extra_costs) Likewise.
	(addr_mode_costs_cortexa57) New.
	(cortexa57_extra_costs) Initialise aarch32_addr_mode.
	(exynosm1_extra_costs) Likewise.
	(xgene1_extra_costs) Likewise.
	(qdf24xx_extra_costs) Likewise.
	* config/arm/arm.c (cortexa9_extra_costs) Initialise aarch32_addr_mode.
	(cortexa9_extra_costs) Likewise.
	(cortexa8_extra_costs) Likewise.
	(cortexa5_extra_costs) Likewise.
	(cortexa7_extra_costs) Likewise.
	(cortexa12_extra_costs) Likewise.
	(cortexv7m_extra_costs) Likewise.
	(arm_mem_costs): Use table lookup to calculate cost of addressing
	mode.

Change-Id: If71bd7c4f4bb876c5ed82dc28791130efb8bf89e
---
 gcc/config/arm/aarch-common-protos.h | 16 +++++++++++
 gcc/config/arm/aarch-cost-tables.h   | 54 ++++++++++++++++++++++++++++++----
 gcc/config/arm/arm.c                 | 56 ++++++++++++++++++++++++++++++------
 3 files changed, 113 insertions(+), 13 deletions(-)

diff --git a/gcc/config/arm/aarch-common-protos.h b/gcc/config/arm/aarch-common-protos.h
index 7c2bb4c..f6fcc94 100644
--- a/gcc/config/arm/aarch-common-protos.h
+++ b/gcc/config/arm/aarch-common-protos.h
@@ -130,6 +130,21 @@ struct vector_cost_table
   const int alu;
 };
 
+enum arm_addr_mode_op
+{
+   AMO_DEFAULT,
+   AMO_NO_WB,
+   AMO_WB,
+   AMO_MAX /* for array size */
+};
+
+struct addr_mode_cost_table
+{
+   const int integer[AMO_MAX];
+   const int fp[AMO_MAX];
+   const int vector[AMO_MAX];
+};
+
 struct cpu_cost_table
 {
   const struct alu_cost_table alu;
@@ -137,6 +152,7 @@ struct cpu_cost_table
   const struct mem_cost_table ldst;
   const struct fp_cost_table fp[2]; /* SFmode and DFmode.  */
   const struct vector_cost_table vect;
+  const struct addr_mode_cost_table *aarch32_addr_mode;
 };
 
 
diff --git a/gcc/config/arm/aarch-cost-tables.h b/gcc/config/arm/aarch-cost-tables.h
index 68f84b0..e3d257f 100644
--- a/gcc/config/arm/aarch-cost-tables.h
+++ b/gcc/config/arm/aarch-cost-tables.h
@@ -22,6 +22,16 @@
 #ifndef GCC_AARCH_COST_TABLES_H
 #define GCC_AARCH_COST_TABLES_H
 
+const struct addr_mode_cost_table generic_addr_mode_costs =
+{
+  /* int */
+  { 0, 0, 0 },
+  /* float */
+  { 0, 0, 0 },
+  /* vector */
+  { 0, 0, 0 }
+};
+
 const struct cpu_cost_table generic_extra_costs =
 {
   /* ALU */
@@ -122,7 +132,9 @@ const struct cpu_cost_table generic_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (1)	/* alu.  */
-  }
+  },
+  /* Addressing mode */
+  &generic_addr_mode_costs
 };
 
 const struct cpu_cost_table cortexa53_extra_costs =
@@ -225,6 +237,30 @@ const struct cpu_cost_table cortexa53_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (1)	/* alu.  */
+  },
+  /* Addressing mode */
+  &generic_addr_mode_costs
+};
+
+const struct addr_mode_cost_table addr_mode_costs_cortexa57 =
+{
+  /* int */
+  {
+    0,
+    0,
+    COSTS_N_INSNS (1),
+  },
+  /* float */
+  {
+    0,
+    0,
+    COSTS_N_INSNS (1),
+  },
+  /* vector */
+  {
+    0,
+    0,
+    COSTS_N_INSNS (1),
   }
 };
 
@@ -328,7 +364,9 @@ const struct cpu_cost_table cortexa57_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (1)  /* alu.  */
-  }
+  },
+  /* Addressing mode */
+  &addr_mode_costs_cortexa57
 };
 
 const struct cpu_cost_table exynosm1_extra_costs =
@@ -431,7 +469,9 @@ const struct cpu_cost_table exynosm1_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (0)  /* alu.  */
-  }
+  },
+  /* Addressing mode */
+  &generic_addr_mode_costs
 };
 
 const struct cpu_cost_table xgene1_extra_costs =
@@ -534,7 +574,9 @@ const struct cpu_cost_table xgene1_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (2)  /* alu.  */
-  }
+  },
+  /* Addressing mode */
+  &generic_addr_mode_costs
 };
 
 const struct cpu_cost_table qdf24xx_extra_costs =
@@ -637,7 +679,9 @@ const struct cpu_cost_table qdf24xx_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (1)  /* alu.  */
-  }
+  },
+  /* Addressing mode */
+  &generic_addr_mode_costs
 };
 
 #endif /* GCC_AARCH_COST_TABLES_H */
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 7f002f1..fe4cd73 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -1099,7 +1099,9 @@ const struct cpu_cost_table cortexa9_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (1)	/* alu.  */
-  }
+  },
+  /* Addressing mode */
+  &generic_addr_mode_costs
 };
 
 const struct cpu_cost_table cortexa8_extra_costs =
@@ -1202,7 +1204,9 @@ const struct cpu_cost_table cortexa8_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (1)	/* alu.  */
-  }
+  },
+  /* Addressing mode */
+  &generic_addr_mode_costs
 };
 
 const struct cpu_cost_table cortexa5_extra_costs =
@@ -1306,7 +1310,9 @@ const struct cpu_cost_table cortexa5_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (1)	/* alu.  */
-  }
+  },
+  /* Addressing mode */
+  &generic_addr_mode_costs
 };
 
 
@@ -1411,7 +1417,9 @@ const struct cpu_cost_table cortexa7_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (1)	/* alu.  */
-  }
+  },
+  /* Addressing mode */
+  &generic_addr_mode_costs
 };
 
 const struct cpu_cost_table cortexa12_extra_costs =
@@ -1514,7 +1522,9 @@ const struct cpu_cost_table cortexa12_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (1)	/* alu.  */
-  }
+  },
+  /* Addressing mode */
+  &generic_addr_mode_costs
 };
 
 const struct cpu_cost_table cortexa15_extra_costs =
@@ -1617,7 +1627,9 @@ const struct cpu_cost_table cortexa15_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (1)	/* alu.  */
-  }
+  },
+  /* Addressing mode */
+  &generic_addr_mode_costs
 };
 
 const struct cpu_cost_table v7m_extra_costs =
@@ -1720,7 +1732,9 @@ const struct cpu_cost_table v7m_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (1)	/* alu.  */
-  }
+  },
+  /* Addressing mode */
+  &generic_addr_mode_costs
 };
 
 const struct tune_params arm_slowmul_tune =
@@ -9093,7 +9107,33 @@ arm_mem_costs (rtx x, const struct cpu_cost_table *extra_cost,
   /* Calculate cost of the addressing mode.  */
   if (speed_p)
   {
-    /* TODO: Add table-driven costs for addressing modes.  */
+    arm_addr_mode_op op_type;
+    switch (GET_CODE(XEXP (x, 0)))
+    {
+    case REG:
+    default:
+      op_type = AMO_DEFAULT;
+      break;
+    case PLUS:
+    case MINUS:
+      op_type = AMO_NO_WB;
+      break;
+    case PRE_INC:
+    case PRE_DEC:
+    case POST_INC:
+    case POST_DEC:
+    case PRE_MODIFY:
+    case POST_MODIFY:
+      op_type = AMO_WB;
+      break;
+    }
+    if (VECTOR_MODE_P (mode)) {
+        *cost += extra_cost->aarch32_addr_mode->vector[op_type];
+    } else if (FLOAT_MODE_P (mode)) {
+        *cost += extra_cost->aarch32_addr_mode->fp[op_type];
+    } else {
+        *cost += extra_cost->aarch32_addr_mode->integer[op_type];
+    }
   }
 
   /* cost of memory access */
-- 
2.7.4

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] [ARM] Refactor costs calculation for MEM.
  2017-02-21 16:54 ` [PATCH 1/2] [ARM] Refactor costs calculation for MEM charles.baylis
@ 2017-02-23  7:46   ` Bernhard Reutner-Fischer
  2017-06-09 13:59   ` Richard Earnshaw (lists)
  1 sibling, 0 replies; 10+ messages in thread
From: Bernhard Reutner-Fischer @ 2017-02-23  7:46 UTC (permalink / raw)
  To: gcc-patches, charles.baylis, Ramana.Radhakrishnan, kyrylo.tkachov
  Cc: rearnsha

On 21 February 2017 17:54:23 CET, charles.baylis@linaro.org wrote:
>From: Charles Baylis <charles.baylis@linaro.org>
>

>+/* Convert fron bytes to ints.  */

s/fron/from/

>+#define ARM_NUM_INTS(X) (((X) + UNITS_PER_WORD - 1) / UNITS_PER_WORD)
>+
>+
> /* RTX costs.  Make an estimate of the cost of executing the operation
>    X, which is contained with an operation with code OUTER_CODE.
>    SPEED_P indicates whether the cost desired is the performance cost,

s/contained with /contained within /
while you are at it.

thanks,

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/2] [ARM] PR61551 addressing mode costs
  2017-02-21 16:54 [PATCH 0/2] [ARM] PR61551 addressing mode costs charles.baylis
  2017-02-21 16:54 ` [PATCH 1/2] [ARM] Refactor costs calculation for MEM charles.baylis
  2017-02-21 16:58 ` [PATCH 2/2] [ARM] Add table of costs for AAarch32 addressing modes charles.baylis
@ 2017-05-12 15:53 ` Charles Baylis
  2 siblings, 0 replies; 10+ messages in thread
From: Charles Baylis @ 2017-05-12 15:53 UTC (permalink / raw)
  To: Ramana Radhakrishnan, Kyrylo Tkachov; +Cc: Richard Earnshaw, GCC Patches

Ping.

Original thread: https://gcc.gnu.org/ml/gcc-patches/2017-02/msg01314.html

(I will fix the typos which Bernhard found before submitting)

On 21 February 2017 at 16:54,  <charles.baylis@linaro.org> wrote:
> From: Charles Baylis <charles.baylis@linaro.org>
>
> Hi Ramana,
>
> This patch set continues previous work on fixing the cost calculations for MEMs
> which use different addressing modes. It implements the approach we discussed
> at Linaro Connect BKK16.
>
> I have included some notes on the patch set as follows:
>
>
> Background:
>
> The motivating problem is that this function:
>   char *f(char *p, int8x8x4_t v, int r) { vst4_s8(p, v); p+=32; return p; }
> compiles to:
>         mov     r3, r0
>         adds    r0, r0, #32
>         vst4.8  {d0-d3}, [r3]
>         bx      lr
> but we would like to get:
>         vst4.8  {d0-d3}, [r0]!
>         bx      lr
>
> Although the ARM back end contains patterns for the write-back forms of these
> instructions, they are not currently generated. The reason for this is that the
> auto-inc-dec phase does not perform this optimisation because arm_rtx_costs
> incorrectly calculates the cost of "vst4.8  {d0-d3}, [r0]!" as much higher than
> "vst4.8  {d0-d3}, [r3]". For that reason, it considers the POST_INC form to be
> worse than the initial sequence of vst4/add and does not perform the
> transformation.
>
> In fact, GCC6 has regressions compared to GCC5 in this area, and no longer
> does post-indexed addressing for int64_t or 64 bit vector types.
>
>
> Solution:
>
> Change cost calculation for MEMs so that the cost of the memory access
> is computed separately from the cost of the addressing mode. A new
> table-driven mechanism is introduced for the costs of the addressing modes.
>
> The first patch in the series implements the calculation of the cost of
> the memory access.
>
> The second patch adds the table-driven model of the extra cost of the
> selected addressing mode. I don't have access to a lot of CPU pipeline
> information, so most CPUs use the generic cost table, with the exception of
> Cortex-A57.
>
>
> Testing:
>
> I did "make check" on arm-linux-gnueabihf with qemu. This patch fixes one test
> failure in lp1243022.c.
>
>
> Benchmarking:
>
> On Cortex-A15, SPEC2006 and a popular suite of embedded benchmarks perform the
> same as before this patch is applied.  This is expected, the expected gain is
> in code quality for hand-written NEON intrinsics code.
>
>
>
> Charles Baylis (2):
>   [ARM] Refactor costs calculation for MEM.
>   [ARM] Add table of costs for AAarch32 addressing modes.
>
>  gcc/config/arm/aarch-common-protos.h |  16 +++++
>  gcc/config/arm/aarch-cost-tables.h   |  54 ++++++++++++++--
>  gcc/config/arm/arm.c                 | 120 ++++++++++++++++++++++++++---------
>  3 files changed, 154 insertions(+), 36 deletions(-)
>
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] [ARM] Refactor costs calculation for MEM.
  2017-02-21 16:54 ` [PATCH 1/2] [ARM] Refactor costs calculation for MEM charles.baylis
  2017-02-23  7:46   ` Bernhard Reutner-Fischer
@ 2017-06-09 13:59   ` Richard Earnshaw (lists)
  2017-08-25 18:16     ` Charles Baylis
  1 sibling, 1 reply; 10+ messages in thread
From: Richard Earnshaw (lists) @ 2017-06-09 13:59 UTC (permalink / raw)
  To: charles.baylis, Ramana.Radhakrishnan, kyrylo.tkachov
  Cc: rearnsha, gcc-patches

On 21/02/17 16:54, charles.baylis@linaro.org wrote:
> From: Charles Baylis <charles.baylis@linaro.org>
> 
> This patch moves the calculation of costs for MEM into a
> separate function, and reforms the calculation into two
> parts. Firstly any additional cost of the addressing mode
> is calculated, and then the cost of the memory access itself
> is added.
> 
> In this patch, the calculation of the cost of the addressing
> mode is left as a placeholder, to be added in a subsequent
> patch.
> 
> gcc/ChangeLog:
> 
> <date>  Charles Baylis  <charles.baylis@linaro.org>
> 
>         * config/arm/arm.c (arm_mem_costs): New function.
>         (arm_rtx_costs_internal): Use arm_mem_costs.

I like the idea of this patch, but it needs further work...

Comments inline.

R.

> 
> Change-Id: I99e93406ea39ee31f71c7bf428ad3e127b7a618e
> ---
>  gcc/config/arm/arm.c | 66 +++++++++++++++++++++++++++++++++-------------------
>  1 file changed, 42 insertions(+), 24 deletions(-)
> 
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 6cae178..7f002f1 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -9072,6 +9072,47 @@ arm_unspec_cost (rtx x, enum rtx_code /* outer_code */, bool speed_p, int *cost)
>  	  }								\
>  	while (0);
>  
> +/* Helper function for arm_rtx_costs_internal. Calculates the cost of a MEM,
> +   considering the costs of the addressing mode and memory access
> +   separately.  */
> +static bool
> +arm_mem_costs (rtx x, const struct cpu_cost_table *extra_cost,
> +	       int *cost, bool speed_p)
> +{
> +  machine_mode mode = GET_MODE (x);
> +  if (flag_pic
> +      && GET_CODE (XEXP (x, 0)) == PLUS
> +      && will_be_in_index_register (XEXP (XEXP (x, 0), 1)))
> +    /* This will be split into two instructions.  Add the cost of the
> +       additional instruction here.  The cost of the memory access is computed
> +       below.  See arm.md:calculate_pic_address.  */
> +    *cost = COSTS_N_INSNS (1);
> +  else
> +    *cost = 0;
> +
> +  /* Calculate cost of the addressing mode.  */
> +  if (speed_p)
> +  {

This patch needs to be reformatted in the GNU style (indentation of
braces, braces and else clauses on separate lines etc).

> +    /* TODO: Add table-driven costs for addressing modes.  */

You need to sort out the comment.  What's missing here?

> +  }
> +
> +  /* cost of memory access */
> +  if (speed_p)
> +  {
> +    /* data transfer is transfer size divided by bus width.  */
> +    int bus_width = arm_arch7 ? 8 : 4;

Basing bus width on the architecture is a bit too simplistic.  Instead
this should be a parameter that comes from the CPU cost tables, based on
the current tune target.

> +    *cost += COSTS_N_INSNS((GET_MODE_SIZE (mode) + bus_width - 1) / bus_width);

Use CEIL (from system.h)

> +    *cost += extra_cost->ldst.load;
> +  } else {
> +    *cost += COSTS_N_INSNS (1);
> +  }
> +
> +  return true;
> +}
> +/* Convert fron bytes to ints.  */
> +#define ARM_NUM_INTS(X) (((X) + UNITS_PER_WORD - 1) / UNITS_PER_WORD)
> +
> +
>  /* RTX costs.  Make an estimate of the cost of executing the operation
>     X, which is contained with an operation with code OUTER_CODE.
>     SPEED_P indicates whether the cost desired is the performance cost,
> @@ -9152,30 +9193,7 @@ arm_rtx_costs_internal (rtx x, enum rtx_code code, enum rtx_code outer_code,
>        return false;
>  
>      case MEM:
> -      /* A memory access costs 1 insn if the mode is small, or the address is
> -	 a single register, otherwise it costs one insn per word.  */
> -      if (REG_P (XEXP (x, 0)))
> -	*cost = COSTS_N_INSNS (1);
> -      else if (flag_pic
> -	       && GET_CODE (XEXP (x, 0)) == PLUS
> -	       && will_be_in_index_register (XEXP (XEXP (x, 0), 1)))
> -	/* This will be split into two instructions.
> -	   See arm.md:calculate_pic_address.  */
> -	*cost = COSTS_N_INSNS (2);
> -      else
> -	*cost = COSTS_N_INSNS (ARM_NUM_REGS (mode));
> -
> -      /* For speed optimizations, add the costs of the address and
> -	 accessing memory.  */
> -      if (speed_p)
> -#ifdef NOT_YET
> -	*cost += (extra_cost->ldst.load
> -		  + arm_address_cost (XEXP (x, 0), mode,
> -				      ADDR_SPACE_GENERIC, speed_p));
> -#else
> -        *cost += extra_cost->ldst.load;
> -#endif
> -      return true;
> +      return arm_mem_costs (x, extra_cost, cost, speed_p);
>  
>      case PARALLEL:
>      {
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] [ARM] Add table of costs for AAarch32 addressing modes.
  2017-02-21 16:58 ` [PATCH 2/2] [ARM] Add table of costs for AAarch32 addressing modes charles.baylis
@ 2017-06-09 14:13   ` Richard Earnshaw (lists)
  2017-08-25 19:05     ` Charles Baylis
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Earnshaw (lists) @ 2017-06-09 14:13 UTC (permalink / raw)
  To: charles.baylis, Ramana.Radhakrishnan, kyrylo.tkachov
  Cc: rearnsha, gcc-patches

On 21/02/17 16:54, charles.baylis@linaro.org wrote:
> From: Charles Baylis <charles.baylis@linaro.org>
> 
> This patch adds support for modelling the varying costs of
> different addressing modes. The generic cost table treats
> all addressing modes as having equal cost. The cost table
> for Cortex-A57 is derived from http://infocenter.arm.com/help/topic/com.arm.doc.uan0015b/Cortex_A57_Software_Optimization_Guide_external.pdf
> and treats addressing modes with write-back as having a
> cost equal to one additional instruction.
> 
> gcc/ChangeLog:
> 
> <date>  Charles Baylis  <charles.baylis@linaro.org>
> 
> 	* config/arm/aarch-common-protos.h (enum arm_addr_mode_op): New.
> 	(struct addr_mode_cost_table): New.
> 	(struct cpu_cost_table): Add pointer to an addr_mode_cost_table.
> 	* config/arm/aarch-cost-tables.h: (generic_addr_mode_costs): New.
> 	(generic_extra_costs) Initialise aarch32_addr_mode.
> 	(cortexa53_extra_costs) Likewise.
> 	(addr_mode_costs_cortexa57) New.
> 	(cortexa57_extra_costs) Initialise aarch32_addr_mode.
> 	(exynosm1_extra_costs) Likewise.
> 	(xgene1_extra_costs) Likewise.
> 	(qdf24xx_extra_costs) Likewise.
> 	* config/arm/arm.c (cortexa9_extra_costs) Initialise aarch32_addr_mode.
> 	(cortexa9_extra_costs) Likewise.
> 	(cortexa8_extra_costs) Likewise.
> 	(cortexa5_extra_costs) Likewise.
> 	(cortexa7_extra_costs) Likewise.
> 	(cortexa12_extra_costs) Likewise.
> 	(cortexv7m_extra_costs) Likewise.
> 	(arm_mem_costs): Use table lookup to calculate cost of addressing
> 	mode.
> 
> Change-Id: If71bd7c4f4bb876c5ed82dc28791130efb8bf89e
> ---
>  gcc/config/arm/aarch-common-protos.h | 16 +++++++++++
>  gcc/config/arm/aarch-cost-tables.h   | 54 ++++++++++++++++++++++++++++++----
>  gcc/config/arm/arm.c                 | 56 ++++++++++++++++++++++++++++++------
>  3 files changed, 113 insertions(+), 13 deletions(-)
> 
> diff --git a/gcc/config/arm/aarch-common-protos.h b/gcc/config/arm/aarch-common-protos.h
> index 7c2bb4c..f6fcc94 100644
> --- a/gcc/config/arm/aarch-common-protos.h
> +++ b/gcc/config/arm/aarch-common-protos.h
> @@ -130,6 +130,21 @@ struct vector_cost_table
>    const int alu;
>  };
>  
> +enum arm_addr_mode_op
> +{
> +   AMO_DEFAULT,
> +   AMO_NO_WB,
> +   AMO_WB,
> +   AMO_MAX /* for array size */

Comment style: Capital letter at start, full stop and two spaces at the end.

The enum and structure below should have some comments explaining what
they are for.

> +};
> +
> +struct addr_mode_cost_table
> +{
> +   const int integer[AMO_MAX];
> +   const int fp[AMO_MAX];
> +   const int vector[AMO_MAX];
> +};
> +
>  struct cpu_cost_table
>  {
>    const struct alu_cost_table alu;
> @@ -137,6 +152,7 @@ struct cpu_cost_table
>    const struct mem_cost_table ldst;
>    const struct fp_cost_table fp[2]; /* SFmode and DFmode.  */
>    const struct vector_cost_table vect;
> +  const struct addr_mode_cost_table *aarch32_addr_mode;
>  };
>  
>  
> diff --git a/gcc/config/arm/aarch-cost-tables.h b/gcc/config/arm/aarch-cost-tables.h
> index 68f84b0..e3d257f 100644
> --- a/gcc/config/arm/aarch-cost-tables.h
> +++ b/gcc/config/arm/aarch-cost-tables.h
> @@ -22,6 +22,16 @@
>  #ifndef GCC_AARCH_COST_TABLES_H
>  #define GCC_AARCH_COST_TABLES_H
>  
> +const struct addr_mode_cost_table generic_addr_mode_costs =
> +{
> +  /* int */
> +  { 0, 0, 0 },

Elements need to be commented, otherwise minor changes to the contents
of the arrays is hard to update and maintain.

> +  /* float */
> +  { 0, 0, 0 },
> +  /* vector */
> +  { 0, 0, 0 }
> +};
> +
>  const struct cpu_cost_table generic_extra_costs =
>  {
>    /* ALU */
> @@ -122,7 +132,9 @@ const struct cpu_cost_table generic_extra_costs =
>    /* Vector */
>    {
>      COSTS_N_INSNS (1)	/* alu.  */
> -  }
> +  },
> +  /* Addressing mode */

Comment style.
> +  &generic_addr_mode_costs
>  };
>  
>  const struct cpu_cost_table cortexa53_extra_costs =
> @@ -225,6 +237,30 @@ const struct cpu_cost_table cortexa53_extra_costs =
>    /* Vector */
>    {
>      COSTS_N_INSNS (1)	/* alu.  */
> +  },
> +  /* Addressing mode */
> +  &generic_addr_mode_costs
> +};
> +
> +const struct addr_mode_cost_table addr_mode_costs_cortexa57 =
> +{
> +  /* int */
> +  {
> +    0,
> +    0,
> +    COSTS_N_INSNS (1),
> +  },
> +  /* float */
> +  {
> +    0,
> +    0,
> +    COSTS_N_INSNS (1),
> +  },
> +  /* vector */
> +  {
> +    0,
> +    0,
> +    COSTS_N_INSNS (1),
>    }
>  };
>  
> @@ -328,7 +364,9 @@ const struct cpu_cost_table cortexa57_extra_costs =
>    /* Vector */
>    {
>      COSTS_N_INSNS (1)  /* alu.  */
> -  }
> +  },
> +  /* Addressing mode */
> +  &addr_mode_costs_cortexa57
>  };
>  
>  const struct cpu_cost_table exynosm1_extra_costs =
> @@ -431,7 +469,9 @@ const struct cpu_cost_table exynosm1_extra_costs =
>    /* Vector */
>    {
>      COSTS_N_INSNS (0)  /* alu.  */
> -  }
> +  },
> +  /* Addressing mode */
> +  &generic_addr_mode_costs
>  };
>  
>  const struct cpu_cost_table xgene1_extra_costs =
> @@ -534,7 +574,9 @@ const struct cpu_cost_table xgene1_extra_costs =
>    /* Vector */
>    {
>      COSTS_N_INSNS (2)  /* alu.  */
> -  }
> +  },
> +  /* Addressing mode */
> +  &generic_addr_mode_costs
>  };
>  
>  const struct cpu_cost_table qdf24xx_extra_costs =
> @@ -637,7 +679,9 @@ const struct cpu_cost_table qdf24xx_extra_costs =
>    /* Vector */
>    {
>      COSTS_N_INSNS (1)  /* alu.  */
> -  }
> +  },
> +  /* Addressing mode */
> +  &generic_addr_mode_costs
>  };
>  
>  #endif /* GCC_AARCH_COST_TABLES_H */
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 7f002f1..fe4cd73 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -1099,7 +1099,9 @@ const struct cpu_cost_table cortexa9_extra_costs =
>    /* Vector */
>    {
>      COSTS_N_INSNS (1)	/* alu.  */
> -  }
> +  },
> +  /* Addressing mode */
> +  &generic_addr_mode_costs
>  };
>  
>  const struct cpu_cost_table cortexa8_extra_costs =
> @@ -1202,7 +1204,9 @@ const struct cpu_cost_table cortexa8_extra_costs =
>    /* Vector */
>    {
>      COSTS_N_INSNS (1)	/* alu.  */
> -  }
> +  },
> +  /* Addressing mode */
> +  &generic_addr_mode_costs
>  };
>  
>  const struct cpu_cost_table cortexa5_extra_costs =
> @@ -1306,7 +1310,9 @@ const struct cpu_cost_table cortexa5_extra_costs =
>    /* Vector */
>    {
>      COSTS_N_INSNS (1)	/* alu.  */
> -  }
> +  },
> +  /* Addressing mode */
> +  &generic_addr_mode_costs
>  };
>  
>  
> @@ -1411,7 +1417,9 @@ const struct cpu_cost_table cortexa7_extra_costs =
>    /* Vector */
>    {
>      COSTS_N_INSNS (1)	/* alu.  */
> -  }
> +  },
> +  /* Addressing mode */
> +  &generic_addr_mode_costs
>  };
>  
>  const struct cpu_cost_table cortexa12_extra_costs =
> @@ -1514,7 +1522,9 @@ const struct cpu_cost_table cortexa12_extra_costs =
>    /* Vector */
>    {
>      COSTS_N_INSNS (1)	/* alu.  */
> -  }
> +  },
> +  /* Addressing mode */
> +  &generic_addr_mode_costs
>  };
>  
>  const struct cpu_cost_table cortexa15_extra_costs =
> @@ -1617,7 +1627,9 @@ const struct cpu_cost_table cortexa15_extra_costs =
>    /* Vector */
>    {
>      COSTS_N_INSNS (1)	/* alu.  */
> -  }
> +  },
> +  /* Addressing mode */
> +  &generic_addr_mode_costs
>  };
>  
>  const struct cpu_cost_table v7m_extra_costs =
> @@ -1720,7 +1732,9 @@ const struct cpu_cost_table v7m_extra_costs =
>    /* Vector */
>    {
>      COSTS_N_INSNS (1)	/* alu.  */
> -  }
> +  },
> +  /* Addressing mode */
> +  &generic_addr_mode_costs
>  };
>  
>  const struct tune_params arm_slowmul_tune =
> @@ -9093,7 +9107,33 @@ arm_mem_costs (rtx x, const struct cpu_cost_table *extra_cost,
>    /* Calculate cost of the addressing mode.  */
>    if (speed_p)
>    {
> -    /* TODO: Add table-driven costs for addressing modes.  */
> +    arm_addr_mode_op op_type;
> +    switch (GET_CODE(XEXP (x, 0)))
> +    {
> +    case REG:
> +    default:
> +      op_type = AMO_DEFAULT;

Why would REG and default share the same cost?  Presumably default is
too complex to recognize, but then it's unlikely to be cheap.

> +      break;
> +    case PLUS:
> +    case MINUS:
> +      op_type = AMO_NO_WB;

GCC doesn't support MINUS in addresses, even though the architecture
could in theory handle this.

Also, I think you need a different cost for scaled offset addressing,
and possibly even for different scaling factors and types of scaling.

> +      break;
> +    case PRE_INC:
> +    case PRE_DEC:
> +    case POST_INC:
> +    case POST_DEC:
> +    case PRE_MODIFY:
> +    case POST_MODIFY:
> +      op_type = AMO_WB;

Pre and post might also need separate entries (plus further entries for
scaling).  A post operation might happen in parallel with the data
fetch, while a pre operation must happen before the address can be sent
to the load/store pipe.

> +      break;
> +    }

blank line between the switch block and the subsequent statements.
Also, the following needs to laid out in GNU style.


> +    if (VECTOR_MODE_P (mode)) {
> +        *cost += extra_cost->aarch32_addr_mode->vector[op_type];
> +    } else if (FLOAT_MODE_P (mode)) {
> +        *cost += extra_cost->aarch32_addr_mode->fp[op_type];
> +    } else {
> +        *cost += extra_cost->aarch32_addr_mode->integer[op_type];
> +    }
>    }
>  
>    /* cost of memory access */
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] [ARM] Refactor costs calculation for MEM.
  2017-06-09 13:59   ` Richard Earnshaw (lists)
@ 2017-08-25 18:16     ` Charles Baylis
  0 siblings, 0 replies; 10+ messages in thread
From: Charles Baylis @ 2017-08-25 18:16 UTC (permalink / raw)
  To: Richard Earnshaw (lists)
  Cc: Ramana Radhakrishnan, Kyrylo Tkachov, Richard Earnshaw, GCC Patches

[-- Attachment #1: Type: text/plain, Size: 3376 bytes --]

On 9 June 2017 at 14:59, Richard Earnshaw (lists)
<Richard.Earnshaw@arm.com> wrote:
> On 21/02/17 16:54, charles.baylis@linaro.org wrote:
>> From: Charles Baylis <charles.baylis@linaro.org>
>>
>> This patch moves the calculation of costs for MEM into a
>> separate function, and reforms the calculation into two
>> parts. Firstly any additional cost of the addressing mode
>> is calculated, and then the cost of the memory access itself
>> is added.
>>
>> In this patch, the calculation of the cost of the addressing
>> mode is left as a placeholder, to be added in a subsequent
>> patch.
>>
>> gcc/ChangeLog:
>>
>> <date>  Charles Baylis  <charles.baylis@linaro.org>
>>
>>         * config/arm/arm.c (arm_mem_costs): New function.
>>         (arm_rtx_costs_internal): Use arm_mem_costs.
>
> I like the idea of this patch, but it needs further work...
>
> Comments inline.
>
> R.
>
>>
>> Change-Id: I99e93406ea39ee31f71c7bf428ad3e127b7a618e
>> ---
>>  gcc/config/arm/arm.c | 66 +++++++++++++++++++++++++++++++++-------------------
>>  1 file changed, 42 insertions(+), 24 deletions(-)
>>
>> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
>> index 6cae178..7f002f1 100644
>> --- a/gcc/config/arm/arm.c
>> +++ b/gcc/config/arm/arm.c
>> @@ -9072,6 +9072,47 @@ arm_unspec_cost (rtx x, enum rtx_code /* outer_code */, bool speed_p, int *cost)
>>         }                                                             \
>>       while (0);
>>
>> +/* Helper function for arm_rtx_costs_internal. Calculates the cost of a MEM,
>> +   considering the costs of the addressing mode and memory access
>> +   separately.  */
>> +static bool
>> +arm_mem_costs (rtx x, const struct cpu_cost_table *extra_cost,
>> +            int *cost, bool speed_p)
>> +{
>> +  machine_mode mode = GET_MODE (x);
>> +  if (flag_pic
>> +      && GET_CODE (XEXP (x, 0)) == PLUS
>> +      && will_be_in_index_register (XEXP (XEXP (x, 0), 1)))
>> +    /* This will be split into two instructions.  Add the cost of the
>> +       additional instruction here.  The cost of the memory access is computed
>> +       below.  See arm.md:calculate_pic_address.  */
>> +    *cost = COSTS_N_INSNS (1);
>> +  else
>> +    *cost = 0;
>> +
>> +  /* Calculate cost of the addressing mode.  */
>> +  if (speed_p)
>> +  {
>
> This patch needs to be reformatted in the GNU style (indentation of
> braces, braces and else clauses on separate lines etc).

Done.

>> +    /* TODO: Add table-driven costs for addressing modes.  */
>
> You need to sort out the comment.  What's missing here?

What's missing is patch 2... I've updated the comment for clarity.

>> +  }
>> +
>> +  /* cost of memory access */
>> +  if (speed_p)
>> +  {
>> +    /* data transfer is transfer size divided by bus width.  */
>> +    int bus_width = arm_arch7 ? 8 : 4;
>
> Basing bus width on the architecture is a bit too simplistic.  Instead
> this should be a parameter that comes from the CPU cost tables, based on
> the current tune target.

This was actually Ramana's suggestion, so I've left it as-is in this
patch. If necessary, I think it's better to move this to a table in a
separate patch, as I'll need to guess the correct bus width for a
number of CPUs and will probably get some wrong.

>> +    *cost += COSTS_N_INSNS((GET_MODE_SIZE (mode) + bus_width - 1) / bus_width);
>
> Use CEIL (from system.h)

Done.

Updated patch attached.

[-- Attachment #2: 0001-ARM-Refactor-costs-calculation-for-MEM.patch --]
[-- Type: text/x-patch, Size: 3945 bytes --]

From 18629835ba12fdfa693e2f9492a5fc23d95ef165 Mon Sep 17 00:00:00 2001
From: Charles Baylis <charles.baylis@linaro.org>
Date: Wed, 8 Feb 2017 16:52:10 +0000
Subject: [PATCH 1/3] [ARM] Refactor costs calculation for MEM.

This patch moves the calculation of costs for MEM into a
separate function, and reforms the calculation into two
parts. Firstly any additional cost of the addressing mode
is calculated, and then the cost of the memory access itself
is added.

In this patch, the calculation of the cost of the addressing
mode is left as a placeholder, to be added in a subsequent
patch.

gcc/ChangeLog:

<date>  Charles Baylis  <charles.baylis@linaro.org>

        * config/arm/arm.c (arm_mem_costs): New function.
        (arm_rtx_costs_internal): Use arm_mem_costs.

Change-Id: I99e93406ea39ee31f71c7bf428ad3e127b7a618e
---
 gcc/config/arm/arm.c | 67 ++++++++++++++++++++++++++++++++--------------------
 1 file changed, 42 insertions(+), 25 deletions(-)

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index fa3e2fa..13cd421 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -9198,8 +9198,48 @@ arm_unspec_cost (rtx x, enum rtx_code /* outer_code */, bool speed_p, int *cost)
 	  }								\
 	while (0);
 
+/* Helper function for arm_rtx_costs_internal.  Calculates the cost of a MEM,
+   considering the costs of the addressing mode and memory access
+   separately.  */
+static bool
+arm_mem_costs (rtx x, const struct cpu_cost_table *extra_cost,
+	       int *cost, bool speed_p)
+{
+  machine_mode mode = GET_MODE (x);
+  if (flag_pic
+      && GET_CODE (XEXP (x, 0)) == PLUS
+      && will_be_in_index_register (XEXP (XEXP (x, 0), 1)))
+    /* This will be split into two instructions.  Add the cost of the
+       additional instruction here.  The cost of the memory access is computed
+       below.  See arm.md:calculate_pic_address.  */
+    *cost = COSTS_N_INSNS (1);
+  else
+    *cost = 0;
+
+  /* Calculate cost of the addressing mode.  */
+  if (speed_p)
+    {
+      /* TODO: Add table-driven costs for addressing modes.  (See patch 2) */
+    }
+
+  /* Calculate cost of memory access.  */
+  if (speed_p)
+    {
+      /* data transfer is transfer size divided by bus width.  */
+      int bus_width = arm_arch7 ? 8 : 4;
+      *cost += CEIL (GET_MODE_SIZE (mode), bus_width);
+      *cost += extra_cost->ldst.load;
+    }
+  else
+    {
+      *cost += COSTS_N_INSNS (1);
+    }
+
+  return true;
+}
+
 /* RTX costs.  Make an estimate of the cost of executing the operation
-   X, which is contained with an operation with code OUTER_CODE.
+   X, which is contained within an operation with code OUTER_CODE.
    SPEED_P indicates whether the cost desired is the performance cost,
    or the size cost.  The estimate is stored in COST and the return
    value is TRUE if the cost calculation is final, or FALSE if the
@@ -9278,30 +9318,7 @@ arm_rtx_costs_internal (rtx x, enum rtx_code code, enum rtx_code outer_code,
       return false;
 
     case MEM:
-      /* A memory access costs 1 insn if the mode is small, or the address is
-	 a single register, otherwise it costs one insn per word.  */
-      if (REG_P (XEXP (x, 0)))
-	*cost = COSTS_N_INSNS (1);
-      else if (flag_pic
-	       && GET_CODE (XEXP (x, 0)) == PLUS
-	       && will_be_in_index_register (XEXP (XEXP (x, 0), 1)))
-	/* This will be split into two instructions.
-	   See arm.md:calculate_pic_address.  */
-	*cost = COSTS_N_INSNS (2);
-      else
-	*cost = COSTS_N_INSNS (ARM_NUM_REGS (mode));
-
-      /* For speed optimizations, add the costs of the address and
-	 accessing memory.  */
-      if (speed_p)
-#ifdef NOT_YET
-	*cost += (extra_cost->ldst.load
-		  + arm_address_cost (XEXP (x, 0), mode,
-				      ADDR_SPACE_GENERIC, speed_p));
-#else
-        *cost += extra_cost->ldst.load;
-#endif
-      return true;
+      return arm_mem_costs (x, extra_cost, cost, speed_p);
 
     case PARALLEL:
     {
-- 
2.7.4


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] [ARM] Add table of costs for AAarch32 addressing modes.
  2017-06-09 14:13   ` Richard Earnshaw (lists)
@ 2017-08-25 19:05     ` Charles Baylis
  2017-08-25 19:10       ` Andrew Pinski
  0 siblings, 1 reply; 10+ messages in thread
From: Charles Baylis @ 2017-08-25 19:05 UTC (permalink / raw)
  To: Richard Earnshaw (lists)
  Cc: Ramana Radhakrishnan, Kyrylo Tkachov, Richard Earnshaw, GCC Patches

[-- Attachment #1: Type: text/plain, Size: 5067 bytes --]

On 9 June 2017 at 15:13, Richard Earnshaw (lists)
<Richard.Earnshaw@arm.com> wrote:
> On 21/02/17 16:54, charles.baylis@linaro.org wrote:
>> From: Charles Baylis <charles.baylis@linaro.org>
>>
>> This patch adds support for modelling the varying costs of
>> different addressing modes. The generic cost table treats
>> all addressing modes as having equal cost. The cost table
>> for Cortex-A57 is derived from http://infocenter.arm.com/help/topic/com.arm.doc.uan0015b/Cortex_A57_Software_Optimization_Guide_external.pdf
>> and treats addressing modes with write-back as having a
>> cost equal to one additional instruction.
>>
>> gcc/ChangeLog:
>>
>> <date>  Charles Baylis  <charles.baylis@linaro.org>
>>
>>       * config/arm/aarch-common-protos.h (enum arm_addr_mode_op): New.
>>       (struct addr_mode_cost_table): New.
>>       (struct cpu_cost_table): Add pointer to an addr_mode_cost_table.
>>       * config/arm/aarch-cost-tables.h: (generic_addr_mode_costs): New.
>>       (generic_extra_costs) Initialise aarch32_addr_mode.
>>       (cortexa53_extra_costs) Likewise.
>>       (addr_mode_costs_cortexa57) New.
>>       (cortexa57_extra_costs) Initialise aarch32_addr_mode.
>>       (exynosm1_extra_costs) Likewise.
>>       (xgene1_extra_costs) Likewise.
>>       (qdf24xx_extra_costs) Likewise.
>>       * config/arm/arm.c (cortexa9_extra_costs) Initialise aarch32_addr_mode.
>>       (cortexa9_extra_costs) Likewise.
>>       (cortexa8_extra_costs) Likewise.
>>       (cortexa5_extra_costs) Likewise.
>>       (cortexa7_extra_costs) Likewise.
>>       (cortexa12_extra_costs) Likewise.
>>       (cortexv7m_extra_costs) Likewise.
>>       (arm_mem_costs): Use table lookup to calculate cost of addressing
>>       mode.
>>
>> Change-Id: If71bd7c4f4bb876c5ed82dc28791130efb8bf89e
>> ---
>>  gcc/config/arm/aarch-common-protos.h | 16 +++++++++++
>>  gcc/config/arm/aarch-cost-tables.h   | 54 ++++++++++++++++++++++++++++++----
>>  gcc/config/arm/arm.c                 | 56 ++++++++++++++++++++++++++++++------
>>  3 files changed, 113 insertions(+), 13 deletions(-)
>>
>> diff --git a/gcc/config/arm/aarch-common-protos.h b/gcc/config/arm/aarch-common-protos.h
>> index 7c2bb4c..f6fcc94 100644
>> --- a/gcc/config/arm/aarch-common-protos.h
>> +++ b/gcc/config/arm/aarch-common-protos.h
>> @@ -130,6 +130,21 @@ struct vector_cost_table
>>    const int alu;
>>  };
>>
>> +enum arm_addr_mode_op
>> +{
>> +   AMO_DEFAULT,
>> +   AMO_NO_WB,
>> +   AMO_WB,
>> +   AMO_MAX /* for array size */
>
> Comment style: Capital letter at start, full stop and two spaces at the end.

Done.

> The enum and structure below should have some comments explaining what
> they are for.

Done.

>> +const struct addr_mode_cost_table generic_addr_mode_costs =
>> +{
>> +  /* int */
>> +  { 0, 0, 0 },
>
> Elements need to be commented, otherwise minor changes to the contents
> of the arrays is hard to update and maintain.

Done.


>> +  /* Addressing mode */
>
> Comment style.

Done.

>> -    /* TODO: Add table-driven costs for addressing modes.  */
>> +    arm_addr_mode_op op_type;
>> +    switch (GET_CODE(XEXP (x, 0)))
>> +    {
>> +    case REG:
>> +    default:
>> +      op_type = AMO_DEFAULT;
>
> Why would REG and default share the same cost?  Presumably default is
> too complex to recognize, but then it's unlikely to be cheap.

Default covers literals in various forms of RTL, for which the cost is
the same as regular, and PIC, which is handled in the original code
above this section.

>> +      break;
>> +    case PLUS:
>> +    case MINUS:
>> +      op_type = AMO_NO_WB;
>
> GCC doesn't support MINUS in addresses, even though the architecture
> could in theory handle this.

I've noted that in a comment, but kept the "case MINUS:" in place.

> Also, I think you need a different cost for scaled offset addressing,
> and possibly even for different scaling factors and types of scaling.

... see below:

>> +      break;
>> +    case PRE_INC:
>> +    case PRE_DEC:
>> +    case POST_INC:
>> +    case POST_DEC:
>> +    case PRE_MODIFY:
>> +    case POST_MODIFY:
>> +      op_type = AMO_WB;
>
> Pre and post might also need separate entries (plus further entries for
> scaling).  A post operation might happen in parallel with the data
> fetch, while a pre operation must happen before the address can be sent
> to the load/store pipe.

The {DEFAULT, NO_WB, WB} range is also Ramana's requested design. I
think this is OK because it is sufficient to describe the currently
publicly documented microarchitectures, and the structure is readily
extensible if future cores require more detailed modelling.

Expanding the range of AMO_* entries in the array makes the tables
more unwieldy for no immediate benefit.

>> +      break;
>> +    }
>
> blank line between the switch block and the subsequent statements.
> Also, the following needs to laid out in GNU style.

Done.

This version of the patch also moves the Cortex-A57 tuning (which was
present in the earlier version) into a separate patch. Discussion of
that in a follow-up post.

Updated patch attached

[-- Attachment #2: 0002-ARM-Add-table-of-costs-for-AAarch32-addressing-modes.patch --]
[-- Type: text/x-patch, Size: 7541 bytes --]

From b6968804f37fa999c090d7316ab59f40b14b5f49 Mon Sep 17 00:00:00 2001
From: Charles Baylis <charles.baylis@linaro.org>
Date: Wed, 8 Feb 2017 16:52:40 +0000
Subject: [PATCH 2/3] [ARM] Add table of costs for AAarch32 addressing modes.

This patch adds support for modelling the varying costs of
different addressing modes. The generic cost table treats
all addressing modes as having equal cost.

gcc/ChangeLog:

<date>  Charles Baylis  <charles.baylis@linaro.org>

	* config/arm/aarch-common-protos.h (enum arm_addr_mode_op): New.
	(struct addr_mode_cost_table): New.
	(struct cpu_cost_table): Add pointer to an addr_mode_cost_table.
	* config/arm/aarch-cost-tables.h: (generic_addr_mode_costs): New.
	(generic_extra_costs) Initialise aarch32_addr_mode.
	(cortexa53_extra_costs) Likewise.
	(cortexa57_extra_costs) Likewise.
	(exynosm1_extra_costs) Likewise.
	(xgene1_extra_costs) Likewise.
	(qdf24xx_extra_costs) Likewise.
	* config/arm/arm.c (cortexa9_extra_costs) Initialise aarch32_addr_mode.
	(cortexa9_extra_costs) Likewise.
	(cortexa8_extra_costs) Likewise.
	(cortexa5_extra_costs) Likewise.
	(cortexa7_extra_costs) Likewise.
	(cortexa12_extra_costs) Likewise.
	(cortexv7m_extra_costs) Likewise.
	(arm_mem_costs): Use table lookup to calculate cost of addressing
	mode.

Change-Id: If71bd7c4f4bb876c5ed82dc28791130efb8bf89e
---
 gcc/config/arm/aarch-common-protos.h | 20 +++++++++++++
 gcc/config/arm/aarch-cost-tables.h   | 38 +++++++++++++++++++----
 gcc/config/arm/arm.c                 | 58 +++++++++++++++++++++++++++++++-----
 3 files changed, 103 insertions(+), 13 deletions(-)

diff --git a/gcc/config/arm/aarch-common-protos.h b/gcc/config/arm/aarch-common-protos.h
index a511211..7887c92 100644
--- a/gcc/config/arm/aarch-common-protos.h
+++ b/gcc/config/arm/aarch-common-protos.h
@@ -131,6 +131,25 @@ struct vector_cost_table
   const int alu;
 };
 
+/* Addressing mode operations.  Used to index tables in struct
+   addr_mode_cost_table.  */
+enum arm_addr_mode_op
+{
+   AMO_DEFAULT,
+   AMO_NO_WB,	/* Offset with no writeback.  */
+   AMO_WB,	/* Offset with writeback.  */
+   AMO_MAX	/* For array size.  */
+};
+
+/* Table of additional costs when using addressing modes for each
+   access type.  */
+struct addr_mode_cost_table
+{
+   const int integer[AMO_MAX];
+   const int fp[AMO_MAX];
+   const int vector[AMO_MAX];
+};
+
 struct cpu_cost_table
 {
   const struct alu_cost_table alu;
@@ -138,6 +157,7 @@ struct cpu_cost_table
   const struct mem_cost_table ldst;
   const struct fp_cost_table fp[2]; /* SFmode and DFmode.  */
   const struct vector_cost_table vect;
+  const struct addr_mode_cost_table *aarch32_addr_mode;
 };
 
 
diff --git a/gcc/config/arm/aarch-cost-tables.h b/gcc/config/arm/aarch-cost-tables.h
index dc67faf..b16a95c 100644
--- a/gcc/config/arm/aarch-cost-tables.h
+++ b/gcc/config/arm/aarch-cost-tables.h
@@ -22,6 +22,29 @@
 #ifndef GCC_AARCH_COST_TABLES_H
 #define GCC_AARCH_COST_TABLES_H
 
+const struct addr_mode_cost_table generic_addr_mode_costs =
+{
+  /* int.  */
+  {
+    0,	/* AMO_DEFAULT.  */
+    0,	/* AMO_NO_WB.  */
+    0	/* AMO_WB.  */
+  },
+  /* float.  */
+  {
+    0,	/* AMO_DEFAULT.  */
+    0,	/* AMO_NO_WB.  */
+    0	/* AMO_WB.  */
+  },
+  /* vector.  */
+  {
+    0,	/* AMO_DEFAULT.  */
+    0,	/* AMO_NO_WB.  */
+    0	/* AMO_WB.  */
+  }
+};
+
+
 const struct cpu_cost_table generic_extra_costs =
 {
   /* ALU */
@@ -122,7 +145,8 @@ const struct cpu_cost_table generic_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (1)	/* alu.  */
-  }
+  },
+  &generic_addr_mode_costs	/* aarch32_addr_mode.  */
 };
 
 const struct cpu_cost_table cortexa53_extra_costs =
@@ -225,7 +249,8 @@ const struct cpu_cost_table cortexa53_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (1)	/* alu.  */
-  }
+  },
+  &generic_addr_mode_costs	/* aarch32_addr_mode.  */
 };
 
 const struct cpu_cost_table cortexa57_extra_costs =
@@ -328,7 +353,8 @@ const struct cpu_cost_table cortexa57_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (1)  /* alu.  */
-  }
+  },
+  &generic_addr_mode_costs	/* aarch32_addr_mode.  */
 };
 
 const struct cpu_cost_table exynosm1_extra_costs =
@@ -431,7 +457,8 @@ const struct cpu_cost_table exynosm1_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (0)  /* alu.  */
-  }
+  },
+  &generic_addr_mode_costs	/* aarch32_addr_mode.  */
 };
 
 const struct cpu_cost_table xgene1_extra_costs =
@@ -534,7 +561,8 @@ const struct cpu_cost_table xgene1_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (2)  /* alu.  */
-  }
+  },
+  &generic_addr_mode_costs	/* aarch32_addr_mode.  */
 };
 
 #endif /* GCC_AARCH_COST_TABLES_H */
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 13cd421..ae5140c 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -1117,7 +1117,8 @@ const struct cpu_cost_table cortexa9_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (1)	/* alu.  */
-  }
+  },
+  &generic_addr_mode_costs	/* aarch32_addr_mode.  */
 };
 
 const struct cpu_cost_table cortexa8_extra_costs =
@@ -1220,7 +1221,8 @@ const struct cpu_cost_table cortexa8_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (1)	/* alu.  */
-  }
+  },
+  &generic_addr_mode_costs	/* aarch32_addr_mode.  */
 };
 
 const struct cpu_cost_table cortexa5_extra_costs =
@@ -1324,7 +1326,8 @@ const struct cpu_cost_table cortexa5_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (1)	/* alu.  */
-  }
+  },
+  &generic_addr_mode_costs	/* aarch32_addr_mode.  */
 };
 
 
@@ -1429,7 +1432,8 @@ const struct cpu_cost_table cortexa7_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (1)	/* alu.  */
-  }
+  },
+  &generic_addr_mode_costs	/* aarch32_addr_mode.  */
 };
 
 const struct cpu_cost_table cortexa12_extra_costs =
@@ -1532,7 +1536,8 @@ const struct cpu_cost_table cortexa12_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (1)	/* alu.  */
-  }
+  },
+  &generic_addr_mode_costs	/* aarch32_addr_mode.  */
 };
 
 const struct cpu_cost_table cortexa15_extra_costs =
@@ -1635,7 +1640,8 @@ const struct cpu_cost_table cortexa15_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (1)	/* alu.  */
-  }
+  },
+  &generic_addr_mode_costs	/* aarch32_addr_mode.  */
 };
 
 const struct cpu_cost_table v7m_extra_costs =
@@ -1738,7 +1744,8 @@ const struct cpu_cost_table v7m_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (1)	/* alu.  */
-  }
+  },
+  &generic_addr_mode_costs	/* aarch32_addr_mode.  */
 };
 
 const struct tune_params arm_slowmul_tune =
@@ -9219,7 +9226,42 @@ arm_mem_costs (rtx x, const struct cpu_cost_table *extra_cost,
   /* Calculate cost of the addressing mode.  */
   if (speed_p)
     {
-      /* TODO: Add table-driven costs for addressing modes.  (See patch 2) */
+      arm_addr_mode_op op_type;
+      switch (GET_CODE (XEXP (x, 0)))
+	{
+	default:
+	case REG:
+	  op_type = AMO_DEFAULT;
+	  break;
+	case MINUS:
+	  /* MINUS does not appear in RTL, but the architecture supports it,
+	     so handle this case defensively.  */
+	  /* fall through */
+	case PLUS:
+	  op_type = AMO_NO_WB;
+	  break;
+	case PRE_INC:
+	case PRE_DEC:
+	case POST_INC:
+	case POST_DEC:
+	case PRE_MODIFY:
+	case POST_MODIFY:
+	  op_type = AMO_WB;
+	  break;
+	}
+
+      if (VECTOR_MODE_P (mode))
+	{
+	  *cost += extra_cost->aarch32_addr_mode->vector[op_type];
+	}
+      else if (FLOAT_MODE_P (mode))
+	{
+	  *cost += extra_cost->aarch32_addr_mode->fp[op_type];
+	}
+      else
+	{
+	  *cost += extra_cost->aarch32_addr_mode->integer[op_type];
+	}
     }
 
   /* Calculate cost of memory access.  */
-- 
2.7.4


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] [ARM] Add table of costs for AAarch32 addressing modes.
  2017-08-25 19:05     ` Charles Baylis
@ 2017-08-25 19:10       ` Andrew Pinski
  0 siblings, 0 replies; 10+ messages in thread
From: Andrew Pinski @ 2017-08-25 19:10 UTC (permalink / raw)
  To: Charles Baylis
  Cc: Richard Earnshaw (lists),
	Ramana Radhakrishnan, Kyrylo Tkachov, Richard Earnshaw,
	GCC Patches

On Fri, Aug 25, 2017 at 10:43 AM, Charles Baylis
<charles.baylis@linaro.org> wrote:
> On 9 June 2017 at 15:13, Richard Earnshaw (lists)
> <Richard.Earnshaw@arm.com> wrote:
>> On 21/02/17 16:54, charles.baylis@linaro.org wrote:
>>> From: Charles Baylis <charles.baylis@linaro.org>
>>>
>>> This patch adds support for modelling the varying costs of
>>> different addressing modes. The generic cost table treats
>>> all addressing modes as having equal cost. The cost table
>>> for Cortex-A57 is derived from http://infocenter.arm.com/help/topic/com.arm.doc.uan0015b/Cortex_A57_Software_Optimization_Guide_external.pdf
>>> and treats addressing modes with write-back as having a
>>> cost equal to one additional instruction.
>>>
>>> gcc/ChangeLog:
>>>
>>> <date>  Charles Baylis  <charles.baylis@linaro.org>
>>>
>>>       * config/arm/aarch-common-protos.h (enum arm_addr_mode_op): New.
>>>       (struct addr_mode_cost_table): New.
>>>       (struct cpu_cost_table): Add pointer to an addr_mode_cost_table.
>>>       * config/arm/aarch-cost-tables.h: (generic_addr_mode_costs): New.
>>>       (generic_extra_costs) Initialise aarch32_addr_mode.
>>>       (cortexa53_extra_costs) Likewise.
>>>       (addr_mode_costs_cortexa57) New.
>>>       (cortexa57_extra_costs) Initialise aarch32_addr_mode.
>>>       (exynosm1_extra_costs) Likewise.
>>>       (xgene1_extra_costs) Likewise.
>>>       (qdf24xx_extra_costs) Likewise.
>>>       * config/arm/arm.c (cortexa9_extra_costs) Initialise aarch32_addr_mode.
>>>       (cortexa9_extra_costs) Likewise.
>>>       (cortexa8_extra_costs) Likewise.
>>>       (cortexa5_extra_costs) Likewise.
>>>       (cortexa7_extra_costs) Likewise.
>>>       (cortexa12_extra_costs) Likewise.
>>>       (cortexv7m_extra_costs) Likewise.
>>>       (arm_mem_costs): Use table lookup to calculate cost of addressing
>>>       mode.
>>>
>>> Change-Id: If71bd7c4f4bb876c5ed82dc28791130efb8bf89e
>>> ---
>>>  gcc/config/arm/aarch-common-protos.h | 16 +++++++++++
>>>  gcc/config/arm/aarch-cost-tables.h   | 54 ++++++++++++++++++++++++++++++----
>>>  gcc/config/arm/arm.c                 | 56 ++++++++++++++++++++++++++++++------
>>>  3 files changed, 113 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/gcc/config/arm/aarch-common-protos.h b/gcc/config/arm/aarch-common-protos.h
>>> index 7c2bb4c..f6fcc94 100644
>>> --- a/gcc/config/arm/aarch-common-protos.h
>>> +++ b/gcc/config/arm/aarch-common-protos.h
>>> @@ -130,6 +130,21 @@ struct vector_cost_table
>>>    const int alu;
>>>  };
>>>
>>> +enum arm_addr_mode_op
>>> +{
>>> +   AMO_DEFAULT,
>>> +   AMO_NO_WB,
>>> +   AMO_WB,
>>> +   AMO_MAX /* for array size */
>>
>> Comment style: Capital letter at start, full stop and two spaces at the end.
>
> Done.
>
>> The enum and structure below should have some comments explaining what
>> they are for.
>
> Done.
>
>>> +const struct addr_mode_cost_table generic_addr_mode_costs =
>>> +{
>>> +  /* int */
>>> +  { 0, 0, 0 },
>>
>> Elements need to be commented, otherwise minor changes to the contents
>> of the arrays is hard to update and maintain.
>
> Done.
>
>
>>> +  /* Addressing mode */
>>
>> Comment style.
>
> Done.
>
>>> -    /* TODO: Add table-driven costs for addressing modes.  */
>>> +    arm_addr_mode_op op_type;
>>> +    switch (GET_CODE(XEXP (x, 0)))
>>> +    {
>>> +    case REG:
>>> +    default:
>>> +      op_type = AMO_DEFAULT;
>>
>> Why would REG and default share the same cost?  Presumably default is
>> too complex to recognize, but then it's unlikely to be cheap.
>
> Default covers literals in various forms of RTL, for which the cost is
> the same as regular, and PIC, which is handled in the original code
> above this section.
>
>>> +      break;
>>> +    case PLUS:
>>> +    case MINUS:
>>> +      op_type = AMO_NO_WB;
>>
>> GCC doesn't support MINUS in addresses, even though the architecture
>> could in theory handle this.
>
> I've noted that in a comment, but kept the "case MINUS:" in place.
>
>> Also, I think you need a different cost for scaled offset addressing,
>> and possibly even for different scaling factors and types of scaling.
>
> ... see below:
>
>>> +      break;
>>> +    case PRE_INC:
>>> +    case PRE_DEC:
>>> +    case POST_INC:
>>> +    case POST_DEC:
>>> +    case PRE_MODIFY:
>>> +    case POST_MODIFY:
>>> +      op_type = AMO_WB;
>>
>> Pre and post might also need separate entries (plus further entries for
>> scaling).  A post operation might happen in parallel with the data
>> fetch, while a pre operation must happen before the address can be sent
>> to the load/store pipe.
>
> The {DEFAULT, NO_WB, WB} range is also Ramana's requested design. I
> think this is OK because it is sufficient to describe the currently
> publicly documented microarchitectures, and the structure is readily
> extensible if future cores require more detailed modelling.
>
> Expanding the range of AMO_* entries in the array makes the tables
> more unwieldy for no immediate benefit.
>
>>> +      break;
>>> +    }
>>
>> blank line between the switch block and the subsequent statements.
>> Also, the following needs to laid out in GNU style.
>
> Done.
>
> This version of the patch also moves the Cortex-A57 tuning (which was
> present in the earlier version) into a separate patch. Discussion of
> that in a follow-up post.

This will break aarch64 and the cores which are aarch64 only.  Also
your changelog is incorrect here:
(qdf24xx_extra_costs) Likewise.

Because qdf24xx_extra_costs has been moved to
aarch64/aarch64-cost-tables.h with the other 2 aarch64 only processors
(thunderx and thunderxt99).

Are you planing on moving cpu_addrcost_table in aarch64 to the same
model as you are using arm?  I would say you are losing information if
you do that as cpu_addrcost_table in aarch64 describes more than what
is described in your new table on the arm side.

Thanks,
Andrew

>
> Updated patch attached

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-08-25 17:52 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-21 16:54 [PATCH 0/2] [ARM] PR61551 addressing mode costs charles.baylis
2017-02-21 16:54 ` [PATCH 1/2] [ARM] Refactor costs calculation for MEM charles.baylis
2017-02-23  7:46   ` Bernhard Reutner-Fischer
2017-06-09 13:59   ` Richard Earnshaw (lists)
2017-08-25 18:16     ` Charles Baylis
2017-02-21 16:58 ` [PATCH 2/2] [ARM] Add table of costs for AAarch32 addressing modes charles.baylis
2017-06-09 14:13   ` Richard Earnshaw (lists)
2017-08-25 19:05     ` Charles Baylis
2017-08-25 19:10       ` Andrew Pinski
2017-05-12 15:53 ` [PATCH 0/2] [ARM] PR61551 addressing mode costs Charles Baylis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).