public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] [AArch64, Falkor] Switch to using Falkor-specific vector costs
@ 2018-07-25 18:10 Luis Machado
  2018-07-26 14:34 ` Kyrill Tkachov
  2018-07-31 22:53 ` James Greenhalgh
  0 siblings, 2 replies; 7+ messages in thread
From: Luis Machado @ 2018-07-25 18:10 UTC (permalink / raw)
  To: gcc-patches; +Cc: james.greenhalgh, Richard.Earnshaw

The adjusted vector costs give Falkor a reasonable boost in performance for FP
benchmarks (both CPU2017 and CPU2006) and doesn't change INT benchmarks that
much. About 0.7% for CPU2017 FP and 1.54% for CPU2006 FP.

OK for trunk?

gcc/ChangeLog:

2018-07-25  Luis Machado  <luis.machado@linaro.org>

	* config/aarch64/aarch64.c (qdf24xx_vector_cost): New.
	(qdf24xx_tunings) <vec_costs>: Set to qdf24xx_vector_cost.
---
 gcc/config/aarch64/aarch64.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index fa01475..d443aee 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -430,6 +430,26 @@ static const struct cpu_vector_cost generic_vector_cost =
   1 /* cond_not_taken_branch_cost  */
 };
 
+/* Qualcomm QDF24xx costs for vector insn classes.  */
+static const struct cpu_vector_cost qdf24xx_vector_cost =
+{
+  1, /* scalar_int_stmt_cost  */
+  1, /* scalar_fp_stmt_cost  */
+  1, /* scalar_load_cost  */
+  1, /* scalar_store_cost  */
+  1, /* vec_int_stmt_cost  */
+  3, /* vec_fp_stmt_cost  */
+  2, /* vec_permute_cost  */
+  1, /* vec_to_scalar_cost  */
+  1, /* scalar_to_vec_cost  */
+  1, /* vec_align_load_cost  */
+  1, /* vec_unalign_load_cost  */
+  1, /* vec_unalign_store_cost  */
+  1, /* vec_store_cost  */
+  3, /* cond_taken_branch_cost  */
+  1  /* cond_not_taken_branch_cost  */
+};
+
 /* ThunderX costs for vector insn classes.  */
 static const struct cpu_vector_cost thunderx_vector_cost =
 {
@@ -890,7 +910,7 @@ static const struct tune_params qdf24xx_tunings =
   &qdf24xx_extra_costs,
   &qdf24xx_addrcost_table,
   &qdf24xx_regmove_cost,
-  &generic_vector_cost,
+  &qdf24xx_vector_cost,
   &generic_branch_cost,
   &generic_approx_modes,
   4, /* memmov_cost  */
-- 
2.7.4

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] [AArch64, Falkor] Switch to using Falkor-specific vector costs
  2018-07-25 18:10 [PATCH] [AArch64, Falkor] Switch to using Falkor-specific vector costs Luis Machado
@ 2018-07-26 14:34 ` Kyrill Tkachov
  2018-07-26 14:47   ` Luis Machado
  2018-07-31 22:53 ` James Greenhalgh
  1 sibling, 1 reply; 7+ messages in thread
From: Kyrill Tkachov @ 2018-07-26 14:34 UTC (permalink / raw)
  To: Luis Machado, gcc-patches; +Cc: James Greenhalgh, Richard Earnshaw

Hi Luis,

On 25/07/18 19:10, Luis Machado wrote:
> The adjusted vector costs give Falkor a reasonable boost in performance for FP
> benchmarks (both CPU2017 and CPU2006) and doesn't change INT benchmarks that
> much. About 0.7% for CPU2017 FP and 1.54% for CPU2006 FP.
>
> OK for trunk?
>

The patch looks ok and safe to me (though you'll need approval from the maintainers).

I'd be interested to see what workloads in CPU2017 were affected by this.
Any chance you could post the breakdown in numbers from CPU2017?

Thanks,
Kyrill

> gcc/ChangeLog:
>
> 2018-07-25  Luis Machado  <luis.machado@linaro.org>
>
>         * config/aarch64/aarch64.c (qdf24xx_vector_cost): New.
>         (qdf24xx_tunings) <vec_costs>: Set to qdf24xx_vector_cost.
> ---
>  gcc/config/aarch64/aarch64.c | 22 +++++++++++++++++++++-
>  1 file changed, 21 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index fa01475..d443aee 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -430,6 +430,26 @@ static const struct cpu_vector_cost generic_vector_cost =
>    1 /* cond_not_taken_branch_cost  */
>  };
>
> +/* Qualcomm QDF24xx costs for vector insn classes.  */
> +static const struct cpu_vector_cost qdf24xx_vector_cost =
> +{
> +  1, /* scalar_int_stmt_cost  */
> +  1, /* scalar_fp_stmt_cost  */
> +  1, /* scalar_load_cost  */
> +  1, /* scalar_store_cost  */
> +  1, /* vec_int_stmt_cost  */
> +  3, /* vec_fp_stmt_cost  */
> +  2, /* vec_permute_cost  */
> +  1, /* vec_to_scalar_cost  */
> +  1, /* scalar_to_vec_cost  */
> +  1, /* vec_align_load_cost  */
> +  1, /* vec_unalign_load_cost  */
> +  1, /* vec_unalign_store_cost  */
> +  1, /* vec_store_cost  */
> +  3, /* cond_taken_branch_cost  */
> +  1  /* cond_not_taken_branch_cost  */
> +};
> +
>  /* ThunderX costs for vector insn classes.  */
>  static const struct cpu_vector_cost thunderx_vector_cost =
>  {
> @@ -890,7 +910,7 @@ static const struct tune_params qdf24xx_tunings =
>    &qdf24xx_extra_costs,
>    &qdf24xx_addrcost_table,
>    &qdf24xx_regmove_cost,
> -  &generic_vector_cost,
> +  &qdf24xx_vector_cost,
>    &generic_branch_cost,
>    &generic_approx_modes,
>    4, /* memmov_cost  */
> -- 
> 2.7.4
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] [AArch64, Falkor] Switch to using Falkor-specific vector costs
  2018-07-26 14:34 ` Kyrill Tkachov
@ 2018-07-26 14:47   ` Luis Machado
  0 siblings, 0 replies; 7+ messages in thread
From: Luis Machado @ 2018-07-26 14:47 UTC (permalink / raw)
  To: Kyrill Tkachov, gcc-patches; +Cc: James Greenhalgh, Richard Earnshaw

Hi Kyrill,

On 07/26/2018 11:34 AM, Kyrill Tkachov wrote:
> Hi Luis,
> 
> On 25/07/18 19:10, Luis Machado wrote:
>> The adjusted vector costs give Falkor a reasonable boost in 
>> performance for FP
>> benchmarks (both CPU2017 and CPU2006) and doesn't change INT 
>> benchmarks that
>> much. About 0.7% for CPU2017 FP and 1.54% for CPU2006 FP.
>>
>> OK for trunk?
>>
> 
> The patch looks ok and safe to me (though you'll need approval from the 
> maintainers).
> 
> I'd be interested to see what workloads in CPU2017 were affected by this.
> Any chance you could post the breakdown in numbers from CPU2017?
> 

Sure. Here it is (speed):

605.mcf_s: -1.8%
620.omnetpp_s: -2% (tends to be noisy)
623.xalancbmk_s: 2%
654.roms_s: 7%

INT mean: -0.09%
FP mean: 0.70%

It is worth mentioning i noticed bigger improvements in CPU2017 rate, 
but i did not record those numbers for the final run. The speed 
benchmarks seem to have a slightly different performance profile.

Here's a breakdown of the biggest changes from CPU2006 in case you're 
interested:

410.bwaves: 5.4%
434.zeusmp: 9.7%
436.cactusADM: -12.3%
437.leslie3d: 5.2%
459.GemsFDTD: 16.9%

cactusADM seems to have a pretty big loop that is a win if vectorized, 
but experimentation showed me it is tricky to get GCC to vectorize that 
specific loop without also vectorizing particular loops from the other 
benchmarks.

It would be nice to get cactusADM back up though.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] [AArch64, Falkor] Switch to using Falkor-specific vector costs
  2018-07-25 18:10 [PATCH] [AArch64, Falkor] Switch to using Falkor-specific vector costs Luis Machado
  2018-07-26 14:34 ` Kyrill Tkachov
@ 2018-07-31 22:53 ` James Greenhalgh
  2018-08-08  7:54   ` Siddhesh Poyarekar
  1 sibling, 1 reply; 7+ messages in thread
From: James Greenhalgh @ 2018-07-31 22:53 UTC (permalink / raw)
  To: Luis Machado; +Cc: gcc-patches, Richard Earnshaw, nd

On Wed, Jul 25, 2018 at 01:10:34PM -0500, Luis Machado wrote:
> The adjusted vector costs give Falkor a reasonable boost in performance for FP
> benchmarks (both CPU2017 and CPU2006) and doesn't change INT benchmarks that
> much. About 0.7% for CPU2017 FP and 1.54% for CPU2006 FP.
> 
> OK for trunk?

OK if this is what works best for your subtarget.

Thanks,
James

> 
> gcc/ChangeLog:
> 
> 2018-07-25  Luis Machado  <luis.machado@linaro.org>
> 
> 	* config/aarch64/aarch64.c (qdf24xx_vector_cost): New.
> 	(qdf24xx_tunings) <vec_costs>: Set to qdf24xx_vector_cost.
> ---
>  gcc/config/aarch64/aarch64.c | 22 +++++++++++++++++++++-
>  1 file changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index fa01475..d443aee 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -430,6 +430,26 @@ static const struct cpu_vector_cost generic_vector_cost =
>    1 /* cond_not_taken_branch_cost  */
>  };
>  
> +/* Qualcomm QDF24xx costs for vector insn classes.  */
> +static const struct cpu_vector_cost qdf24xx_vector_cost =
> +{
> +  1, /* scalar_int_stmt_cost  */
> +  1, /* scalar_fp_stmt_cost  */
> +  1, /* scalar_load_cost  */
> +  1, /* scalar_store_cost  */
> +  1, /* vec_int_stmt_cost  */
> +  3, /* vec_fp_stmt_cost  */
> +  2, /* vec_permute_cost  */
> +  1, /* vec_to_scalar_cost  */
> +  1, /* scalar_to_vec_cost  */
> +  1, /* vec_align_load_cost  */
> +  1, /* vec_unalign_load_cost  */
> +  1, /* vec_unalign_store_cost  */
> +  1, /* vec_store_cost  */
> +  3, /* cond_taken_branch_cost  */
> +  1  /* cond_not_taken_branch_cost  */
> +};
> +
>  /* ThunderX costs for vector insn classes.  */
>  static const struct cpu_vector_cost thunderx_vector_cost =
>  {
> @@ -890,7 +910,7 @@ static const struct tune_params qdf24xx_tunings =
>    &qdf24xx_extra_costs,
>    &qdf24xx_addrcost_table,
>    &qdf24xx_regmove_cost,
> -  &generic_vector_cost,
> +  &qdf24xx_vector_cost,
>    &generic_branch_cost,
>    &generic_approx_modes,
>    4, /* memmov_cost  */
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] [AArch64, Falkor] Switch to using Falkor-specific vector costs
  2018-07-31 22:53 ` James Greenhalgh
@ 2018-08-08  7:54   ` Siddhesh Poyarekar
  2018-08-27 15:05     ` Luis Machado
  0 siblings, 1 reply; 7+ messages in thread
From: Siddhesh Poyarekar @ 2018-08-08  7:54 UTC (permalink / raw)
  To: James Greenhalgh, Luis Machado; +Cc: gcc-patches, Richard Earnshaw, nd

On 08/01/2018 04:23 AM, James Greenhalgh wrote:
> On Wed, Jul 25, 2018 at 01:10:34PM -0500, Luis Machado wrote:
>> The adjusted vector costs give Falkor a reasonable boost in performance for FP
>> benchmarks (both CPU2017 and CPU2006) and doesn't change INT benchmarks that
>> much. About 0.7% for CPU2017 FP and 1.54% for CPU2006 FP.
>>
>> OK for trunk?
> 
> OK if this is what works best for your subtarget.
>

I have pushed this for Luis since he is on holiday.

Thanks,
Siddhesh

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] [AArch64, Falkor] Switch to using Falkor-specific vector costs
  2018-08-08  7:54   ` Siddhesh Poyarekar
@ 2018-08-27 15:05     ` Luis Machado
  2018-08-28 22:00       ` James Greenhalgh
  0 siblings, 1 reply; 7+ messages in thread
From: Luis Machado @ 2018-08-27 15:05 UTC (permalink / raw)
  To: Siddhesh Poyarekar, James Greenhalgh; +Cc: gcc-patches, Richard Earnshaw, nd

Hi,

On 08/08/2018 04:54 AM, Siddhesh Poyarekar wrote:
> On 08/01/2018 04:23 AM, James Greenhalgh wrote:
>> On Wed, Jul 25, 2018 at 01:10:34PM -0500, Luis Machado wrote:
>>> The adjusted vector costs give Falkor a reasonable boost in 
>>> performance for FP
>>> benchmarks (both CPU2017 and CPU2006) and doesn't change INT 
>>> benchmarks that
>>> much. About 0.7% for CPU2017 FP and 1.54% for CPU2006 FP.
>>>
>>> OK for trunk?
>>
>> OK if this is what works best for your subtarget.
>>
> 
> I have pushed this for Luis since he is on holiday.
> 
> Thanks,
> Siddhesh

Given this has a significant performance impact on Falkor and the 
changes are very localized and specific to Falkor, it would be great to 
have them on GCC 8. Would it be ok to push it?

Thanks,
Luis

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] [AArch64, Falkor] Switch to using Falkor-specific vector costs
  2018-08-27 15:05     ` Luis Machado
@ 2018-08-28 22:00       ` James Greenhalgh
  0 siblings, 0 replies; 7+ messages in thread
From: James Greenhalgh @ 2018-08-28 22:00 UTC (permalink / raw)
  To: Luis Machado; +Cc: Siddhesh Poyarekar, gcc-patches, Richard Earnshaw, nd

On Mon, Aug 27, 2018 at 10:05:17AM -0500, Luis Machado wrote:
> Hi,
> 
> On 08/08/2018 04:54 AM, Siddhesh Poyarekar wrote:
> > On 08/01/2018 04:23 AM, James Greenhalgh wrote:
> >> On Wed, Jul 25, 2018 at 01:10:34PM -0500, Luis Machado wrote:
> >>> The adjusted vector costs give Falkor a reasonable boost in 
> >>> performance for FP
> >>> benchmarks (both CPU2017 and CPU2006) and doesn't change INT 
> >>> benchmarks that
> >>> much. About 0.7% for CPU2017 FP and 1.54% for CPU2006 FP.
> >>>
> >>> OK for trunk?
> >>
> >> OK if this is what works best for your subtarget.
> >>
> > 
> > I have pushed this for Luis since he is on holiday.
> > 
> > Thanks,
> > Siddhesh
> 
> Given this has a significant performance impact on Falkor and the 
> changes are very localized and specific to Falkor, it would be great to 
> have them on GCC 8. Would it be ok to push it?

I've not got a problem with this if your opinion is that the risk is
worthwhile for Falkor.

Thanks,
James

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-08-28 22:00 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-25 18:10 [PATCH] [AArch64, Falkor] Switch to using Falkor-specific vector costs Luis Machado
2018-07-26 14:34 ` Kyrill Tkachov
2018-07-26 14:47   ` Luis Machado
2018-07-31 22:53 ` James Greenhalgh
2018-08-08  7:54   ` Siddhesh Poyarekar
2018-08-27 15:05     ` Luis Machado
2018-08-28 22:00       ` James Greenhalgh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).