[PATCH][AARCH64] Set jump-align=4 for neoversen1

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH][AARCH64] Set jump-align=4 for neoversen1
@ 2019-12-24 16:25 Wilco Dijkstra
  2020-01-16 17:53 ` Wilco Dijkstra
  2020-01-17  9:25 ` Richard Sandiford
  0 siblings, 2 replies; 6+ messages in thread
From: Wilco Dijkstra @ 2019-12-24 16:25 UTC (permalink / raw)
  To: GCC Patches; +Cc: Richard Earnshaw, Kyrylo Tkachov, Richard Sandiford

Testing shows the setting of 32:16 for jump alignment has a significant codesize
cost, however it doesn't make a difference in performance. So set jump-align 
to 4 to get 1.6% codesize improvement.

OK for commit?

ChangeLog
2019-12-24  Wilco Dijkstra  <wdijkstr@arm.com>

	* config/aarch64/aarch64.c (neoversen1_tunings): Set jump_align to 4.

--
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 1646ed1d9a3de8ee2f0abff385a1ea145e234475..209ed8ebbe81104d9d8cff0df31946ab7704fb33 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1132,7 +1132,7 @@ static const struct tune_params neoversen1_tunings =
   3, /* issue_rate  */
   (AARCH64_FUSE_AES_AESMC | AARCH64_FUSE_CMP_BRANCH), /* fusible_ops  */
   "32:16",	/* function_align.  */
-  "32:16",	/* jump_align.  */
+  "4",	/* jump_align.  */
   "32:16",	/* loop_align.  */
   2,	/* int_reassoc_width.  */
   4,	/* fp_reassoc_width.  */

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH][AARCH64] Set jump-align=4 for neoversen1
  2019-12-24 16:25 [PATCH][AARCH64] Set jump-align=4 for neoversen1 Wilco Dijkstra
@ 2020-01-16 17:53 ` Wilco Dijkstra
  2020-01-17  9:25 ` Richard Sandiford
  1 sibling, 0 replies; 6+ messages in thread
From: Wilco Dijkstra @ 2020-01-16 17:53 UTC (permalink / raw)
  To: GCC Patches; +Cc: Richard Earnshaw, Kyrylo Tkachov, Richard Sandiford

ping


Testing shows the setting of 32:16 for jump alignment has a significant codesize
cost, however it doesn't make a difference in performance. So set jump-align
to 4 to get 1.6% codesize improvement.

OK for commit?

ChangeLog
2019-12-24  Wilco Dijkstra  <wdijkstr@arm.com>

        * config/aarch64/aarch64.c (neoversen1_tunings): Set jump_align to 4.

--
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 1646ed1d9a3de8ee2f0abff385a1ea145e234475..209ed8ebbe81104d9d8cff0df31946ab7704fb33 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1132,7 +1132,7 @@ static const struct tune_params neoversen1_tunings =
   3, /* issue_rate  */
   (AARCH64_FUSE_AES_AESMC | AARCH64_FUSE_CMP_BRANCH), /* fusible_ops  */
   "32:16",     /* function_align.  */
-  "32:16",     /* jump_align.  */
+  "4", /* jump_align.  */
   "32:16",     /* loop_align.  */
   2,   /* int_reassoc_width.  */
   4,   /* fp_reassoc_width.  */

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH][AARCH64] Set jump-align=4 for neoversen1
  2019-12-24 16:25 [PATCH][AARCH64] Set jump-align=4 for neoversen1 Wilco Dijkstra
  2020-01-16 17:53 ` Wilco Dijkstra
@ 2020-01-17  9:25 ` Richard Sandiford
  2020-01-17 11:12   ` Kyrill Tkachov
  1 sibling, 1 reply; 6+ messages in thread
From: Richard Sandiford @ 2020-01-17  9:25 UTC (permalink / raw)
  To: Wilco Dijkstra; +Cc: GCC Patches, Richard Earnshaw, Kyrylo Tkachov

Wilco Dijkstra <Wilco.Dijkstra@arm.com> writes:
> Testing shows the setting of 32:16 for jump alignment has a significant codesize
> cost, however it doesn't make a difference in performance. So set jump-align
> to 4 to get 1.6% codesize improvement.

I was leaving this to others in case it was obvious to them.  On the
basis that silence suggests it wasn't, :-) could you go into more details?
Is it expected on first principles that jump alignment doesn't matter
for Neoverse N1, or is this purely based on experimentation?  If it's
expected, are we sure that the other "32:16" entries are still worthwhile?
When you say it doesn't make a difference in performance, does that mean
that no individual test's performance changed significantly, or just that
the aggregate score didn't?  Did you experiment with anything inbetween
the current 32:16 and 4, such as 32:8 or even 32:4?

The problem with applying the patch only with the explanation above is
that if someone in future has evidence that jump alignment can make a
difference for their testcase, it would be very hard for them to
reproduce the reasoning that led to this change.

Thanks,
Richard

> OK for commit?
>
> ChangeLog
> 2019-12-24  Wilco Dijkstra  <wdijkstr@arm.com>
>
> * config/aarch64/aarch64.c (neoversen1_tunings): Set jump_align to 4.
>
> --
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 1646ed1d9a3de8ee2f0abff385a1ea145e234475..209ed8ebbe81104d9d8cff0df31946ab7704fb33 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -1132,7 +1132,7 @@ static const struct tune_params neoversen1_tunings =
>    3, /* issue_rate  */
>    (AARCH64_FUSE_AES_AESMC | AARCH64_FUSE_CMP_BRANCH), /* fusible_ops  */
>    "32:16",/* function_align.  */
> -  "32:16",/* jump_align.  */
> +  "4",/* jump_align.  */
>    "32:16",/* loop_align.  */
>    2,/* int_reassoc_width.  */
>    4,/* fp_reassoc_width.  */

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH][AARCH64] Set jump-align=4 for neoversen1
  2020-01-17  9:25 ` Richard Sandiford
@ 2020-01-17 11:12   ` Kyrill Tkachov
  2020-01-17 17:53     ` Wilco Dijkstra
  0 siblings, 1 reply; 6+ messages in thread
From: Kyrill Tkachov @ 2020-01-17 11:12 UTC (permalink / raw)
  To: Wilco Dijkstra, GCC Patches, Richard Earnshaw, richard.sandiford

Hi Richard, Wilco,

On 1/17/20 8:43 AM, Richard Sandiford wrote:
> Wilco Dijkstra <Wilco.Dijkstra@arm.com> writes:
> > Testing shows the setting of 32:16 for jump alignment has a 
> significant codesize
> > cost, however it doesn't make a difference in performance. So set 
> jump-align
> > to 4 to get 1.6% codesize improvement.
>
> I was leaving this to others in case it was obvious to them.Â  On the
> basis that silence suggests it wasn't, :-) could you go into more details?
> Is it expected on first principles that jump alignment doesn't matter
> for Neoverse N1, or is this purely based on experimentation?Â  If it's
> expected, are we sure that the other "32:16" entries are still worthwhile?
> When you say it doesn't make a difference in performance, does that mean
> that no individual test's performance changed significantly, or just that
> the aggregate score didn't?Â  Did you experiment with anything inbetween
> the current 32:16 and 4, such as 32:8 or even 32:4?


Sorry for dragging my feet on this one, as I put in those numbers last 
year and I've been trying to recall my experiments from then.

The Neoverse N1 Software Optimization guide recommends aligning branch 
targets to 32 bytes withing the bounds of code density requirements.

 From my benchmarking last year I do seem to remember function and loop 
alignment to matter.

I probably added the jump alignment for completeness as it's a good idea 
from first principles. But if the code size hit is too large we could 
look to decrease it.

I'd also be interested in seeing the impact of 32:8 and 32:4.

Thanks,

Kyrill


>
> The problem with applying the patch only with the explanation above is
> that if someone in future has evidence that jump alignment can make a
> difference for their testcase, it would be very hard for them to
> reproduce the reasoning that led to this change.
>
> Thanks,
> Richard
>
> > OK for commit?
> >
> > ChangeLog
> > 2019-12-24Â  Wilco DijkstraÂ  <wdijkstr@arm.com>
> >
> > * config/aarch64/aarch64.c (neoversen1_tunings): Set jump_align to 4.
> >
> > --
> > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> > index 
> 1646ed1d9a3de8ee2f0abff385a1ea145e234475..209ed8ebbe81104d9d8cff0df31946ab7704fb33 
> 100644
> > --- a/gcc/config/aarch64/aarch64.c
> > +++ b/gcc/config/aarch64/aarch64.c
> > @@ -1132,7 +1132,7 @@ static const struct tune_params 
> neoversen1_tunings =
> >Â Â Â  3, /* issue_rateÂ  */
> >Â Â Â  (AARCH64_FUSE_AES_AESMC | AARCH64_FUSE_CMP_BRANCH), /* 
> fusible_opsÂ  */
> >Â Â Â  "32:16",/* function_align.Â  */
> > -Â  "32:16",/* jump_align.Â  */
> > +Â  "4",/* jump_align.Â  */
> >Â Â Â  "32:16",/* loop_align.Â  */
> >Â Â Â  2,/* int_reassoc_width.Â  */
> >Â Â Â  4,/* fp_reassoc_width.Â  */

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH][AARCH64] Set jump-align=4 for neoversen1
  2020-01-17 11:12   ` Kyrill Tkachov
@ 2020-01-17 17:53     ` Wilco Dijkstra
  2020-01-20 14:33       ` Richard Sandiford
  0 siblings, 1 reply; 6+ messages in thread
From: Wilco Dijkstra @ 2020-01-17 17:53 UTC (permalink / raw)
  To: Kyrill Tkachov, GCC Patches, Richard Earnshaw, Richard Sandiford

Hi Kyrill & Richard,

> I was leaving this to others in case it was obvious to them.  On the
> basis that silence suggests it wasn't, :-) could you go into more details?
> Is it expected on first principles that jump alignment doesn't matter
> for Neoverse N1, or is this purely based on experimentation?  If it's

Jump alignment is set to 4 on almost all cores because higher values have
a major codesize cost and yet give no performance gains.

I suspect any core that set it higher has done so by accident rather than
having benchmarked the cost/benefit.

> expected, are we sure that the other "32:16" entries are still worthwhile?
> When you say it doesn't make a difference in performance, does that mean
> that no individual test's performance changed significantly, or just that
> the aggregate score didn't?  Did you experiment with anything inbetween
> the current 32:16 and 4, such as 32:8 or even 32:4?

I mean there is no difference above the noise floor for any test you throw at it.
I tried other alignments including 32:16, 32:12, 32:8 but all have a significant
cost and zero benefit.

Cheers,
Wilco

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH][AARCH64] Set jump-align=4 for neoversen1
  2020-01-17 17:53     ` Wilco Dijkstra
@ 2020-01-20 14:33       ` Richard Sandiford
  0 siblings, 0 replies; 6+ messages in thread
From: Richard Sandiford @ 2020-01-20 14:33 UTC (permalink / raw)
  To: Wilco Dijkstra; +Cc: Kyrill Tkachov, GCC Patches, Richard Earnshaw

Wilco Dijkstra <Wilco.Dijkstra@arm.com> writes:
> Hi Kyrill & Richard,
>
>> I was leaving this to others in case it was obvious to them.  On the
>> basis that silence suggests it wasn't, :-) could you go into more details?
>> Is it expected on first principles that jump alignment doesn't matter
>> for Neoverse N1, or is this purely based on experimentation?  If it's
>
> Jump alignment is set to 4 on almost all cores because higher values have
> a major codesize cost and yet give no performance gains.
>
> I suspect any core that set it higher has done so by accident rather than
> having benchmarked the cost/benefit.
>
>> expected, are we sure that the other "32:16" entries are still worthwhile?
>> When you say it doesn't make a difference in performance, does that mean
>> that no individual test's performance changed significantly, or just that
>> the aggregate score didn't?  Did you experiment with anything inbetween
>> the current 32:16 and 4, such as 32:8 or even 32:4?
>
> I mean there is no difference above the noise floor for any test you throw at it.

OK, great.  In that case, let's go with the patch as posted.

Thanks,
Richard

> I tried other alignments including 32:16, 32:12, 32:8 but all have a significant
> cost and zero benefit.
>
> Cheers,
> Wilco

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-01-20 14:23 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-24 16:25 [PATCH][AARCH64] Set jump-align=4 for neoversen1 Wilco Dijkstra
2020-01-16 17:53 ` Wilco Dijkstra
2020-01-17  9:25 ` Richard Sandiford
2020-01-17 11:12   ` Kyrill Tkachov
2020-01-17 17:53     ` Wilco Dijkstra
2020-01-20 14:33       ` Richard Sandiford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).