public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 5/8] Enable max_issue for AArch32 and AArch64
@ 2014-10-21  3:35 Maxim Kuvyrkov
  2014-10-22 18:35 ` Sebastian Pop
  2014-11-07 15:11 ` Richard Earnshaw
  0 siblings, 2 replies; 5+ messages in thread
From: Maxim Kuvyrkov @ 2014-10-21  3:35 UTC (permalink / raw)
  To: ramrad01, Marcus Shawcroft; +Cc: GCC Patches, Sebastian Pop

[-- Attachment #1: Type: text/plain, Size: 801 bytes --]

Hi Ramana,
Hi Marcus,

This patch enables max_issue multipass lookahead scheduling for 2nd scheduler pass (or, more pedantically, whenever register-pressure scheduling is not in use).

Multipass lookahead scheduling is being enabled for cores that can issue 2 or more instructions per cycle, and it allows scheduler to better exploit multi-issue pipelines.  This patch also provides foundation for [upcoming] auto-prefetcher model in the scheduler, which is handled via max_issue.

This change requires benchmarking, which I can't easily do at the moment.  I would appreciate any benchmarking results that you can share.

Bootstrap on aarch64-linux-gnu is in progress.  OK to apply, provided no performance or correctness regressions?

Thank you,

--
Maxim Kuvyrkov
www.linaro.org



[-- Attachment #2: 0005-Enable-max_issue-for-AArch32-and-AArch64.patch --]
[-- Type: application/octet-stream, Size: 3267 bytes --]

From bf51463edee1d161ff8e03cf0af0c3ff8b258305 Mon Sep 17 00:00:00 2001
From: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
Date: Sat, 29 Mar 2014 07:12:52 +1300
Subject: [PATCH 5/8] Enable max_issue for AArch32 and AArch64 	*
 config/aarch64/aarch64.c 
 (aarch64_sched_first_cycle_multipass_dfa_lookahead):
 Implement hook. 
 (TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD):
 Define. 	* config/arm/arm.c 
 (arm_first_cycle_multipass_dfa_lookahead): Implement
 hook. 
 (TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD):
 Define.

---
 gcc/config/aarch64/aarch64.c |   12 ++++++++++++
 gcc/config/arm/arm.c         |   15 +++++++++++++++
 2 files changed, 27 insertions(+)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 2ad5c28..1512418 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -6077,6 +6077,14 @@ aarch64_sched_issue_rate (void)
   return aarch64_tune_params->issue_rate;
 }
 
+static int
+aarch64_sched_first_cycle_multipass_dfa_lookahead (void)
+{
+  int issue_rate = aarch64_sched_issue_rate ();
+
+  return issue_rate > 1 ? issue_rate : 0;
+}
+
 /* Vectorizer cost model target hooks.  */
 
 /* Implement targetm.vectorize.builtin_vectorization_cost.  */
@@ -10136,6 +10144,10 @@ aarch64_asan_shadow_offset (void)
 #undef TARGET_SCHED_ISSUE_RATE
 #define TARGET_SCHED_ISSUE_RATE aarch64_sched_issue_rate
 
+#undef TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD
+#define TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD \
+  aarch64_sched_first_cycle_multipass_dfa_lookahead
+
 #undef TARGET_TRAMPOLINE_INIT
 #define TARGET_TRAMPOLINE_INIT aarch64_trampoline_init
 
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 1ee0eb3..0f15c99 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -246,6 +246,7 @@ static void arm_option_override (void);
 static unsigned HOST_WIDE_INT arm_shift_truncation_mask (enum machine_mode);
 static bool arm_cannot_copy_insn_p (rtx_insn *);
 static int arm_issue_rate (void);
+static int arm_first_cycle_multipass_dfa_lookahead (void);
 static void arm_output_dwarf_dtprel (FILE *, int, rtx) ATTRIBUTE_UNUSED;
 static bool arm_output_addr_const_extra (FILE *, rtx);
 static bool arm_allocate_stack_slots_for_args (void);
@@ -591,6 +592,10 @@ static const struct attribute_spec arm_attribute_table[] =
 #undef TARGET_SCHED_ISSUE_RATE
 #define TARGET_SCHED_ISSUE_RATE arm_issue_rate
 
+#undef TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD
+#define TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD \
+  arm_first_cycle_multipass_dfa_lookahead
+
 #undef TARGET_MANGLE_TYPE
 #define TARGET_MANGLE_TYPE arm_mangle_type
 
@@ -29888,6 +29893,16 @@ arm_issue_rate (void)
     }
 }
 
+/* Return how many instructions should scheduler lookahead to choose the
+   best one.  */
+static int
+arm_first_cycle_multipass_dfa_lookahead (void)
+{
+  int issue_rate = arm_issue_rate ();
+
+  return issue_rate > 1 ? issue_rate : 0;
+}
+
 /* A table and a function to perform ARM-specific name mangling for
    NEON vector types in order to conform to the AAPCS (see "Procedure
    Call Standard for the ARM Architecture", Appendix A).  To qualify
-- 
1.7.9.5


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 5/8] Enable max_issue for AArch32 and AArch64
  2014-10-21  3:35 [PATCH 5/8] Enable max_issue for AArch32 and AArch64 Maxim Kuvyrkov
@ 2014-10-22 18:35 ` Sebastian Pop
  2014-10-22 22:53   ` Sebastian Pop
  2014-11-07 15:11 ` Richard Earnshaw
  1 sibling, 1 reply; 5+ messages in thread
From: Sebastian Pop @ 2014-10-22 18:35 UTC (permalink / raw)
  To: Maxim Kuvyrkov; +Cc: ramrad01, Marcus Shawcroft, GCC Patches

Maxim Kuvyrkov wrote:
> This change requires benchmarking, which I can't easily do at the moment.  I
> would appreciate any benchmarking results that you can share.

I will run my testsuite on aarch64.
Do you need the perf for all the previous patches 1 to 5 together, or just for patch 5?

Thanks,
Sebastian

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 5/8] Enable max_issue for AArch32 and AArch64
  2014-10-22 18:35 ` Sebastian Pop
@ 2014-10-22 22:53   ` Sebastian Pop
  2014-10-22 23:01     ` Maxim Kuvyrkov
  0 siblings, 1 reply; 5+ messages in thread
From: Sebastian Pop @ 2014-10-22 22:53 UTC (permalink / raw)
  To: Maxim Kuvyrkov; +Cc: ramrad01, Marcus Shawcroft, GCC Patches

Sebastian Pop wrote:
> Maxim Kuvyrkov wrote:
> > This change requires benchmarking, which I can't easily do at the moment.  I
> > would appreciate any benchmarking results that you can share.
> 
> I will run my testsuite on aarch64.
> Do you need the perf for all the previous patches 1 to 5 together, or just for patch 5?

I ran your patch 5 on top of r216539 at -O3 and there were not many differences
in performance, and of very little amplitude.  Some small speedups, and some
small slowdowns.  Let me know if you want a reduced testcase for the most
significative slowdown.

Sebastian

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 5/8] Enable max_issue for AArch32 and AArch64
  2014-10-22 22:53   ` Sebastian Pop
@ 2014-10-22 23:01     ` Maxim Kuvyrkov
  0 siblings, 0 replies; 5+ messages in thread
From: Maxim Kuvyrkov @ 2014-10-22 23:01 UTC (permalink / raw)
  To: Sebastian Pop; +Cc: ramrad01, Marcus Shawcroft, GCC Patches

On Oct 23, 2014, at 11:42 AM, Sebastian Pop <sebpop@gmail.com> wrote:

> Sebastian Pop wrote:
>> Maxim Kuvyrkov wrote:
>>> This change requires benchmarking, which I can't easily do at the moment.  I
>>> would appreciate any benchmarking results that you can share.
>> 
>> I will run my testsuite on aarch64.
>> Do you need the perf for all the previous patches 1 to 5 together, or just for patch 5?
> 
> I ran your patch 5 on top of r216539 at -O3 and there were not many differences
> in performance, and of very little amplitude.  Some small speedups, and some
> small slowdowns.  Let me know if you want a reduced testcase for the most
> significative slowdown.

Hi Sebastian,

This is, pretty much, what I expected.  This patch enables infrastructure for patch 7, and, by itself, was not supposed to give measurable speed ups.  Patch 7 should improve performance for Cortex-A15 (and any other cores that have similar auto-prefetcher hardware).

Thank you for the benchmarking!

--
Maxim Kuvyrkov
www.linaro.org

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 5/8] Enable max_issue for AArch32 and AArch64
  2014-10-21  3:35 [PATCH 5/8] Enable max_issue for AArch32 and AArch64 Maxim Kuvyrkov
  2014-10-22 18:35 ` Sebastian Pop
@ 2014-11-07 15:11 ` Richard Earnshaw
  1 sibling, 0 replies; 5+ messages in thread
From: Richard Earnshaw @ 2014-11-07 15:11 UTC (permalink / raw)
  To: Maxim Kuvyrkov, Ramana Radhakrishnan, Marcus Shawcroft
  Cc: GCC Patches, Sebastian Pop

On 21/10/14 04:31, Maxim Kuvyrkov wrote:
> Hi Ramana,
> Hi Marcus,
> 
> This patch enables max_issue multipass lookahead scheduling for 2nd scheduler pass (or, more pedantically, whenever register-pressure scheduling is not in use).
> 
> Multipass lookahead scheduling is being enabled for cores that can issue 2 or more instructions per cycle, and it allows scheduler to better exploit multi-issue pipelines.  This patch also provides foundation for [upcoming] auto-prefetcher model in the scheduler, which is handled via max_issue.
> 
> This change requires benchmarking, which I can't easily do at the moment.  I would appreciate any benchmarking results that you can share.
> 
> Bootstrap on aarch64-linux-gnu is in progress.  OK to apply, provided no performance or correctness regressions?
> 
> Thank you,
> 

OK.

R.

> --
> Maxim Kuvyrkov
> www.linaro.org
> 
> 
> 0005-Enable-max_issue-for-AArch32-and-AArch64.patch
> 
> 
> From bf51463edee1d161ff8e03cf0af0c3ff8b258305 Mon Sep 17 00:00:00 2001
> From: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
> Date: Sat, 29 Mar 2014 07:12:52 +1300
> Subject: [PATCH 5/8] Enable max_issue for AArch32 and AArch64 	*
>  config/aarch64/aarch64.c 
>  (aarch64_sched_first_cycle_multipass_dfa_lookahead):
>  Implement hook. 
>  (TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD):
>  Define. 	* config/arm/arm.c 
>  (arm_first_cycle_multipass_dfa_lookahead): Implement
>  hook. 
>  (TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD):
>  Define.
> 
> ---
>  gcc/config/aarch64/aarch64.c |   12 ++++++++++++
>  gcc/config/arm/arm.c         |   15 +++++++++++++++
>  2 files changed, 27 insertions(+)
> 
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 2ad5c28..1512418 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -6077,6 +6077,14 @@ aarch64_sched_issue_rate (void)
>    return aarch64_tune_params->issue_rate;
>  }
>  
> +static int
> +aarch64_sched_first_cycle_multipass_dfa_lookahead (void)
> +{
> +  int issue_rate = aarch64_sched_issue_rate ();
> +
> +  return issue_rate > 1 ? issue_rate : 0;
> +}
> +
>  /* Vectorizer cost model target hooks.  */
>  
>  /* Implement targetm.vectorize.builtin_vectorization_cost.  */
> @@ -10136,6 +10144,10 @@ aarch64_asan_shadow_offset (void)
>  #undef TARGET_SCHED_ISSUE_RATE
>  #define TARGET_SCHED_ISSUE_RATE aarch64_sched_issue_rate
>  
> +#undef TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD
> +#define TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD \
> +  aarch64_sched_first_cycle_multipass_dfa_lookahead
> +
>  #undef TARGET_TRAMPOLINE_INIT
>  #define TARGET_TRAMPOLINE_INIT aarch64_trampoline_init
>  
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 1ee0eb3..0f15c99 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -246,6 +246,7 @@ static void arm_option_override (void);
>  static unsigned HOST_WIDE_INT arm_shift_truncation_mask (enum machine_mode);
>  static bool arm_cannot_copy_insn_p (rtx_insn *);
>  static int arm_issue_rate (void);
> +static int arm_first_cycle_multipass_dfa_lookahead (void);
>  static void arm_output_dwarf_dtprel (FILE *, int, rtx) ATTRIBUTE_UNUSED;
>  static bool arm_output_addr_const_extra (FILE *, rtx);
>  static bool arm_allocate_stack_slots_for_args (void);
> @@ -591,6 +592,10 @@ static const struct attribute_spec arm_attribute_table[] =
>  #undef TARGET_SCHED_ISSUE_RATE
>  #define TARGET_SCHED_ISSUE_RATE arm_issue_rate
>  
> +#undef TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD
> +#define TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD \
> +  arm_first_cycle_multipass_dfa_lookahead
> +
>  #undef TARGET_MANGLE_TYPE
>  #define TARGET_MANGLE_TYPE arm_mangle_type
>  
> @@ -29888,6 +29893,16 @@ arm_issue_rate (void)
>      }
>  }
>  
> +/* Return how many instructions should scheduler lookahead to choose the
> +   best one.  */
> +static int
> +arm_first_cycle_multipass_dfa_lookahead (void)
> +{
> +  int issue_rate = arm_issue_rate ();
> +
> +  return issue_rate > 1 ? issue_rate : 0;
> +}
> +
>  /* A table and a function to perform ARM-specific name mangling for
>     NEON vector types in order to conform to the AAPCS (see "Procedure
>     Call Standard for the ARM Architecture", Appendix A).  To qualify
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-11-07 15:11 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-21  3:35 [PATCH 5/8] Enable max_issue for AArch32 and AArch64 Maxim Kuvyrkov
2014-10-22 18:35 ` Sebastian Pop
2014-10-22 22:53   ` Sebastian Pop
2014-10-22 23:01     ` Maxim Kuvyrkov
2014-11-07 15:11 ` Richard Earnshaw

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).