* [PATCH 5/8] Enable max_issue for AArch32 and AArch64
@ 2014-10-21 3:35 Maxim Kuvyrkov
2014-10-22 18:35 ` Sebastian Pop
2014-11-07 15:11 ` Richard Earnshaw
0 siblings, 2 replies; 5+ messages in thread
From: Maxim Kuvyrkov @ 2014-10-21 3:35 UTC (permalink / raw)
To: ramrad01, Marcus Shawcroft; +Cc: GCC Patches, Sebastian Pop
[-- Attachment #1: Type: text/plain, Size: 801 bytes --]
Hi Ramana,
Hi Marcus,
This patch enables max_issue multipass lookahead scheduling for 2nd scheduler pass (or, more pedantically, whenever register-pressure scheduling is not in use).
Multipass lookahead scheduling is being enabled for cores that can issue 2 or more instructions per cycle, and it allows scheduler to better exploit multi-issue pipelines. This patch also provides foundation for [upcoming] auto-prefetcher model in the scheduler, which is handled via max_issue.
This change requires benchmarking, which I can't easily do at the moment. I would appreciate any benchmarking results that you can share.
Bootstrap on aarch64-linux-gnu is in progress. OK to apply, provided no performance or correctness regressions?
Thank you,
--
Maxim Kuvyrkov
www.linaro.org
[-- Attachment #2: 0005-Enable-max_issue-for-AArch32-and-AArch64.patch --]
[-- Type: application/octet-stream, Size: 3267 bytes --]
From bf51463edee1d161ff8e03cf0af0c3ff8b258305 Mon Sep 17 00:00:00 2001
From: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
Date: Sat, 29 Mar 2014 07:12:52 +1300
Subject: [PATCH 5/8] Enable max_issue for AArch32 and AArch64 *
config/aarch64/aarch64.c
(aarch64_sched_first_cycle_multipass_dfa_lookahead):
Implement hook.
(TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD):
Define. * config/arm/arm.c
(arm_first_cycle_multipass_dfa_lookahead): Implement
hook.
(TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD):
Define.
---
gcc/config/aarch64/aarch64.c | 12 ++++++++++++
gcc/config/arm/arm.c | 15 +++++++++++++++
2 files changed, 27 insertions(+)
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 2ad5c28..1512418 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -6077,6 +6077,14 @@ aarch64_sched_issue_rate (void)
return aarch64_tune_params->issue_rate;
}
+static int
+aarch64_sched_first_cycle_multipass_dfa_lookahead (void)
+{
+ int issue_rate = aarch64_sched_issue_rate ();
+
+ return issue_rate > 1 ? issue_rate : 0;
+}
+
/* Vectorizer cost model target hooks. */
/* Implement targetm.vectorize.builtin_vectorization_cost. */
@@ -10136,6 +10144,10 @@ aarch64_asan_shadow_offset (void)
#undef TARGET_SCHED_ISSUE_RATE
#define TARGET_SCHED_ISSUE_RATE aarch64_sched_issue_rate
+#undef TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD
+#define TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD \
+ aarch64_sched_first_cycle_multipass_dfa_lookahead
+
#undef TARGET_TRAMPOLINE_INIT
#define TARGET_TRAMPOLINE_INIT aarch64_trampoline_init
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 1ee0eb3..0f15c99 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -246,6 +246,7 @@ static void arm_option_override (void);
static unsigned HOST_WIDE_INT arm_shift_truncation_mask (enum machine_mode);
static bool arm_cannot_copy_insn_p (rtx_insn *);
static int arm_issue_rate (void);
+static int arm_first_cycle_multipass_dfa_lookahead (void);
static void arm_output_dwarf_dtprel (FILE *, int, rtx) ATTRIBUTE_UNUSED;
static bool arm_output_addr_const_extra (FILE *, rtx);
static bool arm_allocate_stack_slots_for_args (void);
@@ -591,6 +592,10 @@ static const struct attribute_spec arm_attribute_table[] =
#undef TARGET_SCHED_ISSUE_RATE
#define TARGET_SCHED_ISSUE_RATE arm_issue_rate
+#undef TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD
+#define TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD \
+ arm_first_cycle_multipass_dfa_lookahead
+
#undef TARGET_MANGLE_TYPE
#define TARGET_MANGLE_TYPE arm_mangle_type
@@ -29888,6 +29893,16 @@ arm_issue_rate (void)
}
}
+/* Return how many instructions should scheduler lookahead to choose the
+ best one. */
+static int
+arm_first_cycle_multipass_dfa_lookahead (void)
+{
+ int issue_rate = arm_issue_rate ();
+
+ return issue_rate > 1 ? issue_rate : 0;
+}
+
/* A table and a function to perform ARM-specific name mangling for
NEON vector types in order to conform to the AAPCS (see "Procedure
Call Standard for the ARM Architecture", Appendix A). To qualify
--
1.7.9.5
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 5/8] Enable max_issue for AArch32 and AArch64
2014-10-21 3:35 [PATCH 5/8] Enable max_issue for AArch32 and AArch64 Maxim Kuvyrkov
@ 2014-10-22 18:35 ` Sebastian Pop
2014-10-22 22:53 ` Sebastian Pop
2014-11-07 15:11 ` Richard Earnshaw
1 sibling, 1 reply; 5+ messages in thread
From: Sebastian Pop @ 2014-10-22 18:35 UTC (permalink / raw)
To: Maxim Kuvyrkov; +Cc: ramrad01, Marcus Shawcroft, GCC Patches
Maxim Kuvyrkov wrote:
> This change requires benchmarking, which I can't easily do at the moment. I
> would appreciate any benchmarking results that you can share.
I will run my testsuite on aarch64.
Do you need the perf for all the previous patches 1 to 5 together, or just for patch 5?
Thanks,
Sebastian
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 5/8] Enable max_issue for AArch32 and AArch64
2014-10-22 18:35 ` Sebastian Pop
@ 2014-10-22 22:53 ` Sebastian Pop
2014-10-22 23:01 ` Maxim Kuvyrkov
0 siblings, 1 reply; 5+ messages in thread
From: Sebastian Pop @ 2014-10-22 22:53 UTC (permalink / raw)
To: Maxim Kuvyrkov; +Cc: ramrad01, Marcus Shawcroft, GCC Patches
Sebastian Pop wrote:
> Maxim Kuvyrkov wrote:
> > This change requires benchmarking, which I can't easily do at the moment. I
> > would appreciate any benchmarking results that you can share.
>
> I will run my testsuite on aarch64.
> Do you need the perf for all the previous patches 1 to 5 together, or just for patch 5?
I ran your patch 5 on top of r216539 at -O3 and there were not many differences
in performance, and of very little amplitude. Some small speedups, and some
small slowdowns. Let me know if you want a reduced testcase for the most
significative slowdown.
Sebastian
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 5/8] Enable max_issue for AArch32 and AArch64
2014-10-22 22:53 ` Sebastian Pop
@ 2014-10-22 23:01 ` Maxim Kuvyrkov
0 siblings, 0 replies; 5+ messages in thread
From: Maxim Kuvyrkov @ 2014-10-22 23:01 UTC (permalink / raw)
To: Sebastian Pop; +Cc: ramrad01, Marcus Shawcroft, GCC Patches
On Oct 23, 2014, at 11:42 AM, Sebastian Pop <sebpop@gmail.com> wrote:
> Sebastian Pop wrote:
>> Maxim Kuvyrkov wrote:
>>> This change requires benchmarking, which I can't easily do at the moment. I
>>> would appreciate any benchmarking results that you can share.
>>
>> I will run my testsuite on aarch64.
>> Do you need the perf for all the previous patches 1 to 5 together, or just for patch 5?
>
> I ran your patch 5 on top of r216539 at -O3 and there were not many differences
> in performance, and of very little amplitude. Some small speedups, and some
> small slowdowns. Let me know if you want a reduced testcase for the most
> significative slowdown.
Hi Sebastian,
This is, pretty much, what I expected. This patch enables infrastructure for patch 7, and, by itself, was not supposed to give measurable speed ups. Patch 7 should improve performance for Cortex-A15 (and any other cores that have similar auto-prefetcher hardware).
Thank you for the benchmarking!
--
Maxim Kuvyrkov
www.linaro.org
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 5/8] Enable max_issue for AArch32 and AArch64
2014-10-21 3:35 [PATCH 5/8] Enable max_issue for AArch32 and AArch64 Maxim Kuvyrkov
2014-10-22 18:35 ` Sebastian Pop
@ 2014-11-07 15:11 ` Richard Earnshaw
1 sibling, 0 replies; 5+ messages in thread
From: Richard Earnshaw @ 2014-11-07 15:11 UTC (permalink / raw)
To: Maxim Kuvyrkov, Ramana Radhakrishnan, Marcus Shawcroft
Cc: GCC Patches, Sebastian Pop
On 21/10/14 04:31, Maxim Kuvyrkov wrote:
> Hi Ramana,
> Hi Marcus,
>
> This patch enables max_issue multipass lookahead scheduling for 2nd scheduler pass (or, more pedantically, whenever register-pressure scheduling is not in use).
>
> Multipass lookahead scheduling is being enabled for cores that can issue 2 or more instructions per cycle, and it allows scheduler to better exploit multi-issue pipelines. This patch also provides foundation for [upcoming] auto-prefetcher model in the scheduler, which is handled via max_issue.
>
> This change requires benchmarking, which I can't easily do at the moment. I would appreciate any benchmarking results that you can share.
>
> Bootstrap on aarch64-linux-gnu is in progress. OK to apply, provided no performance or correctness regressions?
>
> Thank you,
>
OK.
R.
> --
> Maxim Kuvyrkov
> www.linaro.org
>
>
> 0005-Enable-max_issue-for-AArch32-and-AArch64.patch
>
>
> From bf51463edee1d161ff8e03cf0af0c3ff8b258305 Mon Sep 17 00:00:00 2001
> From: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
> Date: Sat, 29 Mar 2014 07:12:52 +1300
> Subject: [PATCH 5/8] Enable max_issue for AArch32 and AArch64 *
> config/aarch64/aarch64.c
> (aarch64_sched_first_cycle_multipass_dfa_lookahead):
> Implement hook.
> (TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD):
> Define. * config/arm/arm.c
> (arm_first_cycle_multipass_dfa_lookahead): Implement
> hook.
> (TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD):
> Define.
>
> ---
> gcc/config/aarch64/aarch64.c | 12 ++++++++++++
> gcc/config/arm/arm.c | 15 +++++++++++++++
> 2 files changed, 27 insertions(+)
>
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 2ad5c28..1512418 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -6077,6 +6077,14 @@ aarch64_sched_issue_rate (void)
> return aarch64_tune_params->issue_rate;
> }
>
> +static int
> +aarch64_sched_first_cycle_multipass_dfa_lookahead (void)
> +{
> + int issue_rate = aarch64_sched_issue_rate ();
> +
> + return issue_rate > 1 ? issue_rate : 0;
> +}
> +
> /* Vectorizer cost model target hooks. */
>
> /* Implement targetm.vectorize.builtin_vectorization_cost. */
> @@ -10136,6 +10144,10 @@ aarch64_asan_shadow_offset (void)
> #undef TARGET_SCHED_ISSUE_RATE
> #define TARGET_SCHED_ISSUE_RATE aarch64_sched_issue_rate
>
> +#undef TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD
> +#define TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD \
> + aarch64_sched_first_cycle_multipass_dfa_lookahead
> +
> #undef TARGET_TRAMPOLINE_INIT
> #define TARGET_TRAMPOLINE_INIT aarch64_trampoline_init
>
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 1ee0eb3..0f15c99 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -246,6 +246,7 @@ static void arm_option_override (void);
> static unsigned HOST_WIDE_INT arm_shift_truncation_mask (enum machine_mode);
> static bool arm_cannot_copy_insn_p (rtx_insn *);
> static int arm_issue_rate (void);
> +static int arm_first_cycle_multipass_dfa_lookahead (void);
> static void arm_output_dwarf_dtprel (FILE *, int, rtx) ATTRIBUTE_UNUSED;
> static bool arm_output_addr_const_extra (FILE *, rtx);
> static bool arm_allocate_stack_slots_for_args (void);
> @@ -591,6 +592,10 @@ static const struct attribute_spec arm_attribute_table[] =
> #undef TARGET_SCHED_ISSUE_RATE
> #define TARGET_SCHED_ISSUE_RATE arm_issue_rate
>
> +#undef TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD
> +#define TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD \
> + arm_first_cycle_multipass_dfa_lookahead
> +
> #undef TARGET_MANGLE_TYPE
> #define TARGET_MANGLE_TYPE arm_mangle_type
>
> @@ -29888,6 +29893,16 @@ arm_issue_rate (void)
> }
> }
>
> +/* Return how many instructions should scheduler lookahead to choose the
> + best one. */
> +static int
> +arm_first_cycle_multipass_dfa_lookahead (void)
> +{
> + int issue_rate = arm_issue_rate ();
> +
> + return issue_rate > 1 ? issue_rate : 0;
> +}
> +
> /* A table and a function to perform ARM-specific name mangling for
> NEON vector types in order to conform to the AAPCS (see "Procedure
> Call Standard for the ARM Architecture", Appendix A). To qualify
>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2014-11-07 15:11 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-21 3:35 [PATCH 5/8] Enable max_issue for AArch32 and AArch64 Maxim Kuvyrkov
2014-10-22 18:35 ` Sebastian Pop
2014-10-22 22:53 ` Sebastian Pop
2014-10-22 23:01 ` Maxim Kuvyrkov
2014-11-07 15:11 ` Richard Earnshaw
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).