* [RFC][PATCH][AArch64] Improve generic branch cost
@ 2017-03-09 14:42 Wilco Dijkstra
2017-03-09 22:06 ` Andrew Pinski
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Wilco Dijkstra @ 2017-03-09 14:42 UTC (permalink / raw)
To: GCC Patches, Evandro Menezes, Andrew.pinski, jim.wilson; +Cc: nd
Hi,
Recently we've put a lot of effort into improving ifcvt to use CSEL on AArch64.
In https://gcc.gnu.org/ml/gcc-patches/2015-11/msg01639.html James determined
the best value for AArch64 code generation. Although this setting is used when
explicitly targeting Cortex cores, it is not otherwise used. This means by
default GCC will not use (F)CSEL in many common cases. Most code is built
without -mcpu= and thus doesn't use CSEL like this example from GLIBC:
strtok:
stp x29, x30, [sp, -48]!
add x29, sp, 0
stp x21, x22, [sp, 32]
mov x21, x1
stp x19, x20, [sp, 16]
adrp x22, .LANCHOR0
mov x19, x0
cbz x0, .L12
.L2: ldrb w0, [x19]
.L12:
ldr x19, [x22, #:lo12:.LANCHOR0]
b .L2
With -mcpu=cortex-a57 GCC generates:
stp x29, x30, [sp, -48]!
cmp x0, 0
add x29, sp, 0
stp x21, x22, [sp, 32]
adrp x21, .LANCHOR0
stp x19, x20, [sp, 16]
mov x19, x0
ldr x0, [x21, #:lo12:.LANCHOR0]
csel x19, x0, x19, eq
ldrb w0, [x19]
This is generally faster and smaller. On one benchmark the new setting fixes a
regression since GCC6 and improves performance by 49%. So I propose to change
generic_branch_cost to be the same as cortexa57_branch_cost so that all supported
cores benefit equally from CSEL. Are there any objections to this?
Wilco
ChangeLog:
2017-03-09 Wilco Dijkstra <wdijkstr@arm.com>
* config/aarch64/aarch64.c (generic_branch_cost): Copy cortexa57_branch_cost.
--
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 5870b5e5d7e8e48cf925b3a62030346f041a7fd6..ea16074af86087a6200d9895583e05acf43d90e2 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -377,8 +377,8 @@ static const struct cpu_vector_cost xgene1_vector_cost =
/* Generic costs for branch instructions. */
static const struct cpu_branch_cost generic_branch_cost =
{
- 2, /* Predictable. */
- 2 /* Unpredictable. */
+ 1, /* Predictable. */
+ 3 /* Unpredictable. */
};
/* Branch costs for Cortex-A57. */
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH][AArch64] Improve generic branch cost
2017-03-09 14:42 [RFC][PATCH][AArch64] Improve generic branch cost Wilco Dijkstra
@ 2017-03-09 22:06 ` Andrew Pinski
2017-03-14 9:37 ` James Greenhalgh
2017-03-16 17:22 ` [PATCH][AArch64] Enable AES fusion with -mcpu=generic Wilco Dijkstra
2017-03-17 10:15 ` [RFC][PATCH][AArch64] Improve generic branch cost Richard Earnshaw (lists)
2 siblings, 1 reply; 11+ messages in thread
From: Andrew Pinski @ 2017-03-09 22:06 UTC (permalink / raw)
To: Wilco Dijkstra
Cc: GCC Patches, Evandro Menezes, Andrew.pinski, jim.wilson, nd
On Thu, Mar 9, 2017 at 6:42 AM, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote:
> Hi,
>
> Recently we've put a lot of effort into improving ifcvt to use CSEL on AArch64.
> In https://gcc.gnu.org/ml/gcc-patches/2015-11/msg01639.html James determined
> the best value for AArch64 code generation. Although this setting is used when
> explicitly targeting Cortex cores, it is not otherwise used. This means by
> default GCC will not use (F)CSEL in many common cases. Most code is built
> without -mcpu= and thus doesn't use CSEL like this example from GLIBC:
>
> strtok:
> stp x29, x30, [sp, -48]!
> add x29, sp, 0
> stp x21, x22, [sp, 32]
> mov x21, x1
> stp x19, x20, [sp, 16]
> adrp x22, .LANCHOR0
> mov x19, x0
> cbz x0, .L12
> .L2: ldrb w0, [x19]
>
> .L12:
> ldr x19, [x22, #:lo12:.LANCHOR0]
> b .L2
>
> With -mcpu=cortex-a57 GCC generates:
>
> stp x29, x30, [sp, -48]!
> cmp x0, 0
> add x29, sp, 0
> stp x21, x22, [sp, 32]
> adrp x21, .LANCHOR0
> stp x19, x20, [sp, 16]
> mov x19, x0
> ldr x0, [x21, #:lo12:.LANCHOR0]
> csel x19, x0, x19, eq
> ldrb w0, [x19]
>
> This is generally faster and smaller. On one benchmark the new setting fixes a
> regression since GCC6 and improves performance by 49%. So I propose to change
> generic_branch_cost to be the same as cortexa57_branch_cost so that all supported
> cores benefit equally from CSEL. Are there any objections to this?
I have no objections. In fact thunderx2t99's branch_cost is 1,3. I
had not looked into improving thunderx branch cost yet but that might
be because I have local patches that improve phiopt for doing ifcvt
earlier. Also my phiopt change does not have a cost model either so
using csel more is good for thunderx 1 and ThunderX 2.
Thanks,
Andrew
>
> Wilco
>
>
> ChangeLog:
> 2017-03-09 Wilco Dijkstra <wdijkstr@arm.com>
>
> * config/aarch64/aarch64.c (generic_branch_cost): Copy cortexa57_branch_cost.
> --
>
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 5870b5e5d7e8e48cf925b3a62030346f041a7fd6..ea16074af86087a6200d9895583e05acf43d90e2 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -377,8 +377,8 @@ static const struct cpu_vector_cost xgene1_vector_cost =
> /* Generic costs for branch instructions. */
> static const struct cpu_branch_cost generic_branch_cost =
> {
> - 2, /* Predictable. */
> - 2 /* Unpredictable. */
> + 1, /* Predictable. */
> + 3 /* Unpredictable. */
> };
>
> /* Branch costs for Cortex-A57. */
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH][AArch64] Improve generic branch cost
2017-03-09 22:06 ` Andrew Pinski
@ 2017-03-14 9:37 ` James Greenhalgh
2017-03-17 3:19 ` Jim Wilson
0 siblings, 1 reply; 11+ messages in thread
From: James Greenhalgh @ 2017-03-14 9:37 UTC (permalink / raw)
To: Andrew Pinski
Cc: Wilco Dijkstra, GCC Patches, Evandro Menezes, Andrew.pinski,
jim.wilson, nd, philipp.tomsich, benedikt.huber
On Thu, Mar 09, 2017 at 02:06:16PM -0800, Andrew Pinski wrote:
> On Thu, Mar 9, 2017 at 6:42 AM, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote:
> > Hi,
> >
> > Recently we've put a lot of effort into improving ifcvt to use CSEL on
> > AArch64. In https://gcc.gnu.org/ml/gcc-patches/2015-11/msg01639.html
> > James determined the best value for AArch64 code generation.
This was before the rewrite to the ifcvt costs which I made earlier in the
GCC 7 release cycle. But I think 1,3 is about right, and I'd be happy
to see us take that direction for "generic".
I'd like to hear comments from the Exynos-M1, Falkor and
xgene-1 subtarget contributors, particularly as these targets use
generic_branch_costs for their subtarget-sepcific tuning. It may be that
your patch needs to preserve the 2,2 setting for such cores even if the
generic target does move to 1,3.
At this stage in the release, this patch will have to wait for GCC 8
regardless of any comments received. I'd suggest that when we do think
about this for GCC 8, we might want to take a wider look at the "generic"
tunings, any opinions from other subtarget contributors, or the other
AArch64 maintainers as to further changes they would advocate for would
be welcome.
> I have no objections. In fact thunderx2t99's branch_cost is 1,3. I
> had not looked into improving thunderx branch cost yet but that might
> be because I have local patches that improve phiopt for doing ifcvt
> earlier. Also my phiopt change does not have a cost model either so
> using csel more is good for thunderx 1 and ThunderX 2.
Thanks for the comments,
James
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH][AArch64] Enable AES fusion with -mcpu=generic
2017-03-09 14:42 [RFC][PATCH][AArch64] Improve generic branch cost Wilco Dijkstra
2017-03-09 22:06 ` Andrew Pinski
@ 2017-03-16 17:22 ` Wilco Dijkstra
2017-03-16 18:01 ` Andrew Pinski
2017-04-20 15:59 ` Wilco Dijkstra
2017-03-17 10:15 ` [RFC][PATCH][AArch64] Improve generic branch cost Richard Earnshaw (lists)
2 siblings, 2 replies; 11+ messages in thread
From: Wilco Dijkstra @ 2017-03-16 17:22 UTC (permalink / raw)
To: GCC Patches, Evandro Menezes, Andrew.pinski, jim.wilson; +Cc: nd
Many supported cores implement fusion of AES instructions. When fusion
happens it can give a significant performance gain. If not, scheduling
fusion candidates next to each other has almost no effect on performance.
Due to the high benefit/low cost it makes sense to enable AES fusion with
-mcpu=generic so that cores that support it always benefit. Any objections?
Bootstrapped on AArch64, no regressions.
ChangeLog:
2017-03-16 Wilco Dijkstra <wdijkstr@arm.com>
* gcc/config/aarch64/aarch64.c (generic_tunings): Add AES fusion.
--
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 728ce7029f1e2b5161d9f317d10e564dd5a5f472..c8cf7169a5d387de336920b50c83761dc0c96f3a 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -536,7 +536,7 @@ static const struct tune_params generic_tunings =
&generic_approx_modes,
4, /* memmov_cost */
2, /* issue_rate */
- AARCH64_FUSE_NOTHING, /* fusible_ops */
+ (AARCH64_FUSE_AES_AESMC), /* fusible_ops */
8, /* function_align. */
8, /* jump_align. */
4, /* loop_align. */
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH][AArch64] Enable AES fusion with -mcpu=generic
2017-03-16 17:22 ` [PATCH][AArch64] Enable AES fusion with -mcpu=generic Wilco Dijkstra
@ 2017-03-16 18:01 ` Andrew Pinski
2017-03-17 3:26 ` Jim Wilson
2017-04-20 15:59 ` Wilco Dijkstra
1 sibling, 1 reply; 11+ messages in thread
From: Andrew Pinski @ 2017-03-16 18:01 UTC (permalink / raw)
To: Wilco Dijkstra
Cc: GCC Patches, Evandro Menezes, Andrew.pinski, jim.wilson, nd
On Thu, Mar 16, 2017 at 10:22 AM, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote:
> Many supported cores implement fusion of AES instructions. When fusion
> happens it can give a significant performance gain. If not, scheduling
> fusion candidates next to each other has almost no effect on performance.
> Due to the high benefit/low cost it makes sense to enable AES fusion with
> -mcpu=generic so that cores that support it always benefit. Any objections?
I am ok with this due to our new cores support this and there was no
performance lost for ThunderX1.
Thanks,
Andrew
>
> Bootstrapped on AArch64, no regressions.
>
> ChangeLog:
> 2017-03-16 Wilco Dijkstra <wdijkstr@arm.com>
>
> * gcc/config/aarch64/aarch64.c (generic_tunings): Add AES fusion.
>
> --
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 728ce7029f1e2b5161d9f317d10e564dd5a5f472..c8cf7169a5d387de336920b50c83761dc0c96f3a 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -536,7 +536,7 @@ static const struct tune_params generic_tunings =
> &generic_approx_modes,
> 4, /* memmov_cost */
> 2, /* issue_rate */
> - AARCH64_FUSE_NOTHING, /* fusible_ops */
> + (AARCH64_FUSE_AES_AESMC), /* fusible_ops */
> 8, /* function_align. */
> 8, /* jump_align. */
> 4, /* loop_align. */
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH][AArch64] Improve generic branch cost
2017-03-14 9:37 ` James Greenhalgh
@ 2017-03-17 3:19 ` Jim Wilson
0 siblings, 0 replies; 11+ messages in thread
From: Jim Wilson @ 2017-03-17 3:19 UTC (permalink / raw)
To: James Greenhalgh
Cc: Andrew Pinski, Wilco Dijkstra, GCC Patches, Evandro Menezes,
Andrew.pinski, nd, Philipp Tomsich, benedikt.huber
On Tue, Mar 14, 2017 at 2:37 AM, James Greenhalgh
<james.greenhalgh@arm.com> wrote:
> I'd like to hear comments from the Exynos-M1, Falkor and
> xgene-1 subtarget contributors, particularly as these targets use
> generic_branch_costs for their subtarget-sepcific tuning. It may be that
> your patch needs to preserve the 2,2 setting for such cores even if the
> generic target does move to 1,3.
I was at Linaro Connect last week of course. I took a look at this
issue this week. I don't see any measurable performance change on
SPEC CPU2006 for falkor, so the change looks OK to me.
In general, I'm not too concerned about changes like this, as I'm
watching the FSF GCC tree, and will make appropriate changes to the
falkor tuning structure as necessary to maintain good performance.
Jim
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH][AArch64] Enable AES fusion with -mcpu=generic
2017-03-16 18:01 ` Andrew Pinski
@ 2017-03-17 3:26 ` Jim Wilson
2017-03-17 10:56 ` James Greenhalgh
0 siblings, 1 reply; 11+ messages in thread
From: Jim Wilson @ 2017-03-17 3:26 UTC (permalink / raw)
To: Andrew Pinski
Cc: Wilco Dijkstra, GCC Patches, Evandro Menezes, Andrew.pinski, nd
On Thu, Mar 16, 2017 at 11:01 AM, Andrew Pinski <apinski@cavium.com> wrote:
> On Thu, Mar 16, 2017 at 10:22 AM, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote:
>> Many supported cores implement fusion of AES instructions. When fusion
>> happens it can give a significant performance gain. If not, scheduling
>> fusion candidates next to each other has almost no effect on performance.
>> Due to the high benefit/low cost it makes sense to enable AES fusion with
>> -mcpu=generic so that cores that support it always benefit. Any objections?
No objection. I'm not currently tracking performance of -mcpu=generic
on falkor, so I'm not very concerned about changes to the generic
tuning structure.
Jim
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH][AArch64] Improve generic branch cost
2017-03-09 14:42 [RFC][PATCH][AArch64] Improve generic branch cost Wilco Dijkstra
2017-03-09 22:06 ` Andrew Pinski
2017-03-16 17:22 ` [PATCH][AArch64] Enable AES fusion with -mcpu=generic Wilco Dijkstra
@ 2017-03-17 10:15 ` Richard Earnshaw (lists)
2 siblings, 0 replies; 11+ messages in thread
From: Richard Earnshaw (lists) @ 2017-03-17 10:15 UTC (permalink / raw)
To: Wilco Dijkstra, GCC Patches, Evandro Menezes, Andrew.pinski, jim.wilson
Cc: nd
On 09/03/17 14:42, Wilco Dijkstra wrote:
> Hi,
>
> Recently we've put a lot of effort into improving ifcvt to use CSEL on AArch64.
> In https://gcc.gnu.org/ml/gcc-patches/2015-11/msg01639.html James determined
> the best value for AArch64 code generation. Although this setting is used when
> explicitly targeting Cortex cores, it is not otherwise used. This means by
> default GCC will not use (F)CSEL in many common cases. Most code is built
> without -mcpu= and thus doesn't use CSEL like this example from GLIBC:
>
> strtok:
> stp x29, x30, [sp, -48]!
> add x29, sp, 0
> stp x21, x22, [sp, 32]
> mov x21, x1
> stp x19, x20, [sp, 16]
> adrp x22, .LANCHOR0
> mov x19, x0
> cbz x0, .L12
> .L2: ldrb w0, [x19]
>
> .L12:
> ldr x19, [x22, #:lo12:.LANCHOR0]
> b .L2
>
> With -mcpu=cortex-a57 GCC generates:
>
> stp x29, x30, [sp, -48]!
> cmp x0, 0
> add x29, sp, 0
> stp x21, x22, [sp, 32]
> adrp x21, .LANCHOR0
> stp x19, x20, [sp, 16]
> mov x19, x0
> ldr x0, [x21, #:lo12:.LANCHOR0]
> csel x19, x0, x19, eq
> ldrb w0, [x19]
>
> This is generally faster and smaller. On one benchmark the new setting fixes a
> regression since GCC6 and improves performance by 49%. So I propose to change
> generic_branch_cost to be the same as cortexa57_branch_cost so that all supported
> cores benefit equally from CSEL. Are there any objections to this?
>
> Wilco
>
>
> ChangeLog:
> 2017-03-09 Wilco Dijkstra <wdijkstr@arm.com>
>
> * config/aarch64/aarch64.c (generic_branch_cost): Copy cortexa57_branch_cost.
This is OK. We already have a number of cores using these values so I
don't think this is likely to be a risky change even in stage 4.
R.
> --
>
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 5870b5e5d7e8e48cf925b3a62030346f041a7fd6..ea16074af86087a6200d9895583e05acf43d90e2 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -377,8 +377,8 @@ static const struct cpu_vector_cost xgene1_vector_cost =
> /* Generic costs for branch instructions. */
> static const struct cpu_branch_cost generic_branch_cost =
> {
> - 2, /* Predictable. */
> - 2 /* Unpredictable. */
> + 1, /* Predictable. */
> + 3 /* Unpredictable. */
> };
>
> /* Branch costs for Cortex-A57. */
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH][AArch64] Enable AES fusion with -mcpu=generic
2017-03-17 3:26 ` Jim Wilson
@ 2017-03-17 10:56 ` James Greenhalgh
0 siblings, 0 replies; 11+ messages in thread
From: James Greenhalgh @ 2017-03-17 10:56 UTC (permalink / raw)
To: Jim Wilson
Cc: Andrew Pinski, Wilco Dijkstra, GCC Patches, Evandro Menezes,
Andrew.pinski, nd
On Thu, Mar 16, 2017 at 08:26:42PM -0700, Jim Wilson wrote:
> On Thu, Mar 16, 2017 at 11:01 AM, Andrew Pinski <apinski@cavium.com> wrote:
> > On Thu, Mar 16, 2017 at 10:22 AM, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote:
> >> Many supported cores implement fusion of AES instructions. When fusion
> >> happens it can give a significant performance gain. If not, scheduling
> >> fusion candidates next to each other has almost no effect on performance.
> >> Due to the high benefit/low cost it makes sense to enable AES fusion with
> >> -mcpu=generic so that cores that support it always benefit. Any objections?
>
> No objection. I'm not currently tracking performance of -mcpu=generic
> on falkor, so I'm not very concerned about changes to the generic
> tuning structure.
Thanks for the feedback Jim, Andrew.
This patch is OK for trunk. As Richard pointed out on the branch costs
thread, if we had a bug here we'd likely have seen it by now on those
cores which do enable the fusion.
Thanks,
James
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH][AArch64] Enable AES fusion with -mcpu=generic
2017-03-16 17:22 ` [PATCH][AArch64] Enable AES fusion with -mcpu=generic Wilco Dijkstra
2017-03-16 18:01 ` Andrew Pinski
@ 2017-04-20 15:59 ` Wilco Dijkstra
2017-05-05 13:38 ` Richard Earnshaw (lists)
1 sibling, 1 reply; 11+ messages in thread
From: Wilco Dijkstra @ 2017-04-20 15:59 UTC (permalink / raw)
To: GCC Patches, James Greenhalgh
Cc: nd, Evandro Menezes, Andrew.pinski, jim.wilson
ping
From: Wilco Dijkstra
Sent: 16 March 2017 17:22
To: GCC Patches; Evandro Menezes; Andrew.pinski@cavium.com; jim.wilson@linaro.org
Cc: nd
Subject: [PATCH][AArch64] Enable AES fusion with -mcpu=generic
Many supported cores implement fusion of AES instructions. When fusion
happens it can give a significant performance gain. If not, scheduling
fusion candidates next to each other has almost no effect on performance.
Due to the high benefit/low cost it makes sense to enable AES fusion with
-mcpu=generic so that cores that support it always benefit. Any objections?
Bootstrapped on AArch64, no regressions.
ChangeLog:
2017-03-16 Wilco Dijkstra <wdijkstr@arm.com>
* gcc/config/aarch64/aarch64.c (generic_tunings): Add AES fusion.
--
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 728ce7029f1e2b5161d9f317d10e564dd5a5f472..c8cf7169a5d387de336920b50c83761dc0c96f3a 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -536,7 +536,7 @@ static const struct tune_params generic_tunings =
&generic_approx_modes,
4, /* memmov_cost */
2, /* issue_rate */
- AARCH64_FUSE_NOTHING, /* fusible_ops */
+ (AARCH64_FUSE_AES_AESMC), /* fusible_ops */
8, /* function_align. */
8, /* jump_align. */
4, /* loop_align. */
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH][AArch64] Enable AES fusion with -mcpu=generic
2017-04-20 15:59 ` Wilco Dijkstra
@ 2017-05-05 13:38 ` Richard Earnshaw (lists)
0 siblings, 0 replies; 11+ messages in thread
From: Richard Earnshaw (lists) @ 2017-05-05 13:38 UTC (permalink / raw)
To: Wilco Dijkstra, GCC Patches, James Greenhalgh
Cc: nd, Evandro Menezes, Andrew.pinski, jim.wilson
On 20/04/17 16:53, Wilco Dijkstra wrote:
>
> ping
James has already approved this on 17 March, why are you pinging again?
https://gcc.gnu.org/ml/gcc-patches/2017-03/msg00918.html
>
> From: Wilco Dijkstra
> Sent: 16 March 2017 17:22
> To: GCC Patches; Evandro Menezes; Andrew.pinski@cavium.com; jim.wilson@linaro.org
> Cc: nd
> Subject: [PATCH][AArch64] Enable AES fusion with -mcpu=generic
>
> Many supported cores implement fusion of AES instructions. When fusion
> happens it can give a significant performance gain. If not, scheduling
> fusion candidates next to each other has almost no effect on performance.
> Due to the high benefit/low cost it makes sense to enable AES fusion with
> -mcpu=generic so that cores that support it always benefit. Any objections?
>
> Bootstrapped on AArch64, no regressions.
>
> ChangeLog:
> 2017-03-16 Wilco Dijkstra <wdijkstr@arm.com>
>
> * gcc/config/aarch64/aarch64.c (generic_tunings): Add AES fusion.
>
> --
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 728ce7029f1e2b5161d9f317d10e564dd5a5f472..c8cf7169a5d387de336920b50c83761dc0c96f3a 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -536,7 +536,7 @@ static const struct tune_params generic_tunings =
> &generic_approx_modes,
> 4, /* memmov_cost */
> 2, /* issue_rate */
> - AARCH64_FUSE_NOTHING, /* fusible_ops */
> + (AARCH64_FUSE_AES_AESMC), /* fusible_ops */
> 8, /* function_align. */
> 8, /* jump_align. */
> 4, /* loop_align. */
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2017-05-05 13:38 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-09 14:42 [RFC][PATCH][AArch64] Improve generic branch cost Wilco Dijkstra
2017-03-09 22:06 ` Andrew Pinski
2017-03-14 9:37 ` James Greenhalgh
2017-03-17 3:19 ` Jim Wilson
2017-03-16 17:22 ` [PATCH][AArch64] Enable AES fusion with -mcpu=generic Wilco Dijkstra
2017-03-16 18:01 ` Andrew Pinski
2017-03-17 3:26 ` Jim Wilson
2017-03-17 10:56 ` James Greenhalgh
2017-04-20 15:59 ` Wilco Dijkstra
2017-05-05 13:38 ` Richard Earnshaw (lists)
2017-03-17 10:15 ` [RFC][PATCH][AArch64] Improve generic branch cost Richard Earnshaw (lists)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).