public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [Patch AArch64] Add initial tuning support for Cortex-A55 and Cortex-A75
@ 2017-06-20 15:42 James Greenhalgh
  2017-06-21  9:43 ` Richard Earnshaw (lists)
  0 siblings, 1 reply; 2+ messages in thread
From: James Greenhalgh @ 2017-06-20 15:42 UTC (permalink / raw)
  To: gcc-patches; +Cc: nd, richard.earnshaw, marcus.shawcroft

[-- Attachment #1: Type: text/plain, Size: 1361 bytes --]


Hi,

This patch adds support for the ARM Cortex-A75 and
Cortex-A55 processors through the -mcpu/-mtune values cortex-a55 and
cortex-a75, and an ARM DynamIQ big.LITTLE configuration of these two
processors through the -mcpu/-mtune value cortex-a75.cortex-a55

The ARM Cortex-A75 is ARM's latest and highest performance applications
processor. For the initial tuning provided in this patch, I have chosen to
share the tuning structure with its predecessor, the Cortex-A73.

The ARM Cortex-A55 delivers the best combination of power efficiency
and performance in its class. For the initial tuning provided in this patch,
I have chosen to share the tuning structure with its predecessor, the
Cortex-A53.

Both Cortex-A55 and Cortex-A75 support ARMv8-A with the ARM8.1-A and
ARMv8.2-A extensions, along with the cryptography extension, and
the RCPC extensions from ARMv8.3-A. This is reflected in the patch,
-mcpu=cortex-a75 is treated as equivalent to passing -mtune=cortex-a75
-march=armv8.2-a+rcpc .

Tested on aarch64-none-elf with no issues.

OK for trunk?

Thanks,
James

---
2017-06-20  James Greenhalgh  <james.greenhalgh@arm.com>

	* config/aarch64/aarch64-cores.def (cortex-a55): New.
	(cortex-a75): Likewise.
	(cortex-a75.cortex-a55): Likewise.
	* config/aarch64/aarch64-tune.md: Regenerate.
	* doc/invoke.texi (-mtune): Document new values for -mtune.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Patch-AArch64-Add-initial-tuning-support-for-Cortex-.patch --]
[-- Type: text/x-patch; name="0001-Patch-AArch64-Add-initial-tuning-support-for-Cortex-.patch", Size: 5206 bytes --]

diff --git a/gcc/config/aarch64/aarch64-cores.def b/gcc/config/aarch64/aarch64-cores.def
index e333d5f..0baa20c 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -80,6 +80,12 @@ AARCH64_CORE("vulcan",  vulcan, thunderx2t99, 8_1A,  AARCH64_FL_FOR_ARCH8_1 | AA
 /* Cavium ('C') cores. */
 AARCH64_CORE("thunderx2t99",  thunderx2t99,  thunderx2t99, 8_1A,  AARCH64_FL_FOR_ARCH8_1 | AARCH64_FL_CRYPTO, thunderx2t99, 0x43, 0x0af, -1)
 
+/* ARMv8.2-A Architecture Processors.  */
+
+/* ARM ('A') cores. */
+AARCH64_CORE("cortex-a55",  cortexa55, cortexa53, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_RCPC, cortexa53, 0x41, 0xd05, -1)
+AARCH64_CORE("cortex-a75",  cortexa75, cortexa57, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_RCPC, cortexa73, 0x41, 0xd0a, -1)
+
 /* ARMv8-A big.LITTLE implementations.  */
 
 AARCH64_CORE("cortex-a57.cortex-a53",  cortexa57cortexa53, cortexa53, 8A,  AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57, 0x41, AARCH64_BIG_LITTLE (0xd07, 0xd03), -1)
@@ -87,4 +93,8 @@ AARCH64_CORE("cortex-a72.cortex-a53",  cortexa72cortexa53, cortexa53, 8A,  AARCH
 AARCH64_CORE("cortex-a73.cortex-a35",  cortexa73cortexa35, cortexa53, 8A,  AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa73, 0x41, AARCH64_BIG_LITTLE (0xd09, 0xd04), -1)
 AARCH64_CORE("cortex-a73.cortex-a53",  cortexa73cortexa53, cortexa53, 8A,  AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa73, 0x41, AARCH64_BIG_LITTLE (0xd09, 0xd03), -1)
 
+/* ARM DynamIQ big.LITTLE configurations.  */
+
+AARCH64_CORE("cortex-a75.cortex-a55",  cortexa75cortexa55, cortexa53, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_RCPC, cortexa73, 0x41, AARCH64_BIG_LITTLE (0xd0a, 0xd05), -1)
+
 #undef AARCH64_CORE
diff --git a/gcc/config/aarch64/aarch64-tune.md b/gcc/config/aarch64/aarch64-tune.md
index 4209f67..7fcd6cb 100644
--- a/gcc/config/aarch64/aarch64-tune.md
+++ b/gcc/config/aarch64/aarch64-tune.md
@@ -1,5 +1,5 @@
 ;; -*- buffer-read-only: t -*-
 ;; Generated automatically by gentune.sh from aarch64-cores.def
 (define_attr "tune"
-	"cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,thunderx2t99p1,vulcan,thunderx2t99,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53"
+	"cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55"
 	(const (symbol_ref "((enum attr_tune) aarch64_tune)")))
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 86c8d62..2746c3e 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -14077,17 +14077,19 @@ processors implementing the target architecture.
 @opindex mtune
 Specify the name of the target processor for which GCC should tune the
 performance of the code.  Permissible values for this option are:
-@samp{generic}, @samp{cortex-a35}, @samp{cortex-a53}, @samp{cortex-a57},
-@samp{cortex-a72}, @samp{cortex-a73}, @samp{exynos-m1},
-@samp{xgene1}, @samp{vulcan}, @samp{thunderx},
+@samp{generic}, @samp{cortex-a35}, @samp{cortex-a53}, @samp{cortex-a55},
+@samp{cortex-a57}, @samp{cortex-a72}, @samp{cortex-a73}, @samp{cortex-a75},
+@samp{exynos-m1}, @samp{xgene1}, @samp{vulcan}, @samp{thunderx},
 @samp{thunderxt88}, @samp{thunderxt88p1}, @samp{thunderxt81},
 @samp{thunderxt83}, @samp{thunderx2t99}, @samp{cortex-a57.cortex-a53},
 @samp{cortex-a72.cortex-a53}, @samp{cortex-a73.cortex-a35},
-@samp{cortex-a73.cortex-a53}, @samp{native}.
+@samp{cortex-a73.cortex-a53}, @samp{cortex-a75.cortex-a55},
+@samp{native}.
 
 The values @samp{cortex-a57.cortex-a53}, @samp{cortex-a72.cortex-a53},
-@samp{cortex-a73.cortex-a35}, @samp{cortex-a73.cortex-a53}
-specify that GCC should tune for a big.LITTLE system.
+@samp{cortex-a73.cortex-a35}, @samp{cortex-a73.cortex-a53},
+@samp{cortex-a75.cortex-a55} specify that GCC should tune for a
+big.LITTLE system.
 
 Additionally on native AArch64 GNU/Linux systems the value
 @samp{native} tunes performance to the host system.  This option has no effect
@@ -25607,12 +25609,13 @@ This option instructs GCC to use 128-bit AVX instructions instead of
 
 @item -mcx16
 @opindex mcx16
-This option enables GCC to generate @code{CMPXCHG16B} instructions in 64-bit
-code to implement compare-and-exchange operations on 16-byte aligned 128-bit
-objects.  This is useful for atomic updates of data structures exceeding one
-machine word in size.  The compiler uses this instruction to implement
-@ref{__sync Builtins}.  However, for @ref{__atomic Builtins} operating on
-128-bit integers, a library call is always used.
+This option enables GCC to generate @code{CMPXCHG16B} instructions.
+@code{CMPXCHG16B} allows for atomic operations on 128-bit double quadword
+(or oword) data types.  
+This is useful for high-resolution counters that can be updated
+by multiple processors (or cores).  This instruction is generated as part of
+atomic built-in functions: see @ref{__sync Builtins} or
+@ref{__atomic Builtins} for details.
 
 @item -msahf
 @opindex msahf

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [Patch AArch64] Add initial tuning support for Cortex-A55 and Cortex-A75
  2017-06-20 15:42 [Patch AArch64] Add initial tuning support for Cortex-A55 and Cortex-A75 James Greenhalgh
@ 2017-06-21  9:43 ` Richard Earnshaw (lists)
  0 siblings, 0 replies; 2+ messages in thread
From: Richard Earnshaw (lists) @ 2017-06-21  9:43 UTC (permalink / raw)
  To: James Greenhalgh, gcc-patches; +Cc: nd, marcus.shawcroft

On 20/06/17 16:41, James Greenhalgh wrote:
> 
> Hi,
> 
> This patch adds support for the ARM Cortex-A75 and
> Cortex-A55 processors through the -mcpu/-mtune values cortex-a55 and
> cortex-a75, and an ARM DynamIQ big.LITTLE configuration of these two
> processors through the -mcpu/-mtune value cortex-a75.cortex-a55
> 
> The ARM Cortex-A75 is ARM's latest and highest performance applications
> processor. For the initial tuning provided in this patch, I have chosen to
> share the tuning structure with its predecessor, the Cortex-A73.
> 
> The ARM Cortex-A55 delivers the best combination of power efficiency
> and performance in its class. For the initial tuning provided in this patch,
> I have chosen to share the tuning structure with its predecessor, the
> Cortex-A53.
> 
> Both Cortex-A55 and Cortex-A75 support ARMv8-A with the ARM8.1-A and
> ARMv8.2-A extensions, along with the cryptography extension, and
> the RCPC extensions from ARMv8.3-A. This is reflected in the patch,
> -mcpu=cortex-a75 is treated as equivalent to passing -mtune=cortex-a75
> -march=armv8.2-a+rcpc .
> 
> Tested on aarch64-none-elf with no issues.
> 
> OK for trunk?
> 
> Thanks,
> James
> 
> ---
> 2017-06-20  James Greenhalgh  <james.greenhalgh@arm.com>
> 
> 	* config/aarch64/aarch64-cores.def (cortex-a55): New.
> 	(cortex-a75): Likewise.
> 	(cortex-a75.cortex-a55): Likewise.
> 	* config/aarch64/aarch64-tune.md: Regenerate.
> 	* doc/invoke.texi (-mtune): Document new values for -mtune.
> 
> 

Mostly ok, but...

> 0001-Patch-AArch64-Add-initial-tuning-support-for-Cortex-.patch
> 
> 
> diff --git a/gcc/config/aarch64/aarch64-cores.def b/gcc/config/aarch64/aarch64-cores.def
> index e333d5f..0baa20c 100644
> --- a/gcc/config/aarch64/aarch64-cores.def
> +++ b/gcc/config/aarch64/aarch64-cores.def
> @@ -80,6 +80,12 @@ AARCH64_CORE("vulcan",  vulcan, thunderx2t99, 8_1A,  AARCH64_FL_FOR_ARCH8_1 | AA
>  /* Cavium ('C') cores. */
>  AARCH64_CORE("thunderx2t99",  thunderx2t99,  thunderx2t99, 8_1A,  AARCH64_FL_FOR_ARCH8_1 | AARCH64_FL_CRYPTO, thunderx2t99, 0x43, 0x0af, -1)
>  
> +/* ARMv8.2-A Architecture Processors.  */
> +
> +/* ARM ('A') cores. */
> +AARCH64_CORE("cortex-a55",  cortexa55, cortexa53, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_RCPC, cortexa53, 0x41, 0xd05, -1)
> +AARCH64_CORE("cortex-a75",  cortexa75, cortexa57, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_RCPC, cortexa73, 0x41, 0xd0a, -1)
> +
>  /* ARMv8-A big.LITTLE implementations.  */
>  
>  AARCH64_CORE("cortex-a57.cortex-a53",  cortexa57cortexa53, cortexa53, 8A,  AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57, 0x41, AARCH64_BIG_LITTLE (0xd07, 0xd03), -1)
> @@ -87,4 +93,8 @@ AARCH64_CORE("cortex-a72.cortex-a53",  cortexa72cortexa53, cortexa53, 8A,  AARCH
>  AARCH64_CORE("cortex-a73.cortex-a35",  cortexa73cortexa35, cortexa53, 8A,  AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa73, 0x41, AARCH64_BIG_LITTLE (0xd09, 0xd04), -1)
>  AARCH64_CORE("cortex-a73.cortex-a53",  cortexa73cortexa53, cortexa53, 8A,  AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa73, 0x41, AARCH64_BIG_LITTLE (0xd09, 0xd03), -1)
>  
> +/* ARM DynamIQ big.LITTLE configurations.  */
> +
> +AARCH64_CORE("cortex-a75.cortex-a55",  cortexa75cortexa55, cortexa53, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_RCPC, cortexa73, 0x41, AARCH64_BIG_LITTLE (0xd0a, 0xd05), -1)
> +
>  #undef AARCH64_CORE
> diff --git a/gcc/config/aarch64/aarch64-tune.md b/gcc/config/aarch64/aarch64-tune.md
> index 4209f67..7fcd6cb 100644
> --- a/gcc/config/aarch64/aarch64-tune.md
> +++ b/gcc/config/aarch64/aarch64-tune.md
> @@ -1,5 +1,5 @@
>  ;; -*- buffer-read-only: t -*-
>  ;; Generated automatically by gentune.sh from aarch64-cores.def
>  (define_attr "tune"
> -	"cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,thunderx2t99p1,vulcan,thunderx2t99,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53"
> +	"cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55"
>  	(const (symbol_ref "((enum attr_tune) aarch64_tune)")))
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 86c8d62..2746c3e 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -14077,17 +14077,19 @@ processors implementing the target architecture.
>  @opindex mtune
>  Specify the name of the target processor for which GCC should tune the
>  performance of the code.  Permissible values for this option are:
> -@samp{generic}, @samp{cortex-a35}, @samp{cortex-a53}, @samp{cortex-a57},
> -@samp{cortex-a72}, @samp{cortex-a73}, @samp{exynos-m1},
> -@samp{xgene1}, @samp{vulcan}, @samp{thunderx},
> +@samp{generic}, @samp{cortex-a35}, @samp{cortex-a53}, @samp{cortex-a55},
> +@samp{cortex-a57}, @samp{cortex-a72}, @samp{cortex-a73}, @samp{cortex-a75},
> +@samp{exynos-m1}, @samp{xgene1}, @samp{vulcan}, @samp{thunderx},
>  @samp{thunderxt88}, @samp{thunderxt88p1}, @samp{thunderxt81},
>  @samp{thunderxt83}, @samp{thunderx2t99}, @samp{cortex-a57.cortex-a53},
>  @samp{cortex-a72.cortex-a53}, @samp{cortex-a73.cortex-a35},
> -@samp{cortex-a73.cortex-a53}, @samp{native}.
> +@samp{cortex-a73.cortex-a53}, @samp{cortex-a75.cortex-a55},
> +@samp{native}.
>  
>  The values @samp{cortex-a57.cortex-a53}, @samp{cortex-a72.cortex-a53},
> -@samp{cortex-a73.cortex-a35}, @samp{cortex-a73.cortex-a53}
> -specify that GCC should tune for a big.LITTLE system.
> +@samp{cortex-a73.cortex-a35}, @samp{cortex-a73.cortex-a53},
> +@samp{cortex-a75.cortex-a55} specify that GCC should tune for a
> +big.LITTLE system.
>  
>  Additionally on native AArch64 GNU/Linux systems the value
>  @samp{native} tunes performance to the host system.  This option has no effect
> @@ -25607,12 +25609,13 @@ This option instructs GCC to use 128-bit AVX instructions instead of
>  
>  @item -mcx16
>  @opindex mcx16
> -This option enables GCC to generate @code{CMPXCHG16B} instructions in 64-bit
> -code to implement compare-and-exchange operations on 16-byte aligned 128-bit
> -objects.  This is useful for atomic updates of data structures exceeding one
> -machine word in size.  The compiler uses this instruction to implement
> -@ref{__sync Builtins}.  However, for @ref{__atomic Builtins} operating on
> -128-bit integers, a library call is always used.
> +This option enables GCC to generate @code{CMPXCHG16B} instructions.
> +@code{CMPXCHG16B} allows for atomic operations on 128-bit double quadword
> +(or oword) data types.  
> +This is useful for high-resolution counters that can be updated
> +by multiple processors (or cores).  This instruction is generated as part of
> +atomic built-in functions: see @ref{__sync Builtins} or
> +@ref{__atomic Builtins} for details.
>  
>  @item -msahf
>  @opindex msahf
> 

I don't think this last hunk should be part of this patch.

OK without that bit...

R.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2017-06-21  9:43 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-20 15:42 [Patch AArch64] Add initial tuning support for Cortex-A55 and Cortex-A75 James Greenhalgh
2017-06-21  9:43 ` Richard Earnshaw (lists)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).