From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 126264 invoked by alias); 22 Apr 2015 15:08:30 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 126250 invoked by uid 89); 22 Apr 2015 15:08:29 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.1 required=5.0 tests=AWL,BAYES_50,KAM_ASCII_DIVIDERS,SPF_PASS autolearn=no version=3.3.2 X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (207.82.80.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 22 Apr 2015 15:08:23 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.140]) by uk-mta-12.uk.mimecast.lan; Wed, 22 Apr 2015 16:08:11 +0100 Received: from [10.2.207.50] ([10.1.2.79]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Wed, 22 Apr 2015 16:08:11 +0100 Message-ID: <5537B95A.9050606@arm.com> Date: Wed, 22 Apr 2015 15:08:00 -0000 From: Kyrill Tkachov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: GCC Patches CC: Marcus Shawcroft , Richard Earnshaw , James Greenhalgh , Evandro Menezes , Andrew Pinski Subject: Re: [PATCH][AArch64] Implement -m{cpu,tune,arch}=native using only /proc/cpuinfo References: <55351FA9.4020603@arm.com> <55378A03.5040509@arm.com> In-Reply-To: <55378A03.5040509@arm.com> X-MC-Unique: L0rp1e3WTjO2Bd2Y6pjZFg-1 Content-Type: multipart/mixed; boundary="------------010502060205000408090005" X-IsSubscribed: yes X-SW-Source: 2015-04/txt/msg01322.txt.bz2 This is a multi-part message in MIME format. --------------010502060205000408090005 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable Content-length: 30559 On 22/04/15 12:46, Kyrill Tkachov wrote: > [Sorry for resending twice. My mail client glitched] > > On 20/04/15 16:47, Kyrill Tkachov wrote: >> Hi all, >> >> This is an attempt to add native CPU detection to AArch64 GNU/Linux targ= ets. >> Similar to other ports we use SPEC rewriting to rewrite -m{cpu,tune,arch= }=3Dnative >> options into the appropriate CPU/architecture and the architecture exten= sion options >> when appropriate (i.e. +crypto/+crc etc). >> >> For CPU/architecture detection it gets a bit involved, especially when r= unning on a >> big.LITTLE system. My proposed approach is to look at /proc/cpuinfo/ and= search for the >> implementer id and part number fields that uniquely identify each core (= appropriate identifying >> information is added to aarch64-cores.def). If we find two types of core= we have a big.LITTLE >> system, so search through the core definitions extracted from aarch64-co= res.def to find if we >> support such a combination (currently only cortex-a57.cortex-a53 and cor= tex-a72.cortex-a53) >> and make sure that the implementer id field matches up. >> >> I tested this on a 4xCortex-A53 + 2xCortex-A57 big.LITTLE Ubuntu GNU/Lin= ux system. >> There are two formats for /proc/cpuinfo/ that I'm aware of. The first (o= ld) one has the format: >> -------------------------------------- >> processor : 0 >> processor : 1 >> processor : 2 >> processor : 3 >> processor : 4 >> processor : 5 >> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 >> CPU implementer : 0x41 >> CPU architecture: AArch64 >> CPU variant : 0x0 >> CPU part : 0xd03 >> -------------------------------------- >> >> In this format it lists the 6 cores but the CPU part it reports is only = the one for the core >> from which /proc/cpuinfo was read from (!), in this case one of the Cort= ex-A53 cores. >> This means we detect a different CPU depending on which >> core GCC was invoked on. Not ideal really, but there's no more informati= on that we can extract. >> Given the /proc/cpuinfo above, this patch will rewrite -mcpu=3Dnative in= to -mcpu=3Dcortex-a53+fp+simd+crypto+crc >> >> The newer /proc/cpuinfo format proposed at >> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?= id=3D44b82b7700d05a52cd983799d3ecde1a976b3bed >> looks like this: >> >> -------------------------------------------------------------- >> processor : 0 >> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 >> CPU implementer : 0x41 >> CPU architecture: 8 >> CPU variant : 0x0 >> CPU part : 0xd03 >> CPU revision : 0 >> >> processor : 1 >> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 >> CPU implementer : 0x41 >> CPU architecture: 8 >> CPU variant : 0x0 >> CPU part : 0xd03 >> CPU revision : 0 >> >> processor : 2 >> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 >> CPU implementer : 0x41 >> CPU architecture: 8 >> CPU variant : 0x0 >> CPU part : 0xd03 >> CPU revision : 0 >> >> processor : 3 >> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 >> CPU implementer : 0x41 >> CPU architecture: 8 >> CPU variant : 0x0 >> CPU part : 0xd03 >> CPU revision : 0 >> >> processor : 4 >> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 >> CPU implementer : 0x41 >> CPU architecture: 8 >> CPU variant : 0x0 >> CPU part : 0xd07 >> CPU revision : 0 >> >> processor : 5 >> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 >> CPU implementer : 0x41 >> CPU architecture: 8 >> CPU variant : 0x0 >> CPU part : 0xd07 >> CPU revision : 0 >> -------------------------------------------------------------- >> >> The Features field is used to detect the architectural features that we = map to GCC option extensions >> i.e. +fp,+crypto,+simd,+crc etc. >> >> Similarly, -march=3Dnative would be rewritten into -march=3Darmv8-a+fp+s= imd+crypto+crc >> while -mtune=3Dnative into -march=3Dcortex-a57.cortex-a53 (the arch exte= nsion options are not valid >> for -mtune). >> >> If it detects more than one implementer ID or the implementer IDs not ma= tching up somewhere >> or some other weirdness /proc/cpuinfo or fails to recognise the CPU it w= ill bail out and ignore >> the option entirely (similarly to other ports). >> >> The patch works fine with both /proc/cpuinfo formats although, as mentio= ned above, it will not be >> able to detect the big.LITTLE combination from the first format. >> >> I've filled in the implementer ID and part numbers for the Cortex-A57, C= ortex-A53, Cortex-A72, X-Gene 1 cores, >> but I don't have that info for thunderx or exynosm1. Could someone from = Cavium and Samsung help me out >> here? At present this patch has some false dummy values that I'd like to= fill out before committing this. > Thanks Andrew and Evandro for the info. > I've added the numbers to the patch, so it should work on those systems. > I'm attaching the final patch here for review. And resending here with a minor whitespace change in aarch64-cores.def to make thunderx line up with the other entries. Thanks Evandro for pointing i= t out! Kyrill > > Thanks, > Kyrill > > > 2014-04-22 Kyrylo Tkachov > > * config.host (case ${host}): Add aarch64*-*-linux case. > * config/aarch64/aarch64-cores.def: Add IMPLEMENTER_ID and PART_NU= MBER > fields to all the cores. > * config/aarch64/aarch64-elf.h (DRIVER_SELF_SPECS): Add > MCPU_MTUNE_NATIVE_SPECS. > * config/aarch64/aarch64-option-extensions.def: Add FEAT_STRING fi= eld to all > extensions. > * config/aarch64/aarch64-opts.h: Adjust definition of AARCH64_CORE. > * config/aarch64/aarch64.c: Adjust definition of AARCH64_CORE. > Adjust definition of AARCH64_OPT_EXTENSION. > * config/aarch64/aarch64.h: Adjust definition of AARCH64_CORE. > (MCPU_MTUNE_NATIVE_SPECS): Define. > * config/aarch64/driver-aarch64.c: New file. > * config/aarch64/x-arch64: New file. > * doc/invoke.texi (AArch64 Options): Document native value for -mc= pu, > -mtune and -march. > >> I've bootstrapped this on the system mentioned above with -mcpu=3Dnative= in the BOOT_CFLAGS and regtested as well. >> For the bootstrap I've used the 2nd /proc/cpuinfo format. >> >> I've also tested it on AArch64 hardware from ARM Ltd. and the ecosystem. >> >> If using the first format the bootstrap fails the comparison because, de= pending on the OS scheduling, some files >> are compiled with Cortex-A57 tuning and some with Cortex-A53 tuning and = this is practically non-deterministic >> across stage2 and stage3! >> >> What do people think of this approach? >> >> 2014-04-20 Kyrylo Tkachov >> >> * config.host (case ${host}): Add aarch64*-*-linux case. >> * config/aarch64/aarch64-cores.def: Add IMPLEMENTER_ID and PART_N= UMBER >> fields to all the cores. >> * config/aarch64/aarch64-elf.h (DRIVER_SELF_SPECS): Add >> MCPU_MTUNE_NATIVE_SPECS. >> * config/aarch64/aarch64-option-extensions.def: Add FEAT_STRING f= ield to all >> extensions. >> * config/aarch64/aarch64-opts.h: Adjust definition of AARCH64_COR= E. >> * config/aarch64/aarch64.c: Adjust definition of AARCH64_CORE. >> Adjust definition of AARCH64_OPT_EXTENSION. >> * config/aarch64/aarch64.h: Adjust definition of AARCH64_CORE. >> (MCPU_MTUNE_NATIVE_SPECS): Define. >> * config/aarch64/driver-aarch64.c: New file. >> * config/aarch64/x-arch64: New file. >> * doc/invoke.texi (AArch64 Options): Document native value for -m= cpu, >> -mtune and -march. > > aarch64-native.patch > > > commit bfdf31b9d71620afac43b15ebf31502022a9bc63 > Author: Kyrylo Tkachov > Date: Fri Apr 10 16:39:27 2015 +0100 > > [AArch64] Implement -m{tune,cpu,arch}=3Dnative on AArch64 GNU/Linux > > diff --git a/gcc/config.host b/gcc/config.host > index b0f5940..a8896d1 100644 > --- a/gcc/config.host > +++ b/gcc/config.host > @@ -99,6 +99,14 @@ case ${host} in > esac > > case ${host} in > + aarch64*-*-linux*) > + case ${target} in > + aarch64*-*-*) > + host_extra_gcc_objs=3D"driver-aarch64.o" > + host_xmake_file=3D"${host_xmake_file} aarch64/x-aarch64" > + ;; > + esac > + ;; > arm*-*-freebsd* | arm*-*-linux*) > case ${target} in > arm*-*-*) > diff --git a/gcc/config/aarch64/aarch64-cores.def b/gcc/config/aarch64/aa= rch64-cores.def > index e46d91b..7e6cb73 100644 > --- a/gcc/config/aarch64/aarch64-cores.def > +++ b/gcc/config/aarch64/aarch64-cores.def > @@ -21,7 +21,7 @@ > > Before using #include to read this file, define a macro: > > - AARCH64_CORE(CORE_NAME, CORE_IDENT, SCHEDULER_IDENT, ARCH, FLAGS, = COSTS) > + AARCH64_CORE(CORE_NAME, CORE_IDENT, SCHEDULER_IDENT, ARCH, FLAGS, = COSTS, IMP, PART) > > The CORE_NAME is the name of the core, represented as a string cons= tant. > The CORE_IDENT is the name of the core, represented as an identifie= r. > @@ -30,18 +30,23 @@ > ARCH is the architecture revision implemented by the chip. > FLAGS are the bitwise-or of the traits that apply to that core. > This need not include flags implied by the architecture. > - COSTS is the name of the rtx_costs routine to use. */ > + COSTS is the name of the rtx_costs routine to use. > + IMP is the implementer ID of the CPU vendor. On a GNU/Linux system i= t can > + be found in /proc/cpuinfo. > + PART is the part number of the CPU. On a GNU/Linux system it can be = found > + in /proc/cpuinfo. For big.LITTLE systems this should have the form a= t of > + ".". */ > > /* V8 Architecture Processors. */ > > -AARCH64_CORE("cortex-a53", cortexa53, cortexa53, 8, AARCH64_FL_FOR_ARC= H8 | AARCH64_FL_CRC, cortexa53) > -AARCH64_CORE("cortex-a57", cortexa57, cortexa57, 8, AARCH64_FL_FOR_ARC= H8 | AARCH64_FL_CRC, cortexa57) > -AARCH64_CORE("cortex-a72", cortexa72, cortexa57, 8, AARCH64_FL_FOR_ARC= H8 | AARCH64_FL_CRC, cortexa57) > -AARCH64_CORE("exynos-m1", exynosm1, cortexa57, 8, AARCH64_FL_FOR_ARC= H8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, cortexa57) > -AARCH64_CORE("thunderx", thunderx, thunderx, 8, AARCH64_FL_FOR_ARCH= 8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx) > -AARCH64_CORE("xgene1", xgene1, xgene1, 8, AARCH64_FL_FOR_ARC= H8, xgene1) > +AARCH64_CORE("cortex-a53", cortexa53, cortexa53, 8, AARCH64_FL_FOR_ARC= H8 | AARCH64_FL_CRC, cortexa53, "0x41", "0xd03") > +AARCH64_CORE("cortex-a57", cortexa57, cortexa57, 8, AARCH64_FL_FOR_ARC= H8 | AARCH64_FL_CRC, cortexa57, "0x41", "0xd07") > +AARCH64_CORE("cortex-a72", cortexa72, cortexa57, 8, AARCH64_FL_FOR_ARC= H8 | AARCH64_FL_CRC, cortexa57, "0x41", "0xd08") > +AARCH64_CORE("exynos-m1", exynosm1, cortexa57, 8, AARCH64_FL_FOR_ARC= H8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, cortexa57, "0x53", "0x001") > +AARCH64_CORE("thunderx", thunderx, thunderx, 8, AARCH64_FL_FOR_ARCH= 8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx, "0x43", "0x0a1") > +AARCH64_CORE("xgene1", xgene1, xgene1, 8, AARCH64_FL_FOR_ARC= H8, xgene1, "0x50", "0x000") > > /* V8 big.LITTLE implementations. */ > > -AARCH64_CORE("cortex-a57.cortex-a53", cortexa57cortexa53, cortexa53, 8,= AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57) > -AARCH64_CORE("cortex-a72.cortex-a53", cortexa72cortexa53, cortexa53, 8,= AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57) > +AARCH64_CORE("cortex-a57.cortex-a53", cortexa57cortexa53, cortexa53, 8,= AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57, "0x41", "0xd07.0xd03") > +AARCH64_CORE("cortex-a72.cortex-a53", cortexa72cortexa53, cortexa53, 8,= AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57, "0x41", "0xd08.0xd03") > diff --git a/gcc/config/aarch64/aarch64-elf.h b/gcc/config/aarch64/aarch6= 4-elf.h > index a5ec8cb..1ce6343 100644 > --- a/gcc/config/aarch64/aarch64-elf.h > +++ b/gcc/config/aarch64/aarch64-elf.h > @@ -132,7 +132,8 @@ > #undef DRIVER_SELF_SPECS > #define DRIVER_SELF_SPECS \ > " %{!mbig-endian:%{!mlittle-endian:" ENDIAN_SPEC "}}" \ > - " %{!mabi=3D*:" ABI_SPEC "}" > + " %{!mabi=3D*:" ABI_SPEC "}" \ > + MCPU_MTUNE_NATIVE_SPECS > > #ifdef HAVE_AS_MABI_OPTION > #define ASM_MABI_SPEC "%{mabi=3D*:-mabi=3D%*}" > diff --git a/gcc/config/aarch64/aarch64-option-extensions.def b/gcc/confi= g/aarch64/aarch64-option-extensions.def > index 6ec3ed6..f296296 100644 > --- a/gcc/config/aarch64/aarch64-option-extensions.def > +++ b/gcc/config/aarch64/aarch64-option-extensions.def > @@ -21,18 +21,25 @@ > > Before using #include to read this file, define a macro: > > - AARCH64_OPT_EXTENSION(EXT_NAME, FLAGS_ON, FLAGS_OFF) > + AARCH64_OPT_EXTENSION(EXT_NAME, FLAGS_ON, FLAGS_OFF, FEATURE_STRIN= G) > > EXT_NAME is the name of the extension, represented as a string cons= tant. > FLAGS_ON are the bitwise-or of the features that the extension adds. > - FLAGS_OFF are the bitwise-or of the features that the extension remov= es. */ > + FLAGS_OFF are the bitwise-or of the features that the extension remov= es. > + FEAT_STRING is a string containing the entries in the 'Features' fiel= d of > + /proc/cpuinfo on a GNU/Linux system that correspond to this architect= ure > + extension being available. Sometimes multiple entries are needed to = enable > + the extension (for example, the 'crypto' extension depends on four > + entries: aes, pmull, sha1, sha2 being present). In that case this fi= eld > + should contain a whitespace-separated list of the strings in 'Feature= s' > + that are required. Their order is not important. */ > > /* V8 Architecture Extensions. > This list currently contains example extensions for CPUs that imple= ment > AArch64, and therefore serves as a template for adding more CPUs in= the > future. */ > > -AARCH64_OPT_EXTENSION("fp", AARCH64_FL_FP, AARCH64_FL_FPSIMD | AARCH= 64_FL_CRYPTO) > -AARCH64_OPT_EXTENSION("simd", AARCH64_FL_FPSIMD, AARCH64_FL_SIMD |= AARCH64_FL_CRYPTO) > -AARCH64_OPT_EXTENSION("crypto", AARCH64_FL_CRYPTO | AARCH64_FL_FP= SIMD, AARCH64_FL_CRYPTO) > -AARCH64_OPT_EXTENSION("crc", AARCH64_FL_CRC, AARCH64_FL_CRC) > +AARCH64_OPT_EXTENSION("fp", AARCH64_FL_FP, A= ARCH64_FL_FPSIMD | AARCH64_FL_CRYPTO, "fp") > +AARCH64_OPT_EXTENSION("simd", AARCH64_FL_FPSIMD, A= ARCH64_FL_SIMD | AARCH64_FL_CRYPTO, "asimd") > +AARCH64_OPT_EXTENSION("crypto", AARCH64_FL_CRYPTO | AARCH64_FL_FP= SIMD, AARCH64_FL_CRYPTO, "aes pmull sha1 sha2") > +AARCH64_OPT_EXTENSION("crc", AARCH64_FL_CRC, A= ARCH64_FL_CRC, "crc32") > diff --git a/gcc/config/aarch64/aarch64-opts.h b/gcc/config/aarch64/aarch= 64-opts.h > index f88ae5b..ea64cf4 100644 > --- a/gcc/config/aarch64/aarch64-opts.h > +++ b/gcc/config/aarch64/aarch64-opts.h > @@ -25,7 +25,7 @@ > /* The various cores that implement AArch64. */ > enum aarch64_processor > { > -#define AARCH64_CORE(NAME, INTERNAL_IDENT, SCHED, ARCH, FLAGS, COSTS) \ > +#define AARCH64_CORE(NAME, INTERNAL_IDENT, SCHED, ARCH, FLAGS, COSTS, IM= P, PART) \ > INTERNAL_IDENT, > #include "aarch64-cores.def" > #undef AARCH64_CORE > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > index 954e110..ea6020f 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -441,7 +441,7 @@ struct processor > /* Processor cores implementing AArch64. */ > static const struct processor all_cores[] =3D > { > -#define AARCH64_CORE(NAME, IDENT, SCHED, ARCH, FLAGS, COSTS) \ > +#define AARCH64_CORE(NAME, IDENT, SCHED, ARCH, FLAGS, COSTS, IMP, PART) \ > {NAME, SCHED, #ARCH, ARCH, FLAGS, &COSTS##_tunings}, > #include "aarch64-cores.def" > #undef AARCH64_CORE > @@ -478,7 +478,7 @@ struct aarch64_option_extension > /* ISA extensions in AArch64. */ > static const struct aarch64_option_extension all_extensions[] =3D > { > -#define AARCH64_OPT_EXTENSION(NAME, FLAGS_ON, FLAGS_OFF) \ > +#define AARCH64_OPT_EXTENSION(NAME, FLAGS_ON, FLAGS_OFF, FEATURE_STRING)= \ > {NAME, FLAGS_ON, FLAGS_OFF}, > #include "aarch64-option-extensions.def" > #undef AARCH64_OPT_EXTENSION > diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h > index bf59e40..1f7187b 100644 > --- a/gcc/config/aarch64/aarch64.h > +++ b/gcc/config/aarch64/aarch64.h > @@ -506,7 +506,7 @@ enum reg_class > > enum target_cpus > { > -#define AARCH64_CORE(NAME, INTERNAL_IDENT, SCHED, ARCH, FLAGS, COSTS) \ > +#define AARCH64_CORE(NAME, INTERNAL_IDENT, SCHED, ARCH, FLAGS, COSTS, IM= P, PART) \ > TARGET_CPU_##INTERNAL_IDENT, > #include "aarch64-cores.def" > #undef AARCH64_CORE > @@ -929,11 +929,24 @@ extern const char *aarch64_rewrite_mcpu (int argc, = const char **argv); > #define BIG_LITTLE_CPU_SPEC_FUNCTIONS \ > { "rewrite_mcpu", aarch64_rewrite_mcpu }, > > +#if defined(__aarch64__) > +extern const char *host_detect_local_cpu (int argc, const char **argv); > +# define EXTRA_SPEC_FUNCTIONS \ > + { "local_cpu_detect", host_detect_local_cpu }, \ > + BIG_LITTLE_CPU_SPEC_FUNCTIONS > + > +# define MCPU_MTUNE_NATIVE_SPECS \ > + " %{march=3Dnative:% + " %{mcpu=3Dnative:% + " %{mtune=3Dnative:% +#else > +# define MCPU_MTUNE_NATIVE_SPECS "" > +# define EXTRA_SPEC_FUNCTIONS BIG_LITTLE_CPU_SPEC_FUNCTIONS > +#endif > + > #define ASM_CPU_SPEC \ > BIG_LITTLE_SPEC > > -#define EXTRA_SPEC_FUNCTIONS BIG_LITTLE_CPU_SPEC_FUNCTIONS > - > #define EXTRA_SPECS \ > { "asm_cpu_spec", ASM_CPU_SPEC } > > diff --git a/gcc/config/aarch64/driver-aarch64.c b/gcc/config/aarch64/dri= ver-aarch64.c > new file mode 100644 > index 0000000..da10a4c > --- /dev/null > +++ b/gcc/config/aarch64/driver-aarch64.c > @@ -0,0 +1,307 @@ > +/* Native CPU detection for aarch64. > + Copyright (C) 2014 Free Software Foundation, Inc. > + > + This file is part of GCC. > + > + GCC is free software; you can redistribute it and/or modify > + it under the terms of the GNU General Public License as published by > + the Free Software Foundation; either version 3, or (at your option) > + any later version. > + > + GCC is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + GNU General Public License for more details. > + > + You should have received a copy of the GNU General Public License > + along with GCC; see the file COPYING3. If not see > +. */ > + > +#include "config.h" > +#include "system.h" > + > +struct arch_extension > +{ > + const char *ext; > + const char *feat_string; > +}; > + > +#define AARCH64_OPT_EXTENSION(EXT_NAME, FLAGS_ON, FLAGS_OFF, FEATURE_STR= ING) \ > + { EXT_NAME, FEATURE_STRING }, > +static struct arch_extension ext_to_feat_string[] =3D > +{ > +#include "aarch64-option-extensions.def" > +}; > +#undef AARCH64_OPT_EXTENSION > + > + > +struct aarch64_core_data > +{ > + const char* name; > + const char* arch; > + const char* implementer_id; > + const char* part_no; > +}; > + > +#define AARCH64_CORE(CORE_NAME, CORE_IDENT, SCHED, ARCH, FLAGS, COSTS, I= MP, PART) \ > + { CORE_NAME, #ARCH, IMP, PART }, > + > +static struct aarch64_core_data cpu_data [] =3D > +{ > +#include "aarch64-cores.def" > + { NULL, NULL, NULL, NULL } > +}; > + > +#undef AARCH64_CORE > + > +struct aarch64_arch > +{ > + const char* id; > + const char* name; > +}; > + > +#define AARCH64_ARCH(NAME, CORE, ARCH, FLAGS) \ > + { #ARCH, NAME }, > + > +static struct aarch64_arch aarch64_arches [] =3D > +{ > +#include "aarch64-arches.def" > + {NULL, NULL} > +}; > + > +#undef AARCH64_ARCH > + > +/* Return the full architecture name string corresponding to the > + identifier ID. */ > + > +static const char* > +get_arch_name_from_id (const char* id) > +{ > + unsigned int i =3D 0; > + > + for (i =3D 0; aarch64_arches[i].id !=3D NULL; i++) > + { > + if (strcmp (id, aarch64_arches[i].id) =3D=3D 0) > + return aarch64_arches[i].name; > + } > + > + return NULL; > +} > + > + > +/* Check wether the string CORE contains the same CPU part numbers > + as BL_STRING. For example CORE=3D"{0xd03, 0xd07}" and BL_STRING=3D"0= xd07.0xd03" > + should return true. */ > + > +static bool > +valid_bL_string_p (const char** core, const char* bL_string) > +{ > + return strstr (bL_string, core[0]) !=3D NULL > + && strstr (bL_string, core[1]) !=3D NULL; > +} > + > +/* Return true iff ARR contains STR in one of its two elements. */ > + > +static bool > +contains_string_p (const char** arr, const char* str) > +{ > + bool res =3D false; > + > + if (arr[0] !=3D NULL) > + { > + res =3D strstr (arr[0], str) !=3D NULL; > + if (res) > + return res; > + > + if (arr[1] !=3D NULL) > + return strstr (arr[1], str) !=3D NULL; > + } > + > + return false; > +} > + > +/* This will be called by the spec parser in gcc.c when it sees > + a %:local_cpu_detect(args) construct. Currently it will be called > + with either "arch", "cpu" or "tune" as argument depending on if > + -march=3Dnative, -mcpu=3Dnative or -mtune=3Dnative is to be substitut= ed. > + > + It returns a string containing new command line parameters to be > + put at the place of the above two options, depending on what CPU > + this is executed. E.g. "-march=3Darmv8-a" on a Cortex-A57 for > + -march=3Dnative. If the routine can't detect a known processor, > + the -march or -mtune option is discarded. > + > + For -mtune and -mcpu arguments it attempts to detect the CPU or > + a big.LITTLE system. > + ARGC and ARGV are set depending on the actual arguments given > + in the spec. */ > + > +const char * > +host_detect_local_cpu (int argc, const char **argv) > +{ > + const char *arch_id =3D NULL; > + const char *res =3D NULL; > + static const int num_exts =3D ARRAY_SIZE (ext_to_feat_string); > + char buf[128]; > + FILE *f =3D NULL; > + bool arch =3D false; > + bool tune =3D false; > + bool cpu =3D false; > + unsigned int i =3D 0; > + unsigned int core_idx =3D 0; > + const char* imps[2] =3D { NULL, NULL }; > + const char* cores[2] =3D { NULL, NULL }; > + unsigned int n_cores =3D 0; > + unsigned int n_imps =3D 0; > + bool processed_exts =3D false; > + const char *ext_string =3D ""; > + > + gcc_assert (argc); > + > + if (!argv[0]) > + goto not_found; > + > + /* Are we processing -march, mtune or mcpu? */ > + arch =3D strcmp (argv[0], "arch") =3D=3D 0; > + if (!arch) > + tune =3D strcmp (argv[0], "tune") =3D=3D 0; > + > + if (!arch && !tune) > + cpu =3D strcmp (argv[0], "cpu") =3D=3D 0; > + > + if (!arch && !tune && !cpu) > + goto not_found; > + > + f =3D fopen ("/proc/cpuinfo", "r"); > + > + if (f =3D=3D NULL) > + goto not_found; > + > + /* Look through /proc/cpuinfo to determine the implementer > + and then the part number that identifies a particular core. */ > + while (fgets (buf, sizeof (buf), f) !=3D NULL) > + { > + if (strstr (buf, "implementer") !=3D NULL) > + { > + for (i =3D 0; cpu_data[i].name !=3D NULL; i++) > + if (strstr (buf, cpu_data[i].implementer_id) !=3D NULL > + && !contains_string_p (imps, cpu_data[i].implementer_id)) > + { > + if (n_imps =3D=3D 2) > + goto not_found; > + > + imps[n_imps++] =3D cpu_data[i].implementer_id; > + > + break; > + } > + continue; > + } > + > + if (strstr (buf, "part") !=3D NULL) > + { > + for (i =3D 0; cpu_data[i].name !=3D NULL; i++) > + if (strstr (buf, cpu_data[i].part_no) !=3D NULL > + && !contains_string_p (cores, cpu_data[i].part_no)) > + { > + if (n_cores =3D=3D 2) > + goto not_found; > + > + cores[n_cores++] =3D cpu_data[i].part_no; > + core_idx =3D i; > + arch_id =3D cpu_data[i].arch; > + break; > + } > + continue; > + } > + if (!tune && !processed_exts && strstr (buf, "Features") !=3D NULL) > + { > + for (i =3D 0; i < num_exts; i++) > + { > + bool enabled =3D true; > + char *p =3D NULL; > + char *feat_string =3D concat (ext_to_feat_string[i].feat_s= tring, NULL); > + > + p =3D strtok (feat_string, " "); > + > + while (p !=3D NULL) > + { > + if (strstr (buf, p) =3D=3D NULL) > + { > + enabled =3D false; > + break; > + } > + p =3D strtok (NULL, " "); > + } > + ext_string =3D concat (ext_string, "+", enabled ? "" : "no= ", > + ext_to_feat_string[i].ext, NULL); > + } > + processed_exts =3D true; > + } > + } > + > + fclose (f); > + f =3D NULL; > + > + /* Weird cpuinfo format that we don't know how to handle. */ > + if (n_cores =3D=3D 0 || n_cores > 2 || n_imps !=3D 1) > + goto not_found; > + > + if (arch && !arch_id) > + goto not_found; > + > + if (arch) > + { > + const char* arch_name =3D get_arch_name_from_id (arch_id); > + > + /* We got some arch indentifier that's not in aarch64-arches.def? = */ > + if (!arch_name) > + goto not_found; > + > + res =3D concat ("-march=3D", arch_name, NULL); > + } > + /* We have big.LITTLE. */ > + else if (n_cores =3D=3D 2) > + { > + for (i =3D 0; cpu_data[i].name !=3D NULL; i++) > + { > + if (strchr (cpu_data[i].part_no, '.') !=3D NULL > + && strncmp (cpu_data[i].implementer_id, imps[0], strlen (i= mps[0]) - 1) =3D=3D 0 > + && valid_bL_string_p (cores, cpu_data[i].part_no)) > + { > + res =3D concat ("-m", cpu ? "cpu" : "tune", "=3D", cpu_dat= a[i].name, NULL); > + break; > + } > + } > + if (!res) > + goto not_found; > + } > + /* The simple, non-big.LITTLE case. */ > + else > + { > + if (strncmp (cpu_data[core_idx].implementer_id, imps[0], > + strlen (imps[0]) - 1) !=3D 0) > + goto not_found; > + > + res =3D concat ("-m", cpu ? "cpu" : "tune", "=3D", > + cpu_data[core_idx].name, NULL); > + } > + > + if (tune) > + return res; > + > + res =3D concat (res, ext_string, NULL); > + > + return res; > + > +not_found: > + { > + /* If detection fails we ignore the option. > + Clean up and return empty string. */ > + > + if (f) > + fclose (f); > + > + return ""; > + } > +} > + > diff --git a/gcc/config/aarch64/x-aarch64 b/gcc/config/aarch64/x-aarch64 > new file mode 100644 > index 0000000..8c09e04 > --- /dev/null > +++ b/gcc/config/aarch64/x-aarch64 > @@ -0,0 +1,3 @@ > +driver-aarch64.o: $(srcdir)/config/aarch64/driver-aarch64.c \ > + $(CONFIG_H) $(SYSTEM_H) > + $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) $< > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index e2918cb..5787524 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -12318,8 +12318,12 @@ This involves inserting a NOP instruction betwee= n memory instructions and > Specify the name of the target architecture, optionally suffixed by on= e or > more feature modifiers. This option has the form > @option{-march=3D@var{arch}@r{@{}+@r{[}no@r{]}@var{feature}@r{@}*}}, w= here the > -only permissible value for @var{arch} is @samp{armv8-a}. The permissible > -values for @var{feature} are documented in the sub-section below. > +only permissible value for @var{arch} is @samp{armv8-a}. > +The permissible values for @var{feature} are documented in the sub-secti= on > +below. Additionally on native AArch64 GNU/Linux systems the value > +@samp{native} is available. This option causes the compiler to pick the > +architecture of the host system. If the compiler is unable to recognize= the > +architecture of the host system this option has no effect. > > Where conflicting feature modifiers are specified, the right-most feat= ure is > used. > @@ -12343,6 +12347,13 @@ Additionally, this option can specify that GCC s= hould tune the performance > of the code for a big.LITTLE system. Permissible values for this > option are: @samp{cortex-a57.cortex-a53}, @samp{cortex-a72.cortex-a53}. > > +Additionally on native AArch64 GNU/Linux systems the value @samp{native} > +is available. > +This option causes the compiler to pick the architecture of and tune the > +performance of the code for the processor of the host system. > +If the compiler is unable to recognize the processor of the host system > +this option has no effect. > + > Where none of @option{-mtune=3D}, @option{-mcpu=3D} or @option{-march= =3D} > are specified, the code is tuned to perform well across a range > of target processors. > @@ -12355,7 +12366,11 @@ Specify the name of the target processor, option= ally suffixed by one or more > feature modifiers. This option has the form > @option{-mcpu=3D@var{cpu}@r{@{}+@r{[}no@r{]}@var{feature}@r{@}*}}, whe= re the > permissible values for @var{cpu} are the same as those available for > -@option{-mtune}. > +@option{-mtune}. Additionally on native AArch64 GNU/Linux systems the > +value @samp{native} is available. > +This option causes the compiler to tune the performance of the code for = the > +processor of the host system. If the compiler is unable to recognize the > +processor of the host system this option has no effect. > > The permissible values for @var{feature} are documented in the sub-sec= tion > below. > --------------010502060205000408090005 Content-Type: text/x-patch; name=aarch64-native.patch Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="aarch64-native.patch" Content-length: 21248 commit 5b8c2958530a96facd16630b89023a4c102af85d Author: Kyrylo Tkachov Date: Fri Apr 10 16:39:27 2015 +0100 [AArch64] Implement -m{tune,cpu,arch}=3Dnative on AArch64 GNU/Linux diff --git a/gcc/config.host b/gcc/config.host index b0f5940..a8896d1 100644 --- a/gcc/config.host +++ b/gcc/config.host @@ -99,6 +99,14 @@ case ${host} in esac =20 case ${host} in + aarch64*-*-linux*) + case ${target} in + aarch64*-*-*) + host_extra_gcc_objs=3D"driver-aarch64.o" + host_xmake_file=3D"${host_xmake_file} aarch64/x-aarch64" + ;; + esac + ;; arm*-*-freebsd* | arm*-*-linux*) case ${target} in arm*-*-*) diff --git a/gcc/config/aarch64/aarch64-cores.def b/gcc/config/aarch64/aarc= h64-cores.def index e46d91b..7c285ba 100644 --- a/gcc/config/aarch64/aarch64-cores.def +++ b/gcc/config/aarch64/aarch64-cores.def @@ -21,7 +21,7 @@ =20 Before using #include to read this file, define a macro: =20 - AARCH64_CORE(CORE_NAME, CORE_IDENT, SCHEDULER_IDENT, ARCH, FLAGS, CO= STS) + AARCH64_CORE(CORE_NAME, CORE_IDENT, SCHEDULER_IDENT, ARCH, FLAGS, CO= STS, IMP, PART) =20 The CORE_NAME is the name of the core, represented as a string constant. The CORE_IDENT is the name of the core, represented as an identifier. @@ -30,18 +30,23 @@ ARCH is the architecture revision implemented by the chip. FLAGS are the bitwise-or of the traits that apply to that core. This need not include flags implied by the architecture. - COSTS is the name of the rtx_costs routine to use. */ + COSTS is the name of the rtx_costs routine to use. + IMP is the implementer ID of the CPU vendor. On a GNU/Linux system it = can + be found in /proc/cpuinfo. + PART is the part number of the CPU. On a GNU/Linux system it can be fo= und + in /proc/cpuinfo. For big.LITTLE systems this should have the form at = of + ".". */ =20 /* V8 Architecture Processors. */ =20 -AARCH64_CORE("cortex-a53", cortexa53, cortexa53, 8, AARCH64_FL_FOR_ARCH8= | AARCH64_FL_CRC, cortexa53) -AARCH64_CORE("cortex-a57", cortexa57, cortexa57, 8, AARCH64_FL_FOR_ARCH8= | AARCH64_FL_CRC, cortexa57) -AARCH64_CORE("cortex-a72", cortexa72, cortexa57, 8, AARCH64_FL_FOR_ARCH8= | AARCH64_FL_CRC, cortexa57) -AARCH64_CORE("exynos-m1", exynosm1, cortexa57, 8, AARCH64_FL_FOR_ARCH8= | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, cortexa57) -AARCH64_CORE("thunderx", thunderx, thunderx, 8, AARCH64_FL_FOR_ARCH8 = | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx) -AARCH64_CORE("xgene1", xgene1, xgene1, 8, AARCH64_FL_FOR_ARCH8= , xgene1) +AARCH64_CORE("cortex-a53", cortexa53, cortexa53, 8, AARCH64_FL_FOR_ARCH8= | AARCH64_FL_CRC, cortexa53, "0x41", "0xd03") +AARCH64_CORE("cortex-a57", cortexa57, cortexa57, 8, AARCH64_FL_FOR_ARCH8= | AARCH64_FL_CRC, cortexa57, "0x41", "0xd07") +AARCH64_CORE("cortex-a72", cortexa72, cortexa57, 8, AARCH64_FL_FOR_ARCH8= | AARCH64_FL_CRC, cortexa57, "0x41", "0xd08") +AARCH64_CORE("exynos-m1", exynosm1, cortexa57, 8, AARCH64_FL_FOR_ARCH8= | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, cortexa57, "0x53", "0x001") +AARCH64_CORE("thunderx", thunderx, thunderx, 8, AARCH64_FL_FOR_ARCH8= | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx, "0x43", "0x0a1") +AARCH64_CORE("xgene1", xgene1, xgene1, 8, AARCH64_FL_FOR_ARCH8= , xgene1, "0x50", "0x000") =20 /* V8 big.LITTLE implementations. */ =20 -AARCH64_CORE("cortex-a57.cortex-a53", cortexa57cortexa53, cortexa53, 8, = AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57) -AARCH64_CORE("cortex-a72.cortex-a53", cortexa72cortexa53, cortexa53, 8, = AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57) +AARCH64_CORE("cortex-a57.cortex-a53", cortexa57cortexa53, cortexa53, 8, = AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57, "0x41", "0xd07.0xd03") +AARCH64_CORE("cortex-a72.cortex-a53", cortexa72cortexa53, cortexa53, 8, = AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57, "0x41", "0xd08.0xd03") diff --git a/gcc/config/aarch64/aarch64-elf.h b/gcc/config/aarch64/aarch64-= elf.h index a5ec8cb..1ce6343 100644 --- a/gcc/config/aarch64/aarch64-elf.h +++ b/gcc/config/aarch64/aarch64-elf.h @@ -132,7 +132,8 @@ #undef DRIVER_SELF_SPECS #define DRIVER_SELF_SPECS \ " %{!mbig-endian:%{!mlittle-endian:" ENDIAN_SPEC "}}" \ - " %{!mabi=3D*:" ABI_SPEC "}" + " %{!mabi=3D*:" ABI_SPEC "}" \ + MCPU_MTUNE_NATIVE_SPECS =20 #ifdef HAVE_AS_MABI_OPTION #define ASM_MABI_SPEC "%{mabi=3D*:-mabi=3D%*}" diff --git a/gcc/config/aarch64/aarch64-option-extensions.def b/gcc/config/= aarch64/aarch64-option-extensions.def index 6ec3ed6..f296296 100644 --- a/gcc/config/aarch64/aarch64-option-extensions.def +++ b/gcc/config/aarch64/aarch64-option-extensions.def @@ -21,18 +21,25 @@ =20 Before using #include to read this file, define a macro: =20 - AARCH64_OPT_EXTENSION(EXT_NAME, FLAGS_ON, FLAGS_OFF) + AARCH64_OPT_EXTENSION(EXT_NAME, FLAGS_ON, FLAGS_OFF, FEATURE_STRING) =20 EXT_NAME is the name of the extension, represented as a string constant. FLAGS_ON are the bitwise-or of the features that the extension adds. - FLAGS_OFF are the bitwise-or of the features that the extension removes= . */ + FLAGS_OFF are the bitwise-or of the features that the extension removes. + FEAT_STRING is a string containing the entries in the 'Features' field = of + /proc/cpuinfo on a GNU/Linux system that correspond to this architecture + extension being available. Sometimes multiple entries are needed to en= able + the extension (for example, the 'crypto' extension depends on four + entries: aes, pmull, sha1, sha2 being present). In that case this field + should contain a whitespace-separated list of the strings in 'Features' + that are required. Their order is not important. */ =20 /* V8 Architecture Extensions. This list currently contains example extensions for CPUs that implement AArch64, and therefore serves as a template for adding more CPUs in the future. */ =20 -AARCH64_OPT_EXTENSION("fp", AARCH64_FL_FP, AARCH64_FL_FPSIMD | AARCH64_FL_= CRYPTO) -AARCH64_OPT_EXTENSION("simd", AARCH64_FL_FPSIMD, AARCH64_FL_SIMD | AARCH64= _FL_CRYPTO) -AARCH64_OPT_EXTENSION("crypto", AARCH64_FL_CRYPTO | AARCH64_FL_FPSIMD, AAR= CH64_FL_CRYPTO) -AARCH64_OPT_EXTENSION("crc", AARCH64_FL_CRC, AARCH64_FL_CRC) +AARCH64_OPT_EXTENSION("fp", AARCH64_FL_FP, AARCH6= 4_FL_FPSIMD | AARCH64_FL_CRYPTO, "fp") +AARCH64_OPT_EXTENSION("simd", AARCH64_FL_FPSIMD, AARC= H64_FL_SIMD | AARCH64_FL_CRYPTO, "asimd") +AARCH64_OPT_EXTENSION("crypto", AARCH64_FL_CRYPTO | AARCH64_FL_FPSIMD, AA= RCH64_FL_CRYPTO, "aes pmull sha1 sha2") +AARCH64_OPT_EXTENSION("crc", AARCH64_FL_CRC, AARCH= 64_FL_CRC, "crc32") diff --git a/gcc/config/aarch64/aarch64-opts.h b/gcc/config/aarch64/aarch64= -opts.h index f88ae5b..ea64cf4 100644 --- a/gcc/config/aarch64/aarch64-opts.h +++ b/gcc/config/aarch64/aarch64-opts.h @@ -25,7 +25,7 @@ /* The various cores that implement AArch64. */ enum aarch64_processor { -#define AARCH64_CORE(NAME, INTERNAL_IDENT, SCHED, ARCH, FLAGS, COSTS) \ +#define AARCH64_CORE(NAME, INTERNAL_IDENT, SCHED, ARCH, FLAGS, COSTS, IMP,= PART) \ INTERNAL_IDENT, #include "aarch64-cores.def" #undef AARCH64_CORE diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index a90993b..5999950 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -441,7 +441,7 @@ struct processor /* Processor cores implementing AArch64. */ static const struct processor all_cores[] =3D { -#define AARCH64_CORE(NAME, IDENT, SCHED, ARCH, FLAGS, COSTS) \ +#define AARCH64_CORE(NAME, IDENT, SCHED, ARCH, FLAGS, COSTS, IMP, PART) \ {NAME, SCHED, #ARCH, ARCH, FLAGS, &COSTS##_tunings}, #include "aarch64-cores.def" #undef AARCH64_CORE @@ -478,7 +478,7 @@ struct aarch64_option_extension /* ISA extensions in AArch64. */ static const struct aarch64_option_extension all_extensions[] =3D { -#define AARCH64_OPT_EXTENSION(NAME, FLAGS_ON, FLAGS_OFF) \ +#define AARCH64_OPT_EXTENSION(NAME, FLAGS_ON, FLAGS_OFF, FEATURE_STRING) \ {NAME, FLAGS_ON, FLAGS_OFF}, #include "aarch64-option-extensions.def" #undef AARCH64_OPT_EXTENSION diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index bf59e40..1f7187b 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -506,7 +506,7 @@ enum reg_class =20 enum target_cpus { -#define AARCH64_CORE(NAME, INTERNAL_IDENT, SCHED, ARCH, FLAGS, COSTS) \ +#define AARCH64_CORE(NAME, INTERNAL_IDENT, SCHED, ARCH, FLAGS, COSTS, IMP,= PART) \ TARGET_CPU_##INTERNAL_IDENT, #include "aarch64-cores.def" #undef AARCH64_CORE @@ -929,11 +929,24 @@ extern const char *aarch64_rewrite_mcpu (int argc, co= nst char **argv); #define BIG_LITTLE_CPU_SPEC_FUNCTIONS \ { "rewrite_mcpu", aarch64_rewrite_mcpu }, =20 +#if defined(__aarch64__) +extern const char *host_detect_local_cpu (int argc, const char **argv); +# define EXTRA_SPEC_FUNCTIONS \ + { "local_cpu_detect", host_detect_local_cpu }, \ + BIG_LITTLE_CPU_SPEC_FUNCTIONS + +# define MCPU_MTUNE_NATIVE_SPECS \ + " %{march=3Dnative:%. */ + +#include "config.h" +#include "system.h" + +struct arch_extension +{ + const char *ext; + const char *feat_string; +}; + +#define AARCH64_OPT_EXTENSION(EXT_NAME, FLAGS_ON, FLAGS_OFF, FEATURE_STRIN= G) \ + { EXT_NAME, FEATURE_STRING }, +static struct arch_extension ext_to_feat_string[] =3D +{ +#include "aarch64-option-extensions.def" +}; +#undef AARCH64_OPT_EXTENSION + + +struct aarch64_core_data +{ + const char* name; + const char* arch; + const char* implementer_id; + const char* part_no; +}; + +#define AARCH64_CORE(CORE_NAME, CORE_IDENT, SCHED, ARCH, FLAGS, COSTS, IMP= , PART) \ + { CORE_NAME, #ARCH, IMP, PART }, + +static struct aarch64_core_data cpu_data [] =3D +{ +#include "aarch64-cores.def" + { NULL, NULL, NULL, NULL } +}; + +#undef AARCH64_CORE + +struct aarch64_arch +{ + const char* id; + const char* name; +}; + +#define AARCH64_ARCH(NAME, CORE, ARCH, FLAGS) \ + { #ARCH, NAME }, + +static struct aarch64_arch aarch64_arches [] =3D +{ +#include "aarch64-arches.def" + {NULL, NULL} +}; + +#undef AARCH64_ARCH + +/* Return the full architecture name string corresponding to the + identifier ID. */ + +static const char* +get_arch_name_from_id (const char* id) +{ + unsigned int i =3D 0; + + for (i =3D 0; aarch64_arches[i].id !=3D NULL; i++) + { + if (strcmp (id, aarch64_arches[i].id) =3D=3D 0) + return aarch64_arches[i].name; + } + + return NULL; +} + + +/* Check wether the string CORE contains the same CPU part numbers + as BL_STRING. For example CORE=3D"{0xd03, 0xd07}" and BL_STRING=3D"0xd= 07.0xd03" + should return true. */ + +static bool +valid_bL_string_p (const char** core, const char* bL_string) +{ + return strstr (bL_string, core[0]) !=3D NULL + && strstr (bL_string, core[1]) !=3D NULL; +} + +/* Return true iff ARR contains STR in one of its two elements. */ + +static bool +contains_string_p (const char** arr, const char* str) +{ + bool res =3D false; + + if (arr[0] !=3D NULL) + { + res =3D strstr (arr[0], str) !=3D NULL; + if (res) + return res; + + if (arr[1] !=3D NULL) + return strstr (arr[1], str) !=3D NULL; + } + + return false; +} + +/* This will be called by the spec parser in gcc.c when it sees + a %:local_cpu_detect(args) construct. Currently it will be called + with either "arch", "cpu" or "tune" as argument depending on if + -march=3Dnative, -mcpu=3Dnative or -mtune=3Dnative is to be substituted. + + It returns a string containing new command line parameters to be + put at the place of the above two options, depending on what CPU + this is executed. E.g. "-march=3Darmv8-a" on a Cortex-A57 for + -march=3Dnative. If the routine can't detect a known processor, + the -march or -mtune option is discarded. + + For -mtune and -mcpu arguments it attempts to detect the CPU or + a big.LITTLE system. + ARGC and ARGV are set depending on the actual arguments given + in the spec. */ + +const char * +host_detect_local_cpu (int argc, const char **argv) +{ + const char *arch_id =3D NULL; + const char *res =3D NULL; + static const int num_exts =3D ARRAY_SIZE (ext_to_feat_string); + char buf[128]; + FILE *f =3D NULL; + bool arch =3D false; + bool tune =3D false; + bool cpu =3D false; + unsigned int i =3D 0; + unsigned int core_idx =3D 0; + const char* imps[2] =3D { NULL, NULL }; + const char* cores[2] =3D { NULL, NULL }; + unsigned int n_cores =3D 0; + unsigned int n_imps =3D 0; + bool processed_exts =3D false; + const char *ext_string =3D ""; + + gcc_assert (argc); + + if (!argv[0]) + goto not_found; + + /* Are we processing -march, mtune or mcpu? */ + arch =3D strcmp (argv[0], "arch") =3D=3D 0; + if (!arch) + tune =3D strcmp (argv[0], "tune") =3D=3D 0; + + if (!arch && !tune) + cpu =3D strcmp (argv[0], "cpu") =3D=3D 0; + + if (!arch && !tune && !cpu) + goto not_found; + + f =3D fopen ("/proc/cpuinfo", "r"); + + if (f =3D=3D NULL) + goto not_found; + + /* Look through /proc/cpuinfo to determine the implementer + and then the part number that identifies a particular core. */ + while (fgets (buf, sizeof (buf), f) !=3D NULL) + { + if (strstr (buf, "implementer") !=3D NULL) + { + for (i =3D 0; cpu_data[i].name !=3D NULL; i++) + if (strstr (buf, cpu_data[i].implementer_id) !=3D NULL + && !contains_string_p (imps, cpu_data[i].implementer_id)) + { + if (n_imps =3D=3D 2) + goto not_found; + + imps[n_imps++] =3D cpu_data[i].implementer_id; + + break; + } + continue; + } + + if (strstr (buf, "part") !=3D NULL) + { + for (i =3D 0; cpu_data[i].name !=3D NULL; i++) + if (strstr (buf, cpu_data[i].part_no) !=3D NULL + && !contains_string_p (cores, cpu_data[i].part_no)) + { + if (n_cores =3D=3D 2) + goto not_found; + + cores[n_cores++] =3D cpu_data[i].part_no; + core_idx =3D i; + arch_id =3D cpu_data[i].arch; + break; + } + continue; + } + if (!tune && !processed_exts && strstr (buf, "Features") !=3D NULL) + { + for (i =3D 0; i < num_exts; i++) + { + bool enabled =3D true; + char *p =3D NULL; + char *feat_string =3D concat (ext_to_feat_string[i].feat_str= ing, NULL); + + p =3D strtok (feat_string, " "); + + while (p !=3D NULL) + { + if (strstr (buf, p) =3D=3D NULL) + { + enabled =3D false; + break; + } + p =3D strtok (NULL, " "); + } + ext_string =3D concat (ext_string, "+", enabled ? "" : "no", + ext_to_feat_string[i].ext, NULL); + } + processed_exts =3D true; + } + } + + fclose (f); + f =3D NULL; + + /* Weird cpuinfo format that we don't know how to handle. */ + if (n_cores =3D=3D 0 || n_cores > 2 || n_imps !=3D 1) + goto not_found; + + if (arch && !arch_id) + goto not_found; + + if (arch) + { + const char* arch_name =3D get_arch_name_from_id (arch_id); + + /* We got some arch indentifier that's not in aarch64-arches.def? */ + if (!arch_name) + goto not_found; + + res =3D concat ("-march=3D", arch_name, NULL); + } + /* We have big.LITTLE. */ + else if (n_cores =3D=3D 2) + { + for (i =3D 0; cpu_data[i].name !=3D NULL; i++) + { + if (strchr (cpu_data[i].part_no, '.') !=3D NULL + && strncmp (cpu_data[i].implementer_id, imps[0], strlen (imp= s[0]) - 1) =3D=3D 0 + && valid_bL_string_p (cores, cpu_data[i].part_no)) + { + res =3D concat ("-m", cpu ? "cpu" : "tune", "=3D", cpu_data[= i].name, NULL); + break; + } + } + if (!res) + goto not_found; + } + /* The simple, non-big.LITTLE case. */ + else + { + if (strncmp (cpu_data[core_idx].implementer_id, imps[0], + strlen (imps[0]) - 1) !=3D 0) + goto not_found; + + res =3D concat ("-m", cpu ? "cpu" : "tune", "=3D", + cpu_data[core_idx].name, NULL); + } + + if (tune) + return res; + + res =3D concat (res, ext_string, NULL); + + return res; + +not_found: + { + /* If detection fails we ignore the option. + Clean up and return empty string. */ + + if (f) + fclose (f); + + return ""; + } +} + diff --git a/gcc/config/aarch64/x-aarch64 b/gcc/config/aarch64/x-aarch64 new file mode 100644 index 0000000..8c09e04 --- /dev/null +++ b/gcc/config/aarch64/x-aarch64 @@ -0,0 +1,3 @@ +driver-aarch64.o: $(srcdir)/config/aarch64/driver-aarch64.c \ + $(CONFIG_H) $(SYSTEM_H) + $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) $< diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index e89e5a8..80dd131 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -12323,8 +12323,12 @@ This involves inserting a NOP instruction between = memory instructions and Specify the name of the target architecture, optionally suffixed by one or more feature modifiers. This option has the form @option{-march=3D@var{arch}@r{@{}+@r{[}no@r{]}@var{feature}@r{@}*}}, where= the -only permissible value for @var{arch} is @samp{armv8-a}. The permissible -values for @var{feature} are documented in the sub-section below. +only permissible value for @var{arch} is @samp{armv8-a}. +The permissible values for @var{feature} are documented in the sub-section +below. Additionally on native AArch64 GNU/Linux systems the value +@samp{native} is available. This option causes the compiler to pick the +architecture of the host system. If the compiler is unable to recognize t= he +architecture of the host system this option has no effect. =20 Where conflicting feature modifiers are specified, the right-most feature = is used. @@ -12348,6 +12352,13 @@ Additionally, this option can specify that GCC sho= uld tune the performance of the code for a big.LITTLE system. Permissible values for this option are: @samp{cortex-a57.cortex-a53}, @samp{cortex-a72.cortex-a53}. =20 +Additionally on native AArch64 GNU/Linux systems the value @samp{native} +is available. +This option causes the compiler to pick the architecture of and tune the +performance of the code for the processor of the host system. +If the compiler is unable to recognize the processor of the host system +this option has no effect. + Where none of @option{-mtune=3D}, @option{-mcpu=3D} or @option{-march=3D} are specified, the code is tuned to perform well across a range of target processors. @@ -12360,7 +12371,11 @@ Specify the name of the target processor, optional= ly suffixed by one or more feature modifiers. This option has the form @option{-mcpu=3D@var{cpu}@r{@{}+@r{[}no@r{]}@var{feature}@r{@}*}}, where t= he permissible values for @var{cpu} are the same as those available for -@option{-mtune}. +@option{-mtune}. Additionally on native AArch64 GNU/Linux systems the +value @samp{native} is available. +This option causes the compiler to tune the performance of the code for the +processor of the host system. If the compiler is unable to recognize the +processor of the host system this option has no effect. =20 The permissible values for @var{feature} are documented in the sub-section below. --------------010502060205000408090005--