From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oa1-x29.google.com (mail-oa1-x29.google.com [IPv6:2001:4860:4864:20::29]) by sourceware.org (Postfix) with ESMTPS id 7D45E3858C3A for ; Tue, 11 Jul 2023 13:45:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7D45E3858C3A Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org Received: by mail-oa1-x29.google.com with SMTP id 586e51a60fabf-1b3ca17f2a6so4663348fac.0 for ; Tue, 11 Jul 2023 06:45:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1689083144; x=1691675144; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=1s7ZjzeslA5VadlshVNvWXACveWPpBbuirESaPMUMc4=; b=T1R+kwAJvHG+XLQrFmwVuHO4e+C0fqVdbttz3sTviNdvi4oeXvxRH0rGStN6lLwet/ q2u8NOlsEs730Qy9QRTiRjJqNEG1Orv0wQIbBsLNCZUCBM0FdnZKHz0yXwFewamX61U7 MBmXB1M627Bxk2gE5RvOpLVcLYmhY4bevJzxyRHm8ZMCEPfkF+ejeLxHfwK1C+9S8Y7t Kch9AxvWYiblsQN0rQPK/2nswO9PQUjwBm2JMHDHZAjgoFOunsyBfeYFUMx/8fA56fgp xKsLJxJZQf0F12CK8Tpm0NWdvokT2+xIZlxmN5rmuIiDVXUO33KRs6A4dGo9U/qIr7eQ 7aYg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689083144; x=1691675144; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=1s7ZjzeslA5VadlshVNvWXACveWPpBbuirESaPMUMc4=; b=Z3YoG7y+c4zwUn2FzOy7Dp9HQHPQbYIInN4JgyUPXDERI2KDbE3b1CbhsbzyyNJrah YJr5RmZIbto1xsh67275qzx8yXB+sdQxgzrmKFs2RDq/QNz0cjCpix/xkqCFdkXqrbpd NZwQkq7hLayxmpzJzHEKWw618/pZvKrX+rVal+uHEPlgs/jheGIuIIUi9s7x8M20eEdp VN2gqAuR/Y2XZQecQraygKN8K8m2ZIs46fJ+8Mvr80GEtJ0K/Ql3eOJ4ObFmkGWcng6s ApWaKFVgzfMpPTJf/nmnvhw4DdXuCgRjviELUufgf49+1bjTtSP2OERsul5MTxuItF8B n6eA== X-Gm-Message-State: ABy/qLYKZHsiWIdfI9L2ZvOztHaH309uDLDjHPdjua3BV5E5S5/bE122 hgi3jw832gqPe2VkQVAo7yJPeQ== X-Google-Smtp-Source: APBJJlEalLfYrrET6HkjJ5ZIWWMF9YgFqrjGe5mQeBBV9PXvjTj1iV0+AsdfHPBUHmccofpZn8d5nA== X-Received: by 2002:a05:6870:f5aa:b0:1b0:79c4:de15 with SMTP id eh42-20020a056870f5aa00b001b079c4de15mr16751279oab.27.1689083143622; Tue, 11 Jul 2023 06:45:43 -0700 (PDT) Received: from ?IPV6:2804:1b3:a7c3:e0c8:142d:91c0:e0dd:736d? ([2804:1b3:a7c3:e0c8:142d:91c0:e0dd:736d]) by smtp.gmail.com with ESMTPSA id ef24-20020a0568701a9800b001aaa093932bsm1022749oab.11.2023.07.11.06.45.41 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 11 Jul 2023 06:45:42 -0700 (PDT) Message-ID: <2b71d409-0649-f47c-0626-f4726456256d@linaro.org> Date: Tue, 11 Jul 2023 10:45:39 -0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 Subject: Re: [PATCH v5] PowerPC: Influence cpu/arch hwcap features via GLIBC_TUNABLES. Content-Language: en-US To: bmahi496@linux.ibm.com, libc-alpha@sourceware.org Cc: rajis@linux.ibm.com, bergner@linux.ibm.com, Mahesh Bodapati References: <20230710182150.2376678-1-bmahi496@linux.ibm.com> From: Adhemerval Zanella Netto Organization: Linaro In-Reply-To: <20230710182150.2376678-1-bmahi496@linux.ibm.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 10/07/23 15:21, bmahi496@linux.ibm.com wrote: > From: Mahesh Bodapati > > This patch enables the option to influence hwcaps used by PowerPC. > The environment variable, GLIBC_TUNABLES=glibc.cpu.hwcaps=-xxx,yyy,-zzz...., > can be used to enable CPU/ARCH feature yyy, disable CPU/ARCH feature xxx > and zzz, where the feature name is case-sensitive and has to match the ones > mentioned in the file{sysdeps/powerpc/dl-procinfo.c}. > > Note that the tunable only handles the features which are really used > in the IFUNC selection. All others are ignored as the values are only > used inside glibc. It is still missing a regression test to check if tunable work as intended. > --- > manual/tunables.texi | 5 +- > sysdeps/powerpc/cpu-features.c | 97 ++++++++++++++++- > sysdeps/powerpc/cpu-features.h | 102 ++++++++++++++++++ > sysdeps/powerpc/dl-tunables.list | 3 + > sysdeps/powerpc/hwcapinfo.c | 4 + > .../power4/multiarch/ifunc-impl-list.c | 4 +- > .../powerpc32/power4/multiarch/init-arch.h | 10 +- > sysdeps/powerpc/powerpc64/dl-machine.h | 2 - > .../powerpc64/multiarch/ifunc-impl-list.c | 7 +- > 9 files changed, 222 insertions(+), 12 deletions(-) > > diff --git a/manual/tunables.texi b/manual/tunables.texi > index 4ca0e42a11..776fd93fd9 100644 > --- a/manual/tunables.texi > +++ b/manual/tunables.texi > @@ -513,7 +513,10 @@ On s390x, the supported HWCAP and STFLE features can be found in > @code{sysdeps/s390/cpu-features.c}. In addition the user can also set > a CPU arch-level like @code{z13} instead of single HWCAP and STFLE features. > > -This tunable is specific to i386, x86-64 and s390x. > +On powerpc, the supported HWCAP and HWCAP2 features can be found in > +@code{sysdeps/powerpc/dl-procinfo.c}. > + > +This tunable is specific to i386, x86-64, s390x and powerpc. > @end deftp > > @deftp Tunable glibc.cpu.cached_memopt > diff --git a/sysdeps/powerpc/cpu-features.c b/sysdeps/powerpc/cpu-features.c > index 0ef3cf89d2..bf1c5353da 100644 > --- a/sysdeps/powerpc/cpu-features.c > +++ b/sysdeps/powerpc/cpu-features.c > @@ -19,14 +19,109 @@ > #include > #include > #include > +#include > +#include > +#define MEMCMP_DEFAULT memcmp > +#define STRLEN_DEFAULT strlen > + > +static void > +TUNABLE_CALLBACK (set_hwcaps) (tunable_val_t *valp) > +{ > + /* The current IFUNC selection is always using the most recent > + features which are available via AT_HWCAP or AT_HWCAP2. But in > + some scenarios it is useful to adjust this selection. > + > + The environment variable: > + > + GLIBC_TUNABLES=glibc.cpu.hwcaps=-xxx,yyy,.... > + > + Can be used to enable HWCAP/HWCAP2 feature yyy, disable HWCAP/HWCAP2 > + feature xxx, where the feature name is case-sensitive and has to match > + the ones mentioned in the file{sysdeps/powerpc/dl-procinfo.c}. */ > + > + /* Copy the features from dl_powerpc_cpu_features, which contains the > + features provided by AT_HWCAP and AT_HWCAP2. */ > + struct cpu_features *cpu_features = &GLRO(dl_powerpc_cpu_features); > + unsigned long int tcbv_hwcap = cpu_features->hwcap; > + unsigned long int tcbv_hwcap2 = cpu_features->hwcap2; > + unsigned int tun_count; > + const char *token = valp->strval; > + tun_count = sizeof(hwcap_tunables)/sizeof(hwcap_tunables[0]); > + do > + { > + const char *token_end, *feature; > + bool disable; > + unsigned short offset=0; > + size_t token_len, i, feature_len; > + /* Find token separator or end of string. */ > + for (token_end = token; *token_end != ','; token_end++) > + if (*token_end == '\0') > + break; > + > + /* Determine feature. */ > + token_len = token_end - token; > + if (*token == '-') > + { > + disable = true; > + feature = token + 1; > + feature_len = token_len - 1; > + } > + else > + { > + disable = false; > + feature = token; > + feature_len = token_len; > + } > + for (i = 0; i < tun_count; ++i) > + { > + const char *hwcap_name = hwcap_names + offset; > + /* Check the tunable name on the supported list. */ > + if (STRLEN_DEFAULT (hwcap_name) == feature_len > + && MEMCMP_DEFAULT (feature, hwcap_name, feature_len) > + == 0) > + { > + /* Update the hwcap and hwcap2 bits. */ > + if (disable) > + { > + /* Id is 1 for hwcap2 tunable. */ > + if (hwcap_tunables[i].id) > + cpu_features->hwcap2 &= ~(hwcap_tunables[i].mask); > + else > + cpu_features->hwcap &= ~(hwcap_tunables[i].mask); > + } > + else > + { > + /* Enable the features and also checking that no unsupported > + features were enabled by user. */ > + if (hwcap_tunables[i].id) > + cpu_features->hwcap2 |= (tcbv_hwcap2 & hwcap_tunables[i].mask); > + else > + cpu_features->hwcap |= (tcbv_hwcap & hwcap_tunables[i].mask); > + } > + break; > + } > + offset = offset + STRLEN_DEFAULT (hwcap_name) + 1; > + } > + token += token_len; > + /* ... and skip token separator for next round. */ > + if (*token == ',') token++; > + } > + while (*token != '\0'); > +} > > static inline void > -init_cpu_features (struct cpu_features *cpu_features) > +init_cpu_features (struct cpu_features *cpu_features, uint64_t hwcaps[]) > { > + /* Fill the cpu_features with the supported hwcaps > + which are set by __tcb_parse_hwcap_and_convert_at_platform. */ > + cpu_features->hwcap = hwcaps[0]; > + cpu_features->hwcap2 = hwcaps[1]; > /* Default is to use aligned memory access on optimized function unless > tunables is enable, since for this case user can explicit disable > unaligned optimizations. */ > int32_t cached_memfunc = TUNABLE_GET (glibc, cpu, cached_memopt, int32_t, > NULL); > cpu_features->use_cached_memopt = (cached_memfunc > 0); > + TUNABLE_GET (glibc, cpu, hwcaps, tunable_val_t *, > + TUNABLE_CALLBACK (set_hwcaps)); > } > diff --git a/sysdeps/powerpc/cpu-features.h b/sysdeps/powerpc/cpu-features.h > index d316dc3d64..e5fce88e5e 100644 > --- a/sysdeps/powerpc/cpu-features.h > +++ b/sysdeps/powerpc/cpu-features.h > @@ -19,10 +19,112 @@ > # define __CPU_FEATURES_POWERPC_H > > #include > +#include > > struct cpu_features > { > bool use_cached_memopt; > + unsigned long int hwcap; > + unsigned long int hwcap2; > +}; > + > +static const char hwcap_names[] = { > + "4xxmac\0" > + "altivec\0" > + "arch_2_05\0" > + "arch_2_06\0" > + "archpmu\0" > + "booke\0" > + "cellbe\0" > + "dfp\0" > + "efpdouble\0" > + "efpsingle\0" > + "fpu\0" > + "ic_snoop\0" > + "mmu\0" > + "notb\0" > + "pa6t\0" > + "power4\0" > + "power5\0" > + "power5+\0" > + "power6x\0" > + "ppc32\0" > + "ppc601\0" > + "ppc64\0" > + "ppcle\0" > + "smt\0" > + "spe\0" > + "true_le\0" > + "ucache\0" > + "vsx\0" > + "arch_2_07\0" > + "dscr\0" > + "ebb\0" > + "htm\0" > + "htm-nosc\0" > + "htm-no-suspend\0" > + "isel\0" > + "tar\0" > + "vcrypto\0" > + "arch_3_00\0" > + "ieee128\0" > + "darn\0" > + "scv\0" > + "arch_3_1\0" > + "mma\0" > +}; > + > +static const struct > +{ > + unsigned int mask; > + bool id; > +} hwcap_tunables[] = { > + /* AT_HWCAP tunable masks. */ > + { PPC_FEATURE_HAS_4xxMAC, 0 }, > + { PPC_FEATURE_HAS_ALTIVEC, 0 }, > + { PPC_FEATURE_ARCH_2_05, 0 }, > + { PPC_FEATURE_ARCH_2_06, 0 }, > + { PPC_FEATURE_PSERIES_PERFMON_COMPAT, 0 }, > + { PPC_FEATURE_BOOKE, 0 }, > + { PPC_FEATURE_CELL_BE, 0 }, > + { PPC_FEATURE_HAS_DFP, 0 }, > + { PPC_FEATURE_HAS_EFP_DOUBLE, 0 }, > + { PPC_FEATURE_HAS_EFP_SINGLE, 0 }, > + { PPC_FEATURE_HAS_FPU, 0 }, > + { PPC_FEATURE_ICACHE_SNOOP, 0 }, > + { PPC_FEATURE_HAS_MMU, 0 }, > + { PPC_FEATURE_NO_TB, 0 }, > + { PPC_FEATURE_PA6T, 0 }, > + { PPC_FEATURE_POWER4, 0 }, > + { PPC_FEATURE_POWER5, 0 }, > + { PPC_FEATURE_POWER5_PLUS, 0 }, > + { PPC_FEATURE_POWER6_EXT, 0 }, > + { PPC_FEATURE_32, 0 }, > + { PPC_FEATURE_601_INSTR, 0 }, > + { PPC_FEATURE_64, 0 }, > + { PPC_FEATURE_PPC_LE, 0 }, > + { PPC_FEATURE_SMT, 0 }, > + { PPC_FEATURE_HAS_SPE, 0 }, > + { PPC_FEATURE_TRUE_LE, 0 }, > + { PPC_FEATURE_UNIFIED_CACHE, 0 }, > + { PPC_FEATURE_HAS_VSX, 0 }, > + > + /* AT_HWCAP2 tunable masks. */ > + { PPC_FEATURE2_ARCH_2_07, 1 }, > + { PPC_FEATURE2_HAS_DSCR, 1 }, > + { PPC_FEATURE2_HAS_EBB, 1 }, > + { PPC_FEATURE2_HAS_HTM, 1 }, > + { PPC_FEATURE2_HTM_NOSC, 1 }, > + { PPC_FEATURE2_HTM_NO_SUSPEND, 1 }, > + { PPC_FEATURE2_HAS_ISEL, 1 }, > + { PPC_FEATURE2_HAS_TAR, 1 }, > + { PPC_FEATURE2_HAS_VEC_CRYPTO, 1 }, > + { PPC_FEATURE2_ARCH_3_00, 1 }, > + { PPC_FEATURE2_HAS_IEEE128, 1 }, > + { PPC_FEATURE2_DARN, 1 }, > + { PPC_FEATURE2_SCV, 1 }, > + { PPC_FEATURE2_ARCH_3_1, 1 }, > + { PPC_FEATURE2_MMA, 1 }, > }; > > #endif /* __CPU_FEATURES_H */ > diff --git a/sysdeps/powerpc/dl-tunables.list b/sysdeps/powerpc/dl-tunables.list > index 87d6235c75..807b7f8013 100644 > --- a/sysdeps/powerpc/dl-tunables.list > +++ b/sysdeps/powerpc/dl-tunables.list > @@ -24,5 +24,8 @@ glibc { > maxval: 1 > default: 0 > } > + hwcaps { > + type: STRING > + } > } > } > diff --git a/sysdeps/powerpc/hwcapinfo.c b/sysdeps/powerpc/hwcapinfo.c > index e26e64d99e..f2c473c556 100644 > --- a/sysdeps/powerpc/hwcapinfo.c > +++ b/sysdeps/powerpc/hwcapinfo.c > @@ -19,6 +19,7 @@ > #include > #include > #include > +#include > > tcbhead_t __tcb __attribute__ ((visibility ("hidden"))); > > @@ -63,6 +64,9 @@ __tcb_parse_hwcap_and_convert_at_platform (void) > else if (h1 & PPC_FEATURE_POWER5) > h1 |= PPC_FEATURE_POWER4; > > + uint64_t array_hwcaps[] = { h1, h2 }; > + init_cpu_features (&GLRO(dl_powerpc_cpu_features), array_hwcaps); > + > /* Consolidate both HWCAP and HWCAP2 into a single doubleword so that > we can read both in a single load later. */ > __tcb.hwcap = (h1 << 32) | (h2 & 0xffffffff); > diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c b/sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c > index b4f80539e7..986c37d71e 100644 > --- a/sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c > +++ b/sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c > @@ -21,6 +21,7 @@ > #include > #include > #include > +#include > > size_t > __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, > @@ -28,7 +29,8 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, > { > size_t i = max; > > - unsigned long int hwcap = GLRO(dl_hwcap); > + const struct cpu_features *features = &GLRO(dl_powerpc_cpu_features); > + unsigned long int hwcap = features->hwcap; > /* hwcap contains only the latest supported ISA, the code checks which is > and fills the previous supported ones. */ > if (hwcap & PPC_FEATURE_ARCH_2_06) > diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/init-arch.h b/sysdeps/powerpc/powerpc32/power4/multiarch/init-arch.h > index 3dd00e02ee..a0bbd12012 100644 > --- a/sysdeps/powerpc/powerpc32/power4/multiarch/init-arch.h > +++ b/sysdeps/powerpc/powerpc32/power4/multiarch/init-arch.h > @@ -16,6 +16,7 @@ > . */ > > #include > +#include > > /* The code checks if _rtld_global_ro was realocated before trying to access > the dl_hwcap field. The assembly is to make the compiler not optimize the > @@ -32,11 +33,12 @@ > # define __GLRO(value) GLRO(value) > #endif > > -/* dl_hwcap contains only the latest supported ISA, the macro checks which is > - and fills the previous ones. */ > +/* Get the hardware information post the tunables set , the macro checks > + it and fills the previous ones. */ > #define INIT_ARCH() \ > - unsigned long int hwcap = __GLRO(dl_hwcap); \ > - unsigned long int __attribute__((unused)) hwcap2 = __GLRO(dl_hwcap2); \ > + const struct cpu_features *features = &GLRO(dl_powerpc_cpu_features); \ > + unsigned long int hwcap = features->hwcap; \ > + unsigned long int __attribute__((unused)) hwcap2 = features->hwcap2; \ > bool __attribute__((unused)) use_cached_memopt = \ > __GLRO(dl_powerpc_cpu_features.use_cached_memopt); \ > if (hwcap & PPC_FEATURE_ARCH_2_06) \ > diff --git a/sysdeps/powerpc/powerpc64/dl-machine.h b/sysdeps/powerpc/powerpc64/dl-machine.h > index 9b8943bc91..449208e86f 100644 > --- a/sysdeps/powerpc/powerpc64/dl-machine.h > +++ b/sysdeps/powerpc/powerpc64/dl-machine.h > @@ -27,7 +27,6 @@ > #include > #include > #include > -#include > #include > #include > #include > @@ -297,7 +296,6 @@ static inline void __attribute__ ((unused)) > dl_platform_init (void) > { > __tcb_parse_hwcap_and_convert_at_platform (); > - init_cpu_features (&GLRO(dl_powerpc_cpu_features)); > } > #endif > > diff --git a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c > index ebe9434052..fc26dd0e17 100644 > --- a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c > +++ b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c > @@ -17,6 +17,7 @@ > . */ > > #include > +#include > #include > #include > #include > @@ -27,9 +28,9 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, > size_t max) > { > size_t i = max; > - > - unsigned long int hwcap = GLRO(dl_hwcap); > - unsigned long int hwcap2 = GLRO(dl_hwcap2); > + const struct cpu_features *features = &GLRO(dl_powerpc_cpu_features); > + unsigned long int hwcap = features->hwcap; > + unsigned long int hwcap2 = features->hwcap2; > #ifdef SHARED > int cacheline_size = GLRO(dl_cache_line_size); > #endif