From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x102f.google.com (mail-pj1-x102f.google.com [IPv6:2607:f8b0:4864:20::102f]) by sourceware.org (Postfix) with ESMTPS id DD0943857424 for ; Wed, 22 Jun 2022 15:27:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org DD0943857424 Received: by mail-pj1-x102f.google.com with SMTP id d14so12792494pjs.3 for ; Wed, 22 Jun 2022 08:27:00 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=bYJOIB+wtueRrnZw0hF2nT98a7JSVRvSLrzvDvIJdVA=; b=Lcvzo1mronrQ86zRLdgEVFhKX3pSBkR4+EpDeGqRMpDxZkpcTpryJvj08OMFrEabzh PcZdbwnwxx++/VBgSvw1QQ/xEiG75paPANEJbE1qsjO0C6e0bOxfBo8U5dK4LawVkD89 zI9VffopJMjrm2FxBs5hVmLaRzc2m9lJK97i5aN/K3Fct+KxufTvowssU9R2lb9794kR 2Csi+0A6GMVnl9PoZEp+jwK2OV1dZInckZEk0/VSnipAjn+DQQg5UWwzlHBFwG/QxHhO QvPGLUClUt6uzgCfxNCoO/LHHCqQrmtwz0LA6BFQSdYcQtfgI4K0THOZKuk2bawXW5Zg 9zyw== X-Gm-Message-State: AJIora+WLX01JoajQHfzXFPQrw+maHbENzERqxBrlmLfNy+81bXRsdZM DiDEXgrMKCoatVpXdWdZ3AW4diOVKBXQ8RE0EuQ= X-Google-Smtp-Source: AGRyM1sLbcmh8GTQrfIqhReHk0SPlvJnlne6Tl4ezzQFuD7PzXczDgBv2xy+uNH1AfzM7E0ee6tUl2jb8HIFmLfNc+g= X-Received: by 2002:a17:902:a502:b0:15e:c251:b769 with SMTP id s2-20020a170902a50200b0015ec251b769mr34087027plq.115.1655911619762; Wed, 22 Jun 2022 08:26:59 -0700 (PDT) MIME-Version: 1.0 References: <20220617035050.1252784-1-goldstein.w.n@gmail.com> <20220622044759.910571-1-goldstein.w.n@gmail.com> In-Reply-To: From: "H.J. Lu" Date: Wed, 22 Jun 2022 08:26:23 -0700 Message-ID: Subject: Re: [PATCH v7 1/2] x86: Add defines / utilities for making ISA specific x86 builds To: Noah Goldstein Cc: GNU C Library , "Carlos O'Donell" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-3025.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Jun 2022 15:27:03 -0000 On Wed, Jun 22, 2022 at 8:13 AM Noah Goldstein wrote: > > On Wed, Jun 22, 2022 at 7:20 AM H.J. Lu wrote: > > > > On Tue, Jun 21, 2022 at 9:48 PM Noah Goldstein wrote: > > > > > > 1. Factor out some of the ISA level defines in isa-level.c to > > > standalone header isa-level.h > > > > > > 2. Add new headers with ISA level dependent macros for handling > > > ifuncs. > > > > > > Note, this file does not change any code. > > > > > > Tested with and without multiarch on x86_64 for ISA levels: > > > {generic, x86-64-v2, x86-64-v3, x86-64-v4} > > > --- > > > sysdeps/x86/init-arch.h | 4 +- > > > sysdeps/x86/isa-ifunc-macros.h | 111 ++++++++++++++++++++++++++++++ > > > sysdeps/x86/isa-level.c | 17 ++--- > > > sysdeps/x86/isa-level.h | 99 ++++++++++++++++++++++++++ > > > sysdeps/x86_64/isa-default-impl.h | 49 +++++++++++++ > > > 5 files changed, 267 insertions(+), 13 deletions(-) > > > create mode 100644 sysdeps/x86/isa-ifunc-macros.h > > > create mode 100644 sysdeps/x86/isa-level.h > > > create mode 100644 sysdeps/x86_64/isa-default-impl.h > > > > > > diff --git a/sysdeps/x86/init-arch.h b/sysdeps/x86/init-arch.h > > > index 277c15f116..a2886a2532 100644 > > > --- a/sysdeps/x86/init-arch.h > > > +++ b/sysdeps/x86/init-arch.h > > > @@ -19,7 +19,9 @@ > > > #include > > > #include > > > > > > -#ifndef __x86_64__ > > > +#ifdef __x86_64__ > > > +# include > > > +#else > > > /* Due to the reordering and the other nifty extensions in i686, it is > > > not really good to use heavily i586 optimized code on an i686. It's > > > better to use i486 code if it isn't an i586. */ > > > diff --git a/sysdeps/x86/isa-ifunc-macros.h b/sysdeps/x86/isa-ifunc-macros.h > > > new file mode 100644 > > > index 0000000000..2aa8fab000 > > > --- /dev/null > > > +++ b/sysdeps/x86/isa-ifunc-macros.h > > > @@ -0,0 +1,111 @@ > > > +/* Common ifunc selection utils > > > + All versions must be listed in ifunc-impl-list.c. > > > + Copyright (C) 2022 Free Software Foundation, Inc. > > > + This file is part of the GNU C Library. > > > + > > > + The GNU C Library is free software; you can redistribute it and/or > > > + modify it under the terms of the GNU Lesser General Public > > > + License as published by the Free Software Foundation; either > > > + version 2.1 of the License, or (at your option) any later version. > > > + > > > + The GNU C Library is distributed in the hope that it will be useful, > > > + but WITHOUT ANY WARRANTY; without even the implied warranty of > > > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > > > + Lesser General Public License for more details. > > > + > > > + You should have received a copy of the GNU Lesser General Public > > > + License along with the GNU C Library; if not, see > > > + . */ > > > + > > > +#ifndef _ISA_IFUNC_MACROS_H > > > +#define _ISA_IFUNC_MACROS_H 1 > > > + > > > +#include > > > +#include > > > +#include > > > + > > > +/* Only include at the level of the minimum build ISA or higher. I.e > > > + if built with ISA=V1, then include all implementations. On the > > > + other hand if built with ISA=V3 only include V3/V4 > > > + implementations. If there is no implementation at or above the > > > + minimum build ISA level, then include the highest ISA level > > > + implementation. */ > > > +#if MINIMUM_X86_ISA_LEVEL <= 4 > > > +# define X86_IFUNC_IMPL_ADD_V4(...) IFUNC_IMPL_ADD (__VA_ARGS__) > > > +# define return_X86_OPTIMIZE_V4(...) return OPTIMIZE (__VA_ARGS__) > > > +# define return_X86_OPTIMIZE1_V4(...) return OPTIMIZE1 (__VA_ARGS__) > > > +#endif > > > +#if MINIMUM_X86_ISA_LEVEL <= 3 > > > +# define X86_IFUNC_IMPL_ADD_V3(...) IFUNC_IMPL_ADD (__VA_ARGS__) > > > +# define return_X86_OPTIMIZE_V3(...) return OPTIMIZE (__VA_ARGS__) > > > +# define return_X86_OPTIMIZE1_V3(...) return OPTIMIZE1 (__VA_ARGS__) > > > +#endif > > > +#if MINIMUM_X86_ISA_LEVEL <= 2 > > > +# define X86_IFUNC_IMPL_ADD_V2(...) IFUNC_IMPL_ADD (__VA_ARGS__) > > > +# define return_X86_OPTIMIZE_V2(...) return OPTIMIZE (__VA_ARGS__) > > > +# define return_X86_OPTIMIZE1_V2(...) return OPTIMIZE1 (__VA_ARGS__) > > > +#endif > > > +#if MINIMUM_X86_ISA_LEVEL <= 1 > > > +# define X86_IFUNC_IMPL_ADD_V1(...) IFUNC_IMPL_ADD (__VA_ARGS__) > > > +# define return_X86_OPTIMIZE_V1(...) return OPTIMIZE (__VA_ARGS__) > > > +# define return_X86_OPTIMIZE1_V1(...) return OPTIMIZE1 (__VA_ARGS__) > > > +#endif > > > + > > > +#ifndef return_X86_OPTIMIZE_V4 > > > +# define X86_IFUNC_IMPL_ADD_V4(...) > > > +# define return_X86_OPTIMIZE_V4(...) (void) (0) > > > +# define return_X86_OPTIMIZE1_V4(...) (void) (0) > > > +#endif > > > +#ifndef return_X86_OPTIMIZE_V3 > > > +# define X86_IFUNC_IMPL_ADD_V3(...) > > > +# define return_X86_OPTIMIZE_V3(...) (void) (0) > > > +# define return_X86_OPTIMIZE1_V3(...) (void) (0) > > > +#endif > > > +#ifndef return_X86_OPTIMIZE_V2 > > > +# define X86_IFUNC_IMPL_ADD_V2(...) > > > +# define return_X86_OPTIMIZE_V2(...) (void) (0) > > > +# define return_X86_OPTIMIZE1_V2(...) (void) (0) > > > +#endif > > > +#ifndef return_X86_OPTIMIZE_V1 > > > +# define X86_IFUNC_IMPL_ADD_V1(...) > > > +# define return_X86_OPTIMIZE_V1(...) (void) (0) > > > +# define return_X86_OPTIMIZE1_V1(...) (void) (0) > > > +#endif > > > + > > > +#if MINIMUM_X86_ISA_LEVEL >= 4 > > > +__errordecl ( > > > + __unreachable_isa_above_4, > > > + "This code should be unreachable if ISA level >= 4 build "); > > > +# define X86_ERROR_IF_REACHABLE_V4() __unreachable_isa_above_4 (); > > > +#else > > > +# define X86_ERROR_IF_REACHABLE_V4() > > > +#endif > > > + > > > +#if MINIMUM_X86_ISA_LEVEL >= 3 > > > +__errordecl (__unreachable_isa_above_3, > > > + "This code should be unreachable if ISA level >= 3 build"); > > > +# define X86_ERROR_IF_REACHABLE_V3() __unreachable_isa_above_3 (); > > > +#else > > > +# define X86_ERROR_IF_REACHABLE_V3() > > > +#endif > > > + > > > +#if MINIMUM_X86_ISA_LEVEL >= 2 > > > +__errordecl (__unreachable_isa_above_2, > > > + "This code should be unreachable if ISA level >= 2 build"); > > > +# define X86_ERROR_IF_REACHABLE_V2() __unreachable_isa_above_2 (); > > > +#else > > > +# define X86_ERROR_IF_REACHABLE_V2() > > > +#endif > > > > No need for return_X86_OPTIMIZE nor X86_ERROR_IF_REACHABLE. > > When the minimum ISA level is v3, we will get undefined > > symbol linker error if compiler doesn't optimize out references > > to v1 and v2 symbols. > > Prefer to keep both. > > Think in this case there is a meaningful clarity argument. If build fails > because undefined reference to sse2 its a less meaningfully error > than if it fails on the exact attr warning. We will only see an undefined sse2 symbol error when there is a mistake. Developers who change IFUNC code should know why the sse2 symbol isn't optimized out properly. These 2 macros make IFUNC code look very different from others. > > > > > +#define X86_ISA_CPU_FEATURE_CONST_CHECK_ENABLED(name) \ > > > + ((name##_X86_ISA_LEVEL) <= MINIMUM_X86_ISA_LEVEL) > > > + > > > +#define X86_ISA_CPU_FEATURE_USABLE_P(ptr, name) \ > > > + (X86_ISA_CPU_FEATURE_CONST_CHECK_ENABLED (name) \ > > > + || CPU_FEATURE_USABLE_P (ptr, name)) > > > + > > > +#define X86_ISA_CPU_FEATURES_ARCH_P(ptr, name) \ > > > + (X86_ISA_CPU_FEATURE_CONST_CHECK_ENABLED (name) \ > > > + || CPU_FEATURES_ARCH_P (ptr, name)) > > > + > > > +#endif > > > diff --git a/sysdeps/x86/isa-level.c b/sysdeps/x86/isa-level.c > > > index 09cd72ab20..5b7a2da870 100644 > > > --- a/sysdeps/x86/isa-level.c > > > +++ b/sysdeps/x86/isa-level.c > > > @@ -26,38 +26,31 @@ > > > . */ > > > > > > #include > > > - > > > +#include > > > /* ELF program property for x86 ISA level. */ > > > #ifdef INCLUDE_X86_ISA_LEVEL > > > -# if defined __SSE__ && defined __SSE2__ > > > +# if MINIMUM_X86_ISA_LEVEL >= 1 > > > /* NB: ISAs, excluding MMX, in x86-64 ISA level baseline are used. */ > > > # define ISA_BASELINE GNU_PROPERTY_X86_ISA_1_BASELINE > > > # else > > > # define ISA_BASELINE 0 > > > # endif > > > > > > -# if ISA_BASELINE && defined __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16 \ > > > - && defined HAVE_X86_LAHF_SAHF && defined __POPCNT__ \ > > > - && defined __SSE3__ && defined __SSSE3__ && defined __SSE4_1__ \ > > > - && defined __SSE4_2__ > > > +# if MINIMUM_X86_ISA_LEVEL >= 2 > > > /* NB: ISAs in x86-64 ISA level v2 are used. */ > > > # define ISA_V2 GNU_PROPERTY_X86_ISA_1_V2 > > > # else > > > # define ISA_V2 0 > > > # endif > > > > > > -# if ISA_V2 && defined __AVX__ && defined __AVX2__ && defined __F16C__ \ > > > - && defined __FMA__ && defined __LZCNT__ && defined HAVE_X86_MOVBE \ > > > - && defined __BMI__ && defined __BMI2__ > > > +# if MINIMUM_X86_ISA_LEVEL >= 3 > > > /* NB: ISAs in x86-64 ISA level v3 are used. */ > > > # define ISA_V3 GNU_PROPERTY_X86_ISA_1_V3 > > > # else > > > # define ISA_V3 0 > > > # endif > > > > > > -# if ISA_V3 && defined __AVX512F__ && defined __AVX512BW__ \ > > > - && defined __AVX512CD__ && defined __AVX512DQ__ \ > > > - && defined __AVX512VL__ > > > +# if MINIMUM_X86_ISA_LEVEL >= 4 > > > /* NB: ISAs in x86-64 ISA level v4 are used. */ > > > # define ISA_V4 GNU_PROPERTY_X86_ISA_1_V4 > > > # else > > > diff --git a/sysdeps/x86/isa-level.h b/sysdeps/x86/isa-level.h > > > new file mode 100644 > > > index 0000000000..21366b3132 > > > --- /dev/null > > > +++ b/sysdeps/x86/isa-level.h > > > @@ -0,0 +1,99 @@ > > > +/* Header defining the minimum x86 ISA level > > > + Copyright (C) 2022 Free Software Foundation, Inc. > > > + This file is part of the GNU C Library. > > > + > > > + The GNU C Library is free software; you can redistribute it and/or > > > + modify it under the terms of the GNU Lesser General Public > > > + License as published by the Free Software Foundation; either > > > + version 2.1 of the License, or (at your option) any later version. > > > + > > > + In addition to the permissions in the GNU Lesser General Public > > > + License, the Free Software Foundation gives you unlimited > > > + permission to link the compiled version of this file with other > > > + programs, and to distribute those programs without any restriction > > > + coming from the use of this file. (The Lesser General Public > > > + License restrictions do apply in other respects; for example, they > > > + cover modification of the file, and distribution when not linked > > > + into another program.) > > > + > > > + The GNU C Library is distributed in the hope that it will be useful, > > > + but WITHOUT ANY WARRANTY; without even the implied warranty of > > > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > > > + Lesser General Public License for more details. > > > + > > > + You should have received a copy of the GNU Lesser General Public > > > + License along with the GNU C Library; if not, see > > > + . */ > > > + > > > +#ifndef _ISA_LEVEL_H > > > +#define _ISA_LEVEL_H > > > + > > > +#if defined __SSE__ && defined __SSE2__ > > > +/* NB: ISAs, excluding MMX, in x86-64 ISA level baseline are used. */ > > > +# define __X86_ISA_V1 1 > > > +#else > > > +# define __X86_ISA_V1 0 > > > +#endif > > > + > > > +#if __X86_ISA_V1 && defined __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16 \ > > > + && defined HAVE_X86_LAHF_SAHF && defined __POPCNT__ && defined __SSE3__ \ > > > + && defined __SSSE3__ && defined __SSE4_1__ && defined __SSE4_2__ > > > +/* NB: ISAs in x86-64 ISA level v2 are used. */ > > > +# define __X86_ISA_V2 1 > > > +#else > > > +# define __X86_ISA_V2 0 > > > +#endif > > > + > > > +#if __X86_ISA_V2 && defined __AVX__ && defined __AVX2__ && defined __F16C__ \ > > > + && defined __FMA__ && defined __LZCNT__ && defined HAVE_X86_MOVBE \ > > > + && defined __BMI__ && defined __BMI2__ > > > +/* NB: ISAs in x86-64 ISA level v3 are used. */ > > > +# define __X86_ISA_V3 1 > > > +#else > > > +# define __X86_ISA_V3 0 > > > +#endif > > > + > > > +#if __X86_ISA_V3 && defined __AVX512F__ && defined __AVX512BW__ \ > > > + && defined __AVX512CD__ && defined __AVX512DQ__ && defined __AVX512VL__ > > > +/* NB: ISAs in x86-64 ISA level v4 are used. */ > > > +# define __X86_ISA_V4 1 > > > +#else > > > +# define __X86_ISA_V4 0 > > > +#endif > > > + > > > +#define MINIMUM_X86_ISA_LEVEL \ > > > + (__X86_ISA_V1 + __X86_ISA_V2 + __X86_ISA_V3 + __X86_ISA_V4) > > > + > > > + > > > +/* > > > + * CPU Features that are hard coded as enabled depending on ISA build > > > + * level. > > > + * - Values > 0 features are always ENABLED if: > > > + * Value >= MINIMUM_X86_ISA_LEVEL > > > + */ > > > + > > > + > > > +/* ISA level >= 4 guaranteed includes. */ > > > +#define AVX512VL_X86_ISA_LEVEL 4 > > > +#define AVX512BW_X86_ISA_LEVEL 4 > > > + > > > +/* ISA level >= 3 guaranteed includes. */ > > > +#define AVX2_X86_ISA_LEVEL 3 > > > +#define BMI2_X86_ISA_LEVEL 3 > > > + > > > +/* > > > + * NB: This may not be fully assumable for ISA level >= 3. From > > > + * looking over the architectures supported in cpu-features.h the > > > + * following CPUs may have an issue with this being default set: > > > + * - AMD Excavator > > > + */ > > > +#define AVX_Fast_Unaligned_Load_X86_ISA_LEVEL 3 > > > + > > > +/* > > > + * KNL (the only cpu that sets this supported in cpu-features.h) > > > + * builds with ISA V1 so this shouldn't harm any architectures. > > > + */ > > > +#define Prefer_No_VZEROUPPER_X86_ISA_LEVEL 3 > > > + > > > + > > > +#endif > > > diff --git a/sysdeps/x86_64/isa-default-impl.h b/sysdeps/x86_64/isa-default-impl.h > > > new file mode 100644 > > > index 0000000000..34634668e5 > > > --- /dev/null > > > +++ b/sysdeps/x86_64/isa-default-impl.h > > > @@ -0,0 +1,49 @@ > > > +/* Utility for including proper default function based on ISA level > > > + Copyright (C) 2022 Free Software Foundation, Inc. > > > + This file is part of the GNU C Library. > > > + > > > + The GNU C Library is free software; you can redistribute it and/or > > > + modify it under the terms of the GNU Lesser General Public > > > + License as published by the Free Software Foundation; either > > > + version 2.1 of the License, or (at your option) any later version. > > > + > > > + The GNU C Library is distributed in the hope that it will be useful, > > > + but WITHOUT ANY WARRANTY; without even the implied warranty of > > > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > > > + Lesser General Public License for more details. > > > + > > > + You should have received a copy of the GNU Lesser General Public > > > + License along with the GNU C Library; if not, see > > > + . */ > > > + > > > +#include > > > + > > > +#ifndef DEFAULT_IMPL_V1 > > > +# error "Must have at least ISA V1 Version" > > > +#endif > > > + > > > +#ifndef DEFAULT_IMPL_V2 > > > +# define DEFAULT_IMPL_V2 DEFAULT_IMPL_V1 > > > +#endif > > > + > > > +#ifndef DEFAULT_IMPL_V3 > > > +# define DEFAULT_IMPL_V3 DEFAULT_IMPL_V2 > > > +#endif > > > + > > > +#ifndef DEFAULT_IMPL_V4 > > > +# define DEFAULT_IMPL_V4 DEFAULT_IMPL_V3 > > > +#endif > > > + > > > +#if MINIMUM_X86_ISA_LEVEL == 1 > > > +# define ISA_DEFAULT_IMPL DEFAULT_IMPL_V1 > > > +#elif MINIMUM_X86_ISA_LEVEL == 2 > > > +# define ISA_DEFAULT_IMPL DEFAULT_IMPL_V2 > > > +#elif MINIMUM_X86_ISA_LEVEL == 3 > > > +# define ISA_DEFAULT_IMPL DEFAULT_IMPL_V3 > > > +#elif MINIMUM_X86_ISA_LEVEL == 4 > > > +# define ISA_DEFAULT_IMPL DEFAULT_IMPL_V4 > > > +#else > > > +# error "Unsupported ISA Level!" > > > +#endif > > > + > > > +#include ISA_DEFAULT_IMPL > > > -- > > > 2.34.1 > > > > > > > > > -- > > H.J. -- H.J.