From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf1-x433.google.com (mail-pf1-x433.google.com [IPv6:2607:f8b0:4864:20::433]) by sourceware.org (Postfix) with ESMTPS id 8C9293857343 for ; Tue, 21 Jun 2022 21:57:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 8C9293857343 Received: by mail-pf1-x433.google.com with SMTP id u37so14305558pfg.3 for ; Tue, 21 Jun 2022 14:57:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=QpJiOwTZSNxyrtamaZFAbUQ05jeXF+/2C8wL8mKY/QY=; b=D/csvjLuhfmNSzHsl6rM2H4j04NiNpOCKz+Y33HgAKimqrhbhlkTLsZp1t43pJt/Gl tZqC6/SvF1yGym1njC1wh+kkiMnftq/8ixy6/ou5VrSm8aUcMk5r7aKaWMLo1bJrQBH4 eCcX+7HyI6gk4dyEFc/4n53/bg6ONu3pUehI1S8neSkEx7jznDrApFiXtlGL5uoXocTB xh49nmy6qU/U5hEwxeLwoiWYrCdca/7LahnkEDDLdY0S/FdXxL+uX/sJiZ1rvr0UUo8m P9tO4nxMYG6x5hBOKVWnKu38MUBCSm6vXS9eQzI/SNUPhdJA0tbcG0m6hPLuQtXBN3lw Ocdw== X-Gm-Message-State: AJIora/Lybd2io+Om7/Z6jWsYOK+JCaCKY6MkP7nI0xNbbABm5VkatOs 1MOnddLd+8SKaouEpRp+Z3NC8JPzazMK5z3T1Zd5ufBe1o4= X-Google-Smtp-Source: AGRyM1vVPjxQVqh79tvGRvRMGoHuFRIFAKgi/t/VDYIQAmv4dV8+uLh0+eZtVbWQTtekoWI8JLVOj3FPofIVamA0dfQ= X-Received: by 2002:a63:b54c:0:b0:40c:7b84:4f7f with SMTP id u12-20020a63b54c000000b0040c7b844f7fmr80617pgo.586.1655848649495; Tue, 21 Jun 2022 14:57:29 -0700 (PDT) MIME-Version: 1.0 References: <20220617035050.1252784-1-goldstein.w.n@gmail.com> <20220621214403.3445837-1-goldstein.w.n@gmail.com> In-Reply-To: <20220621214403.3445837-1-goldstein.w.n@gmail.com> From: "H.J. Lu" Date: Tue, 21 Jun 2022 14:56:53 -0700 Message-ID: Subject: Re: [PATCH v3 1/2] x86: Add defines / utilities for making ISA specific x86 builds To: Noah Goldstein Cc: GNU C Library , "Carlos O'Donell" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-3024.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, KAM_STOCKGEN, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Jun 2022 21:57:34 -0000 On Tue, Jun 21, 2022 at 2:44 PM Noah Goldstein wrote: > > 1. Factor out some of the ISA level defines in isa-level.c to > standalone header isa-level.h > > 2. Add new headers with ISA level dependent macros for handling > ifuncs. > > Note, this file does not change any code. > > Tested with and without multiarch on x86_64 for ISA levels: > {generic, x86-64-v2, x86-64-v3, x86-64-v4} > --- > sysdeps/generic/ifunc-init.h | 8 ++ > sysdeps/x86/init-arch.h | 5 +- > sysdeps/x86/isa-cpu-feature-checks.h | 55 +++++++++++++ > sysdeps/x86/isa-ifunc-macros.h | 117 +++++++++++++++++++++++++++ > sysdeps/x86/isa-level.c | 17 ++-- > sysdeps/x86/isa-level.h | 67 +++++++++++++++ > sysdeps/x86_64/isa-default-impl.h | 49 +++++++++++ > 7 files changed, 305 insertions(+), 13 deletions(-) > create mode 100644 sysdeps/x86/isa-cpu-feature-checks.h > create mode 100644 sysdeps/x86/isa-ifunc-macros.h > create mode 100644 sysdeps/x86/isa-level.h > create mode 100644 sysdeps/x86_64/isa-default-impl.h > > diff --git a/sysdeps/generic/ifunc-init.h b/sysdeps/generic/ifunc-init.h > index 929e22ff5d..76f91c663c 100644 > --- a/sysdeps/generic/ifunc-init.h > +++ b/sysdeps/generic/ifunc-init.h > @@ -55,3 +55,11 @@ > #define OPTIMIZE2(name) EVALUATOR2 (SYMBOL_NAME, name) > /* Default is to use OPTIMIZE2. */ > #define OPTIMIZE(name) OPTIMIZE2(name) > + > +/* Syntactic sugar for common usage of the OPTIMIZE and OPTIMIZE1 macros > + respectively. */ > +#define OPTIMIZE_DECL(...) \ > + extern __typeof (REDIRECT_NAME) OPTIMIZE (__VA_ARGS__) attribute_hidden; > + > +#define OPTIMIZE_DECL1(...) \ > + extern __typeof (REDIRECT_NAME) OPTIMIZE1 (__VA_ARGS__) attribute_hidden; > diff --git a/sysdeps/x86/init-arch.h b/sysdeps/x86/init-arch.h > index 277c15f116..a9fb4a1975 100644 > --- a/sysdeps/x86/init-arch.h > +++ b/sysdeps/x86/init-arch.h > @@ -19,7 +19,10 @@ > #include > #include > > -#ifndef __x86_64__ > +#ifdef __x86_64__ > +# include > +# include > +#else > /* Due to the reordering and the other nifty extensions in i686, it is > not really good to use heavily i586 optimized code on an i686. It's > better to use i486 code if it isn't an i586. */ > diff --git a/sysdeps/x86/isa-cpu-feature-checks.h b/sysdeps/x86/isa-cpu-feature-checks.h > new file mode 100644 > index 0000000000..5900a04599 > --- /dev/null > +++ b/sysdeps/x86/isa-cpu-feature-checks.h > @@ -0,0 +1,55 @@ > +/* Common ifunc selection utils > + All versions must be listed in ifunc-impl-list.c. > + Copyright (C) 2022 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + . */ > + > +#ifndef _ISA_CPU_FEATURE_CHECKS_H > +#define _ISA_CPU_FEATURE_CHECKS_H 1 > + > +#include > + > +/* ISA level >= 4 guaranteed includes. */ > +#define X86_FEATURE_USABLE_P_AVX512VL \ > + (MINIMUM_X86_ISA_LEVEL >= 4 || CPU_FEATURE_USABLE_P (cpu_features, AVX512VL)) > cpu_features should be an argument. How about #define X86_ISA_CPU_FEATURE_USABLE_P(ptr, name) \ (name ## _X86_ISA_LEVEL <= MINIMUM_X86_ISA_LEVEL \ || CPU_FEATURE_USABLE_P (ptr, name) Also please limit lines to 72 columns. > +#define X86_FEATURE_USABLE_P_AVX512BW \ > + (MINIMUM_X86_ISA_LEVEL >= 4 || CPU_FEATURE_USABLE_P (cpu_features, AVX512BW)) > + > +/* ISA level >= 3 guaranteed includes. */ > +#define X86_FEATURE_USABLE_P_AVX2 \ > + (MINIMUM_X86_ISA_LEVEL >= 3 || CPU_FEATURE_USABLE_P (cpu_features, AVX2)) > + > +#define X86_FEATURE_USABLE_P_BMI2 \ > + (MINIMUM_X86_ISA_LEVEL >= 3 || CPU_FEATURE_USABLE_P (cpu_features, BMI2)) > + > +/* > + * NB: This may not be fully assumable for ISA level >= 3. From looking over > + * the architectures supported in cpu-features.h the following CPUs may have an > + * issue with this being default set: > + * - AMD Excavator > + */ > +#define X86_FEATURE_ARCH_P_AVX_Fast_Unaligned_Load \ > + (MINIMUM_X86_ISA_LEVEL >= 3 \ > + || CPU_FEATURES_ARCH_P (cpu_features, AVX_Fast_Unaligned_Load)) > + > +/* ISA independent non-guaranteed includes. */ > +#define X86_FEATURE_USABLE_P_RTM CPU_FEATURE_USABLE_P (cpu_features, RTM) > + > +#define X86_FEATURE_ARCH_P_Prefer_No_VZEROUPPER \ > + CPU_FEATURES_ARCH_P (cpu_features, Prefer_No_VZEROUPPER) > + > +#endif > diff --git a/sysdeps/x86/isa-ifunc-macros.h b/sysdeps/x86/isa-ifunc-macros.h > new file mode 100644 > index 0000000000..69895e26ca > --- /dev/null > +++ b/sysdeps/x86/isa-ifunc-macros.h > @@ -0,0 +1,117 @@ > +/* Common ifunc selection utils > + All versions must be listed in ifunc-impl-list.c. > + Copyright (C) 2022 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + . */ > + > +#ifndef _ISA_IFUNC_MACROS_H > +#define _ISA_IFUNC_MACROS_H 1 > + > +#include > +#include > + > +/* Only include at the level of the minimum build ISA or higher. I.e > + if built with ISA=V1, then include all implementations. On the > + other hand if built with ISA=V3 only include V3/V4 > + implementations. If there is no implementation at or above the > + minimum build ISA level, then include the highest ISA level > + implementation. */ > +#if MINIMUM_X86_ISA_LEVEL <= 4 > +# define X86_IFUNC_IMPL_ADD_V4(...) IFUNC_IMPL_ADD (__VA_ARGS__) > +# define return_X86_OPTIMIZE_V4(...) return OPTIMIZE (__VA_ARGS__) > +# define return_X86_OPTIMIZE1_V4(...) return OPTIMIZE1 (__VA_ARGS__) > +#endif > +#if MINIMUM_X86_ISA_LEVEL <= 3 > +# define X86_IFUNC_IMPL_ADD_V3(...) IFUNC_IMPL_ADD (__VA_ARGS__) > +# define return_X86_OPTIMIZE_V3(...) return OPTIMIZE (__VA_ARGS__) > +# define return_X86_OPTIMIZE1_V3(...) return OPTIMIZE1 (__VA_ARGS__) > +#endif > +#if MINIMUM_X86_ISA_LEVEL <= 2 > +# define X86_IFUNC_IMPL_ADD_V2(...) IFUNC_IMPL_ADD (__VA_ARGS__) > +# define return_X86_OPTIMIZE_V2(...) return OPTIMIZE (__VA_ARGS__) > +# define return_X86_OPTIMIZE1_V2(...) return OPTIMIZE1 (__VA_ARGS__) > +#endif > +#if MINIMUM_X86_ISA_LEVEL <= 1 > +# define X86_IFUNC_IMPL_ADD_V1(...) IFUNC_IMPL_ADD (__VA_ARGS__) > +# define return_X86_OPTIMIZE_V1(...) return OPTIMIZE (__VA_ARGS__) > +# define return_X86_OPTIMIZE1_V1(...) return OPTIMIZE1 (__VA_ARGS__) > +#endif > + > +#ifndef return_X86_OPTIMIZE_V4 > +# define X86_IFUNC_IMPL_ADD_V4(...) > +# define return_X86_OPTIMIZE_V4(...) (void) (0) > +# define return_X86_OPTIMIZE1_V4(...) (void) (0) > +#endif > +#ifndef return_X86_OPTIMIZE_V3 > +# define X86_IFUNC_IMPL_ADD_V3(...) > +# define return_X86_OPTIMIZE_V3(...) (void) (0) > +# define return_X86_OPTIMIZE1_V3(...) (void) (0) > +#endif > +#ifndef return_X86_OPTIMIZE_V2 > +# define X86_IFUNC_IMPL_ADD_V2(...) > +# define return_X86_OPTIMIZE_V2(...) (void) (0) > +# define return_X86_OPTIMIZE1_V2(...) (void) (0) > +#endif > +#ifndef return_X86_OPTIMIZE_V1 > +# define X86_IFUNC_IMPL_ADD_V1(...) > +# define return_X86_OPTIMIZE_V1(...) (void) (0) > +# define return_X86_OPTIMIZE1_V1(...) (void) (0) > +#endif > + > +#if MINIMUM_X86_ISA_LEVEL == 1 > +# define X86_OPTIMIZE_FALLBACK(v1, ...) OPTIMIZE (v1) > +#elif MINIMUM_X86_ISA_LEVEL == 2 > +# define X86_OPTIMIZE_FALLBACK(v1, v2, ...) OPTIMIZE (v2) > +#elif MINIMUM_X86_ISA_LEVEL == 3 > +# define X86_OPTIMIZE_FALLBACK(v1, v2, v3, ...) OPTIMIZE (v3) > +#elif MINIMUM_X86_ISA_LEVEL == 4 > +# define X86_OPTIMIZE_FALLBACK(v1, v2, v3, v4) OPTIMIZE (v4) > +#else > +# error "Unsupported ISA Level" > +#endif > + > + > +#if MINIMUM_X86_ISA_LEVEL >= 4 > +__errordecl (__unreachable_isa_above_4, > + "This code should be unreachable if ISA level >= 4 build "); > +# define X86_ERROR_IF_REACHABLE_V4() \ > + __unreachable_isa_above_4 (); \ > + __builtin_unreachable (); > +#else > +# define X86_ERROR_IF_REACHABLE_V4() > +#endif > + > +#if MINIMUM_X86_ISA_LEVEL >= 3 > +__errordecl (__unreachable_isa_above_3, > + "This code should be unreachable if ISA level >= 3 build"); > +# define X86_ERROR_IF_REACHABLE_V3() \ > + __unreachable_isa_above_3 (); \ > + __builtin_unreachable (); > +#else > +# define X86_ERROR_IF_REACHABLE_V3() > +#endif > + > +#if MINIMUM_X86_ISA_LEVEL >= 2 > +__errordecl (__unreachable_isa_above_2, > + "This code should be unreachable if ISA level >= 2 build"); > +# define X86_ERROR_IF_REACHABLE_V2() \ > + __unreachable_isa_above_2 (); \ > + __builtin_unreachable (); > +#else > +# define X86_ERROR_IF_REACHABLE_V2() > +#endif > + > +#endif > diff --git a/sysdeps/x86/isa-level.c b/sysdeps/x86/isa-level.c > index 09cd72ab20..5b7a2da870 100644 > --- a/sysdeps/x86/isa-level.c > +++ b/sysdeps/x86/isa-level.c > @@ -26,38 +26,31 @@ > . */ > > #include > - > +#include > /* ELF program property for x86 ISA level. */ > #ifdef INCLUDE_X86_ISA_LEVEL > -# if defined __SSE__ && defined __SSE2__ > +# if MINIMUM_X86_ISA_LEVEL >= 1 > /* NB: ISAs, excluding MMX, in x86-64 ISA level baseline are used. */ > # define ISA_BASELINE GNU_PROPERTY_X86_ISA_1_BASELINE > # else > # define ISA_BASELINE 0 > # endif > > -# if ISA_BASELINE && defined __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16 \ > - && defined HAVE_X86_LAHF_SAHF && defined __POPCNT__ \ > - && defined __SSE3__ && defined __SSSE3__ && defined __SSE4_1__ \ > - && defined __SSE4_2__ > +# if MINIMUM_X86_ISA_LEVEL >= 2 > /* NB: ISAs in x86-64 ISA level v2 are used. */ > # define ISA_V2 GNU_PROPERTY_X86_ISA_1_V2 > # else > # define ISA_V2 0 > # endif > > -# if ISA_V2 && defined __AVX__ && defined __AVX2__ && defined __F16C__ \ > - && defined __FMA__ && defined __LZCNT__ && defined HAVE_X86_MOVBE \ > - && defined __BMI__ && defined __BMI2__ > +# if MINIMUM_X86_ISA_LEVEL >= 3 > /* NB: ISAs in x86-64 ISA level v3 are used. */ > # define ISA_V3 GNU_PROPERTY_X86_ISA_1_V3 > # else > # define ISA_V3 0 > # endif > > -# if ISA_V3 && defined __AVX512F__ && defined __AVX512BW__ \ > - && defined __AVX512CD__ && defined __AVX512DQ__ \ > - && defined __AVX512VL__ > +# if MINIMUM_X86_ISA_LEVEL >= 4 > /* NB: ISAs in x86-64 ISA level v4 are used. */ > # define ISA_V4 GNU_PROPERTY_X86_ISA_1_V4 > # else > diff --git a/sysdeps/x86/isa-level.h b/sysdeps/x86/isa-level.h > new file mode 100644 > index 0000000000..33dec72bde > --- /dev/null > +++ b/sysdeps/x86/isa-level.h > @@ -0,0 +1,67 @@ > +/* Header defining the minimum x86 ISA level > + Copyright (C) 2022 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + In addition to the permissions in the GNU Lesser General Public > + License, the Free Software Foundation gives you unlimited > + permission to link the compiled version of this file with other > + programs, and to distribute those programs without any restriction > + coming from the use of this file. (The Lesser General Public > + License restrictions do apply in other respects; for example, they > + cover modification of the file, and distribution when not linked > + into another program.) > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + . */ > + > +#ifndef _ISA_LEVEL_H > +#define _ISA_LEVEL_H > + > +#if defined __SSE__ && defined __SSE2__ > +/* NB: ISAs, excluding MMX, in x86-64 ISA level baseline are used. */ > +# define __X86_ISA_V1 1 > +#else > +# define __X86_ISA_V1 0 > +#endif > + > +#if __X86_ISA_V1 && defined __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16 \ > + && defined HAVE_X86_LAHF_SAHF && defined __POPCNT__ && defined __SSE3__ \ > + && defined __SSSE3__ && defined __SSE4_1__ && defined __SSE4_2__ > +/* NB: ISAs in x86-64 ISA level v2 are used. */ > +# define __X86_ISA_V2 1 > +#else > +# define __X86_ISA_V2 0 > +#endif > + > +#if __X86_ISA_V2 && defined __AVX__ && defined __AVX2__ && defined __F16C__ \ > + && defined __FMA__ && defined __LZCNT__ && defined HAVE_X86_MOVBE \ > + && defined __BMI__ && defined __BMI2__ > +/* NB: ISAs in x86-64 ISA level v3 are used. */ > +# define __X86_ISA_V3 1 > +#else > +# define __X86_ISA_V3 0 > +#endif > + > +#if __X86_ISA_V3 && defined __AVX512F__ && defined __AVX512BW__ \ > + && defined __AVX512CD__ && defined __AVX512DQ__ && defined __AVX512VL__ > +/* NB: ISAs in x86-64 ISA level v4 are used. */ > +# define __X86_ISA_V4 1 > +#else > +# define __X86_ISA_V4 0 > +#endif > + > +#define MINIMUM_X86_ISA_LEVEL \ > + (__X86_ISA_V1 + __X86_ISA_V2 + __X86_ISA_V3 + __X86_ISA_V4) > + > +#endif > diff --git a/sysdeps/x86_64/isa-default-impl.h b/sysdeps/x86_64/isa-default-impl.h > new file mode 100644 > index 0000000000..db0635c8e7 > --- /dev/null > +++ b/sysdeps/x86_64/isa-default-impl.h > @@ -0,0 +1,49 @@ > +/* Utility for including proper default function based on ISA level > + Copyright (C) 2022 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + . */ > + > +#include > + > +#ifndef DEFAULT_IMPL_V1 > +# error "Must have at least ISA V1 Version" > +#endif > + > +#ifndef DEFAULT_IMPL_V2 > +# define DEFAULT_IMPL_V2 DEFAULT_IMPL_V1 > +#endif > + > +#ifndef DEFAULT_IMPL_V3 > +# define DEFAULT_IMPL_V3 DEFAULT_IMPL_V2 > +#endif > + > +#ifndef DEFAULT_IMPL_V4 > +# define DEFAULT_IMPL_V4 DEFAULT_IMPL_V3 > +#endif > + > +#if MINIMUM_X86_ISA_LEVEL == 1 > +# define ISA_DEFAULT_IMPL DEFAULT_IMPL_V1 > +#elif MINIMUM_X86_ISA_LEVEL == 2 > +# define ISA_DEFAULT_IMPL DEFAULT_IMPL_V2 > +#elif MINIMUM_X86_ISA_LEVEL == 3 > +# define ISA_DEFAULT_IMPL DEFAULT_IMPL_V3 > +#elif MINIMUM_X86_ISA_LEVEL == 4 > +# define ISA_DEFAULT_IMPL DEFAULT_IMPL_V4 > +#else > +# error "Unsupport ISA Level!" > +#endif > + > +#include ISA_DEFAULT_IMPL > -- > 2.34.1 > -- H.J.