public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: "H.J. Lu" <hjl.tools@gmail.com>
To: Noah Goldstein <goldstein.w.n@gmail.com>
Cc: GNU C Library <libc-alpha@sourceware.org>,
	"Carlos O'Donell" <carlos@systemhalted.org>
Subject: Re: [PATCH v4] x86: Fix backwards Prefer_No_VZEROUPPER check in ifunc-evex.h
Date: Fri, 24 Jun 2022 15:41:53 -0700	[thread overview]
Message-ID: <CAMe9rOrcv5oApJmz4+XSthg6Vm5K4JaLE0p0_LfbxerS2RXBng@mail.gmail.com> (raw)
In-Reply-To: <20220624222910.1359991-1-goldstein.w.n@gmail.com>

On Fri, Jun 24, 2022 at 3:29 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
>
> Add third argument to X86_ISA_CPU_FEATURES_ARCH_P macro so the runtime
> CPU_FEATURES_ARCH_P check can be inverted if the
> MINIMUM_X86_ISA_LEVEL is not high enough to constantly evaluate
> the check.
>
> Use this new macro to correct the backwards check in ifunc-evex.h
> ---
>  sysdeps/x86/isa-ifunc-macros.h        | 46 +++++++++++++++++++++++----
>  sysdeps/x86/isa-level.h               | 32 ++++++++-----------
>  sysdeps/x86_64/multiarch/ifunc-evex.h |  4 +--
>  3 files changed, 56 insertions(+), 26 deletions(-)
>
> diff --git a/sysdeps/x86/isa-ifunc-macros.h b/sysdeps/x86/isa-ifunc-macros.h
> index ba6826d518..4cdc71e40e 100644
> --- a/sysdeps/x86/isa-ifunc-macros.h
> +++ b/sysdeps/x86/isa-ifunc-macros.h
> @@ -56,15 +56,49 @@
>  # define X86_IFUNC_IMPL_ADD_V1(...)
>  #endif
>
> -#define X86_ISA_CPU_FEATURE_CONST_CHECK_ENABLED(name)                  \
> -  ((name##_X86_ISA_LEVEL) <= MINIMUM_X86_ISA_LEVEL)
> +/* Both X86_ISA_CPU_FEATURE_USABLE_P and X86_ISA_CPU_FEATURES_ARCH_P
> +   macros are wrappers for the the respective
> +   CPU_FEATURE{S}_{USABLE|ARCH}_P runtime checks.  They differ in two
> +   ways.
> +
> +    1.  The USABLE_P version is evaluated to true when the feature
> +        is enabled.
> +
> +    2.  The ARCH_P version has a third argument `not`.  The `not`
> +        argument can either be '!' or empty.  If the feature is
> +        enabled above an ISA level, the third argument should be empty
> +        and the expression is evaluated to true when the feature is
> +        enabled.  If the feature is disabled above an ISA level, the
> +        third argument should be `!` and the expression is evaluated
> +        to true when the feature is disabled.
>

No need for the comments below.  These macros are self-explaining.

> +    i.e
> +
> +    X86_ISA_CPU_FEATURE_USABLE_P(p, FEATURE):
> +        A) default value of feature <= MINIMUM_X86_ISA_LEVEL   -> True
> +        B) runtime bit is set                                  -> True
> +
> +    X86_ISA_CPU_FEATURE_ARCH_P(p, FEATURE, <empty>):
> +        A) default value of feature <= MINIMUM_X86_ISA_LEVEL   -> True
> +        B) runtime bit is set                                  -> True
> +
> +    X86_ISA_CPU_FEATURE_ARCH_P(p, FEATURE, !):
> +        A) default value of feature <= MINIMUM_X86_ISA_LEVEL   -> True
> +        B) runtime bit is NOT set                              -> True
> +
> +    All cases not listed will evaluate to false.
> +
> +    Only the A) cases will constantly evaluate.
> + */
>
>  #define X86_ISA_CPU_FEATURE_USABLE_P(ptr, name)                        \
> -  (X86_ISA_CPU_FEATURE_CONST_CHECK_ENABLED (name)                      \
> +  (((name##_X86_ISA_LEVEL) <= MINIMUM_X86_ISA_LEVEL)                   \
>     || CPU_FEATURE_USABLE_P (ptr, name))
>
> -#define X86_ISA_CPU_FEATURES_ARCH_P(ptr, name)                         \
> -  (X86_ISA_CPU_FEATURE_CONST_CHECK_ENABLED (name)                      \
> -   || CPU_FEATURES_ARCH_P (ptr, name))
> +
> +#define X86_ISA_CPU_FEATURES_ARCH_P(ptr, name, not)                    \
> +  (((name##_X86_ISA_LEVEL) <= MINIMUM_X86_ISA_LEVEL)                   \
> +   || not CPU_FEATURES_ARCH_P (ptr, name))
> +
>
>  #endif
> diff --git a/sysdeps/x86/isa-level.h b/sysdeps/x86/isa-level.h
> index 7cae11c228..73937fecdc 100644
> --- a/sysdeps/x86/isa-level.h
> +++ b/sysdeps/x86/isa-level.h
> @@ -64,14 +64,8 @@
>  #define MINIMUM_X86_ISA_LEVEL                                                 \
>    (__X86_ISA_V1 + __X86_ISA_V2 + __X86_ISA_V3 + __X86_ISA_V4)
>
> -
> -/*
> - * CPU Features that are hard coded as enabled depending on ISA build
> - *   level.
> - *    - Values > 0 features are always ENABLED if:
> - *          Value >= MINIMUM_X86_ISA_LEVEL
> - */
> -
> +/* Depending on the minimum ISA level, a feature check result can be a
> +   compile-time constant.. */
>
>  /* ISA level >= 4 guaranteed includes.  */
>  #define AVX512VL_X86_ISA_LEVEL 4
> @@ -81,20 +75,22 @@
>  #define AVX2_X86_ISA_LEVEL 3
>  #define BMI2_X86_ISA_LEVEL 3
>
> -/*
> - * NB: This may not be fully assumable for ISA level >= 3. From
> - * looking over the architectures supported in cpu-features.h the
> - * following CPUs may have an issue with this being default set:
> - *      - AMD Excavator
> - */
> +/* NB: This feature is enabled when ISA level >= 3, which was disabled
> +   for the following CPUs:
> +        - AMD Excavator
> +   when ISA level < 3.  */
>  #define AVX_Fast_Unaligned_Load_X86_ISA_LEVEL 3
>
> -/*
> - * KNL (the only cpu that sets this supported in cpu-features.h)
> - * builds with ISA V1 so this shouldn't harm any architectures.
> - */
> +/* NB: This feature is disabled when ISA level >= 3, which was enabled
> +   for the following CPUs:
> +        - Intel KNL
> +   when ISA level < 3.  */
>  #define Prefer_No_VZEROUPPER_X86_ISA_LEVEL 3
>
> +
> +/* ISA level >= 2 guaranteed includes.  */
> +#define Fast_Unaligned_Load_X86_ISA_LEVEL 2
> +

These belong to a different patch.

>  #define ISA_SHOULD_BUILD(isa_build_level)                              \
>    (MINIMUM_X86_ISA_LEVEL <= (isa_build_level) && IS_IN (libc))         \
>     || defined ISA_DEFAULT_IMPL
> diff --git a/sysdeps/x86_64/multiarch/ifunc-evex.h b/sysdeps/x86_64/multiarch/ifunc-evex.h
> index 856c6261f8..310cfd269f 100644
> --- a/sysdeps/x86_64/multiarch/ifunc-evex.h
> +++ b/sysdeps/x86_64/multiarch/ifunc-evex.h
> @@ -37,7 +37,7 @@ IFUNC_SELECTOR (void)
>    if (X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, AVX2)
>        && X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, BMI2)
>        && X86_ISA_CPU_FEATURES_ARCH_P (cpu_features,
> -                                     AVX_Fast_Unaligned_Load))
> +                                     AVX_Fast_Unaligned_Load, ))
>      {
>        if (X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, AVX512VL)
>           && X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, AVX512BW))
> @@ -52,7 +52,7 @@ IFUNC_SELECTOR (void)
>         return OPTIMIZE (avx2_rtm);
>
>        if (X86_ISA_CPU_FEATURES_ARCH_P (cpu_features,
> -                                      Prefer_No_VZEROUPPER))
> +                                      Prefer_No_VZEROUPPER, !))
>         return OPTIMIZE (avx2);
>      }
>
> --
> 2.34.1
>


-- 
H.J.

  reply	other threads:[~2022-06-24 22:42 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-24  6:36 [PATCH v1 1/7] x86: Align entry for memrchr to 64-bytes Noah Goldstein
2022-06-24  6:36 ` [PATCH v1 2/7] x86: Rename strstr_sse2 to strstr_generic as it uses string/strstr.c Noah Goldstein
2022-06-24  6:36 ` [PATCH v1 3/7] x86: Add macro for NOT of a cpu arch feature and improve comments Noah Goldstein
2022-06-24 14:32   ` H.J. Lu
2022-06-24 14:49     ` H.J. Lu
2022-06-24 16:43     ` Noah Goldstein
2022-06-24 20:10   ` [PATCH v2] x86: Fix backwards Prefer_No_VZEROUPPER check in ifunc-evex.h Noah Goldstein
2022-06-24 20:32     ` H.J. Lu
2022-06-24 21:26       ` Noah Goldstein
2022-06-24 21:36         ` H.J. Lu
2022-06-24 21:46   ` [PATCH v3] " Noah Goldstein
2022-06-24 22:15     ` H.J. Lu
2022-06-24 22:29       ` Noah Goldstein
2022-06-24 22:29   ` [PATCH v4] " Noah Goldstein
2022-06-24 22:41     ` H.J. Lu [this message]
2022-06-24 22:57       ` Noah Goldstein
2022-06-24 23:05         ` H.J. Lu
2022-06-24 23:16           ` Noah Goldstein
2022-06-24 23:15   ` [PATCH v5] " Noah Goldstein
2022-06-24 23:20     ` H.J. Lu
2022-06-24  6:36 ` [PATCH v1 4/7] x86: Add comment with ISA level for all targets support by GCC12.1 Noah Goldstein
2022-06-24  6:36 ` [PATCH v1 5/7] x86: Use ARCH_P_NOT to check Prefer_No_VZeroupper in ifunc-evex.h Noah Goldstein
2022-06-24  6:36 ` [PATCH v1 6/7] x86: Put wcs{n}len-sse4.1 in the sse4.1 text section Noah Goldstein
2022-06-24  6:36 ` [PATCH v1 7/7] x86: Remove unused file wmemcmp-sse4 Noah Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMe9rOrcv5oApJmz4+XSthg6Vm5K4JaLE0p0_LfbxerS2RXBng@mail.gmail.com \
    --to=hjl.tools@gmail.com \
    --cc=carlos@systemhalted.org \
    --cc=goldstein.w.n@gmail.com \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).