From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oa1-x33.google.com (mail-oa1-x33.google.com [IPv6:2001:4860:4864:20::33]) by sourceware.org (Postfix) with ESMTPS id 998AC3851C25 for ; Tue, 28 Jun 2022 18:35:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 998AC3851C25 Received: by mail-oa1-x33.google.com with SMTP id 586e51a60fabf-1013ecaf7e0so18153541fac.13 for ; Tue, 28 Jun 2022 11:35:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=2Zs2eoXDv3EyBvGnrEs45VopIban6XV36mybDQi3MHY=; b=kKhmsBo5dSi8TzuiOhW5B5tSWDVCrE/Mxkcx5UazBT/VTidkR8m+jVOkpwbdWFmoEj VJvqcIUJ4Z/VN2jOUGaL2u/WMaZrbK/fhegSnWiH2Tq4OBqmk7RVqw+6BJIOwsGEJJKk ZjgWOGN/V+Ti/kkSVE4S3vWt8Wy+uloqAVhjAaf/BqAm5auuWBHpgPHFidgCikvDQsID BCkrJZWRhID1oJMr1ebDThi4r48jml2cDgD6sy/l4Yahb/RdCkVDBa1BD5duPp8jZ3sf NnmKYIjLdo3BWfHSjHxw7VmMOXUOAKSub1y5qy5aACX8siOCmNmIcJ3NKZSrujiN34lX LGmw== X-Gm-Message-State: AJIora9i5OzaNFhyGUCuXmMT8JHKbd+CHYR7NnfjYxgNXwbHjZRGP4i/ dw9Xqgb2Acx9HZ1L5iYQjtaNS2z/6nHJsQoU0bVEsLsKcuw= X-Google-Smtp-Source: AGRyM1sOHN9gBtqt7fuefOpT9I/v2wu7kcGyeWhIRk9tmtAb8dadULAztBIuIuEWovHjn5Xs/6b3fKwYeJAx1GB2X0Q= X-Received: by 2002:a05:6870:e388:b0:101:af6f:267a with SMTP id x8-20020a056870e38800b00101af6f267amr643295oad.94.1656441310923; Tue, 28 Jun 2022 11:35:10 -0700 (PDT) MIME-Version: 1.0 References: <20220628152628.17802-1-goldstein.w.n@gmail.com> <20220628152628.17802-2-goldstein.w.n@gmail.com> In-Reply-To: From: "H.J. Lu" Date: Tue, 28 Jun 2022 11:34:34 -0700 Message-ID: Subject: Re: [PATCH v1] x86: Add support for building strstr with explicit ISA level To: Noah Goldstein Cc: GNU C Library , "Carlos O'Donell" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-3023.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Jun 2022 18:35:13 -0000 On Tue, Jun 28, 2022 at 11:24 AM Noah Goldstein wrote: > > On Tue, Jun 28, 2022 at 11:21 AM H.J. Lu wrote: > > > > On Tue, Jun 28, 2022 at 8:26 AM Noah Goldstein wrote: > > > > > > Small changes for this function as the generic implementation remains > > > the same for all ISA levels. > > > > > > Only changes are using the X86_ISA_CPU_FEATURE{S}_{USABLE|ARCH}_P > > > macros so that some of the checks at least can constant evaluate > > > and some comments explaining the ISA constraints on the function. > > > --- > > > sysdeps/x86_64/multiarch/ifunc-impl-list.c | 13 +++++++------ > > > sysdeps/x86_64/multiarch/strstr.c | 10 +++++----- > > > 2 files changed, 12 insertions(+), 11 deletions(-) > > > > > > diff --git a/sysdeps/x86_64/multiarch/ifunc-impl-list.c b/sysdeps/x86_64/multiarch/ifunc-impl-list.c > > > index 0d28319905..a1bff560bc 100644 > > > --- a/sysdeps/x86_64/multiarch/ifunc-impl-list.c > > > +++ b/sysdeps/x86_64/multiarch/ifunc-impl-list.c > > > @@ -620,12 +620,13 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, > > > > > > /* Support sysdeps/x86_64/multiarch/strstr.c. */ > > > IFUNC_IMPL (i, name, strstr, > > > - IFUNC_IMPL_ADD (array, i, strstr, > > > - (CPU_FEATURE_USABLE (AVX512VL) > > > - && CPU_FEATURE_USABLE (AVX512BW) > > > - && CPU_FEATURE_USABLE (AVX512DQ) > > > - && CPU_FEATURE_USABLE (BMI2)), > > > - __strstr_avx512) > > > + /* All implementations of strstr are built at all ISA levels. */ > > > + IFUNC_IMPL_ADD (array, i, strstr, > > > + (CPU_FEATURE_USABLE (AVX512VL) > > > + && CPU_FEATURE_USABLE (AVX512BW) > > > + && CPU_FEATURE_USABLE (AVX512DQ) > > > + && CPU_FEATURE_USABLE (BMI2)), > > > + __strstr_avx512) > > > IFUNC_IMPL_ADD (array, i, strstr, 1, __strstr_sse2_unaligned) > > > IFUNC_IMPL_ADD (array, i, strstr, 1, __strstr_generic)) > > > > > > diff --git a/sysdeps/x86_64/multiarch/strstr.c b/sysdeps/x86_64/multiarch/strstr.c > > > index 2b83199245..3f86bfa5f2 100644 > > > --- a/sysdeps/x86_64/multiarch/strstr.c > > > +++ b/sysdeps/x86_64/multiarch/strstr.c > > > @@ -49,13 +49,13 @@ IFUNC_SELECTOR (void) > > > const struct cpu_features *cpu_features = __get_cpu_features (); > > > > > > if (!CPU_FEATURES_ARCH_P (cpu_features, Prefer_No_AVX512) > > > - && CPU_FEATURE_USABLE_P (cpu_features, AVX512VL) > > > - && CPU_FEATURE_USABLE_P (cpu_features, AVX512BW) > > > - && CPU_FEATURE_USABLE_P (cpu_features, AVX512DQ) > > > - && CPU_FEATURE_USABLE_P (cpu_features, BMI2)) > > > + && X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, AVX512VL) > > > + && X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, AVX512BW) > > > + && X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, AVX512DQ) > > > + && X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, BMI2)) > > > return __strstr_avx512; > > > > > > - if (CPU_FEATURES_ARCH_P (cpu_features, Fast_Unaligned_Load)) > > > + if (X86_ISA_CPU_FEATURES_ARCH_P (cpu_features, Fast_Unaligned_Load, )) > > > > Is Fast_Unaligned_Load set on all processors before? If not, we should > > revert > > It's set at ISA level >= 2. AFAICT the reason the bit exists is so that that > CPUs with slow sse42 can fallback on an unaligned sse2 implementation > if it's available as opposed to the generic / often quite expensive aligned > sse2 impl. Is Fast_Unaligned_Load set on Zhaoxin processors? > Example in strcmp > > > > > /* Feature(s) enabled when ISA level >= 2. */ > > #define Fast_Unaligned_Load_X86_ISA_LEVEL 2 > > > > > return __strstr_sse2_unaligned; > > > > > > return __strstr_generic; > > > -- > > > 2.34.1 > > > > > > > > > -- > > H.J. -- H.J.