From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yw1-x1130.google.com (mail-yw1-x1130.google.com [IPv6:2607:f8b0:4864:20::1130]) by sourceware.org (Postfix) with ESMTPS id B542D38582B1 for ; Tue, 28 Jun 2022 18:24:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org B542D38582B1 Received: by mail-yw1-x1130.google.com with SMTP id 00721157ae682-31772f8495fso125915957b3.4 for ; Tue, 28 Jun 2022 11:24:18 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=HnMY0+DAPVLZ9GKJIP55kcETV/MJTpULXRUGNJcgeqo=; b=ATCpvoncRTpERKpLYnPErOIoxnqKhZ5KpXmfU/SPaAB5mmxvxtLBquYoG4O+1U2XEK 7LxB506teMG1QOTBclFuOVkoRX2hZ3Re6b2c8N82kmVjo4pG/fSwY9br2pN+ejhuGPz3 X+XGuB7nlRgx6GbPyaKUwV3HGHFf7Q1jB5nf7ITbhMDG9GUSHfvHbm7whYtPA/GUAQp2 z5YEZice+gRHXLIDfP6J+K5wV/lUF4TO9c53KoeUstxZjGAvNbRBQxLNyk3Ui0oa4tyW O9hbJsxbFWOHg7989qGjtahBlBGxm0W7CaX3iPWVLUYR+dWVldTKDSHnXm4nXZhwLRP8 ddYw== X-Gm-Message-State: AJIora+rcXMwVg5wwz/kBdQL4IcxJtgbzBPWiKfs//GgPKMQOtbtbAwC AhysZd1FVVd65MzRr2QAY4qmL1tkJ2+rw7dFIWF8uLhsfGc= X-Google-Smtp-Source: AGRyM1tF1V2JRJinEriRzKCKDN7Os1KGJxmh0t7bQBSVcejooId+F4h9tFV+5p92gdl7naonyj1PVV6ZEVbCXiu/Xf0= X-Received: by 2002:a81:72c5:0:b0:318:27ef:702c with SMTP id n188-20020a8172c5000000b0031827ef702cmr23418823ywc.294.1656440658116; Tue, 28 Jun 2022 11:24:18 -0700 (PDT) MIME-Version: 1.0 References: <20220628152628.17802-1-goldstein.w.n@gmail.com> <20220628152628.17802-2-goldstein.w.n@gmail.com> In-Reply-To: From: Noah Goldstein Date: Tue, 28 Jun 2022 11:24:07 -0700 Message-ID: Subject: Re: [PATCH v1] x86: Add support for building strstr with explicit ISA level To: "H.J. Lu" Cc: GNU C Library , "Carlos O'Donell" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Jun 2022 18:24:20 -0000 On Tue, Jun 28, 2022 at 11:21 AM H.J. Lu wrote: > > On Tue, Jun 28, 2022 at 8:26 AM Noah Goldstein wrote: > > > > Small changes for this function as the generic implementation remains > > the same for all ISA levels. > > > > Only changes are using the X86_ISA_CPU_FEATURE{S}_{USABLE|ARCH}_P > > macros so that some of the checks at least can constant evaluate > > and some comments explaining the ISA constraints on the function. > > --- > > sysdeps/x86_64/multiarch/ifunc-impl-list.c | 13 +++++++------ > > sysdeps/x86_64/multiarch/strstr.c | 10 +++++----- > > 2 files changed, 12 insertions(+), 11 deletions(-) > > > > diff --git a/sysdeps/x86_64/multiarch/ifunc-impl-list.c b/sysdeps/x86_64/multiarch/ifunc-impl-list.c > > index 0d28319905..a1bff560bc 100644 > > --- a/sysdeps/x86_64/multiarch/ifunc-impl-list.c > > +++ b/sysdeps/x86_64/multiarch/ifunc-impl-list.c > > @@ -620,12 +620,13 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, > > > > /* Support sysdeps/x86_64/multiarch/strstr.c. */ > > IFUNC_IMPL (i, name, strstr, > > - IFUNC_IMPL_ADD (array, i, strstr, > > - (CPU_FEATURE_USABLE (AVX512VL) > > - && CPU_FEATURE_USABLE (AVX512BW) > > - && CPU_FEATURE_USABLE (AVX512DQ) > > - && CPU_FEATURE_USABLE (BMI2)), > > - __strstr_avx512) > > + /* All implementations of strstr are built at all ISA levels. */ > > + IFUNC_IMPL_ADD (array, i, strstr, > > + (CPU_FEATURE_USABLE (AVX512VL) > > + && CPU_FEATURE_USABLE (AVX512BW) > > + && CPU_FEATURE_USABLE (AVX512DQ) > > + && CPU_FEATURE_USABLE (BMI2)), > > + __strstr_avx512) > > IFUNC_IMPL_ADD (array, i, strstr, 1, __strstr_sse2_unaligned) > > IFUNC_IMPL_ADD (array, i, strstr, 1, __strstr_generic)) > > > > diff --git a/sysdeps/x86_64/multiarch/strstr.c b/sysdeps/x86_64/multiarch/strstr.c > > index 2b83199245..3f86bfa5f2 100644 > > --- a/sysdeps/x86_64/multiarch/strstr.c > > +++ b/sysdeps/x86_64/multiarch/strstr.c > > @@ -49,13 +49,13 @@ IFUNC_SELECTOR (void) > > const struct cpu_features *cpu_features = __get_cpu_features (); > > > > if (!CPU_FEATURES_ARCH_P (cpu_features, Prefer_No_AVX512) > > - && CPU_FEATURE_USABLE_P (cpu_features, AVX512VL) > > - && CPU_FEATURE_USABLE_P (cpu_features, AVX512BW) > > - && CPU_FEATURE_USABLE_P (cpu_features, AVX512DQ) > > - && CPU_FEATURE_USABLE_P (cpu_features, BMI2)) > > + && X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, AVX512VL) > > + && X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, AVX512BW) > > + && X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, AVX512DQ) > > + && X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, BMI2)) > > return __strstr_avx512; > > > > - if (CPU_FEATURES_ARCH_P (cpu_features, Fast_Unaligned_Load)) > > + if (X86_ISA_CPU_FEATURES_ARCH_P (cpu_features, Fast_Unaligned_Load, )) > > Is Fast_Unaligned_Load set on all processors before? If not, we should > revert It's set at ISA level >= 2. AFAICT the reason the bit exists is so that that CPUs with slow sse42 can fallback on an unaligned sse2 implementation if it's available as opposed to the generic / often quite expensive aligned sse2 impl. Example in strcmp > > /* Feature(s) enabled when ISA level >= 2. */ > #define Fast_Unaligned_Load_X86_ISA_LEVEL 2 > > > return __strstr_sse2_unaligned; > > > > return __strstr_generic; > > -- > > 2.34.1 > > > > > -- > H.J.