From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-x62a.google.com (mail-ej1-x62a.google.com [IPv6:2a00:1450:4864:20::62a]) by sourceware.org (Postfix) with ESMTPS id C80AC3851A9F for ; Mon, 3 Oct 2022 21:12:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org C80AC3851A9F Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-ej1-x62a.google.com with SMTP id nb11so24848330ejc.5 for ; Mon, 03 Oct 2022 14:12:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date; bh=Wgx8l5cZzQfw/pL8IFiYiiBGLaJV8OffvYgAgrly/z4=; b=PL8BUr3N979YXRCFc4F+RHpxcbhuyQoK5sQxb4hkPPJ0+DHlViFq9rMe074jQfWSGh YDfDLN+NTR1SoaWpCUbZqmduBd/O0d0AtuSiw7XxCKNX+ZiQQePgv/hXaw6AsARvPESf Hv8S+t8Xgx+A2MjxHO94xt7zl30WX8JJqglF6lrmCOYtoy/PWFbGCLNkFRHowdhO+IjE uKnsEAC0WkKjvZaEd1T363/uLHvGurj9GCxse+cp20JOc8lZLGG0ThWmJsx4eiLQVDzF /g90kh0QzkWilw4Mq3f6W/qC71jKm7rTqGsncUzSmNEmICthhtK+WmJCTArUm6M5F79R Tmdw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date; bh=Wgx8l5cZzQfw/pL8IFiYiiBGLaJV8OffvYgAgrly/z4=; b=c3vV3DqjBaLgrg6gG9/yjQ5kP8vNquHka1mconeaXDHc0XhDhsj982QAsuS+Vbga7v vjh0SWII72dK3L/LGcCgufXYI8mZNbfolBgkwyKb28spK9HAtOvce8aFnNXP+cLYxaXj v0X+y9F0U/h/AbgtS8j+afyJnj7x4MPXfFLzIauqYdJEYCU6ptClmkqcrzCldl7VUlnv mEmf1YV3+/XyMXV6/7KZYtl5C6ttvKUA1dHrGWG6LaJmi+Ildv1WqkWZASwROykzRIBl EOeRUEgzOuEv9ai5Q1+fygFZWD4mfGa6YCslFdaBI1VkRpKqZAXe+O4PR8kC041ZeGpW 75MQ== X-Gm-Message-State: ACrzQf1oFOQmIrE4N4KwNgWzWXmHwdQae2nNYpO2drZuCAGoo8xwVMbD zQNiOkW1tRMOmQyZhd69DyOfUpgJLYzHdEyN9UkWt9xFx80= X-Google-Smtp-Source: AMsMyM6ECNP/G1uMHuqzZikHT4jVvxM0Ai9e+VZJ09PRYRlhau0psAwDIfOzyFmFeNZhIfx4t02PI4HGxoYbJg+eAqg= X-Received: by 2002:a17:906:5d04:b0:77f:ca9f:33d1 with SMTP id g4-20020a1709065d0400b0077fca9f33d1mr17173740ejt.526.1664831567636; Mon, 03 Oct 2022 14:12:47 -0700 (PDT) MIME-Version: 1.0 References: <20221003195944.3274548-1-aurelien@aurel32.net> <20221003195944.3274548-9-aurelien@aurel32.net> In-Reply-To: <20221003195944.3274548-9-aurelien@aurel32.net> From: Noah Goldstein Date: Mon, 3 Oct 2022 14:12:36 -0700 Message-ID: Subject: Re: [PATCH v3 8/8] x86-64: Require BMI1/BMI2 for AVX2 strrchr and wcsrchr implementations To: Aurelien Jarno Cc: libc-alpha@sourceware.org, "H . J . Lu" , Sunil K Pandey Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Mon, Oct 3, 2022 at 12:59 PM Aurelien Jarno wrote: > > The AVX2 strrchr and wcsrchr implementation uses the 'blsmsk' > instruction which belongs to the BMI1 CPU feature and the 'shrx' > instruction, which belongs to the BMI2 CPU feature. > > Fixes: df7e295d18ff ("x86: Optimize {str|wcs}rchr-avx2") > Partially resolves: BZ #29611 > --- > sysdeps/x86/isa-level.h | 1 + > sysdeps/x86_64/multiarch/ifunc-avx2.h | 1 + > sysdeps/x86_64/multiarch/ifunc-impl-list.c | 17 ++++++++++++++--- > 3 files changed, 16 insertions(+), 3 deletions(-) > > diff --git a/sysdeps/x86/isa-level.h b/sysdeps/x86/isa-level.h > index bbb90f5c5e..06f6c9663e 100644 > --- a/sysdeps/x86/isa-level.h > +++ b/sysdeps/x86/isa-level.h > @@ -79,6 +79,7 @@ > /* ISA level >= 3 guaranteed includes. */ > #define AVX_X86_ISA_LEVEL 3 > #define AVX2_X86_ISA_LEVEL 3 > +#define BMI1_X86_ISA_LEVEL 3 > #define BMI2_X86_ISA_LEVEL 3 > #define LZCNT_X86_ISA_LEVEL 3 > #define MOVBE_X86_ISA_LEVEL 3 > diff --git a/sysdeps/x86_64/multiarch/ifunc-avx2.h b/sysdeps/x86_64/multiarch/ifunc-avx2.h > index f1741083fd..f2f5e8a211 100644 > --- a/sysdeps/x86_64/multiarch/ifunc-avx2.h > +++ b/sysdeps/x86_64/multiarch/ifunc-avx2.h > @@ -36,6 +36,7 @@ IFUNC_SELECTOR (void) > const struct cpu_features *cpu_features = __get_cpu_features (); > > if (X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, AVX2) > + && X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, BMI1) > && X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, BMI2) > && X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, LZCNT) > && X86_ISA_CPU_FEATURES_ARCH_P (cpu_features, > diff --git a/sysdeps/x86_64/multiarch/ifunc-impl-list.c b/sysdeps/x86_64/multiarch/ifunc-impl-list.c > index ec1c5b55fb..00a91123d3 100644 > --- a/sysdeps/x86_64/multiarch/ifunc-impl-list.c > +++ b/sysdeps/x86_64/multiarch/ifunc-impl-list.c > @@ -578,13 +578,19 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, > IFUNC_IMPL (i, name, strrchr, > X86_IFUNC_IMPL_ADD_V4 (array, i, strrchr, > (CPU_FEATURE_USABLE (AVX512VL) > - && CPU_FEATURE_USABLE (AVX512BW)), > + && CPU_FEATURE_USABLE (AVX512BW) > + && CPU_FEATURE_USABLE (BMI1) > + && CPU_FEATURE_USABLE (BMI2)), > __strrchr_evex) > X86_IFUNC_IMPL_ADD_V3 (array, i, strrchr, > - CPU_FEATURE_USABLE (AVX2), > + (CPU_FEATURE_USABLE (AVX2) > + && CPU_FEATURE_USABLE (BMI1) > + && CPU_FEATURE_USABLE (BMI2)), > __strrchr_avx2) > X86_IFUNC_IMPL_ADD_V3 (array, i, strrchr, > (CPU_FEATURE_USABLE (AVX2) > + && CPU_FEATURE_USABLE (BMI1) > + && CPU_FEATURE_USABLE (BMI2) > && CPU_FEATURE_USABLE (RTM)), > __strrchr_avx2_rtm) > /* ISA V2 wrapper for SSE2 implementation because the SSE2 > @@ -797,13 +803,18 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, > X86_IFUNC_IMPL_ADD_V4 (array, i, wcsrchr, > (CPU_FEATURE_USABLE (AVX512VL) > && CPU_FEATURE_USABLE (AVX512BW) > + && CPU_FEATURE_USABLE (BMI1) > && CPU_FEATURE_USABLE (BMI2)), > __wcsrchr_evex) > X86_IFUNC_IMPL_ADD_V3 (array, i, wcsrchr, > - CPU_FEATURE_USABLE (AVX2), > + (CPU_FEATURE_USABLE (AVX2) > + && CPU_FEATURE_USABLE (BMI1) > + && CPU_FEATURE_USABLE (BMI2)), > __wcsrchr_avx2) > X86_IFUNC_IMPL_ADD_V3 (array, i, wcsrchr, > (CPU_FEATURE_USABLE (AVX2) > + && CPU_FEATURE_USABLE (BMI1) > + && CPU_FEATURE_USABLE (BMI2) > && CPU_FEATURE_USABLE (RTM)), > __wcsrchr_avx2_rtm) > /* ISA V2 wrapper for SSE2 implementation because the SSE2 > -- > 2.35.1 > LGTM. Reviewed-by: Noah Goldstein