From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-x534.google.com (mail-ed1-x534.google.com [IPv6:2a00:1450:4864:20::534]) by sourceware.org (Postfix) with ESMTPS id AE7F2385C33F for ; Sun, 2 Oct 2022 21:09:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org AE7F2385C33F Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-ed1-x534.google.com with SMTP id c93so3379257edf.11 for ; Sun, 02 Oct 2022 14:09:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date; bh=5mSi2NgqdcdJ+Jq02XMDpae8aAWivF6r5XkIkcAPlts=; b=AwT50LrtSO3BXrDDXJH8EX+1i00ORCWqQTpWY06CjS2EJxRjh+GZy5ImMGkHaFR64P fxbRVTEvBVPW7A4QJic2XAQp8pwDNPAFtMTXnPUmDIy6KpnGSCsTYZTRQJBqtiTQ4LEh qdKXUMe1j4/R8Be3ydnuuFFQ+o2uQBAIuLctj5UxOEU7h3PLWMmbf6XA3Y0X66r8J13i 3yC0410fCCGsvDfkOlyfKJVOt83qsQJsNcLB/U3c3kzpoxoKXHWS2riOC6mwW1uVPLbA mSYvCB2cd1bXKeFgAy4fv3QScjFfSsl36VlQUR+T+bVAXdGmDVnXf+tnDUOZnuOayZtB /REg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date; bh=5mSi2NgqdcdJ+Jq02XMDpae8aAWivF6r5XkIkcAPlts=; b=XIIc3xtsO04rUzsjcXqLput+XZGKRKgSp4QqmH9XUL98f0i4MsvEaXkyYxVXQc5Igz v2mcqZ6MfocWz3UKZ80+//whJUAyxIOpWFxyz9O/XDOSaBMDXbI8iIKMgqQoUEOcoFC0 Ep5UieS4IeodFO5jc/wRLeyi7S/PQ2W8CHM5wLxBj3suXPlN01tZNqrGAkqDC55k5p+j 7vYK/5K5U2uWay8lemju8ldFTYMz32OcsfodDBf4VcKtykcipDK72HIHZLBXlPOp033d z66OKS8FxjYChD+czb4FzlX68F9A3ValRRt9vxSZS8CTzXvIOelA6w2TgiYu4IVHgFLh gNKQ== X-Gm-Message-State: ACrzQf1eSohSZTT7HO4lPu0P9mIlBVPXCW1xD3CIuu6UVWR4H6oZ0Aaq KR47r85eRfw7XQCNTei+wCWAomQAl5Bh59SDfvk= X-Google-Smtp-Source: AMsMyM4fS97zrvegZjI4Ef6yJM/npHxGqpKH6AKXPA3vy1d5ZCxnOLX82TD71zYrpkEhe97lsplRWBisiXyWSgn76a8= X-Received: by 2002:aa7:c60a:0:b0:458:d707:117 with SMTP id h10-20020aa7c60a000000b00458d7070117mr4610638edq.258.1664744945599; Sun, 02 Oct 2022 14:09:05 -0700 (PDT) MIME-Version: 1.0 References: <20221002123424.3079805-1-aurelien@aurel32.net> <20221002123424.3079805-6-aurelien@aurel32.net> In-Reply-To: <20221002123424.3079805-6-aurelien@aurel32.net> From: Noah Goldstein Date: Sun, 2 Oct 2022 17:08:54 -0400 Message-ID: Subject: Re: [PATCH v2 5/6] x86-64: Require BMI1/BMI2 for AVX2 strrchr and wcsrchr implementations To: Aurelien Jarno Cc: libc-alpha@sourceware.org, "H . J . Lu" , Sunil K Pandey Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Sun, Oct 2, 2022 at 8:34 AM Aurelien Jarno wrote: > > The AVX2 strrchr and wcsrchr implementation uses the 'blsmsk' > instruction which belongs to the BMI1 CPU feature and the 'shrx' > instruction, which belongs to the BMI2 CPU feature. > > Fixes: df7e295d18ff ("x86: Optimize {str|wcs}rchr-avx2") > Partially resolves: BZ #29611 > --- > sysdeps/x86/isa-level.h | 1 + > sysdeps/x86_64/multiarch/ifunc-avx2.h | 1 + > sysdeps/x86_64/multiarch/ifunc-impl-list.c | 17 ++++++++++++++--- > 3 files changed, 16 insertions(+), 3 deletions(-) > > diff --git a/sysdeps/x86/isa-level.h b/sysdeps/x86/isa-level.h > index bbb90f5c5e..06f6c9663e 100644 > --- a/sysdeps/x86/isa-level.h > +++ b/sysdeps/x86/isa-level.h > @@ -79,6 +79,7 @@ > /* ISA level >= 3 guaranteed includes. */ > #define AVX_X86_ISA_LEVEL 3 > #define AVX2_X86_ISA_LEVEL 3 > +#define BMI1_X86_ISA_LEVEL 3 > #define BMI2_X86_ISA_LEVEL 3 > #define LZCNT_X86_ISA_LEVEL 3 > #define MOVBE_X86_ISA_LEVEL 3 > diff --git a/sysdeps/x86_64/multiarch/ifunc-avx2.h b/sysdeps/x86_64/multiarch/ifunc-avx2.h > index f1741083fd..f2f5e8a211 100644 > --- a/sysdeps/x86_64/multiarch/ifunc-avx2.h > +++ b/sysdeps/x86_64/multiarch/ifunc-avx2.h > @@ -36,6 +36,7 @@ IFUNC_SELECTOR (void) > const struct cpu_features *cpu_features = __get_cpu_features (); > > if (X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, AVX2) > + && X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, BMI1) > && X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, BMI2) > && X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, LZCNT) > && X86_ISA_CPU_FEATURES_ARCH_P (cpu_features, > diff --git a/sysdeps/x86_64/multiarch/ifunc-impl-list.c b/sysdeps/x86_64/multiarch/ifunc-impl-list.c > index 4ee28c99bd..1c8afa229f 100644 > --- a/sysdeps/x86_64/multiarch/ifunc-impl-list.c > +++ b/sysdeps/x86_64/multiarch/ifunc-impl-list.c > @@ -575,13 +575,19 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, > IFUNC_IMPL (i, name, strrchr, > X86_IFUNC_IMPL_ADD_V4 (array, i, strrchr, > (CPU_FEATURE_USABLE (AVX512VL) > - && CPU_FEATURE_USABLE (AVX512BW)), > + && CPU_FEATURE_USABLE (AVX512BW) > + && CPU_FEATURE_USABLE (BMI1) > + && CPU_FEATURE_USABLE (BMI2)), > __strrchr_evex) > X86_IFUNC_IMPL_ADD_V3 (array, i, strrchr, > - CPU_FEATURE_USABLE (AVX2), > + (CPU_FEATURE_USABLE (AVX2) > + && CPU_FEATURE_USABLE (BMI1) > + && CPU_FEATURE_USABLE (BMI2)), > __strrchr_avx2) > X86_IFUNC_IMPL_ADD_V3 (array, i, strrchr, > (CPU_FEATURE_USABLE (AVX2) > + && CPU_FEATURE_USABLE (BMI1) > + && CPU_FEATURE_USABLE (BMI2) > && CPU_FEATURE_USABLE (RTM)), > __strrchr_avx2_rtm) > /* ISA V2 wrapper for SSE2 implementation because the SSE2 > @@ -794,13 +800,18 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, > X86_IFUNC_IMPL_ADD_V4 (array, i, wcsrchr, > (CPU_FEATURE_USABLE (AVX512VL) > && CPU_FEATURE_USABLE (AVX512BW) > + && CPU_FEATURE_USABLE (BMI1) > && CPU_FEATURE_USABLE (BMI2)), > __wcsrchr_evex) > X86_IFUNC_IMPL_ADD_V3 (array, i, wcsrchr, > - CPU_FEATURE_USABLE (AVX2), > + (CPU_FEATURE_USABLE (AVX2) > + && CPU_FEATURE_USABLE (BMI1) > + && CPU_FEATURE_USABLE (BMI2)), > __wcsrchr_avx2) > X86_IFUNC_IMPL_ADD_V3 (array, i, wcsrchr, > (CPU_FEATURE_USABLE (AVX2) > + && CPU_FEATURE_USABLE (BMI1) > + && CPU_FEATURE_USABLE (BMI2) > && CPU_FEATURE_USABLE (RTM)), > __wcsrchr_avx2_rtm) > /* ISA V2 wrapper for SSE2 implementation because the SSE2 > -- > 2.35.1 > LGTM.