From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-x630.google.com (mail-ej1-x630.google.com [IPv6:2a00:1450:4864:20::630]) by sourceware.org (Postfix) with ESMTPS id 7FB8E385841F for ; Sat, 1 Oct 2022 22:06:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 7FB8E385841F Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-ej1-x630.google.com with SMTP id a26so15569043ejc.4 for ; Sat, 01 Oct 2022 15:06:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date; bh=cRi3/sjBnMrzAoURR/qyrBu9dTXNOpZtQwY7nkkW684=; b=Tr+tSD84Pek5O8mUT8AxRVnywtix2l+TCrDDVT09Ke25tfFQuPXtkkVZ/BmWswnBQ+ 8OsKSJlakppXd2w3ThG1X7mXH7w/+sDfJXxQ47V3FBEHrXF0Xz+TH6KdYQBvf6i9/DAT 63RgZCwzQB32gl7cdrJDFNIwxLQYdRevqZ28ozI4U4NWqdMeLNFSgcUXGQIMNqLheEcY LCwfOCzjsDHHnN2M/8qlYs2H8eAHayV1FentyNEkbRRK+V8ItDODp/F4/xRlGapx9Q/V ECfV+dN0sXOyKl5YkTVBcHebxAkNVJNheaJhCTRdIEELQb4YjqRYvbvQ7kbSL3k+LH1n ah2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date; bh=cRi3/sjBnMrzAoURR/qyrBu9dTXNOpZtQwY7nkkW684=; b=cL+Ilpn2D8j1fC1nUE+BKjpoRF0URM3Qe0PyWd/5rlmZoIhEF3H7Fjjc8wMspmMqXA OaFLIz15u22dV+IK9VEMp99t/w6+PcqEx/2FnKZm7EEL6JvS0K2cYCiVJTrCEpXBTUR2 8dVccRm2qQKZz+P6hvnFpF9CplJEEsqvnD4obqp+HCHpgHvQrb7mZd+ojNu3wkiBDLp5 OIz7fRrVb8VoSmtqIS6uFfL0AY/06IOmy743CSdm0HLv3mpOnEm0XVirvo2/MYcprU6C bU5yppYGqYkTEWX/ITu5aJfmNVzfWfE+jakchO820nj4A7gKeONgJ9fDDxS3oxOEzvvE 3QyA== X-Gm-Message-State: ACrzQf1yLYEc0PbZg3WK4E+nrM3k2Xb6KC3658ci8QT67RMpkJLnBNCw HpFOzbzm/1LMpWg6+YQ7o/PcTrw1+bqxVvW4gNuqu7Qr X-Google-Smtp-Source: AMsMyM5EwaylbQUHITqWs5V/O+NNbF+PtQPfcmPSJ2Eh76pWuFHX/s0Z6MVHhUXwFsE6fJXUdclGTFErOm2Ilx9C2oc= X-Received: by 2002:a17:907:724a:b0:782:3754:ecb3 with SMTP id ds10-20020a170907724a00b007823754ecb3mr10893784ejc.282.1664662009925; Sat, 01 Oct 2022 15:06:49 -0700 (PDT) MIME-Version: 1.0 References: <20221001190911.2994478-1-aurelien@aurel32.net> <20221001190911.2994478-5-aurelien@aurel32.net> In-Reply-To: <20221001190911.2994478-5-aurelien@aurel32.net> From: Noah Goldstein Date: Sat, 1 Oct 2022 15:06:38 -0700 Message-ID: Subject: Re: [PATCH 4/4] x86-64: Require LZCNT for AVX2 memrchr implementation To: Aurelien Jarno Cc: libc-alpha@sourceware.org, "H . J . Lu" , Sunil K Pandey Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Sat, Oct 1, 2022 at 12:09 PM Aurelien Jarno wrote: > > The AVX2 memrchr implementation uses the lzcntl and lzcntq instructions, > which belongs to the LZCNT CPU feature. > > Fixes: af5306a735eb ("x86: Optimize memrchr-avx2.S") > Partially resolves: BZ #29611 > --- > sysdeps/x86_64/multiarch/ifunc-avx2.h | 1 + > sysdeps/x86_64/multiarch/ifunc-impl-list.c | 7 +++++-- > 2 files changed, 6 insertions(+), 2 deletions(-) > > diff --git a/sysdeps/x86_64/multiarch/ifunc-avx2.h b/sysdeps/x86_64/multiarch/ifunc-avx2.h > index a57a9952f3..f1741083fd 100644 > --- a/sysdeps/x86_64/multiarch/ifunc-avx2.h > +++ b/sysdeps/x86_64/multiarch/ifunc-avx2.h > @@ -37,6 +37,7 @@ IFUNC_SELECTOR (void) > > if (X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, AVX2) > && X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, BMI2) > + && X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, LZCNT) > && X86_ISA_CPU_FEATURES_ARCH_P (cpu_features, > AVX_Fast_Unaligned_Load, )) > { > diff --git a/sysdeps/x86_64/multiarch/ifunc-impl-list.c b/sysdeps/x86_64/multiarch/ifunc-impl-list.c > index c628462d47..db5a2032d6 100644 > --- a/sysdeps/x86_64/multiarch/ifunc-impl-list.c > +++ b/sysdeps/x86_64/multiarch/ifunc-impl-list.c > @@ -209,13 +209,16 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, > IFUNC_IMPL (i, name, memrchr, > X86_IFUNC_IMPL_ADD_V4 (array, i, memrchr, > (CPU_FEATURE_USABLE (AVX512VL) > - && CPU_FEATURE_USABLE (AVX512BW)), > + && CPU_FEATURE_USABLE (AVX512BW) > + && CPU_FEATURE_USABLE (LZCNT)), Also needs BMI2 for the `shlx`. Likewise for avx2 versions. > __memrchr_evex) > X86_IFUNC_IMPL_ADD_V3 (array, i, memrchr, > - CPU_FEATURE_USABLE (AVX2), > + (CPU_FEATURE_USABLE (AVX2) > + && CPU_FEATURE_USABLE (LZCNT)), > __memrchr_avx2) > X86_IFUNC_IMPL_ADD_V3 (array, i, memrchr, > (CPU_FEATURE_USABLE (AVX2) > + && CPU_FEATURE_USABLE (LZCNT) > && CPU_FEATURE_USABLE (RTM)), > __memrchr_avx2_rtm) > /* ISA V2 wrapper for SSE2 implementation because the SSE2 > -- > 2.35.1 >