From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-x62c.google.com (mail-ej1-x62c.google.com [IPv6:2a00:1450:4864:20::62c]) by sourceware.org (Postfix) with ESMTPS id 2F7CC3852753 for ; Mon, 3 Oct 2022 21:12:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2F7CC3852753 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-ej1-x62c.google.com with SMTP id b2so24851395eja.6 for ; Mon, 03 Oct 2022 14:12:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date; bh=CGWsEi+9diF2MbqS9hp9iHOIEiuiTiBKa3k98T51y5E=; b=M4w9xa+Rq+nN7PQw62K5aoiWNv5sNuOazwguQ2Jgtn7R8Y0QdBHdSwbt84JWyntG2w 681YoDI0qKFbiSch+HZpIyCJzmNMUoPDq6635mDZ23dEtP3U1tPJz61L+9HzQMM93pUS 19SBkcXghUDZLDpqnk8z1xIMR/t2LBG+xF/nUF05bzBSV3jTvLs0P9pP/CbWkJsuSdua WKczoA0zg1KLchCFn+ZbkTu/j4jT8OWDNO5CymuvSYzUB7ig1Q/1CIcDGXv8nRB9rlQX CvJYO24sWwzSxw7UH43ww0aFPNi23e//Ixl8gVQqak+z+8j/VSDGWhV8XRpb6Rg8djvS OumQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date; bh=CGWsEi+9diF2MbqS9hp9iHOIEiuiTiBKa3k98T51y5E=; b=vbYmm43+7B7s4BNmj2rMNHpjv6DHSAS2wMEqShZA0xiw+C9EkOmfj3keVP46woskXD jV++DxZwJMn5T8Yx3TygX9XKfm157X79PoN8+D0lmolP6VsZ34zRTH7mBEU4v9Sn39GL 31V9/y3G6Ak9yt5TViGZ4buiwOwnvWaDCkU5muHaFGyq+xKYh2Ty2D/9jHfMtg6GyHsN /XPt4IFtZC802Dd/ay4fXwf59j9tFq3MXhww+IboCUWFj+R8rvb6oMtVHCPfBflPtdcG 3tUpFCuTsairCckav66JxtN2/3Vxz48E2yKd7Gl+rKbvs1hlEpsKsnm7fH516iOzoe27 RSTg== X-Gm-Message-State: ACrzQf3WO672A9VJxOnt3hw/MPqG/Ko0UfEFYZAH7fw/mmFNnXGjtPHt kbncde/EHOUTDj2Lk51Kb8AZVtVFZP9R/aUC5EGSmQEA5dA= X-Google-Smtp-Source: AMsMyM5erDX/BjVftUH/W4yzQGDEGwY3MVuvu98Yj0yGdKZkAmfDg+3iAJb8Iy4CIQstN7lFG4oqQqT+Kts3eY4qSjA= X-Received: by 2002:a17:907:6ea1:b0:783:cc69:342 with SMTP id sh33-20020a1709076ea100b00783cc690342mr16024746ejc.97.1664831534051; Mon, 03 Oct 2022 14:12:14 -0700 (PDT) MIME-Version: 1.0 References: <20221003195944.3274548-1-aurelien@aurel32.net> <20221003195944.3274548-4-aurelien@aurel32.net> In-Reply-To: <20221003195944.3274548-4-aurelien@aurel32.net> From: Noah Goldstein Date: Mon, 3 Oct 2022 14:12:02 -0700 Message-ID: Subject: Re: [PATCH v3 3/8] x86-64: Require BMI2 for AVX2 strcmp implementation To: Aurelien Jarno Cc: libc-alpha@sourceware.org, "H . J . Lu" , Sunil K Pandey Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Mon, Oct 3, 2022 at 12:59 PM Aurelien Jarno wrote: > > The AVX2 strcmp implementation uses the 'bzhi' instruction, which > belongs to the BMI2 CPU feature. > > NB: It also uses the 'tzcnt' BMI1 instruction, but it is executed as BSF > as BSF if the CPU doesn't support TZCNT, and produces the same result > for non-zero input. > > Partially fixes: b77b06e0e296 ("x86: Optimize strcmp-avx2.S") > Partially resolves: BZ #29611 > --- > sysdeps/x86_64/multiarch/ifunc-impl-list.c | 4 +++- > sysdeps/x86_64/multiarch/strcmp.c | 4 ++-- > 2 files changed, 5 insertions(+), 3 deletions(-) > > diff --git a/sysdeps/x86_64/multiarch/ifunc-impl-list.c b/sysdeps/x86_64/multiarch/ifunc-impl-list.c > index d208fae4bf..a42b0a4620 100644 > --- a/sysdeps/x86_64/multiarch/ifunc-impl-list.c > +++ b/sysdeps/x86_64/multiarch/ifunc-impl-list.c > @@ -591,10 +591,12 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, > && CPU_FEATURE_USABLE (BMI2)), > __strcmp_evex) > X86_IFUNC_IMPL_ADD_V3 (array, i, strcmp, > - CPU_FEATURE_USABLE (AVX2), > + (CPU_FEATURE_USABLE (AVX2) > + && CPU_FEATURE_USABLE (BMI2)), > __strcmp_avx2) > X86_IFUNC_IMPL_ADD_V3 (array, i, strcmp, > (CPU_FEATURE_USABLE (AVX2) > + && CPU_FEATURE_USABLE (BMI2) > && CPU_FEATURE_USABLE (RTM)), > __strcmp_avx2_rtm) > X86_IFUNC_IMPL_ADD_V2 (array, i, strcmp, > diff --git a/sysdeps/x86_64/multiarch/strcmp.c b/sysdeps/x86_64/multiarch/strcmp.c > index fdd5afe3af..9d6c9f66ba 100644 > --- a/sysdeps/x86_64/multiarch/strcmp.c > +++ b/sysdeps/x86_64/multiarch/strcmp.c > @@ -45,12 +45,12 @@ IFUNC_SELECTOR (void) > const struct cpu_features *cpu_features = __get_cpu_features (); > > if (X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, AVX2) > + && X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, BMI2) > && X86_ISA_CPU_FEATURES_ARCH_P (cpu_features, > AVX_Fast_Unaligned_Load, )) > { > if (X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, AVX512VL) > - && X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, AVX512BW) > - && X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, BMI2)) > + && X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, AVX512BW)) > return OPTIMIZE (evex); > > if (CPU_FEATURE_USABLE_P (cpu_features, RTM)) > -- > 2.35.1 > LGTM. Reviewed-by: Noah Goldstein