From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qk1-x730.google.com (mail-qk1-x730.google.com [IPv6:2607:f8b0:4864:20::730]) by sourceware.org (Postfix) with ESMTPS id 8A0BE3858C2C; Sat, 23 Apr 2022 01:34:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 8A0BE3858C2C Received: by mail-qk1-x730.google.com with SMTP id j6so7049154qkp.9; Fri, 22 Apr 2022 18:34:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=0gZBbQS9KH0JHjOUxM8ar9LGfeZd0BzPB/OhjnbD9jQ=; b=cov+cfh0qaLBCN59jpb5CXHNBg5U1EwxahhpfPZq1IJUZw5T0ft9zs/7rbS96Tnx0u jbdR2PfGd5g8fhKr5HgVQXWWP/X14csb+7xNmEQE2zq+tePEmcmqKNrWKZL22BFZlkcg 9/HTCfC9V/xTkH2JVnjEzK9NkDOoD8tMJQZd1ORt+tdw/D1S/r9QgUi0QtkWl+ywbABq N2TX6NJhEoLxWQFxBzKnchmsuKbpx5Cm/ToqideEqthD7FxfH2dlEEV+ghJWeRl1t4O9 th26b/cPfk1W70zWGn8JDRfVJXa7aQLDh16nbqFIezUXuawsqD4m/nKMcAlhCX7+Zgmz kL2A== X-Gm-Message-State: AOAM531TRVvKIuSIKKtyxcJbzAhe3okCcLRvtU67tGjrgZo1RVXl5GiT 7aXXTkosexGkvgagzJ6bO3i1/x/CyWI2jIS/3u5jkjp8qFo= X-Google-Smtp-Source: ABdhPJzs8J8HNbucRt2uBoNd3ubk7HgyNcwFMeZMwa1Kci9ZZvJS6IsDdGubauC0QvL8P58b4IGb34fKrGWNVmad+70= X-Received: by 2002:a05:620a:1341:b0:69e:cd37:763c with SMTP id c1-20020a05620a134100b0069ecd37763cmr4416305qkl.284.1650677698990; Fri, 22 Apr 2022 18:34:58 -0700 (PDT) MIME-Version: 1.0 References: <20211101125412.611713-1-hjl.tools@gmail.com> <20211101125412.611713-3-hjl.tools@gmail.com> In-Reply-To: <20211101125412.611713-3-hjl.tools@gmail.com> From: Sunil Pandey Date: Fri, 22 Apr 2022 18:34:23 -0700 Message-ID: Subject: Re: [PATCH 2/2] x86-64: Remove Prefer_AVX2_STRCMP To: "H.J. Lu" , libc-stable@sourceware.org Cc: GNU C Library Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM, GIT_PATCH_0, HK_RANDOM_ENVFROM, HK_RANDOM_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-stable@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-stable mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Apr 2022 01:35:01 -0000 On Mon, Nov 1, 2021 at 5:54 AM H.J. Lu via Libc-alpha wrote: > > Remove Prefer_AVX2_STRCMP to enable EVEX strcmp. When comparing 2 32-byte > strings, EVEX strcmp has been improved to require 1 load, 1 VPTESTM, 1 > VPCMP, 1 KMOVD and 1 INCL instead of 2 loads, 3 VPCMPs, 2 KORDs, 1 KMOVD > and 1 TESTL while AVX2 strcmp requires 1 load, 2 VPCMPEQs, 1 VPMINU, 1 > VPMOVMSKB and 1 TESTL. EVEX strcmp is now faster than AVX2 strcmp by up > to 40% on Tiger Lake and Ice Lake. > --- > sysdeps/x86/cpu-features.c | 8 -------- > sysdeps/x86/cpu-tunables.c | 2 -- > .../include/cpu-features-preferred_feature_index_1.def | 1 - > sysdeps/x86_64/multiarch/strcmp.c | 3 +-- > sysdeps/x86_64/multiarch/strncmp.c | 3 +-- > 5 files changed, 2 insertions(+), 15 deletions(-) > > diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c > index 645bba6314..be2498b2e7 100644 > --- a/sysdeps/x86/cpu-features.c > +++ b/sysdeps/x86/cpu-features.c > @@ -546,14 +546,6 @@ init_cpu_features (struct cpu_features *cpu_features) > if (CPU_FEATURE_USABLE_P (cpu_features, RTM)) > cpu_features->preferred[index_arch_Prefer_No_VZEROUPPER] > |= bit_arch_Prefer_No_VZEROUPPER; > - > - /* Since to compare 2 32-byte strings, 256-bit EVEX strcmp > - requires 2 loads, 3 VPCMPs and 2 KORDs while AVX2 strcmp > - requires 1 load, 2 VPCMPEQs, 1 VPMINU and 1 VPMOVMSKB, > - AVX2 strcmp is faster than EVEX strcmp. */ > - if (CPU_FEATURE_USABLE_P (cpu_features, AVX2)) > - cpu_features->preferred[index_arch_Prefer_AVX2_STRCMP] > - |= bit_arch_Prefer_AVX2_STRCMP; > } > > /* Avoid avoid short distance REP MOVSB on processor with FSRM. */ > diff --git a/sysdeps/x86/cpu-tunables.c b/sysdeps/x86/cpu-tunables.c > index 00fe5045eb..61b05e5b1d 100644 > --- a/sysdeps/x86/cpu-tunables.c > +++ b/sysdeps/x86/cpu-tunables.c > @@ -239,8 +239,6 @@ TUNABLE_CALLBACK (set_hwcaps) (tunable_val_t *valp) > CHECK_GLIBC_IFUNC_PREFERRED_BOTH (n, cpu_features, > Fast_Copy_Backward, > disable, 18); > - CHECK_GLIBC_IFUNC_PREFERRED_NEED_BOTH > - (n, cpu_features, Prefer_AVX2_STRCMP, AVX2, disable, 18); > } > break; > case 19: > diff --git a/sysdeps/x86/include/cpu-features-preferred_feature_index_1.def b/sysdeps/x86/include/cpu-features-preferred_feature_index_1.def > index d7c93f00c5..1530d594b3 100644 > --- a/sysdeps/x86/include/cpu-features-preferred_feature_index_1.def > +++ b/sysdeps/x86/include/cpu-features-preferred_feature_index_1.def > @@ -32,5 +32,4 @@ BIT (Prefer_ERMS) > BIT (Prefer_No_AVX512) > BIT (MathVec_Prefer_No_AVX512) > BIT (Prefer_FSRM) > -BIT (Prefer_AVX2_STRCMP) > BIT (Avoid_Short_Distance_REP_MOVSB) > diff --git a/sysdeps/x86_64/multiarch/strcmp.c b/sysdeps/x86_64/multiarch/strcmp.c > index 62b7abeeee..7c2901bf44 100644 > --- a/sysdeps/x86_64/multiarch/strcmp.c > +++ b/sysdeps/x86_64/multiarch/strcmp.c > @@ -43,8 +43,7 @@ IFUNC_SELECTOR (void) > { > if (CPU_FEATURE_USABLE_P (cpu_features, AVX512VL) > && CPU_FEATURE_USABLE_P (cpu_features, AVX512BW) > - && CPU_FEATURE_USABLE_P (cpu_features, BMI2) > - && !CPU_FEATURES_ARCH_P (cpu_features, Prefer_AVX2_STRCMP)) > + && CPU_FEATURE_USABLE_P (cpu_features, BMI2)) > return OPTIMIZE (evex); > > if (CPU_FEATURE_USABLE_P (cpu_features, RTM)) > diff --git a/sysdeps/x86_64/multiarch/strncmp.c b/sysdeps/x86_64/multiarch/strncmp.c > index 60ba0fe356..f94a421784 100644 > --- a/sysdeps/x86_64/multiarch/strncmp.c > +++ b/sysdeps/x86_64/multiarch/strncmp.c > @@ -43,8 +43,7 @@ IFUNC_SELECTOR (void) > { > if (CPU_FEATURE_USABLE_P (cpu_features, AVX512VL) > && CPU_FEATURE_USABLE_P (cpu_features, AVX512BW) > - && CPU_FEATURE_USABLE_P (cpu_features, BMI2) > - && !CPU_FEATURES_ARCH_P (cpu_features, Prefer_AVX2_STRCMP)) > + && CPU_FEATURE_USABLE_P (cpu_features, BMI2)) > return OPTIMIZE (evex); > > if (CPU_FEATURE_USABLE_P (cpu_features, RTM)) > -- > 2.33.1 > I would like to backport this patch to release branches. Any comments or objections? --Sunil