From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 7852) id 8C3343852770; Mon, 18 Jul 2022 20:58:27 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8C3343852770 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: Sunil Pandey To: glibc-cvs@sourceware.org Subject: [glibc/release/2.35/master] x86: Add sse42 implementation to strcmp's ifunc X-Act-Checkin: glibc X-Git-Author: Noah Goldstein X-Git-Refname: refs/heads/release/2.35/master X-Git-Oldrev: 7f7a728b71ec922e94114103d647eabf5a814206 X-Git-Newrev: 232b7adb14b7ce36f1c25a88dfc834392acac393 Message-Id: <20220718205827.8C3343852770@sourceware.org> Date: Mon, 18 Jul 2022 20:58:27 +0000 (GMT) X-BeenThere: glibc-cvs@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Glibc-cvs mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Jul 2022 20:58:27 -0000 https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=232b7adb14b7ce36f1c25a88dfc834392acac393 commit 232b7adb14b7ce36f1c25a88dfc834392acac393 Author: Noah Goldstein Date: Tue Jun 14 15:37:28 2022 -0700 x86: Add sse42 implementation to strcmp's ifunc This has been missing since the the ifuncs where added. The performance of SSE4.2 is preferable to to SSE2. Measured on Tigerlake with N = 20 runs. Geometric Mean of all benchmarks SSE4.2 / SSE2: 0.906 (cherry picked from commit ff439c47173565fbff4f0f78d07b0f14e4a7db05) Diff: --- sysdeps/x86_64/multiarch/strcmp.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/sysdeps/x86_64/multiarch/strcmp.c b/sysdeps/x86_64/multiarch/strcmp.c index 68cb73baad..08638e2632 100644 --- a/sysdeps/x86_64/multiarch/strcmp.c +++ b/sysdeps/x86_64/multiarch/strcmp.c @@ -29,6 +29,7 @@ extern __typeof (REDIRECT_NAME) OPTIMIZE (sse2) attribute_hidden; extern __typeof (REDIRECT_NAME) OPTIMIZE (sse2_unaligned) attribute_hidden; extern __typeof (REDIRECT_NAME) OPTIMIZE (ssse3) attribute_hidden; +extern __typeof (REDIRECT_NAME) OPTIMIZE (sse42) attribute_hidden; extern __typeof (REDIRECT_NAME) OPTIMIZE (avx2) attribute_hidden; extern __typeof (REDIRECT_NAME) OPTIMIZE (avx2_rtm) attribute_hidden; extern __typeof (REDIRECT_NAME) OPTIMIZE (evex) attribute_hidden; @@ -53,6 +54,10 @@ IFUNC_SELECTOR (void) return OPTIMIZE (avx2); } + if (CPU_FEATURE_USABLE_P (cpu_features, SSE4_2) + && !CPU_FEATURES_ARCH_P (cpu_features, Slow_SSE4_2)) + return OPTIMIZE (sse42); + if (CPU_FEATURES_ARCH_P (cpu_features, Fast_Unaligned_Load)) return OPTIMIZE (sse2_unaligned);