From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 7852) id CD2D93858022; Tue, 19 Jul 2022 05:11:56 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org CD2D93858022 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: Sunil Pandey To: glibc-cvs@sourceware.org Subject: [glibc/release/2.34/master] x86: Add sse42 implementation to strcmp's ifunc X-Act-Checkin: glibc X-Git-Author: Noah Goldstein X-Git-Refname: refs/heads/release/2.34/master X-Git-Oldrev: 6e008c884dad5a25f91085c68d044bb5e2d63761 X-Git-Newrev: 9d50e162eef88e1f870a941b0a973060e984e7ca Message-Id: <20220719051156.CD2D93858022@sourceware.org> Date: Tue, 19 Jul 2022 05:11:56 +0000 (GMT) X-BeenThere: glibc-cvs@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Glibc-cvs mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Jul 2022 05:11:56 -0000 https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=9d50e162eef88e1f870a941b0a973060e984e7ca commit 9d50e162eef88e1f870a941b0a973060e984e7ca Author: Noah Goldstein Date: Tue Jun 14 15:37:28 2022 -0700 x86: Add sse42 implementation to strcmp's ifunc This has been missing since the the ifuncs where added. The performance of SSE4.2 is preferable to to SSE2. Measured on Tigerlake with N = 20 runs. Geometric Mean of all benchmarks SSE4.2 / SSE2: 0.906 (cherry picked from commit ff439c47173565fbff4f0f78d07b0f14e4a7db05) Diff: --- sysdeps/x86_64/multiarch/strcmp.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/sysdeps/x86_64/multiarch/strcmp.c b/sysdeps/x86_64/multiarch/strcmp.c index 7c2901bf44..b457fb4c15 100644 --- a/sysdeps/x86_64/multiarch/strcmp.c +++ b/sysdeps/x86_64/multiarch/strcmp.c @@ -29,6 +29,7 @@ extern __typeof (REDIRECT_NAME) OPTIMIZE (sse2) attribute_hidden; extern __typeof (REDIRECT_NAME) OPTIMIZE (sse2_unaligned) attribute_hidden; extern __typeof (REDIRECT_NAME) OPTIMIZE (ssse3) attribute_hidden; +extern __typeof (REDIRECT_NAME) OPTIMIZE (sse42) attribute_hidden; extern __typeof (REDIRECT_NAME) OPTIMIZE (avx2) attribute_hidden; extern __typeof (REDIRECT_NAME) OPTIMIZE (avx2_rtm) attribute_hidden; extern __typeof (REDIRECT_NAME) OPTIMIZE (evex) attribute_hidden; @@ -53,6 +54,10 @@ IFUNC_SELECTOR (void) return OPTIMIZE (avx2); } + if (CPU_FEATURE_USABLE_P (cpu_features, SSE4_2) + && !CPU_FEATURES_ARCH_P (cpu_features, Slow_SSE4_2)) + return OPTIMIZE (sse42); + if (CPU_FEATURES_ARCH_P (cpu_features, Fast_Unaligned_Load)) return OPTIMIZE (sse2_unaligned);