From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 7852) id E087A3858002; Thu, 5 May 2022 17:26:02 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E087A3858002 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: Sunil Pandey To: glibc-cvs@sourceware.org Subject: [glibc/release/2.33/master] x86: Remove SSSE3 instruction for broadcast in memset.S (SSE2 Only) X-Act-Checkin: glibc X-Git-Author: Noah Goldstein X-Git-Refname: refs/heads/release/2.33/master X-Git-Oldrev: ef264d262b0cee60bf1b85fb898b4ab5d0ae8288 X-Git-Newrev: f5e0ea6c0dbfd0761a5ce5d6c2226bdd45835c66 Message-Id: <20220505172602.E087A3858002@sourceware.org> Date: Thu, 5 May 2022 17:26:02 +0000 (GMT) X-BeenThere: glibc-cvs@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Glibc-cvs mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 May 2022 17:26:03 -0000 https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=f5e0ea6c0dbfd0761a5ce5d6c2226bdd45835c66 commit f5e0ea6c0dbfd0761a5ce5d6c2226bdd45835c66 Author: Noah Goldstein Date: Mon Feb 7 00:32:23 2022 -0600 x86: Remove SSSE3 instruction for broadcast in memset.S (SSE2 Only) commit b62ace2740a106222e124cc86956448fa07abf4d Author: Noah Goldstein Date: Sun Feb 6 00:54:18 2022 -0600 x86: Improve vec generation in memset-vec-unaligned-erms.S Revert usage of 'pshufb' in broadcast logic as it is an SSSE3 instruction and memset.S is restricted to only SSE2 instructions. (cherry picked from commit 1b0c60f95bbe2eded80b2bb5be75c0e45b11cde1) Diff: --- sysdeps/x86_64/memset.S | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/sysdeps/x86_64/memset.S b/sysdeps/x86_64/memset.S index 34ee0bfdcb..954471e5a5 100644 --- a/sysdeps/x86_64/memset.S +++ b/sysdeps/x86_64/memset.S @@ -30,9 +30,10 @@ # define MEMSET_SET_VEC0_AND_SET_RETURN(d, r) \ movd d, %xmm0; \ - pxor %xmm1, %xmm1; \ - pshufb %xmm1, %xmm0; \ - movq r, %rax + movq r, %rax; \ + punpcklbw %xmm0, %xmm0; \ + punpcklwd %xmm0, %xmm0; \ + pshufd $0, %xmm0, %xmm0 # define WMEMSET_SET_VEC0_AND_SET_RETURN(d, r) \ movd d, %xmm0; \