From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 7852) id DA68F3857374; Wed, 27 Apr 2022 02:00:46 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org DA68F3857374 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: Sunil Pandey To: glibc-cvs@sourceware.org Subject: [glibc/release/2.34/master] x86: Replace sse2 instructions with avx in memcmp-evex-movbe.S X-Act-Checkin: glibc X-Git-Author: Noah Goldstein X-Git-Refname: refs/heads/release/2.34/master X-Git-Oldrev: 6d18a93dbbde2958001d65dff3080beed7ae675a X-Git-Newrev: baf3ece63453adac59c5688930324a78ced5b2e4 Message-Id: <20220427020046.DA68F3857374@sourceware.org> Date: Wed, 27 Apr 2022 02:00:46 +0000 (GMT) X-BeenThere: glibc-cvs@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Glibc-cvs mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Apr 2022 02:00:47 -0000 https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=baf3ece63453adac59c5688930324a78ced5b2e4 commit baf3ece63453adac59c5688930324a78ced5b2e4 Author: Noah Goldstein Date: Sat Oct 23 01:26:47 2021 -0400 x86: Replace sse2 instructions with avx in memcmp-evex-movbe.S This commit replaces two usages of SSE2 'movups' with AVX 'vmovdqu'. it could potentially be dangerous to use SSE2 if this function is ever called without using 'vzeroupper' beforehand. While compilers appear to use 'vzeroupper' before function calls if AVX2 has been used, using SSE2 here is more brittle. Since it is not absolutely necessary it should be avoided. It costs 2-extra bytes but the extra bytes should only eat into alignment padding. Reviewed-by: H.J. Lu (cherry picked from commit bad852b61b79503fcb3c5fc379c70f768df3e1fb) Diff: --- sysdeps/x86_64/multiarch/memcmp-evex-movbe.S | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/sysdeps/x86_64/multiarch/memcmp-evex-movbe.S b/sysdeps/x86_64/multiarch/memcmp-evex-movbe.S index 2761b54f2e..640f6757fa 100644 --- a/sysdeps/x86_64/multiarch/memcmp-evex-movbe.S +++ b/sysdeps/x86_64/multiarch/memcmp-evex-movbe.S @@ -561,13 +561,13 @@ L(between_16_31): /* From 16 to 31 bytes. No branch when size == 16. */ /* Use movups to save code size. */ - movups (%rsi), %xmm2 + vmovdqu (%rsi), %xmm2 VPCMP $4, (%rdi), %xmm2, %k1 kmovd %k1, %eax testl %eax, %eax jnz L(return_vec_0_lv) /* Use overlapping loads to avoid branches. */ - movups -16(%rsi, %rdx, CHAR_SIZE), %xmm2 + vmovdqu -16(%rsi, %rdx, CHAR_SIZE), %xmm2 VPCMP $4, -16(%rdi, %rdx, CHAR_SIZE), %xmm2, %k1 addl $(CHAR_PER_VEC - (16 / CHAR_SIZE)), %edx kmovd %k1, %eax