From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 62F1D3865C20; Mon, 30 Oct 2023 03:10:19 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 62F1D3865C20 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1698635419; bh=RtPP9zqmePVmcnfutkegaY2xRlskNV5O2ZZJbvAPzY4=; h=From:To:Subject:Date:In-Reply-To:References:From; b=I3MoiczRioYCve56C7KYxo1FZyCP30betksM6XSHdxMLpU9XbMXohHURY7Bmy+mBN E0C86QLeWiEj3/hBAl9tArIBNIrorCzSwc3G7imil2H9/TvSI1XbYvNTqmagpx3HyV cc9VJip26fDIJ76Wj8y/k0Au8xRPOtfTGMycqIBM= From: "cvs-commit at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/104610] memcmp () == 0 can be optimized better for avx512f Date: Mon, 30 Oct 2023 03:10:17 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: enhancement X-Bugzilla-Who: cvs-commit at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D104610 --- Comment #23 from CVS Commits --- The master branch has been updated by hongtao Liu : https://gcc.gnu.org/g:8c40b72036c967fbb1d1150515cf70aec382f0a2 commit r14-5002-g8c40b72036c967fbb1d1150515cf70aec382f0a2 Author: liuhongt Date: Mon Oct 9 15:07:54 2023 +0800 Improve memcmpeq for 512-bit vector with vpcmpeq + kortest. When 2 vectors are equal, kmask is allones and kortest will set CF, else CF will be cleared. So CF bit can be used to check for the result of the comparison. Before: vmovdqu (%rsi), %ymm0 vpxorq (%rdi), %ymm0, %ymm0 vptest %ymm0, %ymm0 jne .L2 vmovdqu 32(%rsi), %ymm0 vpxorq 32(%rdi), %ymm0, %ymm0 vptest %ymm0, %ymm0 je .L5 .L2: movl $1, %eax xorl $1, %eax vzeroupper ret After: vmovdqu64 (%rsi), %zmm0 xorl %eax, %eax vpcmpeqd (%rdi), %zmm0, %k0 kortestw %k0, %k0 setc %al vzeroupper ret gcc/ChangeLog: PR target/104610 * config/i386/i386-expand.cc (ix86_expand_branch): Handle 512-bit vector with vpcmpeq + kortest. * config/i386/i386.md (cbranchxi4): New expander. * config/i386/sse.md: (cbranch4): Extend to V16SImode and V8DImode. gcc/testsuite/ChangeLog: * gcc.target/i386/pr104610-2.c: New test.=