From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id EBDE53858C5F; Thu, 22 Jun 2023 06:44:10 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org EBDE53858C5F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1687416250; bh=oFBwnq7S6DY/9BtubVjZHF/2QVoqsu1j/E/Cj0QDiuc=; h=From:To:Subject:Date:In-Reply-To:References:From; b=p9pafDzGrqVDHuu3Ej9eg3Oo1wqetf8C6s3Df55eTm6KQWMzVPGlryq9kB6weEYie U+GjcjuDKuDPNMhuBDmh/8G+gtvvwMNOgX8hhQ6QeDggDlLFLCVaJdAbbFltlV3KoL X16ooBI1SodnqtuLO5zmij0FfTJdmR3De0i5eQIw= From: "cvs-commit at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/109973] [13/14 Regression] Wrong code for AVX2 since 13.1 by combining VPAND and VPTEST since r13-2006-ga56c1641e9d25e Date: Thu, 22 Jun 2023 06:44:08 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: wrong-code X-Bugzilla-Severity: normal X-Bugzilla-Who: cvs-commit at gcc dot gnu.org X-Bugzilla-Status: ASSIGNED X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: roger at nextmovesoftware dot com X-Bugzilla-Target-Milestone: 13.2 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D109973 --- Comment #10 from CVS Commits --- The master branch has been updated by Roger Sayle : https://gcc.gnu.org/g:5322f009e8f7d1c7a1c9aab7cb4c90c433398fdd commit r14-2030-g5322f009e8f7d1c7a1c9aab7cb4c90c433398fdd Author: Roger Sayle Date: Thu Jun 22 07:43:07 2023 +0100 i386: Convert ptestz of pandn into ptestc. This patch is the next installment in a set of backend patches around improvements to ptest/vptest. A previous patch optimized the sequence t=3Dpand(x,y); ptestz(t,t) into the equivalent ptestz(x,y), using the property that ZF is set to (X&Y) =3D=3D 0. This patch performs a simil= ar transformation, converting t=3Dpandn(x,y); ptestz(t,t) into the (almost) equivalent ptestc(y,x), using the property that the CF flags is set to (~X&Y) =3D=3D 0. The tricky bit is that this sets the CF flag instead = of the ZF flag, so we can only perform this transformation when we can also convert the flags consumer, as well as the producer. For the test case: int foo (__m128i x, __m128i y) { __m128i a =3D x & ~y; return __builtin_ia32_ptestz128 (a, a); } With -O2 -msse4.1 we previously generated: foo: pandn %xmm0, %xmm1 xorl %eax, %eax ptest %xmm1, %xmm1 sete %al ret with this patch we now generate: foo: xorl %eax, %eax ptest %xmm0, %xmm1 setc %al ret At the same time, this patch also provides alternative fixes for PR target/109973 and PR target/110118, by recognizing that ptestc(x,x) always sets the carry flag (X&~X is always zero). This is achieved both by recognizing the special case in ix86_expand_sse_ptest and with a splitter to convert an eligible ptest into an stc. 2023-06-22 Roger Sayle Uros Bizjak gcc/ChangeLog * config/i386/i386-expand.cc (ix86_expand_sse_ptest): Recognize expansion of ptestc with equal operands as producing const1_rtx. * config/i386/i386.cc (ix86_rtx_costs): Provide accurate cost estimates of UNSPEC_PTEST, where the ptest performs the PAND or PAND of its operands. * config/i386/sse.md (define_split): Transform CCCmode UNSPEC_P= TEST of reg_equal_p operands into an x86_stc instruction. (define_split): Split pandn/ptestz/set{n?}e into ptestc/set{n?}= c. (define_split): Similar to above for strict_low_part destinatio= ns. (define_split): Split pandn/ptestz/j{n?}e into ptestc/j{n?}c. gcc/testsuite/ChangeLog * gcc.target/i386/avx-vptest-4.c: New test case. * gcc.target/i386/avx-vptest-5.c: Likewise. * gcc.target/i386/avx-vptest-6.c: Likewise. * gcc.target/i386/pr109973-1.c: Update test case. * gcc.target/i386/pr109973-2.c: Likewise. * gcc.target/i386/sse4_1-ptest-4.c: New test case. * gcc.target/i386/sse4_1-ptest-5.c: Likewise. * gcc.target/i386/sse4_1-ptest-6.c: Likewise.=