From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id 801423858430; Fri, 26 May 2023 06:56:50 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 801423858430
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1685084210;
	bh=u694QsCFfbLytOvvxWy5blj3V75gWToQ9s3gnSFjhV0=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=fvATM2he7WpgdfPDpUrqZ0PZRKmrNTFz3RcFJpd3SoSF0xgtASm8+2AGLZ1Xc0Kgm
	 xoSxIbVcaYWuV50kSB2m2CnDIy3/Mv5r7xZJaQzyOzeBe18k0zOWi7uF/c2vIuOvC7
	 IXfV3LMYMXjFVsoiew6H4vUR4B/cuAccbp+4vPgY=
From: "jakub at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/109973] [13/14 Regression] Wrong code for AVX2 since
 13.1 by combining VPAND and VPTEST since r13-2006-ga56c1641e9d25e
Date: Fri, 26 May 2023 06:56:50 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: 14.0
X-Bugzilla-Keywords: wrong-code
X-Bugzilla-Severity: normal
X-Bugzilla-Who: jakub at gcc dot gnu.org
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P2
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 13.2
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: cc
Message-ID: <bug-109973-4-XwZRuVPDWX@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-109973-4@http.gcc.gnu.org/bugzilla/>
References: <bug-109973-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D109973

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org
--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Guess the optimization is perfectly valid when it is just the ZF flag that =
is
tested, i.e. in bar:

#include <immintrin.h>

int
foo (__m256i x, __m256i y)
{
  __m256i a =3D _mm256_and_si256 (x, y);
  return _mm256_testc_si256 (a, a);
}

int
bar (__m256i x, __m256i y)
{
  __m256i a =3D _mm256_and_si256 (x, y);
  return _mm256_testz_si256 (a, a);
}

_mm256_testc_si256 (a, a) is dumb (always returns non-zero because a & ~a is
0), perhaps we could fold it in gimple folding to 1.  Still I'm afraid at R=
TL
we can't rely on that folding.  One option could be to use CCZmode instead =
of
CCmode for the _mm*_testz* cases and perform this optimization solely for
CCZmode and not for CCmode that would be used
for _mm*_testc*.  It has a disadvantage that we'd likely not be able to mer=
ge
_mm256_testc_si256 (a, b) + _mm256_testz_si256 (a, b) (or vice versa).=