From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
 id 21AE23858400; Wed,  5 Jan 2022 09:00:29 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 21AE23858400
From: "crazylht at gmail dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/103771] [12 Regression] Missed vectorization under
 -mavx512f -mavx512vl after r12-5489
Date: Wed, 05 Jan 2022 09:00:28 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: 12.0
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: normal
X-Bugzilla-Who: crazylht at gmail dot com
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 12.0
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-103771-4-1uaca9grJ0@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-103771-4@http.gcc.gnu.org/bugzilla/>
References: <bug-103771-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: gcc-bugs@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-bugs mailing list <gcc-bugs.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Wed, 05 Jan 2022 09:00:29 -0000

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D103771
--- Comment #2 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Tamar Christina from comment #1)
> Looks like the change causes the simpler conditional to be detected by the
> vectorizer as a masked operation, which in principle makes sense:
>=20
> note:   vect_recog_mask_conversion_pattern: detected: iftmp.0_21 =3D x.1_=
14 >
> 255 ? iftmp.0_19 : iftmp.0_20;
> note:   mask_conversion pattern recognized: patt_43 =3D patt_42 ? iftmp.0=
_19 :
> iftmp.0_20;
> note:   extra pattern stmt: patt_40 =3D x.1_14 > 255;
> note:   extra pattern stmt: patt_42 =3D (<signed-boolean:8>) patt_40;
>=20
> However not quite sure how the masking works on x86.  The additional
> statement generated for patt_42 causes it to fail during vectorization:
>=20
> note:   =3D=3D> examining pattern def statement: patt_42 =3D (<signed-boo=
lean:8>)
> patt_40;
> note:   =3D=3D> examining statement: patt_42 =3D (<signed-boolean:8>) pat=
t_40;
> note:   vect_is_simple_use: operand x.1_14 > 255, type of def: internal
> note:   vect_is_simple_use: vectype vector(8) <signed-boolean:1>
> missed:   conversion not supported by target.
> note:   vect_is_simple_use: operand x.1_14 > 255, type of def: internal
> note:   vect_is_simple_use: vectype vector(8) <signed-boolean:1>
> note:   vect_is_simple_use: operand x.1_14 > 255, type of def: internal
> note:   vect_is_simple_use: vectype vector(8) <signed-boolean:1>
> missed:   not vectorized: relevant stmt not supported: patt_42 =3D
> (<signed-boolean:8>) patt_40;
> missed:  bad operation or unsupported loop bound.
> note:  ***** Analysis  failed with vector mode V32QI
>=20
> as there's no conversion patterns for `VEC_UNPACK_LO_EXPR` between bool a=
nd
> a mask.
W/ avx512, we're using scalar mode for mask, can we use VEC_UNPACKS_SBOOL_L=
O_
here?
Since we have vec_unpsack_sbool_lo/hi_qi which should be used for conversion
from vector<8> <signed-boolean:1> to vector<4> <signed-boolean:1>.
>=20
> which explains why it works for AVX2 and AVX512BW. AVX512F doesn't seem to
> allow any QI mode conversions [1] so it fails..
>=20
> Not sure why it's doing the replacement without checking to see that the
> target is able to vectorize the statements it generates later. Specifical=
ly
> it doesn't check if what's returned by build_mask_conversion is supported=
 or
> not.
>=20
> My guess is because vectorizable_condition will fail anyway without the t=
ype
> of the conditional being a vector boolean.
>=20
> With -mavx512vl V32QI seems to generate in the pattern mask conversions
> between vector (8) <signed-boolean:1> and without it vector(32)
> <signed-boolean:8>. I think some x86 person needs to give a hint here :)
>=20
> [1] https://www.felixcloutier.com/x86/kunpckbw:kunpckwd:kunpckdq=