From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 21AE23858400; Wed, 5 Jan 2022 09:00:29 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 21AE23858400 From: "crazylht at gmail dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/103771] [12 Regression] Missed vectorization under -mavx512f -mavx512vl after r12-5489 Date: Wed, 05 Jan 2022 09:00:28 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: crazylht at gmail dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 12.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Jan 2022 09:00:29 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D103771 --- Comment #2 from Hongtao.liu --- (In reply to Tamar Christina from comment #1) > Looks like the change causes the simpler conditional to be detected by the > vectorizer as a masked operation, which in principle makes sense: >=20 > note: vect_recog_mask_conversion_pattern: detected: iftmp.0_21 =3D x.1_= 14 > > 255 ? iftmp.0_19 : iftmp.0_20; > note: mask_conversion pattern recognized: patt_43 =3D patt_42 ? iftmp.0= _19 : > iftmp.0_20; > note: extra pattern stmt: patt_40 =3D x.1_14 > 255; > note: extra pattern stmt: patt_42 =3D () patt_40; >=20 > However not quite sure how the masking works on x86. The additional > statement generated for patt_42 causes it to fail during vectorization: >=20 > note: =3D=3D> examining pattern def statement: patt_42 =3D () > patt_40; > note: =3D=3D> examining statement: patt_42 =3D () pat= t_40; > note: vect_is_simple_use: operand x.1_14 > 255, type of def: internal > note: vect_is_simple_use: vectype vector(8) > missed: conversion not supported by target. > note: vect_is_simple_use: operand x.1_14 > 255, type of def: internal > note: vect_is_simple_use: vectype vector(8) > note: vect_is_simple_use: operand x.1_14 > 255, type of def: internal > note: vect_is_simple_use: vectype vector(8) > missed: not vectorized: relevant stmt not supported: patt_42 =3D > () patt_40; > missed: bad operation or unsupported loop bound. > note: ***** Analysis failed with vector mode V32QI >=20 > as there's no conversion patterns for `VEC_UNPACK_LO_EXPR` between bool a= nd > a mask. W/ avx512, we're using scalar mode for mask, can we use VEC_UNPACKS_SBOOL_L= O_ here? Since we have vec_unpsack_sbool_lo/hi_qi which should be used for conversion from vector<8> to vector<4> . >=20 > which explains why it works for AVX2 and AVX512BW. AVX512F doesn't seem to > allow any QI mode conversions [1] so it fails.. >=20 > Not sure why it's doing the replacement without checking to see that the > target is able to vectorize the statements it generates later. Specifical= ly > it doesn't check if what's returned by build_mask_conversion is supported= or > not. >=20 > My guess is because vectorizable_condition will fail anyway without the t= ype > of the conditional being a vector boolean. >=20 > With -mavx512vl V32QI seems to generate in the pattern mask conversions > between vector (8) and without it vector(32) > . I think some x86 person needs to give a hint here :) >=20 > [1] https://www.felixcloutier.com/x86/kunpckbw:kunpckwd:kunpckdq=