From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 4DCA53848413; Fri, 4 Jun 2021 07:41:44 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 4DCA53848413 From: "crazylht at gmail dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/97770] [ICELAKE]Missing vectorization for vpopcnt Date: Fri, 04 Jun 2021 07:41:43 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 11.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: crazylht at gmail dot com X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Jun 2021 07:41:44 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D97770 --- Comment #15 from Hongtao.liu --- (In reply to Richard Biener from comment #14) > So we vectorize to >=20 > _18 =3D .POPCOUNT (vect__5.7_22); > _17 =3D .POPCOUNT (vect__5.7_21); > vect__6.8_16 =3D VEC_PACK_TRUNC_EXPR <_18, _17>; > _6 =3D 0; > _7 =3D dest_13(D) + _2; > vect__8.9_10 =3D [vec_unpack_lo_expr] vect__6.8_16; > vect__8.9_9 =3D [vec_unpack_hi_expr] vect__6.8_16; > _8 =3D (long long int) _6; >=20 > which is exactly the issue that in the scalar code we have a 'int' produc= ing > popcount with long long argument but the vector IFN produces a result of = the > same width as the argument. So the vectorizer compensates for that > (VEC_PACK_TRUNC_EXPR) and then vectorizes the widening that's in the scal= ar > code (vec_unpack_{lo,hi}_expr). The fix for this and for the missing > byte and word variants is to add a pattern to tree-vect-patterns.c for th= is > case matching it to the .POPCOUNT internal function. That possibly appli= es > to other bitops, too, like parity, ctz, ffs, etc. There's quite some > _widen helpers in the pattern recog code so I'm not sure how complicated > it is to match >=20 > (long)popcountl(long) >=20 > and >=20 > (short)popcount((int)short) >=20 > Richard may have a good idea since he did the last "big" surgery there. Any suggestion for this, should we change prototype of builtins or add vec_recog_popcnt_pattern in vectorizer?=