From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
 id C5D1E398300A; Tue, 10 Nov 2020 08:55:45 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C5D1E398300A
From: "rguenth at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/97770] [ICELAKE]Missing vectorization for vpopcnt
Date: Tue, 10 Nov 2020 08:55:45 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: 11.0
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: normal
X-Bugzilla-Who: rguenth at gcc dot gnu.org
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: cc
Message-ID: <bug-97770-4-d7FtpkLxxZ@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-97770-4@http.gcc.gnu.org/bugzilla/>
References: <bug-97770-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: gcc-bugs@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-bugs mailing list <gcc-bugs.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Tue, 10 Nov 2020 08:55:46 -0000

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D97770

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rguenth at gcc dot gnu.org
--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Hongtao.liu from comment #5)
> (In reply to Richard Biener from comment #4)
> > What's missing is middle-end folding support to narrow popcount to the
> > appropriate internal function call with byte/half-word width when target
> > support
> > is available.  But I'm quite sure there's no scalar popcount instruction
> > operating on half-word or byte pieces of a GPR?
> >=20
> > Alternatively the vectorizer can use patterns to do this.
>=20
> Yes, but for 64bit width, vectorizer generate suboptimal code.
>=20
> sse #c3
>=20
>   vector(2) long long unsigned int vect__4.6;
>   vector(2) long long unsigned int vect__4.5;
>   vector(2) long long unsigned int _8;
>   vector(2) long long unsigned int _26;
>=20
>   ...
>   ...
>=20
>   _8 =3D .POPCOUNT (vect__4.5_16);
>   _26 =3D .POPCOUNT (vect__4.6_9);
>   vect__5.7_22 =3D VEC_PACK_TRUNC_EXPR <_8, _26>; --- Why do we do this?
>   vector(4) int vect__5.7;
>=20
>=20
> It could generate directly
>=20
>   v4di =3D .POPCOUNT (v4di);

I guess that the vectorized popcount IFN is defined to be VnDI -> VnDI
but we want to have VnSImode results.  This means the instruction is
wrongly modeled in vectorized form?

Note the vectorizer isn't very good in handling narrowing operations here.

If you can push the missing patterns I can have a look.  Bonus points for
a correctness testcase (from the above I think we're generating wrong code)=
.=