From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 0A182385802E; Tue, 27 Sep 2022 15:33:17 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0A182385802E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1664292797; bh=zMyPQDIm/o2wmNcd26Q52LQa1TjhAktBBPfgPyVF8/4=; h=From:To:Subject:Date:In-Reply-To:References:From; b=RD7gJJTfV+MoOVwS+4nXe3IpnEKY/xjAcRClkfmY3naaUigRrNc/D3ycSlbSCpjNU VYSSClOnIAxYuej8n9USQWD536NmQ9Rkwp7C+e2TdS8uwPCjfi2ukloJ/Y3l2pIuUb XKKOjBVxpjAg+m17JX0FAtiL98fQGrQhBOSuwPhg= From: "aldyh at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/107052] Range of __builtin_popcount can be improved with nonzerobits Date: Tue, 27 Sep 2022 15:33:16 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: enhancement X-Bugzilla-Who: aldyh at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cf_reconfirmed_on cc bug_status everconfirmed Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D107052 Aldy Hernandez changed: What |Removed |Added ---------------------------------------------------------------------------- Last reconfirmed| |2022-09-27 CC| |amacleod at redhat dot com Status|UNCONFIRMED |NEW Ever confirmed|0 |1 --- Comment #2 from Aldy Hernandez --- Don't you mean the only values for popcount are 0-2? I mean, there are only two bits that could be 1 with a mask of 0x300. Or am I missing something? Either way, your check is for b > 3, and we should be able to fold that awa= y. There are two problems here. The cast of a_4 to a.0_1 dropped the nonzero mask. I would've expected that a cast to a number of the same precision wo= uld keep the 0x300 mask, instead we have: a.0_1 : [irange] unsigned int [0, 768] NONZERO 0x3ff If we had the 0x300 mask available in cfn_popcount::fold_range(), then we c= ould fold it. The second problem is that cfn_popcont, does not look at the nonz= ero bits at all. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D BB 2 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Imports: a_3(D)=20=20 Exports: a.0_1 a_3(D) a_4 b_5=20=20 a.0_1 : a_3(D)(I) a_4=20=20 a_4 : a_3(D)(I)=20=20 b_5 : a.0_1 a_3(D)(I) a_4=20=20 a_3(D) [irange] int VARYING : a_4 =3D a_3(D) & 768; a.0_1 =3D (unsigned int) a_4; b_5 =3D __builtin_popcount (a.0_1); if (b_5 > 3) goto ; [INV] else goto ; [INV] a.0_1 : [irange] unsigned int [0, 768] NONZERO 0x3ff a_4 : [irange] int [0, 768] NONZERO 0x300 b_5 : [irange] int [0, 10] NONZERO 0xf 2->3 (T) a.0_1 : [irange] unsigned int [0, 768] NONZERO 0x3ff 2->3 (T) a_4 : [irange] int [0, 768] NONZERO 0x300 2->3 (T) b_5 : [irange] int [4, 10] NONZERO 0xf 2->4 (F) a.0_1 : [irange] unsigned int [0, 768] NONZERO 0x3ff 2->4 (F) a_4 : [irange] int [0, 768] NONZERO 0x300 2->4 (F) b_5 : [irange] int [0, 3] NONZERO 0x3=