From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id 7F3413858438; Tue, 27 Sep 2022 14:29:01 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 7F3413858438
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1664288941;
	bh=38OfBb/OVM8YbB7vHLutMTkJU1cwYrQej/KSte+Ekgk=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=sulE6lyALQzlya1eoO+nui56iHsf3XqIgIHOYeYu0nPPCVpmy+oaUDNl/Zs1naLm0
	 gZd84cvAcJIkIZpdEqa5zFZiuMZ67PVLRhUsuihcpSV6G1v/TC789BnTdqq6Zid719
	 lXXkfxPYzp9cKw6Vlt6F+5udfEMyxSs7t0W7LXDw=
From: "aldyh at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/107043] range information not used in
 popcount
Date: Tue, 27 Sep 2022 14:29:01 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: tree-optimization
X-Bugzilla-Version: 13.0
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: enhancement
X-Bugzilla-Who: aldyh at gcc dot gnu.org
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: cc
Message-ID: <bug-107043-4-skSq8ApXfg@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-107043-4@http.gcc.gnu.org/bugzilla/>
References: <bug-107043-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D107043

Aldy Hernandez <aldyh at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |aldyh at gcc dot gnu.org
--- Comment #3 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #1)

> int g1(int n)
> {
>   int n1 =3D n & 0x8000;
>   if (n1 =3D=3D 0)
>     return 1;
>   // n>>15 will be xxxxxx1 here.
>   return (n >> 15) & 0x1;
> }

Interestingly, getting this one requires us to track something completely
different, the bits that are *definitely* set.

[Right now we track the nonzero mask, which is a misnomer because we're not
tracking bits are nonzero but the bits that *may* be nonzero.  Or more
precisely the inverse of the bits that are known to be 0.  For example, a
"nonzero" mask of 0xfffffff0 means the least significant 8 bits are known t=
o be
zero, and the rest of the bits are unknown.  So we're tracking the "and mas=
k"
of a number?  Or the maybe_nonzero bits?  The reason for the name is because
legacy VRP had this name.]

To get the above, we'd need to track the bits that are definitely 1 (the "or
mask" of a number?).  For example, on the 2->4 edge we'd need to know that =
n_3
has the 0x8000 bit set:

  <bb 2> :
  n1_4 =3D n_3(D) & 32768;
  if (n1_4 =3D=3D 0)
    goto <bb 3>; [INV]
  else
    goto <bb 4>; [INV]

  <bb 3> :
  goto <bb 5>; [INV]

  <bb 4> :
  _1 =3D n_3(D) >> 15;
  _5 =3D _1 & 1;

  <bb 5> :
  # _2 =3D PHI <1(3), _5(4)>
  return _2;

What we're looking for is solving n_3:

[not-zero] =3D n_3 & 32768

which should give us:

[-INF,-1][32768, +INF] ORMASK [0x8000]

or whatever the hell we want to call it.  I hate these names.  Please someo=
ne,
come up with a name that makes sense to us all!

Andrew M and I had a plan for this earlier this cycle, but got sidetracked =
by
floats.  What we'd need is a way to track or-mask's in addition to and-mask=
s.

There's actual infrastructure missing here, but it should be as easy as wha=
t we
did for "nonzero" tracking in commit 4e82205b68024f5c1a9006fe2b62e1a0fa7f12=
45
(plus supporting patches).  Basically we need to add a slot for the or-mask=
 in
the irange, add union/intersect code, and then add some glue in range-ops to
solve:

1 =3D x & mask
x =3D y | mask
etc etc.

Thanks for the testcase, it's quite useful.=