From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
 id E0BA83858423; Mon,  4 Oct 2021 17:15:10 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E0BA83858423
From: "amacleod at redhat dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/102540] [12 Regression] Dead Code Elimination
 Regression at -O3 since r12-476-gd846f225c25c5885
Date: Mon, 04 Oct 2021 17:15:10 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: tree-optimization
X-Bugzilla-Version: 12.0
X-Bugzilla-Keywords: 
X-Bugzilla-Severity: normal
X-Bugzilla-Who: amacleod at redhat dot com
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 12.0
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-102540-4-vnnsuuQq0h@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-102540-4@http.gcc.gnu.org/bugzilla/>
References: <bug-102540-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: gcc-bugs@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-bugs mailing list <gcc-bugs.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Mon, 04 Oct 2021 17:15:11 -0000

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D102540
--- Comment #6 from Andrew Macleod <amacleod at redhat dot com> ---

>=20
> >  It removes a
> > relationship between c_10 and _2. The reason ranger no longer can fold =
_2 =3D=3D 0
> > is because the sequence is now:
> >=20
> >     a.0_1 =3D a;
> >     _2 =3D (unsigned int) a.0_1;
> >     b =3D _2;
> >     _6 =3D a.0_1 & 4294967295;
> >     c_10 =3D _6;
> >     if (c_10 !=3D 0)
> >       goto <bb 3>; [INV]
> >=20
> > We do not find _2 is non-zero on the outgoing edge because _2 is not re=
lated to
> > the calculation in the condition.  (ie c_10 no longer has a dependency =
on _2)
> >=20
> > We do recalculate _2 based on the outgoing range of a.0_1, but with it =
being a
> > 64 bit value and _2 being 32 bits, we only know the outgoing range of a=
.0_1 is
> > non-zero.. we dont track any of the upper bits...=20
> >  2->3  (T) a.0_1 :       long int [-INF, -1][1, +INF]
> > And when we recalculate _2 using that value, we still get varying becau=
se
> > 0xFFFF0000 in not zero, but can still produce a zero in _2.
> >=20
> > The problem is that the condition c_10 !=3D 0 no longer related to the =
value of
> > _2 in the IL... so ranger never sees it. and we cant represent the 2^16
> > subranges that end in [1,0xFFFF].
> >=20
> > Before that transformation,=20
> >   _2 =3D (unsigned int) a.0_1;
> >    b =3D _2;
> >   c_10 =3D (long int) _2;
> > The relationship is obvious, and ranger would relate the c_10 !=3D 0 to=
 _2 no
> > problem.
>=20
> I see - too bad.  Note the transform made the dependence chain of _6
> one instruction shorter without increasing the number of instructions
> so it's a profitable transform.
>=20
> Btw, the relation is still there but only indirectly via a.0_1.  The
> old (E)VRP had this find_asserts(?) that produced assertions based
> on the definitions - sth that now range-ops does(?), so it would
> eventually have built assertions for a.0_1 for both conditions and
> allow relations based on that?  I can't seem to find my way around
> the VRP code now - pieces moved all over the place and so my mind
> fails me on the searching task :/

We do know that a.0_1 is non-zero on that edge:
2->3  (T) a.0_1 :       long int [-INF, -1][1, +INF]

the problem is that we can't currently represent that the bitmask operation
causes all patterns ending in 0x00000000 to not occur.. we just leave it at
~[0,0].  which isn't sufficient for this use case.=20

we don't currently track any equivalences between values of different
precision.. (even though ranger once did).   Handling it as a general
equivalence was fraught with issues.=20

We might be able to add a new equivalence class "slice" or something.. I had
considered it but hadn't seen a great need case.   This would make _6 a 32 =
bit
slice of a.0_1 with range [1, 0xffffffff].
Then when we are querying for the cast
  _2 =3D (unsigned int) a.0_1;
we could also query the 32 bit equivalence slices of a.0_1, find _6, and get
the outgoing range of [1,0xffffffff].. and apply that value.

It would probably resolve an entire class of things where we don't recogniz=
e an
equivalence between a cast and a bitmask of equivalent precision.

This would also mean the reverse would apply.. ie if we instead branched on=
 _2
!=3D 0 we would also understand that _6 will be non-zero.=