From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 0A8AA385842F; Fri, 1 Oct 2021 13:02:24 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0A8AA385842F From: "amacleod at redhat dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/102546] [12 Regregression] Missed Dead Code Elimination regression (trunk vs 11.2.0) at -O3 Date: Fri, 01 Oct 2021 13:02:24 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: amacleod at redhat dot com X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: aldyh at gcc dot gnu.org X-Bugzilla-Target-Milestone: 12.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Oct 2021 13:02:25 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D102546 --- Comment #8 from Andrew Macleod --- On 10/1/21 5:18 AM, aldyh at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D102546 > > Aldy Hernandez changed: > > What |Removed |Added > -------------------------------------------------------------------------= --- > Last reconfirmed| |2021-10-01 > Status|UNCONFIRMED |NEW > Ever confirmed|0 |1 > Assignee|unassigned at gcc dot gnu.org |aldyh at gcc dot= gnu.org > > --- Comment #5 from Aldy Hernandez --- > (In reply to Richard Biener from comment #4) >> (In reply to Aldy Hernandez from comment #2) >>> By VRP1 we seem to be calculating the following: >>> >>> (f_8 << f_8) && (f_8 =3D=3D 0) >>> >>> This would fold to false, which would elide the foo(): >>> >>> [local count: 59055800]: >>> b =3D 0; >>> _3 =3D f_8 << f_8; >>> _4 =3D (char) _3; >>> _5 =3D (int) _4; >>> if (_4 > 0) >>> goto ; [64.06%] >>> else >>> goto ; [35.94%] >>> >>> [local count: 34842922]: >>> if (f_8 =3D=3D 0) >>> goto ; [71.10%] >>> else >>> goto ; [28.90%] >>> >>> [local count: 12809203]: >>> foo (); >> I think it's similar to in the other PR, with old EVRP when visiting BB 8 >> we pushed [1, +INF] as the global range for _4, then supposedly ranger >> manages to evaluate f_8 =3D=3D 0 with its backward infering somehow. >> >> We no longer do this "path sensitive" adjustment of (global) ranges since >> you removed the EVRP DOM walk algorithm. > The hybrid threader does path sensitive ranges and relationals. What's m= issing > is the range-op entry for the following relation: > > ~[0,0] =3D x << x > > In this case, we know that X cannot be 0. Fixing this, causes all the ri= ght > things to happen. > > However, I see that the none of the op1_range entries are being called wi= th a > relation. Presumably this was an oversight on Andrew's part, but can eas= ily be > fixed. Not an oversight.=C2=A0 I believe I added the infrastructure to pass=20 relations to GORI when I introduce relations to range-ops for folding,=20 but there has not been time to flush out actually utilizing them yet. I can maybe take a look at that next week. maybe.=C2=A0 It also opens up so= me=20 possibilities for solving unsigned overflow questions: c_1 =3D a_2 + 2 if (c_1 < a_2)=C2=A0 // check overflow condition On the true edge, solving [1,1] =3D c_1 < a_2...=C2=A0 would propagate c_1 = <=20 a_2 to the defining insn as: [varying] =3D a_2 + 2, (LHS < a_2). op1_range for PLUS can use that relation to determine that a_2 must be=20 fully contained in=C2=A0 [INF - 1, INF] on the TRUE side, and therefore c_1= =C2=A0=20 is [0,1]. The false side would then also calculate a_2 =3D=C2=A0 [0, INF - 2] and c_1= as=20 [2, INF] > Interestingly on this case, the VRP threader shouldn't even need to step = up > here. VRP1 should have folded the conditional in BB8. The new evrp can > though. If I tweak range-ops, and call execute_early_vrp() from VRP1, ev= rp > folds the conditional and there's no need to thread. > > Now before Andrew asks why evrp doesn't clean this up earlier (with the > range-ops tweak), it's because the IL is different. See my question abou= t the > "a" present in earlier passes. ;-) > > So...mine. I'll address all the issues pointed out. >=