From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 991D13858003; Wed, 1 Dec 2021 19:59:52 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 991D13858003 From: "cvs-commit at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/101145] niter analysis fails for until-wrap condition Date: Wed, 01 Dec 2021 19:59:52 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: cvs-commit at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Dec 2021 19:59:52 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D101145 --- Comment #11 from CVS Commits --- The master branch has been updated by Roger Sayle : https://gcc.gnu.org/g:de3e5aae6c4b540e808c822c1e878b0a3304d09c commit r12-5699-gde3e5aae6c4b540e808c822c1e878b0a3304d09c Author: Roger Sayle Date: Wed Dec 1 19:58:40 2021 +0000 Final value replacement improvements for until-wrap loops. This middle-end patch is inspired by the Richard Beiner's until-wrap loop example in PR tree-optimization/101145. unsigned foo(unsigned val, unsigned start) { unsigned cnt =3D 0; for (unsigned i =3D start; i > val; ++i) cnt++; return cnt; } For this loop, the tree optimizers currently generate: unsigned int foo (unsigned int val, unsigned int start) { unsigned int cnt; unsigned int _1; unsigned int _5; [local count: 118111600]: if (start_3(D) > val_4(D)) goto ; [89.00%] else goto ; [11.00%] [local count: 105119324]: _1 =3D start_3(D) + 1; _5 =3D -start_3(D); cnt_2 =3D _1 > val_4(D) ? _5 : 1; [local count: 118111600]: # cnt_11 =3D PHI return cnt_11; } or perhaps slightly easier to read: if (start > val) { cnt =3D (start+1) > val ? -start : 1; } else cnt =3D 0; In this snippet, if we know start > val, then (start+1) > val unless start+1 overflows, i.e. (start+1) =3D=3D 0 and start =3D=3D ~0. We can use this (loop header) context to simplify the ternary expression to "(start !=3D -1) ? -start : 1", which with a little help from match.pd can be folded to -start. Hence the optimal final value replacement should be: cnt =3D (start > val) ? -start : 0; Or as now generated by this patch: unsigned int foo (unsigned int val, unsigned int start) { unsigned int cnt; [local count: 118111600]: if (start_3(D) > val_4(D)) goto ; [89.00%] else goto ; [11.00%] [local count: 105119324]: cnt_2 =3D -start_3(D); [local count: 118111600]: # cnt_11 =3D PHI return cnt_11; } We can also improve until-wrap loops that don't have a (suitable) loop header, as determined by simplify_using_initial_conditions. unsigned bar(unsigned val, unsigned start) { unsigned cnt =3D 0; unsigned i =3D start; do { cnt++; i++; } while (i > val); return cnt; } which is currently optimized to: unsigned int foo (unsigned int val, unsigned int start) { unsigned int cnt; unsigned int _9; unsigned int _10; [local count: 118111600]: _9 =3D start_4(D) + 1; _10 =3D -start_4(D); cnt_3 =3D val_7(D) < _9 ? _10 : 1; return cnt_3; } Here we have "val < (start+1) ? -start : 1", which again with the help of match.pd can be slightly simplified to "val <=3D start ? -start= : 1" when dealing with unsigned types, because at the complicating value whe= re start =3D=3D ~0, we fortunately have -start =3D=3D 1, hence it doesn't = matter whether the second or third operand of the ternary operator is returned. To summarize, this patch (in addition to tweaking may_be_zero in number_of_iterations_until_wrap) adds three new constant folding transforms to match.pd. X !=3D C1 ? -X : C2 simplifies to -X when -C1 =3D=3D C2. which is the generalized form of the simplification above. X !=3D C1 ? ~X : C2 simplifies to ~X when ~C1 =3D=3D C2. which is the BIT_NOT_EXPR analog of the NEGATE_EXPR case. and the "until-wrap final value replacement without context": (X + 1) > Y ? -X : 1 simplifies to X >=3D Y ? -X : 1 when X is unsigned, as when X + 1 overflows, X is -1, so -X =3D=3D 1. 2021-12-01 Roger Sayle Richard Biener gcc/ChangeLog * tree-ssa-loop-niter.c (number_of_iterations_until_wrap): Check if simplify_using_initial_conditions allows us to simplify the expression for may_be_zero. * match.pd (X !=3D C ? -X : -C -> -X): New transform. (X !=3D C ? ~X : ~C -> ~X): Likewise. ((X+1) > Y ? -X : 1 -> X >=3D Y ? -X : 1): Likewise. gcc/testsuite/ChangeLog * gcc.dg/fold-condneg-1.c: New test case. * gcc.dg/fold-condneg-2.c: New test case. * gcc.dg/fold-condnot-1.c: New test case. * gcc.dg/pr101145-1.c: New test case. * gcc.dg/pr101145-2.c: New test case.=