From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id F36EB385843E; Thu, 18 Nov 2021 10:37:19 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org F36EB385843E From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/103300] [12 Regression] wrong code at -O3 on x86_64-linux-gnu Date: Thu, 18 Nov 2021 10:37:19 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: needs-bisection, wrong-code X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P1 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 12.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Nov 2021 10:37:20 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D103300 Richard Biener changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |matz at gcc dot gnu.org --- Comment #4 from Richard Biener --- (In reply to Andrew Pinski from comment #3) > (In reply to hubicka from comment #2) > > Needs -O2 -floop-unroll-and-jam --param early-inlining-insns=3D14 > > to fail, so I guess it may be issue with unrol-and-jam. >=20 > The major difference I see between GCC 11 and GCC 12 is how tree-loop-im > handles the load/store of a and c. In GCC 11, it was an unconditional move > of the store of a and c while in GCC 12 we get some interesting branches: > [local count: 35059055]: > # a_lsm.21_25 =3D PHI <_20(D)(6), _15(8)> > # a_lsm_flag.22_8 =3D PHI <0(6), 1(8)> > # c_lsm.23_22 =3D PHI <0(6), _5(8)> > if (c_lsm.23_22 <=3D 2) > goto ; [94.50%] > else > goto ; [5.50%] >=20 > [local count: 1928248]: > # a_lsm_flag.22_14 =3D PHI > # a_lsm.21_28 =3D PHI > c_lsm.23_27 =3D 3; > if (a_lsm_flag.22_14 !=3D 0) > goto ; [66.67%] > else > goto ; [33.33%] >=20 > [local count: 1285499]: > c =3D c_lsm.23_27; >=20 > [local count: 1285499]: > if (a_lsm_flag.22_14 !=3D 0) > goto ; [66.67%] > else > goto ; [33.33%] >=20 > [local count: 856999]: > a =3D a_lsm.21_28; >=20 > [local count: 1928248]: That's likely a missed threading / header copying, the stores are condition= al now and thus need protecting against store data races. What unroll-and-jam does is make the inner loop enter always, only consider= ing the loop header check for the second iteration and also fails to include the increment. That's likely a latent issue, maybe because the latch of the outer loop is not empty? Testcase that fails with -O2 -floop-unroll-and-jam: int a, b[2], c, d, e, f; int g(int h, int i) { return !i || h && i =3D=3D 1 ? 0 : h % i; } static void j() { while (1) while (1) { if (d) L: if (f) break; if (e) goto L; return; } } int main() { j(); for (c =3D 0; c < 3; c++) for (a =3D 0; a < 2; a++) if (g(0, b[a]++)) while (1) ; if (b[1] !=3D 3) __builtin_abort(); return 0; } Micha?=