From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 6FA153858D37; Wed, 15 Nov 2023 10:36:22 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 6FA153858D37 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1700044582; bh=rMYDpkfRALn/d+eNJ1Zh4CpbzIiJIvhY3J8xI3wU1+U=; h=From:To:Subject:Date:In-Reply-To:References:From; b=xyYCaoibV+ZUce9VYQrSt3xg2YNIScIcUh+S6ZuML8z5mhK/VJaQDUHVaVR4XZKX9 CapC9+LgoF+vOfkruPCMX8dyMzrxQplF9wupEz68PNbFLqlH5bcHBwpQ8fUckDsOPH GIBX1FXuHifvT7kLtUmbUEj1c+V7Buu43ZPzO/Sg= From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/112282] [14 Regression] wrong code (generated code hangs) at -O3 on x86_64-linux-gnu since r14-4777-g88c27070c25309 Date: Wed, 15 Nov 2023 10:36:21 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: wrong-code X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: ASSIGNED X-Bugzilla-Resolution: X-Bugzilla-Priority: P1 X-Bugzilla-Assigned-To: rguenth at gcc dot gnu.org X-Bugzilla-Target-Milestone: 14.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: assigned_to bug_status Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D112282 Richard Biener changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot = gnu.org Status|NEW |ASSIGNED --- Comment #13 from Richard Biener --- So I can cut off bitfield lowering completely, the important part is that we version the loop and thus try to BB vectorize the loop header (yeah, we don= 't BB vectorize the whole body - or rather, we think the header _is_ the fully body). But a key to the failure seems to be that we BB vectorize the unrolled for (; ac < 1; ac++) for (k =3D 0; k < 9; k++) am[k] =3D 0; and doing that not from SLP but from loop vectorization of if-conversion versioned (but otherwise unchanged) loop. It's also solely triggered by unrolling the 'z' loop. Disabling all following passes will still reproduce it. The region VN triggered by ifconversion/vectorization/unrolling isn't needed either (I disabled it). Maybe PR111572 is related (but it doesn't change unrolling and disabling ch_vect doesn't avoid the problem). Unrolling does Analyzing # of iterations of loop 2 exit condition [23, + , 4294967295] !=3D 0 bounds on difference of bases: -23 ... -23 result: # of iterations 23, bounded by 23 Removed pointless exit: if (ivtmp_1055 !=3D 0) because we computed loop->nb_iterations_upper_bound to 21: Statement (exit)if (ivtmp_1055 !=3D 0) is executed at most 23 (bounded by 23) + 1 times in loop 2. Induction variable (int) 21 + -1 * iteration does not wrap in statement _4 = =3D ~u.13_485; in loop 2. Statement _4 =3D ~u.13_485; is executed at most 21 (bounded by 21) + 1 times in loop 2. Induction variable (int) -21 + 1 * iteration does not wrap in statement _19= =3D u.13_485 + 1; in loop 2. Statement _19 =3D u.13_485 + 1; is executed at most 23 (bounded by 23) + 1 times in loop 2. Reducing loop iteration estimate by 1; undefined statement must be executed= at the last iteration. we're SCEV analyzing _4 here, computing {21, +, -1}_2 and VRP1 computed [irange] int [0, +INF] somehow for it. u.13_485 has a global range of [-2147483647, 1], so obviously it must infer sth else here somehow and wrongly so? That very same def also appears with plain -O3. Global Exported: _4 =3D [irange] int [0, +INF] Hmm. We have Folding statement: _64 =3D ~u.13_20; Global Exported: _64 =3D [irange] int [-2, -1] MASK 0x1 VALUE 0xfffffffe Folding statement: _4 =3D ~u.13_20; Global Exported: _4 =3D [irange] int [0, +INF] but the if-conversion pass hoists that before the .LOOP_VECTORIZED properly resetting flow-sensitive info on stmts hoisted fixes this. Meh. Premature duplicate transforms ...=