From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 201F6384803F; Mon, 31 May 2021 14:08:25 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 201F6384803F From: "cvs-commit at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug rtl-optimization/100342] [10/11 Regression] wrong code with -O2 -fno-dse -fno-forward-propagate -mno-sse2 Date: Mon, 31 May 2021 14:08:24 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: rtl-optimization X-Bugzilla-Version: 11.1.1 X-Bugzilla-Keywords: wrong-code X-Bugzilla-Severity: normal X-Bugzilla-Who: cvs-commit at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 10.4 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 31 May 2021 14:08:25 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D100342 --- Comment #15 from CVS Commits --- The releases/gcc-11 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:81c2cd08fafcdc0ff48f6cf440bef86f98822a23 commit r11-8486-g81c2cd08fafcdc0ff48f6cf440bef86f98822a23 Author: Jakub Jelinek Date: Sat May 15 10:12:11 2021 +0200 regcprop: Fix another cprop_hardreg bug [PR100342] On Tue, Jan 19, 2021 at 04:10:33PM +0000, Richard Sandiford via Gcc-pat= ches wrote: > Ah, ok, thanks for the extra context. > > So AIUI the problem when recording xmm2<-di isn't just: > > [A] partial_subreg_p (vd->e[sr].mode, GET_MODE (src)) > > but also that: > > [B] partial_subreg_p (vd->e[sr].mode, vd->e[vd->e[sr].oldest_regno].mode) > > For example, all registers in this sequence can be part of the same chain: > > (set (reg:HI R1) (reg:HI R0)) > (set (reg:SI R2) (reg:SI R1)) // [A] > (set (reg:DI R3) (reg:DI R2)) // [A] > (set (reg:SI R4) (reg:SI R[0-3])) > (set (reg:HI R5) (reg:HI R[0-4])) > > But: > > (set (reg:SI R1) (reg:SI R0)) > (set (reg:HI R2) (reg:HI R1)) > (set (reg:SI R3) (reg:SI R2)) // [A] && [B] > > is problematic because it dips below the precision of the oldest regno > and then increases again. > > When this happens, I guess we have two choices: > > (1) what the patch does: treat R3 as the start of a new chain. > (2) pretend that the copy occured in vd->e[sr].mode instead > (i.e. copy vd->e[sr].mode to vd->e[dr].mode) > > I guess (2) would need to be subject to REG_CAN_CHANGE_MODE_P. > Maybe the optimisation provided by (2) compared to (1) isn't common > enough to be worth the complication. > > I think we should test [B] as well as [A] though. The pass is set > up to do some quite elaborate mode changes and I think rejecting > [A] on its own would make some of the other code redundant. > It also feels like it should be a seperate =C3=A2if=C3=A2 or =C3=A2el= se if=C3=A2, > with its own comment. Unfortunately, we now have a testcase that shows that testing also [B] is a problem (unfortunately now latent on the trunk, only reproduces on 10 and 11 branches). The comment in the patch tries to list just the interesting instruction= s, we have a 64-bit value, copy low 8 bit of those to another register, copy full 64 bits to another register and then clobber the original register. Before that (set (reg:DI r14) (const_int ...)) we have a chain DI r14, QI si, DI bp , that instruction drops the DI r14 from that chai= n, so we have QI si, DI bp , si being the oldest_regno. Next DI si is copied into DI dx. Only the low 8 bits of that are defin= ed, the rest is unspecified, but we would add DI dx into that same chain at= the end, so QI si, DI bp, DI dx [*]. Next si is overwritten, so the chain = is DI bp, DI dx. And then we see (set (reg:DI dx) (reg:DI bp)) and remove= it as redundant, because we think bp and dx are already equivalent, when in reality that is true only for the lowpart 8 bits. I believe the [*] marked step above is where the bug is. The committed regcprop.c (copy_value) change (but only committed to trunk/11, not to 10) added else if (partial_subreg_p (vd->e[sr].mode, GET_MODE (src)) && partial_subreg_p (vd->e[sr].mode, vd->e[vd->e[sr].oldest_regno].mode)) return; and while the first partial_subreg_p call returns true, the second one doesn't; before the (set (reg:DI r14) (const_int ...)) insn it would be true and we'd return, but as that reg got clobbered, si became the olde= st regno in the chain and so vd->e[vd->e[sr].oldest_regno].mode is QImode and vd->e[sr].mode is QImode too, so the second partial_subreg_p is fal= se. But as the testcase shows, what is the oldest_regno in the chain is something that changes over time, so relying on it for anything is problematic, something could have a different oldest_regno and later on get a different oldest_regno (perhaps with different mode) because the oldest_regno got overwritten and it can change both ways. The following patch effectively implements your (2) above. 2021-05-15 Jakub Jelinek PR rtl-optimization/100342 * regcprop.c (copy_value): When copying a source reg in a wider mode than it has recorded for the value, adjust recorded destination mode too or punt if !REG_CAN_CHANGE_MODE_P. * gcc.target/i386/pr100342.c: New test. (cherry picked from commit 425ad87dcfacbb326d8f448a0f2b4d6b53dcd98f)=