From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 314FF385AC2E; Fri, 28 Jan 2022 16:02:13 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 314FF385AC2E From: "vmakarov at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug rtl-optimization/102178] [12 Regression] SPECFP 2006 470.lbm regressions on AMD Zen CPUs after r12-897-gde56f95afaaa22 Date: Fri, 28 Jan 2022 16:02:13 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: rtl-optimization X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: missed-optimization, ra X-Bugzilla-Severity: normal X-Bugzilla-Who: vmakarov at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P1 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 12.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Jan 2022 16:02:13 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D102178 --- Comment #27 from Vladimir Makarov --- (In reply to Richard Biener from comment #17) > So in .reload we have (with unpatched trunk) >=20 > 401: NOTE_INSN_BASIC_BLOCK 6 > 462: ax:DF=3D[`*.LC0'] > REG_EQUAL 9.850689999999999724167309977929107844829559326171875e-1 > 407: xmm2:DF=3Dax:DF > 463: ax:DF=3D[`*.LC0'] > REG_EQUAL 9.850689999999999724167309977929107844829559326171875e-1 > 408: xmm4:DF=3Dax:DF >=20 > why??! We can load .LC0 into xmm4 directly. IRA sees >=20 > 401: NOTE_INSN_BASIC_BLOCK 6 > 407: r118:DF=3Dr482:DF > 408: r119:DF=3Dr482:DF >=20 > now I cannot really decipher IRA or LRA dumps but my guess would be that > inheritance (causing us to load from LC0) interferes badly with register > class assignment? >=20 > Changing pseudo 482 in operand 1 of insn 407 on equiv > 9.850689999999999724167309977929107844829559326171875e-1 > ... > alt=3D21,overall=3D9,losers=3D1,rld_nregs=3D1 > Choosing alt 21 in insn 407: (0) v (1) r {*movdf_internal} > Creating newreg=3D525, assigning class GENERAL_REGS to r525 > 407: r118:DF=3Dr525:DF > Inserting insn reload before: > 462: r525:DF=3D[`*.LC0'] > REG_EQUAL 9.850689999999999724167309977929107844829559326171875e-1 >=20 > we should have preferred alt 14 I think (0) v (1) m, but that has >=20 > alt=3D14,overall=3D13,losers=3D1,rld_nregs=3D0 > 0 Spill pseudo into memory: reject+=3D3 > Using memory insn operand 0: reject+=3D3 > 0 Non input pseudo reload: reject++ > 1 Non-pseudo reload: reject+=3D2 > 1 Non input pseudo reload: reject++ > alt=3D15,overall=3D28,losers=3D3 -- refuse > 0 Costly set: reject++ > alt=3D16: Bad operand -- refuse > 0 Costly set: reject++ > 1 Costly loser: reject++ > 1 Non-pseudo reload: reject+=3D2 > 1 Non input pseudo reload: reject++ > alt=3D17,overall=3D17,losers=3D2 -- refuse > 0 Costly set: reject++ > 1 Spill Non-pseudo into memory: reject+=3D3 > Using memory insn operand 1: reject+=3D3 > 1 Non input pseudo reload: reject++ > alt=3D18,overall=3D14,losers=3D1 -- refuse > 0 Spill pseudo into memory: reject+=3D3 > Using memory insn operand 0: reject+=3D3 > 0 Non input pseudo reload: reject++ > 1 Costly loser: reject++ > 1 Non-pseudo reload: reject+=3D2 > 1 Non input pseudo reload: reject++ > alt=3D19,overall=3D29,losers=3D3 -- refuse > 0 Non-prefered reload: reject+=3D600 > 0 Non input pseudo reload: reject++ > alt=3D20,overall=3D607,losers=3D1 -- refuse > 1 Non-pseudo reload: reject+=3D2 > 1 Non input pseudo reload: reject++ >=20 > I'm not sure I can decipher the reasoning but I don't understand how it > doesn't seem to anticipate the cost of reloading the GPR in the alternati= ve > it chooses? >=20 > Vlad? All this diagnostics is just description of voodoo from the old reload pass= .=20 LRA choosing alternative the same way as the old reload pass (I doubt that = any other approach will not break all existing targets). Simply the old reload pass does not report its decisions in the dump. LRA code (lra-constraints.cc::process_alt_operands) choosing the insn alternatives (as the old reload pass) does not use any memory or register m= ove costs. Instead, the alternative is chosen by heuristics and insn constrain= ts hints (like ? !). The only case where these costs are used, when we have reg:=3Dreg and the register move costs for this is 2. In this case LRA(rel= oad) does not bother to check the insn constraints.=