From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 9DB3E3858C60; Thu, 10 Feb 2022 07:45:34 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 9DB3E3858C60 From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug rtl-optimization/102178] [12 Regression] SPECFP 2006 470.lbm regressions on AMD Zen CPUs after r12-897-gde56f95afaaa22 Date: Thu, 10 Feb 2022 07:45:34 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: rtl-optimization X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: missed-optimization, ra X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P1 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 12.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Feb 2022 07:45:34 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D102178 --- Comment #29 from Richard Biener --- (In reply to Vladimir Makarov from comment #28) > Could somebody benchmark the following patch on zen2 470.lbm. >=20 > diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc > index 9cee17479ba..76619aca8eb 100644 > --- a/gcc/lra-constraints.cc > +++ b/gcc/lra-constraints.cc > @@ -5084,7 +5089,9 @@ lra_constraints (bool first_p) > (x, lra_get_allocno_class (i)) =3D=3D NO_REG= S)) > || contains_symbol_ref_p (x)))) > ira_reg_equiv[i].defined_p =3D false; > - if (contains_reg_p (x, false, true)) > + if (contains_reg_p (x, false, true) > + || (CONST_DOUBLE_P (x) > + && maybe_ge (GET_MODE_SIZE (GET_MODE (x)), 8))) > ira_reg_equiv[i].profitable_p =3D false; > if (get_equiv (reg) !=3D reg) > bitmap_ior_into (equiv_insn_bitmap, > &lra_reg_info[i].insn_bitmap); >=20 > If it improves the performance, I'll commit this patch. >=20 > The expander unconditionally uses memory pool for double constants. I th= ink > the analogous treatment could be done for equiv double constants in LRA. >=20 > I know only x86_64 permits 64-bit constants as immediate for moving them > into general regs. As double fp operations is not done in general regs in > the most cases, they should be moved into fp regs and this is costly as J= an > wrote. So it has sense to prohibit using equiv double constant values in > LRA unconditionally. If in the future we have a target which can move > double immediate into fp regs we can introduce some target hooks to deal > with equiv double constant. But right now I think there is no need for t= he > new hook. Code generation changes quite a bit, with the patch the offending function is 16 bytes larger. I see no large immediate moves to GPRs anymore but there is still a lot of spilling of XMMs to GPRs. Performance is unchanged by the patch: 470.lbm 13740 128 107 S 13740 128 107= S 470.lbm 13740 128 107 * 13740 128 107= S 470.lbm 13740 128 107 S 13740 128 107= * I've used diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc index 9cee17479ba..a0ec608c056 100644 --- a/gcc/lra-constraints.cc +++ b/gcc/lra-constraints.cc @@ -5084,7 +5084,9 @@ lra_constraints (bool first_p) (x, lra_get_allocno_class (i)) =3D=3D NO_REGS)) || contains_symbol_ref_p (x)))) ira_reg_equiv[i].defined_p =3D false; - if (contains_reg_p (x, false, true)) + if (contains_reg_p (x, false, true) + || (CONST_DOUBLE_P (x) + && maybe_ge (GET_MODE_SIZE (GET_MODE (x)), UNITS_PER_WORD))) ira_reg_equiv[i].profitable_p =3D false; if (get_equiv (reg) !=3D reg) bitmap_ior_into (equiv_insn_bitmap, &lra_reg_info[i].insn_bitmap); note UNITS_PER_WORD vs. literal 8. Without knowing much of the code I wonder if we can check whether the move will be to a reg in GENERAL_REGS? That is, do we know whether there are (besides some special constants like zero), immediate moves to the destination register class? That said, given the result on LBM I'd not change this at this point. Honza wanted to look at the move pattern to try to mitigate the GPR spilling of XMMs. I do think that we need to take costs into account at some point and get rid of the reload style hand-waving with !?* in the move patterns.=