From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 3171F3A6F09C; Fri, 7 Jun 2024 13:04:30 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 3171F3A6F09C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1717765470; bh=RfYfADjep6IHc3zPH8285FvTAAQYK7eB8vANOVj6CjY=; h=From:To:Subject:Date:In-Reply-To:References:From; b=oPngXGiu3mC5EGZTBzGdfFYJTiwNxq6XF4oLYH7r0qwvs1R2n3FiFj+Jk/j01YyK6 EWh2dcBNsLAY649fBCJN8Ks2O1sCNP26tQuCageYJttlJF6cqvTcX7Oac+s96v2Z4Y hSohypPy2SS7Cm5nLlQfRFBoI3+XtlaGrC/DCeVs= From: "cvs-commit at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/115351] [14/15 regression] pointless movs when passing by value on x86-64 Date: Fri, 07 Jun 2024 13:04:29 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 14.1.0 X-Bugzilla-Keywords: missed-optimization, needs-bisection X-Bugzilla-Severity: normal X-Bugzilla-Who: cvs-commit at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: roger at nextmovesoftware dot com X-Bugzilla-Target-Milestone: 14.2 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D115351 --- Comment #4 from GCC Commits --- The master branch has been updated by Roger Sayle : https://gcc.gnu.org/g:fb3e4c549d16d5050e10114439ad77149f33c597 commit r15-1101-gfb3e4c549d16d5050e10114439ad77149f33c597 Author: Roger Sayle Date: Fri Jun 7 14:03:20 2024 +0100 i386: PR target/115351: RTX costs for *concatditi3 and *insvti_highpart. This patch addresses PR target/115351, which is a code quality regressi= on on x86 when passing floating point complex numbers. The ABI considers these arguments to have TImode, requiring interunit moves to place the FP values (which are actually passed in SSE registers) into the upper and lower parts of a TImode pseudo, and then similar moves back again before they can be used. The cause of the regression is that changes in how TImode initialization is represented in RTL now prevents the RTL optimizers from eliminating these redundant moves. The specific cause is that the *concatditi3 pattern, (zext(hi)<<64)|zext(lo), has an inappropriately high (default) rtx_cost, preventing fwprop1 from propagating it. This pattern just sets the hipart and lopart of a double-word register, typically two instructions (less if reload can allocate things appropriately) but the current ix86_rtx_costs actually returns INSN_COSTS(13), i.e. 52. propagating insn 5 into insn 6, replacing: (set (reg:TI 110) (ior:TI (and:TI (reg:TI 110) (const_wide_int 0x0ffffffffffffffff)) (ashift:TI (zero_extend:TI (subreg:DI (reg:DF 112 [ zD.2796+8 ]) 0)) (const_int 64 [0x40])))) successfully matched this instruction to *concatditi3_3: (set (reg:TI 110) (ior:TI (ashift:TI (zero_extend:TI (subreg:DI (reg:DF 112 [ zD.2796= +8 ]) 0)) (const_int 64 [0x40])) (zero_extend:TI (subreg:DI (reg:DF 111 [ zD.2796 ]) 0)))) change not profitable (cost 50 -> cost 52) This issue is resolved by having ix86_rtx_costs return more reasonable values for these (place-holder) patterns. 2024-06-07 Roger Sayle gcc/ChangeLog PR target/115351 * config/i386/i386.cc (ix86_rtx_costs): Provide estimates for the *concatditi3 and *insvti_highpart patterns, about two insns. gcc/testsuite/ChangeLog PR target/115351 * g++.target/i386/pr115351.C: New test case.=