From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id CEF633858D32; Fri, 12 Jan 2024 19:06:40 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org CEF633858D32 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1705086400; bh=uaUaq8/Ma9Voq4vDtsUnoljHMhhYvampq3LP8ThkKec=; h=From:To:Subject:Date:In-Reply-To:References:From; b=c/B/Wpz9Hg3Qq0HBi391HUivLtHi73BVNnumDODsAEqao7MpDR0g+ryO0RlaTevl7 Kz711MXGdT0ntZvSCloQ/AfPnfmcGqvlkrc9VREgy1uF9h2bCr3wmV69zs1Sn0EwQq e3LxYtH/DlXQszVEt7D6pKm4Q58a/ZnDS2qnmLf0= From: "roger at nextmovesoftware dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug rtl-optimization/111267] [14 Regression] Codegen regression from i386 argument passing changes Date: Fri, 12 Jan 2024 19:06:40 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: rtl-optimization X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: roger at nextmovesoftware dot com X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P1 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 14.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D111267 --- Comment #6 from Roger Sayle --- Sorry for the delay in replying/answering Jakub's questions/comments. Yes, using a define_insn_and_split in the backend fixes/works around the issue (= and I agree your implementation/refinement in comment #5 is better than mine in comment #2), but I've a feeling that this approach isn't the ideal solution= .=20 Nothing about this split, is specific to these x86 instructions or even to = the i386 backend. A more generic fix might be teach combine.cc that it can split parallels of= two independent sets, with no inter dependencies, into two insns if the total c= ost of the two instructions is less than the original two, i.e. a 2 insn -> 2 i= nsn combination. But then even this doesn't feel like the perfect approach... the reason com= bine doesn't already support 2->2 combinations is that they're not normally required, these types of problems are usually handled by GCSE or CSE or PRE= (or ?). The pattern is insn1 defines REG1 to a complicated expression, that is live= in several locations, so this instruction can't be eliminated. However, if the definition of REG1 is provided to insn2 that sets REG2, this second instruc= tion can be significantly simplified. This feels like a classic (non-)constant propagation problem. I'm thinking perhaps want_to_gcse_p (or somewhere similar) could be tweaked. For people just joining the discussion (hopefully Jeff or a Richard): (set (REG:DI 1) (concat:DI (REG:SI 2) (REG:SI 3)) ... (set (REG:SI 4) (low_part (REG:DI 1)) can be simplified so that the second assignment becomes just: (set (REG:SI 4) (REG:SI 2)) and similarly for high_part vs. low_part. These don't even need to be in the same basic block. In actuality, "concat" is a large ugly expression, and high_part/low_part a= re actually SUBREGs (or could be TRUNCATE or SHIFT+TRUNCATE), but the theory should remain the same. I'm trying to figure out which pass (or cselib?) is normally responsible for handling this type of pseudo-reg propagation. But the define_insn_and_split certainly papers over the deficiency in the middle-end's RTL optimizers and fixes this (very) specific case/regression.=