From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 904F23857C4F; Fri, 18 Feb 2022 07:26:56 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 904F23857C4F From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/104582] [11/12 Regression] Unoptimal code for __negdi2 (and others) from libgcc2 due to unwanted vectorization Date: Fri, 18 Feb 2022 07:26:56 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 11.3 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 18 Feb 2022 07:26:56 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D104582 --- Comment #7 from Richard Biener --- (In reply to Jakub Jelinek from comment #5) > The costs look weird: > _1 1 times scalar_store costs 12 in body > _5 1 times scalar_store costs 12 in body > _1 1 times vector_store costs 12 in body > 1 times vec_construct costs 8 in prologue > vec_construct is certainly more expensive than a store (especially in this > case when it is a store into a TImode variable which isn't addressable and > will not be in memory at all). x86 can do cheap move low/hi so the construct isn't expensive. Note it only gets expensive in the end because the "memory" isn't really memory and the return ABI isn't exposed. Just as a wild idea, maybe we can pessimize vector stores into !TREE_ADDRESSABLE automatic variables ... We do already have some "weird" code in vect_model_store_cost employing hard_function_value to deal with stores to RESULT_DECLs, but here 'w' isn't a RESULT_DECL. In the code we assume what happens happens, spill of the vector and loads of the components. What's missing in the CTOR cost is the move from GPR to XMM regs when we are not dealing with FP or vector components (or direct memory sources). Getting that applied only for relevant cases isn't easy since it requires looking at the defs. One could try to amend the vect_model_store_cost handling by at the beginning of the SLP pass analyze stmts from the function return, marking decls we return a loaded value from in some way and handle that in a similar way.=