From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id BC36C3854818; Tue, 1 Nov 2022 15:55:32 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BC36C3854818 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1667318132; bh=SRx2TxO/0lSkGDg5sryUbrwj/Vi7F4sbGYST4WGyUnA=; h=From:To:Subject:Date:In-Reply-To:References:From; b=g1F0S7p94gP6FJkAyWKY4qwQNDmsle7Zfe6rg0pNOMCg+4dlRcnOTPRjxncWN7zH+ GRC1SI92TMYfdK4pkn+8b8vRR3Vh3fKsAe0W/5e/Ek1Xzpz4PDsEVt1GLOxzhu7RCP ClRT5rz5JyGfV8PuSvEyzd+a7bXNl1khlpKuKDe4= From: "gjl at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug rtl-optimization/90706] [10/11/12/13 Regression] Useless code generated for stack / register operations on AVR Date: Tue, 01 Nov 2022 15:55:30 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: rtl-optimization X-Bugzilla-Version: 9.1.0 X-Bugzilla-Keywords: missed-optimization, ra X-Bugzilla-Severity: normal X-Bugzilla-Who: gjl at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P4 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 10.5 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: attachments.created Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D90706 --- Comment #13 from Georg-Johann Lay --- Created attachment 53812 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=3D53812&action=3Dedit Test case with 32-bit integer. This problem is still present in current master (future v13) and also occurs with 32-bit integers. > avr-gcc -S -Os -mul.c -fdump-rtl-ira With v8, mul.s has 15 instructions. With newer versions, mul.s has 26 additional instructions:=20 * 12 silly, useless stores into / loads from frame. * 12 instructions to setup the frame. * More instructions due to sub-optimal register alloc. * Uses 6 bytes stack frame where v8 needs no frame at all. In the IRA dump, there is: Pass 0 for finding pseudo/allocno costs a0 (r53,l0) best NO_REGS, allocno NO_REGS a2 (r49,l0) best GENERAL_REGS, allocno GENERAL_REGS a1 (r48,l0) best NO_REGS, allocno NO_REGS ... Pass 1 for finding pseudo/allocno costs r53: preferred NO_REGS, alternative NO_REGS, allocno NO_REGS r49: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS r48: preferred NO_REGS, alternative NO_REGS, allocno NO_REGS ... Spill a0(r53,l0) Spill a1(r48,l0) Allocno a2r49 of GENERAL_REGS(30) ... So there are 2 register spills for no reason that lead to that code bloat.=