From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id BC36C3854818; Tue,  1 Nov 2022 15:55:32 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BC36C3854818
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1667318132;
	bh=SRx2TxO/0lSkGDg5sryUbrwj/Vi7F4sbGYST4WGyUnA=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=g1F0S7p94gP6FJkAyWKY4qwQNDmsle7Zfe6rg0pNOMCg+4dlRcnOTPRjxncWN7zH+
	 GRC1SI92TMYfdK4pkn+8b8vRR3Vh3fKsAe0W/5e/Ek1Xzpz4PDsEVt1GLOxzhu7RCP
	 ClRT5rz5JyGfV8PuSvEyzd+a7bXNl1khlpKuKDe4=
From: "gjl at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug rtl-optimization/90706] [10/11/12/13 Regression] Useless code
 generated for stack / register operations on AVR
Date: Tue, 01 Nov 2022 15:55:30 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: rtl-optimization
X-Bugzilla-Version: 9.1.0
X-Bugzilla-Keywords: missed-optimization, ra
X-Bugzilla-Severity: normal
X-Bugzilla-Who: gjl at gcc dot gnu.org
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P4
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 10.5
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: attachments.created
Message-ID: <bug-90706-4-1kSpOwdDjh@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-90706-4@http.gcc.gnu.org/bugzilla/>
References: <bug-90706-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D90706
--- Comment #13 from Georg-Johann Lay <gjl at gcc dot gnu.org> ---
Created attachment 53812
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=3D53812&action=3Dedit
Test case with 32-bit integer.

This problem is still present in current master (future v13) and also occurs
with 32-bit integers.

> avr-gcc -S -Os -mul.c -fdump-rtl-ira

With v8, mul.s has 15 instructions.

With newer versions, mul.s has 26 additional instructions:=20
* 12 silly, useless stores into / loads from frame.
* 12 instructions to setup the frame.
* More instructions due to sub-optimal register alloc.
* Uses 6 bytes stack frame where v8 needs no frame at all.

In the IRA dump, there is:

Pass 0 for finding pseudo/allocno costs
    a0 (r53,l0) best NO_REGS, allocno NO_REGS
    a2 (r49,l0) best GENERAL_REGS, allocno GENERAL_REGS
    a1 (r48,l0) best NO_REGS, allocno NO_REGS
...
Pass 1 for finding pseudo/allocno costs
    r53: preferred NO_REGS, alternative NO_REGS, allocno NO_REGS
    r49: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS
    r48: preferred NO_REGS, alternative NO_REGS, allocno NO_REGS
...
      Spill a0(r53,l0)
      Spill a1(r48,l0)
      Allocno a2r49 of GENERAL_REGS(30) ...

So there are 2 register spills for no reason that lead to that code bloat.=