From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 6C69D385735A; Tue, 14 Jun 2022 09:42:42 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 6C69D385735A From: "jakub at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/105930] [12/13 Regression] Excessive stack spill generation on 32-bit x86 Date: Tue, 14 Jun 2022 09:42:42 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 12.1.1 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: jakub at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 12.2 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: keywords Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Jun 2022 09:42:42 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D105930 Jakub Jelinek changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords|needs-bisection | --- Comment #17 from Jakub Jelinek --- So, I've tried: --- gcc/config/i386/i386.md.jj 2022-06-13 10:53:26.739290704 +0200 +++ gcc/config/i386/i386.md 2022-06-14 11:09:24.467024047 +0200 @@ -13734,14 +13734,13 @@ ;; shift instructions and a scratch register. (define_insn_and_split "ix86_rotl3_doubleword" - [(set (match_operand: 0 "register_operand" "=3Dr") - (rotate: (match_operand: 1 "register_operand" "0") - (match_operand:QI 2 "" "")= )) - (clobber (reg:CC FLAGS_REG)) - (clobber (match_scratch:DWIH 3 "=3D&r"))] - "" + [(set (match_operand: 0 "register_operand") + (rotate: (match_operand: 1 "register_operand") + (match_operand:QI 2 ""))) + (clobber (reg:CC FLAGS_REG))] + "ix86_pre_reload_split ()" "#" - "reload_completed" + "&& 1" [(set (match_dup 3) (match_dup 4)) (parallel [(set (match_dup 4) @@ -13764,6 +13763,7 @@ (match_dup 6)))) 0))) (clobber (reg:CC FLAGS_REG))])] { + operands[3] =3D gen_reg_rtx (mode); operands[6] =3D GEN_INT (GET_MODE_BITSIZE (mode) - 1); operands[7] =3D GEN_INT (GET_MODE_BITSIZE (mode)); @@ -13771,14 +13771,13 @@ }) (define_insn_and_split "ix86_rotr3_doubleword" - [(set (match_operand: 0 "register_operand" "=3Dr") - (rotatert: (match_operand: 1 "register_operand" "0") - (match_operand:QI 2 "" "= "))) - (clobber (reg:CC FLAGS_REG)) - (clobber (match_scratch:DWIH 3 "=3D&r"))] - "" + [(set (match_operand: 0 "register_operand") + (rotatert: (match_operand: 1 "register_operand") + (match_operand:QI 2 ""))) + (clobber (reg:CC FLAGS_REG))] + "ix86_pre_reload_split ()" "#" - "reload_completed" + "&& 1" [(set (match_dup 3) (match_dup 4)) (parallel [(set (match_dup 4) @@ -13801,6 +13800,7 @@ (match_dup 6)))) 0))) (clobber (reg:CC FLAGS_REG))])] { + operands[3] =3D gen_reg_rtx (mode); operands[6] =3D GEN_INT (GET_MODE_BITSIZE (mode) - 1); operands[7] =3D GEN_INT (GET_MODE_BITSIZE (mode)); On the #c0 test with -O2 -m32 -mno-mmx -mno-sse it makes some difference, b= ut not as much as one would hope for: Numbers from gcc 11.3.1 20220614, 11.3.1 20220614 with the patch, 13.0.0 20220610, 13.0.0 20220614 with the patch: sub on %esp 428 2556 2620 2556 fn size in B 21657 23186 28413 23534 .s lines 6199 3942 7260 4198 So, trunk patched with the above patch results in significantly fewer instructions, but larger (more of them use 32-bit immediates, mostly in for= m of whatever(%esp) memory source operand). And the stack usage is high. I think the patch is still a good idea, it gives the RA more options, but we should investigate why it consumes so much more stack and results in larger code.=