From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from nh502-vm13.bullet.mail.kks.yahoo.co.jp (nh502-vm13.bullet.mail.kks.yahoo.co.jp [183.79.56.158]) by sourceware.org (Postfix) with SMTP id 426F63858C54 for ; Fri, 2 Sep 2022 10:57:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 426F63858C54 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=yahoo.co.jp Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=yahoo.co.jp Received: from [183.79.100.140] by nh502.bullet.mail.kks.yahoo.co.jp with NNFMP; 02 Sep 2022 10:57:22 -0000 Received: from [183.79.100.136] by t503.bullet.mail.kks.yahoo.co.jp with NNFMP; 02 Sep 2022 10:57:22 -0000 Received: from [127.0.0.1] by omp505.mail.kks.yahoo.co.jp with NNFMP; 02 Sep 2022 10:57:22 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 381098.72572.bm@omp505.mail.kks.yahoo.co.jp Received: (qmail 42269 invoked by alias); 2 Sep 2022 10:57:22 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.co.jp; s=yj20110701; t=1662116242; bh=QjWdWUozjPmF49053oMdI7QQA2fejJ2b1SFK0AR+YvA=; h=Received:X-YMail-JAS:X-Apparently-From:X-YMail-OSG:Message-ID:Date:MIME-Version:User-Agent:To:Cc:From:Subject:Content-Type:Content-Transfer-Encoding; b=RXZF6FsW9FB1LucpQWvShoeimUjNIWgvj2gV8gQq54kRvUTUUNimAQhi3JVXrFhCedSahkN9BZE53NFkWJVnmfTggMPUy0/XyX9a5jEBpkDhhuUTMwcKndEFH2EjAvcZXf+DQxkR8GMi0498Wt+I72tuvm1DhD+pARjV5YrDgYA= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=yj20110701; d=yahoo.co.jp; h=Received:X-YMail-JAS:X-Apparently-From:X-YMail-OSG:Message-ID:Date:MIME-Version:User-Agent:To:Cc:From:Subject:Content-Type:Content-Transfer-Encoding; b=SKnkfBZO5U3JnC1ibOd+i444FMjVU7CHFecEK33y5feFAfp5fiXb6lyHvitKCo5v+GepaiQAXtOysNBmsJPjeHK3y4i+Nxabv1L67gA98zYDg9FuUxLMdSzsirNtksj7jjFeyICYvd5+x9b6Zf01xk59jz14xYwF3bP6ppK17kA= ; Received: from unknown (HELO ?192.168.2.3?) (175.177.45.181 with ) by smtp5010.mail.kks.ynwp.yahoo.co.jp with SMTP; 2 Sep 2022 10:57:22 -0000 X-YMail-JAS: iV0arHAVM1neiukZXwqRdl2Dm8pDHVC0KKlfp4UEqUpyVBn3aQRa0ncWEFC0bE_p6QZqxQQ9wypHcQFeB1Yc0Gxgw1JkxIkQQcAtaUUZU6.zox.BDxZV.tPd9Oj6yigELsr56ZwZRA-- X-Apparently-From: X-YMail-OSG: sEabS6cVM1msJfTIOlySbpNaaAPQ5IgejSvXhq2SMzP6W6H KwapH4whcTFku_q8OdFgqounBZEHp6dVeelFXROFuBqfrJdyfUefqBRVlmXm LVBIsM5o36nb2CocxnQkVlTAwsnAms0jJb.63vCaMkq5ppSmkjkl_OJBoLhy QfebqyUlXohl7HBGwEcN6bHHZIaEDIsLRtJRa1_EvyIw9SSoKrzSJdiLY785 iPPh8gjk7RaI8zy9hqAjtLWSzCHOf2lwcghZVliY47Q1mh4XprCw4diq6Jmj AK3N1hkxh0NAF.QruDPFFSianQlRyQpaNtQKFvjvej6MwM2N.TKyN1_e.zHn T86WdI8_UqsXy_iO_hNZKRBXfdHCO5akxHAWBdXmNufO3Z7z7GXeZCWoYfG8 FlgImJI15R_c.hnuYz6ho8y7O7b2fnDNLMzyCs5zZizmOHHkYBVFfHZRSB9T m9w_q.LYBPWonTgDhI8PWY.fnHjnIiSQP7wnufzGpAfvKeFY9MDr1rg8hxMY IOFusq0Pkovwfzb_5am0oicHzCdXWE0Bez.iHydcHmaLTG1a2hLgGSGjmeNQ 4jcZ.ux5rjkalMJ0zWZODvHW2dVBRk8wCAORrJCOV1QhCuIiyVThn6_eSO7Q 2iPGxdAJbmYJMVfmSrMgSMdVnHDRsT1u0ysBrZg9sDQ4q1jc7KPoNEGsSc2x VUMQMPyhZzj2_eKZwtmXm52A_12t5tH8w1iTwGmWLu_HhjLGmN7adWiCFsx5 ZlKgcXA1dEwQAo7UCBx9o7cYu9PJZBkdtaIHG3LxtWTaZfqV3_1D_tK5S8TZ 5lBE9XSdupEigKy81S9HzxTj_WbfjZCpia8WFms6gTFjRI59Z6jFgwrm0cYr hGSd8_MW7uYpsxf19Tvr_9ZzFZobBvepw6hgK1qLOj.JRjvGwY.9fIc_ey06 WHz_E2_tAx058mxJOVg-- Message-ID: <912096ce-3910-dd5f-cfb3-f196bf5b9ae5@yahoo.co.jp> Date: Fri, 2 Sep 2022 19:18:52 +0900 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.2.1 To: GCC Patches Cc: Max Filippov From: Takayuki 'January June' Suwa Subject: [PATCH v2 1/2] xtensa: Eliminate unused stack frame allocation/freeing Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Changes from v1: (xtensa_expand_epilogue): Fixed forgetting to consider hard_frame_pointer_rtx when sharing codes. --- In the example below, 'x' is once placed on the stack frame and then read into registers as the argument value of bar(): /* example */ struct foo { int a, b; }; extern struct foo bar(struct foo); struct foo test(void) { struct foo x = { 0, 1 }; return bar(x); } Thanks to the dead store elimination, the initialization of 'x' turns into merely loading the immediates to registers, but corresponding stack frame growth is not rolled back. As a result: ;; prereq: the CALL0 ABI ;; before test: addi sp, sp, -16 // unused stack frame allocation/freeing movi.n a2, 0 movi.n a3, 1 addi sp, sp, 16 // because no instructions that refer to j.l bar, a9 // the stack pointer between the two This patch eliminates such unused stack frame allocation/freeing: ;; after test: movi.n a2, 0 movi.n a3, 1 j.l bar, a9 gcc/ChangeLog: * config/xtensa/xtensa.cc (machine_function): New member to track the insns for stack pointer adjustment inside of the pro/epilogue. (xtensa_emit_adjust_stack_ptr): New function to share the common codes and to record the insns for stack pointer adjustment. (xtensa_expand_prologue): Change to use the function mentioned above when using the CALL0 ABI. (xtensa_expand_epilogue): Ditto. And also change to cancel emitting the insns for the stack pointer adjustment if only used for its own. --- gcc/config/xtensa/xtensa.cc | 230 ++++++++++++++++++------------------ 1 file changed, 118 insertions(+), 112 deletions(-) diff --git a/gcc/config/xtensa/xtensa.cc b/gcc/config/xtensa/xtensa.cc index b673b6764da..17416fc6c3f 100644 --- a/gcc/config/xtensa/xtensa.cc +++ b/gcc/config/xtensa/xtensa.cc @@ -102,6 +102,7 @@ struct GTY(()) machine_function int callee_save_size; bool frame_laid_out; bool epilogue_done; + hash_set *logues_a1_adjusts; }; /* Vector, indexed by hard register number, which contains 1 for a @@ -3048,7 +3049,7 @@ xtensa_output_literal (FILE *file, rtx x, machine_mode mode, int labelno) } static bool -xtensa_call_save_reg(int regno) +xtensa_call_save_reg (int regno) { if (TARGET_WINDOWED_ABI) return false; @@ -3084,7 +3085,7 @@ compute_frame_size (poly_int64 size) cfun->machine->callee_save_size = 0; for (regno = 0; regno < FIRST_PSEUDO_REGISTER; ++regno) { - if (xtensa_call_save_reg(regno)) + if (xtensa_call_save_reg (regno)) cfun->machine->callee_save_size += UNITS_PER_WORD; } @@ -3143,6 +3144,51 @@ xtensa_initial_elimination_offset (int from, int to ATTRIBUTE_UNUSED) and the total number of words must be a multiple of 128 bits. */ #define MIN_FRAME_SIZE (8 * UNITS_PER_WORD) +#define ADJUST_SP_NONE 0x0 +#define ADJUST_SP_NEED_NOTE 0x1 +#define ADJUST_SP_FRAME_PTR 0x2 +static rtx_insn * +xtensa_emit_adjust_stack_ptr (HOST_WIDE_INT offset, int flags) +{ + rtx_insn *insn; + rtx ptr = (flags & ADJUST_SP_FRAME_PTR) ? hard_frame_pointer_rtx + : stack_pointer_rtx; + + if (xtensa_simm8 (offset) + || xtensa_simm8x256 (offset)) + insn = emit_insn (gen_addsi3 (stack_pointer_rtx, ptr, GEN_INT (offset))); + else + { + rtx tmp_reg = gen_rtx_REG (Pmode, A9_REG); + rtx_insn* tmp_insn; + + if (offset < 0) + { + tmp_insn = emit_move_insn (tmp_reg, GEN_INT (-offset)); + insn = emit_insn (gen_subsi3 (stack_pointer_rtx, ptr, tmp_reg)); + } + else + { + tmp_insn = emit_move_insn (tmp_reg, GEN_INT (offset)); + insn = emit_insn (gen_addsi3 (stack_pointer_rtx, ptr, tmp_reg)); + } + cfun->machine->logues_a1_adjusts->add (tmp_insn); + } + + if (flags & ADJUST_SP_NEED_NOTE) + { + rtx note_rtx = gen_rtx_SET (stack_pointer_rtx, + plus_constant (Pmode, stack_pointer_rtx, + offset)); + + RTX_FRAME_RELATED_P (insn) = 1; + add_reg_note (insn, REG_FRAME_RELATED_EXPR, note_rtx); + } + + cfun->machine->logues_a1_adjusts->add (insn); + return insn; +} + void xtensa_expand_prologue (void) { @@ -3175,16 +3221,13 @@ xtensa_expand_prologue (void) HOST_WIDE_INT offset = 0; int callee_save_size = cfun->machine->callee_save_size; + cfun->machine->logues_a1_adjusts = new hash_set; + /* -128 is a limit of single addi instruction. */ if (IN_RANGE (total_size, 1, 128)) { - insn = emit_insn (gen_addsi3 (stack_pointer_rtx, stack_pointer_rtx, - GEN_INT (-total_size))); - RTX_FRAME_RELATED_P (insn) = 1; - note_rtx = gen_rtx_SET (stack_pointer_rtx, - plus_constant (Pmode, stack_pointer_rtx, - -total_size)); - add_reg_note (insn, REG_FRAME_RELATED_EXPR, note_rtx); + insn = xtensa_emit_adjust_stack_ptr (-total_size, + ADJUST_SP_NEED_NOTE); offset = total_size - UNITS_PER_WORD; } else if (callee_save_size) @@ -3194,75 +3237,35 @@ xtensa_expand_prologue (void) * move it to its final location. */ if (total_size > 1024) { - insn = emit_insn (gen_addsi3 (stack_pointer_rtx, stack_pointer_rtx, - GEN_INT (-callee_save_size))); - RTX_FRAME_RELATED_P (insn) = 1; - note_rtx = gen_rtx_SET (stack_pointer_rtx, - plus_constant (Pmode, stack_pointer_rtx, - -callee_save_size)); - add_reg_note (insn, REG_FRAME_RELATED_EXPR, note_rtx); + insn = xtensa_emit_adjust_stack_ptr (-callee_save_size, + ADJUST_SP_NEED_NOTE); offset = callee_save_size - UNITS_PER_WORD; } else { - if (xtensa_simm8x256 (-total_size)) - insn = emit_insn (gen_addsi3 (stack_pointer_rtx, - stack_pointer_rtx, - GEN_INT (-total_size))); - else - { - rtx tmp_reg = gen_rtx_REG (Pmode, A9_REG); - emit_move_insn (tmp_reg, GEN_INT (total_size)); - insn = emit_insn (gen_subsi3 (stack_pointer_rtx, - stack_pointer_rtx, tmp_reg)); - } - RTX_FRAME_RELATED_P (insn) = 1; - note_rtx = gen_rtx_SET (stack_pointer_rtx, - plus_constant (Pmode, stack_pointer_rtx, - -total_size)); - add_reg_note (insn, REG_FRAME_RELATED_EXPR, note_rtx); + insn = xtensa_emit_adjust_stack_ptr (-total_size, + ADJUST_SP_NEED_NOTE); offset = total_size - UNITS_PER_WORD; } } for (regno = 0; regno < FIRST_PSEUDO_REGISTER; ++regno) - { - if (xtensa_call_save_reg(regno)) - { - rtx x = gen_rtx_PLUS (Pmode, stack_pointer_rtx, GEN_INT (offset)); - rtx mem = gen_frame_mem (SImode, x); - rtx reg = gen_rtx_REG (SImode, regno); - - offset -= UNITS_PER_WORD; - insn = emit_move_insn (mem, reg); - RTX_FRAME_RELATED_P (insn) = 1; - add_reg_note (insn, REG_FRAME_RELATED_EXPR, - gen_rtx_SET (mem, reg)); - } - } + if (xtensa_call_save_reg (regno)) + { + rtx x = gen_rtx_PLUS (Pmode, stack_pointer_rtx, GEN_INT (offset)); + rtx mem = gen_frame_mem (SImode, x); + rtx reg = gen_rtx_REG (SImode, regno); + + offset -= UNITS_PER_WORD; + insn = emit_move_insn (mem, reg); + RTX_FRAME_RELATED_P (insn) = 1; + add_reg_note (insn, REG_FRAME_RELATED_EXPR, + gen_rtx_SET (mem, reg)); + } if (total_size > 1024 || (!callee_save_size && total_size > 128)) - { - if (xtensa_simm8x256 (callee_save_size - total_size)) - insn = emit_insn (gen_addsi3 (stack_pointer_rtx, - stack_pointer_rtx, - GEN_INT (callee_save_size - - total_size))); - else - { - rtx tmp_reg = gen_rtx_REG (Pmode, A9_REG); - emit_move_insn (tmp_reg, GEN_INT (total_size - - callee_save_size)); - insn = emit_insn (gen_subsi3 (stack_pointer_rtx, - stack_pointer_rtx, tmp_reg)); - } - RTX_FRAME_RELATED_P (insn) = 1; - note_rtx = gen_rtx_SET (stack_pointer_rtx, - plus_constant (Pmode, stack_pointer_rtx, - callee_save_size - - total_size)); - add_reg_note (insn, REG_FRAME_RELATED_EXPR, note_rtx); - } + insn = xtensa_emit_adjust_stack_ptr (callee_save_size - total_size, + ADJUST_SP_NEED_NOTE); } if (frame_pointer_needed) @@ -3326,24 +3329,17 @@ xtensa_expand_epilogue (bool sibcall_p) { int regno; HOST_WIDE_INT offset; + df_ref ref; + bool stack_pointer_used = false; - if (cfun->machine->current_frame_size > (frame_pointer_needed ? 127 : 1024)) + if (cfun->machine->current_frame_size > (frame_pointer_needed ? 127 + : 1024)) { - if (xtensa_simm8x256 (cfun->machine->current_frame_size - - cfun->machine->callee_save_size)) - emit_insn (gen_addsi3 (stack_pointer_rtx, frame_pointer_needed ? - hard_frame_pointer_rtx : stack_pointer_rtx, - GEN_INT (cfun->machine->current_frame_size - - cfun->machine->callee_save_size))); - else - { - rtx tmp_reg = gen_rtx_REG (Pmode, A9_REG); - emit_move_insn (tmp_reg, GEN_INT (cfun->machine->current_frame_size - - cfun->machine->callee_save_size)); - emit_insn (gen_addsi3 (stack_pointer_rtx, frame_pointer_needed ? - hard_frame_pointer_rtx : stack_pointer_rtx, - tmp_reg)); - } + xtensa_emit_adjust_stack_ptr (cfun->machine->current_frame_size - + cfun->machine->callee_save_size, + frame_pointer_needed + ? ADJUST_SP_FRAME_PTR + : ADJUST_SP_NONE); offset = cfun->machine->callee_save_size - UNITS_PER_WORD; } else @@ -3359,19 +3355,18 @@ xtensa_expand_epilogue (bool sibcall_p) emit_insn (gen_blockage ()); for (regno = 0; regno < FIRST_PSEUDO_REGISTER; ++regno) - { - if (xtensa_call_save_reg(regno)) - { - rtx x = gen_rtx_PLUS (Pmode, stack_pointer_rtx, GEN_INT (offset)); - rtx reg; - - offset -= UNITS_PER_WORD; - emit_move_insn (reg = gen_rtx_REG (SImode, regno), - gen_frame_mem (SImode, x)); - if (regno == A0_REG && sibcall_p) - emit_use (reg); - } - } + if (xtensa_call_save_reg (regno)) + { + rtx x = gen_rtx_PLUS (Pmode, stack_pointer_rtx, GEN_INT (offset)); + rtx reg; + + offset -= UNITS_PER_WORD; + emit_move_insn (reg = gen_rtx_REG (SImode, regno), + gen_frame_mem (SImode, x)); + if (regno == A0_REG && sibcall_p) + emit_use (reg); + stack_pointer_used = true; + } if (cfun->machine->current_frame_size > 0) { @@ -3384,31 +3379,42 @@ xtensa_expand_epilogue (bool sibcall_p) else offset = cfun->machine->callee_save_size; if (offset) - emit_insn (gen_addsi3 (stack_pointer_rtx, - stack_pointer_rtx, - GEN_INT (offset))); + xtensa_emit_adjust_stack_ptr (offset, ADJUST_SP_NONE); } else - { - if (xtensa_simm8x256 (cfun->machine->current_frame_size)) - emit_insn (gen_addsi3 (stack_pointer_rtx, stack_pointer_rtx, - GEN_INT (cfun->machine->current_frame_size))); - else - { - rtx tmp_reg = gen_rtx_REG (Pmode, A9_REG); - emit_move_insn (tmp_reg, - GEN_INT (cfun->machine->current_frame_size)); - emit_insn (gen_addsi3 (stack_pointer_rtx, stack_pointer_rtx, - tmp_reg)); - } - } + xtensa_emit_adjust_stack_ptr (cfun->machine->current_frame_size, + ADJUST_SP_NONE); } if (crtl->calls_eh_return) emit_insn (gen_add3_insn (stack_pointer_rtx, stack_pointer_rtx, EH_RETURN_STACKADJ_RTX)); + + /* Check if the function body uses the stack pointer. */ + for (ref = DF_REG_USE_CHAIN (A1_REG); + ref; ref = DF_REF_NEXT_REG (ref)) + if (DF_REF_CLASS (ref) == DF_REF_REGULAR) + { + stack_pointer_used = true; + break; + } + + /* Undo the insns if the stack pointer is only used for its own + adjustment. */ + if (! (stack_pointer_used + || frame_pointer_needed + || crtl->calls_eh_return)) + for (const auto &insn : *cfun->machine->logues_a1_adjusts) + { + rtx note = find_reg_note (insn, REG_FRAME_RELATED_EXPR, NULL); + + if (note) + remove_note (insn, note); + PATTERN (insn) = const0_rtx; + } } + cfun->machine->epilogue_done = true; if (!sibcall_p) emit_jump_insn (gen_return ()); -- 2.20.1