From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from sonicconh6001-vm0.mail.ssk.yahoo.co.jp (sonicconh6001-vm0.mail.ssk.yahoo.co.jp [182.22.37.9]) by sourceware.org (Postfix) with ESMTPS id 24139385840A for ; Sat, 7 Jan 2023 02:55:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 24139385840A Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=yahoo.co.jp Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=yahoo.co.jp X-YMail-OSG: ZLFhofQVM1ljUWX.Hbsog_TSktBbW0NjVKyv.gP8Z7aZ1KUeGpfPr2KK62Zadxu 6Cv9Jf00hOb9XWyFWLI43.Ppoqmhh3vsqfZp4VrFppiox0Im5iCw6lZRZ5ffmVS5VHqkd0cva.5R KsOT_1L15JAmiTdJtbCEIMTPuI2wAheq.7pS3QRT0Zq4Ta.dNbRsBvkvgyHqClxIyglu2B62vHLJ 8a6IJV6fQr1kW3XYII8yYT8.H.g8XyyzNBX7lFeJW8zxRbA9m_j5Rs4sozELPo9QvKRnVpbASiae gVgrXiorj0HJbN6Z7jnNyKPZPAsdehZdsEC.NPe2hDhXd4ImAoAIGOLViPmSKARtOd6flAbm7OaW vBuZ5atbWtYQXvnpKKWJHDQDQqO3tYeLjfBs5Jjt1Sj4hh9opY1Xmk1vG.UhR.BapaqI4miwYsuE a7eHXZ23x5ml.ud.a2aRG4GSBup0SNM3pXbKY5i2A3..DwAZOWugEe8IDyyMjXUZunxlzgx77.Tw AYq4nHDJOz2ZrV_tcfsDOyLnGbxdQxbchDLhy8899gujXjsbe8k949PvKvnXouf10FuIx71Z8XCq WHrhzjGbivIEUCVCDo0EHuq0Kp78C9Q.I6bvbLghhcZFds3ALlyMTblJl8UTV_6ZK_w03i8RQjhd 2rCIweK6wnGgY7N_p96s2lRkgatT97oo.b_uMNSHfCRMAbhIdYVakH1wQBgGD37a11b81sNP.hmz _pUlwCRqoVUAJinK6A0lrQYt5TtUFXX_8W24_8AYllQUPYHLnbrI7Msm1.rXNZkk4iQm7fNy4ySh cxNht3naSDpJwBPwKtHKWOazMbGs_jbZn5C4s5VDJkqP7t5lG_0ioKi52ysnOWi0T1MzLKskSQZ5 J72LFqX9Ml8NR9okIu8TYROjpznszJzK2yFQZLyDdfkmeb9PsLoMtdSOaQ3gr9MfdXSZ0KS9QcEx TzAXIxY4- Received: from sonicgw.mail.yahoo.co.jp by sonicconh6001.mail.ssk.yahoo.co.jp with HTTP; Sat, 7 Jan 2023 02:55:15 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1673060115; s=yj20110701; d=yahoo.co.jp; h=Message-ID:Date:MIME-Version:From:Subject:To:Cc:Content-Type:Content-Transfer-Encoding:References; bh=yI0GBB0MPfqKvjkKwXk3RKVkhHKpMVzf/TUbauc8EYs=; b=RhfAgfHg5EEfDCjvg+aQWkqDAK9/7ViQg5TZZhWYAF/WarlAx9peBbLdyT3wBLld H1UcIifZzfKgyH7kt9u8aXYkfNURdMK+M4nbRAxh2mbFI29WJX7gbbl98ld4HPM2qRW WJtfkmfTB4T4A7C/IjaEw86ykcBiGL6KD0sX4o7g= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=yj20110701; d=yahoo.co.jp; h=Message-ID:Date:MIME-Version:From:Cc:Content-Type:Content-Transfer-Encoding:References; b=vFMv9T4CUqGG9CEHRiYDPZzhDj4wKp3y9Be86linf5r36AJ6ZOUGhV9u66EW0/7s tcdfcO5cumPbRWdyd98x/1jBFiVPRWFLJ4eCDprvNH+Egod8e92GkjpMgoA7Ik5TxiY pJLV0E0YQlMZmc7wkf6w2wtt7lPSDtC9XhqpRXb0=; Received: by smtphe6007.mail.ssk.ynwp.yahoo.co.jp (YJ Hermes SMTP Server) with ESMTPA ID 8b1b3ebc3898d0c7cc716e49077b7daa; Sat, 07 Jan 2023 11:55:12 +0900 (JST) Message-ID: <19010306-4056-6f84-e555-e744f4f5061e@yahoo.co.jp> Date: Sat, 7 Jan 2023 11:55:11 +0900 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.6.1 From: Takayuki 'January June' Suwa Subject: [PATCH v2] xtensa: Optimize stack frame adjustment more To: GCC Patches Cc: Max Filippov Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit References: <19010306-4056-6f84-e555-e744f4f5061e.ref@yahoo.co.jp> X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: This patch introduces a convenient helper function for integer immediate addition with scratch register as needed, that splits and emits either up to two ADDI/ADDMI machine instructions or an addition by register following an integer immediate load (which may later be transformed by constantsynth). By using the helper function, it makes stack frame adjustment logic simplified and instruction count less in some cases. gcc/ChangeLog: * config/xtensa/xtensa.cc (xtensa_split_imm_two_addends, xtensa_emit_add_imm): New helper functions. (xtensa_set_return_address, xtensa_output_mi_thunk): Change to use the helper function. (xtensa_emit_adjust_stack_ptr): Ditto. And also change to try reusing the content of scratch register A9 if the register is not modified in the function body. --- gcc/config/xtensa/xtensa.cc | 151 +++++++++++++++++++++++++----------- 1 file changed, 106 insertions(+), 45 deletions(-) diff --git a/gcc/config/xtensa/xtensa.cc b/gcc/config/xtensa/xtensa.cc index ae44199bc98..a1f184950ae 100644 --- a/gcc/config/xtensa/xtensa.cc +++ b/gcc/config/xtensa/xtensa.cc @@ -104,6 +104,7 @@ struct GTY(()) machine_function bool frame_laid_out; bool epilogue_done; bool inhibit_logues_a1_adjusts; + rtx last_logues_a9_content; }; /* Vector, indexed by hard register number, which contains 1 for a @@ -2518,6 +2519,86 @@ xtensa_split_DI_reg_imm (rtx *operands) } +/* Try to split an integer value into what are suitable for two consecutive + immediate addition instructions, ADDI or ADDMI. */ + +static bool +xtensa_split_imm_two_addends (HOST_WIDE_INT imm, HOST_WIDE_INT v[2]) +{ + HOST_WIDE_INT v0, v1; + + if (imm < -32768) + v0 = -32768, v1 = imm + 32768; + else if (imm > 32512) + v0 = 32512, v1 = imm - 32512; + else if (TARGET_DENSITY && xtensa_simm12b (imm)) + /* A pair of MOVI(.N) and ADD.N is one or two bytes less than two + immediate additions if TARGET_DENSITY. */ + return false; + else + v0 = (imm + 128) & ~255L, v1 = imm - v0; + + if (xtensa_simm8 (v1) || xtensa_simm8x256 (v1)) + { + v[0] = v0, v[1] = v1; + return true; + } + + return false; +} + + +/* Helper function for integer immediate addition with scratch register + as needed, that splits and emits either up to two ADDI/ADDMI machine + instructions or an addition by register following an integer immediate + load (which may later be transformed by constantsynth). + + If 'scratch' is NULL_RTX but still needed, a new pseudo-register will + be allocated. Thus, after the reload/LRA pass, the specified scratch + register must be a hard one. */ + +static bool +xtensa_emit_add_imm (rtx dst, rtx src, HOST_WIDE_INT imm, rtx scratch, + bool need_note) +{ + bool retval = false; + HOST_WIDE_INT v[2]; + rtx_insn *insn; + + if (imm == 0) + return false; + + if (xtensa_simm8 (imm) || xtensa_simm8x256 (imm)) + insn = emit_insn (gen_addsi3 (dst, src, GEN_INT (imm))); + else if (xtensa_split_imm_two_addends (imm, v)) + { + if (!scratch) + scratch = gen_reg_rtx (SImode); + emit_insn (gen_addsi3 (scratch, src, GEN_INT (v[0]))); + insn = emit_insn (gen_addsi3 (dst, scratch, GEN_INT (v[1]))); + } + else + { + if (scratch) + emit_move_insn (scratch, GEN_INT (imm)); + else + scratch = force_reg (SImode, GEN_INT (imm)); + retval = true; + insn = emit_insn (gen_addsi3 (dst, src, scratch)); + } + + if (need_note) + { + rtx note_rtx = gen_rtx_SET (dst, plus_constant (Pmode, src, imm)); + + RTX_FRAME_RELATED_P (insn) = 1; + add_reg_note (insn, REG_FRAME_RELATED_EXPR, note_rtx); + } + + return retval; +} + + /* Implement TARGET_CANNOT_FORCE_CONST_MEM. */ static bool @@ -3245,41 +3326,33 @@ xtensa_initial_elimination_offset (int from, int to ATTRIBUTE_UNUSED) static void xtensa_emit_adjust_stack_ptr (HOST_WIDE_INT offset, int flags) { + rtx src, scratch; rtx_insn *insn; - rtx ptr = (flags & ADJUST_SP_FRAME_PTR) ? hard_frame_pointer_rtx - : stack_pointer_rtx; if (cfun->machine->inhibit_logues_a1_adjusts) return; - if (xtensa_simm8 (offset) - || xtensa_simm8x256 (offset)) - insn = emit_insn (gen_addsi3 (stack_pointer_rtx, ptr, GEN_INT (offset))); - else - { - rtx tmp_reg = gen_rtx_REG (Pmode, A9_REG); + src = (flags & ADJUST_SP_FRAME_PTR) + ? hard_frame_pointer_rtx : stack_pointer_rtx; + scratch = gen_rtx_REG (Pmode, A9_REG); - if (offset < 0) - { - emit_move_insn (tmp_reg, GEN_INT (-offset)); - insn = emit_insn (gen_subsi3 (stack_pointer_rtx, ptr, tmp_reg)); - } - else - { - emit_move_insn (tmp_reg, GEN_INT (offset)); - insn = emit_insn (gen_addsi3 (stack_pointer_rtx, ptr, tmp_reg)); - } - } - - if (flags & ADJUST_SP_NEED_NOTE) + if (df && DF_REG_DEF_COUNT (A9_REG) == 0 + && cfun->machine->last_logues_a9_content + && -INTVAL (cfun->machine->last_logues_a9_content) == offset) { - rtx note_rtx = gen_rtx_SET (stack_pointer_rtx, - plus_constant (Pmode, stack_pointer_rtx, - offset)); + insn = emit_insn (gen_subsi3 (stack_pointer_rtx, src, scratch)); + if (flags & ADJUST_SP_NEED_NOTE) + { + rtx note_rtx = gen_rtx_SET (stack_pointer_rtx, + plus_constant (Pmode, src, offset)); - RTX_FRAME_RELATED_P (insn) = 1; - add_reg_note (insn, REG_FRAME_RELATED_EXPR, note_rtx); + RTX_FRAME_RELATED_P (insn) = 1; + add_reg_note (insn, REG_FRAME_RELATED_EXPR, note_rtx); + } } + else if (xtensa_emit_add_imm (stack_pointer_rtx, src, offset, scratch, + (flags & ADJUST_SP_NEED_NOTE))) + cfun->machine->last_logues_a9_content = GEN_INT (offset); } /* minimum frame = reg save area (4 words) plus static chain (1 word) @@ -3307,8 +3380,9 @@ xtensa_expand_prologue (void) /* Use a8 as a temporary since a0-a7 may be live. */ rtx tmp_reg = gen_rtx_REG (Pmode, A8_REG); emit_insn (gen_entry (GEN_INT (MIN_FRAME_SIZE))); - emit_move_insn (tmp_reg, GEN_INT (total_size - MIN_FRAME_SIZE)); - emit_insn (gen_subsi3 (tmp_reg, stack_pointer_rtx, tmp_reg)); + xtensa_emit_add_imm (tmp_reg, stack_pointer_rtx, + MIN_FRAME_SIZE - total_size, + tmp_reg, false); insn = emit_insn (gen_movsi (stack_pointer_rtx, tmp_reg)); } } @@ -3540,8 +3614,8 @@ xtensa_set_return_address (rtx address, rtx scratch) if (total_size > 1024) { - emit_move_insn (scratch, GEN_INT (total_size - UNITS_PER_WORD)); - emit_insn (gen_addsi3 (scratch, frame, scratch)); + xtensa_emit_add_imm (scratch, frame, total_size - UNITS_PER_WORD, + scratch, false); a0_addr = scratch; } @@ -5101,15 +5175,7 @@ xtensa_output_mi_thunk (FILE *file, tree thunk ATTRIBUTE_UNUSED, this_rtx = gen_rtx_REG (Pmode, A0_REG + this_reg_no); if (delta) - { - if (xtensa_simm8 (delta)) - emit_insn (gen_addsi3 (this_rtx, this_rtx, GEN_INT (delta))); - else - { - emit_move_insn (temp0, GEN_INT (delta)); - emit_insn (gen_addsi3 (this_rtx, this_rtx, temp0)); - } - } + xtensa_emit_add_imm (this_rtx, this_rtx, delta, temp0, false); if (vcall_offset) { @@ -5119,13 +5185,8 @@ xtensa_output_mi_thunk (FILE *file, tree thunk ATTRIBUTE_UNUSED, emit_move_insn (temp0, gen_rtx_MEM (Pmode, this_rtx)); if (xtensa_uimm8x4 (vcall_offset)) addr = plus_constant (Pmode, temp0, vcall_offset); - else if (xtensa_simm8 (vcall_offset)) - emit_insn (gen_addsi3 (temp1, temp0, GEN_INT (vcall_offset))); else - { - emit_move_insn (temp1, GEN_INT (vcall_offset)); - emit_insn (gen_addsi3 (temp1, temp0, temp1)); - } + xtensa_emit_add_imm (temp1, temp0, vcall_offset, temp1, false); emit_move_insn (temp1, gen_rtx_MEM (Pmode, addr)); emit_insn (gen_add2_insn (this_rtx, temp1)); } -- 2.30.2