From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 7877) id F086F3858C2C; Fri, 8 Jul 2022 03:33:34 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org F086F3858C2C MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="utf-8" From: LuluCheng To: gcc-cvs@gcc.gnu.org Subject: [gcc r12-8558] LoongArch: Modify fp_sp_offset and gp_sp_offset's calculation method when frame->mask or frame->fmas X-Act-Checkin: gcc X-Git-Author: Lulu Cheng X-Git-Refname: refs/heads/releases/gcc-12 X-Git-Oldrev: e02edb338f530ba86ad944327f540e52bb709959 X-Git-Newrev: e623829c18ec2949f8b43a5a13775659e0cd1cbf Message-Id: <20220708033334.F086F3858C2C@sourceware.org> Date: Fri, 8 Jul 2022 03:33:34 +0000 (GMT) X-BeenThere: gcc-cvs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-cvs mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Jul 2022 03:33:35 -0000 https://gcc.gnu.org/g:e623829c18ec2949f8b43a5a13775659e0cd1cbf commit r12-8558-ge623829c18ec2949f8b43a5a13775659e0cd1cbf Author: Lulu Cheng Date: Thu Jul 7 18:07:28 2022 +0800 LoongArch: Modify fp_sp_offset and gp_sp_offset's calculation method when frame->mask or frame->fmask is zero. Under the LA architecture, when the stack is dropped too far, the process of dropping the stack is divided into two steps. step1: After dropping the stack, save callee saved registers on the stack. step2: The rest of it. The stack drop operation is optimized when frame->total_size minus frame->sp_fp_offset is an integer multiple of 4096, can reduce the number of instructions required to drop the stack. However, this optimization is not effective because of the original calculation method The following case: int main() { char buf[1024 * 12]; printf ("%p\n", buf); return 0; } As you can see from the generated assembler, the old GCC has two more instructions than the new GCC, lines 14 and line 24. new old 10 main: | 11 main: 11 addi.d $r3,$r3,-16 | 12 lu12i.w $r13,-12288>>12 12 lu12i.w $r13,-12288>>12 | 13 addi.d $r3,$r3,-2032 13 lu12i.w $r5,-12288>>12 | 14 ori $r13,$r13,2016 14 lu12i.w $r12,12288>>12 | 15 lu12i.w $r5,-12288>>12 15 st.d $r1,$r3,8 | 16 lu12i.w $r12,12288>>12 16 add.d $r12,$r12,$r5 | 17 st.d $r1,$r3,2024 17 add.d $r3,$r3,$r13 | 18 add.d $r12,$r12,$r5 18 add.d $r5,$r12,$r3 | 19 add.d $r3,$r3,$r13 19 la.local $r4,.LC0 | 20 add.d $r5,$r12,$r3 20 bl %plt(printf) | 21 la.local $r4,.LC0 21 lu12i.w $r13,12288>>12 | 22 bl %plt(printf) 22 add.d $r3,$r3,$r13 | 23 lu12i.w $r13,8192>>12 23 ld.d $r1,$r3,8 | 24 ori $r13,$r13,2080 24 or $r4,$r0,$r0 | 25 add.d $r3,$r3,$r13 25 addi.d $r3,$r3,16 | 26 ld.d $r1,$r3,2024 26 jr $r1 | 27 or $r4,$r0,$r0 | 28 addi.d $r3,$r3,2032 | 29 jr $r1 gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_compute_frame_info): Modify fp_sp_offset and gp_sp_offset's calculation method, when frame->mask or frame->fmask is zero, don't minus UNITS_PER_WORD or UNITS_PER_FP_REG. gcc/testsuite/ChangeLog: * gcc.target/loongarch/prolog-opt.c: New test. (cherry picked from commit aa8fd7f65683ef9c3b6d2e9306bea2f28b5cadf7) Diff: --- gcc/config/loongarch/loongarch.cc | 12 +++++++++--- gcc/testsuite/gcc.target/loongarch/prolog-opt.c | 15 +++++++++++++++ 2 files changed, 24 insertions(+), 3 deletions(-) diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc index d72b256df51..5c9a33c14f7 100644 --- a/gcc/config/loongarch/loongarch.cc +++ b/gcc/config/loongarch/loongarch.cc @@ -917,8 +917,12 @@ loongarch_compute_frame_info (void) frame->frame_pointer_offset = offset; /* Next are the callee-saved FPRs. */ if (frame->fmask) - offset += LARCH_STACK_ALIGN (num_f_saved * UNITS_PER_FP_REG); - frame->fp_sp_offset = offset - UNITS_PER_FP_REG; + { + offset += LARCH_STACK_ALIGN (num_f_saved * UNITS_PER_FP_REG); + frame->fp_sp_offset = offset - UNITS_PER_FP_REG; + } + else + frame->fp_sp_offset = offset; /* Next are the callee-saved GPRs. */ if (frame->mask) { @@ -931,8 +935,10 @@ loongarch_compute_frame_info (void) frame->save_libcall_adjustment = x_save_size; offset += x_save_size; + frame->gp_sp_offset = offset - UNITS_PER_WORD; } - frame->gp_sp_offset = offset - UNITS_PER_WORD; + else + frame->gp_sp_offset = offset; /* The hard frame pointer points above the callee-saved GPRs. */ frame->hard_frame_pointer_offset = offset; /* Above the hard frame pointer is the callee-allocated varags save area. */ diff --git a/gcc/testsuite/gcc.target/loongarch/prolog-opt.c b/gcc/testsuite/gcc.target/loongarch/prolog-opt.c new file mode 100644 index 00000000000..0470a1f1eee --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/prolog-opt.c @@ -0,0 +1,15 @@ +/* Test that LoongArch backend stack drop operation optimized. */ + +/* { dg-do compile } */ +/* { dg-options "-O2 -mabi=lp64d" } */ +/* { dg-final { scan-assembler "addi.d\t\\\$r3,\\\$r3,-16" } } */ + +extern int printf (char *, ...); + +int main() +{ + char buf[1024 * 12]; + printf ("%p\n", buf); + return 0; +} +