From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 6624 invoked by alias); 16 Oct 2015 20:21:45 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 6614 invoked by uid 89); 16 Oct 2015 20:21:45 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW,SPF_PASS,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: mailapp01.imgtec.com Received: from mailapp01.imgtec.com (HELO mailapp01.imgtec.com) (195.59.15.196) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 16 Oct 2015 20:21:43 +0000 Received: from KLMAIL01.kl.imgtec.org (unknown [192.168.5.35]) by Websense Email Security Gateway with ESMTPS id 21EFDFDEFDBD9; Fri, 16 Oct 2015 21:21:35 +0100 (IST) Received: from hhmail02.hh.imgtec.org (10.100.10.20) by KLMAIL01.kl.imgtec.org (192.168.5.35) with Microsoft SMTP Server (TLS) id 14.3.195.1; Fri, 16 Oct 2015 21:21:38 +0100 Received: from BAMAIL02.ba.imgtec.org (10.20.40.28) by hhmail02.hh.imgtec.org (10.100.10.20) with Microsoft SMTP Server (TLS) id 14.3.235.1; Fri, 16 Oct 2015 21:21:38 +0100 Received: from ubuntu-sellcey.mips.com (10.20.3.58) by bamail02.ba.imgtec.org (10.20.40.28) with Microsoft SMTP Server id 14.3.174.1; Fri, 16 Oct 2015 13:21:33 -0700 Received: by ubuntu-sellcey.mips.com (sSMTP sendmail emulation); Fri, 16 Oct 2015 13:21:33 -0700 From: "Steve Ellcey " Date: Fri, 16 Oct 2015 21:08:00 -0000 To: CC: , Subject: [Patch, MIPS] Frame header optimization for MIPS (part 2) User-Agent: Heirloom mailx 12.5 6/20/10 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Message-ID: X-SW-Source: 2015-10/txt/msg01647.txt.bz2 Here is the second part of the MIPS frame header optimization patch. The first part avoided allocating a frame header if it knew that none of the functions that it called would need it. This part uses a frame header to save callee saved registers if doing so will allow the function to avoid having to allocate stack space (thus avoiding the need to increment and decrement the stack pointer at all). I did a little reorganization of my first patch as part of this, seperating the is_leaf_function check from needs_frame_header_p and making it a seperate check from callees_functions_use_frame_header. This allowed me to reuse the needs_frame_header_p function and the does_not_use_frame_header field that stores its value in this new patch. The optimimization may be more conservative than it has to be in some cases like functions with floating point arguments but it catches the most important cases and we can extend it later if needed. Tested with no regressions using the mips-mti-linux-gnu toolchain. OK to checkin? Steve Ellcey sellcey@imgtec.com 2015-10-16 Steve Ellcey * frame-header-opt.c (gate): Check for optimize > 0. (has_inlined_assembly): New function. (needs_frame_header_p): Remove is_leaf_function check, add argument type check. (callees_functions_use_frame_header): Add is_leaf_function and has_inlined_assembly calls.. (set_callers_may_not_allocate_frame): New function. (frame_header_opt): Add is_leaf_function call, add set_callers_may_not_allocate_frame call. * config/mips/mips.c (mips_compute_frame_info): Add check to see if callee saved regs can be put in frame header. (mips_expand_prologue): Add check to see if step1 is zero, fix cfa restores when using frame header to store regs. (mips_can_use_return_insn): Check to see if registers are stored in frame header. * config/mips/mips.h (machine_function): Add callers_may_not_allocate_frame and use_frame_header_for_callee_saved_regs fields. diff --git a/gcc/config/mips/frame-header-opt.c b/gcc/config/mips/frame-header-opt.c index 7c7b1f2..5512838 100644 --- a/gcc/config/mips/frame-header-opt.c +++ b/gcc/config/mips/frame-header-opt.c @@ -79,7 +79,7 @@ public: /* This optimization has no affect if TARGET_NEWABI. If optimize is not at least 1 then the data needed for the optimization is not available and nothing will be done anyway. */ - return TARGET_OLDABI && flag_frame_header_optimization; + return TARGET_OLDABI && flag_frame_header_optimization && (optimize > 0); } virtual unsigned int execute (function *) { return frame_header_opt (); } @@ -125,6 +125,29 @@ is_leaf_function (function *fn) return true; } +/* Return true if this function has inline assembly code or if we cannot + be certain that it does not. False if know that there is no inline + assembly. */ + +static bool +has_inlined_assembly (function *fn) +{ + basic_block bb; + gimple_stmt_iterator gsi; + + /* If we do not have a cfg for this function be conservative and assume + it is may have inline assembly. */ + if (fn->cfg == NULL) + return true; + + FOR_EACH_BB_FN (bb, fn) + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + if (gimple_code (gsi_stmt (gsi)) == GIMPLE_ASM) + return true; + + return false; +} + /* Return true if this function will use the stack space allocated by its caller or if we cannot determine for certain that it does not. */ @@ -136,20 +159,26 @@ needs_frame_header_p (function *fn) if (fn->decl == NULL) return true; - if (fn->stdarg || !is_leaf_function (fn)) + if (fn->stdarg) return true; for (t = DECL_ARGUMENTS (fn->decl); t; t = TREE_CHAIN (t)) { if (!use_register_for_decl (t)) - return true; + return true; + + /* Some 64 bit types may get copied to general registers using the frame + header, see mips_output_64bit_xfer. Checking for SImode only may be + overly restrictive but it is gauranteed to be safe. */ + if (DECL_MODE (t) != SImode) + return true; } return false; } -/* Returns TRUE if the argument stack space allocated by function FN is used. - Returns FALSE if the space is needed or if the need for the space cannot +/* Return true if the argument stack space allocated by function FN is used. + Return false if the space is needed or if the need for the space cannot be determined. */ static bool @@ -177,6 +206,8 @@ callees_functions_use_frame_header (function *fn) called_fn = DECL_STRUCT_FUNCTION (called_fn_tree); if (called_fn == NULL || DECL_WEAK (called_fn_tree) + || has_inlined_assembly (called_fn) + || !is_leaf_function (called_fn) || !called_fn->machine->does_not_use_frame_header) return true; } @@ -188,6 +219,41 @@ callees_functions_use_frame_header (function *fn) return false; } +/* Set the callers_may_not_allocate_frame flag for any function which + function FN calls because FN may not allocate a frame header. */ + +static void +set_callers_may_not_allocate_frame (function *fn) +{ + basic_block bb; + gimple_stmt_iterator gsi; + gimple *stmt; + tree called_fn_tree; + function *called_fn; + + if (fn->cfg == NULL) + return; + + FOR_EACH_BB_FN (bb, fn) + { + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + { + stmt = gsi_stmt (gsi); + if (is_gimple_call (stmt)) + { + called_fn_tree = gimple_call_fndecl (stmt); + if (called_fn_tree != NULL) + { + called_fn = DECL_STRUCT_FUNCTION (called_fn_tree); + if (called_fn != NULL) + called_fn->machine->callers_may_not_allocate_frame = true; + } + } + } + } + return; +} + /* Scan each function to determine those that need its frame headers. Perform a second scan to determine if the allocation can be skipped because none of their callees require the frame header. */ @@ -209,8 +275,17 @@ frame_header_opt () { fn = node->get_fun (); if (fn != NULL) - fn->machine->optimize_call_stack - = !callees_functions_use_frame_header (fn); + fn->machine->optimize_call_stack + = (!callees_functions_use_frame_header (fn)) + && (!is_leaf_function (fn)); } + + FOR_EACH_DEFINED_FUNCTION (node) + { + fn = node->get_fun (); + if ((fn != NULL) && fn->machine->optimize_call_stack) + set_callers_may_not_allocate_frame (fn); + } + return 0; } diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c index 521b587..0367d19 100644 --- a/gcc/config/mips/mips.c +++ b/gcc/config/mips/mips.c @@ -10481,6 +10481,35 @@ mips_compute_frame_info (void) frame->cop0_sp_offset = offset - UNITS_PER_WORD; } + /* Determine if we can save the callee saved registers in the frame + header. Restrict this to functions where there is no other reason + to allocate stack space so that we can completely eliminate the + instructions that modify the stack pointer. */ + + if (TARGET_OLDABI + && (optimize > 0) + && flag_frame_header_optimization + && !MAIN_NAME_P (DECL_NAME (current_function_decl)) + && (cfun->machine->varargs_size == 0) + && (crtl->args.pretend_args_size == 0) + && (frame->var_size == 0) + && (frame->num_acc == 0) + && (frame->num_cop0_regs == 0) + && (frame->num_fp == 0) + && (frame->num_gp > 0) + && (frame->num_gp <= MAX_ARGS_IN_REGISTERS) + && (!GENERATE_MIPS16E_SAVE_RESTORE) + && (!cfun->machine->interrupt_handler_p) + && (cfun->machine->does_not_use_frame_header) + && (cfun->machine->optimize_call_stack) + && (!cfun->machine->callers_may_not_allocate_frame) + && !mips_cfun_has_cprestore_slot_p ()) + { + offset = 0; + frame->gp_sp_offset = REG_PARM_STACK_SPACE(cfun) - UNITS_PER_WORD; + cfun->machine->use_frame_header_for_callee_saved_regs = true; + } + /* Move above the callee-allocated varargs save area. */ offset += MIPS_STACK_ALIGN (cfun->machine->varargs_size); frame->arg_pointer_offset = offset; @@ -11599,12 +11628,15 @@ mips_expand_prologue (void) } else { - rtx insn = gen_add3_insn (stack_pointer_rtx, - stack_pointer_rtx, - GEN_INT (-step1)); - RTX_FRAME_RELATED_P (emit_insn (insn)) = 1; - mips_frame_barrier (); - size -= step1; + if (step1 != 0) + { + rtx insn = gen_add3_insn (stack_pointer_rtx, + stack_pointer_rtx, + GEN_INT (-step1)); + RTX_FRAME_RELATED_P (emit_insn (insn)) = 1; + mips_frame_barrier (); + size -= step1; + } } mips_for_each_saved_acc (size, mips_save_reg); mips_for_each_saved_gpr_and_fpr (size, mips_save_reg); @@ -11729,9 +11761,9 @@ mips_epilogue_emit_cfa_restores (void) rtx_insn *insn; insn = get_last_insn (); - gcc_assert (insn && !REG_NOTES (insn)); if (mips_epilogue.cfa_restores) { + gcc_assert (insn && !REG_NOTES (insn)); RTX_FRAME_RELATED_P (insn) = 1; REG_NOTES (insn) = mips_epilogue.cfa_restores; mips_epilogue.cfa_restores = 0; @@ -11982,7 +12014,9 @@ mips_expand_epilogue (bool sibcall_p) mips_deallocate_stack (stack_pointer_rtx, GEN_INT (step2), 0); } - if (!use_jraddiusp_p) + if (cfun->machine->use_frame_header_for_callee_saved_regs) + mips_epilogue_emit_cfa_restores (); + else if (!use_jraddiusp_p) gcc_assert (!mips_epilogue.cfa_restores); /* Add in the __builtin_eh_return stack adjustment. We need to @@ -12084,7 +12118,8 @@ mips_can_use_return_insn (void) if (mips16_cfun_returns_in_fpr_p ()) return false; - return cfun->machine->frame.total_size == 0; + return ((cfun->machine->frame.total_size == 0) + && !cfun->machine->use_frame_header_for_callee_saved_regs); } /* Return true if register REGNO can store a value of mode MODE. diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h index 501d283..7a4a0ba 100644 --- a/gcc/config/mips/mips.h +++ b/gcc/config/mips/mips.h @@ -3273,6 +3273,13 @@ struct GTY(()) machine_function { /* True if none of the functions that are called by this function need stack space allocated for their arguments. */ bool optimize_call_stack; + + /* True if one of the functions calling this function may not allocate + a frame header. */ + bool callers_may_not_allocate_frame; + + /* True if GCC stored callee saved registers in the frame header. */ + bool use_frame_header_for_callee_saved_regs; }; #endif 2015-10-16 Steve Ellcey * gcc.target/mips/frame-header-4.c: New test. diff --git a/gcc/testsuite/gcc.target/mips/frame-header-4.c b/gcc/testsuite/gcc.target/mips/frame-header-4.c index e69de29..3cddba1 100644 --- a/gcc/testsuite/gcc.target/mips/frame-header-4.c +++ b/gcc/testsuite/gcc.target/mips/frame-header-4.c @@ -0,0 +1,20 @@ +/* Verify that we can optimize away the frame header allocation in bar + by having it use its frame header to store $31 in before calling foo. */ + +/* { dg-do compile } */ +/* { dg-options "-mframe-header-opt -mabi=32" } */ +/* { dg-skip-if "code quality test" { *-*-* } { "-O0" } { "" } } */ +/* { dg-final { scan-assembler-not "\taddiu\t\\\$sp" } } */ + +int __attribute__ ((noinline)) +foo (int a, int b) +{ + return a + b; +} + +int __attribute__ ((noinline)) +bar (int a, int b) +{ + return 1 + foo(a,b); +} +