Vladimir, this patch adds analysis of register usage of functions for usage by IRA. The patch: - adds analysis in pass_final to track which hard registers are set or clobbered by the function body, and stores that information in a struct cgraph_node. - adds a target hook fn_other_hard_reg_usage to list hard registers that are set or clobbered by a call to a function, but are not listed as such in the function body, such as f.i. registers clobbered by veneers inserted by the linker. - adds a reg-note REG_CALL_DECL, to be able to easily link call_insns to their corresponding declaration, even after the calls may have been split into an insn (set register to function address) and a call_insn (call register), which can happen for f.i. sh, and mips with -mabi-calls. - uses the register analysis in IRA. - adds an option -fuse-caller-save to control the optimization, on by default at -Os and -O2 and higher. The patch (original version by Radovan Obradovic) is similar to your patch ( http://gcc.gnu.org/ml/gcc-patches/2007-01/msg01625.html ) from 2007. But this patch doesn't implement save area stack slot sharing. ( Btw, I've borrowed the struct cgraph_node field name and comment from the 2007 patch ). [ Steven, you mentioned in this discussion ( http://gcc.gnu.org/ml/gcc/2012-10/msg00213.html ) that you are working on porting the 2007 patch to trunk. What is the status of that effort? ] As an example of the functionality, consider foo and bar from test-case aru-1.c: ... static int __attribute__((noinline)) bar (int x) { return x + 3; } int __attribute__((noinline)) foo (int y) { return y + bar (y); } ... Compiled at -O2, bar only sets register $2 (the first return register): ... bar: .frame $sp,0,$31 # vars= 0, regs= 0/0, args= 0, gp= 0 .mask 0x00000000,0 .fmask 0x00000000,0 .set noreorder .set nomacro j $31 addiu $2,$4,3 ... foo then can use register $3 (the second return register) instead of register $16 to save the value in register $4 (the first argument register) over the call, as demonstrated here in a -fno-use-caller-save vs. -fuse-caller-save diff: ... foo: foo: # vars= 0, regs= 2/0, args= 16, gp= 8 | # vars= 0, regs= 1/0, args= 16, gp= 8 .frame $sp,32,$31 .frame $sp,32,$31 .mask 0x80010000,-4 | .mask 0x80000000,-4 .fmask 0x00000000,0 .fmask 0x00000000,0 .set noreorder .set noreorder .set nomacro .set nomacro addiu $sp,$sp,-32 addiu $sp,$sp,-32 sw $31,28($sp) sw $31,28($sp) sw $16,24($sp) < .option pic0 .option pic0 jal bar jal bar .option pic2 .option pic2 move $16,$4 | move $3,$4 lw $31,28($sp) lw $31,28($sp) addu $2,$2,$16 | addu $2,$2,$3 lw $16,24($sp) < j $31 j $31 addiu $sp,$sp,32 addiu $sp,$sp,32 ... That way we skip the save and restore of register $16, which is not necessary for $3. Btw, a further improvement could be to reuse $4 after the call, and eliminate the move. A version of this patch on top of 4.6 ran into trouble with the epilogue on arm, where a register was clobbered by a stack pop instruction, while that was not visible in the rtl representation. This instruction was introduced in arm_output_epilogue by code marked with the comment 'pop call clobbered registers if it avoids a separate stack adjustment'. I cannot reproduce that issue on trunk. Looking at the generated rtl, it seems that the epilogue instructions now list all registers set by it, so collect_fn_hard_reg_usage is able to analyze all clobbered registers. Bootstrapped and reg-tested on x86_64, Ada inclusive. Build and reg-tested on mips, arm, ppc and sh. No issues found. OK for stage1 trunk? Thanks, - Tom 2013-01-24 Radovan Obradovic Tom de Vries * hooks.c (hook_void_hard_reg_set_containerp): New function. * hooks.h (hook_void_hard_reg_set_containerp): Declare. * target.def (fn_other_hard_reg_usage): New DEFHOOK. * config/arm/arm.c (TARGET_FN_OTHER_HARD_REG_USAGE): Redefine as arm_fn_other_hard_reg_usage. (arm_fn_other_hard_reg_usage): New function. * doc/tm.texi.in (@node Stack and Calling): Add Miscellaneous Register Hooks to @menu. (@node Miscellaneous Register Hooks): New node. (@hook TARGET_FN_OTHER_HARD_REG_USAGE): New hook. * doc/tm.texi: Regenerate. * reg-notes.def (REG_NOTE (CALL_DECL)): New reg-note REG_CALL_DECL. * calls.c (expand_call, emit_library_call_value_1): Add REG_CALL_DECL reg-note. * combine.c (distribute_notes): Handle REG_CALL_DECL reg-note. * emit-rtl.c (try_split): Same. * rtlanal.c (find_all_hard_reg_sets): Add bool implicit parameter and handle. * rtl.h (find_all_hard_reg_sets): Add bool parameter. * haifa-sched.c (recompute_todo_spec, check_clobbered_conditions): Add new argument to find_all_hard_reg_sets call. cgraph.h (struct cgraph_node): Add function_used_regs, function_used_regs_initialized and function_used_regs_valid fields. * common.opt (fuse-caller-save): New option. * opts.c (default_options_table): Add OPT_LEVELS_2_PLUS entry with OPT_fuse_caller_save. * final.c: Move include of hard-reg-set.h to before rtl.h to declare find_all_hard_reg_sets. (collect_fn_hard_reg_usage, get_call_fndecl, get_call_cgraph_node) (get_call_reg_set_usage): New function. (rest_of_handle_final): Use collect_fn_hard_reg_usage. * regs.h (get_call_reg_set_usage): Declare. * df-scan.c (df_get_call_refs): Use get_call_reg_set_usage. * caller-save.c (setup_save_areas, save_call_clobbered_regs): Use get_call_reg_set_usage. * resource.c (mark_set_resources, mark_target_live_regs): Use get_call_reg_set_usage. * ira-int.h (struct ira_allocno): Add crossed_calls_clobbered_regs field. (ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS): Define. * ira-lives.c (process_bb_node_lives): Use get_call_reg_set_usage. Calculate ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS. * ira-build.c (ira_create_allocno): Init ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS. (create_cap_allocno, propagate_allocno_info) (propagate_some_info_from_allocno) (copy_info_to_removed_store_destinations): Handle ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS. * ira-costs.c (ira_tune_allocno_costs): Use ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS to adjust costs. * doc/invoke.texi (@item Optimization Options): Add -fuse-caller-save to gccoptlist. (@item -fuse-caller-save): New item. * lib/target-supports.exp (check_effective_target_mips16) (check_effective_target_micromips): New proc. * gcc.target/mips/mips.exp: Add use-caller-save to -ffoo/-fno-foo options. Add -save-temps to mips_option_groups. * gcc.target/mips/aru-1.c: New test.