public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH][IRA] Analysis of register usage of functions for usage by IRA.
@ 2013-01-25 13:05 Tom de Vries
  2013-01-25 15:46 ` Vladimir Makarov
  2014-09-01 16:41 ` Ulrich Weigand
  0 siblings, 2 replies; 59+ messages in thread
From: Tom de Vries @ 2013-01-25 13:05 UTC (permalink / raw)
  To: Vladimir Makarov, Steven Bosscher; +Cc: gcc-patches, Radovan Obradovic

[-- Attachment #1: Type: text/plain, Size: 7098 bytes --]

Vladimir,

this patch adds analysis of register usage of functions for usage by IRA.

The patch:
- adds analysis in pass_final to track which hard registers are set or clobbered
  by the function body, and stores that information in a struct cgraph_node.
- adds a target hook fn_other_hard_reg_usage to list hard registers that are
  set or clobbered by a call to a function, but are not listed as such in the
  function body, such as f.i. registers clobbered by veneers inserted by the
  linker.
- adds a reg-note REG_CALL_DECL, to be able to easily link call_insns to their
  corresponding declaration, even after the calls may have been split into an
  insn (set register to function address) and a call_insn (call register), which
  can happen for f.i. sh, and mips with -mabi-calls.
- uses the register analysis in IRA.
- adds an option -fuse-caller-save to control the optimization, on by default
  at -Os and -O2 and higher.


The patch (original version by Radovan Obradovic) is similar to your patch
( http://gcc.gnu.org/ml/gcc-patches/2007-01/msg01625.html ) from 2007.
But this patch doesn't implement save area stack slot sharing.
( Btw, I've borrowed the struct cgraph_node field name and comment from the 2007
patch ).

[ Steven, you mentioned in this discussion
  ( http://gcc.gnu.org/ml/gcc/2012-10/msg00213.html ) that you are working on
  porting the 2007 patch to trunk. What is the status of that effort?
]


As an example of the functionality, consider foo and bar from test-case aru-1.c:
...
static int __attribute__((noinline))
bar (int x)
{
  return x + 3;
}

int __attribute__((noinline))
foo (int y)
{
  return y + bar (y);
}
...

Compiled at -O2, bar only sets register $2 (the first return register):
...
bar:
        .frame  $sp,0,$31               # vars= 0, regs= 0/0, args= 0, gp= 0
        .mask   0x00000000,0
        .fmask  0x00000000,0
        .set    noreorder
        .set    nomacro
        j       $31
        addiu   $2,$4,3
...

foo then can use register $3 (the second return register) instead of register
$16 to save the value in register $4 (the first argument register) over the
call, as demonstrated here in a -fno-use-caller-save vs. -fuse-caller-save diff:
...
foo:                                    foo:
# vars= 0, regs= 2/0, args= 16, gp= 8 | # vars= 0, regs= 1/0, args= 16, gp= 8
.frame  $sp,32,$31                      .frame  $sp,32,$31
.mask   0x80010000,-4                 | .mask   0x80000000,-4
.fmask  0x00000000,0                    .fmask  0x00000000,0
.set    noreorder                       .set    noreorder
.set    nomacro                         .set    nomacro
addiu   $sp,$sp,-32                     addiu   $sp,$sp,-32
sw      $31,28($sp)                     sw      $31,28($sp)
sw      $16,24($sp)                   <
.option pic0                            .option pic0
jal     bar                             jal     bar
.option pic2                            .option pic2
move    $16,$4                        | move    $3,$4

lw      $31,28($sp)                     lw      $31,28($sp)
addu    $2,$2,$16                     | addu    $2,$2,$3
lw      $16,24($sp)                   <
j       $31                             j       $31
addiu   $sp,$sp,32                      addiu   $sp,$sp,32
...
That way we skip the save and restore of register $16, which is not necessary
for $3. Btw, a further improvement could be to reuse $4 after the call, and
eliminate the move.


A version of this patch on top of 4.6 ran into trouble with the epilogue on arm,
where a register was clobbered by a stack pop instruction, while that was not
visible in the rtl representation. This instruction was introduced in
arm_output_epilogue by code marked with the comment 'pop call clobbered
registers if it avoids a separate stack adjustment'.
I cannot reproduce that issue on trunk. Looking at the generated rtl, it seems
that the epilogue instructions now list all registers set by it, so
collect_fn_hard_reg_usage is able to analyze all clobbered registers.


Bootstrapped and reg-tested on x86_64, Ada inclusive. Build and reg-tested on
mips, arm, ppc and sh. No issues found. OK for stage1 trunk?

Thanks,
- Tom

2013-01-24  Radovan Obradovic  <robradovic@mips.com>
	    Tom de Vries  <tom@codesourcery.com>

	* hooks.c (hook_void_hard_reg_set_containerp): New function.
	* hooks.h (hook_void_hard_reg_set_containerp): Declare.
	* target.def (fn_other_hard_reg_usage): New DEFHOOK.
	* config/arm/arm.c (TARGET_FN_OTHER_HARD_REG_USAGE): Redefine as
	arm_fn_other_hard_reg_usage.
	(arm_fn_other_hard_reg_usage): New function.
	* doc/tm.texi.in (@node Stack and Calling): Add Miscellaneous Register
	Hooks to @menu.
	(@node Miscellaneous Register Hooks): New node.
	(@hook TARGET_FN_OTHER_HARD_REG_USAGE): New hook.
	* doc/tm.texi: Regenerate.
	* reg-notes.def (REG_NOTE (CALL_DECL)): New reg-note REG_CALL_DECL.
	* calls.c (expand_call, emit_library_call_value_1): Add REG_CALL_DECL
	reg-note.
	* combine.c (distribute_notes): Handle REG_CALL_DECL reg-note.
	* emit-rtl.c (try_split): Same.
	* rtlanal.c (find_all_hard_reg_sets): Add bool implicit parameter and
	handle.
	* rtl.h (find_all_hard_reg_sets): Add bool parameter.
	* haifa-sched.c (recompute_todo_spec, check_clobbered_conditions): Add
	new argument to find_all_hard_reg_sets call.
	cgraph.h (struct cgraph_node): Add function_used_regs,
	function_used_regs_initialized and function_used_regs_valid fields.
	* common.opt (fuse-caller-save): New option.
	* opts.c (default_options_table): Add OPT_LEVELS_2_PLUS entry with
	OPT_fuse_caller_save.
	* final.c: Move include of hard-reg-set.h to before rtl.h to declare
	find_all_hard_reg_sets.
	(collect_fn_hard_reg_usage, get_call_fndecl, get_call_cgraph_node)
	(get_call_reg_set_usage): New function.
	(rest_of_handle_final): Use collect_fn_hard_reg_usage.
	* regs.h (get_call_reg_set_usage): Declare.
	* df-scan.c (df_get_call_refs): Use get_call_reg_set_usage.
	* caller-save.c (setup_save_areas, save_call_clobbered_regs): Use
	get_call_reg_set_usage.
	* resource.c (mark_set_resources, mark_target_live_regs): Use
	get_call_reg_set_usage.
	* ira-int.h (struct ira_allocno): Add crossed_calls_clobbered_regs
	field.
	(ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS): Define.
	* ira-lives.c (process_bb_node_lives): Use get_call_reg_set_usage.
	Calculate ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS.
	* ira-build.c (ira_create_allocno): Init
	ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS.
	(create_cap_allocno, propagate_allocno_info)
	(propagate_some_info_from_allocno)
	(copy_info_to_removed_store_destinations): Handle
	ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS.
	* ira-costs.c (ira_tune_allocno_costs): Use
	ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS to adjust costs.
	* doc/invoke.texi (@item Optimization Options): Add -fuse-caller-save to
	gccoptlist.
	(@item -fuse-caller-save): New item.

	* lib/target-supports.exp (check_effective_target_mips16)
	(check_effective_target_micromips): New proc.
	* gcc.target/mips/mips.exp: Add use-caller-save to -ffoo/-fno-foo
	options.  Add -save-temps to mips_option_groups.
	* gcc.target/mips/aru-1.c: New test.

[-- Attachment #2: aru-upstream.5.patch --]
[-- Type: text/x-patch, Size: 32348 bytes --]

Index: gcc/hooks.c
===================================================================
--- gcc/hooks.c (revision 195240)
+++ gcc/hooks.c (working copy)
@@ -446,3 +446,11 @@ void
 hook_void_gcc_optionsp (struct gcc_options *opts ATTRIBUTE_UNUSED)
 {
 }
+
+/* Generic hook that takes a struct hard_reg_set_container * and returns
+   void.  */
+
+void
+hook_void_hard_reg_set_containerp (struct hard_reg_set_container *regs ATTRIBUTE_UNUSED)
+{
+}
Index: gcc/hooks.h
===================================================================
--- gcc/hooks.h (revision 195240)
+++ gcc/hooks.h (working copy)
@@ -69,6 +69,7 @@ extern void hook_void_tree (tree);
 extern void hook_void_tree_treeptr (tree, tree *);
 extern void hook_void_int_int (int, int);
 extern void hook_void_gcc_optionsp (struct gcc_options *);
+extern void hook_void_hard_reg_set_containerp (struct hard_reg_set_container *);
 
 extern int hook_int_uint_mode_1 (unsigned int, enum machine_mode);
 extern int hook_int_const_tree_0 (const_tree);
Index: gcc/target.def
===================================================================
--- gcc/target.def (revision 195240)
+++ gcc/target.def (working copy)
@@ -2859,6 +2859,17 @@ DEFHOOK
  void, (bitmap regs),
  hook_void_bitmap)
 
+/* For targets that need to mark extra registers as clobbered on entry to
+   the function, they should define this target hook and set their
+   bits in the struct hard_reg_set_container passed in.  */
+DEFHOOK
+(fn_other_hard_reg_usage,
+ "Add any hard registers to @var{regs} that are set or clobbered by a call to\
+ the function.  This hook only needs to be defined to provide registers that\
+ cannot be found by examination of the final RTL representation of a function.",
+ void, (struct hard_reg_set_container *regs),
+ hook_void_hard_reg_set_containerp)
+
 /* Fill in additional registers set up by prologue into a regset.  */
 DEFHOOK
 (set_up_by_prologue,
Index: gcc/cgraph.h
===================================================================
--- gcc/cgraph.h (revision 195240)
+++ gcc/cgraph.h (working copy)
@@ -251,6 +251,15 @@ struct GTY(()) cgraph_node {
   /* Unique id of the node.  */
   int uid;
 
+  /* Call unsaved hard registers really used by the corresponding
+     function (including ones used by functions called by the
+     function).  */
+  HARD_REG_SET function_used_regs;
+  /* Set if function_used_regs is initialized.  */
+  unsigned function_used_regs_initialized: 1;
+  /* Set if function_used_regs is valid.  */
+  unsigned function_used_regs_valid: 1;
+
   /* Set when decl is an abstract function pointed to by the
      ABSTRACT_DECL_ORIGIN of a reachable function.  */
   unsigned abstract_and_needed : 1;
Index: gcc/rtlanal.c
===================================================================
--- gcc/rtlanal.c (revision 195240)
+++ gcc/rtlanal.c (working copy)
@@ -1028,13 +1028,13 @@ record_hard_reg_sets (rtx x, const_rtx p
 /* Examine INSN, and compute the set of hard registers written by it.
    Store it in *PSET.  Should only be called after reload.  */
 void
-find_all_hard_reg_sets (const_rtx insn, HARD_REG_SET *pset)
+find_all_hard_reg_sets (const_rtx insn, HARD_REG_SET *pset, bool implicit)
 {
   rtx link;
 
   CLEAR_HARD_REG_SET (*pset);
   note_stores (PATTERN (insn), record_hard_reg_sets, pset);
-  if (CALL_P (insn))
+  if (implicit && CALL_P (insn))
     IOR_HARD_REG_SET (*pset, call_used_reg_set);
   for (link = REG_NOTES (insn); link; link = XEXP (link, 1))
     if (REG_NOTE_KIND (link) == REG_INC)
Index: gcc/final.c
===================================================================
--- gcc/final.c (revision 195240)
+++ gcc/final.c (working copy)
@@ -48,6 +48,7 @@ along with GCC; see the file COPYING3.
 #include "tm.h"
 
 #include "tree.h"
+#include "hard-reg-set.h"
 #include "rtl.h"
 #include "tm_p.h"
 #include "regs.h"
@@ -56,7 +57,6 @@ along with GCC; see the file COPYING3.
 #include "recog.h"
 #include "conditions.h"
 #include "flags.h"
-#include "hard-reg-set.h"
 #include "output.h"
 #include "except.h"
 #include "function.h"
@@ -219,6 +219,7 @@ static int alter_cond (rtx);
 static int final_addr_vec_align (rtx);
 #endif
 static int align_fuzz (rtx, rtx, int, unsigned);
+static void collect_fn_hard_reg_usage (void);
 \f
 /* Initialize data in final at the beginning of a compilation.  */
 
@@ -4277,6 +4278,8 @@ rest_of_handle_final (void)
   rtx x;
   const char *fnname;
 
+  collect_fn_hard_reg_usage ();
+
   /* Get the function's name, as described by its RTL.  This may be
      different from the DECL_NAME name used in the source file.  */
 
@@ -4533,3 +4536,121 @@ struct rtl_opt_pass pass_clean_state =
   0                                     /* todo_flags_finish */
  }
 };
+
+/* Collect hard register usage for the current function.  */
+
+static void
+collect_fn_hard_reg_usage (void)
+{
+  rtx insn;
+  int i;
+  struct cgraph_node *node;
+  struct hard_reg_set_container other_usage;
+
+  if (!flag_use_caller_save)
+    return;
+
+  node = cgraph_get_node (current_function_decl);
+  gcc_assert (node != NULL);
+
+  gcc_assert (!node->function_used_regs_initialized);
+  node->function_used_regs_initialized = 1;
+
+  for (insn = get_insns (); insn != NULL_RTX; insn = next_insn (insn))
+    {
+      HARD_REG_SET insn_used_regs;
+
+      if (!NONDEBUG_INSN_P (insn))
+	continue;
+
+      find_all_hard_reg_sets (insn, &insn_used_regs, false);
+
+      if (CALL_P (insn)
+	  && !get_call_reg_set_usage (insn, &insn_used_regs, call_used_reg_set))
+	{
+	  CLEAR_HARD_REG_SET (node->function_used_regs);
+	  return;
+	}
+
+      IOR_HARD_REG_SET (node->function_used_regs, insn_used_regs);
+    }
+
+  /* Be conservative - mark fixed and global registers as used.  */
+  IOR_HARD_REG_SET (node->function_used_regs, fixed_reg_set);
+  for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
+    if (global_regs[i])
+      SET_HARD_REG_BIT (node->function_used_regs, i);
+
+#ifdef STACK_REGS
+  /* Handle STACK_REGS conservatively, since the df-framework does not
+     provide accurate information for them.  */
+
+  for (i = FIRST_STACK_REG; i <= LAST_STACK_REG; i++)
+    SET_HARD_REG_BIT (node->function_used_regs, i);
+#endif
+
+  CLEAR_HARD_REG_SET (other_usage.set);
+  targetm.fn_other_hard_reg_usage (&other_usage);
+  IOR_HARD_REG_SET (node->function_used_regs, other_usage.set);
+
+  node->function_used_regs_valid = 1;
+}
+
+/* Get the declaration of the function called by INSN.  */
+
+static tree
+get_call_fndecl (rtx insn)
+{
+  rtx note, datum;
+
+  if (!flag_use_caller_save)
+    return NULL_TREE;
+
+  note = find_reg_note (insn, REG_CALL_DECL, NULL_RTX);
+  if (note == NULL_RTX)
+    return NULL_TREE;
+
+  datum = XEXP (note, 0);
+  if (datum != NULL_RTX)
+    return SYMBOL_REF_DECL (datum);
+
+  return NULL_TREE;
+}
+
+static struct cgraph_node *
+get_call_cgraph_node (rtx insn)
+{
+  tree fndecl;
+
+  if (insn == NULL_RTX)
+    return NULL;
+
+  fndecl = get_call_fndecl (insn);
+  if (fndecl == NULL_TREE
+      || !targetm.binds_local_p (fndecl))
+    return NULL;
+
+  return cgraph_get_node (fndecl);
+}
+
+/* Find hard registers used by function call instruction INSN, and return them
+   in REG_SET.  Return DEFAULT_SET in REG_SET if not found.  */
+
+bool
+get_call_reg_set_usage (rtx insn, HARD_REG_SET *reg_set,
+			HARD_REG_SET default_set)
+{
+  struct cgraph_node *node = get_call_cgraph_node (insn);
+  if (node != NULL
+      && node->function_used_regs_valid)
+    {
+      COPY_HARD_REG_SET (*reg_set, node->function_used_regs);
+      AND_HARD_REG_SET (*reg_set, default_set);
+      return true;
+    }
+  else
+    {
+      COPY_HARD_REG_SET (*reg_set, default_set);
+      return false;
+    }
+}
Index: gcc/regs.h
===================================================================
--- gcc/regs.h (revision 195240)
+++ gcc/regs.h (working copy)
@@ -419,4 +419,8 @@ range_in_hard_reg_set_p (const HARD_REG_
   return true;
 }
 
+/* Get registers used by given function call instruction.  */
+extern bool get_call_reg_set_usage (rtx insn, HARD_REG_SET *reg_set,
+				    HARD_REG_SET default_set);
+
 #endif /* GCC_REGS_H */
Index: gcc/df-scan.c
===================================================================
--- gcc/df-scan.c (revision 195240)
+++ gcc/df-scan.c (working copy)
@@ -3363,10 +3363,13 @@ df_get_call_refs (struct df_collection_r
   bool is_sibling_call;
   unsigned int i;
   HARD_REG_SET defs_generated;
+  HARD_REG_SET fn_reg_set_usage;
 
   CLEAR_HARD_REG_SET (defs_generated);
   df_find_hard_reg_defs (PATTERN (insn_info->insn), &defs_generated);
   is_sibling_call = SIBLING_CALL_P (insn_info->insn);
+  get_call_reg_set_usage (insn_info->insn, &fn_reg_set_usage,
+			  regs_invalidated_by_call);
 
   for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
     {
@@ -3391,6 +3394,7 @@ df_get_call_refs (struct df_collection_r
 	    }
 	}
       else if (TEST_HARD_REG_BIT (regs_invalidated_by_call, i)
+	       && TEST_HARD_REG_BIT (fn_reg_set_usage, i)
 	       /* no clobbers for regs that are the result of the call */
 	       && !TEST_HARD_REG_BIT (defs_generated, i)
 	       && (!is_sibling_call
Index: gcc/haifa-sched.c
===================================================================
--- gcc/haifa-sched.c (revision 195240)
+++ gcc/haifa-sched.c (working copy)
@@ -1271,7 +1271,7 @@ recompute_todo_spec (rtx next, bool for_
 	  {
 	    HARD_REG_SET t;
 
-	    find_all_hard_reg_sets (prev, &t);
+	    find_all_hard_reg_sets (prev, &t, true);
 	    if (TEST_HARD_REG_BIT (t, regno))
 	      return HARD_DEP;
 	    if (prev == pro)
@@ -3041,7 +3041,7 @@ check_clobbered_conditions (rtx insn)
   if ((current_sched_info->flags & DO_PREDICATION) == 0)
     return;
 
-  find_all_hard_reg_sets (insn, &t);
+  find_all_hard_reg_sets (insn, &t, true);
 
  restart:
   for (i = 0; i < ready.n_ready; i++)
Index: gcc/caller-save.c
===================================================================
--- gcc/caller-save.c (revision 195240)
+++ gcc/caller-save.c (working copy)
@@ -441,7 +441,7 @@ setup_save_areas (void)
       freq = REG_FREQ_FROM_BB (BLOCK_FOR_INSN (insn));
       REG_SET_TO_HARD_REG_SET (hard_regs_to_save,
 			       &chain->live_throughout);
-      COPY_HARD_REG_SET (used_regs, call_used_reg_set);
+      get_call_reg_set_usage (insn, &used_regs, call_used_reg_set);
 
       /* Record all registers set in this call insn.  These don't
 	 need to be saved.  N.B. the call insn might set a subreg
@@ -525,7 +525,7 @@ setup_save_areas (void)
 
 	  REG_SET_TO_HARD_REG_SET (hard_regs_to_save,
 				   &chain->live_throughout);
-	  COPY_HARD_REG_SET (used_regs, call_used_reg_set);
+	  get_call_reg_set_usage (insn, &used_regs, call_used_reg_set);
 
 	  /* Record all registers set in this call insn.  These don't
 	     need to be saved.  N.B. the call insn might set a subreg
@@ -804,6 +804,7 @@ save_call_clobbered_regs (void)
 	    {
 	      unsigned regno;
 	      HARD_REG_SET hard_regs_to_save;
+	      HARD_REG_SET call_def_reg_set;
 	      reg_set_iterator rsi;
 	      rtx cheap;
 
@@ -854,7 +855,9 @@ save_call_clobbered_regs (void)
 	      AND_COMPL_HARD_REG_SET (hard_regs_to_save, call_fixed_reg_set);
 	      AND_COMPL_HARD_REG_SET (hard_regs_to_save, this_insn_sets);
 	      AND_COMPL_HARD_REG_SET (hard_regs_to_save, hard_regs_saved);
-	      AND_HARD_REG_SET (hard_regs_to_save, call_used_reg_set);
+	      get_call_reg_set_usage (insn, &call_def_reg_set,
+				      call_used_reg_set);
+	      AND_HARD_REG_SET (hard_regs_to_save, call_def_reg_set);
 
 	      for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
 		if (TEST_HARD_REG_BIT (hard_regs_to_save, regno))
Index: gcc/ira-int.h
===================================================================
--- gcc/ira-int.h (revision 195240)
+++ gcc/ira-int.h (working copy)
@@ -374,6 +374,8 @@ struct ira_allocno
   /* The number of calls across which it is live, but which should not
      affect register preferences.  */
   int cheap_calls_crossed_num;
+  /* Registers clobbered by intersected calls.  */
+   HARD_REG_SET crossed_calls_clobbered_regs;
   /* Array of usage costs (accumulated and the one updated during
      coloring) for each hard register of the allocno class.  The
      member value can be NULL if all costs are the same and equal to
@@ -417,6 +419,8 @@ struct ira_allocno
 #define ALLOCNO_CALL_FREQ(A) ((A)->call_freq)
 #define ALLOCNO_CALLS_CROSSED_NUM(A) ((A)->calls_crossed_num)
 #define ALLOCNO_CHEAP_CALLS_CROSSED_NUM(A) ((A)->cheap_calls_crossed_num)
+#define ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS(A) \
+  ((A)->crossed_calls_clobbered_regs)
 #define ALLOCNO_MEM_OPTIMIZED_DEST(A) ((A)->mem_optimized_dest)
 #define ALLOCNO_MEM_OPTIMIZED_DEST_P(A) ((A)->mem_optimized_dest_p)
 #define ALLOCNO_SOMEWHERE_RENAMED_P(A) ((A)->somewhere_renamed_p)
Index: gcc/opts.c
===================================================================
--- gcc/opts.c (revision 195240)
+++ gcc/opts.c (working copy)
@@ -484,6 +484,7 @@ static const struct default_options defa
     { OPT_LEVELS_2_PLUS, OPT_ftree_tail_merge, NULL, 1 },
     { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_foptimize_strlen, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fhoist_adjacent_loads, NULL, 1 },
+    { OPT_LEVELS_2_PLUS, OPT_fuse_caller_save, NULL, 1 },
 
     /* -O3 optimizations.  */
     { OPT_LEVELS_3_PLUS, OPT_ftree_loop_distribute_patterns, NULL, 1 },
Index: gcc/ira-lives.c
===================================================================
--- gcc/ira-lives.c (revision 195240)
+++ gcc/ira-lives.c (working copy)
@@ -1273,6 +1273,10 @@ process_bb_node_lives (ira_loop_tree_nod
 		  ira_object_t obj = ira_object_id_map[i];
 		  ira_allocno_t a = OBJECT_ALLOCNO (obj);
 		  int num = ALLOCNO_NUM (a);
+		  HARD_REG_SET this_call_used_reg_set;
+
+		  get_call_reg_set_usage (insn, &this_call_used_reg_set,
+					  call_used_reg_set);
 
 		  /* Don't allocate allocnos that cross setjmps or any
 		     call, if this function receives a nonlocal
@@ -1287,9 +1291,9 @@ process_bb_node_lives (ira_loop_tree_nod
 		  if (can_throw_internal (insn))
 		    {
 		      IOR_HARD_REG_SET (OBJECT_CONFLICT_HARD_REGS (obj),
-					call_used_reg_set);
+					this_call_used_reg_set);
 		      IOR_HARD_REG_SET (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj),
-					call_used_reg_set);
+					this_call_used_reg_set);
 		    }
 
 		  if (sparseset_bit_p (allocnos_processed, num))
@@ -1306,6 +1310,8 @@ process_bb_node_lives (ira_loop_tree_nod
 		  /* Mark it as saved at the next call.  */
 		  allocno_saved_at_call[num] = last_call_num + 1;
 		  ALLOCNO_CALLS_CROSSED_NUM (a)++;
+		  IOR_HARD_REG_SET (ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a),
+				    this_call_used_reg_set);
 		  if (cheap_reg != NULL_RTX
 		      && ALLOCNO_REGNO (a) == (int) REGNO (cheap_reg))
 		    ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a)++;
Index: gcc/ira-build.c
===================================================================
--- gcc/ira-build.c (revision 195240)
+++ gcc/ira-build.c (working copy)
@@ -506,6 +506,7 @@ ira_create_allocno (int regno, bool cap_
   ALLOCNO_CALL_FREQ (a) = 0;
   ALLOCNO_CALLS_CROSSED_NUM (a) = 0;
   ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a) = 0;
+  CLEAR_HARD_REG_SET (ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a));
 #ifdef STACK_REGS
   ALLOCNO_NO_STACK_REG_P (a) = false;
   ALLOCNO_TOTAL_NO_STACK_REG_P (a) = false;
@@ -903,6 +904,8 @@ create_cap_allocno (ira_allocno_t a)
 
   ALLOCNO_CALLS_CROSSED_NUM (cap) = ALLOCNO_CALLS_CROSSED_NUM (a);
   ALLOCNO_CHEAP_CALLS_CROSSED_NUM (cap) = ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a);
+  IOR_HARD_REG_SET (ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (cap),
+		    ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a));
   if (internal_flag_ira_verbose > 2 && ira_dump_file != NULL)
     {
       fprintf (ira_dump_file, "    Creating cap ");
@@ -1822,6 +1825,8 @@ propagate_allocno_info (void)
 	    += ALLOCNO_CALLS_CROSSED_NUM (a);
 	  ALLOCNO_CHEAP_CALLS_CROSSED_NUM (parent_a)
 	    += ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a);
+ 	  IOR_HARD_REG_SET (ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (parent_a),
+ 			    ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a));
 	  ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (parent_a)
 	    += ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (a);
 	  aclass = ALLOCNO_CLASS (a);
@@ -2202,6 +2207,9 @@ propagate_some_info_from_allocno (ira_al
   ALLOCNO_CALLS_CROSSED_NUM (a) += ALLOCNO_CALLS_CROSSED_NUM (from_a);
   ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a)
     += ALLOCNO_CHEAP_CALLS_CROSSED_NUM (from_a);
+  IOR_HARD_REG_SET (ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a),
+ 		    ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (from_a));
+
   ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (a)
     += ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (from_a);
   if (! ALLOCNO_BAD_SPILL_P (from_a))
@@ -2827,6 +2835,8 @@ copy_info_to_removed_store_destinations
 	+= ALLOCNO_CALLS_CROSSED_NUM (a);
       ALLOCNO_CHEAP_CALLS_CROSSED_NUM (parent_a)
 	+= ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a);
+      IOR_HARD_REG_SET (ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (parent_a),
+ 			ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a));
       ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (parent_a)
 	+= ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (a);
       merged_p = true;
Index: gcc/calls.c
===================================================================
--- gcc/calls.c (revision 195240)
+++ gcc/calls.c (working copy)
@@ -3158,6 +3158,19 @@ expand_call (tree exp, rtx target, int i
 		   next_arg_reg, valreg, old_inhibit_defer_pop, call_fusage,
 		   flags, args_so_far);
 
+      if (flag_use_caller_save)
+	{
+	  rtx last, datum = NULL_RTX;
+	  if (fndecl != NULL_TREE)
+	    {
+	      datum = XEXP (DECL_RTL (fndecl), 0);
+	      gcc_assert (datum != NULL_RTX
+			  && GET_CODE (datum) == SYMBOL_REF);
+	    }
+	  last = last_call_insn ();
+	  add_reg_note (last, REG_CALL_DECL, datum);
+	}
+
       /* If the call setup or the call itself overlaps with anything
 	 of the argument setup we probably clobbered our call address.
 	 In that case we can't do sibcalls.  */
@@ -4183,6 +4196,14 @@ emit_library_call_value_1 (int retval, r
 	       valreg,
 	       old_inhibit_defer_pop + 1, call_fusage, flags, args_so_far);
 
+  if (flag_use_caller_save)
+    {
+      rtx last, datum = orgfun;
+      gcc_assert (GET_CODE (datum) == SYMBOL_REF);
+      last = last_call_insn ();
+      add_reg_note (last, REG_CALL_DECL, datum);
+    }
+
   /* Right-shift returned value if necessary.  */
   if (!pcc_struct_value
       && TYPE_MODE (tfom) != BLKmode
Index: gcc/emit-rtl.c
===================================================================
--- gcc/emit-rtl.c (revision 195240)
+++ gcc/emit-rtl.c (working copy)
@@ -3517,6 +3517,7 @@ try_split (rtx pat, rtx trial, int last)
   int probability;
   rtx insn_last, insn;
   int njumps = 0;
+  rtx call_insn = NULL_RTX;
 
   /* We're not good at redistributing frame information.  */
   if (RTX_FRAME_RELATED_P (trial))
@@ -3589,6 +3590,9 @@ try_split (rtx pat, rtx trial, int last)
 	  {
 	    rtx next, *p;
 
+	    gcc_assert (call_insn == NULL_RTX);
+	    call_insn = insn;
+
 	    /* Add the old CALL_INSN_FUNCTION_USAGE to whatever the
 	       target may have explicitly specified.  */
 	    p = &CALL_INSN_FUNCTION_USAGE (insn);
@@ -3660,6 +3664,11 @@ try_split (rtx pat, rtx trial, int last)
 	  fixup_args_size_notes (NULL_RTX, insn_last, INTVAL (XEXP (note, 0)));
 	  break;
 
+	case REG_CALL_DECL:
+	  gcc_assert (call_insn != NULL_RTX);
+	  add_reg_note (call_insn, REG_NOTE_KIND (note), XEXP (note, 0));
+	  break;
+
 	default:
 	  break;
 	}
Index: gcc/common.opt
===================================================================
--- gcc/common.opt (revision 195240)
+++ gcc/common.opt (working copy)
@@ -2540,4 +2540,8 @@ Create a position independent executable
 z
 Driver Joined Separate
 
+fuse-caller-save
+Common Report Var(flag_use_caller_save) Optimization
+Use caller save register across calls if possible
+
 ; This comment is to ensure we retain the blank line above.
Index: gcc/ira-costs.c
===================================================================
--- gcc/ira-costs.c (revision 195240)
+++ gcc/ira-costs.c (working copy)
@@ -2082,6 +2082,7 @@ ira_tune_allocno_costs (void)
   ira_allocno_object_iterator oi;
   ira_object_t obj;
   bool skip_p;
+  HARD_REG_SET *crossed_calls_clobber_regs;
 
   FOR_EACH_ALLOCNO (a, ai)
     {
@@ -2116,17 +2117,24 @@ ira_tune_allocno_costs (void)
 		continue;
 	      rclass = REGNO_REG_CLASS (regno);
 	      cost = 0;
-	      if (ira_hard_reg_set_intersection_p (regno, mode, call_used_reg_set)
-		  || HARD_REGNO_CALL_PART_CLOBBERED (regno, mode))
-		cost += (ALLOCNO_CALL_FREQ (a)
-			 * (ira_memory_move_cost[mode][rclass][0]
-			    + ira_memory_move_cost[mode][rclass][1]));
+	      crossed_calls_clobber_regs
+		= &(ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a));
+	      if (ira_hard_reg_set_intersection_p (regno, mode,
+						   *crossed_calls_clobber_regs))
+		{
+		  if (ira_hard_reg_set_intersection_p (regno, mode,
+						       call_used_reg_set)
+		      || HARD_REGNO_CALL_PART_CLOBBERED (regno, mode))
+		    cost += (ALLOCNO_CALL_FREQ (a)
+			     * (ira_memory_move_cost[mode][rclass][0]
+				+ ira_memory_move_cost[mode][rclass][1]));
 #ifdef IRA_HARD_REGNO_ADD_COST_MULTIPLIER
-	      cost += ((ira_memory_move_cost[mode][rclass][0]
-			+ ira_memory_move_cost[mode][rclass][1])
-		       * ALLOCNO_FREQ (a)
-		       * IRA_HARD_REGNO_ADD_COST_MULTIPLIER (regno) / 2);
+		  cost += ((ira_memory_move_cost[mode][rclass][0]
+			    + ira_memory_move_cost[mode][rclass][1])
+			   * ALLOCNO_FREQ (a)
+			   * IRA_HARD_REGNO_ADD_COST_MULTIPLIER (regno) / 2);
 #endif
+		}
 	      if (INT_MAX - cost < reg_costs[j])
 		reg_costs[j] = INT_MAX;
 	      else
Index: gcc/rtl.h
===================================================================
--- gcc/rtl.h (revision 195240)
+++ gcc/rtl.h (working copy)
@@ -2039,7 +2039,7 @@ extern const_rtx set_of (const_rtx, cons
 extern void record_hard_reg_sets (rtx, const_rtx, void *);
 extern void record_hard_reg_uses (rtx *, void *);
 #ifdef HARD_CONST
-extern void find_all_hard_reg_sets (const_rtx, HARD_REG_SET *);
+extern void find_all_hard_reg_sets (const_rtx, HARD_REG_SET *, bool);
 #endif
 extern void note_stores (const_rtx, void (*) (rtx, const_rtx, void *), void *);
 extern void note_uses (rtx *, void (*) (rtx *, void *), void *);
Index: gcc/combine.c
===================================================================
--- gcc/combine.c (revision 195240)
+++ gcc/combine.c (working copy)
@@ -13188,6 +13188,7 @@ distribute_notes (rtx notes, rtx from_in
 	case REG_NORETURN:
 	case REG_SETJMP:
 	case REG_TM:
+	case REG_CALL_DECL:
 	  /* These notes must remain with the call.  It should not be
 	     possible for both I2 and I3 to be a call.  */
 	  if (CALL_P (i3))
Index: gcc/resource.c
===================================================================
--- gcc/resource.c (revision 195240)
+++ gcc/resource.c (working copy)
@@ -649,10 +649,12 @@ mark_set_resources (rtx x, struct resour
       if (mark_type == MARK_SRC_DEST_CALL)
 	{
 	  rtx link;
+	  HARD_REG_SET regs;
 
 	  res->cc = res->memory = 1;
 
-	  IOR_HARD_REG_SET (res->regs, regs_invalidated_by_call);
+	  get_call_reg_set_usage (x, &regs, regs_invalidated_by_call);
+	  IOR_HARD_REG_SET (res->regs, regs);
 
 	  for (link = CALL_INSN_FUNCTION_USAGE (x);
 	       link; link = XEXP (link, 1))
@@ -998,11 +1000,15 @@ mark_target_live_regs (rtx insns, rtx ta
 
 	  if (CALL_P (real_insn))
 	    {
+	      HARD_REG_SET regs_invalidated_by_this_call;
 	      /* CALL clobbers all call-used regs that aren't fixed except
 		 sp, ap, and fp.  Do this before setting the result of the
 		 call live.  */
-	      AND_COMPL_HARD_REG_SET (current_live_regs,
+	      get_call_reg_set_usage (real_insn,
+				      &regs_invalidated_by_this_call,
 				      regs_invalidated_by_call);
+	      AND_COMPL_HARD_REG_SET (current_live_regs,
+				      regs_invalidated_by_this_call);
 
 	      /* A CALL_INSN sets any global register live, since it may
 		 have been modified by the call.  */
Index: gcc/reg-notes.def
===================================================================
--- gcc/reg-notes.def (revision 195240)
+++ gcc/reg-notes.def (working copy)
@@ -216,3 +216,8 @@ REG_NOTE (ARGS_SIZE)
    that the return value of a call can be used to reinitialize a
    pseudo reg.  */
 REG_NOTE (RETURNED)
+
+/* Used to mark a call with the function decl called by the call.
+   The decl might not be available in the call due to splitting of the call
+   insn.  This note is a SYMBOL_REF.  */
+REG_NOTE (CALL_DECL)
Index: gcc/doc/tm.texi
===================================================================
--- gcc/doc/tm.texi (revision 195418)
+++ gcc/doc/tm.texi (working copy)
@@ -3074,6 +3074,7 @@ This describes the stack layout and call
 * Profiling::
 * Tail Calls::
 * Stack Smashing Protection::
+* Miscellaneous Register Hooks::
 @end menu
 
 @node Frame Layout
@@ -4999,6 +5000,14 @@ normally defined in @file{libgcc2.c}.
 Whether this target supports splitting the stack when the options described in @var{opts} have been passed.  This is called after options have been parsed, so the target may reject splitting the stack in some configurations.  The default version of this hook returns false.  If @var{report} is true, this function may issue a warning or error; if @var{report} is false, it must simply return a value
 @end deftypefn
 
+@node Miscellaneous Register Hooks
+@subsection Miscellaneous register hooks
+@cindex miscellaneous register hooks
+
+@deftypefn {Target Hook} void TARGET_FN_OTHER_HARD_REG_USAGE (struct hard_reg_set_container *@var{regs})
+Add any hard registers to @var{regs} that are set or clobbered by a call to the function.  This hook only needs to be defined to provide registers that cannot be found by examination of the final RTL representation of a function.
+@end deftypefn
+
 @node Varargs
 @section Implementing the Varargs Macros
 @cindex varargs implementation
Index: gcc/doc/tm.texi.in
===================================================================
--- gcc/doc/tm.texi.in (revision 195418)
+++ gcc/doc/tm.texi.in (working copy)
@@ -3042,6 +3042,7 @@ This describes the stack layout and call
 * Profiling::
 * Tail Calls::
 * Stack Smashing Protection::
+* Miscellaneous Register Hooks::
 @end menu
 
 @node Frame Layout
@@ -4922,6 +4923,12 @@ normally defined in @file{libgcc2.c}.
 
 @hook TARGET_SUPPORTS_SPLIT_STACK
 
+@node Miscellaneous Register Hooks
+@subsection Miscellaneous register hooks
+@cindex miscellaneous register hooks
+
+@hook TARGET_FN_OTHER_HARD_REG_USAGE
+
 @node Varargs
 @section Implementing the Varargs Macros
 @cindex varargs implementation
Index: gcc/doc/invoke.texi
===================================================================
--- gcc/doc/invoke.texi (revision 195418)
+++ gcc/doc/invoke.texi (working copy)
@@ -419,8 +419,8 @@ Objective-C and Objective-C++ Dialects}.
 -ftree-ter -ftree-vect-loop-version -ftree-vectorize -ftree-vrp @gol
 -funit-at-a-time -funroll-all-loops -funroll-loops @gol
 -funsafe-loop-optimizations -funsafe-math-optimizations -funswitch-loops @gol
--fvariable-expansion-in-unroller -fvect-cost-model -fvpt -fweb @gol
--fwhole-program -fwpa -fuse-ld=@var{linker} -fuse-linker-plugin @gol
+-fuse-caller-save -fvariable-expansion-in-unroller -fvect-cost-model -fvpt @gol
+-fweb -fwhole-program -fwpa -fuse-ld=@var{linker} -fuse-linker-plugin @gol
 --param @var{name}=@var{value}
 -O  -O0  -O1  -O2  -O3  -Os -Ofast -Og}
 
@@ -7355,6 +7355,14 @@ and then tries to find ways to combine t
 
 Enabled by default at @option{-O1} and higher.
 
+@item -fuse-caller-save
+Use caller save registers for allocation if those registers are not used by
+any called function.  In that case it is not necessary to save and restore
+them around calls.  This is only possible if called functions are part of
+same compilation unit as current function and they are compiled before it.
+
+Enabled at levels @option{-O2}, @option{-O3}, @option{-Os}.
+
 @item -fconserve-stack
 @opindex fconserve-stack
 Attempt to minimize stack usage.  The compiler attempts to use less
Index: gcc/config/arm/arm.c
===================================================================
--- gcc/config/arm/arm.c (revision 195240)
+++ gcc/config/arm/arm.c (working copy)
@@ -270,6 +270,7 @@ static bool arm_vectorize_vec_perm_const
 					     const unsigned char *sel);
 static void arm_canonicalize_comparison (int *code, rtx *op0, rtx *op1,
 					 bool op0_preserve_value);
+static void arm_fn_other_hard_reg_usage (struct hard_reg_set_container *);
 \f
 /* Table of machine attributes.  */
 static const struct attribute_spec arm_attribute_table[] =
@@ -633,6 +634,10 @@ static const struct attribute_spec arm_a
 #define TARGET_CANONICALIZE_COMPARISON \
   arm_canonicalize_comparison
 
+#undef TARGET_FN_OTHER_HARD_REG_USAGE
+#define TARGET_FN_OTHER_HARD_REG_USAGE \
+  arm_fn_other_hard_reg_usage
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 \f
 /* Obstack for minipool constant handling.  */
@@ -3695,6 +3700,19 @@ arm_canonicalize_comparison (int *code,
     }
 }
 
+/* Implement TARGET_FN_OTHER_HARD_REG_USAGE.  */
+
+static void
+arm_fn_other_hard_reg_usage (struct hard_reg_set_container *regs)
+{
+  if (TARGET_AAPCS_BASED)
+    {
+      /* For AAPCS, IP and CC can be clobbered by veneers inserted by the
+	 linker.  */
+      SET_HARD_REG_BIT (regs->set, IP_REGNUM);
+      SET_HARD_REG_BIT (regs->set, CC_REGNUM);
+    }
+}
 
 /* Define how to find the value returned by a function.  */
 
Index: gcc/testsuite/lib/target-supports.exp
===================================================================
--- gcc/testsuite/lib/target-supports.exp (revision 195240)
+++ gcc/testsuite/lib/target-supports.exp (working copy)
@@ -897,6 +897,26 @@ proc check_effective_target_mips16_attri
     } [add_options_for_mips16_attribute ""]]
 }
 
+# Return 1 if the target generates mips16 code by default.
+
+proc check_effective_target_mips16 { } {
+    return [check_no_compiler_messages mips16 assembly {
+	#if !(defined __mips16)
+	#error FOO
+	#endif
+    } ""]
+}
+
+# Return 1 if the target generates micromips code by default.
+
+proc check_effective_target_micromips { } {
+    return [check_no_compiler_messages micromips assembly {
+	#if !(defined __mips_micromips)
+	#error FOO
+	#endif
+    } ""]
+}
+
 # Return 1 if the target supports long double larger than double when
 # using the new ABI, 0 otherwise.
 
Index: gcc/testsuite/gcc.target/mips/mips.exp
===================================================================
--- gcc/testsuite/gcc.target/mips/mips.exp (revision 195240)
+++ gcc/testsuite/gcc.target/mips/mips.exp (working copy)
@@ -245,6 +245,7 @@ set mips_option_groups {
     small-data "-G[0-9]+"
     warnings "-w"
     dump "-fdump-.*"
+    save_temps "-save-temps"
 }
 
 # Add -mfoo/-mno-foo options to mips_option_groups.
@@ -301,6 +302,7 @@ foreach option {
     tree-vectorize
     unroll-all-loops
     unroll-loops
+    use-caller-save
 } {
     lappend mips_option_groups $option "-f(no-|)$option"
 }
Index: gcc/testsuite/gcc.target/mips/aru-1.c
===================================================================
--- /dev/null (new file)
+++ gcc/testsuite/gcc.target/mips/aru-1.c (revision 0)
@@ -0,0 +1,38 @@
+/* { dg-do run } */
+/* { dg-options "-fuse-caller-save -save-temps" } */
+/* { dg-skip-if "" { *-*-* }  { "*" } { "-Os" } } */
+/* Testing -fuse-caller-save optimization option.  */
+
+static int __attribute__((noinline))
+bar (int x)
+{
+  return x + 3;
+}
+
+int __attribute__((noinline))
+foo (int y)
+{
+  return y + bar (y);
+}
+
+int
+main (void)
+{
+  return !(foo (5) == 13);
+}
+
+/* Check that there are only 2 stack-saves: r31 in main and foo.  */
+
+/* Variant not mips16.  Check that there only 2 sw/sd.  */
+/* { dg-final { scan-assembler-times "(?n)s\[wd\]\t\\\$.*,.*\\(\\\$sp\\)" 2 { target { ! mips16 } } } } */
+
+/* Variant not mips16, Subvariant micromips.  Additionally check there's no
+   swm.  */
+/* { dg-final { scan-assembler-times "(?n)swm\t\\\$.*,.*\\(\\\$sp\\)" 0 {target micromips } } } */
+
+/* Variant mips16.  The save can save 1 or more registers, check that only 1 is
+   saved, twice in total.  */
+/* { dg-final { scan-assembler-times "(?n)save\t\[0-9\]*,\\\$\[^,\]*\$" 2 { target mips16 } } } */
+
+/* Check that the first caller-save register is unused.  */
+/* { dg-final { scan-assembler-not "(\\\$16)" } } */

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH][IRA] Analysis of register usage of functions for usage by IRA.
  2013-01-25 13:05 [PATCH][IRA] Analysis of register usage of functions for usage by IRA Tom de Vries
@ 2013-01-25 15:46 ` Vladimir Makarov
  2013-02-07 19:12   ` Tom de Vries
  2014-09-01 16:41 ` Ulrich Weigand
  1 sibling, 1 reply; 59+ messages in thread
From: Vladimir Makarov @ 2013-01-25 15:46 UTC (permalink / raw)
  To: Tom de Vries; +Cc: Steven Bosscher, gcc-patches, Radovan Obradovic

On 01/25/2013 08:05 AM, Tom de Vries wrote:
> Vladimir,
>
> this patch adds analysis of register usage of functions for usage by IRA.
>
> The patch:
> - adds analysis in pass_final to track which hard registers are set or clobbered
>    by the function body, and stores that information in a struct cgraph_node.
> - adds a target hook fn_other_hard_reg_usage to list hard registers that are
>    set or clobbered by a call to a function, but are not listed as such in the
>    function body, such as f.i. registers clobbered by veneers inserted by the
>    linker.
> - adds a reg-note REG_CALL_DECL, to be able to easily link call_insns to their
>    corresponding declaration, even after the calls may have been split into an
>    insn (set register to function address) and a call_insn (call register), which
>    can happen for f.i. sh, and mips with -mabi-calls.
> - uses the register analysis in IRA.
> - adds an option -fuse-caller-save to control the optimization, on by default
>    at -Os and -O2 and higher.
>
>
> The patch (original version by Radovan Obradovic) is similar to your patch
> ( http://gcc.gnu.org/ml/gcc-patches/2007-01/msg01625.html ) from 2007.
> But this patch doesn't implement save area stack slot sharing.
> ( Btw, I've borrowed the struct cgraph_node field name and comment from the 2007
> patch ).
>
> [ Steven, you mentioned in this discussion
>    ( http://gcc.gnu.org/ml/gcc/2012-10/msg00213.html ) that you are working on
>    porting the 2007 patch to trunk. What is the status of that effort?
> ]
>
>
> As an example of the functionality, consider foo and bar from test-case aru-1.c:
> ...
> static int __attribute__((noinline))
> bar (int x)
> {
>    return x + 3;
> }
>
> int __attribute__((noinline))
> foo (int y)
> {
>    return y + bar (y);
> }
> ...
>
> Compiled at -O2, bar only sets register $2 (the first return register):
> ...
> bar:
>          .frame  $sp,0,$31               # vars= 0, regs= 0/0, args= 0, gp= 0
>          .mask   0x00000000,0
>          .fmask  0x00000000,0
>          .set    noreorder
>          .set    nomacro
>          j       $31
>          addiu   $2,$4,3
> ...
>
> foo then can use register $3 (the second return register) instead of register
> $16 to save the value in register $4 (the first argument register) over the
> call, as demonstrated here in a -fno-use-caller-save vs. -fuse-caller-save diff:
> ...
> foo:                                    foo:
> # vars= 0, regs= 2/0, args= 16, gp= 8 | # vars= 0, regs= 1/0, args= 16, gp= 8
> .frame  $sp,32,$31                      .frame  $sp,32,$31
> .mask   0x80010000,-4                 | .mask   0x80000000,-4
> .fmask  0x00000000,0                    .fmask  0x00000000,0
> .set    noreorder                       .set    noreorder
> .set    nomacro                         .set    nomacro
> addiu   $sp,$sp,-32                     addiu   $sp,$sp,-32
> sw      $31,28($sp)                     sw      $31,28($sp)
> sw      $16,24($sp)                   <
> .option pic0                            .option pic0
> jal     bar                             jal     bar
> .option pic2                            .option pic2
> move    $16,$4                        | move    $3,$4
>
> lw      $31,28($sp)                     lw      $31,28($sp)
> addu    $2,$2,$16                     | addu    $2,$2,$3
> lw      $16,24($sp)                   <
> j       $31                             j       $31
> addiu   $sp,$sp,32                      addiu   $sp,$sp,32
> ...
> That way we skip the save and restore of register $16, which is not necessary
> for $3. Btw, a further improvement could be to reuse $4 after the call, and
> eliminate the move.
>
>
> A version of this patch on top of 4.6 ran into trouble with the epilogue on arm,
> where a register was clobbered by a stack pop instruction, while that was not
> visible in the rtl representation. This instruction was introduced in
> arm_output_epilogue by code marked with the comment 'pop call clobbered
> registers if it avoids a separate stack adjustment'.
> I cannot reproduce that issue on trunk. Looking at the generated rtl, it seems
> that the epilogue instructions now list all registers set by it, so
> collect_fn_hard_reg_usage is able to analyze all clobbered registers.
>
>
> Bootstrapped and reg-tested on x86_64, Ada inclusive. Build and reg-tested on
> mips, arm, ppc and sh. No issues found. OK for stage1 trunk?
>
>
Thanks for the patch.  I'll look at it during the next week.

Right now I see that the code is based on reload which uses 
caller-saves.c.  LRA does not use caller-saves.c at all.  Right now we 
have LRA support only for x86/x86-64 but the next version will probably 
have a few more targets based on LRA.  Fortunately, LRA modification 
will be pretty easy with all this machinery.

I am going to use ira-improv branch for some my future work for gcc4.9.  
And I am going to regularly (about once per month) merge trunk into it.  
So if you want you could use the branch for your work too.  But this is 
absolutely up to you.  I don't mind if you put this patch directly to 
the trunk at stage1 when the review is finished.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH][IRA] Analysis of register usage of functions for usage by IRA.
  2013-01-25 15:46 ` Vladimir Makarov
@ 2013-02-07 19:12   ` Tom de Vries
  2013-02-13 22:35     ` Vladimir Makarov
  0 siblings, 1 reply; 59+ messages in thread
From: Tom de Vries @ 2013-02-07 19:12 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: Steven Bosscher, gcc-patches, Radovan Obradovic

Vladimir,

On 25/01/13 16:36, Vladimir Makarov wrote:
> On 01/25/2013 08:05 AM, Tom de Vries wrote:
>> Vladimir,
>>
>> this patch adds analysis of register usage of functions for usage by IRA.
>>
>> The patch:
>> - adds analysis in pass_final to track which hard registers are set or clobbered
>>    by the function body, and stores that information in a struct cgraph_node.
>> - adds a target hook fn_other_hard_reg_usage to list hard registers that are
>>    set or clobbered by a call to a function, but are not listed as such in the
>>    function body, such as f.i. registers clobbered by veneers inserted by the
>>    linker.
>> - adds a reg-note REG_CALL_DECL, to be able to easily link call_insns to their
>>    corresponding declaration, even after the calls may have been split into an
>>    insn (set register to function address) and a call_insn (call register), which
>>    can happen for f.i. sh, and mips with -mabi-calls.
>> - uses the register analysis in IRA.
>> - adds an option -fuse-caller-save to control the optimization, on by default
>>    at -Os and -O2 and higher.

<SNIP>

>> Bootstrapped and reg-tested on x86_64, Ada inclusive. Build and reg-tested on
>> mips, arm, ppc and sh. No issues found. OK for stage1 trunk?
>>
>>
> Thanks for the patch.  I'll look at it during the next week.
> 

Did you get a chance to look at this?

> Right now I see that the code is based on reload which uses 
> caller-saves.c.  LRA does not use caller-saves.c at all.  Right now we 
> have LRA support only for x86/x86-64 but the next version will probably 
> have a few more targets based on LRA.  Fortunately, LRA modification 
> will be pretty easy with all this machinery.
> 

I see, thanks for noticing that. Btw I'm now working on a testsuite construct
dg-size-compare to be able to do
  dg-size-compare "text" "-fuse-caller-save" "<" "-fno-use-caller-save"
which I could have used to create a generic testcase, which would have
demonstrated that the optimization didn't work for x86_64.

I'm also currently looking at how to use the analysis in LRA.
AFAIU, in lra-constraints.c we do a backward scan over the insns, and keep track
of how many calls we've seen (calls_num), and mark insns with that number. Then
when looking at a live-range segment consisting of a def or use insn a and a
following use insn b, we can compare the number of calls seen for each insn, and
if they're not equal there is at least one call between the 2 insns, and if the
corresponding hard register is clobbered by calls, we spill after insn a and
restore before insn b.

That is too coarse-grained to use with our analysis, since we need to know which
calls occur in between insn a and insn b, and more precisely which registers
those calls clobbered.

I wonder though if we can do something similar: we keep an array
call_clobbers_num[FIRST_PSEUDO_REG], initialized at 0 when we start scanning.
When encountering a call, we increase the call_clobbers_num entries for the hard
registers clobbered by the call.
When encountering a use, we set the call_clobbers_num field of the use to
call_clobbers_num[reg_renumber[original_regno]].
And when looking at a live-range segment, we compare the clobbers_num field of
insn a and insn b, and if it is not equal, the hard register was clobbered by at
least one call between insn a and insn b.
Would that work? WDYT?

> I am going to use ira-improv branch for some my future work for gcc4.9.  
> And I am going to regularly (about once per month) merge trunk into it.  
> So if you want you could use the branch for your work too.  But this is 
> absolutely up to you.  I don't mind if you put this patch directly to 
> the trunk at stage1 when the review is finished.
> 

OK, I'd say stage1 then unless during review a reason pops up why it's better to
use the ira-improv branch.

Thanks,
- Tom

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH][IRA] Analysis of register usage of functions for usage by IRA.
  2013-02-07 19:12   ` Tom de Vries
@ 2013-02-13 22:35     ` Vladimir Makarov
  2013-03-14  9:35       ` Tom de Vries
  0 siblings, 1 reply; 59+ messages in thread
From: Vladimir Makarov @ 2013-02-13 22:35 UTC (permalink / raw)
  To: Tom de Vries; +Cc: gcc-patches

On 13-02-07 2:11 PM, Tom de Vries wrote:
> Vladimir,
>
> On 25/01/13 16:36, Vladimir Makarov wrote:
>> On 01/25/2013 08:05 AM, Tom de Vries wrote:
>>> Vladimir,
>>>
>>> this patch adds analysis of register usage of functions for usage by IRA.
>>>
>>> The patch:
>>> - adds analysis in pass_final to track which hard registers are set or clobbered
>>>     by the function body, and stores that information in a struct cgraph_node.
>>> - adds a target hook fn_other_hard_reg_usage to list hard registers that are
>>>     set or clobbered by a call to a function, but are not listed as such in the
>>>     function body, such as f.i. registers clobbered by veneers inserted by the
>>>     linker.
>>> - adds a reg-note REG_CALL_DECL, to be able to easily link call_insns to their
>>>     corresponding declaration, even after the calls may have been split into an
>>>     insn (set register to function address) and a call_insn (call register), which
>>>     can happen for f.i. sh, and mips with -mabi-calls.
>>> - uses the register analysis in IRA.
>>> - adds an option -fuse-caller-save to control the optimization, on by default
>>>     at -Os and -O2 and higher.
> <SNIP>
>
>>> Bootstrapped and reg-tested on x86_64, Ada inclusive. Build and reg-tested on
>>> mips, arm, ppc and sh. No issues found. OK for stage1 trunk?
>>>
>>>
>> Thanks for the patch.  I'll look at it during the next week.
>>
> Did you get a chance to look at this?
Sorry for the delay with the answer.  I was and am quite busy with other 
more urgent things.  I'll work on it when I have more free time.  In any 
case, I'll do it before stage1 to have your patch ready.
>> Right now I see that the code is based on reload which uses
>> caller-saves.c.  LRA does not use caller-saves.c at all.  Right now we
>> have LRA support only for x86/x86-64 but the next version will probably
>> have a few more targets based on LRA.  Fortunately, LRA modification
>> will be pretty easy with all this machinery.
>>
> I see, thanks for noticing that. Btw I'm now working on a testsuite construct
> dg-size-compare to be able to do
>    dg-size-compare "text" "-fuse-caller-save" "<" "-fno-use-caller-save"
> which I could have used to create a generic testcase, which would have
> demonstrated that the optimization didn't work for x86_64.
I thought about implementing your optimization for LRA by myself. But it 
is ok if you decide to work on it.  At least, I am not going to start 
this work for a month.
> I'm also currently looking at how to use the analysis in LRA.
> AFAIU, in lra-constraints.c we do a backward scan over the insns, and keep track
> of how many calls we've seen (calls_num), and mark insns with that number. Then
> when looking at a live-range segment consisting of a def or use insn a and a
> following use insn b, we can compare the number of calls seen for each insn, and
> if they're not equal there is at least one call between the 2 insns, and if the
> corresponding hard register is clobbered by calls, we spill after insn a and
> restore before insn b.
>
> That is too coarse-grained to use with our analysis, since we need to know which
> calls occur in between insn a and insn b, and more precisely which registers
> those calls clobbered.

> I wonder though if we can do something similar: we keep an array
> call_clobbers_num[FIRST_PSEUDO_REG], initialized at 0 when we start scanning.
> When encountering a call, we increase the call_clobbers_num entries for the hard
> registers clobbered by the call.
> When encountering a use, we set the call_clobbers_num field of the use to
> call_clobbers_num[reg_renumber[original_regno]].
> And when looking at a live-range segment, we compare the clobbers_num field of
> insn a and insn b, and if it is not equal, the hard register was clobbered by at
> least one call between insn a and insn b.
> Would that work? WDYT?
>
As I understand you looked at live-range splitting code in 
lra-constraints.c.  To get necessary info you should look at ira-lives.c.
>> I am going to use ira-improv branch for some my future work for gcc4.9.
>> And I am going to regularly (about once per month) merge trunk into it.
>> So if you want you could use the branch for your work too.  But this is
>> absolutely up to you.  I don't mind if you put this patch directly to
>> the trunk at stage1 when the review is finished.
>>
> OK, I'd say stage1 then unless during review a reason pops up why it's better to
> use the ira-improv branch.
>
That is ok.  Stage1 then.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH][IRA] Analysis of register usage of functions for usage by IRA.
  2013-02-13 22:35     ` Vladimir Makarov
@ 2013-03-14  9:35       ` Tom de Vries
  2013-03-14 15:22         ` Vladimir Makarov
  2013-12-06  0:47         ` [PATCH][IRA] Analysis of register usage of functions for usage by IRA Tom de Vries
  0 siblings, 2 replies; 59+ messages in thread
From: Tom de Vries @ 2013-03-14  9:35 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: gcc-patches

On 13/02/13 23:35, Vladimir Makarov wrote:
> On 13-02-07 2:11 PM, Tom de Vries wrote:
>> Vladimir,
>>
>> On 25/01/13 16:36, Vladimir Makarov wrote:
>>> On 01/25/2013 08:05 AM, Tom de Vries wrote:
>>>> Vladimir,
>>>>
>>>> this patch adds analysis of register usage of functions for usage by IRA.
>>>>
>>>> The patch:
>>>> - adds analysis in pass_final to track which hard registers are set or clobbered
>>>>     by the function body, and stores that information in a struct cgraph_node.
>>>> - adds a target hook fn_other_hard_reg_usage to list hard registers that are
>>>>     set or clobbered by a call to a function, but are not listed as such in the
>>>>     function body, such as f.i. registers clobbered by veneers inserted by the
>>>>     linker.
>>>> - adds a reg-note REG_CALL_DECL, to be able to easily link call_insns to their
>>>>     corresponding declaration, even after the calls may have been split into an
>>>>     insn (set register to function address) and a call_insn (call register), which
>>>>     can happen for f.i. sh, and mips with -mabi-calls.
>>>> - uses the register analysis in IRA.
>>>> - adds an option -fuse-caller-save to control the optimization, on by default
>>>>     at -Os and -O2 and higher.
>> <SNIP>
>>
>>>> Bootstrapped and reg-tested on x86_64, Ada inclusive. Build and reg-tested on
>>>> mips, arm, ppc and sh. No issues found. OK for stage1 trunk?
>>>>
>>>>
>>> Thanks for the patch.  I'll look at it during the next week.
>>>
>> Did you get a chance to look at this?
> Sorry for the delay with the answer.  I was and am quite busy with other 
> more urgent things.  I'll work on it when I have more free time.  In any 
> case, I'll do it before stage1 to have your patch ready.

Vladimir,

do you have an ETA on this review?

>>> Right now I see that the code is based on reload which uses
>>> caller-saves.c.  LRA does not use caller-saves.c at all.  Right now we
>>> have LRA support only for x86/x86-64 but the next version will probably
>>> have a few more targets based on LRA.  Fortunately, LRA modification
>>> will be pretty easy with all this machinery.
>>>
>> I see, thanks for noticing that. Btw I'm now working on a testsuite construct
>> dg-size-compare to be able to do
>>    dg-size-compare "text" "-fuse-caller-save" "<" "-fno-use-caller-save"
>> which I could have used to create a generic testcase, which would have
>> demonstrated that the optimization didn't work for x86_64.
> I thought about implementing your optimization for LRA by myself. But it 
> is ok if you decide to work on it.  At least, I am not going to start 
> this work for a month.
>> I'm also currently looking at how to use the analysis in LRA.
>> AFAIU, in lra-constraints.c we do a backward scan over the insns, and keep track
>> of how many calls we've seen (calls_num), and mark insns with that number. Then
>> when looking at a live-range segment consisting of a def or use insn a and a
>> following use insn b, we can compare the number of calls seen for each insn, and
>> if they're not equal there is at least one call between the 2 insns, and if the
>> corresponding hard register is clobbered by calls, we spill after insn a and
>> restore before insn b.
>>
>> That is too coarse-grained to use with our analysis, since we need to know which
>> calls occur in between insn a and insn b, and more precisely which registers
>> those calls clobbered.
> 
>> I wonder though if we can do something similar: we keep an array
>> call_clobbers_num[FIRST_PSEUDO_REG], initialized at 0 when we start scanning.
>> When encountering a call, we increase the call_clobbers_num entries for the hard
>> registers clobbered by the call.
>> When encountering a use, we set the call_clobbers_num field of the use to
>> call_clobbers_num[reg_renumber[original_regno]].
>> And when looking at a live-range segment, we compare the clobbers_num field of
>> insn a and insn b, and if it is not equal, the hard register was clobbered by at
>> least one call between insn a and insn b.
>> Would that work? WDYT?
>>
> As I understand you looked at live-range splitting code in 
> lra-constraints.c.  To get necessary info you should look at ira-lives.c.

Unfortunately I haven't been able to find time to work further on the LRA part.
So if you're still willing to pick up that part, that would be great.

Thanks,
- Tom

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH][IRA] Analysis of register usage of functions for usage by IRA.
  2013-03-14  9:35       ` Tom de Vries
@ 2013-03-14 15:22         ` Vladimir Makarov
  2013-03-29 12:54           ` Tom de Vries
  2013-12-06  0:47         ` [PATCH][IRA] Analysis of register usage of functions for usage by IRA Tom de Vries
  1 sibling, 1 reply; 59+ messages in thread
From: Vladimir Makarov @ 2013-03-14 15:22 UTC (permalink / raw)
  To: Tom de Vries; +Cc: gcc-patches

On 03/14/2013 05:34 AM, Tom de Vries wrote:
> On 13/02/13 23:35, Vladimir Makarov wrote:
>>
>> Sorry for the delay with the answer.  I was and am quite busy with other
>> more urgent things.  I'll work on it when I have more free time.  In any
>> case, I'll do it before stage1 to have your patch ready.
> Vladimir,
>
> do you have an ETA on this review?
>
>
Actually, I am done with it.  In general, it is ok.  Although I have 
some minors comments:

In Changelog, you missed '*" before cgraph.h:

     * haifa-sched.c (recompute_todo_spec, check_clobbered_conditions): Add
     new argument to find_all_hard_reg_sets call.
     cgraph.h (struct cgraph_node): Add function_used_regs,
     function_used_regs_initialized and function_used_regs_valid fields.


@@ -3391,6 +3394,7 @@ df_get_call_refs (struct df_collection_r
          }
      }
        else if (TEST_HARD_REG_BIT (regs_invalidated_by_call, i)

I'd remove the test of regs_invalidated_by_call.

+           && TEST_HARD_REG_BIT (fn_reg_set_usage, i)
             /* no clobbers for regs that are the result of the call */
             && !TEST_HARD_REG_BIT (defs_generated, i)

+static void
+collect_fn_hard_reg_usage (void)
+{
+  rtx insn;
+  int i;
+  struct cgraph_node *node;
+  struct hard_reg_set_container other_usage;
+
+  if (!flag_use_caller_save)
+    return;
+
+  node = cgraph_get_node (current_function_decl);
+  gcc_assert (node != NULL);
+
+  gcc_assert (!node->function_used_regs_initialized);
+  node->function_used_regs_initialized = 1;
+
+  for (insn = get_insns (); insn != NULL_RTX; insn = next_insn (insn))
+    {
+      HARD_REG_SET insn_used_regs;
+
+      if (!NONDEBUG_INSN_P (insn))
+    continue;
+
+      find_all_hard_reg_sets (insn, &insn_used_regs, false);
+
+      if (CALL_P (insn)
+      && !get_call_reg_set_usage (insn, &insn_used_regs, 
call_used_reg_set))
+    {
+      CLEAR_HARD_REG_SET (node->function_used_regs);
+      return;
+    }
+

I'd put it before find_all_hard_reg_sets

+      IOR_HARD_REG_SET (node->function_used_regs, insn_used_regs);
+    }
+



But you can ignore my two last 2 comments.

The patch is ok for me for trunk at stage1.  But I think you need a 
formal approval for df-scan.c, arm.c, mips.c, GCC testsuite expect files 
(lib/target-supports.exp and gcc.target/mips/mips.exp) as I am not a 
maintainer of these parts although these changes look ok for me.

Thanks for your hard work and sorry for the review delay.

I guess you need to pay attention to reported problems for some time 
after you commit the patch as it affects all targets.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH][IRA] Analysis of register usage of functions for usage by IRA.
  2013-03-14 15:22         ` Vladimir Makarov
@ 2013-03-29 12:54           ` Tom de Vries
  2013-03-29 13:06             ` [PATCH][06/10] -fuse-caller-save - Collect register usage information Tom de Vries
                               ` (20 more replies)
  0 siblings, 21 replies; 59+ messages in thread
From: Tom de Vries @ 2013-03-29 12:54 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: gcc-patches

On 14/03/13 16:11, Vladimir Makarov wrote:
> On 03/14/2013 05:34 AM, Tom de Vries wrote:
>> On 13/02/13 23:35, Vladimir Makarov wrote:
>>
> Actually, I am done with it.  In general, it is ok.  Although I have 
> some minors comments:
> 

Vladimir,

Thanks for the review.

I split the patch up into 10 patches, to facilitate further review:
...
0001-Add-command-line-option.patch
0002-Add-new-reg-note-REG_CALL_DECL.patch
0003-Add-implicit-parameter-to-find_all_hard_reg_sets.patch
0004-Add-TARGET_FN_OTHER_HARD_REG_USAGE-hook.patch
0005-Implement-TARGET_FN_OTHER_HARD_REG_USAGE-hook-for-ARM.patch
0006-Collect-register-usage-information.patch
0007-Use-collected-register-usage-information.patch
0008-Enable-by-default-at-O2-and-higher.patch
0009-Add-documentation.patch
0010-Add-test-case.patch
...
I'll post these in reply to this email.

> In Changelog, you missed '*" before cgraph.h:
> 
>      * haifa-sched.c (recompute_todo_spec, check_clobbered_conditions): Add
>      new argument to find_all_hard_reg_sets call.
>      cgraph.h (struct cgraph_node): Add function_used_regs,
>      function_used_regs_initialized and function_used_regs_valid fields.
> 

Fixed (in the log of 0006-Collect-register-usage-information.patch).

> 
> @@ -3391,6 +3394,7 @@ df_get_call_refs (struct df_collection_r
>           }
>       }
>         else if (TEST_HARD_REG_BIT (regs_invalidated_by_call, i)
> 
> I'd remove the test of regs_invalidated_by_call.
> 
> +           && TEST_HARD_REG_BIT (fn_reg_set_usage, i)
>              /* no clobbers for regs that are the result of the call */
>              && !TEST_HARD_REG_BIT (defs_generated, i)
> 

Fixed (in 0007-Use-collected-register-usage-information.patch).

> +static void
> +collect_fn_hard_reg_usage (void)
> +{
> +  rtx insn;
> +  int i;
> +  struct cgraph_node *node;
> +  struct hard_reg_set_container other_usage;
> +
> +  if (!flag_use_caller_save)
> +    return;
> +
> +  node = cgraph_get_node (current_function_decl);
> +  gcc_assert (node != NULL);
> +
> +  gcc_assert (!node->function_used_regs_initialized);
> +  node->function_used_regs_initialized = 1;
> +
> +  for (insn = get_insns (); insn != NULL_RTX; insn = next_insn (insn))
> +    {
> +      HARD_REG_SET insn_used_regs;
> +
> +      if (!NONDEBUG_INSN_P (insn))
> +    continue;
> +
> +      find_all_hard_reg_sets (insn, &insn_used_regs, false);
> +
> +      if (CALL_P (insn)
> +      && !get_call_reg_set_usage (insn, &insn_used_regs, 
> call_used_reg_set))
> +    {
> +      CLEAR_HARD_REG_SET (node->function_used_regs);
> +      return;
> +    }
> +
> 
> I'd put it before find_all_hard_reg_sets
> 
> +      IOR_HARD_REG_SET (node->function_used_regs, insn_used_regs);
> +    }
> +
> 
> 

insn_used_regs is set by both find_all_hard_reg_sets, and by
get_call_reg_set_usage. If we move the IOR to before find_all_hard_reg_sets,
we're using an undefined value.

> 
> But you can ignore my two last 2 comments.
> 
> The patch is ok for me for trunk at stage1.  But I think you need a 
> formal approval for df-scan.c, arm.c, mips.c, GCC testsuite expect files 
> (lib/target-supports.exp and gcc.target/mips/mips.exp) as I am not a 
> maintainer of these parts although these changes look ok for me.
> 

I'm assuming you've ok'ed patch 1, 2, 3, 4, 6, 8, 9 and the non-df-scan part of 7.

I'll ask other maintainers about the other parts (5, 10 and the df-scan part of 7).

Thanks,
- Tom

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH][06/10] -fuse-caller-save - Collect register usage information
  2013-03-29 12:54           ` Tom de Vries
@ 2013-03-29 13:06             ` Tom de Vries
  2013-03-29 13:06             ` [PATCH][09/10] -fuse-caller-save - Add documentation Tom de Vries
                               ` (19 subsequent siblings)
  20 siblings, 0 replies; 59+ messages in thread
From: Tom de Vries @ 2013-03-29 13:06 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 701 bytes --]

Vladimir,



This patch adds analysis in pass_final to track which hard registers are set or

clobbered by the function body, and stores that information in a

struct cgraph_node.



Thanks,

  -Tom



2013-03-29  Radovan Obradovic  <robradovic@mips.com>

            Tom de Vries  <tom@codesourcery.com>



	* cgraph.h (struct cgraph_node): Add function_used_regs,

	function_used_regs_initialized and function_used_regs_valid fields.

	* final.c: Move include of hard-reg-set.h to before rtl.h to declare

	find_all_hard_reg_sets.

	(collect_fn_hard_reg_usage, get_call_fndecl, get_call_cgraph_node)

	(get_call_reg_set_usage): New function.

	(rest_of_handle_final): Use collect_fn_hard_reg_usage.

[-- Attachment #2: 0006-Collect-register-usage-information.patch --]
[-- Type: text/x-patch, Size: 5486 bytes --]

diff --git a/gcc/cgraph.h b/gcc/cgraph.h

index 8ab7ae1..2132d91 100644

--- a/gcc/cgraph.h

+++ b/gcc/cgraph.h

@@ -251,6 +251,15 @@ struct GTY(()) cgraph_node {

   /* Unique id of the node.  */

   int uid;

 

+  /* Call unsaved hard registers really used by the corresponding

+     function (including ones used by functions called by the

+     function).  */

+  HARD_REG_SET function_used_regs;

+  /* Set if function_used_regs is initialized.  */

+  unsigned function_used_regs_initialized: 1;

+  /* Set if function_used_regs is valid.  */

+  unsigned function_used_regs_valid: 1;

+

   /* Set when decl is an abstract function pointed to by the

      ABSTRACT_DECL_ORIGIN of a reachable function.  */

   unsigned abstract_and_needed : 1;

diff --git a/gcc/final.c b/gcc/final.c

index d25b8e0..4e0fd01 100644

--- a/gcc/final.c

+++ b/gcc/final.c

@@ -48,6 +48,7 @@ along with GCC; see the file COPYING3.  If not see

 #include "tm.h"

 

 #include "tree.h"

+#include "hard-reg-set.h"

 #include "rtl.h"

 #include "tm_p.h"

 #include "regs.h"

@@ -56,7 +57,6 @@ along with GCC; see the file COPYING3.  If not see

 #include "recog.h"

 #include "conditions.h"

 #include "flags.h"

-#include "hard-reg-set.h"

 #include "output.h"

 #include "except.h"

 #include "function.h"

@@ -222,6 +222,7 @@ static int alter_cond (rtx);

 static int final_addr_vec_align (rtx);

 #endif

 static int align_fuzz (rtx, rtx, int, unsigned);

+static void collect_fn_hard_reg_usage (void);

 \f

 /* Initialize data in final at the beginning of a compilation.  */

 

@@ -4328,6 +4329,8 @@ rest_of_handle_final (void)

   rtx x;

   const char *fnname;

 

+  collect_fn_hard_reg_usage ();

+

   /* Get the function's name, as described by its RTL.  This may be

      different from the DECL_NAME name used in the source file.  */

 

@@ -4584,3 +4587,121 @@ struct rtl_opt_pass pass_clean_state =

   0                                     /* todo_flags_finish */

  }

 };

+

+/* Collect hard register usage for the current function.  */

+

+static void

+collect_fn_hard_reg_usage (void)

+{

+  rtx insn;

+  int i;

+  struct cgraph_node *node;

+  struct hard_reg_set_container other_usage;

+

+  if (!flag_use_caller_save)

+    return;

+

+  node = cgraph_get_node (current_function_decl);

+  gcc_assert (node != NULL);

+

+  gcc_assert (!node->function_used_regs_initialized);

+  node->function_used_regs_initialized = 1;

+

+  for (insn = get_insns (); insn != NULL_RTX; insn = next_insn (insn))

+    {

+      HARD_REG_SET insn_used_regs;

+

+      if (!NONDEBUG_INSN_P (insn))

+	continue;

+

+      find_all_hard_reg_sets (insn, &insn_used_regs, false);

+

+      if (CALL_P (insn)

+	  && !get_call_reg_set_usage (insn, &insn_used_regs, call_used_reg_set))

+	{

+	  CLEAR_HARD_REG_SET (node->function_used_regs);

+	  return;

+	}

+

+      IOR_HARD_REG_SET (node->function_used_regs, insn_used_regs);

+    }

+

+  /* Be conservative - mark fixed and global registers as used.  */

+  IOR_HARD_REG_SET (node->function_used_regs, fixed_reg_set);

+  for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)

+    if (global_regs[i])

+      SET_HARD_REG_BIT (node->function_used_regs, i);

+

+#ifdef STACK_REGS

+  /* Handle STACK_REGS conservatively, since the df-framework does not

+     provide accurate information for them.  */

+

+  for (i = FIRST_STACK_REG; i <= LAST_STACK_REG; i++)

+    SET_HARD_REG_BIT (node->function_used_regs, i);

+#endif

+

+  CLEAR_HARD_REG_SET (other_usage.set);

+  targetm.fn_other_hard_reg_usage (&other_usage);

+  IOR_HARD_REG_SET (node->function_used_regs, other_usage.set);

+

+  node->function_used_regs_valid = 1;

+}

+

+/* Get the declaration of the function called by INSN.  */

+

+static tree

+get_call_fndecl (rtx insn)

+{

+  rtx note, datum;

+

+  if (!flag_use_caller_save)

+    return NULL_TREE;

+

+  note = find_reg_note (insn, REG_CALL_DECL, NULL_RTX);

+  if (note == NULL_RTX)

+    return NULL_TREE;

+

+  datum = XEXP (note, 0);

+  if (datum != NULL_RTX)

+    return SYMBOL_REF_DECL (datum);

+

+  return NULL_TREE;

+}

+

+static struct cgraph_node *

+get_call_cgraph_node (rtx insn)

+{

+  tree fndecl;

+

+  if (insn == NULL_RTX)

+    return NULL;

+

+  fndecl = get_call_fndecl (insn);

+  if (fndecl == NULL_TREE

+      || !targetm.binds_local_p (fndecl))

+    return NULL;

+

+  return cgraph_get_node (fndecl);

+}

+

+/* Find hard registers used by function call instruction INSN, and return them

+   in REG_SET.  Return DEFAULT_SET in REG_SET if not found.  */

+

+bool

+get_call_reg_set_usage (rtx insn, HARD_REG_SET *reg_set,

+			HARD_REG_SET default_set)

+{

+  struct cgraph_node *node = get_call_cgraph_node (insn);

+  if (node != NULL

+      && node->function_used_regs_valid)

+    {

+      COPY_HARD_REG_SET (*reg_set, node->function_used_regs);

+      AND_HARD_REG_SET (*reg_set, default_set);

+      return true;

+    }

+  else

+    {

+      COPY_HARD_REG_SET (*reg_set, default_set);

+      return false;

+    }

+}

diff --git a/gcc/regs.h b/gcc/regs.h

index 090d6b6..ec71ad4 100644

--- a/gcc/regs.h

+++ b/gcc/regs.h

@@ -421,4 +421,8 @@ range_in_hard_reg_set_p (const HARD_REG_SET set, unsigned regno, int nregs)

   return true;

 }

 

+/* Get registers used by given function call instruction.  */

+extern bool get_call_reg_set_usage (rtx insn, HARD_REG_SET *reg_set,

+				    HARD_REG_SET default_set);

+

 #endif /* GCC_REGS_H */

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH][05/10] -fuse-caller-save - Implement TARGET_FN_OTHER_HARD_REG_USAGE hook for ARM
  2013-03-29 12:54           ` Tom de Vries
                               ` (3 preceding siblings ...)
  2013-03-29 13:06             ` [PATCH][01/10] -fuse-caller-save - Add command line option Tom de Vries
@ 2013-03-29 13:06             ` Tom de Vries
  2013-03-29 13:06             ` [PATCH][04/10] -fuse-caller-save - Add TARGET_FN_OTHER_HARD_REG_USAGE hook Tom de Vries
                               ` (15 subsequent siblings)
  20 siblings, 0 replies; 59+ messages in thread
From: Tom de Vries @ 2013-03-29 13:06 UTC (permalink / raw)
  To: Richard Earnshaw; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 692 bytes --]

Richard,



This patch series adds analysis of register usage of functions for usage by IRA.

The original post is here

( http://gcc.gnu.org/ml/gcc-patches/2013-01/msg01234.html ).



This patch implements the target hook TARGET_FN_OTHER_HARD_REG_USAGE for ARM.

The target hook TARGET_FN_OTHER_HARD_REG_USAGE was introduced in the previous

patch in this patch series.



Build and reg-tested on ARM.



OK for trunk?



Thanks,

  -Tom



2013-03-29  Radovan Obradovic  <robradovic@mips.com>

            Tom de Vries  <tom@codesourcery.com>



	* config/arm/arm.c (TARGET_FN_OTHER_HARD_REG_USAGE): Redefine as

	arm_fn_other_hard_reg_usage.

	(arm_fn_other_hard_reg_usage): New function.

[-- Attachment #2: 0005-Implement-TARGET_FN_OTHER_HARD_REG_USAGE-hook-for-ARM.patch --]
[-- Type: text/x-patch, Size: 1439 bytes --]

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c

index 5f63a2e..341fa86 100644

--- a/gcc/config/arm/arm.c

+++ b/gcc/config/arm/arm.c

@@ -280,6 +280,7 @@ static unsigned arm_add_stmt_cost (void *data, int count,

 

 static void arm_canonicalize_comparison (int *code, rtx *op0, rtx *op1,

 					 bool op0_preserve_value);

+static void arm_fn_other_hard_reg_usage (struct hard_reg_set_container *);

 \f

 /* Table of machine attributes.  */

 static const struct attribute_spec arm_attribute_table[] =

@@ -649,6 +650,10 @@ static const struct attribute_spec arm_attribute_table[] =

 #define TARGET_CANONICALIZE_COMPARISON \

   arm_canonicalize_comparison

 

+#undef TARGET_FN_OTHER_HARD_REG_USAGE

+#define TARGET_FN_OTHER_HARD_REG_USAGE \

+  arm_fn_other_hard_reg_usage

+

 struct gcc_target targetm = TARGET_INITIALIZER;

 \f

 /* Obstack for minipool constant handling.  */

@@ -3762,6 +3767,19 @@ arm_canonicalize_comparison (int *code, rtx *op0, rtx *op1,

     }

 }

 

+/* Implement TARGET_FN_OTHER_HARD_REG_USAGE.  */

+

+static void

+arm_fn_other_hard_reg_usage (struct hard_reg_set_container *regs)

+{

+  if (TARGET_AAPCS_BASED)

+    {

+      /* For AAPCS, IP and CC can be clobbered by veneers inserted by the

+	 linker.  */

+      SET_HARD_REG_BIT (regs->set, IP_REGNUM);

+      SET_HARD_REG_BIT (regs->set, CC_REGNUM);

+    }

+}

 

 /* Define how to find the value returned by a function.  */

 

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH][08/10] -fuse-caller-save - Enable by default at O2 and higher
  2013-03-29 12:54           ` Tom de Vries
                               ` (5 preceding siblings ...)
  2013-03-29 13:06             ` [PATCH][04/10] -fuse-caller-save - Add TARGET_FN_OTHER_HARD_REG_USAGE hook Tom de Vries
@ 2013-03-29 13:06             ` Tom de Vries
  2013-03-29 13:06             ` [PATCH][10/10] -fuse-caller-save - Add test-case Tom de Vries
                               ` (13 subsequent siblings)
  20 siblings, 0 replies; 59+ messages in thread
From: Tom de Vries @ 2013-03-29 13:06 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 299 bytes --]

Vladimir,



This patch enables the -fuse-caller-save optimization by default.



Thanks,

  -Tom



2013-03-29  Radovan Obradovic  <robradovic@mips.com>

            Tom de Vries  <tom@codesourcery.com>



	* opts.c (default_options_table): Add OPT_LEVELS_2_PLUS entry with

	OPT_fuse_caller_save.

[-- Attachment #2: 0008-Enable-by-default-at-O2-and-higher.patch --]
[-- Type: text/x-patch, Size: 552 bytes --]

diff --git a/gcc/opts.c b/gcc/opts.c

index 45b12fe..52a42b9 100644

--- a/gcc/opts.c

+++ b/gcc/opts.c

@@ -486,6 +486,7 @@ static const struct default_options default_options_table[] =

     { OPT_LEVELS_2_PLUS, OPT_ftree_tail_merge, NULL, 1 },

     { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_foptimize_strlen, NULL, 1 },

     { OPT_LEVELS_2_PLUS, OPT_fhoist_adjacent_loads, NULL, 1 },

+    { OPT_LEVELS_2_PLUS, OPT_fuse_caller_save, NULL, 1 },

 

     /* -O3 optimizations.  */

     { OPT_LEVELS_3_PLUS, OPT_ftree_loop_distribute_patterns, NULL, 1 },

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH][10/10] -fuse-caller-save - Add test-case
  2013-03-29 12:54           ` Tom de Vries
                               ` (6 preceding siblings ...)
  2013-03-29 13:06             ` [PATCH][08/10] -fuse-caller-save - Enable by default at O2 and higher Tom de Vries
@ 2013-03-29 13:06             ` Tom de Vries
  2013-03-29 13:06             ` [PATCH][03/10] -fuse-caller-save - Add implicit parameter to find_all_hard_reg_sets Tom de Vries
                               ` (12 subsequent siblings)
  20 siblings, 0 replies; 59+ messages in thread
From: Tom de Vries @ 2013-03-29 13:06 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 786 bytes --]

Richard,



This patch series adds analysis of register usage of functions for usage by IRA.

The original post is here

( http://gcc.gnu.org/ml/gcc-patches/2013-01/msg01234.html ).



This patch adds a test-case for -fuse-caller-save.  Since the test-case has

different output for mips16 and micromips, new effective targets are introduced.



Build and reg-tested on mips.



OK for trunk?



Thanks,

  -Tom



2013-03-29  Radovan Obradovic  <robradovic@mips.com>

            Tom de Vries  <tom@codesourcery.com>



	* lib/target-supports.exp (check_effective_target_mips16)

	(check_effective_target_micromips): New proc.

	* gcc.target/mips/mips.exp: Add use-caller-save to -ffoo/-fno-foo

	options.  Add -save-temps to mips_option_groups.

	* gcc.target/mips/aru-1.c: New test.

[-- Attachment #2: 0010-Add-test-case.patch --]
[-- Type: text/x-patch, Size: 2965 bytes --]

diff --git a/gcc/testsuite/gcc.target/mips/aru-1.c b/gcc/testsuite/gcc.target/mips/aru-1.c

new file mode 100644

index 0000000..71515a9

--- /dev/null

+++ b/gcc/testsuite/gcc.target/mips/aru-1.c

@@ -0,0 +1,38 @@

+/* { dg-do run } */

+/* { dg-options "-fuse-caller-save -save-temps" } */

+/* { dg-skip-if "" { *-*-* }  { "*" } { "-Os" } } */

+/* Testing -fuse-caller-save optimization option.  */

+

+static int __attribute__((noinline))

+bar (int x)

+{

+  return x + 3;

+}

+

+int __attribute__((noinline))

+foo (int y)

+{

+  return y + bar (y);

+}

+

+int

+main (void)

+{

+  return !(foo (5) == 13);

+}

+

+/* Check that there are only 2 stack-saves: r31 in main and foo.  */

+

+/* Variant not mips16.  Check that there only 2 sw/sd.  */

+/* { dg-final { scan-assembler-times "(?n)s\[wd\]\t\\\$.*,.*\\(\\\$sp\\)" 2 { target { ! mips16 } } } } */

+

+/* Variant not mips16, Subvariant micromips.  Additionally check there's no

+   swm.  */

+/* { dg-final { scan-assembler-times "(?n)swm\t\\\$.*,.*\\(\\\$sp\\)" 0 {target micromips } } } */

+

+/* Variant mips16.  The save can save 1 or more registers, check that only 1 is

+   saved, twice in total.  */

+/* { dg-final { scan-assembler-times "(?n)save\t\[0-9\]*,\\\$\[^,\]*\$" 2 { target mips16 } } } */

+

+/* Check that the first caller-save register is unused.  */

+/* { dg-final { scan-assembler-not "(\\\$16)" } } */

diff --git a/gcc/testsuite/gcc.target/mips/mips.exp b/gcc/testsuite/gcc.target/mips/mips.exp

index 15b1386..63570bd 100644

--- a/gcc/testsuite/gcc.target/mips/mips.exp

+++ b/gcc/testsuite/gcc.target/mips/mips.exp

@@ -246,6 +246,7 @@ set mips_option_groups {

     small-data "-G[0-9]+"

     warnings "-w"

     dump "-fdump-.*"

+    save_temps "-save-temps"

 }

 

 # Add -mfoo/-mno-foo options to mips_option_groups.

@@ -302,6 +303,7 @@ foreach option {

     tree-vectorize

     unroll-all-loops

     unroll-loops

+    use-caller-save

 } {

     lappend mips_option_groups $option "-f(no-|)$option"

 }

diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp

index a146f17..dbd0037 100644

--- a/gcc/testsuite/lib/target-supports.exp

+++ b/gcc/testsuite/lib/target-supports.exp

@@ -918,6 +918,26 @@ proc check_effective_target_mips16_attribute { } {

     } [add_options_for_mips16_attribute ""]]

 }

 

+# Return 1 if the target generates mips16 code by default.

+

+proc check_effective_target_mips16 { } {

+    return [check_no_compiler_messages mips16 assembly {

+	#if !(defined __mips16)

+	#error FOO

+	#endif

+    } ""]

+}

+

+# Return 1 if the target generates micromips code by default.

+

+proc check_effective_target_micromips { } {

+    return [check_no_compiler_messages micromips assembly {

+	#if !(defined __mips_micromips)

+	#error FOO

+	#endif

+    } ""]

+}

+

 # Return 1 if the target supports long double larger than double when

 # using the new ABI, 0 otherwise.

 

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH][03/10] -fuse-caller-save - Add implicit parameter to find_all_hard_reg_sets
  2013-03-29 12:54           ` Tom de Vries
                               ` (7 preceding siblings ...)
  2013-03-29 13:06             ` [PATCH][10/10] -fuse-caller-save - Add test-case Tom de Vries
@ 2013-03-29 13:06             ` Tom de Vries
  2013-03-29 13:06             ` [PATCH][07/10] -fuse-caller-save - Use collected register usage information Tom de Vries
                               ` (11 subsequent siblings)
  20 siblings, 0 replies; 59+ messages in thread
From: Tom de Vries @ 2013-03-29 13:06 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 463 bytes --]

Vladimir,



This patch adds an implicit parameter to find_all_hard_reg_sets.



Thanks,

  -Tom



2013-03-29  Radovan Obradovic  <robradovic@mips.com>

            Tom de Vries  <tom@codesourcery.com>



	* rtlanal.c (find_all_hard_reg_sets): Add bool implicit parameter and

	handle.

	* rtl.h (find_all_hard_reg_sets): Add bool parameter.

	* haifa-sched.c (recompute_todo_spec, check_clobbered_conditions): Add

	new argument to find_all_hard_reg_sets call.

[-- Attachment #2: 0003-Add-implicit-parameter-to-find_all_hard_reg_sets.patch --]
[-- Type: text/x-patch, Size: 2126 bytes --]

diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c

index c4591bfe..fe24d43 100644

--- a/gcc/haifa-sched.c

+++ b/gcc/haifa-sched.c

@@ -1271,7 +1271,7 @@ recompute_todo_spec (rtx next, bool for_backtrack)

 	  {

 	    HARD_REG_SET t;

 

-	    find_all_hard_reg_sets (prev, &t);

+	    find_all_hard_reg_sets (prev, &t, true);

 	    if (TEST_HARD_REG_BIT (t, regno))

 	      return HARD_DEP;

 	    if (prev == pro)

@@ -3041,7 +3041,7 @@ check_clobbered_conditions (rtx insn)

   if ((current_sched_info->flags & DO_PREDICATION) == 0)

     return;

 

-  find_all_hard_reg_sets (insn, &t);

+  find_all_hard_reg_sets (insn, &t, true);

 

  restart:

   for (i = 0; i < ready.n_ready; i++)

diff --git a/gcc/rtl.h b/gcc/rtl.h

index b9defcc..6486f20 100644

--- a/gcc/rtl.h

+++ b/gcc/rtl.h

@@ -2038,7 +2038,7 @@ extern const_rtx set_of (const_rtx, const_rtx);

 extern void record_hard_reg_sets (rtx, const_rtx, void *);

 extern void record_hard_reg_uses (rtx *, void *);

 #ifdef HARD_CONST

-extern void find_all_hard_reg_sets (const_rtx, HARD_REG_SET *);

+extern void find_all_hard_reg_sets (const_rtx, HARD_REG_SET *, bool);

 #endif

 extern void note_stores (const_rtx, void (*) (rtx, const_rtx, void *), void *);

 extern void note_uses (rtx *, void (*) (rtx *, void *), void *);

diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c

index b198685..27c1974 100644

--- a/gcc/rtlanal.c

+++ b/gcc/rtlanal.c

@@ -1028,13 +1028,13 @@ record_hard_reg_sets (rtx x, const_rtx pat ATTRIBUTE_UNUSED, void *data)

 /* Examine INSN, and compute the set of hard registers written by it.

    Store it in *PSET.  Should only be called after reload.  */

 void

-find_all_hard_reg_sets (const_rtx insn, HARD_REG_SET *pset)

+find_all_hard_reg_sets (const_rtx insn, HARD_REG_SET *pset, bool implicit)

 {

   rtx link;

 

   CLEAR_HARD_REG_SET (*pset);

   note_stores (PATTERN (insn), record_hard_reg_sets, pset);

-  if (CALL_P (insn))

+  if (implicit && CALL_P (insn))

     IOR_HARD_REG_SET (*pset, call_used_reg_set);

   for (link = REG_NOTES (insn); link; link = XEXP (link, 1))

     if (REG_NOTE_KIND (link) == REG_INC)

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH][04/10] -fuse-caller-save - Add TARGET_FN_OTHER_HARD_REG_USAGE hook
  2013-03-29 12:54           ` Tom de Vries
                               ` (4 preceding siblings ...)
  2013-03-29 13:06             ` [PATCH][05/10] -fuse-caller-save - Implement TARGET_FN_OTHER_HARD_REG_USAGE hook for ARM Tom de Vries
@ 2013-03-29 13:06             ` Tom de Vries
  2013-03-29 13:06             ` [PATCH][08/10] -fuse-caller-save - Enable by default at O2 and higher Tom de Vries
                               ` (14 subsequent siblings)
  20 siblings, 0 replies; 59+ messages in thread
From: Tom de Vries @ 2013-03-29 13:06 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 805 bytes --]

Vladimir,



This patch adds a TARGET_FN_OTHER_HARD_REG_USAGE hook.  The hook is used to

list hard registers that are set or clobbered by a call to a function, but are

not listed as such in the function body, such as f.i. registers clobbered by

veneers inserted by the linker.



Thanks,

  -Tom



2013-03-29  Radovan Obradovic  <robradovic@mips.com>

            Tom de Vries  <tom@codesourcery.com>



	* hooks.c (hook_void_hard_reg_set_containerp): New function.

	* hooks.h (hook_void_hard_reg_set_containerp): Declare.

	* target.def (fn_other_hard_reg_usage): New DEFHOOK.

	* doc/tm.texi.in (@node Stack and Calling): Add Miscellaneous Register

	Hooks to @menu.

	(@node Miscellaneous Register Hooks): New node.

	(@hook TARGET_FN_OTHER_HARD_REG_USAGE): New hook.

	* doc/tm.texi: Regenerate.

[-- Attachment #2: 0004-Add-TARGET_FN_OTHER_HARD_REG_USAGE-hook.patch --]
[-- Type: text/x-patch, Size: 3934 bytes --]

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi

index cbbc82d..3bf7abe 100644

--- a/gcc/doc/tm.texi

+++ b/gcc/doc/tm.texi

@@ -3074,6 +3074,7 @@ This describes the stack layout and calling conventions.

 * Profiling::

 * Tail Calls::

 * Stack Smashing Protection::

+* Miscellaneous Register Hooks::

 @end menu

 

 @node Frame Layout

@@ -4999,6 +5000,14 @@ normally defined in @file{libgcc2.c}.

 Whether this target supports splitting the stack when the options described in @var{opts} have been passed.  This is called after options have been parsed, so the target may reject splitting the stack in some configurations.  The default version of this hook returns false.  If @var{report} is true, this function may issue a warning or error; if @var{report} is false, it must simply return a value

 @end deftypefn

 

+@node Miscellaneous Register Hooks

+@subsection Miscellaneous register hooks

+@cindex miscellaneous register hooks

+

+@deftypefn {Target Hook} void TARGET_FN_OTHER_HARD_REG_USAGE (struct hard_reg_set_container *@var{regs})

+Add any hard registers to @var{regs} that are set or clobbered by a call to the function.  This hook only needs to be defined to provide registers that cannot be found by examination of the final RTL representation of a function.

+@end deftypefn

+

 @node Varargs

 @section Implementing the Varargs Macros

 @cindex varargs implementation

diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in

index dfba947..4dfd8aa 100644

--- a/gcc/doc/tm.texi.in

+++ b/gcc/doc/tm.texi.in

@@ -3042,6 +3042,7 @@ This describes the stack layout and calling conventions.

 * Profiling::

 * Tail Calls::

 * Stack Smashing Protection::

+* Miscellaneous Register Hooks::

 @end menu

 

 @node Frame Layout

@@ -4922,6 +4923,12 @@ normally defined in @file{libgcc2.c}.

 

 @hook TARGET_SUPPORTS_SPLIT_STACK

 

+@node Miscellaneous Register Hooks

+@subsection Miscellaneous register hooks

+@cindex miscellaneous register hooks

+

+@hook TARGET_FN_OTHER_HARD_REG_USAGE

+

 @node Varargs

 @section Implementing the Varargs Macros

 @cindex varargs implementation

diff --git a/gcc/hooks.c b/gcc/hooks.c

index 3b54dfa..e038a95 100644

--- a/gcc/hooks.c

+++ b/gcc/hooks.c

@@ -446,3 +446,11 @@ void

 hook_void_gcc_optionsp (struct gcc_options *opts ATTRIBUTE_UNUSED)

 {

 }

+

+/* Generic hook that takes a struct hard_reg_set_container * and returns

+   void.  */

+

+void

+hook_void_hard_reg_set_containerp (struct hard_reg_set_container *regs ATTRIBUTE_UNUSED)

+{

+}

diff --git a/gcc/hooks.h b/gcc/hooks.h

index 50bcc6a..44decdf 100644

--- a/gcc/hooks.h

+++ b/gcc/hooks.h

@@ -69,6 +69,7 @@ extern void hook_void_tree (tree);

 extern void hook_void_tree_treeptr (tree, tree *);

 extern void hook_void_int_int (int, int);

 extern void hook_void_gcc_optionsp (struct gcc_options *);

+extern void hook_void_hard_reg_set_containerp (struct hard_reg_set_container *);

 

 extern int hook_int_uint_mode_1 (unsigned int, enum machine_mode);

 extern int hook_int_const_tree_0 (const_tree);

diff --git a/gcc/target.def b/gcc/target.def

index 831cad8..e8f7c4a 100644

--- a/gcc/target.def

+++ b/gcc/target.def

@@ -2850,6 +2850,17 @@ DEFHOOK

  void, (bitmap regs),

  hook_void_bitmap)

 

+/* For targets that need to mark extra registers as clobbered on entry to

+   the function, they should define this target hook and set their

+   bits in the struct hard_reg_set_container passed in.  */

+DEFHOOK

+(fn_other_hard_reg_usage,

+ "Add any hard registers to @var{regs} that are set or clobbered by a call to\

+ the function.  This hook only needs to be defined to provide registers that\

+ cannot be found by examination of the final RTL representation of a function.",

+ void, (struct hard_reg_set_container *regs),

+ hook_void_hard_reg_set_containerp)

+

 /* Fill in additional registers set up by prologue into a regset.  */

 DEFHOOK

 (set_up_by_prologue,

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH][09/10] -fuse-caller-save - Add documentation
  2013-03-29 12:54           ` Tom de Vries
  2013-03-29 13:06             ` [PATCH][06/10] -fuse-caller-save - Collect register usage information Tom de Vries
@ 2013-03-29 13:06             ` Tom de Vries
  2013-03-29 13:06             ` [PATCH][02/10] -fuse-caller-save - Add new reg-note REG_CALL_DECL Tom de Vries
                               ` (18 subsequent siblings)
  20 siblings, 0 replies; 59+ messages in thread
From: Tom de Vries @ 2013-03-29 13:06 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 324 bytes --]

Vladimir,



This patch adds the documentation of -fuse-caller-save.



Thanks,

  -Tom



2013-03-29  Radovan Obradovic  <robradovic@mips.com>

            Tom de Vries  <tom@codesourcery.com>



	* doc/invoke.texi (@item Optimization Options): Add -fuse-caller-save to

	gccoptlist.

	(@item -fuse-caller-save): New item.

[-- Attachment #2: 0009-Add-documentation.patch --]
[-- Type: text/x-patch, Size: 1437 bytes --]

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi

index 475dcf0..efb8a1a 100644

--- a/gcc/doc/invoke.texi

+++ b/gcc/doc/invoke.texi

@@ -421,8 +421,8 @@ Objective-C and Objective-C++ Dialects}.

 -ftree-ter -ftree-vect-loop-version -ftree-vectorize -ftree-vrp @gol

 -funit-at-a-time -funroll-all-loops -funroll-loops @gol

 -funsafe-loop-optimizations -funsafe-math-optimizations -funswitch-loops @gol

--fvariable-expansion-in-unroller -fvect-cost-model -fvpt -fweb @gol

--fwhole-program -fwpa -fuse-ld=@var{linker} -fuse-linker-plugin @gol

+-fuse-caller-save -fvariable-expansion-in-unroller -fvect-cost-model -fvpt @gol

+-fweb -fwhole-program -fwpa -fuse-ld=@var{linker} -fuse-linker-plugin @gol

 --param @var{name}=@var{value}

 -O  -O0  -O1  -O2  -O3  -Os -Ofast -Og}

 

@@ -7382,6 +7382,14 @@ and then tries to find ways to combine them.

 

 Enabled by default at @option{-O1} and higher.

 

+@item -fuse-caller-save

+Use caller save registers for allocation if those registers are not used by

+any called function.  In that case it is not necessary to save and restore

+them around calls.  This is only possible if called functions are part of

+same compilation unit as current function and they are compiled before it.

+

+Enabled at levels @option{-O2}, @option{-O3}, @option{-Os}.

+

 @item -fconserve-stack

 @opindex fconserve-stack

 Attempt to minimize stack usage.  The compiler attempts to use less

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH][07/10] -fuse-caller-save - Use collected register usage information
  2013-03-29 12:54           ` Tom de Vries
                               ` (8 preceding siblings ...)
  2013-03-29 13:06             ` [PATCH][03/10] -fuse-caller-save - Add implicit parameter to find_all_hard_reg_sets Tom de Vries
@ 2013-03-29 13:06             ` Tom de Vries
  2013-03-30 16:10             ` [PATCH][IRA] Analysis of register usage of functions for usage by IRA Tom de Vries
                               ` (10 subsequent siblings)
  20 siblings, 0 replies; 59+ messages in thread
From: Tom de Vries @ 2013-03-29 13:06 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1412 bytes --]

Paolo,



This patch series adds analysis of register usage of functions for usage by IRA.

The original post is here

( http://gcc.gnu.org/ml/gcc-patches/2013-01/msg01234.html ).



This patch uses the information of which registers are clobbered by a call

in IRA and df-scan.



Bootstrapped and reg-tested on x86_64, Ada inclusive. Build and reg-tested on

mips, arm, ppc and sh.



Can you approve the df-scan part for trunk?



Thanks,

  -Tom



2013-03-29  Radovan Obradovic  <robradovic@mips.com>

            Tom de Vries  <tom@codesourcery.com>



	* df-scan.c (df_get_call_refs): Use get_call_reg_set_usage.

	* caller-save.c (setup_save_areas, save_call_clobbered_regs): Use

	get_call_reg_set_usage.

	* resource.c (mark_set_resources, mark_target_live_regs): Use

	get_call_reg_set_usage.

	* ira-int.h (struct ira_allocno): Add crossed_calls_clobbered_regs

	field.

	(ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS): Define.

	* ira-lives.c (process_bb_node_lives): Use get_call_reg_set_usage.

	Calculate ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS.

	* ira-build.c (ira_create_allocno): Init

	ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS.

	(create_cap_allocno, propagate_allocno_info)

	(propagate_some_info_from_allocno)

	(copy_info_to_removed_store_destinations): Handle

	ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS.

	* ira-costs.c (ira_tune_allocno_costs): Use

	ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS to adjust costs.

[-- Attachment #2: 0007-Use-collected-register-usage-information.patch --]
[-- Type: text/x-patch, Size: 10919 bytes --]

diff --git a/gcc/caller-save.c b/gcc/caller-save.c

index 5e65294..39d75ad 100644

--- a/gcc/caller-save.c

+++ b/gcc/caller-save.c

@@ -441,7 +441,7 @@ setup_save_areas (void)

       freq = REG_FREQ_FROM_BB (BLOCK_FOR_INSN (insn));

       REG_SET_TO_HARD_REG_SET (hard_regs_to_save,

 			       &chain->live_throughout);

-      COPY_HARD_REG_SET (used_regs, call_used_reg_set);

+      get_call_reg_set_usage (insn, &used_regs, call_used_reg_set);

 

       /* Record all registers set in this call insn.  These don't

 	 need to be saved.  N.B. the call insn might set a subreg

@@ -525,7 +525,7 @@ setup_save_areas (void)

 

 	  REG_SET_TO_HARD_REG_SET (hard_regs_to_save,

 				   &chain->live_throughout);

-	  COPY_HARD_REG_SET (used_regs, call_used_reg_set);

+	  get_call_reg_set_usage (insn, &used_regs, call_used_reg_set);

 

 	  /* Record all registers set in this call insn.  These don't

 	     need to be saved.  N.B. the call insn might set a subreg

@@ -804,6 +804,7 @@ save_call_clobbered_regs (void)

 	    {

 	      unsigned regno;

 	      HARD_REG_SET hard_regs_to_save;

+	      HARD_REG_SET call_def_reg_set;

 	      reg_set_iterator rsi;

 	      rtx cheap;

 

@@ -854,7 +855,9 @@ save_call_clobbered_regs (void)

 	      AND_COMPL_HARD_REG_SET (hard_regs_to_save, call_fixed_reg_set);

 	      AND_COMPL_HARD_REG_SET (hard_regs_to_save, this_insn_sets);

 	      AND_COMPL_HARD_REG_SET (hard_regs_to_save, hard_regs_saved);

-	      AND_HARD_REG_SET (hard_regs_to_save, call_used_reg_set);

+	      get_call_reg_set_usage (insn, &call_def_reg_set,

+				      call_used_reg_set);

+	      AND_HARD_REG_SET (hard_regs_to_save, call_def_reg_set);

 

 	      for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)

 		if (TEST_HARD_REG_BIT (hard_regs_to_save, regno))

diff --git a/gcc/df-scan.c b/gcc/df-scan.c

index fdfa931..898454c 100644

--- a/gcc/df-scan.c

+++ b/gcc/df-scan.c

@@ -3398,10 +3398,13 @@ df_get_call_refs (struct df_collection_rec *collection_rec,

   bool is_sibling_call;

   unsigned int i;

   HARD_REG_SET defs_generated;

+  HARD_REG_SET fn_reg_set_usage;

 

   CLEAR_HARD_REG_SET (defs_generated);

   df_find_hard_reg_defs (PATTERN (insn_info->insn), &defs_generated);

   is_sibling_call = SIBLING_CALL_P (insn_info->insn);

+  get_call_reg_set_usage (insn_info->insn, &fn_reg_set_usage,

+			  regs_invalidated_by_call);

 

   for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)

     {

@@ -3425,7 +3428,7 @@ df_get_call_refs (struct df_collection_rec *collection_rec,

 			       NULL, bb, insn_info, DF_REF_REG_DEF, flags);

 	    }

 	}

-      else if (TEST_HARD_REG_BIT (regs_invalidated_by_call, i)

+      else if (TEST_HARD_REG_BIT (fn_reg_set_usage, i)

 	       /* no clobbers for regs that are the result of the call */

 	       && !TEST_HARD_REG_BIT (defs_generated, i)

 	       && (!is_sibling_call

diff --git a/gcc/ira-build.c b/gcc/ira-build.c

index b1e481b..054139a 100644

--- a/gcc/ira-build.c

+++ b/gcc/ira-build.c

@@ -507,6 +507,7 @@ ira_create_allocno (int regno, bool cap_p,

   ALLOCNO_CALL_FREQ (a) = 0;

   ALLOCNO_CALLS_CROSSED_NUM (a) = 0;

   ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a) = 0;

+  CLEAR_HARD_REG_SET (ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a));

 #ifdef STACK_REGS

   ALLOCNO_NO_STACK_REG_P (a) = false;

   ALLOCNO_TOTAL_NO_STACK_REG_P (a) = false;

@@ -904,6 +905,8 @@ create_cap_allocno (ira_allocno_t a)

 

   ALLOCNO_CALLS_CROSSED_NUM (cap) = ALLOCNO_CALLS_CROSSED_NUM (a);

   ALLOCNO_CHEAP_CALLS_CROSSED_NUM (cap) = ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a);

+  IOR_HARD_REG_SET (ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (cap),

+		    ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a));

   if (internal_flag_ira_verbose > 2 && ira_dump_file != NULL)

     {

       fprintf (ira_dump_file, "    Creating cap ");

@@ -1823,6 +1826,8 @@ propagate_allocno_info (void)

 	    += ALLOCNO_CALLS_CROSSED_NUM (a);

 	  ALLOCNO_CHEAP_CALLS_CROSSED_NUM (parent_a)

 	    += ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a);

+ 	  IOR_HARD_REG_SET (ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (parent_a),

+ 			    ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a));

 	  ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (parent_a)

 	    += ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (a);

 	  aclass = ALLOCNO_CLASS (a);

@@ -2203,6 +2208,9 @@ propagate_some_info_from_allocno (ira_allocno_t a, ira_allocno_t from_a)

   ALLOCNO_CALLS_CROSSED_NUM (a) += ALLOCNO_CALLS_CROSSED_NUM (from_a);

   ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a)

     += ALLOCNO_CHEAP_CALLS_CROSSED_NUM (from_a);

+  IOR_HARD_REG_SET (ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a),

+ 		    ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (from_a));

+

   ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (a)

     += ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (from_a);

   if (! ALLOCNO_BAD_SPILL_P (from_a))

@@ -2828,6 +2836,8 @@ copy_info_to_removed_store_destinations (int regno)

 	+= ALLOCNO_CALLS_CROSSED_NUM (a);

       ALLOCNO_CHEAP_CALLS_CROSSED_NUM (parent_a)

 	+= ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a);

+      IOR_HARD_REG_SET (ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (parent_a),

+ 			ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a));

       ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (parent_a)

 	+= ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (a);

       merged_p = true;

diff --git a/gcc/ira-costs.c b/gcc/ira-costs.c

index 1de0061..8dbef3f 100644

--- a/gcc/ira-costs.c

+++ b/gcc/ira-costs.c

@@ -2082,6 +2082,7 @@ ira_tune_allocno_costs (void)

   ira_allocno_object_iterator oi;

   ira_object_t obj;

   bool skip_p;

+  HARD_REG_SET *crossed_calls_clobber_regs;

 

   FOR_EACH_ALLOCNO (a, ai)

     {

@@ -2116,17 +2117,24 @@ ira_tune_allocno_costs (void)

 		continue;

 	      rclass = REGNO_REG_CLASS (regno);

 	      cost = 0;

-	      if (ira_hard_reg_set_intersection_p (regno, mode, call_used_reg_set)

-		  || HARD_REGNO_CALL_PART_CLOBBERED (regno, mode))

-		cost += (ALLOCNO_CALL_FREQ (a)

-			 * (ira_memory_move_cost[mode][rclass][0]

-			    + ira_memory_move_cost[mode][rclass][1]));

+	      crossed_calls_clobber_regs

+		= &(ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a));

+	      if (ira_hard_reg_set_intersection_p (regno, mode,

+						   *crossed_calls_clobber_regs))

+		{

+		  if (ira_hard_reg_set_intersection_p (regno, mode,

+						       call_used_reg_set)

+		      || HARD_REGNO_CALL_PART_CLOBBERED (regno, mode))

+		    cost += (ALLOCNO_CALL_FREQ (a)

+			     * (ira_memory_move_cost[mode][rclass][0]

+				+ ira_memory_move_cost[mode][rclass][1]));

 #ifdef IRA_HARD_REGNO_ADD_COST_MULTIPLIER

-	      cost += ((ira_memory_move_cost[mode][rclass][0]

-			+ ira_memory_move_cost[mode][rclass][1])

-		       * ALLOCNO_FREQ (a)

-		       * IRA_HARD_REGNO_ADD_COST_MULTIPLIER (regno) / 2);

+		  cost += ((ira_memory_move_cost[mode][rclass][0]

+			    + ira_memory_move_cost[mode][rclass][1])

+			   * ALLOCNO_FREQ (a)

+			   * IRA_HARD_REGNO_ADD_COST_MULTIPLIER (regno) / 2);

 #endif

+		}

 	      if (INT_MAX - cost < reg_costs[j])

 		reg_costs[j] = INT_MAX;

 	      else

diff --git a/gcc/ira-int.h b/gcc/ira-int.h

index 096f330..3d0e5ec 100644

--- a/gcc/ira-int.h

+++ b/gcc/ira-int.h

@@ -371,6 +371,8 @@ struct ira_allocno

   /* The number of calls across which it is live, but which should not

      affect register preferences.  */

   int cheap_calls_crossed_num;

+  /* Registers clobbered by intersected calls.  */

+   HARD_REG_SET crossed_calls_clobbered_regs;

   /* Array of usage costs (accumulated and the one updated during

      coloring) for each hard register of the allocno class.  The

      member value can be NULL if all costs are the same and equal to

@@ -414,6 +416,8 @@ struct ira_allocno

 #define ALLOCNO_CALL_FREQ(A) ((A)->call_freq)

 #define ALLOCNO_CALLS_CROSSED_NUM(A) ((A)->calls_crossed_num)

 #define ALLOCNO_CHEAP_CALLS_CROSSED_NUM(A) ((A)->cheap_calls_crossed_num)

+#define ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS(A) \

+  ((A)->crossed_calls_clobbered_regs)

 #define ALLOCNO_MEM_OPTIMIZED_DEST(A) ((A)->mem_optimized_dest)

 #define ALLOCNO_MEM_OPTIMIZED_DEST_P(A) ((A)->mem_optimized_dest_p)

 #define ALLOCNO_SOMEWHERE_RENAMED_P(A) ((A)->somewhere_renamed_p)

diff --git a/gcc/ira-lives.c b/gcc/ira-lives.c

index 31635dd..45a00c2 100644

--- a/gcc/ira-lives.c

+++ b/gcc/ira-lives.c

@@ -1273,6 +1273,10 @@ process_bb_node_lives (ira_loop_tree_node_t loop_tree_node)

 		  ira_object_t obj = ira_object_id_map[i];

 		  ira_allocno_t a = OBJECT_ALLOCNO (obj);

 		  int num = ALLOCNO_NUM (a);

+		  HARD_REG_SET this_call_used_reg_set;

+

+		  get_call_reg_set_usage (insn, &this_call_used_reg_set,

+					  call_used_reg_set);

 

 		  /* Don't allocate allocnos that cross setjmps or any

 		     call, if this function receives a nonlocal

@@ -1287,9 +1291,9 @@ process_bb_node_lives (ira_loop_tree_node_t loop_tree_node)

 		  if (can_throw_internal (insn))

 		    {

 		      IOR_HARD_REG_SET (OBJECT_CONFLICT_HARD_REGS (obj),

-					call_used_reg_set);

+					this_call_used_reg_set);

 		      IOR_HARD_REG_SET (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj),

-					call_used_reg_set);

+					this_call_used_reg_set);

 		    }

 

 		  if (sparseset_bit_p (allocnos_processed, num))

@@ -1306,6 +1310,8 @@ process_bb_node_lives (ira_loop_tree_node_t loop_tree_node)

 		  /* Mark it as saved at the next call.  */

 		  allocno_saved_at_call[num] = last_call_num + 1;

 		  ALLOCNO_CALLS_CROSSED_NUM (a)++;

+		  IOR_HARD_REG_SET (ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a),

+				    this_call_used_reg_set);

 		  if (cheap_reg != NULL_RTX

 		      && ALLOCNO_REGNO (a) == (int) REGNO (cheap_reg))

 		    ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a)++;

diff --git a/gcc/resource.c b/gcc/resource.c

index 333c28f..111708c 100644

--- a/gcc/resource.c

+++ b/gcc/resource.c

@@ -647,10 +647,12 @@ mark_set_resources (rtx x, struct resources *res, int in_dest,

       if (mark_type == MARK_SRC_DEST_CALL)

 	{

 	  rtx link;

+	  HARD_REG_SET regs;

 

 	  res->cc = res->memory = 1;

 

-	  IOR_HARD_REG_SET (res->regs, regs_invalidated_by_call);

+	  get_call_reg_set_usage (x, &regs, regs_invalidated_by_call);

+	  IOR_HARD_REG_SET (res->regs, regs);

 

 	  for (link = CALL_INSN_FUNCTION_USAGE (x);

 	       link; link = XEXP (link, 1))

@@ -996,11 +998,15 @@ mark_target_live_regs (rtx insns, rtx target, struct resources *res)

 

 	  if (CALL_P (real_insn))

 	    {

+	      HARD_REG_SET regs_invalidated_by_this_call;

 	      /* CALL clobbers all call-used regs that aren't fixed except

 		 sp, ap, and fp.  Do this before setting the result of the

 		 call live.  */

-	      AND_COMPL_HARD_REG_SET (current_live_regs,

+	      get_call_reg_set_usage (real_insn,

+				      &regs_invalidated_by_this_call,

 				      regs_invalidated_by_call);

+	      AND_COMPL_HARD_REG_SET (current_live_regs,

+				      regs_invalidated_by_this_call);

 

 	      /* A CALL_INSN sets any global register live, since it may

 		 have been modified by the call.  */

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH][01/10] -fuse-caller-save - Add command line option
  2013-03-29 12:54           ` Tom de Vries
                               ` (2 preceding siblings ...)
  2013-03-29 13:06             ` [PATCH][02/10] -fuse-caller-save - Add new reg-note REG_CALL_DECL Tom de Vries
@ 2013-03-29 13:06             ` Tom de Vries
  2013-03-29 13:06             ` [PATCH][05/10] -fuse-caller-save - Implement TARGET_FN_OTHER_HARD_REG_USAGE hook for ARM Tom de Vries
                               ` (16 subsequent siblings)
  20 siblings, 0 replies; 59+ messages in thread
From: Tom de Vries @ 2013-03-29 13:06 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 246 bytes --]

Vladimir,



This patch adds the -fuse-caller-save command line option.



Thanks,

  -Tom



2013-03-29  Radovan Obradovic  <robradovic@mips.com>

            Tom de Vries  <tom@codesourcery.com>



	* common.opt (fuse-caller-save): New option.

[-- Attachment #2: 0001-Add-command-line-option.patch --]
[-- Type: text/x-patch, Size: 407 bytes --]

diff --git a/gcc/common.opt b/gcc/common.opt

index bdbd3b6..d29b0a0 100644

--- a/gcc/common.opt

+++ b/gcc/common.opt

@@ -2549,4 +2549,8 @@ Create a position independent executable

 z

 Driver Joined Separate

 

+fuse-caller-save

+Common Report Var(flag_use_caller_save) Optimization

+Use caller save register across calls if possible

+

 ; This comment is to ensure we retain the blank line above.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH][02/10] -fuse-caller-save - Add new reg-note REG_CALL_DECL
  2013-03-29 12:54           ` Tom de Vries
  2013-03-29 13:06             ` [PATCH][06/10] -fuse-caller-save - Collect register usage information Tom de Vries
  2013-03-29 13:06             ` [PATCH][09/10] -fuse-caller-save - Add documentation Tom de Vries
@ 2013-03-29 13:06             ` Tom de Vries
  2013-03-29 13:06             ` [PATCH][01/10] -fuse-caller-save - Add command line option Tom de Vries
                               ` (17 subsequent siblings)
  20 siblings, 0 replies; 59+ messages in thread
From: Tom de Vries @ 2013-03-29 13:06 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 715 bytes --]

Vladimir,



This patch addes the REG_CALL_DECL reg-note.  Using the reg-note we are able to

easily link call_insns to their corresponding declaration, even after the calls

may have been split into an insn (set register to function address) and a

call_insn (call register), which can happen for f.i. sh, and mips

with -mabi-calls.



Thanks,

  -Tom



2013-03-29  Radovan Obradovic  <robradovic@mips.com>

            Tom de Vries  <tom@codesourcery.com>



	* reg-notes.def (REG_NOTE (CALL_DECL)): New reg-note REG_CALL_DECL.

	* calls.c (expand_call, emit_library_call_value_1): Add REG_CALL_DECL

	reg-note.

	* combine.c (distribute_notes): Handle REG_CALL_DECL reg-note.

	* emit-rtl.c (try_split): Same.

[-- Attachment #2: 0002-Add-new-reg-note-REG_CALL_DECL.patch --]
[-- Type: text/x-patch, Size: 3285 bytes --]

diff --git a/gcc/calls.c b/gcc/calls.c

index cdab8e0..39571da 100644

--- a/gcc/calls.c

+++ b/gcc/calls.c

@@ -3158,6 +3158,19 @@ expand_call (tree exp, rtx target, int ignore)

 		   next_arg_reg, valreg, old_inhibit_defer_pop, call_fusage,

 		   flags, args_so_far);

 

+      if (flag_use_caller_save)

+	{

+	  rtx last, datum = NULL_RTX;

+	  if (fndecl != NULL_TREE)

+	    {

+	      datum = XEXP (DECL_RTL (fndecl), 0);

+	      gcc_assert (datum != NULL_RTX

+			  && GET_CODE (datum) == SYMBOL_REF);

+	    }

+	  last = last_call_insn ();

+	  add_reg_note (last, REG_CALL_DECL, datum);

+	}

+

       /* If the call setup or the call itself overlaps with anything

 	 of the argument setup we probably clobbered our call address.

 	 In that case we can't do sibcalls.  */

@@ -4185,6 +4198,14 @@ emit_library_call_value_1 (int retval, rtx orgfun, rtx value,

 	       valreg,

 	       old_inhibit_defer_pop + 1, call_fusage, flags, args_so_far);

 

+  if (flag_use_caller_save)

+    {

+      rtx last, datum = orgfun;

+      gcc_assert (GET_CODE (datum) == SYMBOL_REF);

+      last = last_call_insn ();

+      add_reg_note (last, REG_CALL_DECL, datum);

+    }

+

   /* Right-shift returned value if necessary.  */

   if (!pcc_struct_value

       && TYPE_MODE (tfom) != BLKmode

diff --git a/gcc/combine.c b/gcc/combine.c

index acb4cb4..191eb71 100644

--- a/gcc/combine.c

+++ b/gcc/combine.c

@@ -13187,6 +13187,7 @@ distribute_notes (rtx notes, rtx from_insn, rtx i3, rtx i2, rtx elim_i2,

 	case REG_NORETURN:

 	case REG_SETJMP:

 	case REG_TM:

+	case REG_CALL_DECL:

 	  /* These notes must remain with the call.  It should not be

 	     possible for both I2 and I3 to be a call.  */

 	  if (CALL_P (i3))

diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c

index e412bef..e4843fe 100644

--- a/gcc/emit-rtl.c

+++ b/gcc/emit-rtl.c

@@ -3473,6 +3473,7 @@ try_split (rtx pat, rtx trial, int last)

   int probability;

   rtx insn_last, insn;

   int njumps = 0;

+  rtx call_insn = NULL_RTX;

 

   /* We're not good at redistributing frame information.  */

   if (RTX_FRAME_RELATED_P (trial))

@@ -3545,6 +3546,9 @@ try_split (rtx pat, rtx trial, int last)

 	  {

 	    rtx next, *p;

 

+	    gcc_assert (call_insn == NULL_RTX);

+	    call_insn = insn;

+

 	    /* Add the old CALL_INSN_FUNCTION_USAGE to whatever the

 	       target may have explicitly specified.  */

 	    p = &CALL_INSN_FUNCTION_USAGE (insn);

@@ -3616,6 +3620,11 @@ try_split (rtx pat, rtx trial, int last)

 	  fixup_args_size_notes (NULL_RTX, insn_last, INTVAL (XEXP (note, 0)));

 	  break;

 

+	case REG_CALL_DECL:

+	  gcc_assert (call_insn != NULL_RTX);

+	  add_reg_note (call_insn, REG_NOTE_KIND (note), XEXP (note, 0));

+	  break;

+

 	default:

 	  break;

 	}

diff --git a/gcc/reg-notes.def b/gcc/reg-notes.def

index db61c09..f0b6dad 100644

--- a/gcc/reg-notes.def

+++ b/gcc/reg-notes.def

@@ -216,3 +216,8 @@ REG_NOTE (ARGS_SIZE)

    that the return value of a call can be used to reinitialize a

    pseudo reg.  */

 REG_NOTE (RETURNED)

+

+/* Used to mark a call with the function decl called by the call.

+   The decl might not be available in the call due to splitting of the call

+   insn.  This note is a SYMBOL_REF.  */

+REG_NOTE (CALL_DECL)

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH][IRA] Analysis of register usage of functions for usage by IRA.
  2013-03-29 12:54           ` Tom de Vries
                               ` (9 preceding siblings ...)
  2013-03-29 13:06             ` [PATCH][07/10] -fuse-caller-save - Use collected register usage information Tom de Vries
@ 2013-03-30 16:10             ` Tom de Vries
  2014-01-09 14:42               ` Richard Earnshaw
  2013-03-30 17:11             ` [PATCH][05/10] -fuse-caller-save - Implement TARGET_FN_OTHER_HARD_REG_USAGE hook for ARM Tom de Vries
                               ` (9 subsequent siblings)
  20 siblings, 1 reply; 59+ messages in thread
From: Tom de Vries @ 2013-03-30 16:10 UTC (permalink / raw)
  To: Vladimir Makarov, Richard Earnshaw, Richard Sandiford, Paolo Bonzini
  Cc: gcc-patches

On 29/03/13 13:54, Tom de Vries wrote:
> I split the patch up into 10 patches, to facilitate further review:
> ...
> 0001-Add-command-line-option.patch
> 0002-Add-new-reg-note-REG_CALL_DECL.patch
> 0003-Add-implicit-parameter-to-find_all_hard_reg_sets.patch
> 0004-Add-TARGET_FN_OTHER_HARD_REG_USAGE-hook.patch
> 0005-Implement-TARGET_FN_OTHER_HARD_REG_USAGE-hook-for-ARM.patch
> 0006-Collect-register-usage-information.patch
> 0007-Use-collected-register-usage-information.patch
> 0008-Enable-by-default-at-O2-and-higher.patch
> 0009-Add-documentation.patch
> 0010-Add-test-case.patch
> ...
> I'll post these in reply to this email.
> 

Something went wrong with those emails, which were generated.

I tested the emails by sending them to my work email, where they looked fine.
I managed to reproduce the problem by sending them to my private email.
It seems the problem was inconsistent EOL format.

I've written a python script to handle composing the email, and posted it here
using that script: http://gcc.gnu.org/ml/gcc-patches/2013-03/msg01311.html.
Given that that email looks ok, I think I've addressed the problems now.

I'll repost the patches. Sorry about the noise.

Thanks,
- Tom

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH][05/10] -fuse-caller-save - Implement TARGET_FN_OTHER_HARD_REG_USAGE hook for ARM
  2013-03-29 12:54           ` Tom de Vries
                               ` (10 preceding siblings ...)
  2013-03-30 16:10             ` [PATCH][IRA] Analysis of register usage of functions for usage by IRA Tom de Vries
@ 2013-03-30 17:11             ` Tom de Vries
  2013-12-06  0:54               ` Tom de Vries
  2013-03-30 17:11             ` [PATCH][02/10] -fuse-caller-save - Add new reg-note REG_CALL_DECL Tom de Vries
                               ` (8 subsequent siblings)
  20 siblings, 1 reply; 59+ messages in thread
From: Tom de Vries @ 2013-03-30 17:11 UTC (permalink / raw)
  To: Richard Earnshaw; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 670 bytes --]

Richard,

This patch series adds analysis of register usage of functions for usage by IRA.
The original post is here
( http://gcc.gnu.org/ml/gcc-patches/2013-01/msg01234.html ).

This patch implements the target hook TARGET_FN_OTHER_HARD_REG_USAGE for ARM.
The target hook TARGET_FN_OTHER_HARD_REG_USAGE was introduced in the previous
patch in this patch series.

Build and reg-tested on ARM.

OK for trunk?

Thanks,
  -Tom

2013-03-30  Radovan Obradovic  <robradovic@mips.com>
            Tom de Vries  <tom@codesourcery.com>

	* config/arm/arm.c (TARGET_FN_OTHER_HARD_REG_USAGE): Redefine as
	arm_fn_other_hard_reg_usage.
	(arm_fn_other_hard_reg_usage): New function.

[-- Attachment #2: 0005-Implement-TARGET_FN_OTHER_HARD_REG_USAGE-hook-for-ARM.patch --]
[-- Type: text/x-patch, Size: 1397 bytes --]

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 5f63a2e..341fa86 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -280,6 +280,7 @@ static unsigned arm_add_stmt_cost (void *data, int count,
 
 static void arm_canonicalize_comparison (int *code, rtx *op0, rtx *op1,
 					 bool op0_preserve_value);
+static void arm_fn_other_hard_reg_usage (struct hard_reg_set_container *);
 \f
 /* Table of machine attributes.  */
 static const struct attribute_spec arm_attribute_table[] =
@@ -649,6 +650,10 @@ static const struct attribute_spec arm_attribute_table[] =
 #define TARGET_CANONICALIZE_COMPARISON \
   arm_canonicalize_comparison
 
+#undef TARGET_FN_OTHER_HARD_REG_USAGE
+#define TARGET_FN_OTHER_HARD_REG_USAGE \
+  arm_fn_other_hard_reg_usage
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 \f
 /* Obstack for minipool constant handling.  */
@@ -3762,6 +3767,19 @@ arm_canonicalize_comparison (int *code, rtx *op0, rtx *op1,
     }
 }
 
+/* Implement TARGET_FN_OTHER_HARD_REG_USAGE.  */
+
+static void
+arm_fn_other_hard_reg_usage (struct hard_reg_set_container *regs)
+{
+  if (TARGET_AAPCS_BASED)
+    {
+      /* For AAPCS, IP and CC can be clobbered by veneers inserted by the
+	 linker.  */
+      SET_HARD_REG_BIT (regs->set, IP_REGNUM);
+      SET_HARD_REG_BIT (regs->set, CC_REGNUM);
+    }
+}
 
 /* Define how to find the value returned by a function.  */
 

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH][06/10] -fuse-caller-save - Collect register usage information
  2013-03-29 12:54           ` Tom de Vries
                               ` (12 preceding siblings ...)
  2013-03-30 17:11             ` [PATCH][02/10] -fuse-caller-save - Add new reg-note REG_CALL_DECL Tom de Vries
@ 2013-03-30 17:11             ` Tom de Vries
  2013-03-30 17:11             ` [PATCH][01/10] -fuse-caller-save - Add command line option Tom de Vries
                               ` (6 subsequent siblings)
  20 siblings, 0 replies; 59+ messages in thread
From: Tom de Vries @ 2013-03-30 17:11 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 683 bytes --]

Vladimir,

This patch adds analysis in pass_final to track which hard registers are set or
clobbered by the function body, and stores that information in a
struct cgraph_node.

Thanks,
  -Tom

2013-03-30  Radovan Obradovic  <robradovic@mips.com>
            Tom de Vries  <tom@codesourcery.com>

	* cgraph.h (struct cgraph_node): Add function_used_regs,
	function_used_regs_initialized and function_used_regs_valid fields.
	* final.c: Move include of hard-reg-set.h to before rtl.h to declare
	find_all_hard_reg_sets.
	(collect_fn_hard_reg_usage, get_call_fndecl, get_call_cgraph_node)
	(get_call_reg_set_usage): New function.
	(rest_of_handle_final): Use collect_fn_hard_reg_usage.

[-- Attachment #2: 0006-Collect-register-usage-information.patch --]
[-- Type: text/x-patch, Size: 5295 bytes --]

diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index 8ab7ae1..2132d91 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -251,6 +251,15 @@ struct GTY(()) cgraph_node {
   /* Unique id of the node.  */
   int uid;
 
+  /* Call unsaved hard registers really used by the corresponding
+     function (including ones used by functions called by the
+     function).  */
+  HARD_REG_SET function_used_regs;
+  /* Set if function_used_regs is initialized.  */
+  unsigned function_used_regs_initialized: 1;
+  /* Set if function_used_regs is valid.  */
+  unsigned function_used_regs_valid: 1;
+
   /* Set when decl is an abstract function pointed to by the
      ABSTRACT_DECL_ORIGIN of a reachable function.  */
   unsigned abstract_and_needed : 1;
diff --git a/gcc/final.c b/gcc/final.c
index d25b8e0..4e0fd01 100644
--- a/gcc/final.c
+++ b/gcc/final.c
@@ -48,6 +48,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tm.h"
 
 #include "tree.h"
+#include "hard-reg-set.h"
 #include "rtl.h"
 #include "tm_p.h"
 #include "regs.h"
@@ -56,7 +57,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "recog.h"
 #include "conditions.h"
 #include "flags.h"
-#include "hard-reg-set.h"
 #include "output.h"
 #include "except.h"
 #include "function.h"
@@ -222,6 +222,7 @@ static int alter_cond (rtx);
 static int final_addr_vec_align (rtx);
 #endif
 static int align_fuzz (rtx, rtx, int, unsigned);
+static void collect_fn_hard_reg_usage (void);
 \f
 /* Initialize data in final at the beginning of a compilation.  */
 
@@ -4328,6 +4329,8 @@ rest_of_handle_final (void)
   rtx x;
   const char *fnname;
 
+  collect_fn_hard_reg_usage ();
+
   /* Get the function's name, as described by its RTL.  This may be
      different from the DECL_NAME name used in the source file.  */
 
@@ -4584,3 +4587,121 @@ struct rtl_opt_pass pass_clean_state =
   0                                     /* todo_flags_finish */
  }
 };
+
+/* Collect hard register usage for the current function.  */
+
+static void
+collect_fn_hard_reg_usage (void)
+{
+  rtx insn;
+  int i;
+  struct cgraph_node *node;
+  struct hard_reg_set_container other_usage;
+
+  if (!flag_use_caller_save)
+    return;
+
+  node = cgraph_get_node (current_function_decl);
+  gcc_assert (node != NULL);
+
+  gcc_assert (!node->function_used_regs_initialized);
+  node->function_used_regs_initialized = 1;
+
+  for (insn = get_insns (); insn != NULL_RTX; insn = next_insn (insn))
+    {
+      HARD_REG_SET insn_used_regs;
+
+      if (!NONDEBUG_INSN_P (insn))
+	continue;
+
+      find_all_hard_reg_sets (insn, &insn_used_regs, false);
+
+      if (CALL_P (insn)
+	  && !get_call_reg_set_usage (insn, &insn_used_regs, call_used_reg_set))
+	{
+	  CLEAR_HARD_REG_SET (node->function_used_regs);
+	  return;
+	}
+
+      IOR_HARD_REG_SET (node->function_used_regs, insn_used_regs);
+    }
+
+  /* Be conservative - mark fixed and global registers as used.  */
+  IOR_HARD_REG_SET (node->function_used_regs, fixed_reg_set);
+  for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
+    if (global_regs[i])
+      SET_HARD_REG_BIT (node->function_used_regs, i);
+
+#ifdef STACK_REGS
+  /* Handle STACK_REGS conservatively, since the df-framework does not
+     provide accurate information for them.  */
+
+  for (i = FIRST_STACK_REG; i <= LAST_STACK_REG; i++)
+    SET_HARD_REG_BIT (node->function_used_regs, i);
+#endif
+
+  CLEAR_HARD_REG_SET (other_usage.set);
+  targetm.fn_other_hard_reg_usage (&other_usage);
+  IOR_HARD_REG_SET (node->function_used_regs, other_usage.set);
+
+  node->function_used_regs_valid = 1;
+}
+
+/* Get the declaration of the function called by INSN.  */
+
+static tree
+get_call_fndecl (rtx insn)
+{
+  rtx note, datum;
+
+  if (!flag_use_caller_save)
+    return NULL_TREE;
+
+  note = find_reg_note (insn, REG_CALL_DECL, NULL_RTX);
+  if (note == NULL_RTX)
+    return NULL_TREE;
+
+  datum = XEXP (note, 0);
+  if (datum != NULL_RTX)
+    return SYMBOL_REF_DECL (datum);
+
+  return NULL_TREE;
+}
+
+static struct cgraph_node *
+get_call_cgraph_node (rtx insn)
+{
+  tree fndecl;
+
+  if (insn == NULL_RTX)
+    return NULL;
+
+  fndecl = get_call_fndecl (insn);
+  if (fndecl == NULL_TREE
+      || !targetm.binds_local_p (fndecl))
+    return NULL;
+
+  return cgraph_get_node (fndecl);
+}
+
+/* Find hard registers used by function call instruction INSN, and return them
+   in REG_SET.  Return DEFAULT_SET in REG_SET if not found.  */
+
+bool
+get_call_reg_set_usage (rtx insn, HARD_REG_SET *reg_set,
+			HARD_REG_SET default_set)
+{
+  struct cgraph_node *node = get_call_cgraph_node (insn);
+  if (node != NULL
+      && node->function_used_regs_valid)
+    {
+      COPY_HARD_REG_SET (*reg_set, node->function_used_regs);
+      AND_HARD_REG_SET (*reg_set, default_set);
+      return true;
+    }
+  else
+    {
+      COPY_HARD_REG_SET (*reg_set, default_set);
+      return false;
+    }
+}
diff --git a/gcc/regs.h b/gcc/regs.h
index 090d6b6..ec71ad4 100644
--- a/gcc/regs.h
+++ b/gcc/regs.h
@@ -421,4 +421,8 @@ range_in_hard_reg_set_p (const HARD_REG_SET set, unsigned regno, int nregs)
   return true;
 }
 
+/* Get registers used by given function call instruction.  */
+extern bool get_call_reg_set_usage (rtx insn, HARD_REG_SET *reg_set,
+				    HARD_REG_SET default_set);
+
 #endif /* GCC_REGS_H */

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH][03/10] -fuse-caller-save - Add implicit parameter to find_all_hard_reg_sets
  2013-03-29 12:54           ` Tom de Vries
                               ` (15 preceding siblings ...)
  2013-03-30 17:11             ` [PATCH][04/10] -fuse-caller-save - Add TARGET_FN_OTHER_HARD_REG_USAGE hook Tom de Vries
@ 2013-03-30 17:11             ` Tom de Vries
  2013-03-30 17:12             ` [PATCH][07/10] -fuse-caller-save - Use collected register usage information Tom de Vries
                               ` (3 subsequent siblings)
  20 siblings, 0 replies; 59+ messages in thread
From: Tom de Vries @ 2013-03-30 17:11 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 449 bytes --]

Vladimir,

This patch adds an implicit parameter to find_all_hard_reg_sets.

Thanks,
  -Tom

2013-03-30  Radovan Obradovic  <robradovic@mips.com>
            Tom de Vries  <tom@codesourcery.com>

	* rtlanal.c (find_all_hard_reg_sets): Add bool implicit parameter and
	handle.
	* rtl.h (find_all_hard_reg_sets): Add bool parameter.
	* haifa-sched.c (recompute_todo_spec, check_clobbered_conditions): Add
	new argument to find_all_hard_reg_sets call.

[-- Attachment #2: 0003-Add-implicit-parameter-to-find_all_hard_reg_sets.patch --]
[-- Type: text/x-patch, Size: 2072 bytes --]

diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c
index c4591bfe..fe24d43 100644
--- a/gcc/haifa-sched.c
+++ b/gcc/haifa-sched.c
@@ -1271,7 +1271,7 @@ recompute_todo_spec (rtx next, bool for_backtrack)
 	  {
 	    HARD_REG_SET t;
 
-	    find_all_hard_reg_sets (prev, &t);
+	    find_all_hard_reg_sets (prev, &t, true);
 	    if (TEST_HARD_REG_BIT (t, regno))
 	      return HARD_DEP;
 	    if (prev == pro)
@@ -3041,7 +3041,7 @@ check_clobbered_conditions (rtx insn)
   if ((current_sched_info->flags & DO_PREDICATION) == 0)
     return;
 
-  find_all_hard_reg_sets (insn, &t);
+  find_all_hard_reg_sets (insn, &t, true);
 
  restart:
   for (i = 0; i < ready.n_ready; i++)
diff --git a/gcc/rtl.h b/gcc/rtl.h
index b9defcc..6486f20 100644
--- a/gcc/rtl.h
+++ b/gcc/rtl.h
@@ -2038,7 +2038,7 @@ extern const_rtx set_of (const_rtx, const_rtx);
 extern void record_hard_reg_sets (rtx, const_rtx, void *);
 extern void record_hard_reg_uses (rtx *, void *);
 #ifdef HARD_CONST
-extern void find_all_hard_reg_sets (const_rtx, HARD_REG_SET *);
+extern void find_all_hard_reg_sets (const_rtx, HARD_REG_SET *, bool);
 #endif
 extern void note_stores (const_rtx, void (*) (rtx, const_rtx, void *), void *);
 extern void note_uses (rtx *, void (*) (rtx *, void *), void *);
diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c
index b198685..27c1974 100644
--- a/gcc/rtlanal.c
+++ b/gcc/rtlanal.c
@@ -1028,13 +1028,13 @@ record_hard_reg_sets (rtx x, const_rtx pat ATTRIBUTE_UNUSED, void *data)
 /* Examine INSN, and compute the set of hard registers written by it.
    Store it in *PSET.  Should only be called after reload.  */
 void
-find_all_hard_reg_sets (const_rtx insn, HARD_REG_SET *pset)
+find_all_hard_reg_sets (const_rtx insn, HARD_REG_SET *pset, bool implicit)
 {
   rtx link;
 
   CLEAR_HARD_REG_SET (*pset);
   note_stores (PATTERN (insn), record_hard_reg_sets, pset);
-  if (CALL_P (insn))
+  if (implicit && CALL_P (insn))
     IOR_HARD_REG_SET (*pset, call_used_reg_set);
   for (link = REG_NOTES (insn); link; link = XEXP (link, 1))
     if (REG_NOTE_KIND (link) == REG_INC)

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH][04/10] -fuse-caller-save - Add TARGET_FN_OTHER_HARD_REG_USAGE hook
  2013-03-29 12:54           ` Tom de Vries
                               ` (14 preceding siblings ...)
  2013-03-30 17:11             ` [PATCH][01/10] -fuse-caller-save - Add command line option Tom de Vries
@ 2013-03-30 17:11             ` Tom de Vries
  2013-12-07 15:07               ` [PATCH] -fuse-caller-save - Implement TARGET_FN_OTHER_HARD_REG_USAGE hook for MIPS Tom de Vries
  2013-03-30 17:11             ` [PATCH][03/10] -fuse-caller-save - Add implicit parameter to find_all_hard_reg_sets Tom de Vries
                               ` (4 subsequent siblings)
  20 siblings, 1 reply; 59+ messages in thread
From: Tom de Vries @ 2013-03-30 17:11 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 785 bytes --]

Vladimir,

This patch adds a TARGET_FN_OTHER_HARD_REG_USAGE hook.  The hook is used to
list hard registers that are set or clobbered by a call to a function, but are
not listed as such in the function body, such as f.i. registers clobbered by
veneers inserted by the linker.

Thanks,
  -Tom

2013-03-30  Radovan Obradovic  <robradovic@mips.com>
            Tom de Vries  <tom@codesourcery.com>

	* hooks.c (hook_void_hard_reg_set_containerp): New function.
	* hooks.h (hook_void_hard_reg_set_containerp): Declare.
	* target.def (fn_other_hard_reg_usage): New DEFHOOK.
	* doc/tm.texi.in (@node Stack and Calling): Add Miscellaneous Register
	Hooks to @menu.
	(@node Miscellaneous Register Hooks): New node.
	(@hook TARGET_FN_OTHER_HARD_REG_USAGE): New hook.
	* doc/tm.texi: Regenerate.

[-- Attachment #2: 0004-Add-TARGET_FN_OTHER_HARD_REG_USAGE-hook.patch --]
[-- Type: text/x-patch, Size: 3833 bytes --]

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index cbbc82d..3bf7abe 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -3074,6 +3074,7 @@ This describes the stack layout and calling conventions.
 * Profiling::
 * Tail Calls::
 * Stack Smashing Protection::
+* Miscellaneous Register Hooks::
 @end menu
 
 @node Frame Layout
@@ -4999,6 +5000,14 @@ normally defined in @file{libgcc2.c}.
 Whether this target supports splitting the stack when the options described in @var{opts} have been passed.  This is called after options have been parsed, so the target may reject splitting the stack in some configurations.  The default version of this hook returns false.  If @var{report} is true, this function may issue a warning or error; if @var{report} is false, it must simply return a value
 @end deftypefn
 
+@node Miscellaneous Register Hooks
+@subsection Miscellaneous register hooks
+@cindex miscellaneous register hooks
+
+@deftypefn {Target Hook} void TARGET_FN_OTHER_HARD_REG_USAGE (struct hard_reg_set_container *@var{regs})
+Add any hard registers to @var{regs} that are set or clobbered by a call to the function.  This hook only needs to be defined to provide registers that cannot be found by examination of the final RTL representation of a function.
+@end deftypefn
+
 @node Varargs
 @section Implementing the Varargs Macros
 @cindex varargs implementation
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index dfba947..4dfd8aa 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -3042,6 +3042,7 @@ This describes the stack layout and calling conventions.
 * Profiling::
 * Tail Calls::
 * Stack Smashing Protection::
+* Miscellaneous Register Hooks::
 @end menu
 
 @node Frame Layout
@@ -4922,6 +4923,12 @@ normally defined in @file{libgcc2.c}.
 
 @hook TARGET_SUPPORTS_SPLIT_STACK
 
+@node Miscellaneous Register Hooks
+@subsection Miscellaneous register hooks
+@cindex miscellaneous register hooks
+
+@hook TARGET_FN_OTHER_HARD_REG_USAGE
+
 @node Varargs
 @section Implementing the Varargs Macros
 @cindex varargs implementation
diff --git a/gcc/hooks.c b/gcc/hooks.c
index 3b54dfa..e038a95 100644
--- a/gcc/hooks.c
+++ b/gcc/hooks.c
@@ -446,3 +446,11 @@ void
 hook_void_gcc_optionsp (struct gcc_options *opts ATTRIBUTE_UNUSED)
 {
 }
+
+/* Generic hook that takes a struct hard_reg_set_container * and returns
+   void.  */
+
+void
+hook_void_hard_reg_set_containerp (struct hard_reg_set_container *regs ATTRIBUTE_UNUSED)
+{
+}
diff --git a/gcc/hooks.h b/gcc/hooks.h
index 50bcc6a..44decdf 100644
--- a/gcc/hooks.h
+++ b/gcc/hooks.h
@@ -69,6 +69,7 @@ extern void hook_void_tree (tree);
 extern void hook_void_tree_treeptr (tree, tree *);
 extern void hook_void_int_int (int, int);
 extern void hook_void_gcc_optionsp (struct gcc_options *);
+extern void hook_void_hard_reg_set_containerp (struct hard_reg_set_container *);
 
 extern int hook_int_uint_mode_1 (unsigned int, enum machine_mode);
 extern int hook_int_const_tree_0 (const_tree);
diff --git a/gcc/target.def b/gcc/target.def
index 831cad8..e8f7c4a 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -2850,6 +2850,17 @@ DEFHOOK
  void, (bitmap regs),
  hook_void_bitmap)
 
+/* For targets that need to mark extra registers as clobbered on entry to
+   the function, they should define this target hook and set their
+   bits in the struct hard_reg_set_container passed in.  */
+DEFHOOK
+(fn_other_hard_reg_usage,
+ "Add any hard registers to @var{regs} that are set or clobbered by a call to\
+ the function.  This hook only needs to be defined to provide registers that\
+ cannot be found by examination of the final RTL representation of a function.",
+ void, (struct hard_reg_set_container *regs),
+ hook_void_hard_reg_set_containerp)
+
 /* Fill in additional registers set up by prologue into a regset.  */
 DEFHOOK
 (set_up_by_prologue,

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH][02/10] -fuse-caller-save - Add new reg-note REG_CALL_DECL
  2013-03-29 12:54           ` Tom de Vries
                               ` (11 preceding siblings ...)
  2013-03-30 17:11             ` [PATCH][05/10] -fuse-caller-save - Implement TARGET_FN_OTHER_HARD_REG_USAGE hook for ARM Tom de Vries
@ 2013-03-30 17:11             ` Tom de Vries
  2013-03-30 17:11             ` [PATCH][06/10] -fuse-caller-save - Collect register usage information Tom de Vries
                               ` (7 subsequent siblings)
  20 siblings, 0 replies; 59+ messages in thread
From: Tom de Vries @ 2013-03-30 17:11 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 697 bytes --]

Vladimir,

This patch addes the REG_CALL_DECL reg-note.  Using the reg-note we are able to
easily link call_insns to their corresponding declaration, even after the calls
may have been split into an insn (set register to function address) and a
call_insn (call register), which can happen for f.i. sh, and mips
with -mabi-calls.

Thanks,
  -Tom

2013-03-30  Radovan Obradovic  <robradovic@mips.com>
            Tom de Vries  <tom@codesourcery.com>

	* reg-notes.def (REG_NOTE (CALL_DECL)): New reg-note REG_CALL_DECL.
	* calls.c (expand_call, emit_library_call_value_1): Add REG_CALL_DECL
	reg-note.
	* combine.c (distribute_notes): Handle REG_CALL_DECL reg-note.
	* emit-rtl.c (try_split): Same.

[-- Attachment #2: 0002-Add-new-reg-note-REG_CALL_DECL.patch --]
[-- Type: text/x-patch, Size: 3188 bytes --]

diff --git a/gcc/calls.c b/gcc/calls.c
index cdab8e0..39571da 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -3158,6 +3158,19 @@ expand_call (tree exp, rtx target, int ignore)
 		   next_arg_reg, valreg, old_inhibit_defer_pop, call_fusage,
 		   flags, args_so_far);
 
+      if (flag_use_caller_save)
+	{
+	  rtx last, datum = NULL_RTX;
+	  if (fndecl != NULL_TREE)
+	    {
+	      datum = XEXP (DECL_RTL (fndecl), 0);
+	      gcc_assert (datum != NULL_RTX
+			  && GET_CODE (datum) == SYMBOL_REF);
+	    }
+	  last = last_call_insn ();
+	  add_reg_note (last, REG_CALL_DECL, datum);
+	}
+
       /* If the call setup or the call itself overlaps with anything
 	 of the argument setup we probably clobbered our call address.
 	 In that case we can't do sibcalls.  */
@@ -4185,6 +4198,14 @@ emit_library_call_value_1 (int retval, rtx orgfun, rtx value,
 	       valreg,
 	       old_inhibit_defer_pop + 1, call_fusage, flags, args_so_far);
 
+  if (flag_use_caller_save)
+    {
+      rtx last, datum = orgfun;
+      gcc_assert (GET_CODE (datum) == SYMBOL_REF);
+      last = last_call_insn ();
+      add_reg_note (last, REG_CALL_DECL, datum);
+    }
+
   /* Right-shift returned value if necessary.  */
   if (!pcc_struct_value
       && TYPE_MODE (tfom) != BLKmode
diff --git a/gcc/combine.c b/gcc/combine.c
index acb4cb4..191eb71 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -13187,6 +13187,7 @@ distribute_notes (rtx notes, rtx from_insn, rtx i3, rtx i2, rtx elim_i2,
 	case REG_NORETURN:
 	case REG_SETJMP:
 	case REG_TM:
+	case REG_CALL_DECL:
 	  /* These notes must remain with the call.  It should not be
 	     possible for both I2 and I3 to be a call.  */
 	  if (CALL_P (i3))
diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index e412bef..e4843fe 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -3473,6 +3473,7 @@ try_split (rtx pat, rtx trial, int last)
   int probability;
   rtx insn_last, insn;
   int njumps = 0;
+  rtx call_insn = NULL_RTX;
 
   /* We're not good at redistributing frame information.  */
   if (RTX_FRAME_RELATED_P (trial))
@@ -3545,6 +3546,9 @@ try_split (rtx pat, rtx trial, int last)
 	  {
 	    rtx next, *p;
 
+	    gcc_assert (call_insn == NULL_RTX);
+	    call_insn = insn;
+
 	    /* Add the old CALL_INSN_FUNCTION_USAGE to whatever the
 	       target may have explicitly specified.  */
 	    p = &CALL_INSN_FUNCTION_USAGE (insn);
@@ -3616,6 +3620,11 @@ try_split (rtx pat, rtx trial, int last)
 	  fixup_args_size_notes (NULL_RTX, insn_last, INTVAL (XEXP (note, 0)));
 	  break;
 
+	case REG_CALL_DECL:
+	  gcc_assert (call_insn != NULL_RTX);
+	  add_reg_note (call_insn, REG_NOTE_KIND (note), XEXP (note, 0));
+	  break;
+
 	default:
 	  break;
 	}
diff --git a/gcc/reg-notes.def b/gcc/reg-notes.def
index db61c09..f0b6dad 100644
--- a/gcc/reg-notes.def
+++ b/gcc/reg-notes.def
@@ -216,3 +216,8 @@ REG_NOTE (ARGS_SIZE)
    that the return value of a call can be used to reinitialize a
    pseudo reg.  */
 REG_NOTE (RETURNED)
+
+/* Used to mark a call with the function decl called by the call.
+   The decl might not be available in the call due to splitting of the call
+   insn.  This note is a SYMBOL_REF.  */
+REG_NOTE (CALL_DECL)

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH][01/10] -fuse-caller-save - Add command line option
  2013-03-29 12:54           ` Tom de Vries
                               ` (13 preceding siblings ...)
  2013-03-30 17:11             ` [PATCH][06/10] -fuse-caller-save - Collect register usage information Tom de Vries
@ 2013-03-30 17:11             ` Tom de Vries
  2013-03-30 17:11             ` [PATCH][04/10] -fuse-caller-save - Add TARGET_FN_OTHER_HARD_REG_USAGE hook Tom de Vries
                               ` (5 subsequent siblings)
  20 siblings, 0 replies; 59+ messages in thread
From: Tom de Vries @ 2013-03-30 17:11 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 236 bytes --]

Vladimir,

This patch adds the -fuse-caller-save command line option.

Thanks,
  -Tom

2013-03-30  Radovan Obradovic  <robradovic@mips.com>
            Tom de Vries  <tom@codesourcery.com>

	* common.opt (fuse-caller-save): New option.

[-- Attachment #2: 0001-Add-command-line-option.patch --]
[-- Type: text/x-patch, Size: 395 bytes --]

diff --git a/gcc/common.opt b/gcc/common.opt
index bdbd3b6..d29b0a0 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2549,4 +2549,8 @@ Create a position independent executable
 z
 Driver Joined Separate
 
+fuse-caller-save
+Common Report Var(flag_use_caller_save) Optimization
+Use caller save register across calls if possible
+
 ; This comment is to ensure we retain the blank line above.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH][09/10] -fuse-caller-save - Add documentation
  2013-03-29 12:54           ` Tom de Vries
                               ` (18 preceding siblings ...)
  2013-03-30 17:12             ` [PATCH][10/10] -fuse-caller-save - Add test-case Tom de Vries
@ 2013-03-30 17:12             ` Tom de Vries
  2013-03-30 17:12             ` [PATCH][08/10] -fuse-caller-save - Enable by default at O2 and higher Tom de Vries
  20 siblings, 0 replies; 59+ messages in thread
From: Tom de Vries @ 2013-03-30 17:12 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 312 bytes --]

Vladimir,

This patch adds the documentation of -fuse-caller-save.

Thanks,
  -Tom

2013-03-30  Radovan Obradovic  <robradovic@mips.com>
            Tom de Vries  <tom@codesourcery.com>

	* doc/invoke.texi (@item Optimization Options): Add -fuse-caller-save to
	gccoptlist.
	(@item -fuse-caller-save): New item.

[-- Attachment #2: 0009-Add-documentation.patch --]
[-- Type: text/x-patch, Size: 1408 bytes --]

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 475dcf0..efb8a1a 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -421,8 +421,8 @@ Objective-C and Objective-C++ Dialects}.
 -ftree-ter -ftree-vect-loop-version -ftree-vectorize -ftree-vrp @gol
 -funit-at-a-time -funroll-all-loops -funroll-loops @gol
 -funsafe-loop-optimizations -funsafe-math-optimizations -funswitch-loops @gol
--fvariable-expansion-in-unroller -fvect-cost-model -fvpt -fweb @gol
--fwhole-program -fwpa -fuse-ld=@var{linker} -fuse-linker-plugin @gol
+-fuse-caller-save -fvariable-expansion-in-unroller -fvect-cost-model -fvpt @gol
+-fweb -fwhole-program -fwpa -fuse-ld=@var{linker} -fuse-linker-plugin @gol
 --param @var{name}=@var{value}
 -O  -O0  -O1  -O2  -O3  -Os -Ofast -Og}
 
@@ -7382,6 +7382,14 @@ and then tries to find ways to combine them.
 
 Enabled by default at @option{-O1} and higher.
 
+@item -fuse-caller-save
+Use caller save registers for allocation if those registers are not used by
+any called function.  In that case it is not necessary to save and restore
+them around calls.  This is only possible if called functions are part of
+same compilation unit as current function and they are compiled before it.
+
+Enabled at levels @option{-O2}, @option{-O3}, @option{-Os}.
+
 @item -fconserve-stack
 @opindex fconserve-stack
 Attempt to minimize stack usage.  The compiler attempts to use less

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH][07/10] -fuse-caller-save - Use collected register usage information
  2013-03-29 12:54           ` Tom de Vries
                               ` (16 preceding siblings ...)
  2013-03-30 17:11             ` [PATCH][03/10] -fuse-caller-save - Add implicit parameter to find_all_hard_reg_sets Tom de Vries
@ 2013-03-30 17:12             ` Tom de Vries
  2013-12-06  0:56               ` Tom de Vries
  2013-03-30 17:12             ` [PATCH][10/10] -fuse-caller-save - Add test-case Tom de Vries
                               ` (2 subsequent siblings)
  20 siblings, 1 reply; 59+ messages in thread
From: Tom de Vries @ 2013-03-30 17:12 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1375 bytes --]

Paolo,

This patch series adds analysis of register usage of functions for usage by IRA.
The original post is here
( http://gcc.gnu.org/ml/gcc-patches/2013-01/msg01234.html ).

This patch uses the information of which registers are clobbered by a call
in IRA and df-scan.

Bootstrapped and reg-tested on x86_64, Ada inclusive. Build and reg-tested on
mips, arm, ppc and sh.

Can you approve the df-scan part for trunk?

Thanks,
  -Tom

2013-03-30  Radovan Obradovic  <robradovic@mips.com>
            Tom de Vries  <tom@codesourcery.com>

	* df-scan.c (df_get_call_refs): Use get_call_reg_set_usage.
	* caller-save.c (setup_save_areas, save_call_clobbered_regs): Use
	get_call_reg_set_usage.
	* resource.c (mark_set_resources, mark_target_live_regs): Use
	get_call_reg_set_usage.
	* ira-int.h (struct ira_allocno): Add crossed_calls_clobbered_regs
	field.
	(ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS): Define.
	* ira-lives.c (process_bb_node_lives): Use get_call_reg_set_usage.
	Calculate ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS.
	* ira-build.c (ira_create_allocno): Init
	ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS.
	(create_cap_allocno, propagate_allocno_info)
	(propagate_some_info_from_allocno)
	(copy_info_to_removed_store_destinations): Handle
	ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS.
	* ira-costs.c (ira_tune_allocno_costs): Use
	ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS to adjust costs.

[-- Attachment #2: 0007-Use-collected-register-usage-information.patch --]
[-- Type: text/x-patch, Size: 10664 bytes --]

diff --git a/gcc/caller-save.c b/gcc/caller-save.c
index 5e65294..39d75ad 100644
--- a/gcc/caller-save.c
+++ b/gcc/caller-save.c
@@ -441,7 +441,7 @@ setup_save_areas (void)
       freq = REG_FREQ_FROM_BB (BLOCK_FOR_INSN (insn));
       REG_SET_TO_HARD_REG_SET (hard_regs_to_save,
 			       &chain->live_throughout);
-      COPY_HARD_REG_SET (used_regs, call_used_reg_set);
+      get_call_reg_set_usage (insn, &used_regs, call_used_reg_set);
 
       /* Record all registers set in this call insn.  These don't
 	 need to be saved.  N.B. the call insn might set a subreg
@@ -525,7 +525,7 @@ setup_save_areas (void)
 
 	  REG_SET_TO_HARD_REG_SET (hard_regs_to_save,
 				   &chain->live_throughout);
-	  COPY_HARD_REG_SET (used_regs, call_used_reg_set);
+	  get_call_reg_set_usage (insn, &used_regs, call_used_reg_set);
 
 	  /* Record all registers set in this call insn.  These don't
 	     need to be saved.  N.B. the call insn might set a subreg
@@ -804,6 +804,7 @@ save_call_clobbered_regs (void)
 	    {
 	      unsigned regno;
 	      HARD_REG_SET hard_regs_to_save;
+	      HARD_REG_SET call_def_reg_set;
 	      reg_set_iterator rsi;
 	      rtx cheap;
 
@@ -854,7 +855,9 @@ save_call_clobbered_regs (void)
 	      AND_COMPL_HARD_REG_SET (hard_regs_to_save, call_fixed_reg_set);
 	      AND_COMPL_HARD_REG_SET (hard_regs_to_save, this_insn_sets);
 	      AND_COMPL_HARD_REG_SET (hard_regs_to_save, hard_regs_saved);
-	      AND_HARD_REG_SET (hard_regs_to_save, call_used_reg_set);
+	      get_call_reg_set_usage (insn, &call_def_reg_set,
+				      call_used_reg_set);
+	      AND_HARD_REG_SET (hard_regs_to_save, call_def_reg_set);
 
 	      for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
 		if (TEST_HARD_REG_BIT (hard_regs_to_save, regno))
diff --git a/gcc/df-scan.c b/gcc/df-scan.c
index fdfa931..898454c 100644
--- a/gcc/df-scan.c
+++ b/gcc/df-scan.c
@@ -3398,10 +3398,13 @@ df_get_call_refs (struct df_collection_rec *collection_rec,
   bool is_sibling_call;
   unsigned int i;
   HARD_REG_SET defs_generated;
+  HARD_REG_SET fn_reg_set_usage;
 
   CLEAR_HARD_REG_SET (defs_generated);
   df_find_hard_reg_defs (PATTERN (insn_info->insn), &defs_generated);
   is_sibling_call = SIBLING_CALL_P (insn_info->insn);
+  get_call_reg_set_usage (insn_info->insn, &fn_reg_set_usage,
+			  regs_invalidated_by_call);
 
   for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
     {
@@ -3425,7 +3428,7 @@ df_get_call_refs (struct df_collection_rec *collection_rec,
 			       NULL, bb, insn_info, DF_REF_REG_DEF, flags);
 	    }
 	}
-      else if (TEST_HARD_REG_BIT (regs_invalidated_by_call, i)
+      else if (TEST_HARD_REG_BIT (fn_reg_set_usage, i)
 	       /* no clobbers for regs that are the result of the call */
 	       && !TEST_HARD_REG_BIT (defs_generated, i)
 	       && (!is_sibling_call
diff --git a/gcc/ira-build.c b/gcc/ira-build.c
index b1e481b..054139a 100644
--- a/gcc/ira-build.c
+++ b/gcc/ira-build.c
@@ -507,6 +507,7 @@ ira_create_allocno (int regno, bool cap_p,
   ALLOCNO_CALL_FREQ (a) = 0;
   ALLOCNO_CALLS_CROSSED_NUM (a) = 0;
   ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a) = 0;
+  CLEAR_HARD_REG_SET (ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a));
 #ifdef STACK_REGS
   ALLOCNO_NO_STACK_REG_P (a) = false;
   ALLOCNO_TOTAL_NO_STACK_REG_P (a) = false;
@@ -904,6 +905,8 @@ create_cap_allocno (ira_allocno_t a)
 
   ALLOCNO_CALLS_CROSSED_NUM (cap) = ALLOCNO_CALLS_CROSSED_NUM (a);
   ALLOCNO_CHEAP_CALLS_CROSSED_NUM (cap) = ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a);
+  IOR_HARD_REG_SET (ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (cap),
+		    ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a));
   if (internal_flag_ira_verbose > 2 && ira_dump_file != NULL)
     {
       fprintf (ira_dump_file, "    Creating cap ");
@@ -1823,6 +1826,8 @@ propagate_allocno_info (void)
 	    += ALLOCNO_CALLS_CROSSED_NUM (a);
 	  ALLOCNO_CHEAP_CALLS_CROSSED_NUM (parent_a)
 	    += ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a);
+ 	  IOR_HARD_REG_SET (ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (parent_a),
+ 			    ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a));
 	  ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (parent_a)
 	    += ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (a);
 	  aclass = ALLOCNO_CLASS (a);
@@ -2203,6 +2208,9 @@ propagate_some_info_from_allocno (ira_allocno_t a, ira_allocno_t from_a)
   ALLOCNO_CALLS_CROSSED_NUM (a) += ALLOCNO_CALLS_CROSSED_NUM (from_a);
   ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a)
     += ALLOCNO_CHEAP_CALLS_CROSSED_NUM (from_a);
+  IOR_HARD_REG_SET (ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a),
+ 		    ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (from_a));
+
   ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (a)
     += ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (from_a);
   if (! ALLOCNO_BAD_SPILL_P (from_a))
@@ -2828,6 +2836,8 @@ copy_info_to_removed_store_destinations (int regno)
 	+= ALLOCNO_CALLS_CROSSED_NUM (a);
       ALLOCNO_CHEAP_CALLS_CROSSED_NUM (parent_a)
 	+= ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a);
+      IOR_HARD_REG_SET (ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (parent_a),
+ 			ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a));
       ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (parent_a)
 	+= ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (a);
       merged_p = true;
diff --git a/gcc/ira-costs.c b/gcc/ira-costs.c
index 1de0061..8dbef3f 100644
--- a/gcc/ira-costs.c
+++ b/gcc/ira-costs.c
@@ -2082,6 +2082,7 @@ ira_tune_allocno_costs (void)
   ira_allocno_object_iterator oi;
   ira_object_t obj;
   bool skip_p;
+  HARD_REG_SET *crossed_calls_clobber_regs;
 
   FOR_EACH_ALLOCNO (a, ai)
     {
@@ -2116,17 +2117,24 @@ ira_tune_allocno_costs (void)
 		continue;
 	      rclass = REGNO_REG_CLASS (regno);
 	      cost = 0;
-	      if (ira_hard_reg_set_intersection_p (regno, mode, call_used_reg_set)
-		  || HARD_REGNO_CALL_PART_CLOBBERED (regno, mode))
-		cost += (ALLOCNO_CALL_FREQ (a)
-			 * (ira_memory_move_cost[mode][rclass][0]
-			    + ira_memory_move_cost[mode][rclass][1]));
+	      crossed_calls_clobber_regs
+		= &(ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a));
+	      if (ira_hard_reg_set_intersection_p (regno, mode,
+						   *crossed_calls_clobber_regs))
+		{
+		  if (ira_hard_reg_set_intersection_p (regno, mode,
+						       call_used_reg_set)
+		      || HARD_REGNO_CALL_PART_CLOBBERED (regno, mode))
+		    cost += (ALLOCNO_CALL_FREQ (a)
+			     * (ira_memory_move_cost[mode][rclass][0]
+				+ ira_memory_move_cost[mode][rclass][1]));
 #ifdef IRA_HARD_REGNO_ADD_COST_MULTIPLIER
-	      cost += ((ira_memory_move_cost[mode][rclass][0]
-			+ ira_memory_move_cost[mode][rclass][1])
-		       * ALLOCNO_FREQ (a)
-		       * IRA_HARD_REGNO_ADD_COST_MULTIPLIER (regno) / 2);
+		  cost += ((ira_memory_move_cost[mode][rclass][0]
+			    + ira_memory_move_cost[mode][rclass][1])
+			   * ALLOCNO_FREQ (a)
+			   * IRA_HARD_REGNO_ADD_COST_MULTIPLIER (regno) / 2);
 #endif
+		}
 	      if (INT_MAX - cost < reg_costs[j])
 		reg_costs[j] = INT_MAX;
 	      else
diff --git a/gcc/ira-int.h b/gcc/ira-int.h
index 096f330..3d0e5ec 100644
--- a/gcc/ira-int.h
+++ b/gcc/ira-int.h
@@ -371,6 +371,8 @@ struct ira_allocno
   /* The number of calls across which it is live, but which should not
      affect register preferences.  */
   int cheap_calls_crossed_num;
+  /* Registers clobbered by intersected calls.  */
+   HARD_REG_SET crossed_calls_clobbered_regs;
   /* Array of usage costs (accumulated and the one updated during
      coloring) for each hard register of the allocno class.  The
      member value can be NULL if all costs are the same and equal to
@@ -414,6 +416,8 @@ struct ira_allocno
 #define ALLOCNO_CALL_FREQ(A) ((A)->call_freq)
 #define ALLOCNO_CALLS_CROSSED_NUM(A) ((A)->calls_crossed_num)
 #define ALLOCNO_CHEAP_CALLS_CROSSED_NUM(A) ((A)->cheap_calls_crossed_num)
+#define ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS(A) \
+  ((A)->crossed_calls_clobbered_regs)
 #define ALLOCNO_MEM_OPTIMIZED_DEST(A) ((A)->mem_optimized_dest)
 #define ALLOCNO_MEM_OPTIMIZED_DEST_P(A) ((A)->mem_optimized_dest_p)
 #define ALLOCNO_SOMEWHERE_RENAMED_P(A) ((A)->somewhere_renamed_p)
diff --git a/gcc/ira-lives.c b/gcc/ira-lives.c
index 31635dd..45a00c2 100644
--- a/gcc/ira-lives.c
+++ b/gcc/ira-lives.c
@@ -1273,6 +1273,10 @@ process_bb_node_lives (ira_loop_tree_node_t loop_tree_node)
 		  ira_object_t obj = ira_object_id_map[i];
 		  ira_allocno_t a = OBJECT_ALLOCNO (obj);
 		  int num = ALLOCNO_NUM (a);
+		  HARD_REG_SET this_call_used_reg_set;
+
+		  get_call_reg_set_usage (insn, &this_call_used_reg_set,
+					  call_used_reg_set);
 
 		  /* Don't allocate allocnos that cross setjmps or any
 		     call, if this function receives a nonlocal
@@ -1287,9 +1291,9 @@ process_bb_node_lives (ira_loop_tree_node_t loop_tree_node)
 		  if (can_throw_internal (insn))
 		    {
 		      IOR_HARD_REG_SET (OBJECT_CONFLICT_HARD_REGS (obj),
-					call_used_reg_set);
+					this_call_used_reg_set);
 		      IOR_HARD_REG_SET (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj),
-					call_used_reg_set);
+					this_call_used_reg_set);
 		    }
 
 		  if (sparseset_bit_p (allocnos_processed, num))
@@ -1306,6 +1310,8 @@ process_bb_node_lives (ira_loop_tree_node_t loop_tree_node)
 		  /* Mark it as saved at the next call.  */
 		  allocno_saved_at_call[num] = last_call_num + 1;
 		  ALLOCNO_CALLS_CROSSED_NUM (a)++;
+		  IOR_HARD_REG_SET (ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a),
+				    this_call_used_reg_set);
 		  if (cheap_reg != NULL_RTX
 		      && ALLOCNO_REGNO (a) == (int) REGNO (cheap_reg))
 		    ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a)++;
diff --git a/gcc/resource.c b/gcc/resource.c
index 333c28f..111708c 100644
--- a/gcc/resource.c
+++ b/gcc/resource.c
@@ -647,10 +647,12 @@ mark_set_resources (rtx x, struct resources *res, int in_dest,
       if (mark_type == MARK_SRC_DEST_CALL)
 	{
 	  rtx link;
+	  HARD_REG_SET regs;
 
 	  res->cc = res->memory = 1;
 
-	  IOR_HARD_REG_SET (res->regs, regs_invalidated_by_call);
+	  get_call_reg_set_usage (x, &regs, regs_invalidated_by_call);
+	  IOR_HARD_REG_SET (res->regs, regs);
 
 	  for (link = CALL_INSN_FUNCTION_USAGE (x);
 	       link; link = XEXP (link, 1))
@@ -996,11 +998,15 @@ mark_target_live_regs (rtx insns, rtx target, struct resources *res)
 
 	  if (CALL_P (real_insn))
 	    {
+	      HARD_REG_SET regs_invalidated_by_this_call;
 	      /* CALL clobbers all call-used regs that aren't fixed except
 		 sp, ap, and fp.  Do this before setting the result of the
 		 call live.  */
-	      AND_COMPL_HARD_REG_SET (current_live_regs,
+	      get_call_reg_set_usage (real_insn,
+				      &regs_invalidated_by_this_call,
 				      regs_invalidated_by_call);
+	      AND_COMPL_HARD_REG_SET (current_live_regs,
+				      regs_invalidated_by_this_call);
 
 	      /* A CALL_INSN sets any global register live, since it may
 		 have been modified by the call.  */

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH][08/10] -fuse-caller-save - Enable by default at O2 and higher
  2013-03-29 12:54           ` Tom de Vries
                               ` (19 preceding siblings ...)
  2013-03-30 17:12             ` [PATCH][09/10] -fuse-caller-save - Add documentation Tom de Vries
@ 2013-03-30 17:12             ` Tom de Vries
  20 siblings, 0 replies; 59+ messages in thread
From: Tom de Vries @ 2013-03-30 17:12 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 288 bytes --]

Vladimir,

This patch enables the -fuse-caller-save optimization by default.

Thanks,
  -Tom

2013-03-30  Radovan Obradovic  <robradovic@mips.com>
            Tom de Vries  <tom@codesourcery.com>

	* opts.c (default_options_table): Add OPT_LEVELS_2_PLUS entry with
	OPT_fuse_caller_save.

[-- Attachment #2: 0008-Enable-by-default-at-O2-and-higher.patch --]
[-- Type: text/x-patch, Size: 541 bytes --]

diff --git a/gcc/opts.c b/gcc/opts.c
index 45b12fe..52a42b9 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -486,6 +486,7 @@ static const struct default_options default_options_table[] =
     { OPT_LEVELS_2_PLUS, OPT_ftree_tail_merge, NULL, 1 },
     { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_foptimize_strlen, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fhoist_adjacent_loads, NULL, 1 },
+    { OPT_LEVELS_2_PLUS, OPT_fuse_caller_save, NULL, 1 },
 
     /* -O3 optimizations.  */
     { OPT_LEVELS_3_PLUS, OPT_ftree_loop_distribute_patterns, NULL, 1 },

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH][10/10] -fuse-caller-save - Add test-case
  2013-03-29 12:54           ` Tom de Vries
                               ` (17 preceding siblings ...)
  2013-03-30 17:12             ` [PATCH][07/10] -fuse-caller-save - Use collected register usage information Tom de Vries
@ 2013-03-30 17:12             ` Tom de Vries
  2013-04-28 10:57               ` Richard Sandiford
  2013-03-30 17:12             ` [PATCH][09/10] -fuse-caller-save - Add documentation Tom de Vries
  2013-03-30 17:12             ` [PATCH][08/10] -fuse-caller-save - Enable by default at O2 and higher Tom de Vries
  20 siblings, 1 reply; 59+ messages in thread
From: Tom de Vries @ 2013-03-30 17:12 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 763 bytes --]

Richard,

This patch series adds analysis of register usage of functions for usage by IRA.
The original post is here
( http://gcc.gnu.org/ml/gcc-patches/2013-01/msg01234.html ).

This patch adds a test-case for -fuse-caller-save.  Since the test-case has
different output for mips16 and micromips, new effective targets are introduced.

Build and reg-tested on mips.

OK for trunk?

Thanks,
  -Tom

2013-03-30  Radovan Obradovic  <robradovic@mips.com>
            Tom de Vries  <tom@codesourcery.com>

	* lib/target-supports.exp (check_effective_target_mips16)
	(check_effective_target_micromips): New proc.
	* gcc.target/mips/mips.exp: Add use-caller-save to -ffoo/-fno-foo
	options.  Add -save-temps to mips_option_groups.
	* gcc.target/mips/aru-1.c: New test.

[-- Attachment #2: 0010-Add-test-case.patch --]
[-- Type: text/x-patch, Size: 2871 bytes --]

diff --git a/gcc/testsuite/gcc.target/mips/aru-1.c b/gcc/testsuite/gcc.target/mips/aru-1.c
new file mode 100644
index 0000000..71515a9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/aru-1.c
@@ -0,0 +1,38 @@
+/* { dg-do run } */
+/* { dg-options "-fuse-caller-save -save-temps" } */
+/* { dg-skip-if "" { *-*-* }  { "*" } { "-Os" } } */
+/* Testing -fuse-caller-save optimization option.  */
+
+static int __attribute__((noinline))
+bar (int x)
+{
+  return x + 3;
+}
+
+int __attribute__((noinline))
+foo (int y)
+{
+  return y + bar (y);
+}
+
+int
+main (void)
+{
+  return !(foo (5) == 13);
+}
+
+/* Check that there are only 2 stack-saves: r31 in main and foo.  */
+
+/* Variant not mips16.  Check that there only 2 sw/sd.  */
+/* { dg-final { scan-assembler-times "(?n)s\[wd\]\t\\\$.*,.*\\(\\\$sp\\)" 2 { target { ! mips16 } } } } */
+
+/* Variant not mips16, Subvariant micromips.  Additionally check there's no
+   swm.  */
+/* { dg-final { scan-assembler-times "(?n)swm\t\\\$.*,.*\\(\\\$sp\\)" 0 {target micromips } } } */
+
+/* Variant mips16.  The save can save 1 or more registers, check that only 1 is
+   saved, twice in total.  */
+/* { dg-final { scan-assembler-times "(?n)save\t\[0-9\]*,\\\$\[^,\]*\$" 2 { target mips16 } } } */
+
+/* Check that the first caller-save register is unused.  */
+/* { dg-final { scan-assembler-not "(\\\$16)" } } */
diff --git a/gcc/testsuite/gcc.target/mips/mips.exp b/gcc/testsuite/gcc.target/mips/mips.exp
index 15b1386..63570bd 100644
--- a/gcc/testsuite/gcc.target/mips/mips.exp
+++ b/gcc/testsuite/gcc.target/mips/mips.exp
@@ -246,6 +246,7 @@ set mips_option_groups {
     small-data "-G[0-9]+"
     warnings "-w"
     dump "-fdump-.*"
+    save_temps "-save-temps"
 }
 
 # Add -mfoo/-mno-foo options to mips_option_groups.
@@ -302,6 +303,7 @@ foreach option {
     tree-vectorize
     unroll-all-loops
     unroll-loops
+    use-caller-save
 } {
     lappend mips_option_groups $option "-f(no-|)$option"
 }
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index a146f17..dbd0037 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -918,6 +918,26 @@ proc check_effective_target_mips16_attribute { } {
     } [add_options_for_mips16_attribute ""]]
 }
 
+# Return 1 if the target generates mips16 code by default.
+
+proc check_effective_target_mips16 { } {
+    return [check_no_compiler_messages mips16 assembly {
+	#if !(defined __mips16)
+	#error FOO
+	#endif
+    } ""]
+}
+
+# Return 1 if the target generates micromips code by default.
+
+proc check_effective_target_micromips { } {
+    return [check_no_compiler_messages micromips assembly {
+	#if !(defined __mips_micromips)
+	#error FOO
+	#endif
+    } ""]
+}
+
 # Return 1 if the target supports long double larger than double when
 # using the new ABI, 0 otherwise.
 

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH][10/10] -fuse-caller-save - Add test-case
  2013-03-30 17:12             ` [PATCH][10/10] -fuse-caller-save - Add test-case Tom de Vries
@ 2013-04-28 10:57               ` Richard Sandiford
  2013-12-06  0:34                 ` Tom de Vries
  0 siblings, 1 reply; 59+ messages in thread
From: Richard Sandiford @ 2013-04-28 10:57 UTC (permalink / raw)
  To: Tom de Vries; +Cc: gcc-patches

Tom de Vries <tom@codesourcery.com> writes:
> +/* { dg-do run } */
> +/* { dg-options "-fuse-caller-save -save-temps" } */
> +/* { dg-skip-if "" { *-*-* }  { "*" } { "-Os" } } */
> +/* Testing -fuse-caller-save optimization option.  */
> +
> +static int __attribute__((noinline))
> +bar (int x)
> +{
> +  return x + 3;
> +}
> +
> +int __attribute__((noinline))
> +foo (int y)
> +{
> +  return y + bar (y);
> +}
> +
> +int
> +main (void)
> +{
> +  return !(foo (5) == 13);
> +}
> +
> +/* Check that there are only 2 stack-saves: r31 in main and foo.  */
> +
> +/* Variant not mips16.  Check that there only 2 sw/sd.  */
> +/* { dg-final { scan-assembler-times "(?n)s\[wd\]\t\\\$.*,.*\\(\\\$sp\\)" 2 { target { ! mips16 } } } } */
> +
> +/* Variant not mips16, Subvariant micromips.  Additionally check there's no
> +   swm.  */
> +/* { dg-final { scan-assembler-times "(?n)swm\t\\\$.*,.*\\(\\\$sp\\)" 0 {target micromips } } } */
> +
> +/* Variant mips16.  The save can save 1 or more registers, check that only 1 is
> +   saved, twice in total.  */
> +/* { dg-final { scan-assembler-times "(?n)save\t\[0-9\]*,\\\$\[^,\]*\$" 2 { target mips16 } } } */
> +
> +/* Check that the first caller-save register is unused.  */
> +/* { dg-final { scan-assembler-not "(\\\$16)" } } */

Sorry to ask, but I think it would be better to split this up into
a compile test and a run test.  The run test shouldn't be skipped at -Os.
It should probably also go somewhere more general than gcc.target/mips.

I've tried to avoid conditional scan-assemblers in gcc.target/mips
whereever possible.  The directory has been set up so that you can force
any subtarget you like, so that (for example) -mips16 output is tested
by every run, not just things like mips-sim/-mips16.

In this case I think that means using NOCOMPRESSION to force the
functions to use the standard ISA encoding and making the scan-assemblers
test only for that.  Since you've already done the work :-), bonus points
for creating two copies, one for micromips (dg-options "-micromips ...")
and one for mips16 (dg-options "-mips16 ...").  That's certainly not a
requirement though.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH][10/10] -fuse-caller-save - Add test-case
  2013-04-28 10:57               ` Richard Sandiford
@ 2013-12-06  0:34                 ` Tom de Vries
  2013-12-06  8:51                   ` Richard Sandiford
  0 siblings, 1 reply; 59+ messages in thread
From: Tom de Vries @ 2013-12-06  0:34 UTC (permalink / raw)
  To: Tom de Vries, gcc-patches, rdsandiford

[-- Attachment #1: Type: text/plain, Size: 2458 bytes --]

On 27-04-13 12:01, Richard Sandiford wrote:
> Tom de Vries <tom@codesourcery.com> writes:
>> +/* { dg-do run } */
>> +/* { dg-options "-fuse-caller-save -save-temps" } */
>> +/* { dg-skip-if "" { *-*-* }  { "*" } { "-Os" } } */
>> +/* Testing -fuse-caller-save optimization option.  */
>> +
>> +static int __attribute__((noinline))
>> +bar (int x)
>> +{
>> +  return x + 3;
>> +}
>> +
>> +int __attribute__((noinline))
>> +foo (int y)
>> +{
>> +  return y + bar (y);
>> +}
>> +
>> +int
>> +main (void)
>> +{
>> +  return !(foo (5) == 13);
>> +}
>> +
>> +/* Check that there are only 2 stack-saves: r31 in main and foo.  */
>> +
>> +/* Variant not mips16.  Check that there only 2 sw/sd.  */
>> +/* { dg-final { scan-assembler-times "(?n)s\[wd\]\t\\\$.*,.*\\(\\\$sp\\)" 2 { target { ! mips16 } } } } */
>> +
>> +/* Variant not mips16, Subvariant micromips.  Additionally check there's no
>> +   swm.  */
>> +/* { dg-final { scan-assembler-times "(?n)swm\t\\\$.*,.*\\(\\\$sp\\)" 0 {target micromips } } } */
>> +
>> +/* Variant mips16.  The save can save 1 or more registers, check that only 1 is
>> +   saved, twice in total.  */
>> +/* { dg-final { scan-assembler-times "(?n)save\t\[0-9\]*,\\\$\[^,\]*\$" 2 { target mips16 } } } */
>> +
>> +/* Check that the first caller-save register is unused.  */
>> +/* { dg-final { scan-assembler-not "(\\\$16)" } } */
>
> Sorry to ask, but I think it would be better to split this up into
> a compile test and a run test.  The run test shouldn't be skipped at -Os.
> It should probably also go somewhere more general than gcc.target/mips.
>

Richard,

Done. There's now run test gcc.dg/fuse-caller-save.c and compile test 
gcc.target/mips/fuse-caller-save.c.

> I've tried to avoid conditional scan-assemblers in gcc.target/mips
> whereever possible.  The directory has been set up so that you can force
> any subtarget you like, so that (for example) -mips16 output is tested
> by every run, not just things like mips-sim/-mips16.
>
> In this case I think that means using NOCOMPRESSION to force the
> functions to use the standard ISA encoding and making the scan-assemblers
> test only for that.

Done.

> Since you've already done the work :-), bonus points
> for creating two copies, one for micromips (dg-options "-micromips ...")
> and one for mips16 (dg-options "-mips16 ...").  That's certainly not a
> requirement though.
>

I've left that out for now.

OK for stage1?

Thanks,
- Tom

> Thanks,
> Richard
>


[-- Attachment #2: fuse-caller-save-test.patch --]
[-- Type: text/x-patch, Size: 2182 bytes --]

2013-12-04  Radovan Obradovic  <robradovic@mips.com>
            Tom de Vries  <tom@codesourcery.com>

	* gcc.target/mips/mips.exp: Add use-caller-save to -ffoo/-fno-foo
	options.
	* gcc.dg/fuse-caller-save.c: New test.
	* gcc.target/mips/fuse-caller-save.c: Same.

diff --git a/gcc/testsuite/gcc.dg/fuse-caller-save.c b/gcc/testsuite/gcc.dg/fuse-caller-save.c
new file mode 100644
index 0000000..561a66d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/fuse-caller-save.c
@@ -0,0 +1,21 @@
+/* { dg-do run } */
+/* { dg-options "-fuse-caller-save" } */
+/* Testing -fuse-caller-save optimization option.  */
+
+static int __attribute__((noinline))
+bar (int x)
+{
+  return x + 3;
+}
+
+int __attribute__((noinline))
+foo (int y)
+{
+  return y + bar (y);
+}
+
+int
+main (void)
+{
+  return !(foo (5) == 13);
+}
diff --git a/gcc/testsuite/gcc.target/mips/fuse-caller-save.c b/gcc/testsuite/gcc.target/mips/fuse-caller-save.c
new file mode 100644
index 0000000..212ca45
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/fuse-caller-save.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-fuse-caller-save" } */
+/* { dg-skip-if "" { *-*-* }  { "*" } { "-Os" } } */
+/* Testing -fuse-caller-save optimization option.  */
+
+static int __attribute__((noinline)) NOCOMPRESSION
+bar (int x)
+{
+  return x + 3;
+}
+
+int __attribute__((noinline)) NOCOMPRESSION
+foo (int y)
+{
+  return y + bar (y);
+}
+
+int NOCOMPRESSION
+main (void)
+{
+  return !(foo (5) == 13);
+}
+
+/* Check that there are only 2 stack-saves: r31 in main and foo.  */
+
+/* Check that there only 2 sw/sd.  */
+/* { dg-final { scan-assembler-times "(?n)s\[wd\]\t\\\$.*,.*\\(\\\$sp\\)" 2 } } */
+
+/* Check that the first caller-save register is unused.  */
+/* { dg-final { scan-assembler-not "(\\\$16)" } } */
diff --git a/gcc/testsuite/gcc.target/mips/mips.exp b/gcc/testsuite/gcc.target/mips/mips.exp
index 1f0d0d6..5bfadec 100644
--- a/gcc/testsuite/gcc.target/mips/mips.exp
+++ b/gcc/testsuite/gcc.target/mips/mips.exp
@@ -305,6 +305,7 @@ foreach option {
     tree-vectorize
     unroll-all-loops
     unroll-loops
+    use-caller-save
 } {
     lappend mips_option_groups $option "-f(no-|)$option"
 }

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH][IRA] Analysis of register usage of functions for usage by IRA.
  2013-03-14  9:35       ` Tom de Vries
  2013-03-14 15:22         ` Vladimir Makarov
@ 2013-12-06  0:47         ` Tom de Vries
  2014-01-14 19:36           ` Vladimir Makarov
  1 sibling, 1 reply; 59+ messages in thread
From: Tom de Vries @ 2013-12-06  0:47 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 2241 bytes --]

On 14-03-13 10:34, Tom de Vries wrote:
>> I thought about implementing your optimization for LRA by myself. But it
>> >is ok if you decide to work on it.  At least, I am not going to start
>> >this work for a month.
>>> >>I'm also currently looking at how to use the analysis in LRA.
>>> >>AFAIU, in lra-constraints.c we do a backward scan over the insns, and keep track
>>> >>of how many calls we've seen (calls_num), and mark insns with that number. Then
>>> >>when looking at a live-range segment consisting of a def or use insn a and a
>>> >>following use insn b, we can compare the number of calls seen for each insn, and
>>> >>if they're not equal there is at least one call between the 2 insns, and if the
>>> >>corresponding hard register is clobbered by calls, we spill after insn a and
>>> >>restore before insn b.
>>> >>
>>> >>That is too coarse-grained to use with our analysis, since we need to know which
>>> >>calls occur in between insn a and insn b, and more precisely which registers
>>> >>those calls clobbered.
>> >
>>> >>I wonder though if we can do something similar: we keep an array
>>> >>call_clobbers_num[FIRST_PSEUDO_REG], initialized at 0 when we start scanning.
>>> >>When encountering a call, we increase the call_clobbers_num entries for the hard
>>> >>registers clobbered by the call.
>>> >>When encountering a use, we set the call_clobbers_num field of the use to
>>> >>call_clobbers_num[reg_renumber[original_regno]].
>>> >>And when looking at a live-range segment, we compare the clobbers_num field of
>>> >>insn a and insn b, and if it is not equal, the hard register was clobbered by at
>>> >>least one call between insn a and insn b.
>>> >>Would that work? WDYT?
>>> >>
>> >As I understand you looked at live-range splitting code in
>> >lra-constraints.c.  To get necessary info you should look at ira-lives.c.
> Unfortunately I haven't been able to find time to work further on the LRA part.
> So if you're still willing to pick up that part, that would be great.

Vladimir,

I gave this a try. The attached patch works for the included test-case for x86_64.

I've bootstrapped and reg-tested the patch (in combination with the other 
patches from the series) on x86_64.

OK for stage1?

Thanks,
- Tom

[-- Attachment #2: fuse-caller-save-lra.patch --]
[-- Type: text/x-patch, Size: 6862 bytes --]

2013-12-04  Tom de Vries  <tom@codesourcery.com>

	* lra-int.h (struct lra_reg): Add field actual_call_used_reg_set.
	* lra.c (initialize_lra_reg_info_element): Add init of
	actual_call_used_reg_set field.
	(lra): Call lra_create_live_ranges before lra_inheritance for
	-fuse-caller-save.
	* lra-assigns.c (lra_assign): Allow call_used_regs to cross calls for
	-fuse-caller-save.
	* lra-constraints.c (need_for_call_save_p): Use actual_call_used_reg_set
	instead of call_used_reg_set for -fuse-caller-save.
	* lra-lives.c (process_bb_lives): Calculate actual_call_used_reg_set.

	* gcc.target/i386/fuse-caller-save.c: New test.
	* gcc.dg/ira-shrinkwrap-prep-1.c: Run with -fno-use-caller-save.

diff --git a/gcc/lra-assigns.c b/gcc/lra-assigns.c
index 88fc693..943b349 100644
--- a/gcc/lra-assigns.c
+++ b/gcc/lra-assigns.c
@@ -1413,6 +1413,7 @@ lra_assign (void)
   bitmap_head insns_to_process;
   bool no_spills_p;
   int max_regno = max_reg_num ();
+  unsigned int call_used_reg_crosses_call = 0;
 
   timevar_push (TV_LRA_ASSIGN);
   init_lives ();
@@ -1425,14 +1426,22 @@ lra_assign (void)
   bitmap_initialize (&all_spilled_pseudos, &reg_obstack);
   create_live_range_start_chains ();
   setup_live_pseudos_and_spill_after_risky_transforms (&all_spilled_pseudos);
-#ifdef ENABLE_CHECKING
   for (i = FIRST_PSEUDO_REGISTER; i < max_regno; i++)
     if (lra_reg_info[i].nrefs != 0 && reg_renumber[i] >= 0
 	&& lra_reg_info[i].call_p
 	&& overlaps_hard_reg_set_p (call_used_reg_set,
 				    PSEUDO_REGNO_MODE (i), reg_renumber[i]))
-      gcc_unreachable ();
-#endif
+      {
+	if (!flag_use_caller_save)
+	  gcc_unreachable ();
+	call_used_reg_crosses_call++;
+      }
+  if (lra_dump_file
+      && call_used_reg_crosses_call > 0)
+    fprintf (lra_dump_file,
+	     "Found %u pseudo(s) with a call used reg crossing a call.\n"
+	     "Allowing due to -fuse-caller-save\n",
+	     call_used_reg_crosses_call);    
   /* Setup insns to process on the next constraint pass.  */
   bitmap_initialize (&changed_pseudo_bitmap, &reg_obstack);
   init_live_reload_and_inheritance_pseudos ();
diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c
index bb5242a..d0939dc 100644
--- a/gcc/lra-constraints.c
+++ b/gcc/lra-constraints.c
@@ -4438,7 +4438,10 @@ need_for_call_save_p (int regno)
   lra_assert (regno >= FIRST_PSEUDO_REGISTER && reg_renumber[regno] >= 0);
   return (usage_insns[regno].calls_num < calls_num
 	  && (overlaps_hard_reg_set_p
-	      (call_used_reg_set,
+	      ((flag_use_caller_save &&
+		! hard_reg_set_empty_p (lra_reg_info[regno].actual_call_used_reg_set))
+	       ? lra_reg_info[regno].actual_call_used_reg_set
+	       : call_used_reg_set,
 	       PSEUDO_REGNO_MODE (regno), reg_renumber[regno])
 	      || HARD_REGNO_CALL_PART_CLOBBERED (reg_renumber[regno],
 						 PSEUDO_REGNO_MODE (regno))));
diff --git a/gcc/lra-int.h b/gcc/lra-int.h
index 6d8d80f..f2b8079 100644
--- a/gcc/lra-int.h
+++ b/gcc/lra-int.h
@@ -77,6 +77,10 @@ struct lra_reg
   /* The following fields are defined only for pseudos.	 */
   /* Hard registers with which the pseudo conflicts.  */
   HARD_REG_SET conflict_hard_regs;
+  /* Call used registers with which the pseudo conflicts, taking into account
+     the registers used by functions called from calls which cross the
+     pseudo. */
+  HARD_REG_SET actual_call_used_reg_set;
   /* We assign hard registers to reload pseudos which can occur in few
      places.  So two hard register preferences are enough for them.
      The following fields define the preferred hard registers.	If
diff --git a/gcc/lra-lives.c b/gcc/lra-lives.c
index efc19f2..774d6c2 100644
--- a/gcc/lra-lives.c
+++ b/gcc/lra-lives.c
@@ -624,6 +624,17 @@ process_bb_lives (basic_block bb, int &curr_point)
 
       if (call_p)
 	{
+	  if (flag_use_caller_save)
+	    {
+	      HARD_REG_SET this_call_used_reg_set;
+	      get_call_reg_set_usage (curr_insn, &this_call_used_reg_set,
+				      call_used_reg_set);
+
+	      EXECUTE_IF_SET_IN_SPARSESET (pseudos_live, j)
+		IOR_HARD_REG_SET (lra_reg_info[j].actual_call_used_reg_set,
+				  this_call_used_reg_set);
+	    }
+ 
 	  sparseset_ior (pseudos_live_through_calls,
 			 pseudos_live_through_calls, pseudos_live);
 	  if (cfun->has_nonlocal_label
diff --git a/gcc/lra.c b/gcc/lra.c
index d0d9bcb..599f95a 100644
--- a/gcc/lra.c
+++ b/gcc/lra.c
@@ -1427,6 +1427,7 @@ initialize_lra_reg_info_element (int i)
   lra_reg_info[i].no_stack_p = false;
 #endif
   CLEAR_HARD_REG_SET (lra_reg_info[i].conflict_hard_regs);
+  CLEAR_HARD_REG_SET (lra_reg_info[i].actual_call_used_reg_set);
   lra_reg_info[i].preferred_hard_regno1 = -1;
   lra_reg_info[i].preferred_hard_regno2 = -1;
   lra_reg_info[i].preferred_hard_regno_profit1 = 0;
@@ -2343,7 +2344,18 @@ lra (FILE *f)
 	  lra_eliminate (false, false);
 	  /* Do inheritance only for regular algorithms.  */
 	  if (! lra_simple_p)
-	    lra_inheritance ();
+	    {
+	      if (flag_use_caller_save)
+		{
+		  if (live_p)
+		    lra_clear_live_ranges ();
+		  /* As a side-effect of lra_create_live_ranges, we calculate
+		     actual_call_used_reg_set,  which is needed during
+		     lra_inheritance.  */
+		  lra_create_live_ranges (true);
+		}
+	      lra_inheritance ();
+	    }
 	  if (live_p)
 	    lra_clear_live_ranges ();
 	  /* We need live ranges for lra_assign -- so build them.  */
diff --git a/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-1.c b/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-1.c
index 54d3e76..a386fab 100644
--- a/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-1.c
+++ b/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile { target { { x86_64-*-* && lp64 } || { powerpc*-*-* && lp64 } } } } */
-/* { dg-options "-O3 -fdump-rtl-ira -fdump-rtl-pro_and_epilogue"  } */
+/* { dg-options "-O3 -fdump-rtl-ira -fdump-rtl-pro_and_epilogue -fno-use-caller-save"  } */
 
 long __attribute__((noinline, noclone))
 foo (long a)
diff --git a/gcc/testsuite/gcc.target/i386/fuse-caller-save.c b/gcc/testsuite/gcc.target/i386/fuse-caller-save.c
new file mode 100644
index 0000000..c5d620c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/fuse-caller-save.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fuse-caller-save -fdump-rtl-reload" } */
+/* Testing -fuse-caller-save optimization option.  */
+
+static int __attribute__((noinline))
+bar (int x)
+{
+  return x + 3;
+}
+
+int __attribute__((noinline))
+foo (int y)
+{
+  return y + bar (y);
+}
+
+int
+main (void)
+{
+  return !(foo (5) == 13);
+}
+
+/* { dg-final { scan-rtl-dump-times "Found 1 pseudo.* with a call used reg crossing a call" 1 "reload" } } */
+/* { dg-final { scan-rtl-dump-times "Found .* pseudo.* with a call used reg crossing a call" 1 "reload" } } */
+/* { dg-final { scan-rtl-dump-times "Allowing due to -fuse-caller-save" 1 "reload" } } */
+/* { dg-final { cleanup-rtl-dump "reload" } } */

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH][05/10] -fuse-caller-save - Implement TARGET_FN_OTHER_HARD_REG_USAGE hook for ARM
  2013-03-30 17:11             ` [PATCH][05/10] -fuse-caller-save - Implement TARGET_FN_OTHER_HARD_REG_USAGE hook for ARM Tom de Vries
@ 2013-12-06  0:54               ` Tom de Vries
  2013-12-09 10:03                 ` Richard Earnshaw
  0 siblings, 1 reply; 59+ messages in thread
From: Tom de Vries @ 2013-12-06  0:54 UTC (permalink / raw)
  To: Richard Earnshaw; +Cc: gcc-patches

On 30-03-13 18:11, Tom de Vries wrote:
> Richard,
>
> This patch series adds analysis of register usage of functions for usage by IRA.
> The original post is here
> ( http://gcc.gnu.org/ml/gcc-patches/2013-01/msg01234.html ).
>
> This patch implements the target hook TARGET_FN_OTHER_HARD_REG_USAGE for ARM.
> The target hook TARGET_FN_OTHER_HARD_REG_USAGE was introduced in the previous
> patch in this patch series.
>
> Build and reg-tested on ARM.
>
> OK for trunk?
>

Richard,

Ping. OK for stage1?

Thanks,
- Tom

> Thanks,
>    -Tom
>
> 2013-03-30  Radovan Obradovic  <robradovic@mips.com>
>              Tom de Vries  <tom@codesourcery.com>
>
> 	* config/arm/arm.c (TARGET_FN_OTHER_HARD_REG_USAGE): Redefine as
> 	arm_fn_other_hard_reg_usage.
> 	(arm_fn_other_hard_reg_usage): New function.
>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH][07/10] -fuse-caller-save - Use collected register usage information
  2013-03-30 17:12             ` [PATCH][07/10] -fuse-caller-save - Use collected register usage information Tom de Vries
@ 2013-12-06  0:56               ` Tom de Vries
  2013-12-06  9:11                 ` Paolo Bonzini
  0 siblings, 1 reply; 59+ messages in thread
From: Tom de Vries @ 2013-12-06  0:56 UTC (permalink / raw)
  To: Tom de Vries, Paolo Bonzini; +Cc: gcc-patches

On 30-03-13 18:11, Tom de Vries wrote:
> Paolo,
>
> This patch series adds analysis of register usage of functions for usage by IRA.
> The original post is here
> ( http://gcc.gnu.org/ml/gcc-patches/2013-01/msg01234.html ).
>
> This patch uses the information of which registers are clobbered by a call
> in IRA and df-scan.
>
> Bootstrapped and reg-tested on x86_64, Ada inclusive. Build and reg-tested on
> mips, arm, ppc and sh.
>
> Can you approve the df-scan part for trunk?
>

Paolo,

Ping. df-scan part OK for stage1?

Thanks,
- Tom

> Thanks,
>    -Tom
>
> 2013-03-30  Radovan Obradovic  <robradovic@mips.com>
>              Tom de Vries  <tom@codesourcery.com>
>
> 	* df-scan.c (df_get_call_refs): Use get_call_reg_set_usage.
> 	* caller-save.c (setup_save_areas, save_call_clobbered_regs): Use
> 	get_call_reg_set_usage.
> 	* resource.c (mark_set_resources, mark_target_live_regs): Use
> 	get_call_reg_set_usage.
> 	* ira-int.h (struct ira_allocno): Add crossed_calls_clobbered_regs
> 	field.
> 	(ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS): Define.
> 	* ira-lives.c (process_bb_node_lives): Use get_call_reg_set_usage.
> 	Calculate ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS.
> 	* ira-build.c (ira_create_allocno): Init
> 	ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS.
> 	(create_cap_allocno, propagate_allocno_info)
> 	(propagate_some_info_from_allocno)
> 	(copy_info_to_removed_store_destinations): Handle
> 	ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS.
> 	* ira-costs.c (ira_tune_allocno_costs): Use
> 	ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS to adjust costs.
>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH][10/10] -fuse-caller-save - Add test-case
  2013-12-06  0:34                 ` Tom de Vries
@ 2013-12-06  8:51                   ` Richard Sandiford
  0 siblings, 0 replies; 59+ messages in thread
From: Richard Sandiford @ 2013-12-06  8:51 UTC (permalink / raw)
  To: Tom de Vries; +Cc: Tom de Vries, gcc-patches

Tom de Vries <Tom_deVries@mentor.com> writes:
> On 27-04-13 12:01, Richard Sandiford wrote:
>> Tom de Vries <tom@codesourcery.com> writes:
>>> +/* { dg-do run } */
>>> +/* { dg-options "-fuse-caller-save -save-temps" } */
>>> +/* { dg-skip-if "" { *-*-* }  { "*" } { "-Os" } } */
>>> +/* Testing -fuse-caller-save optimization option.  */
>>> +
>>> +static int __attribute__((noinline))
>>> +bar (int x)
>>> +{
>>> +  return x + 3;
>>> +}
>>> +
>>> +int __attribute__((noinline))
>>> +foo (int y)
>>> +{
>>> +  return y + bar (y);
>>> +}
>>> +
>>> +int
>>> +main (void)
>>> +{
>>> +  return !(foo (5) == 13);
>>> +}
>>> +
>>> +/* Check that there are only 2 stack-saves: r31 in main and foo.  */
>>> +
>>> +/* Variant not mips16.  Check that there only 2 sw/sd.  */
>>> +/* { dg-final { scan-assembler-times
>>> "(?n)s\[wd\]\t\\\$.*,.*\\(\\\$sp\\)" 2 { target { ! mips16 } } } } */
>>> +
>>> +/* Variant not mips16, Subvariant micromips.  Additionally check there's no
>>> +   swm.  */
>>> +/* { dg-final { scan-assembler-times
>>> "(?n)swm\t\\\$.*,.*\\(\\\$sp\\)" 0 {target micromips } } } */
>>> +
>>> +/* Variant mips16.  The save can save 1 or more registers, check
>>> that only 1 is
>>> +   saved, twice in total.  */
>>> +/* { dg-final { scan-assembler-times
>>> "(?n)save\t\[0-9\]*,\\\$\[^,\]*\$" 2 { target mips16 } } } */
>>> +
>>> +/* Check that the first caller-save register is unused.  */
>>> +/* { dg-final { scan-assembler-not "(\\\$16)" } } */
>>
>> Sorry to ask, but I think it would be better to split this up into
>> a compile test and a run test.  The run test shouldn't be skipped at -Os.
>> It should probably also go somewhere more general than gcc.target/mips.
>>
>
> Richard,
>
> Done. There's now run test gcc.dg/fuse-caller-save.c and compile test 
> gcc.target/mips/fuse-caller-save.c.
>
>> I've tried to avoid conditional scan-assemblers in gcc.target/mips
>> whereever possible.  The directory has been set up so that you can force
>> any subtarget you like, so that (for example) -mips16 output is tested
>> by every run, not just things like mips-sim/-mips16.
>>
>> In this case I think that means using NOCOMPRESSION to force the
>> functions to use the standard ISA encoding and making the scan-assemblers
>> test only for that.
>
> Done.
>
>> Since you've already done the work :-), bonus points
>> for creating two copies, one for micromips (dg-options "-micromips ...")
>> and one for mips16 (dg-options "-mips16 ...").  That's certainly not a
>> requirement though.
>>
>
> I've left that out for now.

OK for the MIPS part, except:

> +/* Check that the first caller-save register is unused.  */
> +/* { dg-final { scan-assembler-not "(\\\$16)" } } */

the (...) regexp grouping seems redundant here.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH][07/10] -fuse-caller-save - Use collected register usage information
  2013-12-06  0:56               ` Tom de Vries
@ 2013-12-06  9:11                 ` Paolo Bonzini
  0 siblings, 0 replies; 59+ messages in thread
From: Paolo Bonzini @ 2013-12-06  9:11 UTC (permalink / raw)
  To: Tom de Vries; +Cc: Tom de Vries, gcc-patches

Il 06/12/2013 01:56, Tom de Vries ha scritto:
>>
>>
>> This patch series adds analysis of register usage of functions for
>> usage by IRA.
>> The original post is here
>> ( http://gcc.gnu.org/ml/gcc-patches/2013-01/msg01234.html ).
>>
>> This patch uses the information of which registers are clobbered by a
>> call
>> in IRA and df-scan.
>>
>> Bootstrapped and reg-tested on x86_64, Ada inclusive. Build and
>> reg-tested on
>> mips, arm, ppc and sh.
>>
>> Can you approve the df-scan part for trunk?
>>
> 
> Paolo,
> 
> Ping. df-scan part OK for stage1?

Yes, thanks.

Paolo

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH] -fuse-caller-save - Implement TARGET_FN_OTHER_HARD_REG_USAGE hook for MIPS
  2013-03-30 17:11             ` [PATCH][04/10] -fuse-caller-save - Add TARGET_FN_OTHER_HARD_REG_USAGE hook Tom de Vries
@ 2013-12-07 15:07               ` Tom de Vries
  2013-12-25 13:02                 ` Tom de Vries
  0 siblings, 1 reply; 59+ messages in thread
From: Tom de Vries @ 2013-12-07 15:07 UTC (permalink / raw)
  To: rdsandiford; +Cc: Vladimir Makarov, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 291 bytes --]

Richard,

This patch implements the target hook TARGET_FN_OTHER_HARD_REG_USAGE (posted 
here: http://gcc.gnu.org/ml/gcc-patches/2013-03/msg01318.html) for MIPS, to 
address the issue that $6 is sometimes used in split calls.

Build and reg-tested on MIPS.

OK for stage1?

Thanks,
   - Tom


[-- Attachment #2: fuse-caller-save-mips.patch --]
[-- Type: text/x-patch, Size: 3399 bytes --]

2013-11-12  Chung-Lin Tang  <cltang@codesourcery.com>
            Tom de Vries  <tom@codesourcery.com>

	* config/mips/mips.c (POST_CALL_TMP_REG): Define.
	(mips_split_call): Use POST_CALL_TMP_REG.
	(mips_fn_other_hard_reg_usage): New function.
	(TARGET_FN_OTHER_HARD_REG_USAGE): Define targhook using new function.

diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index 36ba6df..3f60f5b 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -175,6 +175,11 @@ along with GCC; see the file COPYING3.  If not see
 /* Return the usual opcode for a nop.  */
 #define MIPS_NOP 0
 
+/* Temporary register that is used after a call, and suitable for both
+   MIPS16 and non-MIPS16 code.  $4 and $5 are used for returning complex double
+   values in soft-float code, so $6 is the first suitable candidate.  */
+#define POST_CALL_TMP_REG (GP_ARG_FIRST + 2)
+
 /* Classifies an address.
 
    ADDRESS_REG
@@ -6990,10 +6995,8 @@ mips_split_call (rtx insn, rtx call_pattern)
 {
   emit_call_insn (call_pattern);
   if (!find_reg_note (insn, REG_NORETURN, 0))
-    /* Pick a temporary register that is suitable for both MIPS16 and
-       non-MIPS16 code.  $4 and $5 are used for returning complex double
-       values in soft-float code, so $6 is the first suitable candidate.  */
-    mips_restore_gp_from_cprestore_slot (gen_rtx_REG (Pmode, GP_ARG_FIRST + 2));
+    mips_restore_gp_from_cprestore_slot (gen_rtx_REG (Pmode,
+						      POST_CALL_TMP_REG));
 }
 
 /* Return true if a call to DECL may need to use JALX.  */
@@ -18699,6 +18702,32 @@ mips_case_values_threshold (void)
   else
     return default_case_values_threshold ();
 }
+
+/* Implement TARGET_FN_OTHER_HARD_REG_USAGE.  */
+
+static void
+mips_fn_other_hard_reg_usage (struct hard_reg_set_container *fn_used_regs)
+{
+  /* POST_CALL_TMP_REG is used in splitting calls after register allocation.
+     With -fno-use-caller-save, the register is available because register
+     allocation ensures that members of call_used_regs are not live across
+     calls.
+     With -fuse-caller-save that's not the case, so we're missing a clobber on
+     the unsplit call insn to tell register allocation that the register is used
+     by the split call insn(s) after register allocation (we don't need the
+     clobber for a non-returning call, but we don't expect there will be a
+     penalty if we add the clobber for both returning and non-returning calls).
+
+     For the sake of simplicity we don't add the individual clobbers, but we use
+     this hook to mark the reg as clobbered.  This is a bit ugly, since this
+     hook is called during the final pass on a function, and we're expressing
+     here that the insn after a call to this function will clobber a register.
+
+     The condition is the pass-independent part of TARGET_SPLIT_CALLS.  */
+  if (TARGET_EXPLICIT_RELOCS
+      && TARGET_CALL_CLOBBERED_GP)
+    SET_HARD_REG_BIT (fn_used_regs->set, POST_CALL_TMP_REG);
+}
 \f
 /* Initialize the GCC target structure.  */
 #undef TARGET_ASM_ALIGNED_HI_OP
@@ -18933,6 +18962,9 @@ mips_case_values_threshold (void)
 #undef TARGET_CASE_VALUES_THRESHOLD
 #define TARGET_CASE_VALUES_THRESHOLD mips_case_values_threshold
 
+#undef TARGET_FN_OTHER_HARD_REG_USAGE
+#define TARGET_FN_OTHER_HARD_REG_USAGE mips_fn_other_hard_reg_usage
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 \f
 #include "gt-mips.h"

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH][05/10] -fuse-caller-save - Implement TARGET_FN_OTHER_HARD_REG_USAGE hook for ARM
  2013-12-06  0:54               ` Tom de Vries
@ 2013-12-09 10:03                 ` Richard Earnshaw
  0 siblings, 0 replies; 59+ messages in thread
From: Richard Earnshaw @ 2013-12-09 10:03 UTC (permalink / raw)
  To: Tom de Vries; +Cc: gcc-patches

On 06/12/13 00:54, Tom de Vries wrote:
> On 30-03-13 18:11, Tom de Vries wrote:
>> Richard,
>>
>> This patch series adds analysis of register usage of functions for usage by IRA.
>> The original post is here
>> ( http://gcc.gnu.org/ml/gcc-patches/2013-01/msg01234.html ).
>>
>> This patch implements the target hook TARGET_FN_OTHER_HARD_REG_USAGE for ARM.
>> The target hook TARGET_FN_OTHER_HARD_REG_USAGE was introduced in the previous
>> patch in this patch series.
>>
>> Build and reg-tested on ARM.
>>
>> OK for trunk?
>>
> 
> Richard,
> 
> Ping. OK for stage1?
> 
> Thanks,
> - Tom
> 
>> Thanks,
>>    -Tom
>>
>> 2013-03-30  Radovan Obradovic  <robradovic@mips.com>
>>              Tom de Vries  <tom@codesourcery.com>
>>
>> 	* config/arm/arm.c (TARGET_FN_OTHER_HARD_REG_USAGE): Redefine as
>> 	arm_fn_other_hard_reg_usage.
>> 	(arm_fn_other_hard_reg_usage): New function.
>>
> 
> 

OK.

R.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] -fuse-caller-save - Implement TARGET_FN_OTHER_HARD_REG_USAGE hook for MIPS
  2013-12-07 15:07               ` [PATCH] -fuse-caller-save - Implement TARGET_FN_OTHER_HARD_REG_USAGE hook for MIPS Tom de Vries
@ 2013-12-25 13:02                 ` Tom de Vries
  2014-01-09 13:51                   ` [PING^2][PATCH] " Tom de Vries
  0 siblings, 1 reply; 59+ messages in thread
From: Tom de Vries @ 2013-12-25 13:02 UTC (permalink / raw)
  To: rdsandiford; +Cc: Vladimir Makarov, gcc-patches

On 07-12-13 16:07, Tom de Vries wrote:
> Richard,
>
> This patch implements the target hook TARGET_FN_OTHER_HARD_REG_USAGE (posted
> here: http://gcc.gnu.org/ml/gcc-patches/2013-03/msg01318.html) for MIPS, to
> address the issue that $6 is sometimes used in split calls.
>
> Build and reg-tested on MIPS.
>
> OK for stage1?
>

Richard,

Ping.

This patch was submitted here ( 
http://gcc.gnu.org/ml/gcc-patches/2013-12/msg00771.html ) and is required for 
the -fuse-caller-save optimization which was submitted here ( 
http://gcc.gnu.org/ml/gcc-patches/2013-01/msg01234.html ).

The patch fixes a correctness issue with -fuse-caller-save for MIPS.

OK for stage1?

Thanks,
- Tom

> Thanks,
>    - Tom
>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PING^2][PATCH] -fuse-caller-save - Implement TARGET_FN_OTHER_HARD_REG_USAGE hook for MIPS
  2013-12-25 13:02                 ` Tom de Vries
@ 2014-01-09 13:51                   ` Tom de Vries
  2014-01-09 15:31                     ` Richard Sandiford
  0 siblings, 1 reply; 59+ messages in thread
From: Tom de Vries @ 2014-01-09 13:51 UTC (permalink / raw)
  To: rdsandiford; +Cc: Vladimir Makarov, gcc-patches

On 25/12/13 14:02, Tom de Vries wrote:
> On 07-12-13 16:07, Tom de Vries wrote:
>> Richard,
>>
>> This patch implements the target hook TARGET_FN_OTHER_HARD_REG_USAGE (posted
>> here: http://gcc.gnu.org/ml/gcc-patches/2013-03/msg01318.html) for MIPS, to
>> address the issue that $6 is sometimes used in split calls.
>>
>> Build and reg-tested on MIPS.
>>
>> OK for stage1?
>>
> 

Richard,

Ping.

This patch is the only part of -fuse-caller-save that still needs approval.

> This patch was submitted here ( 
> http://gcc.gnu.org/ml/gcc-patches/2013-12/msg00771.html ) and is required for 
> the -fuse-caller-save optimization which was submitted here ( 
> http://gcc.gnu.org/ml/gcc-patches/2013-01/msg01234.html ).
> 
> The patch fixes a correctness issue with -fuse-caller-save for MIPS.
> 
> OK for stage1?
> 

Thanks,
- Tom

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH][IRA] Analysis of register usage of functions for usage by IRA.
  2013-03-30 16:10             ` [PATCH][IRA] Analysis of register usage of functions for usage by IRA Tom de Vries
@ 2014-01-09 14:42               ` Richard Earnshaw
  2014-01-09 20:56                 ` Tom de Vries
  0 siblings, 1 reply; 59+ messages in thread
From: Richard Earnshaw @ 2014-01-09 14:42 UTC (permalink / raw)
  To: Tom de Vries
  Cc: Vladimir Makarov, Richard Sandiford, Paolo Bonzini, gcc-patches

On 30/03/13 16:10, Tom de Vries wrote:
> On 29/03/13 13:54, Tom de Vries wrote:
>> I split the patch up into 10 patches, to facilitate further review:
>> ...
>> 0001-Add-command-line-option.patch
>> 0002-Add-new-reg-note-REG_CALL_DECL.patch
>> 0003-Add-implicit-parameter-to-find_all_hard_reg_sets.patch
>> 0004-Add-TARGET_FN_OTHER_HARD_REG_USAGE-hook.patch
>> 0005-Implement-TARGET_FN_OTHER_HARD_REG_USAGE-hook-for-ARM.patch
>> 0006-Collect-register-usage-information.patch
>> 0007-Use-collected-register-usage-information.patch
>> 0008-Enable-by-default-at-O2-and-higher.patch
>> 0009-Add-documentation.patch
>> 0010-Add-test-case.patch
>> ...
>> I'll post these in reply to this email.
>>
> 
> Something went wrong with those emails, which were generated.
> 
> I tested the emails by sending them to my work email, where they looked fine.
> I managed to reproduce the problem by sending them to my private email.
> It seems the problem was inconsistent EOL format.
> 
> I've written a python script to handle composing the email, and posted it here
> using that script: http://gcc.gnu.org/ml/gcc-patches/2013-03/msg01311.html.
> Given that that email looks ok, I think I've addressed the problems now.
> 
> I'll repost the patches. Sorry about the noise.
> 
> Thanks,
> - Tom
> 
> 

It's unfortunate that this feature doesn't fail safe when a port has not
explicitly defined what should happen.

Consequently, you'll need to add a patch for AArch64 which has two
registers clobbered by PLT-based calls.

R.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PING^2][PATCH] -fuse-caller-save - Implement TARGET_FN_OTHER_HARD_REG_USAGE hook for MIPS
  2014-01-09 13:51                   ` [PING^2][PATCH] " Tom de Vries
@ 2014-01-09 15:31                     ` Richard Sandiford
  2014-01-09 23:43                       ` Tom de Vries
  0 siblings, 1 reply; 59+ messages in thread
From: Richard Sandiford @ 2014-01-09 15:31 UTC (permalink / raw)
  To: Tom de Vries; +Cc: Vladimir Makarov, gcc-patches

Tom de Vries <Tom_deVries@mentor.com> writes:
> On 25/12/13 14:02, Tom de Vries wrote:
>> On 07-12-13 16:07, Tom de Vries wrote:
>>> Richard,
>>>
>>> This patch implements the target hook TARGET_FN_OTHER_HARD_REG_USAGE (posted
>>> here: http://gcc.gnu.org/ml/gcc-patches/2013-03/msg01318.html) for MIPS, to
>>> address the issue that $6 is sometimes used in split calls.
>>>
>>> Build and reg-tested on MIPS.
>>>
>>> OK for stage1?
>>>
>> 
>
> Richard,
>
> Ping.
>
> This patch is the only part of -fuse-caller-save that still needs approval.

Hmm, where were parts 4 and 6 approved?  Was looking for the discussion
in the hope that it would answer the question I don't really understand,
which is: this hook is only used during final, is that right?  And the
clobber that you're adding is exposed at the rtl level.  So why do we
need the hook at all?  Why not just collect the usage information at
the end of final rather than at the beginning, so that all splits during
final have been done?  For other cases (where the usage isn't explicit
at the rtl level), why not record the usage in CALL_INSN_FUNCTION_USAGE
instead?

Thanks,
Richard

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH][IRA] Analysis of register usage of functions for usage by IRA.
  2014-01-09 14:42               ` Richard Earnshaw
@ 2014-01-09 20:56                 ` Tom de Vries
  2014-01-09 21:10                   ` Andi Kleen
  2014-01-10 11:39                   ` Richard Earnshaw
  0 siblings, 2 replies; 59+ messages in thread
From: Tom de Vries @ 2014-01-09 20:56 UTC (permalink / raw)
  To: Richard Earnshaw
  Cc: Vladimir Makarov, Richard Sandiford, Paolo Bonzini, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1926 bytes --]

On 09-01-14 15:41, Richard Earnshaw wrote:
> On 30/03/13 16:10, Tom de Vries wrote:
>> On 29/03/13 13:54, Tom de Vries wrote:
>>> I split the patch up into 10 patches, to facilitate further review:
>>> ...
>>> 0001-Add-command-line-option.patch
>>> 0002-Add-new-reg-note-REG_CALL_DECL.patch
>>> 0003-Add-implicit-parameter-to-find_all_hard_reg_sets.patch
>>> 0004-Add-TARGET_FN_OTHER_HARD_REG_USAGE-hook.patch
>>> 0005-Implement-TARGET_FN_OTHER_HARD_REG_USAGE-hook-for-ARM.patch
>>> 0006-Collect-register-usage-information.patch
>>> 0007-Use-collected-register-usage-information.patch
>>> 0008-Enable-by-default-at-O2-and-higher.patch
>>> 0009-Add-documentation.patch
>>> 0010-Add-test-case.patch
>>> ...
>>> I'll post these in reply to this email.
>>>
>>
>> Something went wrong with those emails, which were generated.
>>
>> I tested the emails by sending them to my work email, where they looked fine.
>> I managed to reproduce the problem by sending them to my private email.
>> It seems the problem was inconsistent EOL format.
>>
>> I've written a python script to handle composing the email, and posted it here
>> using that script: http://gcc.gnu.org/ml/gcc-patches/2013-03/msg01311.html.
>> Given that that email looks ok, I think I've addressed the problems now.
>>
>> I'll repost the patches. Sorry about the noise.
>>
>> Thanks,
>> - Tom
>>
>>
>
> It's unfortunate that this feature doesn't fail safe when a port has not
> explicitly defined what should happen.
>

Richard,

Attached tentative patch (an update of patch 4 in the series) changes the hook 
in the way you propose.

Is this patch OK for stage1 (after proper retesting)?

> Consequently, you'll need to add a patch for AArch64 which has two
> registers clobbered by PLT-based calls.
>

Thanks for pointing that out. That's r16 and r17, right? I can propose the hook 
for AArch64, once we all agree on how the hook should look.

Thanks,
- Tom

> R.
>


[-- Attachment #2: fuse-caller-save-hook.patch --]
[-- Type: text/x-patch, Size: 4725 bytes --]

2013-04-29  Radovan Obradovic  <robradovic@mips.com>
            Tom de Vries  <tom@codesourcery.com>

	* hooks.c (hook_bool_hard_reg_set_containerp_false): New function.
	* hooks.h (hook_bool_hard_reg_set_containerp_false): Declare.
	* target.def (fn_other_hard_reg_usage): New DEFHOOK.
	* doc/tm.texi.in (@node Stack and Calling): Add Miscellaneous Register
	Hooks to @menu.
	(@node Miscellaneous Register Hooks): New node.
	(@hook TARGET_FN_OTHER_HARD_REG_USAGE): New hook.
	* doc/tm.texi: Regenerate.

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index f204936..1bae6bb 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -3091,6 +3091,7 @@ This describes the stack layout and calling conventions.
 * Profiling::
 * Tail Calls::
 * Stack Smashing Protection::
+* Miscellaneous Register Hooks::
 @end menu
 
 @node Frame Layout
@@ -5016,6 +5017,14 @@ normally defined in @file{libgcc2.c}.
 Whether this target supports splitting the stack when the options described in @var{opts} have been passed.  This is called after options have been parsed, so the target may reject splitting the stack in some configurations.  The default version of this hook returns false.  If @var{report} is true, this function may issue a warning or error; if @var{report} is false, it must simply return a value
 @end deftypefn
 
+@node Miscellaneous Register Hooks
+@subsection Miscellaneous register hooks
+@cindex miscellaneous register hooks
+
+@deftypefn {Target Hook} bool TARGET_FN_OTHER_HARD_REG_USAGE (struct hard_reg_set_container *@var{regs})
+Add any hard registers to @var{regs} that are set or clobbered by a call to the function.  This hook only needs to add registers that cannot be found by examination of the final RTL representation of a function.  This hook returns true if it managed to determine which registers need to be added.  The default version of this hook returns false.
+@end deftypefn
+
 @node Varargs
 @section Implementing the Varargs Macros
 @cindex varargs implementation
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 50f412c..bf75446 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -2720,6 +2720,7 @@ This describes the stack layout and calling conventions.
 * Profiling::
 * Tail Calls::
 * Stack Smashing Protection::
+* Miscellaneous Register Hooks::
 @end menu
 
 @node Frame Layout
@@ -3985,6 +3986,12 @@ the function prologue.  Normally, the profiling code comes after.
 
 @hook TARGET_SUPPORTS_SPLIT_STACK
 
+@node Miscellaneous Register Hooks
+@subsection Miscellaneous register hooks
+@cindex miscellaneous register hooks
+
+@hook TARGET_FN_OTHER_HARD_REG_USAGE
+
 @node Varargs
 @section Implementing the Varargs Macros
 @cindex varargs implementation
diff --git a/gcc/hooks.c b/gcc/hooks.c
index 1c67bdf..44f1d06 100644
--- a/gcc/hooks.c
+++ b/gcc/hooks.c
@@ -467,3 +467,12 @@ void
 hook_void_gcc_optionsp (struct gcc_options *opts ATTRIBUTE_UNUSED)
 {
 }
+
+/* Generic hook that takes a struct hard_reg_set_container * and returns
+   false.  */
+
+bool
+hook_bool_hard_reg_set_containerp_false (struct hard_reg_set_container *regs ATTRIBUTE_UNUSED)
+{
+  return false;
+}
diff --git a/gcc/hooks.h b/gcc/hooks.h
index 896b41d..f0afdbd 100644
--- a/gcc/hooks.h
+++ b/gcc/hooks.h
@@ -73,6 +73,7 @@ extern void hook_void_tree (tree);
 extern void hook_void_tree_treeptr (tree, tree *);
 extern void hook_void_int_int (int, int);
 extern void hook_void_gcc_optionsp (struct gcc_options *);
+extern bool hook_bool_hard_reg_set_containerp_false (struct hard_reg_set_container *);
 
 extern int hook_int_uint_mode_1 (unsigned int, enum machine_mode);
 extern int hook_int_const_tree_0 (const_tree);
diff --git a/gcc/target.def b/gcc/target.def
index 3a64cd1..8bee4c3 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -5130,6 +5130,19 @@ FRAME_POINTER_REGNUM, ARG_POINTER_REGNUM, and the PIC_OFFSET_TABLE_REGNUM.",
  void, (bitmap regs),
  hook_void_bitmap)
 
+/* Targets should define this target hook to mark which registers are clobbered
+   on entry to the function.  They should should set their bits in the struct
+   hard_reg_set_container passed in, and return true.  */
+DEFHOOK
+(fn_other_hard_reg_usage,
+ "Add any hard registers to @var{regs} that are set or clobbered by a call to\
+ the function.  This hook only needs to add registers that cannot be found by\
+ examination of the final RTL representation of a function.  This hook returns\
+ true if it managed to determine which registers need to be added.  The\
+ default version of this hook returns false.",
+ bool, (struct hard_reg_set_container *regs),
+ hook_bool_hard_reg_set_containerp_false)
+
 /* Fill in additional registers set up by prologue into a regset.  */
 DEFHOOK
 (set_up_by_prologue,

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH][IRA] Analysis of register usage of functions for usage by IRA.
  2014-01-09 20:56                 ` Tom de Vries
@ 2014-01-09 21:10                   ` Andi Kleen
  2014-01-10  0:22                     ` Tom de Vries
  2014-01-10 11:39                   ` Richard Earnshaw
  1 sibling, 1 reply; 59+ messages in thread
From: Andi Kleen @ 2014-01-09 21:10 UTC (permalink / raw)
  To: Tom de Vries
  Cc: Richard Earnshaw, Vladimir Makarov, Richard Sandiford,
	Paolo Bonzini, gcc-patches

Tom de Vries <Tom_deVries@mentor.com> writes:
>
> Is this patch OK for stage1 (after proper retesting)?

Could you perhaps post the latest series first? 

I don't think it made it to the mailing list.

-Andi

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PING^2][PATCH] -fuse-caller-save - Implement TARGET_FN_OTHER_HARD_REG_USAGE hook for MIPS
  2014-01-09 15:31                     ` Richard Sandiford
@ 2014-01-09 23:43                       ` Tom de Vries
  2014-01-10  8:47                         ` Richard Sandiford
  0 siblings, 1 reply; 59+ messages in thread
From: Tom de Vries @ 2014-01-09 23:43 UTC (permalink / raw)
  To: rdsandiford; +Cc: Vladimir Makarov, gcc-patches

On 09-01-14 16:31, Richard Sandiford wrote:
> Tom de Vries <Tom_deVries@mentor.com> writes:
>> On 25/12/13 14:02, Tom de Vries wrote:
>>> On 07-12-13 16:07, Tom de Vries wrote:
>>>> Richard,
>>>>
>>>> This patch implements the target hook TARGET_FN_OTHER_HARD_REG_USAGE (posted
>>>> here: http://gcc.gnu.org/ml/gcc-patches/2013-03/msg01318.html) for MIPS, to
>>>> address the issue that $6 is sometimes used in split calls.
>>>>
>>>> Build and reg-tested on MIPS.
>>>>
>>>> OK for stage1?
>>>>
>>>
>>
>> Richard,
>>
>> Ping.
>>
>> This patch is the only part of -fuse-caller-save that still needs approval.
>

Richard,

thanks for the review.

> Hmm, where were parts 4 and 6 approved?

In http://gcc.gnu.org/ml/gcc-patches/2013-03/msg00508.html, Vladimir wrote:
...
The patch is ok for me for trunk at stage1. But I think you need a formal 
approval for df-scan.c, arm.c, mips.c, GCC testsuite expect files 
(lib/target-supports.exp and gcc.target/mips/mips.exp) as I am not a maintainer 
of these parts although these changes look ok for me.
...

In reaction to that, I split up the patch into a patches series, and replied in 
http://gcc.gnu.org/ml/gcc-patches/2013-03/msg01255.html:
...
I'm assuming you've ok'ed patch 1, 2, 3, 4, 6, 8, 9 and the non-df-scan part of 7.

I'll ask other maintainers about the other parts (5, 10 and the df-scan part of 7).
...

>  Was looking for the discussion
> in the hope that it would answer the question I don't really understand,
> which is: this hook is only used during final, is that right?

Yes.

> And the
> clobber that you're adding is exposed at the rtl level.

Yes, after the calls are split, but not before.

> So why do we
> need the hook at all?

In general we need the hook for registers that are clobbered during a call to a 
function, while the registers are not present in the final rtl representation of 
that function.

For MIPS, we don't need the hook for that purpose.

But, for MIPS there's the following issue: the unsplit call clobbers r6, but the 
clobber is not explicit in the rtl. Only after splitting, the clobber becomes 
explicit in the rtl.

In general, that's not a problem because r6 is a member of the set of register 
clobbered by a call (CALL_REALLY_USED_REGISTERS), so it's implicitly clobbered.

But for -fuse-caller-save, when we find a call, we ignore 
CALL_REALLY_USED_REGISTERS and use a potentially smaller set of implicit 
clobbers: the union of:
- the registers usage analysis of the final rtl representation of the called
   function
- the registers marked by the hook.
So before splitting the unsplit call, there's nothing to tell us that r6 is 
clobbered by that call.  Resulting in register allocation using r6 as if it was 
not clobbered, which causes errors.

>  Why not just collect the usage information at
> the end of final rather than at the beginning, so that all splits during
> final have been done?

If we have a call to a leaf function, the final rtl representation does not 
contain calls. The problem does not lie in the final pass where the callee is 
analyzed, but in the caller, where information is used, and where the unsplit 
call is missing the clobber of r6.

> For other cases (where the usage isn't explicit
> at the rtl level), why not record the usage in CALL_INSN_FUNCTION_USAGE
> instead?
>

Right, we could add the r6 clobber that way. But to keep things simple, I've 
used the hook instead.

Thanks,
- Tom

> Thanks,
> Richard
>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH][IRA] Analysis of register usage of functions for usage by IRA.
  2014-01-09 21:10                   ` Andi Kleen
@ 2014-01-10  0:22                     ` Tom de Vries
  0 siblings, 0 replies; 59+ messages in thread
From: Tom de Vries @ 2014-01-10  0:22 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Richard Earnshaw, Vladimir Makarov, Richard Sandiford,
	Paolo Bonzini, gcc-patches

On 09-01-14 22:10, Andi Kleen wrote:
> Tom de Vries <Tom_deVries@mentor.com> writes:
>>
>> Is this patch OK for stage1 (after proper retesting)?
>
> Could you perhaps post the latest series first?
>
> I don't think it made it to the mailing list.
>

Andi,

the current status is:
- toplevel of patch series:
   http://gcc.gnu.org/ml/gcc-patches/2013-03/msg01255.html
- the approved version (nitpick aside) of the test-cases patch is here:
   http://gcc.gnu.org/ml/gcc-patches/2013-12/msg00585.html
- the mips implementation of the hook (not a part of the original series,
   but necessary) is discussed here:
   http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00570.html
- all of the above is accumulated here:
   http://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/heads/vries/fuse-caller-save
- the hook as such is discussed here:
   http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00555.html.

Thanks,
- Tom

> -Andi
>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PING^2][PATCH] -fuse-caller-save - Implement TARGET_FN_OTHER_HARD_REG_USAGE hook for MIPS
  2014-01-09 23:43                       ` Tom de Vries
@ 2014-01-10  8:47                         ` Richard Sandiford
  2014-01-13 15:04                           ` Tom de Vries
  0 siblings, 1 reply; 59+ messages in thread
From: Richard Sandiford @ 2014-01-10  8:47 UTC (permalink / raw)
  To: Tom de Vries; +Cc: Vladimir Makarov, gcc-patches

Tom de Vries <Tom_deVries@mentor.com> writes:
>>  Why not just collect the usage information at
>> the end of final rather than at the beginning, so that all splits during
>> final have been done?
>
> If we have a call to a leaf function, the final rtl representation does not 
> contain calls. The problem does not lie in the final pass where the callee is 
> analyzed, but in the caller, where information is used, and where the unsplit 
> call is missing the clobber of r6.

Ah, so when you're using this hook in final, you're actually adding in
the set of registers that will be clobbered by a future caller's CALL_INSN,
as well as the registers that are clobbered by the callee itself?
That seems a bit error-prone, since we don't know at this stage what
the future caller will look like.  (Things like the target attribute
make this harder to predict.)

I think it would be cleaner to just calculate the callee-clobbered
registers during final and leave the caller to say what it clobbers.

FWIW, I still think it'd be better to collect the set at the end of final
(after any final splits) rather than at the beginning.

>> For other cases (where the usage isn't explicit
>> at the rtl level), why not record the usage in CALL_INSN_FUNCTION_USAGE
>> instead?
>>
>
> Right, we could add the r6 clobber that way. But to keep things simple, I've 
> used the hook instead.

Why's it simpler though?  That's the kind of thing CALL_INSN_FUNCTION_USAGE
is there for.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH][IRA] Analysis of register usage of functions for usage by IRA.
  2014-01-09 20:56                 ` Tom de Vries
  2014-01-09 21:10                   ` Andi Kleen
@ 2014-01-10 11:39                   ` Richard Earnshaw
  2014-01-10 16:44                     ` Tom de Vries
  2014-01-13 16:16                     ` Tom de Vries
  1 sibling, 2 replies; 59+ messages in thread
From: Richard Earnshaw @ 2014-01-10 11:39 UTC (permalink / raw)
  To: Tom de Vries
  Cc: Vladimir Makarov, Richard Sandiford, Paolo Bonzini, gcc-patches

On 09/01/14 20:42, Tom de Vries wrote:
> On 09-01-14 15:41, Richard Earnshaw wrote:
>> On 30/03/13 16:10, Tom de Vries wrote:
>>> On 29/03/13 13:54, Tom de Vries wrote:
>>>> I split the patch up into 10 patches, to facilitate further review:
>>>> ...
>>>> 0001-Add-command-line-option.patch
>>>> 0002-Add-new-reg-note-REG_CALL_DECL.patch
>>>> 0003-Add-implicit-parameter-to-find_all_hard_reg_sets.patch
>>>> 0004-Add-TARGET_FN_OTHER_HARD_REG_USAGE-hook.patch
>>>> 0005-Implement-TARGET_FN_OTHER_HARD_REG_USAGE-hook-for-ARM.patch
>>>> 0006-Collect-register-usage-information.patch
>>>> 0007-Use-collected-register-usage-information.patch
>>>> 0008-Enable-by-default-at-O2-and-higher.patch
>>>> 0009-Add-documentation.patch
>>>> 0010-Add-test-case.patch
>>>> ...
>>>> I'll post these in reply to this email.
>>>>
>>>
>>> Something went wrong with those emails, which were generated.
>>>
>>> I tested the emails by sending them to my work email, where they looked fine.
>>> I managed to reproduce the problem by sending them to my private email.
>>> It seems the problem was inconsistent EOL format.
>>>
>>> I've written a python script to handle composing the email, and posted it here
>>> using that script: http://gcc.gnu.org/ml/gcc-patches/2013-03/msg01311.html.
>>> Given that that email looks ok, I think I've addressed the problems now.
>>>
>>> I'll repost the patches. Sorry about the noise.
>>>
>>> Thanks,
>>> - Tom
>>>
>>>
>>
>> It's unfortunate that this feature doesn't fail safe when a port has not
>> explicitly defined what should happen.
>>
> 
> Richard,
> 
> Attached tentative patch (an update of patch 4 in the series) changes the hook 
> in the way you propose.
> 
> Is this patch OK for stage1 (after proper retesting)?

I certainly think that's safer.  Though of course it means that target
maintainers will now have to explicitly enable this when appropriate.
C'est la vie.

> 
>> Consequently, you'll need to add a patch for AArch64 which has two
>> registers clobbered by PLT-based calls.
>>
> 
> Thanks for pointing that out. That's r16 and r17, right? I can propose the hook 
> for AArch64, once we all agree on how the hook should look.
> 

Yes; and thanks!

R.

> Thanks,
> - Tom
> 
>> R.
>>
>>
>> fuse-caller-save-hook.patch
>>
>>
>> 2013-04-29  Radovan Obradovic  <robradovic@mips.com>
>>             Tom de Vries  <tom@codesourcery.com>
>>
>> 	* hooks.c (hook_bool_hard_reg_set_containerp_false): New function.
>> 	* hooks.h (hook_bool_hard_reg_set_containerp_false): Declare.
>> 	* target.def (fn_other_hard_reg_usage): New DEFHOOK.
>> 	* doc/tm.texi.in (@node Stack and Calling): Add Miscellaneous Register
>> 	Hooks to @menu.
>> 	(@node Miscellaneous Register Hooks): New node.
>> 	(@hook TARGET_FN_OTHER_HARD_REG_USAGE): New hook.
>> 	* doc/tm.texi: Regenerate.
>>
>> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
>> index f204936..1bae6bb 100644
>> --- a/gcc/doc/tm.texi
>> +++ b/gcc/doc/tm.texi
>> @@ -3091,6 +3091,7 @@ This describes the stack layout and calling conventions.
>>  * Profiling::
>>  * Tail Calls::
>>  * Stack Smashing Protection::
>> +* Miscellaneous Register Hooks::
>>  @end menu
>>  
>>  @node Frame Layout
>> @@ -5016,6 +5017,14 @@ normally defined in @file{libgcc2.c}.
>>  Whether this target supports splitting the stack when the options described in @var{opts} have been passed.  This is called after options have been parsed, so the target may reject splitting the stack in some configurations.  The default version of this hook returns false.  If @var{report} is true, this function may issue a warning or error; if @var{report} is false, it must simply return a value
>>  @end deftypefn
>>  
>> +@node Miscellaneous Register Hooks
>> +@subsection Miscellaneous register hooks
>> +@cindex miscellaneous register hooks
>> +
>> +@deftypefn {Target Hook} bool TARGET_FN_OTHER_HARD_REG_USAGE (struct hard_reg_set_container *@var{regs})
>> +Add any hard registers to @var{regs} that are set or clobbered by a call to the function.  This hook only needs to add registers that cannot be found by examination of the final RTL representation of a function.  This hook returns true if it managed to determine which registers need to be added.  The default version of this hook returns false.
>> +@end deftypefn
>> +
>>  @node Varargs
>>  @section Implementing the Varargs Macros
>>  @cindex varargs implementation
>> diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
>> index 50f412c..bf75446 100644
>> --- a/gcc/doc/tm.texi.in
>> +++ b/gcc/doc/tm.texi.in
>> @@ -2720,6 +2720,7 @@ This describes the stack layout and calling conventions.
>>  * Profiling::
>>  * Tail Calls::
>>  * Stack Smashing Protection::
>> +* Miscellaneous Register Hooks::
>>  @end menu
>>  
>>  @node Frame Layout
>> @@ -3985,6 +3986,12 @@ the function prologue.  Normally, the profiling code comes after.
>>  
>>  @hook TARGET_SUPPORTS_SPLIT_STACK
>>  
>> +@node Miscellaneous Register Hooks
>> +@subsection Miscellaneous register hooks
>> +@cindex miscellaneous register hooks
>> +
>> +@hook TARGET_FN_OTHER_HARD_REG_USAGE
>> +
>>  @node Varargs
>>  @section Implementing the Varargs Macros
>>  @cindex varargs implementation
>> diff --git a/gcc/hooks.c b/gcc/hooks.c
>> index 1c67bdf..44f1d06 100644
>> --- a/gcc/hooks.c
>> +++ b/gcc/hooks.c
>> @@ -467,3 +467,12 @@ void
>>  hook_void_gcc_optionsp (struct gcc_options *opts ATTRIBUTE_UNUSED)
>>  {
>>  }
>> +
>> +/* Generic hook that takes a struct hard_reg_set_container * and returns
>> +   false.  */
>> +
>> +bool
>> +hook_bool_hard_reg_set_containerp_false (struct hard_reg_set_container *regs ATTRIBUTE_UNUSED)
>> +{
>> +  return false;
>> +}
>> diff --git a/gcc/hooks.h b/gcc/hooks.h
>> index 896b41d..f0afdbd 100644
>> --- a/gcc/hooks.h
>> +++ b/gcc/hooks.h
>> @@ -73,6 +73,7 @@ extern void hook_void_tree (tree);
>>  extern void hook_void_tree_treeptr (tree, tree *);
>>  extern void hook_void_int_int (int, int);
>>  extern void hook_void_gcc_optionsp (struct gcc_options *);
>> +extern bool hook_bool_hard_reg_set_containerp_false (struct hard_reg_set_container *);
>>  
>>  extern int hook_int_uint_mode_1 (unsigned int, enum machine_mode);
>>  extern int hook_int_const_tree_0 (const_tree);
>> diff --git a/gcc/target.def b/gcc/target.def
>> index 3a64cd1..8bee4c3 100644
>> --- a/gcc/target.def
>> +++ b/gcc/target.def
>> @@ -5130,6 +5130,19 @@ FRAME_POINTER_REGNUM, ARG_POINTER_REGNUM, and the PIC_OFFSET_TABLE_REGNUM.",
>>   void, (bitmap regs),
>>   hook_void_bitmap)
>>  
>> +/* Targets should define this target hook to mark which registers are clobbered
>> +   on entry to the function.  They should should set their bits in the struct
>> +   hard_reg_set_container passed in, and return true.  */
>> +DEFHOOK
>> +(fn_other_hard_reg_usage,
>> + "Add any hard registers to @var{regs} that are set or clobbered by a call to\
>> + the function.  This hook only needs to add registers that cannot be found by\
>> + examination of the final RTL representation of a function.  This hook returns\
>> + true if it managed to determine which registers need to be added.  The\
>> + default version of this hook returns false.",
>> + bool, (struct hard_reg_set_container *regs),
>> + hook_bool_hard_reg_set_containerp_false)
>> +
>>  /* Fill in additional registers set up by prologue into a regset.  */
>>  DEFHOOK
>>  (set_up_by_prologue,


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH][IRA] Analysis of register usage of functions for usage by IRA.
  2014-01-10 11:39                   ` Richard Earnshaw
@ 2014-01-10 16:44                     ` Tom de Vries
  2014-01-13 16:16                     ` Tom de Vries
  1 sibling, 0 replies; 59+ messages in thread
From: Tom de Vries @ 2014-01-10 16:44 UTC (permalink / raw)
  To: Vladimir Makarov
  Cc: Richard Earnshaw, Richard Sandiford, Paolo Bonzini, gcc-patches

On 10-01-14 12:39, Richard Earnshaw wrote:
> On 09/01/14 20:42, Tom de Vries wrote:
>> On 09-01-14 15:41, Richard Earnshaw wrote:
>>> On 30/03/13 16:10, Tom de Vries wrote:
>>>> On 29/03/13 13:54, Tom de Vries wrote:
>>>>> I split the patch up into 10 patches, to facilitate further review:
>>>>> ...
>>>>> 0001-Add-command-line-option.patch
>>>>> 0002-Add-new-reg-note-REG_CALL_DECL.patch
>>>>> 0003-Add-implicit-parameter-to-find_all_hard_reg_sets.patch
>>>>> 0004-Add-TARGET_FN_OTHER_HARD_REG_USAGE-hook.patch
>>>>> 0005-Implement-TARGET_FN_OTHER_HARD_REG_USAGE-hook-for-ARM.patch
>>>>> 0006-Collect-register-usage-information.patch
>>>>> 0007-Use-collected-register-usage-information.patch
>>>>> 0008-Enable-by-default-at-O2-and-higher.patch
>>>>> 0009-Add-documentation.patch
>>>>> 0010-Add-test-case.patch
>>>>> ...
>>>>> I'll post these in reply to this email.
>>>>>
>>>>
>>>> Something went wrong with those emails, which were generated.
>>>>
>>>> I tested the emails by sending them to my work email, where they looked fine.
>>>> I managed to reproduce the problem by sending them to my private email.
>>>> It seems the problem was inconsistent EOL format.
>>>>
>>>> I've written a python script to handle composing the email, and posted it here
>>>> using that script: http://gcc.gnu.org/ml/gcc-patches/2013-03/msg01311.html.
>>>> Given that that email looks ok, I think I've addressed the problems now.
>>>>
>>>> I'll repost the patches. Sorry about the noise.
>>>>
>>>> Thanks,
>>>> - Tom
>>>>
>>>>
>>>
>>> It's unfortunate that this feature doesn't fail safe when a port has not
>>> explicitly defined what should happen.
>>>
>>
>> Richard,
>>
>> Attached tentative patch (an update of patch 4 in the series) changes the hook
>> in the way you propose.
>>
>> Is this patch OK for stage1 (after proper retesting)?
>
> I certainly think that's safer.
> Though of course it means that target
> maintainers will now have to explicitly enable this when appropriate.
> C'est la vie.
>

Vladimir,

is this patch (optimization only on by default for architectures that define the 
hook) OK for stage1? Or do you prefer the previous, already approved patch 
(optimization on by default)?

Thanks,
- Tom

>>
>>> Consequently, you'll need to add a patch for AArch64 which has two
>>> registers clobbered by PLT-based calls.
>>>
>>
>> Thanks for pointing that out. That's r16 and r17, right? I can propose the hook
>> for AArch64, once we all agree on how the hook should look.
>>
>
> Yes; and thanks!
>
> R.
>
>> Thanks,
>> - Tom
>>
>>> R.
>>>
>>>
>>> fuse-caller-save-hook.patch
>>>
>>>
>>> 2013-04-29  Radovan Obradovic  <robradovic@mips.com>
>>>              Tom de Vries  <tom@codesourcery.com>
>>>
>>> 	* hooks.c (hook_bool_hard_reg_set_containerp_false): New function.
>>> 	* hooks.h (hook_bool_hard_reg_set_containerp_false): Declare.
>>> 	* target.def (fn_other_hard_reg_usage): New DEFHOOK.
>>> 	* doc/tm.texi.in (@node Stack and Calling): Add Miscellaneous Register
>>> 	Hooks to @menu.
>>> 	(@node Miscellaneous Register Hooks): New node.
>>> 	(@hook TARGET_FN_OTHER_HARD_REG_USAGE): New hook.
>>> 	* doc/tm.texi: Regenerate.
>>>
>>> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
>>> index f204936..1bae6bb 100644
>>> --- a/gcc/doc/tm.texi
>>> +++ b/gcc/doc/tm.texi
>>> @@ -3091,6 +3091,7 @@ This describes the stack layout and calling conventions.
>>>   * Profiling::
>>>   * Tail Calls::
>>>   * Stack Smashing Protection::
>>> +* Miscellaneous Register Hooks::
>>>   @end menu
>>>
>>>   @node Frame Layout
>>> @@ -5016,6 +5017,14 @@ normally defined in @file{libgcc2.c}.
>>>   Whether this target supports splitting the stack when the options described in @var{opts} have been passed.  This is called after options have been parsed, so the target may reject splitting the stack in some configurations.  The default version of this hook returns false.  If @var{report} is true, this function may issue a warning or error; if @var{report} is false, it must simply return a value
>>>   @end deftypefn
>>>
>>> +@node Miscellaneous Register Hooks
>>> +@subsection Miscellaneous register hooks
>>> +@cindex miscellaneous register hooks
>>> +
>>> +@deftypefn {Target Hook} bool TARGET_FN_OTHER_HARD_REG_USAGE (struct hard_reg_set_container *@var{regs})
>>> +Add any hard registers to @var{regs} that are set or clobbered by a call to the function.  This hook only needs to add registers that cannot be found by examination of the final RTL representation of a function.  This hook returns true if it managed to determine which registers need to be added.  The default version of this hook returns false.
>>> +@end deftypefn
>>> +
>>>   @node Varargs
>>>   @section Implementing the Varargs Macros
>>>   @cindex varargs implementation
>>> diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
>>> index 50f412c..bf75446 100644
>>> --- a/gcc/doc/tm.texi.in
>>> +++ b/gcc/doc/tm.texi.in
>>> @@ -2720,6 +2720,7 @@ This describes the stack layout and calling conventions.
>>>   * Profiling::
>>>   * Tail Calls::
>>>   * Stack Smashing Protection::
>>> +* Miscellaneous Register Hooks::
>>>   @end menu
>>>
>>>   @node Frame Layout
>>> @@ -3985,6 +3986,12 @@ the function prologue.  Normally, the profiling code comes after.
>>>
>>>   @hook TARGET_SUPPORTS_SPLIT_STACK
>>>
>>> +@node Miscellaneous Register Hooks
>>> +@subsection Miscellaneous register hooks
>>> +@cindex miscellaneous register hooks
>>> +
>>> +@hook TARGET_FN_OTHER_HARD_REG_USAGE
>>> +
>>>   @node Varargs
>>>   @section Implementing the Varargs Macros
>>>   @cindex varargs implementation
>>> diff --git a/gcc/hooks.c b/gcc/hooks.c
>>> index 1c67bdf..44f1d06 100644
>>> --- a/gcc/hooks.c
>>> +++ b/gcc/hooks.c
>>> @@ -467,3 +467,12 @@ void
>>>   hook_void_gcc_optionsp (struct gcc_options *opts ATTRIBUTE_UNUSED)
>>>   {
>>>   }
>>> +
>>> +/* Generic hook that takes a struct hard_reg_set_container * and returns
>>> +   false.  */
>>> +
>>> +bool
>>> +hook_bool_hard_reg_set_containerp_false (struct hard_reg_set_container *regs ATTRIBUTE_UNUSED)
>>> +{
>>> +  return false;
>>> +}
>>> diff --git a/gcc/hooks.h b/gcc/hooks.h
>>> index 896b41d..f0afdbd 100644
>>> --- a/gcc/hooks.h
>>> +++ b/gcc/hooks.h
>>> @@ -73,6 +73,7 @@ extern void hook_void_tree (tree);
>>>   extern void hook_void_tree_treeptr (tree, tree *);
>>>   extern void hook_void_int_int (int, int);
>>>   extern void hook_void_gcc_optionsp (struct gcc_options *);
>>> +extern bool hook_bool_hard_reg_set_containerp_false (struct hard_reg_set_container *);
>>>
>>>   extern int hook_int_uint_mode_1 (unsigned int, enum machine_mode);
>>>   extern int hook_int_const_tree_0 (const_tree);
>>> diff --git a/gcc/target.def b/gcc/target.def
>>> index 3a64cd1..8bee4c3 100644
>>> --- a/gcc/target.def
>>> +++ b/gcc/target.def
>>> @@ -5130,6 +5130,19 @@ FRAME_POINTER_REGNUM, ARG_POINTER_REGNUM, and the PIC_OFFSET_TABLE_REGNUM.",
>>>    void, (bitmap regs),
>>>    hook_void_bitmap)
>>>
>>> +/* Targets should define this target hook to mark which registers are clobbered
>>> +   on entry to the function.  They should should set their bits in the struct
>>> +   hard_reg_set_container passed in, and return true.  */
>>> +DEFHOOK
>>> +(fn_other_hard_reg_usage,
>>> + "Add any hard registers to @var{regs} that are set or clobbered by a call to\
>>> + the function.  This hook only needs to add registers that cannot be found by\
>>> + examination of the final RTL representation of a function.  This hook returns\
>>> + true if it managed to determine which registers need to be added.  The\
>>> + default version of this hook returns false.",
>>> + bool, (struct hard_reg_set_container *regs),
>>> + hook_bool_hard_reg_set_containerp_false)
>>> +
>>>   /* Fill in additional registers set up by prologue into a regset.  */
>>>   DEFHOOK
>>>   (set_up_by_prologue,
>
>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PING^2][PATCH] -fuse-caller-save - Implement TARGET_FN_OTHER_HARD_REG_USAGE hook for MIPS
  2014-01-10  8:47                         ` Richard Sandiford
@ 2014-01-13 15:04                           ` Tom de Vries
  0 siblings, 0 replies; 59+ messages in thread
From: Tom de Vries @ 2014-01-13 15:04 UTC (permalink / raw)
  To: Vladimir Makarov, gcc-patches, rdsandiford

[-- Attachment #1: Type: text/plain, Size: 2019 bytes --]

On 10-01-14 09:47, Richard Sandiford wrote:
> Tom de Vries <Tom_deVries@mentor.com> writes:
>>>   Why not just collect the usage information at
>>> the end of final rather than at the beginning, so that all splits during
>>> final have been done?
>>
>> If we have a call to a leaf function, the final rtl representation does not
>> contain calls. The problem does not lie in the final pass where the callee is
>> analyzed, but in the caller, where information is used, and where the unsplit
>> call is missing the clobber of r6.
>
> Ah, so when you're using this hook in final, you're actually adding in
> the set of registers that will be clobbered by a future caller's CALL_INSN,
> as well as the registers that are clobbered by the callee itself?

Right. The first part is not the intended usage of the hook, but it was the 
simplest fix.

> That seems a bit error-prone, since we don't know at this stage what
> the future caller will look like.  (Things like the target attribute
> make this harder to predict.)
>
> I think it would be cleaner to just calculate the callee-clobbered
> registers during final and leave the caller to say what it clobbers.
>

Agree. I've rewritten the patch as such.

> FWIW, I still think it'd be better to collect the set at the end of final
> (after any final splits) rather than at the beginning.
>

Hmm. I was not aware that splits can happen during final. I'll try to update 
that patch as well.

>>> For other cases (where the usage isn't explicit
>>> at the rtl level), why not record the usage in CALL_INSN_FUNCTION_USAGE
>>> instead?
>>>
>>
>> Right, we could add the r6 clobber that way. But to keep things simple, I've
>> used the hook instead.
>
> Why's it simpler though?  That's the kind of thing CALL_INSN_FUNCTION_USAGE
> is there for.
>

It was simpler to implement. But you're right, using CALL_INSN_FUNCTION_USAGE 
was simple as well.

build and reg-tested on MIPS. OK for stage1? (You've alread OK-ed the test-case 
part).

Thanks,
- Tom

> Thanks,
> Richard
>


[-- Attachment #2: fuse-caller-save-mips-hook.patch --]
[-- Type: text/x-patch, Size: 5521 bytes --]

2014-01-12  Radovan Obradovic  <robradovic@mips.com>
            Tom de Vries  <tom@codesourcery.com>

	* config/mips/mips.c (POST_CALL_TMP_REG): Define.
	(mips_split_call): Use POST_CALL_TMP_REG.
	(mips_fn_other_hard_reg_usage): New function.
	(TARGET_FN_OTHER_HARD_REG_USAGE): Define targhook using new function.
	(mips_expand_call): Add POST_CALL_TMP_REG clobber.

	* gcc.target/mips/mips.exp: Add use-caller-save to -ffoo/-fno-foo
	options.
	* gcc.target/mips/fuse-caller-save.c: New test.
---
 gcc/config/mips/mips.c                           | 41 +++++++++++++++++++++---
 gcc/testsuite/gcc.target/mips/fuse-caller-save.c | 30 +++++++++++++++++
 gcc/testsuite/gcc.target/mips/mips.exp           |  1 +
 3 files changed, 67 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/mips/fuse-caller-save.c

diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index 617391c..ef7a3f9 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -175,6 +175,11 @@ along with GCC; see the file COPYING3.  If not see
 /* Return the usual opcode for a nop.  */
 #define MIPS_NOP 0
 
+/* Temporary register that is used after a call, and suitable for both
+   MIPS16 and non-MIPS16 code.  $4 and $5 are used for returning complex double
+   values in soft-float code, so $6 is the first suitable candidate.  */
+#define POST_CALL_TMP_REG (GP_ARG_FIRST + 2)
+
 /* Classifies an address.
 
    ADDRESS_REG
@@ -6906,11 +6911,19 @@ mips_expand_call (enum mips_call_type type, rtx result, rtx addr,
 {
   rtx orig_addr, pattern, insn;
   int fp_code;
+  rtx post_call_tmp_reg = gen_rtx_REG (word_mode, POST_CALL_TMP_REG);
 
   fp_code = aux == 0 ? 0 : (int) GET_MODE (aux);
   insn = mips16_build_call_stub (result, &addr, args_size, fp_code);
   if (insn)
     {
+      if (TARGET_EXPLICIT_RELOCS
+	  && TARGET_CALL_CLOBBERED_GP
+	  && !find_reg_note (insn, REG_NORETURN, 0))
+	CALL_INSN_FUNCTION_USAGE (insn)
+	  = gen_rtx_EXPR_LIST (VOIDmode,
+			       gen_rtx_CLOBBER (VOIDmode, post_call_tmp_reg),
+			       CALL_INSN_FUNCTION_USAGE (insn));
       gcc_assert (!lazy_p && type == MIPS_CALL_NORMAL);
       return insn;
     }
@@ -6966,7 +6979,16 @@ mips_expand_call (enum mips_call_type type, rtx result, rtx addr,
       pattern = fn (result, addr, args_size);
     }
 
-  return mips_emit_call_insn (pattern, orig_addr, addr, lazy_p);
+  insn = mips_emit_call_insn (pattern, orig_addr, addr, lazy_p);
+  if (TARGET_EXPLICIT_RELOCS
+      && TARGET_CALL_CLOBBERED_GP
+      && !find_reg_note (insn, REG_NORETURN, 0))
+    CALL_INSN_FUNCTION_USAGE (insn)
+      = gen_rtx_EXPR_LIST (VOIDmode,
+			   gen_rtx_CLOBBER (VOIDmode, post_call_tmp_reg),
+			   CALL_INSN_FUNCTION_USAGE (insn));
+
+  return insn;
 }
 
 /* Split call instruction INSN into a $gp-clobbering call and
@@ -6978,10 +7000,8 @@ mips_split_call (rtx insn, rtx call_pattern)
 {
   emit_call_insn (call_pattern);
   if (!find_reg_note (insn, REG_NORETURN, 0))
-    /* Pick a temporary register that is suitable for both MIPS16 and
-       non-MIPS16 code.  $4 and $5 are used for returning complex double
-       values in soft-float code, so $6 is the first suitable candidate.  */
-    mips_restore_gp_from_cprestore_slot (gen_rtx_REG (Pmode, GP_ARG_FIRST + 2));
+    mips_restore_gp_from_cprestore_slot (gen_rtx_REG (Pmode,
+						      POST_CALL_TMP_REG));
 }
 
 /* Return true if a call to DECL may need to use JALX.  */
@@ -18687,6 +18707,14 @@ mips_case_values_threshold (void)
   else
     return default_case_values_threshold ();
 }
+
+/* Implement TARGET_FN_OTHER_HARD_REG_USAGE.  */
+
+static bool
+mips_fn_other_hard_reg_usage (struct hard_reg_set_container *fn_used_regs)
+{
+  return true;
+}
 \f
 /* Initialize the GCC target structure.  */
 #undef TARGET_ASM_ALIGNED_HI_OP
@@ -18921,6 +18949,9 @@ mips_case_values_threshold (void)
 #undef TARGET_CASE_VALUES_THRESHOLD
 #define TARGET_CASE_VALUES_THRESHOLD mips_case_values_threshold
 
+#undef TARGET_FN_OTHER_HARD_REG_USAGE
+#define TARGET_FN_OTHER_HARD_REG_USAGE mips_fn_other_hard_reg_usage
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 \f
 #include "gt-mips.h"
diff --git a/gcc/testsuite/gcc.target/mips/fuse-caller-save.c b/gcc/testsuite/gcc.target/mips/fuse-caller-save.c
new file mode 100644
index 0000000..1fd6c7d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/fuse-caller-save.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-fuse-caller-save" } */
+/* { dg-skip-if "" { *-*-* }  { "*" } { "-Os" } } */
+/* Testing -fuse-caller-save optimization option.  */
+
+static int __attribute__((noinline)) NOCOMPRESSION
+bar (int x)
+{
+  return x + 3;
+}
+
+int __attribute__((noinline)) NOCOMPRESSION
+foo (int y)
+{
+  return y + bar (y);
+}
+
+int NOCOMPRESSION
+main (void)
+{
+  return !(foo (5) == 13);
+}
+
+/* Check that there are only 2 stack-saves: r31 in main and foo.  */
+
+/* Check that there only 2 sw/sd.  */
+/* { dg-final { scan-assembler-times "(?n)s\[wd\]\t\\\$.*,.*\\(\\\$sp\\)" 2 } } */
+
+/* Check that the first caller-save register is unused.  */
+/* { dg-final { scan-assembler-not "\\\$16" } } */
diff --git a/gcc/testsuite/gcc.target/mips/mips.exp b/gcc/testsuite/gcc.target/mips/mips.exp
index 8c72cff..6ad8160 100644
--- a/gcc/testsuite/gcc.target/mips/mips.exp
+++ b/gcc/testsuite/gcc.target/mips/mips.exp
@@ -305,6 +305,7 @@ foreach option {
     tree-vectorize
     unroll-all-loops
     unroll-loops
+    use-caller-save
 } {
     lappend mips_option_groups $option "-f(no-|)$option"
 }
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH][IRA] Analysis of register usage of functions for usage by IRA.
  2014-01-10 11:39                   ` Richard Earnshaw
  2014-01-10 16:44                     ` Tom de Vries
@ 2014-01-13 16:16                     ` Tom de Vries
  2014-01-14 10:00                       ` Richard Earnshaw
  1 sibling, 1 reply; 59+ messages in thread
From: Tom de Vries @ 2014-01-13 16:16 UTC (permalink / raw)
  To: Richard Earnshaw
  Cc: Vladimir Makarov, Richard Sandiford, Paolo Bonzini, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1564 bytes --]

On 10-01-14 12:39, Richard Earnshaw wrote:
>>> >>Consequently, you'll need to add a patch for AArch64 which has two
>>> >>registers clobbered by PLT-based calls.
>>> >>
>> >
>> >Thanks for pointing that out. That's r16 and r17, right? I can propose the hook
>> >for AArch64, once we all agree on how the hook should look.
>> >
> Yes; and thanks!

Hi Richard,

I'm posting this patch that implements the TARGET_FN_OTHER_HARD_REG_USAGE hook 
for aarch64. It uses the conservative hook format for now.

I've build gcc and cc1 with the patch, and observed the impact on this code snippet:
...
static int
bar (int x)
{
   return x + 3;
}

int
foo (int y)
{
   return y + bar (y);
}
...

AFAICT, that looks as expected:
...
$ gcc fuse-caller-save.c -mno-lra -fno-use-caller-save -O2 -S -o- > 1
$ gcc fuse-caller-save.c -mno-lra -fuse-caller-save -O2 -S -o- > 2
$ diff -u 1 2
--- 1	2014-01-13 16:51:24.000000000 +0100
+++ 2	2014-01-13 16:51:19.000000000 +0100
@@ -11,14 +11,12 @@
  	.global	foo
  	.type	foo, %function
  foo:
-	stp	x29, x30, [sp, -32]!
+	stp	x29, x30, [sp, -16]!
+	mov	w1, w0
  	add	x29, sp, 0
-	str	x19, [sp,16]
-	mov	w19, w0
  	bl	bar
-	add	w0, w0, w19
-	ldr	x19, [sp,16]
-	ldp	x29, x30, [sp], 32
+	ldp	x29, x30, [sp], 16
+	add	w0, w0, w1
  	ret
  	.size	foo, .-foo
  	.section	.text.startup,"ax",%progbits
...

Btw, the results are the same for -mno-lra and -mlra. I'm just using the 
-mno-lra version here because the -mlra version of -fuse-caller-save is still in 
review ( http://gcc.gnu.org/ml/gcc-patches/2013-12/msg00586.html ).

Thanks,
- Tom


[-- Attachment #2: fuse-caller-save-aarch64-hook.patch --]
[-- Type: text/x-patch, Size: 1325 bytes --]

2014-01-11  Tom de Vries  <tom@codesourcery.com>

	* config/aarch64/aarch64.c (TARGET_FN_OTHER_HARD_REG_USAGE): Redefine as
	aarch64_fn_other_hard_reg_usage.
	(aarch64_fn_other_hard_reg_usage): New function.
---
 gcc/config/aarch64/aarch64.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 3b1f6b5..295fd5d 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -3287,6 +3287,16 @@ aarch64_fixed_condition_code_regs (unsigned int *p1, unsigned int *p2)
   return true;
 }
 
+/* Implement TARGET_FN_OTHER_HARD_REG_USAGE.  */
+
+static bool
+aarch64_fn_other_hard_reg_usage (struct hard_reg_set_container *regs)
+{
+  SET_HARD_REG_BIT (regs->set, R16_REGNUM);
+  SET_HARD_REG_BIT (regs->set, R17_REGNUM);
+  return true;
+}
+
 enum machine_mode
 aarch64_select_cc_mode (RTX_CODE code, rtx x, rtx y)
 {
@@ -8472,6 +8482,11 @@ aarch64_vectorize_vec_perm_const_ok (enum machine_mode vmode,
 #undef TARGET_FIXED_CONDITION_CODE_REGS
 #define TARGET_FIXED_CONDITION_CODE_REGS aarch64_fixed_condition_code_regs
 
+#undef TARGET_FN_OTHER_HARD_REG_USAGE
+#define TARGET_FN_OTHER_HARD_REG_USAGE \
+  aarch64_fn_other_hard_reg_usage
+
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-aarch64.h"
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH][IRA] Analysis of register usage of functions for usage by IRA.
  2014-01-13 16:16                     ` Tom de Vries
@ 2014-01-14 10:00                       ` Richard Earnshaw
  0 siblings, 0 replies; 59+ messages in thread
From: Richard Earnshaw @ 2014-01-14 10:00 UTC (permalink / raw)
  To: Tom de Vries
  Cc: Vladimir Makarov, Richard Sandiford, Paolo Bonzini, gcc-patches

On 13/01/14 16:16, Tom de Vries wrote:
> On 10-01-14 12:39, Richard Earnshaw wrote:
>>>>>> Consequently, you'll need to add a patch for AArch64 which has two
>>>>>> registers clobbered by PLT-based calls.
>>>>>>
>>>>
>>>> Thanks for pointing that out. That's r16 and r17, right? I can propose the hook
>>>> for AArch64, once we all agree on how the hook should look.
>>>>
>> Yes; and thanks!
> 
> Hi Richard,
> 
> I'm posting this patch that implements the TARGET_FN_OTHER_HARD_REG_USAGE hook 
> for aarch64. It uses the conservative hook format for now.
> 
> I've build gcc and cc1 with the patch, and observed the impact on this code snippet:
> ...
> static int
> bar (int x)
> {
>    return x + 3;
> }
> 
> int
> foo (int y)
> {
>    return y + bar (y);
> }
> ...
> 
> AFAICT, that looks as expected:
> ...
> $ gcc fuse-caller-save.c -mno-lra -fno-use-caller-save -O2 -S -o- > 1
> $ gcc fuse-caller-save.c -mno-lra -fuse-caller-save -O2 -S -o- > 2
> $ diff -u 1 2
> --- 1	2014-01-13 16:51:24.000000000 +0100
> +++ 2	2014-01-13 16:51:19.000000000 +0100
> @@ -11,14 +11,12 @@
>   	.global	foo
>   	.type	foo, %function
>   foo:
> -	stp	x29, x30, [sp, -32]!
> +	stp	x29, x30, [sp, -16]!
> +	mov	w1, w0
>   	add	x29, sp, 0
> -	str	x19, [sp,16]
> -	mov	w19, w0
>   	bl	bar
> -	add	w0, w0, w19
> -	ldr	x19, [sp,16]
> -	ldp	x29, x30, [sp], 32
> +	ldp	x29, x30, [sp], 16
> +	add	w0, w0, w1
>   	ret
>   	.size	foo, .-foo
>   	.section	.text.startup,"ax",%progbits
> ...
> 
> Btw, the results are the same for -mno-lra and -mlra. I'm just using the 
> -mno-lra version here because the -mlra version of -fuse-caller-save is still in 
> review ( http://gcc.gnu.org/ml/gcc-patches/2013-12/msg00586.html ).
> 
> Thanks,
> - Tom
> 
> 
> fuse-caller-save-aarch64-hook.patch
> 
> 
> 2014-01-11  Tom de Vries  <tom@codesourcery.com>
> 
> 	* config/aarch64/aarch64.c (TARGET_FN_OTHER_HARD_REG_USAGE): Redefine as
> 	aarch64_fn_other_hard_reg_usage.
> 	(aarch64_fn_other_hard_reg_usage): New function.
> ---
>  gcc/config/aarch64/aarch64.c | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 3b1f6b5..295fd5d 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -3287,6 +3287,16 @@ aarch64_fixed_condition_code_regs (unsigned int *p1, unsigned int *p2)
>    return true;
>  }
>  
> +/* Implement TARGET_FN_OTHER_HARD_REG_USAGE.  */
> +
> +static bool
> +aarch64_fn_other_hard_reg_usage (struct hard_reg_set_container *regs)
> +{
> +  SET_HARD_REG_BIT (regs->set, R16_REGNUM);
> +  SET_HARD_REG_BIT (regs->set, R17_REGNUM);
> +  return true;
> +}


I think that in this context using IP0_REGNUM and IP1_REGNUM would be
slightly clearer; since it is because these registers are the
inter-procedure-call scratch registers that they aren't safe to use in
this context.

Otherwise, this is OK.

R.




^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH][IRA] Analysis of register usage of functions for usage by IRA.
  2013-12-06  0:47         ` [PATCH][IRA] Analysis of register usage of functions for usage by IRA Tom de Vries
@ 2014-01-14 19:36           ` Vladimir Makarov
  2014-05-30  9:20             ` Tom de Vries
  0 siblings, 1 reply; 59+ messages in thread
From: Vladimir Makarov @ 2014-01-14 19:36 UTC (permalink / raw)
  To: Tom de Vries; +Cc: gcc-patches

On 12/05/2013 07:47 PM, Tom de Vries wrote:
> On 14-03-13 10:34, Tom de Vries wrote:
>>> I thought about implementing your optimization for LRA by myself.
>>> But it
>>> >is ok if you decide to work on it.  At least, I am not going to start
>>> >this work for a month.
>>>> >>I'm also currently looking at how to use the analysis in LRA.
>>>> >>AFAIU, in lra-constraints.c we do a backward scan over the insns,
>>>> and keep track
>>>> >>of how many calls we've seen (calls_num), and mark insns with
>>>> that number. Then
>>>> >>when looking at a live-range segment consisting of a def or use
>>>> insn a and a
>>>> >>following use insn b, we can compare the number of calls seen for
>>>> each insn, and
>>>> >>if they're not equal there is at least one call between the 2
>>>> insns, and if the
>>>> >>corresponding hard register is clobbered by calls, we spill after
>>>> insn a and
>>>> >>restore before insn b.
>>>> >>
>>>> >>That is too coarse-grained to use with our analysis, since we
>>>> need to know which
>>>> >>calls occur in between insn a and insn b, and more precisely
>>>> which registers
>>>> >>those calls clobbered.
>>> >
>>>> >>I wonder though if we can do something similar: we keep an array
>>>> >>call_clobbers_num[FIRST_PSEUDO_REG], initialized at 0 when we
>>>> start scanning.
>>>> >>When encountering a call, we increase the call_clobbers_num
>>>> entries for the hard
>>>> >>registers clobbered by the call.
>>>> >>When encountering a use, we set the call_clobbers_num field of
>>>> the use to
>>>> >>call_clobbers_num[reg_renumber[original_regno]].
>>>> >>And when looking at a live-range segment, we compare the
>>>> clobbers_num field of
>>>> >>insn a and insn b, and if it is not equal, the hard register was
>>>> clobbered by at
>>>> >>least one call between insn a and insn b.
>>>> >>Would that work? WDYT?
>>>> >>
>>> >As I understand you looked at live-range splitting code in
>>> >lra-constraints.c.  To get necessary info you should look at
>>> ira-lives.c.
>> Unfortunately I haven't been able to find time to work further on the
>> LRA part.
>> So if you're still willing to pick up that part, that would be great.
>
> Vladimir,
>
> I gave this a try. The attached patch works for the included test-case
> for x86_64.
>
> I've bootstrapped and reg-tested the patch (in combination with the
> other patches from the series) on x86_64.
>
> OK for stage1?
>
Yes, it is ok for stage1.  Thanks for not forgetting LRA and sorry for
the delay with the answer (it is not a high priority patch for me right
now).

I believe, this patch helps to improve code also because of better
spilling into SSE regs.  Spilling into SSE regs instead of memory has a
rare probability right now as all SSE regs are call clobbered.

Thanks again, Tom.

 

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH][IRA] Analysis of register usage of functions for usage by IRA.
  2014-01-14 19:36           ` Vladimir Makarov
@ 2014-05-30  9:20             ` Tom de Vries
  0 siblings, 0 replies; 59+ messages in thread
From: Tom de Vries @ 2014-05-30  9:20 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1482 bytes --]

On 14-01-14 20:36, Vladimir Makarov wrote:
>>> Unfortunately I haven't been able to find time to work further on the
>>> >>LRA part.
>>> >>So if you're still willing to pick up that part, that would be great.
>> >
>> >Vladimir,
>> >
>> >I gave this a try. The attached patch works for the included test-case
>> >for x86_64.
>> >
>> >I've bootstrapped and reg-tested the patch (in combination with the
>> >other patches from the series) on x86_64.
>> >
>> >OK for stage1?
>> >
> Yes, it is ok for stage1.  Thanks for not forgetting LRA and sorry for
> the delay with the answer (it is not a high priority patch for me right
> now).
>
> I believe, this patch helps to improve code also because of better
> spilling into SSE regs.  Spilling into SSE regs instead of memory has a
> rare probability right now as all SSE regs are call clobbered.
>

Vladimir,

After committing the original patch, Martin Liška told me on IRC that the patch 
broke the build with --enable-checking=release.

The bit in lra_assign used the call_p field unconditionally, while the 
definition of the call_p field is guarded with #ifdef ENABLE_CHECKING.

I've reverted the original patch, and bootstrapped and reg-tested this version 
of the patch, which has a simplified bit for lra_assign.

The only functional difference between the patches is that we no longer add 
printing a debug message in lra_assign. Committed (since the difference between 
the approved and new patch is trivial).

Thanks,
- Tom


[-- Attachment #2: 0001-fuse-caller-save-Support-in-lra.patch --]
[-- Type: text/x-patch, Size: 4851 bytes --]

2014-05-30  Tom de Vries  <tom@codesourcery.com>

	* lra-int.h (struct lra_reg): Add field actual_call_used_reg_set.
	* lra.c (initialize_lra_reg_info_element): Add init of
	actual_call_used_reg_set field.
	(lra): Call lra_create_live_ranges before lra_inheritance for
	-fuse-caller-save.
	* lra-assigns.c (lra_assign): Allow call_used_regs to cross calls for
	-fuse-caller-save.
	* lra-constraints.c (need_for_call_save_p): Use actual_call_used_reg_set
	instead of call_used_reg_set for -fuse-caller-save.
	* lra-lives.c (process_bb_lives): Calculate actual_call_used_reg_set.

diff --git a/gcc/lra-assigns.c b/gcc/lra-assigns.c
index f7bb86b..03c2506 100644
--- a/gcc/lra-assigns.c
+++ b/gcc/lra-assigns.c
@@ -1460,12 +1460,13 @@ lra_assign (void)
   create_live_range_start_chains ();
   setup_live_pseudos_and_spill_after_risky_transforms (&all_spilled_pseudos);
 #ifdef ENABLE_CHECKING
-  for (i = FIRST_PSEUDO_REGISTER; i < max_regno; i++)
-    if (lra_reg_info[i].nrefs != 0 && reg_renumber[i] >= 0
-	&& lra_reg_info[i].call_p
-	&& overlaps_hard_reg_set_p (call_used_reg_set,
-				    PSEUDO_REGNO_MODE (i), reg_renumber[i]))
-      gcc_unreachable ();
+  if (!flag_use_caller_save)
+    for (i = FIRST_PSEUDO_REGISTER; i < max_regno; i++)
+      if (lra_reg_info[i].nrefs != 0 && reg_renumber[i] >= 0
+	  && lra_reg_info[i].call_p
+	  && overlaps_hard_reg_set_p (call_used_reg_set,
+				      PSEUDO_REGNO_MODE (i), reg_renumber[i]))
+	gcc_unreachable ();
 #endif
   /* Setup insns to process on the next constraint pass.  */
   bitmap_initialize (&changed_pseudo_bitmap, &reg_obstack);
diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c
index 2df841a..7eb9dbc 100644
--- a/gcc/lra-constraints.c
+++ b/gcc/lra-constraints.c
@@ -4605,7 +4605,10 @@ need_for_call_save_p (int regno)
   lra_assert (regno >= FIRST_PSEUDO_REGISTER && reg_renumber[regno] >= 0);
   return (usage_insns[regno].calls_num < calls_num
 	  && (overlaps_hard_reg_set_p
-	      (call_used_reg_set,
+	      ((flag_use_caller_save &&
+		! hard_reg_set_empty_p (lra_reg_info[regno].actual_call_used_reg_set))
+	       ? lra_reg_info[regno].actual_call_used_reg_set
+	       : call_used_reg_set,
 	       PSEUDO_REGNO_MODE (regno), reg_renumber[regno])
 	      || HARD_REGNO_CALL_PART_CLOBBERED (reg_renumber[regno],
 						 PSEUDO_REGNO_MODE (regno))));
diff --git a/gcc/lra-int.h b/gcc/lra-int.h
index 41c9849..3c89734 100644
--- a/gcc/lra-int.h
+++ b/gcc/lra-int.h
@@ -77,6 +77,10 @@ struct lra_reg
   /* The following fields are defined only for pseudos.	 */
   /* Hard registers with which the pseudo conflicts.  */
   HARD_REG_SET conflict_hard_regs;
+  /* Call used registers with which the pseudo conflicts, taking into account
+     the registers used by functions called from calls which cross the
+     pseudo.  */
+  HARD_REG_SET actual_call_used_reg_set;
   /* We assign hard registers to reload pseudos which can occur in few
      places.  So two hard register preferences are enough for them.
      The following fields define the preferred hard registers.	If
diff --git a/gcc/lra-lives.c b/gcc/lra-lives.c
index 8444ade..26ba0d2 100644
--- a/gcc/lra-lives.c
+++ b/gcc/lra-lives.c
@@ -624,6 +624,17 @@ process_bb_lives (basic_block bb, int &curr_point)
 
       if (call_p)
 	{
+	  if (flag_use_caller_save)
+	    {
+	      HARD_REG_SET this_call_used_reg_set;
+	      get_call_reg_set_usage (curr_insn, &this_call_used_reg_set,
+				      call_used_reg_set);
+
+	      EXECUTE_IF_SET_IN_SPARSESET (pseudos_live, j)
+		IOR_HARD_REG_SET (lra_reg_info[j].actual_call_used_reg_set,
+				  this_call_used_reg_set);
+	    }
+
 	  sparseset_ior (pseudos_live_through_calls,
 			 pseudos_live_through_calls, pseudos_live);
 	  if (cfun->has_nonlocal_label
diff --git a/gcc/lra.c b/gcc/lra.c
index ecec890..d199a81 100644
--- a/gcc/lra.c
+++ b/gcc/lra.c
@@ -1427,6 +1427,7 @@ initialize_lra_reg_info_element (int i)
   lra_reg_info[i].no_stack_p = false;
 #endif
   CLEAR_HARD_REG_SET (lra_reg_info[i].conflict_hard_regs);
+  CLEAR_HARD_REG_SET (lra_reg_info[i].actual_call_used_reg_set);
   lra_reg_info[i].preferred_hard_regno1 = -1;
   lra_reg_info[i].preferred_hard_regno2 = -1;
   lra_reg_info[i].preferred_hard_regno_profit1 = 0;
@@ -2344,7 +2345,18 @@ lra (FILE *f)
 	  lra_eliminate (false, false);
 	  /* Do inheritance only for regular algorithms.  */
 	  if (! lra_simple_p)
-	    lra_inheritance ();
+	    {
+	      if (flag_use_caller_save)
+		{
+		  if (live_p)
+		    lra_clear_live_ranges ();
+		  /* As a side-effect of lra_create_live_ranges, we calculate
+		     actual_call_used_reg_set,  which is needed during
+		     lra_inheritance.  */
+		  lra_create_live_ranges (true);
+		}
+	      lra_inheritance ();
+	    }
 	  if (live_p)
 	    lra_clear_live_ranges ();
 	  /* We need live ranges for lra_assign -- so build them.  */
-- 
1.9.1


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH][IRA] Analysis of register usage of functions for usage by IRA.
  2013-01-25 13:05 [PATCH][IRA] Analysis of register usage of functions for usage by IRA Tom de Vries
  2013-01-25 15:46 ` Vladimir Makarov
@ 2014-09-01 16:41 ` Ulrich Weigand
  2014-09-03 16:58   ` Tom de Vries
  1 sibling, 1 reply; 59+ messages in thread
From: Ulrich Weigand @ 2014-09-01 16:41 UTC (permalink / raw)
  To: Tom de Vries
  Cc: Vladimir Makarov, Steven Bosscher, gcc-patches, Radovan Obradovic

Tom de Vries wrote:

> 	* ira-costs.c (ira_tune_allocno_costs): Use
> 	ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS to adjust costs.

In debugging PR 53864 on s390x-linux, I ran into a weird change in behavior
that occurs when the following part of this patch was checked in:

> -	      if (ira_hard_reg_set_intersection_p (regno, mode, call_used_reg_set)
> -		  || HARD_REGNO_CALL_PART_CLOBBERED (regno, mode))
> -		cost += (ALLOCNO_CALL_FREQ (a)
> -			 * (ira_memory_move_cost[mode][rclass][0]
> -			    + ira_memory_move_cost[mode][rclass][1]));
> +	      crossed_calls_clobber_regs
> +		= &(ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a));
> +	      if (ira_hard_reg_set_intersection_p (regno, mode,
> +						   *crossed_calls_clobber_regs))
> +		{
> +		  if (ira_hard_reg_set_intersection_p (regno, mode,
> +						       call_used_reg_set)
> +		      || HARD_REGNO_CALL_PART_CLOBBERED (regno, mode))
> +		    cost += (ALLOCNO_CALL_FREQ (a)
> +			     * (ira_memory_move_cost[mode][rclass][0]
> +				+ ira_memory_move_cost[mode][rclass][1]));
>  #ifdef IRA_HARD_REGNO_ADD_COST_MULTIPLIER
> -	      cost += ((ira_memory_move_cost[mode][rclass][0]
> -			+ ira_memory_move_cost[mode][rclass][1])
> -		       * ALLOCNO_FREQ (a)
> -		       * IRA_HARD_REGNO_ADD_COST_MULTIPLIER (regno) / 2);
> +		  cost += ((ira_memory_move_cost[mode][rclass][0]
> +			    + ira_memory_move_cost[mode][rclass][1])
> +			   * ALLOCNO_FREQ (a)
> +			   * IRA_HARD_REGNO_ADD_COST_MULTIPLIER (regno) / 2);
>  #endif
> +		}

Before that patch, this code would penalize all call-clobbered registers
(if the alloca is used across a call), and it would penalize *all* registers
in a target-dependent way if IRA_HARD_REGNO_ADD_COST_MULTIPLIER is defined;
the latter is completely independent of the presence of any calls.

However, after that patch, the IRA_HARD_REGNO_ADD_COST_MULTIPLIER penalty
is only applied for registers clobbered by calls in this function.  This
seems a completely unrelated change, and looks just wrong to me ...

Was this done intentionally or is this just an oversight?

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH][IRA] Analysis of register usage of functions for usage by IRA.
  2014-09-01 16:41 ` Ulrich Weigand
@ 2014-09-03 16:58   ` Tom de Vries
  2014-09-03 18:12     ` Ulrich Weigand
  2014-09-04  7:37     ` Tom de Vries
  0 siblings, 2 replies; 59+ messages in thread
From: Tom de Vries @ 2014-09-03 16:58 UTC (permalink / raw)
  To: Ulrich Weigand
  Cc: Vladimir Makarov, Steven Bosscher, gcc-patches, Radovan Obradovic

[-- Attachment #1: Type: text/plain, Size: 2647 bytes --]

On 01-09-14 18:41, Ulrich Weigand wrote:
> Tom de Vries wrote:
>
>> 	* ira-costs.c (ira_tune_allocno_costs): Use
>> 	ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS to adjust costs.
>
> In debugging PR 53864 on s390x-linux, I ran into a weird change in behavior
> that occurs when the following part of this patch was checked in:
>
>> -	      if (ira_hard_reg_set_intersection_p (regno, mode, call_used_reg_set)
>> -		  || HARD_REGNO_CALL_PART_CLOBBERED (regno, mode))
>> -		cost += (ALLOCNO_CALL_FREQ (a)
>> -			 * (ira_memory_move_cost[mode][rclass][0]
>> -			    + ira_memory_move_cost[mode][rclass][1]));
>> +	      crossed_calls_clobber_regs
>> +		= &(ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a));
>> +	      if (ira_hard_reg_set_intersection_p (regno, mode,
>> +						   *crossed_calls_clobber_regs))
>> +		{
>> +		  if (ira_hard_reg_set_intersection_p (regno, mode,
>> +						       call_used_reg_set)
>> +		      || HARD_REGNO_CALL_PART_CLOBBERED (regno, mode))
>> +		    cost += (ALLOCNO_CALL_FREQ (a)
>> +			     * (ira_memory_move_cost[mode][rclass][0]
>> +				+ ira_memory_move_cost[mode][rclass][1]));
>>   #ifdef IRA_HARD_REGNO_ADD_COST_MULTIPLIER
>> -	      cost += ((ira_memory_move_cost[mode][rclass][0]
>> -			+ ira_memory_move_cost[mode][rclass][1])
>> -		       * ALLOCNO_FREQ (a)
>> -		       * IRA_HARD_REGNO_ADD_COST_MULTIPLIER (regno) / 2);
>> +		  cost += ((ira_memory_move_cost[mode][rclass][0]
>> +			    + ira_memory_move_cost[mode][rclass][1])
>> +			   * ALLOCNO_FREQ (a)
>> +			   * IRA_HARD_REGNO_ADD_COST_MULTIPLIER (regno) / 2);
>>   #endif
>> +		}
>
> Before that patch, this code would penalize all call-clobbered registers
> (if the alloca is used across a call), and it would penalize *all* registers
> in a target-dependent way if IRA_HARD_REGNO_ADD_COST_MULTIPLIER is defined;
> the latter is completely independent of the presence of any calls.
>
> However, after that patch, the IRA_HARD_REGNO_ADD_COST_MULTIPLIER penalty
> is only applied for registers clobbered by calls in this function.  This
> seems a completely unrelated change, and looks just wrong to me ...
>
> Was this done intentionally or is this just an oversight?
>

Ulrich,

thanks for noticing this. I agree, this looks wrong, and is probably an 
oversight. [ It seems that s390 is the only target defining 
IRA_HARD_REGNO_ADD_COST_MULTIPLIER, so this problem didn't show up on any other 
target. ]

I think attached patch fixes it.

I've build the patch and ran the fuse-caller-save tests, and I'm currently 
bootstrapping and reg-testing it on x86_64.

Can you check whether this patches fixes the issue for s390 ?

Thanks,
- Tom

> Bye,
> Ulrich
>


[-- Attachment #2: 0001-Fix-IRA_HARD_REGNO_ADD_COST_MULTIPLIER-in-ira_tune_a.patch --]
[-- Type: text/x-patch, Size: 1456 bytes --]

diff --git a/gcc/ira-costs.c b/gcc/ira-costs.c
index 774a958..57239f5 100644
--- a/gcc/ira-costs.c
+++ b/gcc/ira-costs.c
@@ -2217,21 +2217,19 @@ ira_tune_allocno_costs (void)
 	      crossed_calls_clobber_regs
 		= &(ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a));
 	      if (ira_hard_reg_set_intersection_p (regno, mode,
-						   *crossed_calls_clobber_regs))
-		{
-		  if (ira_hard_reg_set_intersection_p (regno, mode,
+						   *crossed_calls_clobber_regs)
+		  && (ira_hard_reg_set_intersection_p (regno, mode,
 						       call_used_reg_set)
-		      || HARD_REGNO_CALL_PART_CLOBBERED (regno, mode))
-		    cost += (ALLOCNO_CALL_FREQ (a)
-			     * (ira_memory_move_cost[mode][rclass][0]
-				+ ira_memory_move_cost[mode][rclass][1]));
+		      || HARD_REGNO_CALL_PART_CLOBBERED (regno, mode)))
+		cost += (ALLOCNO_CALL_FREQ (a)
+			 * (ira_memory_move_cost[mode][rclass][0]
+			    + ira_memory_move_cost[mode][rclass][1]));
 #ifdef IRA_HARD_REGNO_ADD_COST_MULTIPLIER
-		  cost += ((ira_memory_move_cost[mode][rclass][0]
-			    + ira_memory_move_cost[mode][rclass][1])
-			   * ALLOCNO_FREQ (a)
-			   * IRA_HARD_REGNO_ADD_COST_MULTIPLIER (regno) / 2);
+	      cost += ((ira_memory_move_cost[mode][rclass][0]
+			+ ira_memory_move_cost[mode][rclass][1])
+		       * ALLOCNO_FREQ (a)
+		       * IRA_HARD_REGNO_ADD_COST_MULTIPLIER (regno) / 2);
 #endif
-		}
 	      if (INT_MAX - cost < reg_costs[j])
 		reg_costs[j] = INT_MAX;
 	      else
-- 
1.9.1


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH][IRA] Analysis of register usage of functions for usage by IRA.
  2014-09-03 16:58   ` Tom de Vries
@ 2014-09-03 18:12     ` Ulrich Weigand
  2014-09-03 22:24       ` Tom de Vries
  2014-09-04  7:37     ` Tom de Vries
  1 sibling, 1 reply; 59+ messages in thread
From: Ulrich Weigand @ 2014-09-03 18:12 UTC (permalink / raw)
  To: Tom de Vries
  Cc: Vladimir Makarov, Steven Bosscher, gcc-patches, Radovan Obradovic

Tom de Vries wrote:

> thanks for noticing this. I agree, this looks wrong, and is probably an 
> oversight. [ It seems that s390 is the only target defining 
> IRA_HARD_REGNO_ADD_COST_MULTIPLIER, so this problem didn't show up on any other 
> target. ]
> 
> I think attached patch fixes it.
> 
> I've build the patch and ran the fuse-caller-save tests, and I'm currently 
> bootstrapping and reg-testing it on x86_64.

Thanks!

> Can you check whether this patches fixes the issue for s390 ?

Yes, this (which is equivalent to a patch I had been using) does fix
the s390 issue again.


Just for my curiosity, why is the second condition (after &&)
needed in this clause in the first place?

>  	      if (ira_hard_reg_set_intersection_p (regno, mode,
> +						   *crossed_calls_clobber_regs)
> +		  && (ira_hard_reg_set_intersection_p (regno, mode,
>  						       call_used_reg_set)
> -		      || HARD_REGNO_CALL_PART_CLOBBERED (regno, mode))

If a register is in crossed_calls_clobber_regs, can it ever *not*
be a call-clobbered register?

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH][IRA] Analysis of register usage of functions for usage by IRA.
  2014-09-03 18:12     ` Ulrich Weigand
@ 2014-09-03 22:24       ` Tom de Vries
  0 siblings, 0 replies; 59+ messages in thread
From: Tom de Vries @ 2014-09-03 22:24 UTC (permalink / raw)
  To: Ulrich Weigand
  Cc: Vladimir Makarov, Steven Bosscher, gcc-patches, Radovan Obradovic

On 03-09-14 20:12, Ulrich Weigand wrote:
> Just for my curiosity, why is the second condition (after &&)
> needed in this clause in the first place?
>
>> >  	      if (ira_hard_reg_set_intersection_p (regno, mode,
>> >+						   *crossed_calls_clobber_regs)
>> >+		  && (ira_hard_reg_set_intersection_p (regno, mode,
>> >  						       call_used_reg_set)
>> >-		      || HARD_REGNO_CALL_PART_CLOBBERED (regno, mode))
> If a register is in crossed_calls_clobber_regs, can it ever*not*
> be a call-clobbered register?

I *think* you're right that the second condition is not needed. But I'll leave 
that for a follow-up patch.

Thanks,
- Tom

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH][IRA] Analysis of register usage of functions for usage by IRA.
  2014-09-03 16:58   ` Tom de Vries
  2014-09-03 18:12     ` Ulrich Weigand
@ 2014-09-04  7:37     ` Tom de Vries
  2014-09-04 14:55       ` Vladimir Makarov
  1 sibling, 1 reply; 59+ messages in thread
From: Tom de Vries @ 2014-09-04  7:37 UTC (permalink / raw)
  To: Vladimir Makarov
  Cc: Ulrich Weigand, Steven Bosscher, gcc-patches, Radovan Obradovic

On 03-09-14 18:58, Tom de Vries wrote:
> I've build the patch and ran the fuse-caller-save tests, and I'm currently
> bootstrapping and reg-testing it on x86_64.
>

Vladimir,

This patch fixes a problem (found on s390) in one of the committed 
fuse-caller-save patches. s390 is the only user of the 
IRA_HARD_REGNO_ADD_COST_MULTIPLIER target macro. The problem in the 
fuse-caller-save patch is that the code guarded by 
IRA_HARD_REGNO_ADD_COST_MULTIPLIER in ira_tune_allocno_costs is not 
call-related, but is now conditional on a ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS 
test. This patch fixes that.

Bootstrapped and reg-tested on x86_64. No issues found ( other than a 
non-reproducible failure while testing the non-bootstrap version: 
https://gcc.gnu.org/ml/gcc/2014-09/msg00065.html ).

OK for trunk ?

Thanks,
- Tom

2014-09-04  Tom de Vries  <tom@codesourcery.com>

	* ira-costs.c (ira_tune_allocno_costs): Don't conditionalize
	IRA_HARD_REGNO_ADD_COST_MULTIPLIER code on
	ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS.

>
>
> 0001-Fix-IRA_HARD_REGNO_ADD_COST_MULTIPLIER-in-ira_tune_a.patch
>
>
> diff --git a/gcc/ira-costs.c b/gcc/ira-costs.c
> index 774a958..57239f5 100644
> --- a/gcc/ira-costs.c
> +++ b/gcc/ira-costs.c
> @@ -2217,21 +2217,19 @@ ira_tune_allocno_costs (void)
>   	      crossed_calls_clobber_regs
>   		= &(ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS (a));
>   	      if (ira_hard_reg_set_intersection_p (regno, mode,
> -						   *crossed_calls_clobber_regs))
> -		{
> -		  if (ira_hard_reg_set_intersection_p (regno, mode,
> +						   *crossed_calls_clobber_regs)
> +		  && (ira_hard_reg_set_intersection_p (regno, mode,
>   						       call_used_reg_set)
> -		      || HARD_REGNO_CALL_PART_CLOBBERED (regno, mode))
> -		    cost += (ALLOCNO_CALL_FREQ (a)
> -			     * (ira_memory_move_cost[mode][rclass][0]
> -				+ ira_memory_move_cost[mode][rclass][1]));
> +		      || HARD_REGNO_CALL_PART_CLOBBERED (regno, mode)))
> +		cost += (ALLOCNO_CALL_FREQ (a)
> +			 * (ira_memory_move_cost[mode][rclass][0]
> +			    + ira_memory_move_cost[mode][rclass][1]));
>   #ifdef IRA_HARD_REGNO_ADD_COST_MULTIPLIER
> -		  cost += ((ira_memory_move_cost[mode][rclass][0]
> -			    + ira_memory_move_cost[mode][rclass][1])
> -			   * ALLOCNO_FREQ (a)
> -			   * IRA_HARD_REGNO_ADD_COST_MULTIPLIER (regno) / 2);
> +	      cost += ((ira_memory_move_cost[mode][rclass][0]
> +			+ ira_memory_move_cost[mode][rclass][1])
> +		       * ALLOCNO_FREQ (a)
> +		       * IRA_HARD_REGNO_ADD_COST_MULTIPLIER (regno) / 2);
>   #endif
> -		}
>   	      if (INT_MAX - cost < reg_costs[j])
>   		reg_costs[j] = INT_MAX;
>   	      else

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH][IRA] Analysis of register usage of functions for usage by IRA.
  2014-09-04  7:37     ` Tom de Vries
@ 2014-09-04 14:55       ` Vladimir Makarov
  0 siblings, 0 replies; 59+ messages in thread
From: Vladimir Makarov @ 2014-09-04 14:55 UTC (permalink / raw)
  To: Tom de Vries
  Cc: Ulrich Weigand, Steven Bosscher, gcc-patches, Radovan Obradovic

On 2014-09-04 3:37 AM, Tom de Vries wrote:
> On 03-09-14 18:58, Tom de Vries wrote:
>> I've build the patch and ran the fuse-caller-save tests, and I'm
>> currently
>> bootstrapping and reg-testing it on x86_64.
>>
>
> Vladimir,
>
> This patch fixes a problem (found on s390) in one of the committed
> fuse-caller-save patches. s390 is the only user of the
> IRA_HARD_REGNO_ADD_COST_MULTIPLIER target macro. The problem in the
> fuse-caller-save patch is that the code guarded by
> IRA_HARD_REGNO_ADD_COST_MULTIPLIER in ira_tune_allocno_costs is not
> call-related, but is now conditional on a
> ALLOCNO_CROSSED_CALLS_CLOBBERED_REGS test. This patch fixes that.
>
> Bootstrapped and reg-tested on x86_64. No issues found ( other than a
> non-reproducible failure while testing the non-bootstrap version:
> https://gcc.gnu.org/ml/gcc/2014-09/msg00065.html ).
>
> OK for trunk ?
>

Yes, Tom.  Thanks for fixing the problem Ulrich found.


^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2014-09-04 14:55 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-25 13:05 [PATCH][IRA] Analysis of register usage of functions for usage by IRA Tom de Vries
2013-01-25 15:46 ` Vladimir Makarov
2013-02-07 19:12   ` Tom de Vries
2013-02-13 22:35     ` Vladimir Makarov
2013-03-14  9:35       ` Tom de Vries
2013-03-14 15:22         ` Vladimir Makarov
2013-03-29 12:54           ` Tom de Vries
2013-03-29 13:06             ` [PATCH][06/10] -fuse-caller-save - Collect register usage information Tom de Vries
2013-03-29 13:06             ` [PATCH][09/10] -fuse-caller-save - Add documentation Tom de Vries
2013-03-29 13:06             ` [PATCH][02/10] -fuse-caller-save - Add new reg-note REG_CALL_DECL Tom de Vries
2013-03-29 13:06             ` [PATCH][01/10] -fuse-caller-save - Add command line option Tom de Vries
2013-03-29 13:06             ` [PATCH][05/10] -fuse-caller-save - Implement TARGET_FN_OTHER_HARD_REG_USAGE hook for ARM Tom de Vries
2013-03-29 13:06             ` [PATCH][04/10] -fuse-caller-save - Add TARGET_FN_OTHER_HARD_REG_USAGE hook Tom de Vries
2013-03-29 13:06             ` [PATCH][08/10] -fuse-caller-save - Enable by default at O2 and higher Tom de Vries
2013-03-29 13:06             ` [PATCH][10/10] -fuse-caller-save - Add test-case Tom de Vries
2013-03-29 13:06             ` [PATCH][03/10] -fuse-caller-save - Add implicit parameter to find_all_hard_reg_sets Tom de Vries
2013-03-29 13:06             ` [PATCH][07/10] -fuse-caller-save - Use collected register usage information Tom de Vries
2013-03-30 16:10             ` [PATCH][IRA] Analysis of register usage of functions for usage by IRA Tom de Vries
2014-01-09 14:42               ` Richard Earnshaw
2014-01-09 20:56                 ` Tom de Vries
2014-01-09 21:10                   ` Andi Kleen
2014-01-10  0:22                     ` Tom de Vries
2014-01-10 11:39                   ` Richard Earnshaw
2014-01-10 16:44                     ` Tom de Vries
2014-01-13 16:16                     ` Tom de Vries
2014-01-14 10:00                       ` Richard Earnshaw
2013-03-30 17:11             ` [PATCH][05/10] -fuse-caller-save - Implement TARGET_FN_OTHER_HARD_REG_USAGE hook for ARM Tom de Vries
2013-12-06  0:54               ` Tom de Vries
2013-12-09 10:03                 ` Richard Earnshaw
2013-03-30 17:11             ` [PATCH][02/10] -fuse-caller-save - Add new reg-note REG_CALL_DECL Tom de Vries
2013-03-30 17:11             ` [PATCH][06/10] -fuse-caller-save - Collect register usage information Tom de Vries
2013-03-30 17:11             ` [PATCH][01/10] -fuse-caller-save - Add command line option Tom de Vries
2013-03-30 17:11             ` [PATCH][04/10] -fuse-caller-save - Add TARGET_FN_OTHER_HARD_REG_USAGE hook Tom de Vries
2013-12-07 15:07               ` [PATCH] -fuse-caller-save - Implement TARGET_FN_OTHER_HARD_REG_USAGE hook for MIPS Tom de Vries
2013-12-25 13:02                 ` Tom de Vries
2014-01-09 13:51                   ` [PING^2][PATCH] " Tom de Vries
2014-01-09 15:31                     ` Richard Sandiford
2014-01-09 23:43                       ` Tom de Vries
2014-01-10  8:47                         ` Richard Sandiford
2014-01-13 15:04                           ` Tom de Vries
2013-03-30 17:11             ` [PATCH][03/10] -fuse-caller-save - Add implicit parameter to find_all_hard_reg_sets Tom de Vries
2013-03-30 17:12             ` [PATCH][07/10] -fuse-caller-save - Use collected register usage information Tom de Vries
2013-12-06  0:56               ` Tom de Vries
2013-12-06  9:11                 ` Paolo Bonzini
2013-03-30 17:12             ` [PATCH][10/10] -fuse-caller-save - Add test-case Tom de Vries
2013-04-28 10:57               ` Richard Sandiford
2013-12-06  0:34                 ` Tom de Vries
2013-12-06  8:51                   ` Richard Sandiford
2013-03-30 17:12             ` [PATCH][09/10] -fuse-caller-save - Add documentation Tom de Vries
2013-03-30 17:12             ` [PATCH][08/10] -fuse-caller-save - Enable by default at O2 and higher Tom de Vries
2013-12-06  0:47         ` [PATCH][IRA] Analysis of register usage of functions for usage by IRA Tom de Vries
2014-01-14 19:36           ` Vladimir Makarov
2014-05-30  9:20             ` Tom de Vries
2014-09-01 16:41 ` Ulrich Weigand
2014-09-03 16:58   ` Tom de Vries
2014-09-03 18:12     ` Ulrich Weigand
2014-09-03 22:24       ` Tom de Vries
2014-09-04  7:37     ` Tom de Vries
2014-09-04 14:55       ` Vladimir Makarov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).