public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS
@ 2010-11-04  5:31 Yao Qi
  2010-11-05  9:53 ` Eric Botcazou
  2010-11-20 12:05 ` Eric Botcazou
  0 siblings, 2 replies; 26+ messages in thread
From: Yao Qi @ 2010-11-04  5:31 UTC (permalink / raw)
  To: gcc-patches

I sent three patches to add some logic in register-rename pass to prefer 
for different register classes.  After some reviewers' review, three 
patches are updated and refactored like this,

Patch 1: does everything except adding target hook 
preferred_rename_class .  It is a separate improvement to register-rename.
http://gcc.gnu.org/ml/gcc-patches/2010-10/msg02197.html

Patch 2: add target hook with doc, and use this hook in regrename.c
http://gcc.gnu.org/ml/gcc-patches/2010-11/msg00029.html

Patch 3: it is arm specific patch.
http://gcc.gnu.org/ml/gcc-patches/2010-10/msg01785.html

Middle-End maintainers,
could you review patch 1 and patch 2?

-- 
Yao Qi
CodeSourcery
yao@codesourcery.com
(650) 331-3385 x739

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS
  2010-11-04  5:31 [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS Yao Qi
@ 2010-11-05  9:53 ` Eric Botcazou
  2010-11-05 16:08   ` Yao Qi
  2010-11-20 12:05 ` Eric Botcazou
  1 sibling, 1 reply; 26+ messages in thread
From: Eric Botcazou @ 2010-11-05  9:53 UTC (permalink / raw)
  To: Yao Qi; +Cc: gcc-patches

> Patch 1: does everything except adding target hook
> preferred_rename_class .  It is a separate improvement to register-rename.
> http://gcc.gnu.org/ml/gcc-patches/2010-10/msg02197.html

You didn't really inline the callback, did you?  If you want to keep the 
comparer function standalone, then remove the superfluous indirection.


@@ -51,6 +51,8 @@ struct du_head
   struct du_head *next_chain;
   /* The first and last elements of this chain.  */
   struct du_chain *first, *last;
+  /* The length of du_chain for non-debug insns.  */
+  int length;
   /* Describes the register being tracked.  */
   unsigned regno, nregs;
 
du_chain is a structure tag, so you need to elaborate.



@@ -154,6 +156,141 @@ merge_overlapping_regs (HARD_REG_SET *pset, struct 
du_head *head)
     }
 }
 
+/* Get the next LENGTH element in list from START.  If length of
+   list from START is less than or equal to LENGTH, return the 
+   last element.  */
+static struct du_head *
+get_tail_du_head (struct du_head *start, int length)
+{
+  while (length-- && start->next_chain != NULL)
+      start = start->next_chain;
+
+  return start;

Avoid TAIL and HEAD in the same function name.  GET_ELEMENT is clearer.  And 
LENGTH is a poor choice for the parameter name, use N instead.

/* Return the Nth element in LIST.  If LIST contains less than N elements,
   return the last one.  */

static struct du_head *
get_element (struct du_head *list, int n)



+/* Merge the first two sequences of CURRENT_LENGTH nodes in the 
+   linked list pointed by START_NODE.  Update START_NODE to point 
+   to the merged nodes, and return a pointer to the last merged
+   node.   CMP is callback  for comparison.  */
+
+static struct du_head *
+merge(struct du_head **start_node, int current_length,
+      int (*cmp) (const struct du_head *, const struct du_head *))

Why CURRENT_LENGTH?  Just use LENGTH.

The first sentence is somewhat ambiguous as one might get the impression that 
the 2 sequences aren't already present in the list.

/* Merge the first 2 sub-lists of LENGTH nodes contained in the linked list
   pointed to by START_NODE.  Update...

Document what happens when the linked list doesn't contain enough elements.


+	  current_tail_node = 
+	    get_tail_du_head (current_sorted_node,
+			      current_length - left_count);

'=' on the next line


+  return current_tail_node->next_chain != NULL ? 
+    current_tail_node : NULL;

return (current_tail_node->next_chain ? current_tail_node : NULL);


+/* Non-recursive merge sort to du_head list.  */
+static void
+sort_du_head (struct du_head **head,
+	      int(*cmp)(const struct du_head *, const struct du_head *))

/* Sort the linked list pointed to by HEAD.  The algorithm is a non-recursive
   merge sort [that ...optional more detailed explanation...]


+  do
+    {
+      struct du_head **segment = head;
+      

Trailing spaces.

+      while ((last_tail = merge(segment, current_length, cmp)) 
+	     != NULL)

Why breaking the line?

+	{
+	  segment = &last_tail->next_chain;
+	}

Superfluous curly braces.


Are you sure that this revised implementation works as expected?  Won't it 
stop after the 1-length merge phase?


@@ -265,13 +405,13 @@ regrename_optimize (void)
 
 	  /* Now potential_regs is a reasonable approximation, let's
 	     have a closer look at each register still in there.  */
-	  for (new_reg = 0; new_reg < FIRST_PSEUDO_REGISTER; new_reg++)
+	  for (new_reg = FIRST_PSEUDO_REGISTER - 1; new_reg >= 0; new_reg--)

What is this hunk for?  It isn't mentioned in the ChangeLog.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS
  2010-11-05  9:53 ` Eric Botcazou
@ 2010-11-05 16:08   ` Yao Qi
  2010-11-08 12:52     ` Eric Botcazou
  0 siblings, 1 reply; 26+ messages in thread
From: Yao Qi @ 2010-11-05 16:08 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 4671 bytes --]

On 11/05/2010 04:55 PM, Eric Botcazou wrote:
>> Patch 1: does everything except adding target hook
>> preferred_rename_class .  It is a separate improvement to register-rename.
>> http://gcc.gnu.org/ml/gcc-patches/2010-10/msg02197.html
>
> You didn't really inline the callback, did you?  If you want to keep the
> comparer function standalone, then remove the superfluous indirection.
>

Done.  Removed function pointer CMP, and use merge_sort_callback directly.

>
> @@ -51,6 +51,8 @@ struct du_head
>     struct du_head *next_chain;
>     /* The first and last elements of this chain.  */
>     struct du_chain *first, *last;
> +  /* The length of du_chain for non-debug insns.  */
> +  int length;
>     /* Describes the register being tracked.  */
>     unsigned regno, nregs;
>
> du_chain is a structure tag, so you need to elaborate.
>
>

Add one sentence in comment to explain it a little more, "The du_head 
linked list can be sorted by this, and register-rename can prefer 
register classes according to this order.".

>
> @@ -154,6 +156,141 @@ merge_overlapping_regs (HARD_REG_SET *pset, struct
> du_head *head)
>       }
>   }
>
> +/* Get the next LENGTH element in list from START.  If length of
> +   list from START is less than or equal to LENGTH, return the
> +   last element.  */
> +static struct du_head *
> +get_tail_du_head (struct du_head *start, int length)
> +{
> +  while (length--&&  start->next_chain != NULL)
> +      start = start->next_chain;
> +
> +  return start;
>
> Avoid TAIL and HEAD in the same function name.  GET_ELEMENT is clearer.  And
> LENGTH is a poor choice for the parameter name, use N instead.
>
> /* Return the Nth element in LIST.  If LIST contains less than N elements,
>     return the last one.  */
>
> static struct du_head *
> get_element (struct du_head *list, int n)
>
>

Fixed as you suggested.

>
> +/* Merge the first two sequences of CURRENT_LENGTH nodes in the
> +   linked list pointed by START_NODE.  Update START_NODE to point
> +   to the merged nodes, and return a pointer to the last merged
> +   node.   CMP is callback  for comparison.  */
> +
> +static struct du_head *
> +merge(struct du_head **start_node, int current_length,
> +      int (*cmp) (const struct du_head *, const struct du_head *))
>
> Why CURRENT_LENGTH?  Just use LENGTH.
>

Done.

> The first sentence is somewhat ambiguous as one might get the impression that
> the 2 sequences aren't already present in the list.
>
> /* Merge the first 2 sub-lists of LENGTH nodes contained in the linked list
>     pointed to by START_NODE.  Update...
>
> Document what happens when the linked list doesn't contain enough elements.
>
>

Done.

> +	  current_tail_node =
> +	    get_tail_du_head (current_sorted_node,
> +			      current_length - left_count);
>
> '=' on the next line
>
>

Done.

> +  return current_tail_node->next_chain != NULL ?
> +    current_tail_node : NULL;
>
> return (current_tail_node->next_chain ? current_tail_node : NULL);
>

Done.

>
> +/* Non-recursive merge sort to du_head list.  */
> +static void
> +sort_du_head (struct du_head **head,
> +	      int(*cmp)(const struct du_head *, const struct du_head *))
>
> /* Sort the linked list pointed to by HEAD.  The algorithm is a non-recursive
>     merge sort [that ...optional more detailed explanation...]
>
>

Done.  Add more detailed explanation in comments.

> +  do
> +    {
> +      struct du_head **segment = head;
> +
>
> Trailing spaces.
>

Which one is trailing space?

> +      while ((last_tail = merge(segment, current_length, cmp))
> +	     != NULL)
>
> Why breaking the line?
>

Fixed.

> +	{
> +	  segment =&last_tail->next_chain;
> +	}
>
> Superfluous curly braces.
>

Fixed.

>
> Are you sure that this revised implementation works as expected?  Won't it
> stop after the 1-length merge phase?
>

Ur, my first version is correct.  Revert it to my first version.

>
> @@ -265,13 +405,13 @@ regrename_optimize (void)
>
>   	  /* Now potential_regs is a reasonable approximation, let's
>   	     have a closer look at each register still in there.  */
> -	  for (new_reg = 0; new_reg<  FIRST_PSEUDO_REGISTER; new_reg++)
> +	  for (new_reg = FIRST_PSEUDO_REGISTER - 1; new_reg>= 0; new_reg--)
>
> What is this hunk for?  It isn't mentioned in the ChangeLog.
>

Yeah, we should explain it a little bit in comments.  I add this 
paragraph below in comments, and ChangeLog will be updated accordingly.
"Registers from high to low is iterated here, in order to 	     prefer 
low registers.  On some ports, low registers are 	     better than high 
registers in terms of final code size."

-- 
Yao Qi
CodeSourcery
yao@codesourcery.com
(650) 331-3385 x739

[-- Attachment #2: regrename_improve_1105.patch --]
[-- Type: text/x-patch, Size: 7438 bytes --]

gcc/
        * regrename.c (struct du_head): Add new element length.
        (sort_du_head, get_element, merge, merge_sort_callback):
        New functions of merge sort implementation to du_head list.
        (regrename_optimize): Sort du_head linked list by length.
	Iterate registers in a reversed order to prefer low registers.
        (create_new_chain):  Initialize length.
        (scan_rtx_reg): Increase length for non-debug insns.

diff --git a/gcc/regrename.c b/gcc/regrename.c
index 2535ab7..31745f8 100644
--- a/gcc/regrename.c
+++ b/gcc/regrename.c
@@ -51,6 +51,10 @@ struct du_head
   struct du_head *next_chain;
   /* The first and last elements of this chain.  */
   struct du_chain *first, *last;
+  /* The length of du_chain for non-debug insns.  The du_head linked
+     list can be sorted by this, and register-rename can prefer
+     register classes according to this order.  */
+  int length;
   /* Describes the register being tracked.  */
   unsigned regno, nregs;
 
@@ -154,6 +158,151 @@ merge_overlapping_regs (HARD_REG_SET *pset, struct du_head *head)
     }
 }
 
+/* Return the Nth element in LIST.  If LIST contains less than N
+   elements, return the last one.  */
+static struct du_head *
+get_element (struct du_head *list, int n)
+{
+  while (n-- && list->next_chain != NULL)
+    list = list->next_chain;
+
+  return list;
+}
+
+/* Comparison function of merge sort.  Return true if A is less than
+   B, otherwise return false.  */
+static inline int 
+merge_sort_callback(const struct du_head *a, 
+		    const struct du_head *b)
+{
+  return a->length < b->length;
+}
+
+/* Merge the first 2 sub-lists of LENGTH nodes contained in the
+   linked list pointed to by START_NODE.  Update START_NODE to point 
+   to the merged nodes, and return a pointer to the last merged
+   node.  Return NULL if START_NODE doesn't contain enough
+   elements, or this pass of merge is done.  */
+static struct du_head *
+merge(struct du_head **start_node, int length)
+{
+  int i, left_count, right_count;
+  struct du_head *left, *right;
+  /* Current node of sort result.  */
+  struct du_head *current_sorted_node;
+  /* Tail node of sort, used to connect with next piece of list.  */
+  struct du_head *current_tail_node;
+
+  if (*start_node == NULL)
+    return NULL;
+
+  left = right = *start_node;
+  right_count = left_count = 0;
+
+  /* Step RIGHT along the list by LENGTH places.  */
+  for (i = 0; i < length; i++)
+    {
+      right = right->next_chain;
+      if (right == NULL)
+	  return NULL;
+    }
+
+  /* Initialize current_sorted_node.  */
+  if (merge_sort_callback (left, right))
+    {
+      ++right_count;
+      current_sorted_node = right;
+      *start_node = right;
+      right = right->next_chain;
+    }
+  else
+    {
+      ++left_count;
+      current_sorted_node = left;
+      left = left->next_chain;
+    }
+
+  while (1)
+    {
+      /* Choose LEFT or RIGHT to take the next element from.  If 
+	 either is empty, choose from the other one.  */
+      if (left_count == length || left == NULL)
+	{
+	  current_sorted_node->next_chain = right;
+	  current_tail_node 
+	    = get_element (current_sorted_node, 
+			   length - right_count);
+
+	  break;
+	}
+      else if (right_count == length || right == NULL)
+	{
+	  /* Save the head node of next piece of linked list.  */
+	  struct du_head *tmp = current_sorted_node->next_chain;
+
+	  current_sorted_node->next_chain = left;
+	  current_tail_node
+	    = get_element (current_sorted_node,
+			   length - left_count);
+	  /* Connect sorted list to next piece of list.  */
+	  current_tail_node->next_chain = tmp;
+	  break;
+	}
+      else
+	{
+	  /* Normal merge operations.  If both LEFT and RIGHT are 
+	     non-empty, compare the first element of each and choose 
+	     the lower one.  */
+	  if (merge_sort_callback (left, right))
+	    {
+	      right_count++;
+	      current_sorted_node->next_chain = right;
+	      right = right->next_chain;
+	    }
+	  else
+	    {
+	      left_count++;
+	      current_sorted_node->next_chain = left;
+	      left = left->next_chain;
+	    }
+	  current_sorted_node = current_sorted_node->next_chain;
+	}
+    }
+  /* Returns NULL if this pass of merge is done.  */
+  return (current_tail_node->next_chain ? current_tail_node : NULL);
+}
+
+/* Sort the linked list pointed to by HEAD.  The algorithm is a
+   non-recursive merge sort to linked list.  */
+static void
+sort_du_head (struct du_head **head)
+{
+  int current_length = 1;
+  struct du_head *last_tail;
+
+  /* In each pass, lists of size current_length is merged to 
+     lists of size 2xcurrent_length (Initially current_length 
+     is 1).  */
+  while (1)
+    {
+      last_tail = merge(head, current_length);
+      if (last_tail != NULL)
+	{
+	  do
+	    {
+	      last_tail = merge (&last_tail->next_chain, 
+				 current_length);
+	    }
+	  while (last_tail != NULL);
+	  current_length *= 2;
+	}
+      else
+	break;
+    }
+}
+
 /* Perform register renaming on the current function.  */
 
 static unsigned int
@@ -195,6 +344,8 @@ regrename_optimize (void)
       if (dump_file)
 	dump_def_use_chain (all_chains);
 
+      sort_du_head (&all_chains);
+
       CLEAR_HARD_REG_SET (unavailable);
       /* Don't clobber traceback for noreturn functions.  */
       if (frame_pointer_needed)
@@ -251,6 +402,7 @@ regrename_optimize (void)
 	      if (DEBUG_INSN_P (tmp->insn))
 		continue;
 	      n_uses++;
 	      IOR_COMPL_HARD_REG_SET (this_unavailable,
 				      reg_class_contents[tmp->cl]);
 	    }
@@ -264,14 +416,17 @@ regrename_optimize (void)
 	  merge_overlapping_regs (&this_unavailable, this_head);
 
 	  /* Now potential_regs is a reasonable approximation, let's
-	     have a closer look at each register still in there.  */
-	  for (new_reg = 0; new_reg < FIRST_PSEUDO_REGISTER; new_reg++)
+	     have a closer look at each register still in there.
+	     Registers from high to low is iterated here, in order to
+	     prefer low registers.  On some ports, low registers are
+	     better than high registers in terms of final code size.  */
+	  for (new_reg = FIRST_PSEUDO_REGISTER - 1; new_reg >= 0; new_reg--)
 	    {
 	      enum machine_mode mode = GET_MODE (*this_head->first->loc);
 	      int nregs = hard_regno_nregs[new_reg][mode];
 
 	      for (i = nregs - 1; i >= 0; --i)
-	        if (TEST_HARD_REG_BIT (this_unavailable, new_reg + i)
+		if (TEST_HARD_REG_BIT (this_unavailable, new_reg + i)
 		    || fixed_regs[new_reg + i]
 		    || global_regs[new_reg + i]
 		    /* Can't use regs which aren't saved by the prologue.  */
@@ -527,6 +682,7 @@ create_new_chain (unsigned this_regno, unsigned this_nregs, rtx *loc,
   head->need_caller_save_reg = 0;
   head->cannot_rename = 0;
   head->terminated = 0;
+  head->length = 0;
 
   VEC_safe_push (du_head_p, heap, id_to_chain, head);
   head->id = current_id++;
@@ -572,6 +728,8 @@ create_new_chain (unsigned this_regno, unsigned this_nregs, rtx *loc,
   this_du->loc = loc;
   this_du->insn = insn;
   this_du->cl = cl;
+
+  head->length = 1;
 }
 
 static void
@@ -661,6 +819,8 @@ scan_rtx_reg (rtx insn, rtx *loc, enum reg_class cl, enum scan_actions action,
 		head->last->next_use = this_du;
 	      head->last = this_du;
 
+	      if (!DEBUG_INSN_P (insn))
+		head->length++;
 	    }
 	  /* Avoid adding the same location in a DEBUG_INSN multiple times,
 	     which could happen with non-exact overlap.  */

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS
  2010-11-05 16:08   ` Yao Qi
@ 2010-11-08 12:52     ` Eric Botcazou
  2010-11-08 15:11       ` Yao Qi
  0 siblings, 1 reply; 26+ messages in thread
From: Eric Botcazou @ 2010-11-08 12:52 UTC (permalink / raw)
  To: Yao Qi; +Cc: gcc-patches

> Done.  Removed function pointer CMP, and use merge_sort_callback directly.

+/* Comparison function of merge sort.  Return true if A is less than
+   B, otherwise return false.  */
+static inline int 
+merge_sort_callback(const struct du_head *a, 
+		    const struct du_head *b)
+{
+  return a->length < b->length;
+}

It isn't really a callback anymore. :-)

There are trailing spaces after the "int" and the "*a,".  There are more of 
them throughout the patch.  Good editors have options to make them visible.

> Add one sentence in comment to explain it a little more, "The du_head
> linked list can be sorted by this, and register-rename can prefer
> register classes according to this order.".

Thanks.  Still "the length of du_chain" doesn't make much sense.  So I'd write
"The number of elements of this chain, excluding those corresponding to
 references of the register in debug insns."

> Fixed as you suggested.

Thanks.  Note that we generally have a blank line between the block comment of 
a function and the function itself.

+  /* Returns NULL if this pass of merge is done.  */
+  return (current_tail_node->next_chain ? current_tail_node : NULL);

No 's' at the end of "Return".

> Ur, my first version is correct.  Revert it to my first version.

Looks better indeed. :-)  No need to break a line if it doesn't overflow.

> Yeah, we should explain it a little bit in comments.  I add this
> paragraph below in comments, and ChangeLog will be updated accordingly.
> "Registers from high to low is iterated here, in order to 	     prefer
> low registers.  On some ports, low registers are 	     better than high
> registers in terms of final code size."

This can be different on other ports so you cannot change it unilaterally.

Why do you need this for the ARM?  Won't the hook already achieve this?

On the other hand, the hook seems overly aggressive: IIUC, on the ARM, it will 
prevent non-LO_REGS from being renaming registers for GENERAL_REGS.  Do we 
really want not to rename if all LO_REGS are already taken, instead of using 
an available non-LO_REGS reg?

Wouldn't it be better to add a more fine-grained hook, for example 

DEFHOOK
(preferred_renaming_class,
 "",
 reg_class_t,
 (unsigned int, reg_class_t),
 hook_regclasst_unsignedint_regclasst_null)

that would return the Nth preferred renaming (sub)class when passed an integer 
N (0 being the most preferred) and a class of registers?  

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS
  2010-11-08 12:52     ` Eric Botcazou
@ 2010-11-08 15:11       ` Yao Qi
  2010-11-09  9:26         ` Eric Botcazou
  0 siblings, 1 reply; 26+ messages in thread
From: Yao Qi @ 2010-11-08 15:11 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches

On 11/08/2010 08:34 PM, Eric Botcazou wrote:
>> Done.  Removed function pointer CMP, and use merge_sort_callback directly.
> 
> +/* Comparison function of merge sort.  Return true if A is less than
> +   B, otherwise return false.  */
> +static inline int 
> +merge_sort_callback(const struct du_head *a, 
> +		    const struct du_head *b)
> +{
> +  return a->length < b->length;
> +}
> 
> It isn't really a callback anymore. :-)
> 
> There are trailing spaces after the "int" and the "*a,".  There are more of 
> them throughout the patch.  Good editors have options to make them visible.
> 

I noticed it when I installed whitespace.el in my emacs. :)  Thanks for
your suggestions.  I'll include my fixes as you pointed out in my next
patch.

>> Yeah, we should explain it a little bit in comments.  I add this
>> paragraph below in comments, and ChangeLog will be updated accordingly.
>> "Registers from high to low is iterated here, in order to 	     prefer
>> low registers.  On some ports, low registers are 	     better than high
>> registers in terms of final code size."
> 
> This can be different on other ports so you cannot change it unilaterally.
> 

Currently, we iterate registers from low to high, and resulting
"preferring" high registers in each iteration.  Generally speaking, low
registers are "better", at least not worse, than high register, in terms
of code size or other metrics.  Why can't we reverse the order of
iteration?  I don't see any ports do something specific to this
iteration order.  I run regression test on ARM, X86, X86_64, and don't
see any regressions if iteration order is reversed.  So I assume that it
is not harmful to reverse the iteration order to all ports, and useful
to some ports, not only to ARM.

> Why do you need this for the ARM?  Won't the hook already achieve this?
> 

The goal of this piece of work is to assign low registers to mostly used
registers, so that code size can be reduced.  There is similar feature
on X86_64, pointed out by HJ.  In order to achieve this, we need 1) a
target hook to prefer low register class, 2) iterate registers from high
to low.

> On the other hand, the hook seems overly aggressive: IIUC, on the ARM, it will 
> prevent non-LO_REGS from being renaming registers for GENERAL_REGS.  Do we 
> really want not to rename if all LO_REGS are already taken, instead of using 
> an available non-LO_REGS reg?

Yes, the situation you described is correct, but this situation is *not*
popular, because we've sorted du_head linked list, and assign mostly
used registers to low registers.  I can't give you a quantitative
measurements, but code size of some programs is reduced.

> Wouldn't it be better to add a more fine-grained hook, for example 
> 
> DEFHOOK
> (preferred_renaming_class,
>  "",
>  reg_class_t,
>  (unsigned int, reg_class_t),
>  hook_regclasst_unsignedint_regclasst_null)
> 
> that would return the Nth preferred renaming (sub)class when passed an integer 
> N (0 being the most preferred) and a class of registers?  

I thought of target hook like this before, but this can't solve the
issue of iteration order.  If different ports need different iteration
order, we have to provide a target hook for this.

-- 
Yao Qi
CodeSourcery
yao@codesourcery.com
(650) 331-3385 x739

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS
  2010-11-08 15:11       ` Yao Qi
@ 2010-11-09  9:26         ` Eric Botcazou
  2010-11-09 15:33           ` Yao Qi
  0 siblings, 1 reply; 26+ messages in thread
From: Eric Botcazou @ 2010-11-09  9:26 UTC (permalink / raw)
  To: Yao Qi; +Cc: gcc-patches

> Currently, we iterate registers from low to high, and resulting
> "preferring" high registers in each iteration.  Generally speaking, low
> registers are "better", at least not worse, than high register, in terms
> of code size or other metrics.  Why can't we reverse the order of
> iteration?

Generally we avoid changing things if we cannot prove that there is a benefit 
in doing so.  The compiler supports many architectures and you never really 
know how a change behaves with each of them in particular.  Why would "low" 
registers be universally better than "high" registers?

> I don't see any ports do something specific to this iteration order.  I run
> regression test on ARM, X86, X86_64, and don't see any regressions if
> iteration order is reversed.  So I assume that it is not harmful to reverse
> the iteration order to all ports, and useful to some ports, not only to ARM.

Testing on a couple of architectures isn't really sufficient to conclude here 
in my opinion, so I wouldn't change this, unless specifically requested by 
the presence of the hook.

> The goal of this piece of work is to assign low registers to mostly used
> registers, so that code size can be reduced.  There is similar feature
> on X86_64, pointed out by HJ.  In order to achieve this, we need 1) a
> target hook to prefer low register class, 2) iterate registers from high
> to low.

OK, but AFAICS the target hook as currently written isn't a "preference" thing 
but an "exclusion" thing, i.e. all the non-LO_REGS will be excluded and not 
only disparaged for GENERAL_REGS.  So, once non-LO_REGS are excluded, why 
changing the iteration order?  Does the iteration order within the LO_REGS 
class matter as well for the ARM?

> Yes, the situation you described is correct, but this situation is *not*
> popular, because we've sorted du_head linked list, and assign mostly
> used registers to low registers.  I can't give you a quantitative
> measurements, but code size of some programs is reduced.

OK, on the ARM, but what about the 20 other architectures that may want to 
paramerize the pass too?

> I thought of target hook like this before, but this can't solve the
> issue of iteration order.  If different ports need different iteration
> order, we have to provide a target hook for this.

This can solve the iteration order like so: you compute a "representative" 
class for the renaming chain, and then you iterate over N, call the hook, and 
iterate over the contents of the class it returns.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS
  2010-11-09  9:26         ` Eric Botcazou
@ 2010-11-09 15:33           ` Yao Qi
  2010-11-11 17:40             ` Eric Botcazou
  0 siblings, 1 reply; 26+ messages in thread
From: Yao Qi @ 2010-11-09 15:33 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches

On 11/09/2010 05:16 PM, Eric Botcazou wrote:
>> Currently, we iterate registers from low to high, and resulting
>> "preferring" high registers in each iteration.  Generally speaking, low
>> registers are "better", at least not worse, than high register, in terms
>> of code size or other metrics.  Why can't we reverse the order of
>> iteration?
> 
> Generally we avoid changing things if we cannot prove that there is a benefit 
> in doing so.  The compiler supports many architectures and you never really 
> know how a change behaves with each of them in particular.  Why would "low" 
> registers be universally better than "high" registers?
> 

Eric,
I can't prove that statement.

>> I don't see any ports do something specific to this iteration order.  I run
>> regression test on ARM, X86, X86_64, and don't see any regressions if
>> iteration order is reversed.  So I assume that it is not harmful to reverse
>> the iteration order to all ports, and useful to some ports, not only to ARM.
> 
> Testing on a couple of architectures isn't really sufficient to conclude here 
> in my opinion, so I wouldn't change this, unless specifically requested by 
> the presence of the hook.
> 

OK, I agree on this.

>> The goal of this piece of work is to assign low registers to mostly used
>> registers, so that code size can be reduced.  There is similar feature
>> on X86_64, pointed out by HJ.  In order to achieve this, we need 1) a
>> target hook to prefer low register class, 2) iterate registers from high
>> to low.
> 
> OK, but AFAICS the target hook as currently written isn't a "preference" thing 
> but an "exclusion" thing, i.e. all the non-LO_REGS will be excluded and not 
> only disparaged for GENERAL_REGS.  So, once non-LO_REGS are excluded, why 
> changing the iteration order?  Does the iteration order within the LO_REGS 
> class matter as well for the ARM?
> 

No, order doesn't matter within the LO_REGS class.

>> Yes, the situation you described is correct, but this situation is *not*
>> popular, because we've sorted du_head linked list, and assign mostly
>> used registers to low registers.  I can't give you a quantitative
>> measurements, but code size of some programs is reduced.
> 
> OK, on the ARM, but what about the 20 other architectures that may want to 
> paramerize the pass too?
> 

That is a good question, and I have no answer to it. :)

>> I thought of target hook like this before, but this can't solve the
>> issue of iteration order.  If different ports need different iteration
>> order, we have to provide a target hook for this.
> 
> This can solve the iteration order like so: you compute a "representative" 
> class for the renaming chain, and then you iterate over N, call the hook, and 
> iterate over the contents of the class it returns.
> 

I consider this patch again today, and here is a design like this,

  1.  Don't change iteration order,
  2.  Define new hook preferred_rename_register_p,
  DEFHOOK
  (preferred_renaming_register_p,
  "True if hardware register is what target prefers to rename",
  bool,
  (int),
  preferred_renaming_register_false)

  3.  Call this hook like this,
for (new_reg = 0; new_reg < FIRST_PSEUDO_REGISTER; new_reg++)
{
  ...
  if (! tmp)
    {
      if (tick[best_new_reg] > tick[new_reg])
        {
          best_new_reg = new_reg;
          best_nregs = nregs;
	
+	  if (best_new_reg != reg /* Have better choice.  */
+             /* If this register is what target prefers to rename,
break the loop,  and not skip this register.  */
+	      && targetm.preferred_rename_register_p(best_new_reg))
+	        break;
	 }
    }
}

  4.  On ARM, preferred_renaming_register_p returns true if target is
thumb2 and register's class is LO_REGS.  Other ports can define their
own preference here.

This design has some advantages against last one,
  1.  More target-independent,
  2.  We prefer register rather than exclude register.

What do you think of this design?

-- 
Yao (齐尧)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS
  2010-11-09 15:33           ` Yao Qi
@ 2010-11-11 17:40             ` Eric Botcazou
  2010-11-11 18:01               ` Eric Botcazou
  2010-12-01 16:48               ` Yao Qi
  0 siblings, 2 replies; 26+ messages in thread
From: Eric Botcazou @ 2010-11-11 17:40 UTC (permalink / raw)
  To: Yao Qi; +Cc: gcc-patches

> I consider this patch again today, and here is a design like this,
>
>   1.  Don't change iteration order,
>   2.  Define new hook preferred_rename_register_p,
>   DEFHOOK
>   (preferred_renaming_register_p,
>   "True if hardware register is what target prefers to rename",
>   bool,
>   (int),
>   preferred_renaming_register_false)
>
>   3.  Call this hook like this,
> for (new_reg = 0; new_reg < FIRST_PSEUDO_REGISTER; new_reg++)
> {
>   ...
>   if (! tmp)
>     {
>       if (tick[best_new_reg] > tick[new_reg])
>         {
>           best_new_reg = new_reg;
>           best_nregs = nregs;
>
> +	  if (best_new_reg != reg /* Have better choice.  */
> +             /* If this register is what target prefers to rename,
> break the loop,  and not skip this register.  */
> +	      && targetm.preferred_rename_register_p(best_new_reg))
> +	        break;
> 	 }
>     }
> }
>
>   4.  On ARM, preferred_renaming_register_p returns true if target is
> thumb2 and register's class is LO_REGS.  Other ports can define their
> own preference here.
>
> This design has some advantages against last one,
>   1.  More target-independent,
>   2.  We prefer register rather than exclude register.
>
> What do you think of this design?

It's a progress, but the first approach of using register classes instead of 
individual registers seemed to be the right granularity here.

For each element in the chain, we have the class.  Can we compute a superset 
of all the classes in the chain?  If we cannot or if this is ALL_REGS, then 
we iterate as before.  If we can, we call the hook on the result and we first 
iterate on the contents of this result (a class); if we find a new reg, we 
stop, otherwise we iterate on the remaining registers.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS
  2010-11-11 17:40             ` Eric Botcazou
@ 2010-11-11 18:01               ` Eric Botcazou
  2010-12-01 16:48               ` Yao Qi
  1 sibling, 0 replies; 26+ messages in thread
From: Eric Botcazou @ 2010-11-11 18:01 UTC (permalink / raw)
  To: Yao Qi; +Cc: gcc-patches

> For each element in the chain, we have the class.  Can we compute a
> superset of all the classes in the chain?  If we cannot or if this is
> ALL_REGS, then we iterate as before.  If we can, we call the hook on the
> result and we first iterate on the contents of this result (a class); if we
> find a new reg, we stop, otherwise we iterate on the remaining registers.

"the remaining registers" are the registers not in the class.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS
  2010-11-04  5:31 [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS Yao Qi
  2010-11-05  9:53 ` Eric Botcazou
@ 2010-11-20 12:05 ` Eric Botcazou
  2010-11-23 12:06   ` Yao Qi
  1 sibling, 1 reply; 26+ messages in thread
From: Eric Botcazou @ 2010-11-20 12:05 UTC (permalink / raw)
  To: Yao Qi; +Cc: gcc-patches

> I sent three patches to add some logic in register-rename pass to prefer
> for different register classes.  After some reviewers' review, three
> patches are updated and refactored like this,
>
> Patch 1: does everything except adding target hook
> preferred_rename_class .  It is a separate improvement to register-rename.
> http://gcc.gnu.org/ml/gcc-patches/2010-10/msg02197.html
>
> Patch 2: add target hook with doc, and use this hook in regrename.c
> http://gcc.gnu.org/ml/gcc-patches/2010-11/msg00029.html
>
> Patch 3: it is arm specific patch.
> http://gcc.gnu.org/ml/gcc-patches/2010-10/msg01785.html
>
> Middle-End maintainers,
> could you review patch 1 and patch 2?

Do you plan to follow up on this?  I think it's a valuable improvement, worth 
having in 4.6 and we are almost there implementation-wise.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS
  2010-11-20 12:05 ` Eric Botcazou
@ 2010-11-23 12:06   ` Yao Qi
  0 siblings, 0 replies; 26+ messages in thread
From: Yao Qi @ 2010-11-23 12:06 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches

On 11/20/2010 06:17 PM, Eric Botcazou wrote:
>> I sent three patches to add some logic in register-rename pass to prefer
>> for different register classes.  After some reviewers' review, three
>> patches are updated and refactored like this,
>>
>> Patch 1: does everything except adding target hook
>> preferred_rename_class .  It is a separate improvement to register-rename.
>> http://gcc.gnu.org/ml/gcc-patches/2010-10/msg02197.html
>>
>> Patch 2: add target hook with doc, and use this hook in regrename.c
>> http://gcc.gnu.org/ml/gcc-patches/2010-11/msg00029.html
>>
>> Patch 3: it is arm specific patch.
>> http://gcc.gnu.org/ml/gcc-patches/2010-10/msg01785.html
>>
>> Middle-End maintainers,
>> could you review patch 1 and patch 2?
> 
> Do you plan to follow up on this?  I think it's a valuable improvement, worth 
> having in 4.6 and we are almost there implementation-wise.
> 

Yes, I am still working on this, nearly done.  Still need some time to
test it on ARM.  Sorry for the delayed reply.

-- 
Yao (齐尧)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS
  2010-11-11 17:40             ` Eric Botcazou
  2010-11-11 18:01               ` Eric Botcazou
@ 2010-12-01 16:48               ` Yao Qi
  2010-12-03 10:58                 ` Eric Botcazou
  1 sibling, 1 reply; 26+ messages in thread
From: Yao Qi @ 2010-12-01 16:48 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1842 bytes --]

On 11/12/2010 01:34 AM, Eric Botcazou wrote:
> 
> It's a progress, but the first approach of using register classes instead of 
> individual registers seemed to be the right granularity here.
> 
> For each element in the chain, we have the class.  Can we compute a superset 
> of all the classes in the chain?  If we cannot or if this is ALL_REGS, then 
> we iterate as before.  If we can, we call the hook on the result and we first 
> iterate on the contents of this result (a class); if we find a new reg, we 
> stop, otherwise we iterate on the remaining registers.
> 

In this patch, most of part is same as the previous one, except three
changes,

1. compute superunion set of all the classes in the chain,

  reg_class_superunion_in_chain
    =
reg_class_superunion[(int)reg_class_superunion_in_chain][(int)tmp->cl];

2. compute the preferred class of reg_class_superunion_in_chain by hook,

   preferred_class
     = targetm.preferred_rename_class(reg_class_superunion_in_chain);

3. Iterate registers first in preferred_class, and stop once better
register is found.  Then, iterate the rest of registers as usual.

Tested on mainline r167293 x86_64-linux with option {, -O2
-frename-registers} respectively.  No regression.

Macro PREFERRED_RENAME_CLASS is defined in backend.  For example, on
ARM, it can be,

#define PREFERRED_RENAME_CLASS(CLASS)          \
  (TARGET_THUMB2 ? ((CLASS) == GENERAL_REGS    \
   ? LO_REGS : (NO_REGS))                                \
  : (NO_REGS))

Given this definition of macro, code size of bash-3.2 is reduced by 0.2%
roughly, with option "-mthumb -march=armv7 -O2 -frename-registers".  The
result and code is posted here to prove this middle-end patch is useful.
 I'll submit another patch for ARM part changes, once this middle-end
part is approved.

OK for mainline?

-- 
Yao (齐尧)

[-- Attachment #2: regrename_improve_all_1117.patch --]
[-- Type: text/x-patch, Size: 16409 bytes --]

gcc/
        * regrename.c (struct du_head): Add new element length.
        (sort_du_head, get_element, merge, merge_sort_comparison):
        New functions of merge sort implementation to du_head list.
        (regrename_optimize): Sort du_head linked list by length.
	Move some code to ...
	(check_new_reg_p): here.  New function.
        (create_new_chain):  Initialize length.
        (scan_rtx_reg): Increase length for non-debug insns.
        * target.def: New hook preferred_rename_class.
        * targhook.c (default_preferred_rename_class): New.
        * targhook.h: Declare it.
        * doc/tm.texi.in: New hook TARGET_PREFERRED_RENAME_CLASS.
	* doc/tm.texi: Regenerate.

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 5446501..7105c72 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -2506,6 +2506,10 @@ looking for one that is valid, and will reload one or both registers
 only if neither labeling works.
 @end defmac

+@deftypefn {Target Hook} reg_class_t TARGET_PREFERRED_RENAME_CLASS (reg_class_t @var{rclass})
+A target hook that places additional preference on the register class to use when it is necessary to rename a register in class @var{class} to another class, or perhaps @var{NO_REGS}, if no prefered register class is found or macro @code{PREFERRED_RENAME_CLASS} is not define. Sometimes returning a more restrictive class makes better code.  For example, on ARM, thumb-2 instructions using @code{LO_REGS} may be smaller than instructions using @code{GENERIC_REGS}.  By returning @code{LO_REGS} from @code{PREFERRED_RENAME_CLASS}, code size can be reduced.
+@end deftypefn
+
 @deftypefn {Target Hook} reg_class_t TARGET_PREFERRED_RELOAD_CLASS (rtx @var{x}, reg_class_t @var{rclass})
 A target hook that places additional restrictions on the register class
 to use when it is necessary to copy value @var{x} into a register in class
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 4b21c92..e9bf4a8 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -2496,6 +2496,8 @@ looking for one that is valid, and will reload one or both registers
 only if neither labeling works.
 @end defmac
 
+@hook TARGET_PREFERRED_RENAME_CLASS
+
 @hook TARGET_PREFERRED_RELOAD_CLASS
 A target hook that places additional restrictions on the register class
 to use when it is necessary to copy value @var{x} into a register in class
diff --git a/gcc/regrename.c b/gcc/regrename.c
index 2535ab7..f085e78 100644
--- a/gcc/regrename.c
+++ b/gcc/regrename.c
@@ -38,6 +38,7 @@
 #include "timevar.h"
 #include "tree-pass.h"
 #include "df.h"
+#include "target.h"
 
 #if HOST_BITS_PER_WIDE_INT <= MAX_RECOG_OPERANDS
 #error "Use a different bitmap implementation for untracked_operands."
@@ -51,6 +52,11 @@ struct du_head
   struct du_head *next_chain;
   /* The first and last elements of this chain.  */
   struct du_chain *first, *last;
+  /* The number of elements of this chain, excluding those corresponding
+     to references of the register in debug insns.  The du_head linked
+     list can be sorted by this, and register-rename can prefer
+     register classes according to this order.  */
+  int length;
   /* Describes the register being tracked.  */
   unsigned regno, nregs;
 
@@ -154,6 +160,199 @@ merge_overlapping_regs (HARD_REG_SET *pset, struct du_head *head)
     }
 }
 
+/* Return the Nth element in LIST.  If LIST contains less than N
+   elements, return the last one.  */
+static struct du_head *
+get_element (struct du_head *list, int n)
+{
+  while (n-- && list->next_chain != NULL)
+    list = list->next_chain;
+
+  return list;
+}
+
+/* Comparison function of merge sort.  Return true if A is less than
+   B, otherwise return false.  */
+static inline int
+merge_sort_comparison(const struct du_head *a,
+		    const struct du_head *b)
+{
+  return a->length < b->length;
+}
+
+/* Merge the first 2 sub-lists of LENGTH nodes contained in the
+   linked list pointed to by START_NODE.  Update START_NODE to point
+   to the merged nodes, and return a pointer to the last merged
+   node.  Return NULL if START_NODE doesn't contain enough
+   elements, or this pass of merge is done.  */
+
+static struct du_head *
+merge(struct du_head **start_node, int length)
+{
+  int i, left_count, right_count;
+  struct du_head *left, *right;
+  /* Current node of sort result.  */
+  struct du_head *current_sorted_node;
+  /* Tail node of sort, used to connect with next piece of list.  */
+  struct du_head *current_tail_node;
+
+  if (*start_node == NULL)
+    return NULL;
+
+  left = right = *start_node;
+  right_count = left_count = 0;
+
+  /* Step RIGHT along the list by LENGTH places.  */
+  for (i = 0; i < length; i++)
+    {
+      right = right->next_chain;
+      if (right == NULL)
+	{
+	  return NULL;
+	}
+    }
+
+  /* Initialize current_sorted_node.  */
+  if (merge_sort_comparison (left, right))
+    {
+      ++right_count;
+      current_sorted_node = right;
+      *start_node = right;
+      right = right->next_chain;
+    }
+  else
+    {
+      ++left_count;
+      current_sorted_node = left;
+      left = left->next_chain;
+    }
+
+  while (1)
+    {
+      /* Choose LEFT or RIGHT to take the next element from.  If
+	 either is empty, choose from the other one.  */
+      if (left_count == length || left == NULL)
+	{
+	  current_sorted_node->next_chain = right;
+	  current_tail_node = get_element (current_sorted_node,
+					   length - right_count);
+
+	  break;
+	}
+      else if (right_count == length || right == NULL)
+	{
+	  /* Save the head node of next piece of linked list.  */
+	  struct du_head *tmp = current_sorted_node->next_chain;
+
+	  current_sorted_node->next_chain = left;
+	  current_tail_node
+	    = get_element (current_sorted_node,
+			   length - left_count);
+	  /* Connect sorted list to next piece of list.  */
+	  current_tail_node->next_chain = tmp;
+	  break;
+	}
+      else
+	{
+	  /* Normal merge operations.  If both LEFT and RIGHT are
+	     non-empty, compare the first element of each and choose
+	     the lower one.  */
+	  if (merge_sort_comparison (left, right))
+	    {
+	      right_count++;
+	      current_sorted_node->next_chain = right;
+	      right = right->next_chain;
+	    }
+	  else
+	    {
+	      left_count++;
+	      current_sorted_node->next_chain = left;
+	      left = left->next_chain;
+	    }
+	  current_sorted_node = current_sorted_node->next_chain;
+	}
+    }
+  /* Return NULL if this pass of merge is done.  */
+  return (current_tail_node->next_chain ? current_tail_node : NULL);
+}
+
+/* Sort the linked list pointed to by HEAD.  The algorithm is a
+   non-recursive merge sort to linked list.  */
+
+static void
+sort_du_head (struct du_head **head)
+{
+  int current_length = 1;
+  struct du_head *last_tail;
+
+  /* In each pass, lists of size current_length is merged to
+     lists of size 2xcurrent_length (Initially current_length
+     is 1).  */
+  while (1)
+    {
+      last_tail = merge(head, current_length);
+      if (last_tail != NULL)
+	{
+	  do
+	    last_tail = merge (&last_tail->next_chain, current_length);
+	  while (last_tail != NULL);
+
+	  current_length *= 2;
+	}
+      else
+	break;
+    }
+}
+
+/* Check if NEW_REG can be the candidate register to rename for
+   REG in THIS_HEAD chain.  THIS_UNAVAILABLE is a set of unavailable hard
+   registers.  */
+
+static bool
+check_new_reg_p (int reg, int new_reg,  struct du_head *this_head,
+		 HARD_REG_SET this_unavailable)
+{
+  enum machine_mode mode = GET_MODE (*this_head->first->loc);
+  int nregs = hard_regno_nregs[new_reg][mode];
+  int i;
+  struct du_chain *tmp;
+
+  for (i = nregs - 1; i >= 0; --i)
+    if (TEST_HARD_REG_BIT (this_unavailable, new_reg + i)
+	|| fixed_regs[new_reg + i]
+	|| global_regs[new_reg + i]
+	/* Can't use regs which aren't saved by the prologue.  */
+	|| (! df_regs_ever_live_p (new_reg + i)
+	    && ! call_used_regs[new_reg + i])
+#ifdef LEAF_REGISTERS
+	/* We can't use a non-leaf register if we're in a
+	   leaf function.  */
+	|| (current_function_is_leaf
+	    && !LEAF_REGISTERS[new_reg + i])
+#endif
+#ifdef HARD_REGNO_RENAME_OK
+	|| ! HARD_REGNO_RENAME_OK (reg + i, new_reg + i)
+#endif
+	)
+      break;
+  if (i >= 0)
+    return false;
+
+  /* See whether it accepts all modes that occur in
+     definition and uses.  */
+  for (tmp = this_head->first; tmp; tmp = tmp->next_use)
+    if ((! HARD_REGNO_MODE_OK (new_reg, GET_MODE (*tmp->loc))
+	 && ! DEBUG_INSN_P (tmp->insn))
+	|| (this_head->need_caller_save_reg
+	    && ! (HARD_REGNO_CALL_PART_CLOBBERED
+		  (reg, GET_MODE (*tmp->loc)))
+	    && (HARD_REGNO_CALL_PART_CLOBBERED
+		(new_reg, GET_MODE (*tmp->loc)))))
+      break;
+
+  return true;
+}
+
 /* Perform register renaming on the current function.  */
 
 static unsigned int
@@ -195,6 +394,8 @@ regrename_optimize (void)
       if (dump_file)
 	dump_def_use_chain (all_chains);
 
+      sort_du_head (&all_chains);
+
       CLEAR_HARD_REG_SET (unavailable);
       /* Don't clobber traceback for noreturn functions.  */
       if (frame_pointer_needed)
@@ -214,6 +415,8 @@ regrename_optimize (void)
 	  HARD_REG_SET this_unavailable;
 	  int reg = this_head->regno;
 	  int i;
+	  enum reg_class reg_class_superunion_in_chain = NO_REGS;
+	  enum reg_class preferred_class;
 
 	  all_chains = this_head->next_chain;
 
@@ -243,16 +446,23 @@ regrename_optimize (void)
 
 	  COPY_HARD_REG_SET (this_unavailable, unavailable);
 
-	  /* Count number of uses, and narrow the set of registers we can
-	     use for renaming.  */
+	  /* Iterate elements in chain in order to:
+	     1. Count number of uses, and narrow the set of registers we can
+	     use for renaming.
+	     2. Compute the superunion of register classes in this chain.  */
 	  n_uses = 0;
+	  reg_class_superunion_in_chain = NO_REGS;
 	  for (tmp = this_head->first; tmp; tmp = tmp->next_use)
 	    {
 	      if (DEBUG_INSN_P (tmp->insn))
 		continue;
 	      n_uses++;
+
 	      IOR_COMPL_HARD_REG_SET (this_unavailable,
 				      reg_class_contents[tmp->cl]);
+
+	      reg_class_superunion_in_chain
+		= reg_class_superunion[(int)reg_class_superunion_in_chain][(int)tmp->cl];
 	    }
 
 	  if (n_uses < 2)
@@ -262,56 +472,55 @@ regrename_optimize (void)
 	    IOR_HARD_REG_SET (this_unavailable, call_used_reg_set);
 
 	  merge_overlapping_regs (&this_unavailable, this_head);
-
+	  /* Compute preferred rename class of super union of all the classes
+	     on the chain.  */
+	  preferred_class
+	    = targetm.preferred_rename_class(reg_class_superunion_in_chain);
+
+	  /* The register iteration order here is "preferred-register-first".
+	     Firstly(i == 0), we iterate registers belong to class
+	     PREFERRED_CLASS, if we find a new register, we stop immeidately.
+	     Otherwise, we iterate once more (i == 1) registers, which
+	     doesn't belong class PREFERRED_CLASS.
+	     If preferred class is not found by target hook, PREFERRED_CLASS
+	     is NO_REGS, and registers are iterated in an ascending order
+	     without any preference.  */
+	  for (i = 0; i <2; i++)
+	    {
+	      bool prefer[] = {true, false};
+	      bool found = false;
 	  /* Now potential_regs is a reasonable approximation, let's
 	     have a closer look at each register still in there.  */
 	  for (new_reg = 0; new_reg < FIRST_PSEUDO_REGISTER; new_reg++)
 	    {
-	      enum machine_mode mode = GET_MODE (*this_head->first->loc);
-	      int nregs = hard_regno_nregs[new_reg][mode];
-
-	      for (i = nregs - 1; i >= 0; --i)
-	        if (TEST_HARD_REG_BIT (this_unavailable, new_reg + i)
-		    || fixed_regs[new_reg + i]
-		    || global_regs[new_reg + i]
-		    /* Can't use regs which aren't saved by the prologue.  */
-		    || (! df_regs_ever_live_p (new_reg + i)
-			&& ! call_used_regs[new_reg + i])
-#ifdef LEAF_REGISTERS
-		    /* We can't use a non-leaf register if we're in a
-		       leaf function.  */
-		    || (current_function_is_leaf
-			&& !LEAF_REGISTERS[new_reg + i])
-#endif
-#ifdef HARD_REGNO_RENAME_OK
-		    || ! HARD_REGNO_RENAME_OK (reg + i, new_reg + i)
-#endif
-		    )
-		  break;
-	      if (i >= 0)
+		  /* Iterate registers first in prefered class.  */
+		  if (prefer[i]
+		      != TEST_HARD_REG_BIT (reg_class_contents[preferred_class],
+					    new_reg))
 		continue;
 
-	      /* See whether it accepts all modes that occur in
-		 definition and uses.  */
-	      for (tmp = this_head->first; tmp; tmp = tmp->next_use)
-		if ((! HARD_REGNO_MODE_OK (new_reg, GET_MODE (*tmp->loc))
-		     && ! DEBUG_INSN_P (tmp->insn))
-		    || (this_head->need_caller_save_reg
-			&& ! (HARD_REGNO_CALL_PART_CLOBBERED
-			      (reg, GET_MODE (*tmp->loc)))
-			&& (HARD_REGNO_CALL_PART_CLOBBERED
-			    (new_reg, GET_MODE (*tmp->loc)))))
-		  break;
-	      if (! tmp)
+		  if (check_new_reg_p (reg, new_reg, this_head,
+				       this_unavailable))
 		{
 		  if (tick[best_new_reg] > tick[new_reg])
 		    {
+			  enum machine_mode mode
+			    = GET_MODE (*this_head->first->loc);
 		      best_new_reg = new_reg;
-		      best_nregs = nregs;
+			  best_nregs = hard_regno_nregs[new_reg][mode];
+			  /* If we find a new reg in our preferred class,
+			     stop immediately.  */
+			  if (best_new_reg != reg && prefer[i])
+			    {
+			      found = true;
+			      break;
 		    }
 		}
 	    }
-
+		}
+	      if (found)
+		break;
+	    }
 	  if (dump_file)
 	    {
 	      fprintf (dump_file, "Register %s in insn %d",
@@ -527,6 +736,7 @@ create_new_chain (unsigned this_regno, unsigned this_nregs, rtx *loc,
   head->need_caller_save_reg = 0;
   head->cannot_rename = 0;
   head->terminated = 0;
+  head->length = 0;
 
   VEC_safe_push (du_head_p, heap, id_to_chain, head);
   head->id = current_id++;
@@ -572,6 +782,8 @@ create_new_chain (unsigned this_regno, unsigned this_nregs, rtx *loc,
   this_du->loc = loc;
   this_du->insn = insn;
   this_du->cl = cl;
+
+  head->length = 1;
 }
 
 static void
@@ -661,6 +873,8 @@ scan_rtx_reg (rtx insn, rtx *loc, enum reg_class cl, enum scan_actions action,
 		head->last->next_use = this_du;
 	      head->last = this_du;
 
+	      if (!DEBUG_INSN_P (insn))
+		head->length++;
 	    }
 	  /* Avoid adding the same location in a DEBUG_INSN multiple times,
 	     which could happen with non-exact overlap.  */
diff --git a/gcc/target.def b/gcc/target.def
index 66006ae..0f9a9f9 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -2204,6 +2204,21 @@ DEFHOOK
  bool, (reg_class_t rclass),
  default_class_likely_spilled_p)
 
+DEFHOOK
+(preferred_rename_class,
+ "A target hook that places additional preference on the register\
+ class to use when it is necessary to rename a register in class\
+ @var{class} to another class, or perhaps @var{NO_REGS}, if no\
+ prefered register class is found or macro @code{PREFERRED_RENAME_CLASS}\
+ is not define.\
+ Sometimes returning a more restrictive class makes better code.  For\
+ example, on ARM, thumb-2 instructions using @code{LO_REGS} may be\
+ smaller than instructions using @code{GENERIC_REGS}.  By returning\
+ @code{LO_REGS} from @code{PREFERRED_RENAME_CLASS}, code size can\
+ be reduced.",
+ reg_class_t, (reg_class_t rclass),
+ default_preferred_rename_class)
+
 /* This target hook allows the backend to perform additional
    processing while initializing for variable expansion.  */
 DEFHOOK
diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index 3647436..19fc29a 100644
--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -1270,6 +1270,16 @@ default_preferred_output_reload_class (rtx x ATTRIBUTE_UNUSED,
 #endif
 }
 
+reg_class_t
+default_preferred_rename_class (reg_class_t rclass)
+{
+#ifdef PREFERRED_RENAME_CLASS
+  return PREFERRED_RENAME_CLASS ((enum reg_class)rclass);
+#else
+  return NO_REGS;
+#endif
+}
+
 /* The default implementation of TARGET_CLASS_LIKELY_SPILLED_P.  */
 
 bool
diff --git a/gcc/targhooks.h b/gcc/targhooks.h
index eeefe05..6fe8350 100644
--- a/gcc/targhooks.h
+++ b/gcc/targhooks.h
@@ -158,6 +158,7 @@ extern int default_register_move_cost (enum machine_mode, reg_class_t,
 extern bool default_profile_before_prologue (void);
 extern reg_class_t default_preferred_reload_class (rtx, reg_class_t);
 extern reg_class_t default_preferred_output_reload_class (rtx, reg_class_t);
+extern reg_class_t default_preferred_rename_class (reg_class_t rclass);
 extern bool default_class_likely_spilled_p (reg_class_t);
 
 extern enum unwind_info_type default_debug_unwind_info (void);

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS
  2010-12-01 16:48               ` Yao Qi
@ 2010-12-03 10:58                 ` Eric Botcazou
  2010-12-03 14:08                   ` Yao Qi
  2010-12-04 16:06                   ` [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS Daniel Jacobowitz
  0 siblings, 2 replies; 26+ messages in thread
From: Eric Botcazou @ 2010-12-03 10:58 UTC (permalink / raw)
  To: Yao Qi; +Cc: gcc-patches

> In this patch, most of part is same as the previous one, except three
> changes,
>
> 1. compute superunion set of all the classes in the chain,
>
>   reg_class_superunion_in_chain
>     =
> reg_class_superunion[(int)reg_class_superunion_in_chain][(int)tmp->cl];
>
> 2. compute the preferred class of reg_class_superunion_in_chain by hook,
>
>    preferred_class
>      = targetm.preferred_rename_class(reg_class_superunion_in_chain);
>
> 3. Iterate registers first in preferred_class, and stop once better
> register is found.  Then, iterate the rest of registers as usual.

Almost OK, but:

+	  enum reg_class reg_class_superunion_in_chain = NO_REGS;

The name is too long, leading to

+	      reg_class_superunion_in_chain
+		= reg_class_superunion[(int)reg_class_superunion_in_chain][(int)tmp->cl];

Just call it "superunion_class".


+	  /* The register iteration order here is "preferred-register-first".
+	     Firstly(i == 0), we iterate registers belong to class
+	     PREFERRED_CLASS, if we find a new register, we stop immeidately.

"... we iterate over registers that belong to PREFERRED_CLASS; if we find a 
new register, we stop immediately."

+	     Otherwise, we iterate once more (i == 1) registers, which
+	     doesn't belong class PREFERRED_CLASS.

"...over registers that don't belong to PREFERRED_CLASS".

+	     If preferred class is not found by target hook, PREFERRED_CLASS
+	     is NO_REGS, and registers are iterated in an ascending order
+	     without any preference.  */

"If PREFERRED_CLASS is NO_REGS, we iterate over all registers in ascending 
order without any preference".

But the loop is still run twice, isn't it?  Can we skip the first run if 
PREFERRED_CLASS is NO_REGS, for example by starting at 1 instead of 0?

The usual convention for this kind of scheme (mutiple runs) in the RTL 
optimizers is to use PASS as iteration variable:

  for (pass = 0; pass < 2; pass++)

and just test (pass == 0) or (pass == 1), i.e. don't use PREFER at all:

  /* In the first pass, iterate over registers first in preferred class.  */
  if (pass == 0
      && !TEST_HARD_REG_BIT (reg_class_contents[preferred_class], new_reg))
    continue;

[...]

  /* If we find a new reg in our preferred class, stop immediately.  */
  if (pass == 0 && best_new_reg != reg)
    {
      found = true;
      break;
    }


As for the hook itself: drop the PREFERRED_RENAME_CLASS macro entirely, i.e. 
the default hook is just:

reg_class_t
default_preferred_rename_class (reg_class_t rclass)
{
  return NO_REGS;
}

and back-ends must implement the function (or use the default).


Finally, I have a question for native speakers: does preferred_rename_class 
sound good or would preferred_renaming_class be better?

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS
  2010-12-03 10:58                 ` Eric Botcazou
@ 2010-12-03 14:08                   ` Yao Qi
  2010-12-06 18:27                     ` Eric Botcazou
  2010-12-13 19:43                     ` ARM bootstrap failure (Re: [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS) Ulrich Weigand
  2010-12-04 16:06                   ` [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS Daniel Jacobowitz
  1 sibling, 2 replies; 26+ messages in thread
From: Yao Qi @ 2010-12-03 14:08 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 2493 bytes --]

On 12/03/2010 06:57 PM, Eric Botcazou wrote:
> 
> Almost OK, but:
> 
> +	  enum reg_class reg_class_superunion_in_chain = NO_REGS;
> 
> The name is too long, leading to
> 
> +	      reg_class_superunion_in_chain
> +		= reg_class_superunion[(int)reg_class_superunion_in_chain][(int)tmp->cl];
> 
> Just call it "superunion_class".
> 

OK, fixed.

> 
> +	  /* The register iteration order here is "preferred-register-first".
> +	     Firstly(i == 0), we iterate registers belong to class
> +	     PREFERRED_CLASS, if we find a new register, we stop immeidately.
> 
> "... we iterate over registers that belong to PREFERRED_CLASS; if we find a 
> new register, we stop immediately."
> 

Fixed.

> +	     Otherwise, we iterate once more (i == 1) registers, which
> +	     doesn't belong class PREFERRED_CLASS.
> 
> "...over registers that don't belong to PREFERRED_CLASS".
> 
> +	     If preferred class is not found by target hook, PREFERRED_CLASS
> +	     is NO_REGS, and registers are iterated in an ascending order
> +	     without any preference.  */
> 
> "If PREFERRED_CLASS is NO_REGS, we iterate over all registers in ascending 
> order without any preference".
> 

Fixed.

> But the loop is still run twice, isn't it?  Can we skip the first run if 
> PREFERRED_CLASS is NO_REGS, for example by starting at 1 instead of 0?
> 

Yes, loop still runs twice. Fixed like this,

for (pass = (preferred_class == NO_REGS ? 1 : 0); pass < 2; pass++)

> The usual convention for this kind of scheme (mutiple runs) in the RTL 
> optimizers is to use PASS as iteration variable:
> 
>   for (pass = 0; pass < 2; pass++)
> 
> and just test (pass == 0) or (pass == 1), i.e. don't use PREFER at all:
> 
>   /* In the first pass, iterate over registers first in preferred class.  */
>   if (pass == 0
>       && !TEST_HARD_REG_BIT (reg_class_contents[preferred_class], new_reg))
>     continue;
> 
> [...]
> 
>   /* If we find a new reg in our preferred class, stop immediately.  */
>   if (pass == 0 && best_new_reg != reg)
>     {
>       found = true;
>       break;
>     }
> 

PASS is a good variable name here! :)  Fixed.

> As for the hook itself: drop the PREFERRED_RENAME_CLASS macro entirely, i.e. 
> the default hook is just:
> 
> reg_class_t
> default_preferred_rename_class (reg_class_t rclass)
> {
>   return NO_REGS;
> }
> 
> and back-ends must implement the function (or use the default).
> 

Seems I didn't understand targethook fully before.  You are right.  Done.

-- 
Yao (齐尧)

[-- Attachment #2: regrename_improve_all_1203.patch --]
[-- Type: text/x-patch, Size: 15400 bytes --]

gcc/
        * regrename.c (struct du_head): Add new element length.
        (sort_du_head, get_element, merge, merge_sort_comparison):
        New functions of merge sort implementation to du_head list.
        (regrename_optimize): Sort du_head linked list by length.
	Iterate registers in a preferred-register-first order.
        Move some code to ... 
        (check_new_reg_p): here.  New function.
        (create_new_chain):  Initialize length.
        (scan_rtx_reg): Increase length for non-debug insns.
        * target.def: New hook preferred_rename_class.
        * targhook.c (default_preferred_rename_class): New.
        * targhook.h: Declare it. 
        * doc/tm.texi.in: New hook TARGET_PREFERRED_RENAME_CLASS.
        * doc/tm.texi: Regenerate.


diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 4b21c92..e9bf4a8 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -2496,6 +2496,8 @@ looking for one that is valid, and will reload one or both registers
 only if neither labeling works.
 @end defmac
 
+@hook TARGET_PREFERRED_RENAME_CLASS
+
 @hook TARGET_PREFERRED_RELOAD_CLASS
 A target hook that places additional restrictions on the register class
 to use when it is necessary to copy value @var{x} into a register in class
diff --git a/gcc/regrename.c b/gcc/regrename.c
index 2535ab7..adbcde5 100644
--- a/gcc/regrename.c
+++ b/gcc/regrename.c
@@ -38,6 +38,7 @@
 #include "timevar.h"
 #include "tree-pass.h"
 #include "df.h"
+#include "target.h"
 
 #if HOST_BITS_PER_WIDE_INT <= MAX_RECOG_OPERANDS
 #error "Use a different bitmap implementation for untracked_operands."
@@ -51,6 +52,11 @@ struct du_head
   struct du_head *next_chain;
   /* The first and last elements of this chain.  */
   struct du_chain *first, *last;
+  /* The number of elements of this chain, excluding those corresponding
+     to references of the register in debug insns.  The du_head linked
+     list can be sorted by this, and register-rename can prefer
+     register classes according to this order.  */
+  int length;
   /* Describes the register being tracked.  */
   unsigned regno, nregs;
 
@@ -154,6 +160,199 @@ merge_overlapping_regs (HARD_REG_SET *pset, struct du_head *head)
     }
 }
 
+/* Return the Nth element in LIST.  If LIST contains less than N
+   elements, return the last one.  */
+static struct du_head *
+get_element (struct du_head *list, int n)
+{
+  while (n-- && list->next_chain != NULL)
+    list = list->next_chain;
+
+  return list;
+}
+
+/* Comparison function of merge sort.  Return true if A is less than
+   B, otherwise return false.  */
+static inline int
+merge_sort_comparison(const struct du_head *a,
+		    const struct du_head *b)
+{
+  return a->length < b->length;
+}
+
+/* Merge the first 2 sub-lists of LENGTH nodes contained in the
+   linked list pointed to by START_NODE.  Update START_NODE to point
+   to the merged nodes, and return a pointer to the last merged
+   node.  Return NULL if START_NODE doesn't contain enough
+   elements, or this pass of merge is done.  */
+
+static struct du_head *
+merge(struct du_head **start_node, int length)
+{
+  int i, left_count, right_count;
+  struct du_head *left, *right;
+  /* Current node of sort result.  */
+  struct du_head *current_sorted_node;
+  /* Tail node of sort, used to connect with next piece of list.  */
+  struct du_head *current_tail_node;
+
+  if (*start_node == NULL)
+    return NULL;
+
+  left = right = *start_node;
+  right_count = left_count = 0;
+
+  /* Step RIGHT along the list by LENGTH places.  */
+  for (i = 0; i < length; i++)
+    {
+      right = right->next_chain;
+      if (right == NULL)
+	{
+	  return NULL;
+	}
+    }
+
+  /* Initialize current_sorted_node.  */
+  if (merge_sort_comparison (left, right))
+    {
+      ++right_count;
+      current_sorted_node = right;
+      *start_node = right;
+      right = right->next_chain;
+    }
+  else
+    {
+      ++left_count;
+      current_sorted_node = left;
+      left = left->next_chain;
+    }
+
+  while (1)
+    {
+      /* Choose LEFT or RIGHT to take the next element from.  If
+	 either is empty, choose from the other one.  */
+      if (left_count == length || left == NULL)
+	{
+	  current_sorted_node->next_chain = right;
+	  current_tail_node = get_element (current_sorted_node,
+					   length - right_count);
+
+	  break;
+	}
+      else if (right_count == length || right == NULL)
+	{
+	  /* Save the head node of next piece of linked list.  */
+	  struct du_head *tmp = current_sorted_node->next_chain;
+
+	  current_sorted_node->next_chain = left;
+	  current_tail_node
+	    = get_element (current_sorted_node,
+			   length - left_count);
+	  /* Connect sorted list to next piece of list.  */
+	  current_tail_node->next_chain = tmp;
+	  break;
+	}
+      else
+	{
+	  /* Normal merge operations.  If both LEFT and RIGHT are
+	     non-empty, compare the first element of each and choose
+	     the lower one.  */
+	  if (merge_sort_comparison (left, right))
+	    {
+	      right_count++;
+	      current_sorted_node->next_chain = right;
+	      right = right->next_chain;
+	    }
+	  else
+	    {
+	      left_count++;
+	      current_sorted_node->next_chain = left;
+	      left = left->next_chain;
+	    }
+	  current_sorted_node = current_sorted_node->next_chain;
+	}
+    }
+  /* Return NULL if this pass of merge is done.  */
+  return (current_tail_node->next_chain ? current_tail_node : NULL);
+}
+
+/* Sort the linked list pointed to by HEAD.  The algorithm is a
+   non-recursive merge sort to linked list.  */
+
+static void
+sort_du_head (struct du_head **head)
+{
+  int current_length = 1;
+  struct du_head *last_tail;
+
+  /* In each pass, lists of size current_length is merged to
+     lists of size 2xcurrent_length (Initially current_length
+     is 1).  */
+  while (1)
+    {
+      last_tail = merge(head, current_length);
+      if (last_tail != NULL)
+	{
+	  do
+	    last_tail = merge (&last_tail->next_chain, current_length);
+	  while (last_tail != NULL);
+
+	  current_length *= 2;
+	}
+      else
+	break;
+    }
+}
+
+/* Check if NEW_REG can be the candidate register to rename for
+   REG in THIS_HEAD chain.  THIS_UNAVAILABLE is a set of unavailable hard
+   registers.  */
+
+static bool
+check_new_reg_p (int reg, int new_reg,  struct du_head *this_head,
+		 HARD_REG_SET this_unavailable)
+{
+  enum machine_mode mode = GET_MODE (*this_head->first->loc);
+  int nregs = hard_regno_nregs[new_reg][mode];
+  int i;
+  struct du_chain *tmp;
+
+  for (i = nregs - 1; i >= 0; --i)
+    if (TEST_HARD_REG_BIT (this_unavailable, new_reg + i)
+	|| fixed_regs[new_reg + i]
+	|| global_regs[new_reg + i]
+	/* Can't use regs which aren't saved by the prologue.  */
+	|| (! df_regs_ever_live_p (new_reg + i)
+	    && ! call_used_regs[new_reg + i])
+#ifdef LEAF_REGISTERS
+	/* We can't use a non-leaf register if we're in a
+	   leaf function.  */
+	|| (current_function_is_leaf
+	    && !LEAF_REGISTERS[new_reg + i])
+#endif
+#ifdef HARD_REGNO_RENAME_OK
+	|| ! HARD_REGNO_RENAME_OK (reg + i, new_reg + i)
+#endif
+	)
+      break;
+  if (i >= 0)
+    return false;
+
+  /* See whether it accepts all modes that occur in
+     definition and uses.  */
+  for (tmp = this_head->first; tmp; tmp = tmp->next_use)
+    if ((! HARD_REGNO_MODE_OK (new_reg, GET_MODE (*tmp->loc))
+	 && ! DEBUG_INSN_P (tmp->insn))
+	|| (this_head->need_caller_save_reg
+	    && ! (HARD_REGNO_CALL_PART_CLOBBERED
+		  (reg, GET_MODE (*tmp->loc)))
+	    && (HARD_REGNO_CALL_PART_CLOBBERED
+		(new_reg, GET_MODE (*tmp->loc)))))
+      break;
+
+  return true;
+}
+
 /* Perform register renaming on the current function.  */
 
 static unsigned int
@@ -195,6 +394,8 @@ regrename_optimize (void)
       if (dump_file)
 	dump_def_use_chain (all_chains);
 
+      sort_du_head (&all_chains);
+
       CLEAR_HARD_REG_SET (unavailable);
       /* Don't clobber traceback for noreturn functions.  */
       if (frame_pointer_needed)
@@ -213,7 +414,9 @@ regrename_optimize (void)
 	  struct du_chain *tmp;
 	  HARD_REG_SET this_unavailable;
 	  int reg = this_head->regno;
-	  int i;
+	  int pass;
+	  enum reg_class superunion_class = NO_REGS;
+	  enum reg_class preferred_class;
 
 	  all_chains = this_head->next_chain;
 
@@ -243,16 +446,23 @@ regrename_optimize (void)
 
 	  COPY_HARD_REG_SET (this_unavailable, unavailable);
 
-	  /* Count number of uses, and narrow the set of registers we can
-	     use for renaming.  */
+	  /* Iterate elements in chain in order to:
+	     1. Count number of uses, and narrow the set of registers we can
+	     use for renaming.
+	     2. Compute the superunion of register classes in this chain.  */
 	  n_uses = 0;
+	  superunion_class = NO_REGS;
 	  for (tmp = this_head->first; tmp; tmp = tmp->next_use)
 	    {
 	      if (DEBUG_INSN_P (tmp->insn))
 		continue;
 	      n_uses++;
+
 	      IOR_COMPL_HARD_REG_SET (this_unavailable,
 				      reg_class_contents[tmp->cl]);
+
+	      superunion_class
+		= reg_class_superunion[(int) superunion_class][(int) tmp->cl];
 	    }
 
 	  if (n_uses < 2)
@@ -262,56 +472,52 @@ regrename_optimize (void)
 	    IOR_HARD_REG_SET (this_unavailable, call_used_reg_set);
 
 	  merge_overlapping_regs (&this_unavailable, this_head);
-
-	  /* Now potential_regs is a reasonable approximation, let's
-	     have a closer look at each register still in there.  */
-	  for (new_reg = 0; new_reg < FIRST_PSEUDO_REGISTER; new_reg++)
+	  /* Compute preferred rename class of super union of all the classes
+	     on the chain.  */
+	  preferred_class = targetm.preferred_rename_class(superunion_class);
+
+	  /* The register iteration order here is "preferred-register-first".
+	     Firstly(pass == 0), we iterate registers belong to PREFERRED_CLASS,
+	     if we find a new register, we stop immeidately.
+	     Otherwise, we iterate over registers that don't belong to
+	     PREFERRED_CLASS.
+	     If PREFERRED_CLASS is NO_REGS, we iterate over all registers in
+	     ascending order without any preference.  */
+	  for (pass = (preferred_class == NO_REGS ? 1 : 0); pass < 2; pass++)
 	    {
-	      enum machine_mode mode = GET_MODE (*this_head->first->loc);
-	      int nregs = hard_regno_nregs[new_reg][mode];
-
-	      for (i = nregs - 1; i >= 0; --i)
-	        if (TEST_HARD_REG_BIT (this_unavailable, new_reg + i)
-		    || fixed_regs[new_reg + i]
-		    || global_regs[new_reg + i]
-		    /* Can't use regs which aren't saved by the prologue.  */
-		    || (! df_regs_ever_live_p (new_reg + i)
-			&& ! call_used_regs[new_reg + i])
-#ifdef LEAF_REGISTERS
-		    /* We can't use a non-leaf register if we're in a
-		       leaf function.  */
-		    || (current_function_is_leaf
-			&& !LEAF_REGISTERS[new_reg + i])
-#endif
-#ifdef HARD_REGNO_RENAME_OK
-		    || ! HARD_REGNO_RENAME_OK (reg + i, new_reg + i)
-#endif
-		    )
-		  break;
-	      if (i >= 0)
-		continue;
-
-	      /* See whether it accepts all modes that occur in
-		 definition and uses.  */
-	      for (tmp = this_head->first; tmp; tmp = tmp->next_use)
-		if ((! HARD_REGNO_MODE_OK (new_reg, GET_MODE (*tmp->loc))
-		     && ! DEBUG_INSN_P (tmp->insn))
-		    || (this_head->need_caller_save_reg
-			&& ! (HARD_REGNO_CALL_PART_CLOBBERED
-			      (reg, GET_MODE (*tmp->loc)))
-			&& (HARD_REGNO_CALL_PART_CLOBBERED
-			    (new_reg, GET_MODE (*tmp->loc)))))
-		  break;
-	      if (! tmp)
+	      bool found = false;
+	      /* Now potential_regs is a reasonable approximation, let's
+		 have a closer look at each register still in there.  */
+	      for (new_reg = 0; new_reg < FIRST_PSEUDO_REGISTER; new_reg++)
 		{
-		  if (tick[best_new_reg] > tick[new_reg])
+		  /* Iterate registers first in prefered class.  */
+		  if (pass == 0
+		      && !TEST_HARD_REG_BIT (reg_class_contents[preferred_class],
+					     new_reg))
+		    continue;
+
+		  if (check_new_reg_p (reg, new_reg, this_head,
+				       this_unavailable))
 		    {
-		      best_new_reg = new_reg;
-		      best_nregs = nregs;
+		      if (tick[best_new_reg] > tick[new_reg])
+			{
+			  enum machine_mode mode
+			    = GET_MODE (*this_head->first->loc);
+			  best_new_reg = new_reg;
+			  best_nregs = hard_regno_nregs[new_reg][mode];
+			  /* If we find a new reg in our preferred class,
+			     stop immediately.  */
+			  if (best_new_reg != reg && pass == 0)
+			    {
+			      found = true;
+			      break;
+			    }
+			}
 		    }
 		}
+	      if (found)
+		break;
 	    }
-
 	  if (dump_file)
 	    {
 	      fprintf (dump_file, "Register %s in insn %d",
@@ -527,6 +733,7 @@ create_new_chain (unsigned this_regno, unsigned this_nregs, rtx *loc,
   head->need_caller_save_reg = 0;
   head->cannot_rename = 0;
   head->terminated = 0;
+  head->length = 0;
 
   VEC_safe_push (du_head_p, heap, id_to_chain, head);
   head->id = current_id++;
@@ -572,6 +779,8 @@ create_new_chain (unsigned this_regno, unsigned this_nregs, rtx *loc,
   this_du->loc = loc;
   this_du->insn = insn;
   this_du->cl = cl;
+
+  head->length = 1;
 }
 
 static void
@@ -661,6 +870,8 @@ scan_rtx_reg (rtx insn, rtx *loc, enum reg_class cl, enum scan_actions action,
 		head->last->next_use = this_du;
 	      head->last = this_du;
 
+	      if (!DEBUG_INSN_P (insn))
+		head->length++;
 	    }
 	  /* Avoid adding the same location in a DEBUG_INSN multiple times,
 	     which could happen with non-exact overlap.  */
diff --git a/gcc/target.def b/gcc/target.def
index 66006ae..ca4f225 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -2204,6 +2204,21 @@ DEFHOOK
  bool, (reg_class_t rclass),
  default_class_likely_spilled_p)
 
+DEFHOOK
+(preferred_rename_class,
+ "A target hook that places additional preference on the register\
+ class to use when it is necessary to rename a register in class\
+ @var{class} to another class, or perhaps @var{NO_REGS}, if no\
+ prefered register class is found or hook @code{preferred_rename_class}\
+ is not implemented.\
+ Sometimes returning a more restrictive class makes better code.  For\
+ example, on ARM, thumb-2 instructions using @code{LO_REGS} may be\
+ smaller than instructions using @code{GENERIC_REGS}.  By returning\
+ @code{LO_REGS} from @code{preferred_rename_class}, code size can\
+ be reduced.",
+ reg_class_t, (reg_class_t rclass),
+ default_preferred_rename_class)
+
 /* This target hook allows the backend to perform additional
    processing while initializing for variable expansion.  */
 DEFHOOK
diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index 3647436..11d12d2 100644
--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -1270,6 +1270,12 @@ default_preferred_output_reload_class (rtx x ATTRIBUTE_UNUSED,
 #endif
 }
 
+reg_class_t
+default_preferred_rename_class (reg_class_t rclass)
+{
+  return NO_REGS;
+}
+
 /* The default implementation of TARGET_CLASS_LIKELY_SPILLED_P.  */
 
 bool
diff --git a/gcc/targhooks.h b/gcc/targhooks.h
index eeefe05..6fe8350 100644
--- a/gcc/targhooks.h
+++ b/gcc/targhooks.h
@@ -158,6 +158,7 @@ extern int default_register_move_cost (enum machine_mode, reg_class_t,
 extern bool default_profile_before_prologue (void);
 extern reg_class_t default_preferred_reload_class (rtx, reg_class_t);
 extern reg_class_t default_preferred_output_reload_class (rtx, reg_class_t);
+extern reg_class_t default_preferred_rename_class (reg_class_t rclass);
 extern bool default_class_likely_spilled_p (reg_class_t);
 
 extern enum unwind_info_type default_debug_unwind_info (void);

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS
  2010-12-03 10:58                 ` Eric Botcazou
  2010-12-03 14:08                   ` Yao Qi
@ 2010-12-04 16:06                   ` Daniel Jacobowitz
  2010-12-06 11:11                     ` Eric Botcazou
  1 sibling, 1 reply; 26+ messages in thread
From: Daniel Jacobowitz @ 2010-12-04 16:06 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Yao Qi, gcc-patches

On Fri, Dec 03, 2010 at 11:57:20AM +0100, Eric Botcazou wrote:
> Finally, I have a question for native speakers: does preferred_rename_class 
> sound good or would preferred_renaming_class be better?

I'd go for preferred_rename_class, personally.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS
  2010-12-04 16:06                   ` [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS Daniel Jacobowitz
@ 2010-12-06 11:11                     ` Eric Botcazou
  0 siblings, 0 replies; 26+ messages in thread
From: Eric Botcazou @ 2010-12-06 11:11 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: Yao Qi, gcc-patches

> I'd go for preferred_rename_class, personally.

OK, thanks for your feedback.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS
  2010-12-03 14:08                   ` Yao Qi
@ 2010-12-06 18:27                     ` Eric Botcazou
  2010-12-06 20:34                       ` Joseph S. Myers
  2010-12-07 12:35                       ` Yao Qi
  2010-12-13 19:43                     ` ARM bootstrap failure (Re: [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS) Ulrich Weigand
  1 sibling, 2 replies; 26+ messages in thread
From: Eric Botcazou @ 2010-12-06 18:27 UTC (permalink / raw)
  To: Yao Qi; +Cc: gcc-patches

> Seems I didn't understand targethook fully before.  You are right.  Done.

Thanks.  The patch is OK for mainline modulo the following last nits:

--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -2496,6 +2496,8 @@ looking for one that is valid, and will reload one or 
both registers
 only if neither labeling works.
 @end defmac
 
+@hook TARGET_PREFERRED_RENAME_CLASS
+
 @hook TARGET_PREFERRED_RELOAD_CLASS
 A target hook that places additional restrictions on the register class
 to use when it is necessary to copy value @var{x} into a register in class

It looks like most of the hooks have some text in tm.texi.in so I'd suggest 
writing a single sentence in target.def and move the rest of the text (with 
the example) to tm.texi.in.


diff --git a/gcc/regrename.c b/gcc/regrename.c
index 2535ab7..adbcde5 100644
--- a/gcc/regrename.c
+++ b/gcc/regrename.c
@@ -38,6 +38,7 @@
 #include "timevar.h"
 #include "tree-pass.h"
 #include "df.h"
+#include "target.h"

You also need to add $(TARGET_H) to the regrename.o rule in Makefile.in.


+  for (i = nregs - 1; i >= 0; --i)
+    if (TEST_HARD_REG_BIT (this_unavailable, new_reg + i)
+	|| fixed_regs[new_reg + i]
+	|| global_regs[new_reg + i]
+	/* Can't use regs which aren't saved by the prologue.  */
+	|| (! df_regs_ever_live_p (new_reg + i)
+	    && ! call_used_regs[new_reg + i])
+#ifdef LEAF_REGISTERS
+	/* We can't use a non-leaf register if we're in a
+	   leaf function.  */
+	|| (current_function_is_leaf
+	    && !LEAF_REGISTERS[new_reg + i])
+#endif
+#ifdef HARD_REGNO_RENAME_OK
+	|| ! HARD_REGNO_RENAME_OK (reg + i, new_reg + i)
+#endif
+	)
+      break;
+  if (i >= 0)
+    return false;

Just use "return false" instead of "break" and remove the "if (i >= 0)" test.


+  /* See whether it accepts all modes that occur in
+     definition and uses.  */
+  for (tmp = this_head->first; tmp; tmp = tmp->next_use)
+    if ((! HARD_REGNO_MODE_OK (new_reg, GET_MODE (*tmp->loc))
+	 && ! DEBUG_INSN_P (tmp->insn))
+	|| (this_head->need_caller_save_reg
+	    && ! (HARD_REGNO_CALL_PART_CLOBBERED
+		  (reg, GET_MODE (*tmp->loc)))
+	    && (HARD_REGNO_CALL_PART_CLOBBERED
+		(new_reg, GET_MODE (*tmp->loc)))))
+      break;

Likewise, you want "return false" here instead of "break".



+	  /* The register iteration order here is "preferred-register-first".
+	     Firstly(pass == 0), we iterate registers belong to PREFERRED_CLASS,
+	     if we find a new register, we stop immeidately.

immediately

+		  /* Iterate registers first in prefered class.  */

/* First iterate over registers in our preferred class.  */

 
+reg_class_t
+default_preferred_rename_class (reg_class_t rclass)
+{
+  return NO_REGS;
+}
+

You'll probably need ATTRIBUTE_UNUSED to be able to bootstrap that.  Also add 
the standard comment:

/* The default implementation of TARGET_PREFERRED_RENAME_CLASS.  */

in front of it.


No need to re-submit: make the above changes, check that you still get the 
desired effect with the hook for the ARM, and bootstrap/regtest on a native 
platform.  Thanks for your patience.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS
  2010-12-06 18:27                     ` Eric Botcazou
@ 2010-12-06 20:34                       ` Joseph S. Myers
  2010-12-06 20:35                         ` Eric Botcazou
  2010-12-07 12:35                       ` Yao Qi
  1 sibling, 1 reply; 26+ messages in thread
From: Joseph S. Myers @ 2010-12-06 20:34 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Yao Qi, gcc-patches

On Mon, 6 Dec 2010, Eric Botcazou wrote:

> --- a/gcc/doc/tm.texi.in
> +++ b/gcc/doc/tm.texi.in
> @@ -2496,6 +2496,8 @@ looking for one that is valid, and will reload one or 
> both registers
>  only if neither labeling works.
>  @end defmac
>  
> +@hook TARGET_PREFERRED_RENAME_CLASS
> +
>  @hook TARGET_PREFERRED_RELOAD_CLASS
>  A target hook that places additional restrictions on the register class
>  to use when it is necessary to copy value @var{x} into a register in class
> 
> It looks like most of the hooks have some text in tm.texi.in so I'd suggest 
> writing a single sentence in target.def and move the rest of the text (with 
> the example) to tm.texi.in.

No!  Never do that for a new hook.  That should only ever be done if the 
hook documentation is based on pre-existing GFDL-only text, while we are 
waiting for the FSF Board of Directors to resolve the problems with not 
being able to move text from one place to another.  target.def is always 
the preferred place for hook documentation where legally possible.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS
  2010-12-06 20:34                       ` Joseph S. Myers
@ 2010-12-06 20:35                         ` Eric Botcazou
  2010-12-06 20:39                           ` Joseph S. Myers
  0 siblings, 1 reply; 26+ messages in thread
From: Eric Botcazou @ 2010-12-06 20:35 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: Yao Qi, gcc-patches

> No!  Never do that for a new hook.  That should only ever be done if the
> hook documentation is based on pre-existing GFDL-only text, while we are
> waiting for the FSF Board of Directors to resolve the problems with not
> being able to move text from one place to another.  target.def is always
> the preferred place for hook documentation where legally possible.

OK, thanks for the heads up.  Is that written down somewhere?

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS
  2010-12-06 20:35                         ` Eric Botcazou
@ 2010-12-06 20:39                           ` Joseph S. Myers
  2010-12-06 22:02                             ` Eric Botcazou
  0 siblings, 1 reply; 26+ messages in thread
From: Joseph S. Myers @ 2010-12-06 20:39 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Yao Qi, gcc-patches

On Mon, 6 Dec 2010, Eric Botcazou wrote:

> > No!  Never do that for a new hook.  That should only ever be done if the
> > hook documentation is based on pre-existing GFDL-only text, while we are
> > waiting for the FSF Board of Directors to resolve the problems with not
> > being able to move text from one place to another.  target.def is always
> > the preferred place for hook documentation where legally possible.
> 
> OK, thanks for the heads up.  Is that written down somewhere?

There's a long comment near the top of target.def discussing where 
documentation should go in different cases.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS
  2010-12-06 20:39                           ` Joseph S. Myers
@ 2010-12-06 22:02                             ` Eric Botcazou
  0 siblings, 0 replies; 26+ messages in thread
From: Eric Botcazou @ 2010-12-06 22:02 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: Yao Qi, gcc-patches

> There's a long comment near the top of target.def discussing where
> documentation should go in different cases.

Thanks.  "near" is the important word here, I had looked at the top of this 
file, but the long comment doesn't even start in the first page!

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS
  2010-12-06 18:27                     ` Eric Botcazou
  2010-12-06 20:34                       ` Joseph S. Myers
@ 2010-12-07 12:35                       ` Yao Qi
  2010-12-13 13:10                         ` Eric Botcazou
  1 sibling, 1 reply; 26+ messages in thread
From: Yao Qi @ 2010-12-07 12:35 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches

On 12/07/2010 02:25 AM, Eric Botcazou wrote:
>> Seems I didn't understand targethook fully before.  You are right.  Done.
> 
> Thanks.  The patch is OK for mainline modulo the following last nits:
> 
> No need to re-submit: make the above changes, check that you still get the 
> desired effect with the hook for the ARM, and bootstrap/regtest on a native 
> platform.  Thanks for your patience.
> 

Bootstrap and regression tested on x86_64-linux.  Committed.  Thanks a
lot for your patient review.

-- 
Yao (齐尧)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS
  2010-12-07 12:35                       ` Yao Qi
@ 2010-12-13 13:10                         ` Eric Botcazou
  2011-01-02 17:43                           ` Eric Botcazou
  0 siblings, 1 reply; 26+ messages in thread
From: Eric Botcazou @ 2010-12-13 13:10 UTC (permalink / raw)
  To: Yao Qi; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 889 bytes --]

> Bootstrap and regression tested on x86_64-linux.  Committed.  Thanks a
> lot for your patient review.

You're welcome.  However, I've pondered some more on this and I now think that 
we went a little too far.  The problem is the sorting of DU chains.  Although 
this isn't explicit in the pass (lack of general comment at the top of the 
file among other things), it uses a round-robin algorithm to rename the 
registers (the 'tick' business); sorting the chains damages the algorithm.

So I think that we cannot sort the chains, at least not so abruptly.  The 
attached patch removes the sort, but in exchange forces the renaming of 
registers that don't belong to the preferred class to registers that do, even 
though they were used recently.  It also adds the missing general comment.

Could you give it a whirl on your favorite set of tests and see how it fares?

-- 
Eric Botcazou

[-- Attachment #2: regrename.diff --]
[-- Type: text/x-diff, Size: 11968 bytes --]

Index: regrename.c
===================================================================
--- regrename.c	(revision 167721)
+++ regrename.c	(working copy)
@@ -40,10 +40,33 @@
 #include "df.h"
 #include "target.h"
 
+/* This file implements the RTL register renaming pass of the compiler.  It is
+   a semi-local pass whose goal is to maximize the usage of the register file
+   of the processor by substituting registers for others in the solution given
+   by the register allocator.  The algorithm is as follows:
+
+     1. Local def/use chains are built: within each basic block, chains are
+	opened and closed; if a chain isn't closed at the end of the block,
+	it is dropped.
+
+     2. For each chain, the set of possible renaming registers is computed.
+	This takes into account the renaming of previously processed chains.
+	Optionally, a preferred class is computed for the renaming register.
+
+     3. The best renaming register is computed for the chain in the above set,
+	using a round-robin allocation.  If a preferred class exists, then the
+	round-robin allocation is done within the class first, if possible.
+	The round-robin allocation of renaming registers itself is global.
+
+     4. If a renaming register has been found, it is substituted in the chain.
+
+  Targets can parameterize the pass by specifying a preferred class for the
+  renaming register for a given (super)class of registers to be renamed.  */
+
 #if HOST_BITS_PER_WIDE_INT <= MAX_RECOG_OPERANDS
 #error "Use a different bitmap implementation for untracked_operands."
 #endif
-   
+
 /* We keep linked lists of DU_HEAD structures, each of which describes
    a chain of occurrences of a reg.  */
 struct du_head
@@ -52,11 +75,6 @@ struct du_head
   struct du_head *next_chain;
   /* The first and last elements of this chain.  */
   struct du_chain *first, *last;
-  /* The number of elements of this chain, excluding those corresponding
-     to references of the register in debug insns.  The du_head linked
-     list can be sorted by this, and register-rename can prefer
-     register classes according to this order.  */
-  int length;
   /* Describes the register being tracked.  */
   unsigned regno, nregs;
 
@@ -160,150 +178,6 @@ merge_overlapping_regs (HARD_REG_SET *ps
     }
 }
 
-/* Return the Nth element in LIST.  If LIST contains less than N
-   elements, return the last one.  */
-static struct du_head *
-get_element (struct du_head *list, int n)
-{
-  while (n-- && list->next_chain != NULL)
-    list = list->next_chain;
-
-  return list;
-}
-
-/* Comparison function of merge sort.  Return true if A is less than
-   B, otherwise return false.  */
-static inline int
-merge_sort_comparison(const struct du_head *a,
-		    const struct du_head *b)
-{
-  return a->length < b->length;
-}
-
-/* Merge the first 2 sub-lists of LENGTH nodes contained in the
-   linked list pointed to by START_NODE.  Update START_NODE to point
-   to the merged nodes, and return a pointer to the last merged
-   node.  Return NULL if START_NODE doesn't contain enough
-   elements, or this pass of merge is done.  */
-
-static struct du_head *
-merge(struct du_head **start_node, int length)
-{
-  int i, left_count, right_count;
-  struct du_head *left, *right;
-  /* Current node of sort result.  */
-  struct du_head *current_sorted_node;
-  /* Tail node of sort, used to connect with next piece of list.  */
-  struct du_head *current_tail_node;
-
-  if (*start_node == NULL)
-    return NULL;
-
-  left = right = *start_node;
-  right_count = left_count = 0;
-
-  /* Step RIGHT along the list by LENGTH places.  */
-  for (i = 0; i < length; i++)
-    {
-      right = right->next_chain;
-      if (right == NULL)
-	{
-	  return NULL;
-	}
-    }
-
-  /* Initialize current_sorted_node.  */
-  if (merge_sort_comparison (left, right))
-    {
-      ++right_count;
-      current_sorted_node = right;
-      *start_node = right;
-      right = right->next_chain;
-    }
-  else
-    {
-      ++left_count;
-      current_sorted_node = left;
-      left = left->next_chain;
-    }
-
-  while (1)
-    {
-      /* Choose LEFT or RIGHT to take the next element from.  If
-	 either is empty, choose from the other one.  */
-      if (left_count == length || left == NULL)
-	{
-	  current_sorted_node->next_chain = right;
-	  current_tail_node = get_element (current_sorted_node,
-					   length - right_count);
-
-	  break;
-	}
-      else if (right_count == length || right == NULL)
-	{
-	  /* Save the head node of next piece of linked list.  */
-	  struct du_head *tmp = current_sorted_node->next_chain;
-
-	  current_sorted_node->next_chain = left;
-	  current_tail_node
-	    = get_element (current_sorted_node,
-			   length - left_count);
-	  /* Connect sorted list to next piece of list.  */
-	  current_tail_node->next_chain = tmp;
-	  break;
-	}
-      else
-	{
-	  /* Normal merge operations.  If both LEFT and RIGHT are
-	     non-empty, compare the first element of each and choose
-	     the lower one.  */
-	  if (merge_sort_comparison (left, right))
-	    {
-	      right_count++;
-	      current_sorted_node->next_chain = right;
-	      right = right->next_chain;
-	    }
-	  else
-	    {
-	      left_count++;
-	      current_sorted_node->next_chain = left;
-	      left = left->next_chain;
-	    }
-	  current_sorted_node = current_sorted_node->next_chain;
-	}
-    }
-  /* Return NULL if this pass of merge is done.  */
-  return (current_tail_node->next_chain ? current_tail_node : NULL);
-}
-
-/* Sort the linked list pointed to by HEAD.  The algorithm is a
-   non-recursive merge sort to linked list.  */
-
-static void
-sort_du_head (struct du_head **head)
-{
-  int current_length = 1;
-  struct du_head *last_tail;
-
-  /* In each pass, lists of size current_length is merged to
-     lists of size 2xcurrent_length (Initially current_length
-     is 1).  */
-  while (1)
-    {
-      last_tail = merge(head, current_length);
-      if (last_tail != NULL)
-	{
-	  do
-	    last_tail = merge (&last_tail->next_chain, current_length);
-	  while (last_tail != NULL);
-
-	  current_length *= 2;
-	}
-      else
-	break;
-    }
-}
-
 /* Check if NEW_REG can be the candidate register to rename for
    REG in THIS_HEAD chain.  THIS_UNAVAILABLE is a set of unavailable hard
    registers.  */
@@ -392,8 +266,6 @@ regrename_optimize (void)
       if (dump_file)
 	dump_def_use_chain (all_chains);
 
-      sort_du_head (&all_chains);
-
       CLEAR_HARD_REG_SET (unavailable);
       /* Don't clobber traceback for noreturn functions.  */
       if (frame_pointer_needed)
@@ -413,8 +285,9 @@ regrename_optimize (void)
 	  HARD_REG_SET this_unavailable;
 	  int reg = this_head->regno;
 	  int pass;
-	  enum reg_class superunion_class = NO_REGS;
+	  enum reg_class super_class = NO_REGS;
 	  enum reg_class preferred_class;
+	  bool has_preferred_class;
 
 	  all_chains = this_head->next_chain;
 
@@ -444,79 +317,78 @@ regrename_optimize (void)
 
 	  COPY_HARD_REG_SET (this_unavailable, unavailable);
 
-	  /* Iterate elements in chain in order to:
+	  /* Iterate over elements in the chain in order to:
 	     1. Count number of uses, and narrow the set of registers we can
-	     use for renaming.
+		use for renaming.
 	     2. Compute the superunion of register classes in this chain.  */
 	  n_uses = 0;
-	  superunion_class = NO_REGS;
+	  super_class = NO_REGS;
 	  for (tmp = this_head->first; tmp; tmp = tmp->next_use)
 	    {
 	      if (DEBUG_INSN_P (tmp->insn))
 		continue;
 	      n_uses++;
-
 	      IOR_COMPL_HARD_REG_SET (this_unavailable,
 				      reg_class_contents[tmp->cl]);
-
-	      superunion_class
-		= reg_class_superunion[(int) superunion_class][(int) tmp->cl];
+	      super_class
+		= reg_class_superunion[(int) super_class][(int) tmp->cl];
 	    }
 
 	  if (n_uses < 2)
 	    continue;
 
+	  /* Further narrow the set of registers we can use for renaming.
+	     If the chain needs a call-saved register, mark the call-used
+	     registers as unavailable.  */
 	  if (this_head->need_caller_save_reg)
 	    IOR_HARD_REG_SET (this_unavailable, call_used_reg_set);
 
+	  /* And mark registers that overlap its lifetime as unavailable.  */
 	  merge_overlapping_regs (&this_unavailable, this_head);
+
 	  /* Compute preferred rename class of super union of all the classes
-	     on the chain.  */
+	     in the chain.  */
 	  preferred_class
-	    = (enum reg_class) targetm.preferred_rename_class(superunion_class);
+	    = (enum reg_class) targetm.preferred_rename_class (super_class);
 
-	  /* The register iteration order here is "preferred-register-first".
-	     Firstly(pass == 0), we iterate registers belong to PREFERRED_CLASS,
-	     if we find a new register, we stop immeidately.
-	     Otherwise, we iterate over registers that don't belong to
-	     PREFERRED_CLASS.
+	  /* If PREFERRED_CLASS is not NO_REGS, we iterate in the first pass
+	     over registers that belong to PREFERRED_CLASS and try to find the
+	     best register within the class.  If that failed, we iterate in
+	     the second pass over registers that don't belong to the class.
 	     If PREFERRED_CLASS is NO_REGS, we iterate over all registers in
 	     ascending order without any preference.  */
-	  for (pass = (preferred_class == NO_REGS ? 1 : 0); pass < 2; pass++)
+	  has_preferred_class = (preferred_class != NO_REGS);
+	  for (pass = (has_preferred_class ? 0 : 1); pass < 2; pass++)
 	    {
-	      bool found = false;
-	      /* Now potential_regs is a reasonable approximation, let's
-		 have a closer look at each register still in there.  */
 	      for (new_reg = 0; new_reg < FIRST_PSEUDO_REGISTER; new_reg++)
 		{
-		  /* Iterate registers first in prefered class.  */
-		  if (pass == 0
-		      && !TEST_HARD_REG_BIT (reg_class_contents[preferred_class],
-					     new_reg))
+		  if (has_preferred_class
+		      && (pass == 0)
+			 != TEST_HARD_REG_BIT
+			    (reg_class_contents[preferred_class], new_reg))
 		    continue;
 
+		  /* In the first pass, we force the renaming of registers that
+		     don't belong to PREFERRED_CLASS to registers that do, even
+		     though the latters were used not very long ago.  */
 		  if (check_new_reg_p (reg, new_reg, this_head,
-				       this_unavailable))
+				       this_unavailable)
+		      && ((pass == 0
+			   && !TEST_HARD_REG_BIT
+			       (reg_class_contents[preferred_class],
+			        best_new_reg))
+			  || tick[best_new_reg] > tick[new_reg]))
 		    {
-		      if (tick[best_new_reg] > tick[new_reg])
-			{
-			  enum machine_mode mode
-			    = GET_MODE (*this_head->first->loc);
-			  best_new_reg = new_reg;
-			  best_nregs = hard_regno_nregs[new_reg][mode];
-			  /* If we find a new reg in our preferred class,
-			     stop immediately.  */
-			  if (best_new_reg != reg && pass == 0)
-			    {
-			      found = true;
-			      break;
-			    }
-			}
+		      enum machine_mode mode
+			= GET_MODE (*this_head->first->loc);
+		      best_new_reg = new_reg;
+		      best_nregs = hard_regno_nregs[new_reg][mode];
 		    }
 		}
-	      if (found)
+	      if (pass == 0 && best_new_reg != reg)
 		break;
 	    }
+
 	  if (dump_file)
 	    {
 	      fprintf (dump_file, "Register %s in insn %d",
@@ -732,7 +604,6 @@ create_new_chain (unsigned this_regno, u
   head->need_caller_save_reg = 0;
   head->cannot_rename = 0;
   head->terminated = 0;
-  head->length = 0;
 
   VEC_safe_push (du_head_p, heap, id_to_chain, head);
   head->id = current_id++;
@@ -778,8 +649,6 @@ create_new_chain (unsigned this_regno, u
   this_du->loc = loc;
   this_du->insn = insn;
   this_du->cl = cl;
-
-  head->length = 1;
 }
 
 static void
@@ -868,9 +737,6 @@ scan_rtx_reg (rtx insn, rtx *loc, enum r
 	      else
 		head->last->next_use = this_du;
 	      head->last = this_du;
-
-	      if (!DEBUG_INSN_P (insn))
-		head->length++;
 	    }
 	  /* Avoid adding the same location in a DEBUG_INSN multiple times,
 	     which could happen with non-exact overlap.  */

^ permalink raw reply	[flat|nested] 26+ messages in thread

* ARM bootstrap failure (Re: [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS)
  2010-12-03 14:08                   ` Yao Qi
  2010-12-06 18:27                     ` Eric Botcazou
@ 2010-12-13 19:43                     ` Ulrich Weigand
  2010-12-13 20:46                       ` Ulrich Weigand
  1 sibling, 1 reply; 26+ messages in thread
From: Ulrich Weigand @ 2010-12-13 19:43 UTC (permalink / raw)
  To: Yao Qi; +Cc: Eric Botcazou, gcc-patches

Yao Qi wrote:

> +static bool
> +check_new_reg_p (int reg, int new_reg,  struct du_head *this_head,
> +		 HARD_REG_SET this_unavailable)

This function causes a native bootstrap failure on ARM in stage 2:

../../gcc-head/gcc/regrename.c: In function "check_new_reg_p":
../../gcc-head/gcc/regrename.c:312:22: error: unused parameter "reg" [-Werror=unused-parameter]
cc1: all warnings being treated as errors

This is because "reg" is only used as input to target macros
(HARD_REGNO_RENAME_OK, HARD_REGNO_CALL_PART_CLOBBERED), and
the ARM versions of those don't actually use that argument.

I guess this needs an ATTRIBUTE_UNUSED ...

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: ARM bootstrap failure (Re: [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS)
  2010-12-13 19:43                     ` ARM bootstrap failure (Re: [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS) Ulrich Weigand
@ 2010-12-13 20:46                       ` Ulrich Weigand
  0 siblings, 0 replies; 26+ messages in thread
From: Ulrich Weigand @ 2010-12-13 20:46 UTC (permalink / raw)
  To: Ulrich Weigand; +Cc: Yao Qi, Eric Botcazou, gcc-patches

I wrote:
> Yao Qi wrote:
> 
> > +static bool
> > +check_new_reg_p (int reg, int new_reg,  struct du_head *this_head,
> > +		 HARD_REG_SET this_unavailable)
> 
> This function causes a native bootstrap failure on ARM in stage 2:
> 
> ../../gcc-head/gcc/regrename.c: In function "check_new_reg_p":
> ../../gcc-head/gcc/regrename.c:312:22: error: unused parameter "reg" [-Werror=unused-parameter]
> cc1: all warnings being treated as errors
> 
> This is because "reg" is only used as input to target macros
> (HARD_REGNO_RENAME_OK, HARD_REGNO_CALL_PART_CLOBBERED), and
> the ARM versions of those don't actually use that argument.
> 
> I guess this needs an ATTRIBUTE_UNUSED ...

Oops, this was already fixed by Jakub here:
http://gcc.gnu.org/ml/gcc-patches/2010-12/msg00675.html

Sorry for the noise.

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS
  2010-12-13 13:10                         ` Eric Botcazou
@ 2011-01-02 17:43                           ` Eric Botcazou
  0 siblings, 0 replies; 26+ messages in thread
From: Eric Botcazou @ 2011-01-02 17:43 UTC (permalink / raw)
  To: gcc-patches; +Cc: Yao Qi

> So I think that we cannot sort the chains, at least not so abruptly.  The
> attached patch removes the sort, but in exchange forces the renaming of
> registers that don't belong to the preferred class to registers that do,
> even though they were used recently.  It also adds the missing general
> comment.
>
> Could you give it a whirl on your favorite set of tests and see how it
> fares?

I looked at the assembly generated for the ARM on a (limited) set of testcases 
and this seems to still have the desired effect.

Tested on x86_64-suse-linux, applied on the mainline.


2011-01-02  Eric Botcazou  <ebotcazou@adacore.com>

	* regrename.c: Add general comment describing the pass.
	(struct du_head): Remove 'length' field.
	(get_element, merge_sort_comparison, merge, sort_du_head): Remove.
	(regrename_optimize): Do not sort chains.  Rework comments, add others.
	Force renaming to the preferred class (if any) in the first pass and do
	not consider registers that belong to it in the second pass.
	(create_new_chain): Do not set 'length' field.
	(scan_rtx_reg): Likewise.


-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2011-01-02 17:29 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-04  5:31 [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS Yao Qi
2010-11-05  9:53 ` Eric Botcazou
2010-11-05 16:08   ` Yao Qi
2010-11-08 12:52     ` Eric Botcazou
2010-11-08 15:11       ` Yao Qi
2010-11-09  9:26         ` Eric Botcazou
2010-11-09 15:33           ` Yao Qi
2010-11-11 17:40             ` Eric Botcazou
2010-11-11 18:01               ` Eric Botcazou
2010-12-01 16:48               ` Yao Qi
2010-12-03 10:58                 ` Eric Botcazou
2010-12-03 14:08                   ` Yao Qi
2010-12-06 18:27                     ` Eric Botcazou
2010-12-06 20:34                       ` Joseph S. Myers
2010-12-06 20:35                         ` Eric Botcazou
2010-12-06 20:39                           ` Joseph S. Myers
2010-12-06 22:02                             ` Eric Botcazou
2010-12-07 12:35                       ` Yao Qi
2010-12-13 13:10                         ` Eric Botcazou
2011-01-02 17:43                           ` Eric Botcazou
2010-12-13 19:43                     ` ARM bootstrap failure (Re: [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS) Ulrich Weigand
2010-12-13 20:46                       ` Ulrich Weigand
2010-12-04 16:06                   ` [Ping] [patch 0/3] New macro PREFERRED_RENAME_CLASS Daniel Jacobowitz
2010-12-06 11:11                     ` Eric Botcazou
2010-11-20 12:05 ` Eric Botcazou
2010-11-23 12:06   ` Yao Qi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).