public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* RFA: patch for PR37397
@ 2008-11-10 18:43 Vladimir Makarov
  2008-11-12 17:23 ` Jeff Law
  0 siblings, 1 reply; 4+ messages in thread
From: Vladimir Makarov @ 2008-11-10 18:43 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jeffrey Law, Kenneth Zadeck

[-- Attachment #1: Type: text/plain, Size: 1101 bytes --]

This patch improves SPEC benchmarks.  I saw stable improvements on 
x86/x86_64 and ppc.  This patch implements a small trick mentioned in 
one classical article by Chaitin etc (Register allocation and spilling 
via graph coloring).  There is no sense to spill pseudo in whose live 
range nothing is dying because the spill will not make other allocnos 
colorable and additional reloads for the corresponding pseudo will be 
generated in reload pass for each insn it occurs.

Is it ok to commit?

2008-11-10  Vladimir Makarov  <vmakarov@redhat.com>

    PR rtl-optimization/37397
    * ira-int.h (struct ira_allocno): New member bad_spill_p.
    (ALLOCNO_BAD_SPILL_P): New macro.

    * ira-color.c (push_allocnos_to_stack): Check ALLOCNO_BAD_SPILL_P.

    * ira-build.c (ira_create_allocno): Initialize
    ALLOCNO_BAD_SPILL_P.
    (create_cap_allocno, propagate_allocno_info,
    remove_unnecessary_allocnos): Set up or update
    ALLOCNO_BAD_SPILL_P.
    (update_bad_spill_attribute): New function.
    (ira_build): Call it.

    * ira-costs.c (record_reg_classes): Set up ALLOCNO_BAD_SPILL_P.
   


[-- Attachment #2: pr37397.patch --]
[-- Type: text/x-patch, Size: 8777 bytes --]

Index: ira-int.h
===================================================================
--- ira-int.h	(revision 141708)
+++ ira-int.h	(working copy)
@@ -351,6 +351,10 @@ struct ira_allocno
      region and all its subregions recursively.  */
   unsigned int no_stack_reg_p : 1, total_no_stack_reg_p : 1;
 #endif
+  /* TRUE value means that there is no sense to spill the allocno
+     during coloring because the spill will result in additional
+     reloads in reload pass.  */
+  unsigned int bad_spill_p : 1;
   /* TRUE value means that the allocno was not removed yet from the
      conflicting graph during colouring.  */
   unsigned int in_graph_p : 1;
@@ -435,6 +439,7 @@ struct ira_allocno
 #define ALLOCNO_NO_STACK_REG_P(A) ((A)->no_stack_reg_p)
 #define ALLOCNO_TOTAL_NO_STACK_REG_P(A) ((A)->total_no_stack_reg_p)
 #endif
+#define ALLOCNO_BAD_SPILL_P(A) ((A)->bad_spill_p)
 #define ALLOCNO_IN_GRAPH_P(A) ((A)->in_graph_p)
 #define ALLOCNO_ASSIGNED_P(A) ((A)->assigned_p)
 #define ALLOCNO_MAY_BE_SPILLED_P(A) ((A)->may_be_spilled_p)
Index: ira-color.c
===================================================================
--- ira-color.c	(revision 141708)
+++ ira-color.c	(working copy)
@@ -1195,7 +1195,10 @@ push_allocnos_to_stack (void)
 			  * ira_reg_class_nregs[ALLOCNO_COVER_CLASS
 						(i_allocno)]
 			  [ALLOCNO_MODE (i_allocno)] + 1));
-		  if (allocno == NULL || allocno_pri > i_allocno_pri
+		  if (allocno == NULL
+		      || (! ALLOCNO_BAD_SPILL_P (i_allocno)
+			  && ALLOCNO_BAD_SPILL_P (allocno))
+		      || allocno_pri > i_allocno_pri
 		      || (allocno_pri == i_allocno_pri
 			  && (allocno_cost > i_allocno_cost
 			      || (allocno_cost == i_allocno_cost 
Index: ira-build.c
===================================================================
--- ira-build.c	(revision 141708)
+++ ira-build.c	(working copy)
@@ -456,6 +456,7 @@ ira_create_allocno (int regno, bool cap_
   ALLOCNO_SOMEWHERE_RENAMED_P (a) = false;
   ALLOCNO_CHILD_RENAMED_P (a) = false;
   ALLOCNO_DONT_REASSIGN_P (a) = false;
+  ALLOCNO_BAD_SPILL_P (a) = false;
   ALLOCNO_IN_GRAPH_P (a) = false;
   ALLOCNO_ASSIGNED_P (a) = false;
   ALLOCNO_MAY_BE_SPILLED_P (a) = false;
@@ -775,6 +776,7 @@ create_cap_allocno (ira_allocno_t a)
   ira_allocate_and_copy_costs
     (&ALLOCNO_CONFLICT_HARD_REG_COSTS (cap), cover_class,
      ALLOCNO_CONFLICT_HARD_REG_COSTS (a));
+  ALLOCNO_BAD_SPILL_P (cap) = ALLOCNO_BAD_SPILL_P (a);
   ALLOCNO_NREFS (cap) = ALLOCNO_NREFS (a);
   ALLOCNO_FREQ (cap) = ALLOCNO_FREQ (a);
   ALLOCNO_CALL_FREQ (cap) = ALLOCNO_CALL_FREQ (a);
@@ -1484,6 +1486,8 @@ propagate_allocno_info (void)
 	  && bitmap_bit_p (ALLOCNO_LOOP_TREE_NODE (a)->border_allocnos,
 			   ALLOCNO_NUM (a)))
 	{
+	  if (! ALLOCNO_BAD_SPILL_P (a))
+	    ALLOCNO_BAD_SPILL_P (parent_a) = false;
 	  ALLOCNO_NREFS (parent_a) += ALLOCNO_NREFS (a);
 	  ALLOCNO_FREQ (parent_a) += ALLOCNO_FREQ (a);
 	  ALLOCNO_CALL_FREQ (parent_a) += ALLOCNO_CALL_FREQ (a);
@@ -1771,6 +1775,8 @@ remove_unnecessary_allocnos (void)
 		  += ALLOCNO_CALLS_CROSSED_NUM (a);
 		ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (parent_a)
 		  += ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (a);
+		if (! ALLOCNO_BAD_SPILL_P (a))
+		  ALLOCNO_BAD_SPILL_P (parent_a) = false;
 #ifdef STACK_REGS
 		if (ALLOCNO_TOTAL_NO_STACK_REG_P (a))
 		  ALLOCNO_TOTAL_NO_STACK_REG_P (parent_a) = true;
@@ -1819,6 +1825,69 @@ remove_unnecessary_regions (void)
 
 \f
 
+/* At this point true value of allocno attribute bad_spill_p means
+   that there is an insn where allocno occurs and where the allocno
+   can not be used as memory.  The function updates the attribute, now
+   it can be true only for allocnos which can not be used as memory in
+   an insn and in whose live ranges there is other allocno deaths.
+   Spilling allocnos with true value will not improve the code because
+   it will not make other allocnos colorable and additional reloads
+   for the corresponding pseudo will be generated in reload pass for
+   each insn it occurs.
+
+   This is a trick mentioned in one classic article of Chaitin etc
+   which is frequently omitted in other implementations of RA based on
+   graph coloring.  */
+static void
+update_bad_spill_attribute (void)
+{
+  int i;
+  ira_allocno_t a;
+  ira_allocno_iterator ai;
+  allocno_live_range_t r;
+  enum reg_class cover_class;
+  bitmap_head dead_points[N_REG_CLASSES];
+
+  for (i = 0; i < ira_reg_class_cover_size; i++)
+    {
+      cover_class = ira_reg_class_cover[i];
+      bitmap_initialize (&dead_points[cover_class], &reg_obstack);
+    }
+  FOR_EACH_ALLOCNO (a, ai)
+    {
+      cover_class = ALLOCNO_COVER_CLASS (a);
+      if (cover_class == NO_REGS)
+	continue;
+      for (r = ALLOCNO_LIVE_RANGES (a); r != NULL; r = r->next)
+	bitmap_set_bit (&dead_points[cover_class], r->finish);
+    }
+  FOR_EACH_ALLOCNO (a, ai)
+    {
+      cover_class = ALLOCNO_COVER_CLASS (a);
+      if (cover_class == NO_REGS)
+	continue;
+      if (! ALLOCNO_BAD_SPILL_P (a))
+	continue;
+      for (r = ALLOCNO_LIVE_RANGES (a); r != NULL; r = r->next)
+	{
+	  for (i = r->start + 1; i < r->finish; i++)
+	    if (bitmap_bit_p (&dead_points[cover_class], i))
+	      break;
+	  if (i < r->finish)
+	    break;
+	}
+      if (r != NULL)
+	ALLOCNO_BAD_SPILL_P (a) = false;
+    }
+  for (i = 0; i < ira_reg_class_cover_size; i++)
+    {
+      cover_class = ira_reg_class_cover[i];
+      bitmap_clear (&dead_points[cover_class]);
+    }
+}
+
+\f
+
 /* Set up minimal and maximal live range points for allocnos.  */
 static void
 setup_min_max_allocno_live_range_point (void)
@@ -2432,6 +2501,7 @@ ira_build (bool loops_p)
   ira_create_allocno_live_ranges ();
   remove_unnecessary_regions ();
   ira_compress_allocno_live_ranges ();
+  update_bad_spill_attribute ();
   loops_p = more_one_region_p ();
   if (loops_p)
     {
Index: ira-costs.c
===================================================================
--- ira-costs.c	(revision 141708)
+++ ira-costs.c	(working copy)
@@ -187,6 +187,10 @@ record_reg_classes (int n_alts, int n_op
   int alt;
   int i, j, k;
   rtx set;
+  int insn_allows_mem[MAX_RECOG_OPERANDS];
+
+  for (i = 0; i < n_ops; i++)
+    insn_allows_mem[i] = 0;
 
   /* Process each alternative, each time minimizing an operand's cost
      with the cost for each operand in that alternative.  */
@@ -236,6 +240,8 @@ record_reg_classes (int n_alts, int n_op
 	      j = p[0] - '0';
 	      classes[i] = classes[j];
 	      allows_mem[i] = allows_mem[j];
+	      if (allows_mem[i])
+		insn_allows_mem[i] = 1;
 
 	      if (! REG_P (op) || REGNO (op) < FIRST_PSEUDO_REGISTER)
 		{
@@ -302,6 +308,7 @@ record_reg_classes (int n_alts, int n_op
 		       + (recog_data.operand_type[i] != OP_OUT
 			  ? ira_memory_move_cost[mode][classes[i]][1] : 0)
 		       - allows_mem[i]) * frequency;
+
 		  /* If we have assigned a class to this allocno in our
 		     first pass, add a cost to this alternative
 		     corresponding to what we would add if this allocno
@@ -380,7 +387,7 @@ record_reg_classes (int n_alts, int n_op
 		  /* It doesn't seem worth distinguishing between
 		     offsettable and non-offsettable addresses
 		     here.  */
-		  allows_mem[i] = 1;
+		  insn_allows_mem[i] = allows_mem[i] = 1;
 		  if (MEM_P (op))
 		    win = 1;
 		  break;
@@ -456,7 +463,7 @@ record_reg_classes (int n_alts, int n_op
 		      || (CONSTANT_P (op)
 			  && (! flag_pic || LEGITIMATE_PIC_OPERAND_P (op))))
 		    win = 1;
-		  allows_mem[i] = 1;
+		  insn_allows_mem[i] = allows_mem[i] = 1;
 		case 'r':
 		  classes[i] = ira_reg_class_union[classes[i]][GENERAL_REGS];
 		  break;
@@ -472,7 +479,7 @@ record_reg_classes (int n_alts, int n_op
 		  if (EXTRA_MEMORY_CONSTRAINT (c, p))
 		    {
 		      /* Every MEM can be reloaded to fit.  */
-		      allows_mem[i] = 1;
+		      insn_allows_mem[i] = allows_mem[i] = 1;
 		      if (MEM_P (op))
 			win = 1;
 		    }
@@ -625,6 +632,18 @@ record_reg_classes (int n_alts, int n_op
 	  }
     }
 
+  for (i = 0; i < n_ops; i++)
+    {
+      ira_allocno_t a;
+      rtx op = ops[i];
+
+      if (! REG_P (op) || REGNO (op) < FIRST_PSEUDO_REGISTER)
+	continue;
+      a = ira_curr_regno_allocno_map [REGNO (op)];
+      if (! ALLOCNO_BAD_SPILL_P (a) && insn_allows_mem[i] == 0)
+	ALLOCNO_BAD_SPILL_P (a) = true;
+    }
+
   /* If this insn is a single set copying operand 1 to operand 0 and
      one operand is an allocno with the other a hard reg or an allocno
      that prefers a hard register that is in its own register class
@@ -867,6 +886,7 @@ record_address_regs (enum machine_mode m
 	if (REGNO (x) < FIRST_PSEUDO_REGISTER)
 	  break;
 
+	ALLOCNO_BAD_SPILL_P (ira_curr_regno_allocno_map[REGNO (x)]) = true;
 	pp = COSTS_OF_ALLOCNO (allocno_costs,
 			       ALLOCNO_NUM (ira_curr_regno_allocno_map
 					    [REGNO (x)]));

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: RFA: patch for PR37397
  2008-11-10 18:43 RFA: patch for PR37397 Vladimir Makarov
@ 2008-11-12 17:23 ` Jeff Law
  2008-11-17 15:53   ` H.J. Lu
  0 siblings, 1 reply; 4+ messages in thread
From: Jeff Law @ 2008-11-12 17:23 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: gcc-patches, Kenneth Zadeck

Vladimir Makarov wrote:
> This patch improves SPEC benchmarks.  I saw stable improvements on 
> x86/x86_64 and ppc.  This patch implements a small trick mentioned in 
> one classical article by Chaitin etc (Register allocation and spilling 
> via graph coloring).  There is no sense to spill pseudo in whose live 
> range nothing is dying because the spill will not make other allocnos 
> colorable and additional reloads for the corresponding pseudo will be 
> generated in reload pass for each insn it occurs.
>
> Is it ok to commit?
OK.  Good find.

jeff

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: RFA: patch for PR37397
  2008-11-12 17:23 ` Jeff Law
@ 2008-11-17 15:53   ` H.J. Lu
  2008-11-18 22:15     ` Vladimir Makarov
  0 siblings, 1 reply; 4+ messages in thread
From: H.J. Lu @ 2008-11-17 15:53 UTC (permalink / raw)
  To: Jeff Law; +Cc: Vladimir Makarov, gcc-patches, Kenneth Zadeck

On Wed, Nov 12, 2008 at 9:17 AM, Jeff Law <law@redhat.com> wrote:
> Vladimir Makarov wrote:
>>
>> This patch improves SPEC benchmarks.  I saw stable improvements on
>> x86/x86_64 and ppc.  This patch implements a small trick mentioned in one
>> classical article by Chaitin etc (Register allocation and spilling via graph
>> coloring).  There is no sense to spill pseudo in whose live range nothing is
>> dying because the spill will not make other allocnos colorable and
>> additional reloads for the corresponding pseudo will be generated in reload
>> pass for each insn it occurs.
>>
>> Is it ok to commit?
>
> OK.  Good find.
>

Hi Vladimir,

This caused 30% slowdown on 454.calculix in SPEC CPU 2006
with -O2 -ffast-math on Linux/Intel64.


-- 
H.J.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: RFA: patch for PR37397
  2008-11-17 15:53   ` H.J. Lu
@ 2008-11-18 22:15     ` Vladimir Makarov
  0 siblings, 0 replies; 4+ messages in thread
From: Vladimir Makarov @ 2008-11-18 22:15 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Jeff Law, gcc-patches, Kenneth Zadeck

H.J. Lu wrote:
> On Wed, Nov 12, 2008 at 9:17 AM, Jeff Law <law@redhat.com> wrote:
>   
>> Vladimir Makarov wrote:
>>     
>>> This patch improves SPEC benchmarks.  I saw stable improvements on
>>> x86/x86_64 and ppc.  This patch implements a small trick mentioned in one
>>> classical article by Chaitin etc (Register allocation and spilling via graph
>>> coloring).  There is no sense to spill pseudo in whose live range nothing is
>>> dying because the spill will not make other allocnos colorable and
>>> additional reloads for the corresponding pseudo will be generated in reload
>>> pass for each insn it occurs.
>>>
>>> Is it ok to commit?
>>>       
>> OK.  Good find.
>>
>>     
>
> Hi Vladimir,
>
> This caused 30% slowdown on 454.calculix in SPEC CPU 2006
> with -O2 -ffast-math on Linux/Intel64.
>
>
>   
H.J., thanks for reporting this.  I'll submit a patch today to fix it.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2008-11-18 22:11 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-11-10 18:43 RFA: patch for PR37397 Vladimir Makarov
2008-11-12 17:23 ` Jeff Law
2008-11-17 15:53   ` H.J. Lu
2008-11-18 22:15     ` Vladimir Makarov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).