* RFA: patch for PR37397
@ 2008-11-10 18:43 Vladimir Makarov
2008-11-12 17:23 ` Jeff Law
0 siblings, 1 reply; 4+ messages in thread
From: Vladimir Makarov @ 2008-11-10 18:43 UTC (permalink / raw)
To: gcc-patches; +Cc: Jeffrey Law, Kenneth Zadeck
[-- Attachment #1: Type: text/plain, Size: 1101 bytes --]
This patch improves SPEC benchmarks. I saw stable improvements on
x86/x86_64 and ppc. This patch implements a small trick mentioned in
one classical article by Chaitin etc (Register allocation and spilling
via graph coloring). There is no sense to spill pseudo in whose live
range nothing is dying because the spill will not make other allocnos
colorable and additional reloads for the corresponding pseudo will be
generated in reload pass for each insn it occurs.
Is it ok to commit?
2008-11-10 Vladimir Makarov <vmakarov@redhat.com>
PR rtl-optimization/37397
* ira-int.h (struct ira_allocno): New member bad_spill_p.
(ALLOCNO_BAD_SPILL_P): New macro.
* ira-color.c (push_allocnos_to_stack): Check ALLOCNO_BAD_SPILL_P.
* ira-build.c (ira_create_allocno): Initialize
ALLOCNO_BAD_SPILL_P.
(create_cap_allocno, propagate_allocno_info,
remove_unnecessary_allocnos): Set up or update
ALLOCNO_BAD_SPILL_P.
(update_bad_spill_attribute): New function.
(ira_build): Call it.
* ira-costs.c (record_reg_classes): Set up ALLOCNO_BAD_SPILL_P.
[-- Attachment #2: pr37397.patch --]
[-- Type: text/x-patch, Size: 8777 bytes --]
Index: ira-int.h
===================================================================
--- ira-int.h (revision 141708)
+++ ira-int.h (working copy)
@@ -351,6 +351,10 @@ struct ira_allocno
region and all its subregions recursively. */
unsigned int no_stack_reg_p : 1, total_no_stack_reg_p : 1;
#endif
+ /* TRUE value means that there is no sense to spill the allocno
+ during coloring because the spill will result in additional
+ reloads in reload pass. */
+ unsigned int bad_spill_p : 1;
/* TRUE value means that the allocno was not removed yet from the
conflicting graph during colouring. */
unsigned int in_graph_p : 1;
@@ -435,6 +439,7 @@ struct ira_allocno
#define ALLOCNO_NO_STACK_REG_P(A) ((A)->no_stack_reg_p)
#define ALLOCNO_TOTAL_NO_STACK_REG_P(A) ((A)->total_no_stack_reg_p)
#endif
+#define ALLOCNO_BAD_SPILL_P(A) ((A)->bad_spill_p)
#define ALLOCNO_IN_GRAPH_P(A) ((A)->in_graph_p)
#define ALLOCNO_ASSIGNED_P(A) ((A)->assigned_p)
#define ALLOCNO_MAY_BE_SPILLED_P(A) ((A)->may_be_spilled_p)
Index: ira-color.c
===================================================================
--- ira-color.c (revision 141708)
+++ ira-color.c (working copy)
@@ -1195,7 +1195,10 @@ push_allocnos_to_stack (void)
* ira_reg_class_nregs[ALLOCNO_COVER_CLASS
(i_allocno)]
[ALLOCNO_MODE (i_allocno)] + 1));
- if (allocno == NULL || allocno_pri > i_allocno_pri
+ if (allocno == NULL
+ || (! ALLOCNO_BAD_SPILL_P (i_allocno)
+ && ALLOCNO_BAD_SPILL_P (allocno))
+ || allocno_pri > i_allocno_pri
|| (allocno_pri == i_allocno_pri
&& (allocno_cost > i_allocno_cost
|| (allocno_cost == i_allocno_cost
Index: ira-build.c
===================================================================
--- ira-build.c (revision 141708)
+++ ira-build.c (working copy)
@@ -456,6 +456,7 @@ ira_create_allocno (int regno, bool cap_
ALLOCNO_SOMEWHERE_RENAMED_P (a) = false;
ALLOCNO_CHILD_RENAMED_P (a) = false;
ALLOCNO_DONT_REASSIGN_P (a) = false;
+ ALLOCNO_BAD_SPILL_P (a) = false;
ALLOCNO_IN_GRAPH_P (a) = false;
ALLOCNO_ASSIGNED_P (a) = false;
ALLOCNO_MAY_BE_SPILLED_P (a) = false;
@@ -775,6 +776,7 @@ create_cap_allocno (ira_allocno_t a)
ira_allocate_and_copy_costs
(&ALLOCNO_CONFLICT_HARD_REG_COSTS (cap), cover_class,
ALLOCNO_CONFLICT_HARD_REG_COSTS (a));
+ ALLOCNO_BAD_SPILL_P (cap) = ALLOCNO_BAD_SPILL_P (a);
ALLOCNO_NREFS (cap) = ALLOCNO_NREFS (a);
ALLOCNO_FREQ (cap) = ALLOCNO_FREQ (a);
ALLOCNO_CALL_FREQ (cap) = ALLOCNO_CALL_FREQ (a);
@@ -1484,6 +1486,8 @@ propagate_allocno_info (void)
&& bitmap_bit_p (ALLOCNO_LOOP_TREE_NODE (a)->border_allocnos,
ALLOCNO_NUM (a)))
{
+ if (! ALLOCNO_BAD_SPILL_P (a))
+ ALLOCNO_BAD_SPILL_P (parent_a) = false;
ALLOCNO_NREFS (parent_a) += ALLOCNO_NREFS (a);
ALLOCNO_FREQ (parent_a) += ALLOCNO_FREQ (a);
ALLOCNO_CALL_FREQ (parent_a) += ALLOCNO_CALL_FREQ (a);
@@ -1771,6 +1775,8 @@ remove_unnecessary_allocnos (void)
+= ALLOCNO_CALLS_CROSSED_NUM (a);
ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (parent_a)
+= ALLOCNO_EXCESS_PRESSURE_POINTS_NUM (a);
+ if (! ALLOCNO_BAD_SPILL_P (a))
+ ALLOCNO_BAD_SPILL_P (parent_a) = false;
#ifdef STACK_REGS
if (ALLOCNO_TOTAL_NO_STACK_REG_P (a))
ALLOCNO_TOTAL_NO_STACK_REG_P (parent_a) = true;
@@ -1819,6 +1825,69 @@ remove_unnecessary_regions (void)
\f
+/* At this point true value of allocno attribute bad_spill_p means
+ that there is an insn where allocno occurs and where the allocno
+ can not be used as memory. The function updates the attribute, now
+ it can be true only for allocnos which can not be used as memory in
+ an insn and in whose live ranges there is other allocno deaths.
+ Spilling allocnos with true value will not improve the code because
+ it will not make other allocnos colorable and additional reloads
+ for the corresponding pseudo will be generated in reload pass for
+ each insn it occurs.
+
+ This is a trick mentioned in one classic article of Chaitin etc
+ which is frequently omitted in other implementations of RA based on
+ graph coloring. */
+static void
+update_bad_spill_attribute (void)
+{
+ int i;
+ ira_allocno_t a;
+ ira_allocno_iterator ai;
+ allocno_live_range_t r;
+ enum reg_class cover_class;
+ bitmap_head dead_points[N_REG_CLASSES];
+
+ for (i = 0; i < ira_reg_class_cover_size; i++)
+ {
+ cover_class = ira_reg_class_cover[i];
+ bitmap_initialize (&dead_points[cover_class], ®_obstack);
+ }
+ FOR_EACH_ALLOCNO (a, ai)
+ {
+ cover_class = ALLOCNO_COVER_CLASS (a);
+ if (cover_class == NO_REGS)
+ continue;
+ for (r = ALLOCNO_LIVE_RANGES (a); r != NULL; r = r->next)
+ bitmap_set_bit (&dead_points[cover_class], r->finish);
+ }
+ FOR_EACH_ALLOCNO (a, ai)
+ {
+ cover_class = ALLOCNO_COVER_CLASS (a);
+ if (cover_class == NO_REGS)
+ continue;
+ if (! ALLOCNO_BAD_SPILL_P (a))
+ continue;
+ for (r = ALLOCNO_LIVE_RANGES (a); r != NULL; r = r->next)
+ {
+ for (i = r->start + 1; i < r->finish; i++)
+ if (bitmap_bit_p (&dead_points[cover_class], i))
+ break;
+ if (i < r->finish)
+ break;
+ }
+ if (r != NULL)
+ ALLOCNO_BAD_SPILL_P (a) = false;
+ }
+ for (i = 0; i < ira_reg_class_cover_size; i++)
+ {
+ cover_class = ira_reg_class_cover[i];
+ bitmap_clear (&dead_points[cover_class]);
+ }
+}
+
+\f
+
/* Set up minimal and maximal live range points for allocnos. */
static void
setup_min_max_allocno_live_range_point (void)
@@ -2432,6 +2501,7 @@ ira_build (bool loops_p)
ira_create_allocno_live_ranges ();
remove_unnecessary_regions ();
ira_compress_allocno_live_ranges ();
+ update_bad_spill_attribute ();
loops_p = more_one_region_p ();
if (loops_p)
{
Index: ira-costs.c
===================================================================
--- ira-costs.c (revision 141708)
+++ ira-costs.c (working copy)
@@ -187,6 +187,10 @@ record_reg_classes (int n_alts, int n_op
int alt;
int i, j, k;
rtx set;
+ int insn_allows_mem[MAX_RECOG_OPERANDS];
+
+ for (i = 0; i < n_ops; i++)
+ insn_allows_mem[i] = 0;
/* Process each alternative, each time minimizing an operand's cost
with the cost for each operand in that alternative. */
@@ -236,6 +240,8 @@ record_reg_classes (int n_alts, int n_op
j = p[0] - '0';
classes[i] = classes[j];
allows_mem[i] = allows_mem[j];
+ if (allows_mem[i])
+ insn_allows_mem[i] = 1;
if (! REG_P (op) || REGNO (op) < FIRST_PSEUDO_REGISTER)
{
@@ -302,6 +308,7 @@ record_reg_classes (int n_alts, int n_op
+ (recog_data.operand_type[i] != OP_OUT
? ira_memory_move_cost[mode][classes[i]][1] : 0)
- allows_mem[i]) * frequency;
+
/* If we have assigned a class to this allocno in our
first pass, add a cost to this alternative
corresponding to what we would add if this allocno
@@ -380,7 +387,7 @@ record_reg_classes (int n_alts, int n_op
/* It doesn't seem worth distinguishing between
offsettable and non-offsettable addresses
here. */
- allows_mem[i] = 1;
+ insn_allows_mem[i] = allows_mem[i] = 1;
if (MEM_P (op))
win = 1;
break;
@@ -456,7 +463,7 @@ record_reg_classes (int n_alts, int n_op
|| (CONSTANT_P (op)
&& (! flag_pic || LEGITIMATE_PIC_OPERAND_P (op))))
win = 1;
- allows_mem[i] = 1;
+ insn_allows_mem[i] = allows_mem[i] = 1;
case 'r':
classes[i] = ira_reg_class_union[classes[i]][GENERAL_REGS];
break;
@@ -472,7 +479,7 @@ record_reg_classes (int n_alts, int n_op
if (EXTRA_MEMORY_CONSTRAINT (c, p))
{
/* Every MEM can be reloaded to fit. */
- allows_mem[i] = 1;
+ insn_allows_mem[i] = allows_mem[i] = 1;
if (MEM_P (op))
win = 1;
}
@@ -625,6 +632,18 @@ record_reg_classes (int n_alts, int n_op
}
}
+ for (i = 0; i < n_ops; i++)
+ {
+ ira_allocno_t a;
+ rtx op = ops[i];
+
+ if (! REG_P (op) || REGNO (op) < FIRST_PSEUDO_REGISTER)
+ continue;
+ a = ira_curr_regno_allocno_map [REGNO (op)];
+ if (! ALLOCNO_BAD_SPILL_P (a) && insn_allows_mem[i] == 0)
+ ALLOCNO_BAD_SPILL_P (a) = true;
+ }
+
/* If this insn is a single set copying operand 1 to operand 0 and
one operand is an allocno with the other a hard reg or an allocno
that prefers a hard register that is in its own register class
@@ -867,6 +886,7 @@ record_address_regs (enum machine_mode m
if (REGNO (x) < FIRST_PSEUDO_REGISTER)
break;
+ ALLOCNO_BAD_SPILL_P (ira_curr_regno_allocno_map[REGNO (x)]) = true;
pp = COSTS_OF_ALLOCNO (allocno_costs,
ALLOCNO_NUM (ira_curr_regno_allocno_map
[REGNO (x)]));
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: RFA: patch for PR37397
2008-11-10 18:43 RFA: patch for PR37397 Vladimir Makarov
@ 2008-11-12 17:23 ` Jeff Law
2008-11-17 15:53 ` H.J. Lu
0 siblings, 1 reply; 4+ messages in thread
From: Jeff Law @ 2008-11-12 17:23 UTC (permalink / raw)
To: Vladimir Makarov; +Cc: gcc-patches, Kenneth Zadeck
Vladimir Makarov wrote:
> This patch improves SPEC benchmarks. I saw stable improvements on
> x86/x86_64 and ppc. This patch implements a small trick mentioned in
> one classical article by Chaitin etc (Register allocation and spilling
> via graph coloring). There is no sense to spill pseudo in whose live
> range nothing is dying because the spill will not make other allocnos
> colorable and additional reloads for the corresponding pseudo will be
> generated in reload pass for each insn it occurs.
>
> Is it ok to commit?
OK. Good find.
jeff
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: RFA: patch for PR37397
2008-11-12 17:23 ` Jeff Law
@ 2008-11-17 15:53 ` H.J. Lu
2008-11-18 22:15 ` Vladimir Makarov
0 siblings, 1 reply; 4+ messages in thread
From: H.J. Lu @ 2008-11-17 15:53 UTC (permalink / raw)
To: Jeff Law; +Cc: Vladimir Makarov, gcc-patches, Kenneth Zadeck
On Wed, Nov 12, 2008 at 9:17 AM, Jeff Law <law@redhat.com> wrote:
> Vladimir Makarov wrote:
>>
>> This patch improves SPEC benchmarks. I saw stable improvements on
>> x86/x86_64 and ppc. This patch implements a small trick mentioned in one
>> classical article by Chaitin etc (Register allocation and spilling via graph
>> coloring). There is no sense to spill pseudo in whose live range nothing is
>> dying because the spill will not make other allocnos colorable and
>> additional reloads for the corresponding pseudo will be generated in reload
>> pass for each insn it occurs.
>>
>> Is it ok to commit?
>
> OK. Good find.
>
Hi Vladimir,
This caused 30% slowdown on 454.calculix in SPEC CPU 2006
with -O2 -ffast-math on Linux/Intel64.
--
H.J.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: RFA: patch for PR37397
2008-11-17 15:53 ` H.J. Lu
@ 2008-11-18 22:15 ` Vladimir Makarov
0 siblings, 0 replies; 4+ messages in thread
From: Vladimir Makarov @ 2008-11-18 22:15 UTC (permalink / raw)
To: H.J. Lu; +Cc: Jeff Law, gcc-patches, Kenneth Zadeck
H.J. Lu wrote:
> On Wed, Nov 12, 2008 at 9:17 AM, Jeff Law <law@redhat.com> wrote:
>
>> Vladimir Makarov wrote:
>>
>>> This patch improves SPEC benchmarks. I saw stable improvements on
>>> x86/x86_64 and ppc. This patch implements a small trick mentioned in one
>>> classical article by Chaitin etc (Register allocation and spilling via graph
>>> coloring). There is no sense to spill pseudo in whose live range nothing is
>>> dying because the spill will not make other allocnos colorable and
>>> additional reloads for the corresponding pseudo will be generated in reload
>>> pass for each insn it occurs.
>>>
>>> Is it ok to commit?
>>>
>> OK. Good find.
>>
>>
>
> Hi Vladimir,
>
> This caused 30% slowdown on 454.calculix in SPEC CPU 2006
> with -O2 -ffast-math on Linux/Intel64.
>
>
>
H.J., thanks for reporting this. I'll submit a patch today to fix it.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2008-11-18 22:11 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-11-10 18:43 RFA: patch for PR37397 Vladimir Makarov
2008-11-12 17:23 ` Jeff Law
2008-11-17 15:53 ` H.J. Lu
2008-11-18 22:15 ` Vladimir Makarov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).