public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Bug in ra-colorize.c:merge_moves?
@ 2003-02-25  9:00 Christian Ehrhardt
  2003-02-25 16:59 ` Michael Matz
  0 siblings, 1 reply; 25+ messages in thread
From: Christian Ehrhardt @ 2003-02-25  9:00 UTC (permalink / raw)
  To: gcc


Hi,

this is merge_moves from ra-colorize.c which has somewhat broken list
handling IMHO. Look at the line marked with XXX, this changes ml->next
but the for-loop at the place marked with YYY will later on try to move
to the next element that previously followed ml by setting ml to ml->next.

void
merge_moves (u, v)
     struct web *u, *v;
{
  regset seen;
  struct move_list *ml;

  seen = BITMAP_XMALLOC ();
  for (ml = u->moves; ml; ml = ml->next)
    bitmap_set_bit (seen, INSN_UID (ml->move->insn));
  for (ml = v->moves; ml; ml = ml->next /* YYY */ )
    {
      if (! bitmap_bit_p (seen, INSN_UID (ml->move->insn)))
        {
          ml->next = u->moves;  /* XXX */
          u->moves = ml;
        }
    }
  BITMAP_XFREE (seen);
  v->moves = NULL;
}

    regards  Christian

-- 
THAT'S ALL FOLKS!

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Bug in ra-colorize.c:merge_moves?
  2003-02-25  9:00 Bug in ra-colorize.c:merge_moves? Christian Ehrhardt
@ 2003-02-25 16:59 ` Michael Matz
  2003-02-27 17:07   ` Christian Ehrhardt
  0 siblings, 1 reply; 25+ messages in thread
From: Michael Matz @ 2003-02-25 16:59 UTC (permalink / raw)
  To: Christian Ehrhardt; +Cc: gcc, gcc-patches

Hi,

On Tue, 25 Feb 2003, Christian Ehrhardt wrote:

> this is merge_moves from ra-colorize.c which has somewhat broken list
> handling IMHO.

Hmm, I must have been half asleep, when I wrote that.  Although note, that
with the default setting of flags noone will notice that error, as
web->moves is empty with optimistic coalescing.  I'll commit the below
patch, once bootstrapping/regtesting on i686-linux is done (regalloc
branch and HEAD).  I'm unclear if I need permission for 3.3.


Ciao,
Michael.
-- 
        * ra-colorize.c (merge_moves): Fix list handling.

Index: ra-colorize.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/ra-colorize.c,v
retrieving revision 1.1.2.9
diff -u -p -r1.1.2.9 ra-colorize.c
--- ra-colorize.c	21 Feb 2003 00:45:10 -0000	1.1.2.9
+++ ra-colorize.c	25 Feb 2003 16:45:16 -0000
@@ -550,13 +550,14 @@ merge_moves (u, v)
      struct web *u, *v;
 {
   regset seen;
-  struct move_list *ml;
+  struct move_list *ml, *ml_next;

   seen = BITMAP_XMALLOC ();
   for (ml = u->moves; ml; ml = ml->next)
     bitmap_set_bit (seen, INSN_UID (ml->move->insn));
-  for (ml = v->moves; ml; ml = ml->next)
+  for (ml = v->moves; ml; ml = ml_next)
     {
+      ml_next = ml->next;
       if (! bitmap_bit_p (seen, INSN_UID (ml->move->insn)))
         {
 	  ml->next = u->moves;

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Bug in ra-colorize.c:merge_moves?
  2003-02-25 16:59 ` Michael Matz
@ 2003-02-27 17:07   ` Christian Ehrhardt
  2003-02-27 17:34     ` Michael Matz
  2003-03-03 20:24     ` Daniel Berlin
  0 siblings, 2 replies; 25+ messages in thread
From: Christian Ehrhardt @ 2003-02-27 17:07 UTC (permalink / raw)
  To: Michael Matz; +Cc: gcc, gcc-patches

On Tue, Feb 25, 2003 at 05:57:15PM +0100, Michael Matz wrote:
> > this is merge_moves from ra-colorize.c which has somewhat broken list
> > handling IMHO.
> 
> Hmm, I must have been half asleep, when I wrote that.  Although note, that
> with the default setting of flags noone will notice that error, as
> web->moves is empty with optimistic coalescing.  I'll commit the below
> patch, once bootstrapping/regtesting on i686-linux is done (regalloc
> branch and HEAD).  I'm unclear if I need permission for 3.3.

Thanks for the patch. Another mostly unrelated issue:
The behaviour of the register allocator with the following testcase
(and others) is IMHO suboptimal:

extern void byte_store_op2 (int op, unsigned char *loc, int arg1, int arg2);
static void
byte_insert_op2 (op, loc, arg1, arg2, end)
    int op;
    unsigned char *loc;
    int arg1, arg2;
    unsigned char *end;
{
  register unsigned char *pfrom = end;
  register unsigned char *pto = end + 1 + 2 * 2;
  while (pfrom != loc)
    *--pto = *--pfrom;
  byte_store_op2 (op, loc, arg1, arg2);
}

With -O2 this testcase gives 7 non-precolored webs that all conflict with
each other, i.e. the conflict graph is a complete graph with 7 nodes.
With the current code it takes no less than 8 passes to get this graph
colored. The problem is that the allocator allows coalescing between
spill slots and non spill slots.
With the testcase above (compile with -O2) we decide that we have to
spill some web (Web 0) and insert proper spill code. However, the
first thing that the next pass does is to coalesce all the moves to
and from the newly inserted spill slot. As a consequence we are faced
with almost the same graph again.
This is a huge compile time performance problem. Adding the following
test to aggressive coalesce reduced compile time dramatically (> 80%
on my favourite test case, see below):

Index: ra-colorize.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/ra-colorize.c,v
retrieving revision 1.1.2.8
diff -u -r1.1.2.8 ra-colorize.c
--- ra-colorize.c       1 Feb 2003 17:27:08 -0000       1.1.2.8
+++ ra-colorize.c       27 Feb 2003 16:16:13 -0000
@@ -2683,7 +2683,8 @@
        if (s != t
            && t->type != PRECOLORED
            && !TEST_BIT (sup_igraph, s->id * num_webs + t->id)
-           && !TEST_BIT (sup_igraph, t->id * num_webs + s->id))
+           && !TEST_BIT (sup_igraph, t->id * num_webs + s->id)
+           && (SPILL_SLOT_P (s->regno) == SPILL_SLOT_P (t->regno)))
          {
            if ((s->type == PRECOLORED && ok (t, s))
                || s->type != PRECOLORED)

Here are timeings as reported by -ftime-report with and without the patch.
Testcase is a preprocessed regex.c from libiberty compiled with -O2:

                 vanilla            with patch      saved
local alloc      31.21              2.23            92.8%
TOTAL            34.16              5.19            84.8%

I realize that this patch may change the generated code. Do you
have some kind of testsuite that can be used to measure the run
time performance impact of this patch? For those (admittedly few)
test cases that I looked in detail, the same code was generated with
and without the patch.

Finally let me say a few words about my rational:
* We do want to coalesce normal moves. In this case SPILL_SLOT_P is
  zero for both webs involved.
* We do want to coalesce moves between different spill slots to avoid
  unnecessary mem-mem moves, in this case SPILL_SLOT_P is one for both
  webs involved.
* I really think we don't want to coalesce moves that result from spills.
  These moves normally move a non SPILL_SLOT_P register to or from a
  SPILL_SLOT_P register.
 
    regards   Christian

-- 
THAT'S ALL FOLKS!

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Bug in ra-colorize.c:merge_moves?
  2003-02-27 17:07   ` Christian Ehrhardt
@ 2003-02-27 17:34     ` Michael Matz
  2003-02-27 20:37       ` Michael Matz
  2003-03-03 20:24     ` Daniel Berlin
  1 sibling, 1 reply; 25+ messages in thread
From: Michael Matz @ 2003-02-27 17:34 UTC (permalink / raw)
  To: Christian Ehrhardt; +Cc: gcc, gcc-patches

Hi,

On Thu, 27 Feb 2003, Christian Ehrhardt wrote:

> Thanks for the patch. Another mostly unrelated issue:
> The behaviour of the register allocator with the following testcase
> (and others) is IMHO suboptimal:
>
> [... description of problem, that ra coalesces spill pseudos and
>      non-spill pseudos ...]

Yes, a known problem.  On the current ra-branch this type of coalescing is
switched off (in a brutal way, i.e. those insns are not recognized as copy
insns anymore, which is a bit too pessimistic, only switching of actually
coalescing them is enough).

> This is a huge compile time performance problem. Adding the following
> test to aggressive coalesce reduced compile time dramatically (> 80%
> on my favourite test case, see below):
>
> Index: ra-colorize.c
> ===================================================================
> RCS file: /cvsroot/gcc/gcc/gcc/ra-colorize.c,v
> retrieving revision 1.1.2.8
> diff -u -r1.1.2.8 ra-colorize.c
> --- ra-colorize.c       1 Feb 2003 17:27:08 -0000       1.1.2.8
> +++ ra-colorize.c       27 Feb 2003 16:16:13 -0000
> @@ -2683,7 +2683,8 @@
>         if (s != t
>             && t->type != PRECOLORED
>             && !TEST_BIT (sup_igraph, s->id * num_webs + t->id)
> -           && !TEST_BIT (sup_igraph, t->id * num_webs + s->id))
> +           && !TEST_BIT (sup_igraph, t->id * num_webs + s->id)
> +           && (SPILL_SLOT_P (s->regno) == SPILL_SLOT_P (t->regno)))
>           {
>             if ((s->type == PRECOLORED && ok (t, s))
>                 || s->type != PRECOLORED)
>
> I realize that this patch may change the generated code. Do you
> have some kind of testsuite that can be used to measure the run
> time performance impact of this patch? For those (admittedly few)
> test cases that I looked in detail, the same code was generated with
> and without the patch.

No, when I'm really interested in quality of produced code I use SPEC
CPU2000, or look at some file where I noticed a problem.  But no specific
test-suite.

> Finally let me say a few words about my rational:

All points correct.  I guess, I will do something similar for the
new-regalloc-branch, and remove the non-detection of copy-insns involving
SPILL_SLOT_P's.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Bug in ra-colorize.c:merge_moves?
  2003-02-27 17:34     ` Michael Matz
@ 2003-02-27 20:37       ` Michael Matz
  2003-02-28 12:50         ` Christian Ehrhardt
  0 siblings, 1 reply; 25+ messages in thread
From: Michael Matz @ 2003-02-27 20:37 UTC (permalink / raw)
  To: Christian Ehrhardt; +Cc: gcc, gcc-patches

Hi Christian,

On Thu, 27 Feb 2003, Michael Matz wrote:

> > Finally let me say a few words about my rational:
>
> All points correct.  I guess, I will do something similar for the
> new-regalloc-branch, and remove the non-detection of copy-insns involving
> SPILL_SLOT_P's.

Could you try the patch in
http://gcc.gnu.org/ml/gcc-patches/2003-02/msg02347.html to see if this
also fixes your performance issues?


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Bug in ra-colorize.c:merge_moves?
  2003-02-27 20:37       ` Michael Matz
@ 2003-02-28 12:50         ` Christian Ehrhardt
  2003-02-28 23:07           ` Michael Matz
  0 siblings, 1 reply; 25+ messages in thread
From: Christian Ehrhardt @ 2003-02-28 12:50 UTC (permalink / raw)
  To: Michael Matz; +Cc: gcc, gcc-patches

On Thu, Feb 27, 2003 at 09:32:27PM +0100, Michael Matz wrote:
> Hi Christian,
> 
> On Thu, 27 Feb 2003, Michael Matz wrote:
> 
> > > Finally let me say a few words about my rational:
> >
> > All points correct.  I guess, I will do something similar for the
> > new-regalloc-branch, and remove the non-detection of copy-insns involving
> > SPILL_SLOT_P's.
> 
> Could you try the patch in
> http://gcc.gnu.org/ml/gcc-patches/2003-02/msg02347.html to see if this
> also fixes your performance issues?

I ran a somewhat longer test, the test is to compile gcc with the
tested compiler as hostgcc. Additional CFLAGS are -O2 and -ftime-report.
The time-report outputs are then added up by a per Skript.

The current new-regalloc-branch (with the patch included) gave these
top numbers (times in milliseconds):

                                  user     sys    wall
                        parser:   48150    5100   52010
                           CSE:   80310     220   82220
                    global CSE:   81410   41150  123600
                   local alloc:  175740    1470  175030
                         TOTAL:  865730   63910  929210

This means that the performance issues of newra aren't solved yet but
here are the numbers that I got without the patch:

                                  user     sys    wall
                        parser:   49670    5040   54920
                           CSE:   79560     250   76490
                    global CSE:   81760   41950  123330
                   local alloc:  598860    3230  601360
                         TOTAL: 1288580   66380 1351710

So this is definitely a significant improvement wrt. compile time speed!
Thanks.

   regards  Christian

-- 
THAT'S ALL FOLKS!

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Bug in ra-colorize.c:merge_moves?
  2003-02-28 12:50         ` Christian Ehrhardt
@ 2003-02-28 23:07           ` Michael Matz
  0 siblings, 0 replies; 25+ messages in thread
From: Michael Matz @ 2003-02-28 23:07 UTC (permalink / raw)
  To: Christian Ehrhardt; +Cc: gcc, gcc-patches

Hi,

On Fri, 28 Feb 2003, Christian Ehrhardt wrote:

> The current new-regalloc-branch (with the patch included) gave these
> top numbers (times in milliseconds):
>
>                    local alloc:  175740    1470  175030
>
> This means that the performance issues of newra aren't solved yet but
> here are the numbers that I got without the patch:
>
>                    local alloc:  598860    3230  601360
>
> So this is definitely a significant improvement wrt. compile time speed!

OK, cool enough, this is all what I wanted to know (that it is nearly
equivalent to your local patch regarding compiler time reduction).  That
the new allocator is still slower is known.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Bug in ra-colorize.c:merge_moves?
  2003-02-27 17:07   ` Christian Ehrhardt
  2003-02-27 17:34     ` Michael Matz
@ 2003-03-03 20:24     ` Daniel Berlin
  2003-03-03 20:31       ` Michael Matz
  1 sibling, 1 reply; 25+ messages in thread
From: Daniel Berlin @ 2003-03-03 20:24 UTC (permalink / raw)
  To: Christian Ehrhardt; +Cc: Michael Matz, gcc, gcc-patches


On Thursday, February 27, 2003, at 11:39  AM, Christian Ehrhardt wrote:

> On Tue, Feb 25, 2003 at 05:57:15PM +0100, Michael Matz wrote:
>>> this is merge_moves from ra-colorize.c which has somewhat broken list
>>> handling IMHO.
>>
>> Hmm, I must have been half asleep, when I wrote that.  Although note, 
>> that
>> with the default setting of flags noone will notice that error, as
>> web->moves is empty with optimistic coalescing.  I'll commit the below
>> patch, once bootstrapping/regtesting on i686-linux is done (regalloc
>> branch and HEAD).  I'm unclear if I need permission for 3.3.
>
> Thanks for the patch. Another mostly unrelated issue:
> The behaviour of the register allocator with the following testcase
> (and others) is IMHO suboptimal:
>
> extern void byte_store_op2 (int op, unsigned char *loc, int arg1, int 
> arg2);
> static void
> byte_insert_op2 (op, loc, arg1, arg2, end)
>     int op;
>     unsigned char *loc;
>     int arg1, arg2;
>     unsigned char *end;
> {
>   register unsigned char *pfrom = end;
>   register unsigned char *pto = end + 1 + 2 * 2;
>   while (pfrom != loc)
>     *--pto = *--pfrom;
>   byte_store_op2 (op, loc, arg1, arg2);
> }
>
> With -O2 this testcase gives 7 non-precolored webs that all conflict 
> with
> each other, i.e. the conflict graph is a complete graph with 7 nodes.
> With the current code it takes no less than 8 passes to get this graph
> colored. The problem is that the allocator allows coalescing between
> spill slots and non spill slots.

new-regalloc-branch isn't very good compile-time wise.
For instance, we can't ever color in one pass anymore (it pretends 
something has changed, even if nothing has, so the minimum number of 
passes is 2).

x86 is also just not an easy architecture to color for.

I'm actively working on merging good parts to the head that don't 
increase compile time (but increase runtime performance), as well as 
modularizing the new-register allocator so that different architectures 
can do different things easily, for speed.

I've SPEC benchmarked every change[1] i'm moving from new-regalloc in 
order to make sure it improves performance

Thus, if you are concerned with improving compile time, you might want 
to just try to speed up things algorithmically, rather than small 
pieces at a time, since i can take care of that.
--Dan
[1]  actually, i spec'd every combination of every change i'm moving, 
to make sure they don't interact badly.  But only on the two spec tests 
that were most affected by bad choices in register allocation/spilling. 
  Otherwise, it would have taken days or weeks to do this.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Bug in ra-colorize.c:merge_moves?
  2003-03-03 20:24     ` Daniel Berlin
@ 2003-03-03 20:31       ` Michael Matz
  2003-03-03 20:47         ` Daniel Berlin
  0 siblings, 1 reply; 25+ messages in thread
From: Michael Matz @ 2003-03-03 20:31 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Christian Ehrhardt, gcc, gcc-patches

Hi,

On Mon, 3 Mar 2003, Daniel Berlin wrote:

> new-regalloc-branch isn't very good compile-time wise.
> For instance, we can't ever color in one pass anymore (it pretends
> something has changed, even if nothing has, so the minimum number of
> passes is 2).

Oh, really?  That should obviously not happen.  Do you have a testcase, so
I can fix that?


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Bug in ra-colorize.c:merge_moves?
  2003-03-03 20:31       ` Michael Matz
@ 2003-03-03 20:47         ` Daniel Berlin
  2003-03-03 21:46           ` Denis Chertykov
  0 siblings, 1 reply; 25+ messages in thread
From: Daniel Berlin @ 2003-03-03 20:47 UTC (permalink / raw)
  To: Michael Matz; +Cc: Christian Ehrhardt, gcc, gcc-patches


On Monday, March 3, 2003, at 02:59  PM, Michael Matz wrote:

> Hi,
>
> On Mon, 3 Mar 2003, Daniel Berlin wrote:
>
>> new-regalloc-branch isn't very good compile-time wise.
>> For instance, we can't ever color in one pass anymore (it pretends
>> something has changed, even if nothing has, so the minimum number of
>> passes is 2).
>
> Oh, really?  That should obviously not happen.  Do you have a 
> testcase, so
> I can fix that?

You probably don't need a testcase, since it's obvious from looking at 
the code itself.  You just probably didn't notice when it happened 
(it's a one line problem, hard to see).

In one_pass, we have:

  else if (ra_pass == 1)
     {
       if (death_insns_max_uid < get_max_uid ())
         {
           sbitmap_free (insns_with_deaths);
           insns_with_deaths = sbitmap_alloc (get_max_uid ());
           sbitmap_ones (insns_with_deaths);
           death_insns_max_uid = get_max_uid ();
         }
       last_changed_insns = ra_modified_insns;
       detect_web_parts_to_rebuild ();
       last_changed_insns = NULL;
       something_spilled = 1;
       last_max_uid = get_max_uid ();
     }

Note that it's setting something_spilled to 1 unconditionally on the 
first pass (which automatically means we will do another pass, since we 
return something spilled as the return value of one_pass, and the 
calling code is "changed = one_pass ...").
This code isn't in the HEAD's ra.c.  IIRC, it was added by Denis on 
08-Jan-03, in revision 1.1.2.65:
1.1.2.65     (denisc   08-Jan-03):       something_spilled = 1;


Evil, no?

>
>
> Ciao,
> Michael.
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Bug in ra-colorize.c:merge_moves?
  2003-03-03 20:47         ` Daniel Berlin
@ 2003-03-03 21:46           ` Denis Chertykov
  2003-03-04  5:05             ` Daniel Berlin
  0 siblings, 1 reply; 25+ messages in thread
From: Denis Chertykov @ 2003-03-03 21:46 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Michael Matz, Christian Ehrhardt, gcc, gcc-patches

Daniel Berlin <dberlin@dberlin.org> writes:

> On Monday, March 3, 2003, at 02:59  PM, Michael Matz wrote:
> 
> > Hi,
> >
> > On Mon, 3 Mar 2003, Daniel Berlin wrote:
> >
> >> new-regalloc-branch isn't very good compile-time wise.
> >> For instance, we can't ever color in one pass anymore (it pretends
> >> something has changed, even if nothing has, so the minimum number of
> >> passes is 2).
> >
> > Oh, really?  That should obviously not happen.  Do you have a
> > testcase, so
> > I can fix that?
> 
> You probably don't need a testcase, since it's obvious from looking at
> the code itself.  You just probably didn't notice when it happened
> (it's a one line problem, hard to see).
> 
> In one_pass, we have:
> 
>   else if (ra_pass == 1)
>      {
>        if (death_insns_max_uid < get_max_uid ())
>          {
>            sbitmap_free (insns_with_deaths);
>            insns_with_deaths = sbitmap_alloc (get_max_uid ());
>            sbitmap_ones (insns_with_deaths);
>            death_insns_max_uid = get_max_uid ();
>          }
>        last_changed_insns = ra_modified_insns;
>        detect_web_parts_to_rebuild ();
>        last_changed_insns = NULL;
>        something_spilled = 1;
>        last_max_uid = get_max_uid ();
>      }
> 
> Note that it's setting something_spilled to 1 unconditionally on the
> first pass (which automatically means we will do another pass, since
> we return something spilled as the return value of one_pass, and the
> calling code is "changed = one_pass ...").
> This code isn't in the HEAD's ra.c.  IIRC, it was added by Denis on
> 08-Jan-03, in revision 1.1.2.65:
> 1.1.2.65     (denisc   08-Jan-03):       something_spilled = 1;

The previous condition is 'if (!WEBS (SPILLED))'.
Only if we have a webs to spill then we doing this.

Only if we want to spill a web which can't have sigle color.
(May be I'm wrong.)
I'm very interested in testcase. 

Denis.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Bug in ra-colorize.c:merge_moves?
  2003-03-03 21:46           ` Denis Chertykov
@ 2003-03-04  5:05             ` Daniel Berlin
  2003-03-04 12:16               ` Michael Matz
  0 siblings, 1 reply; 25+ messages in thread
From: Daniel Berlin @ 2003-03-04  5:05 UTC (permalink / raw)
  To: Denis Chertykov; +Cc: Michael Matz, Christian Ehrhardt, gcc, gcc-patches


On Monday, March 3, 2003, at 04:12  PM, Denis Chertykov wrote:

> Daniel Berlin <dberlin@dberlin.org> writes:
>
>> On Monday, March 3, 2003, at 02:59  PM, Michael Matz wrote:
>>
>>> Hi,
>>>
>>> On Mon, 3 Mar 2003, Daniel Berlin wrote:
>>>
>>>> new-regalloc-branch isn't very good compile-time wise.
>>>> For instance, we can't ever color in one pass anymore (it pretends
>>>> something has changed, even if nothing has, so the minimum number of
>>>> passes is 2).
>>>
>>> Oh, really?  That should obviously not happen.  Do you have a
>>> testcase, so
>>> I can fix that?
>>
>> You probably don't need a testcase, since it's obvious from looking at
>> the code itself.  You just probably didn't notice when it happened
>> (it's a one line problem, hard to see).
>>
>> In one_pass, we have:
>>
>>   else if (ra_pass == 1)
>>      {
>>        if (death_insns_max_uid < get_max_uid ())
>>          {
>>            sbitmap_free (insns_with_deaths);
>>            insns_with_deaths = sbitmap_alloc (get_max_uid ());
>>            sbitmap_ones (insns_with_deaths);
>>            death_insns_max_uid = get_max_uid ();
>>          }
>>        last_changed_insns = ra_modified_insns;
>>        detect_web_parts_to_rebuild ();
>>        last_changed_insns = NULL;
>>        something_spilled = 1;
>>        last_max_uid = get_max_uid ();
>>      }
>>
>> Note that it's setting something_spilled to 1 unconditionally on the
>> first pass (which automatically means we will do another pass, since
>> we return something spilled as the return value of one_pass, and the
>> calling code is "changed = one_pass ...").
>> This code isn't in the HEAD's ra.c.  IIRC, it was added by Denis on
>> 08-Jan-03, in revision 1.1.2.65:
>> 1.1.2.65     (denisc   08-Jan-03):       something_spilled = 1;
>
> The previous condition is 'if (!WEBS (SPILLED))'.
> Only if we have a webs to spill then we doing this.
I actually pointed out the wrong line. Your stack slot stuff (the 
subst_to_stack() check + setting of something_spilled) makes us always 
do an extra pass if something got colored by an unusable color (even if 
nothing else spilled) which means we can never color in *2* passes 
(going by RA counts, which is 3 coloring rounds), not one (sorry about 
that), which we didn't use to do before.

 From params.c:

  spill cost of graph (initially) = 0
  spill cost of graph (after alias-breaking) = 0
  spill cost of graph (before spill-recolor) = 0
  spill cost of graph (after spill-recolor) = 0
Stack spill slots must be added.
Allocate stack spill slots for webs:
          62(89) insns:  134(u37) 139(d156)
          63(90) insns:  135(u35) 138(d157)
          64(88) insns:  133(u36) 136(u34) 137(d158)

....
RegAlloc Pass 3 (because of the subst_to_stack_p() call setting 
something_spilled).

Extra passes cost a lot of time (the time new-regalloc takes is 
directly related to how many passes it performed).  I've got a tree 
with everything but your stack slots changes merged (including 
pre-reload), and the number of coloring passes is about the same.  
However, with your stack slots changes , it takes 3-5x as many passes 
in those functions that spill (some functions went from 3 passes to 15 
passes).  It seems we have cases where after you assigned stack slots, 
we spill again the next pass, and then have another couple passes to 
remove those spills, repeat.

If it wasn't really that bad, i would have merged it into the changes 
i'm cleaning up for merging into the mainline.

I should also point out the code i pointed out in the previous message 
is still broken, if you think it will ever do something.
It will never trigger, afaict.
I change it to an abort in that case, and bootstrapped, and it doesn't 
trigger the abort (meaning it never falls into that case during a 
bootstrap).

What exactly are you trying to do with that code?
> Only if we want to spill a web which can't have sigle color.
> (May be I'm wrong.)
> I'm very interested in testcase.


> Denis.
>
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Bug in ra-colorize.c:merge_moves?
  2003-03-04  5:05             ` Daniel Berlin
@ 2003-03-04 12:16               ` Michael Matz
  2003-03-04 15:42                 ` Daniel Berlin
  2003-03-04 20:32                 ` Denis Chertykov
  0 siblings, 2 replies; 25+ messages in thread
From: Michael Matz @ 2003-03-04 12:16 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Denis Chertykov, Christian Ehrhardt, gcc, gcc-patches

Hi,

On Mon, 3 Mar 2003, Daniel Berlin wrote:

> >> You probably don't need a testcase, since it's obvious from looking at
> >> the code itself.  You just probably didn't notice when it happened
> >> (it's a one line problem, hard to see).
> >>
> >> In one_pass, we have:
> >>
> >>   else if (ra_pass == 1)
> >>      {
> >>        if (death_insns_max_uid < get_max_uid ())
> >>          {
> >>            sbitmap_free (insns_with_deaths);
> >>            insns_with_deaths = sbitmap_alloc (get_max_uid ());
> >>            sbitmap_ones (insns_with_deaths);
> >>            death_insns_max_uid = get_max_uid ();
> >>          }
> >>        last_changed_insns = ra_modified_insns;
> >>        detect_web_parts_to_rebuild ();
> >>        last_changed_insns = NULL;
> >>        something_spilled = 1;

As Denis already pointed out, this if() branch is only run, if we have
WEBS(SPILLED), i.e. if in fact something needs to change.

> I actually pointed out the wrong line. Your stack slot stuff (the
> subst_to_stack() check + setting of something_spilled) makes us always
> do an extra pass if something got colored by an unusable color (even if
> nothing else spilled) which means we can never color in *2* passes
> (going by RA counts, which is 3 coloring rounds), not one (sorry about
> that), which we didn't use to do before.

Hmm.  It's like this:  webs representing stack slots (stack pseudos I call
them, or so), can be colored with an_unusable_color, which means
conceptually, that they indeed have to be placed onto the stack.  Denis
made changes which in fact put those webs onto the stack, where formerly
they were placed onto stack only as last pass, if there were _only_
"uncolored" stack webs.

> Extra passes cost a lot of time (the time new-regalloc takes is
> directly related to how many passes it performed).

Yes, this thing, that uncolored stack webs are forced to stack early is
only necessary one some machines (namely to early expose the stack access
addresses), while for instance on ppc and i686 not.

> I've got a tree
> with everything but your stack slots changes merged (including
> pre-reload), and the number of coloring passes is about the same.
> However, with your stack slots changes , it takes 3-5x as many passes
> in those functions that spill (some functions went from 3 passes to 15
> passes).

Ugh, that's much.  I wouldn't have expected this.  The only thing which
_should_ have changed is, that now sometimes there are stack-accesses
while coloring, where formerly there were simply only accesses to
SPILL_SLOT_P() pseudos.  I believe Denis also makes some local
optimizations while creating the code for the stack accesses early.

Are you sure, that you don't coalesce SPILL_SLOT_P pseudos with non-spill
slots?  This indeed makes the time go up fairly much without improving the
code.  The change for that was only committed a few days ago, without it,
one had needed to ignore moves with any SPILL_SLOT_P pseudo involved.

> I should also point out the code i pointed out in the previous message
> is still broken, if you think it will ever do something.
> It will never trigger, afaict.

It will, with pre-reload.  One sub function of build_i_graph() is
select_regclass(), which calls web_class(), which can put a web into
SPILLED.  This happens if one reference of a pseudo has conflicting
constraints with all other references.  In that case this reference is
"spilled" (for a lack of a better word, not to be confused with spilling
resulting from register pressure; it's similar to reloads just on
pseudos).  Such a web is then marked as changed, and Denis uses the
SPILLED list for this.

The integration with the allocator is rather ugly, I agree.  The use of
the SPILLED list for this is confusing.  What this code basically does is
something like:
  build-graph-including-webclasses
  if (pre-reloads needed && this-is-first-pass)
    go-to-next-round;
  else if (!pre-reloads needed)
    make-normal-coloring-attempt
  else
    abort

I.e. if there are pre-reloads, there is _no_ coloring attempt done (this
is so, because the code changed, and the interference graph doesn't
reflect this code anymore correctly, i.e. needs to be rebuilt).  But
pre-reloads are only expected in the first pass, if at all, therefore the
abort.

> I change it to an abort in that case, and bootstrapped, and it doesn't
> trigger the abort (meaning it never falls into that case during a
> bootstrap).

On your platform.  Take one with stranger constraints, and involving
sources which cast between funny types, and you somewhen will hit it.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Bug in ra-colorize.c:merge_moves?
  2003-03-04 12:16               ` Michael Matz
@ 2003-03-04 15:42                 ` Daniel Berlin
  2003-03-04 20:32                 ` Denis Chertykov
  1 sibling, 0 replies; 25+ messages in thread
From: Daniel Berlin @ 2003-03-04 15:42 UTC (permalink / raw)
  To: Michael Matz; +Cc: Denis Chertykov, Christian Ehrhardt, gcc, gcc-patches


On Tuesday, March 4, 2003, at 06:55  AM, Michael Matz wrote:
>
>> I've got a tree
>> with everything but your stack slots changes merged (including
>> pre-reload), and the number of coloring passes is about the same.
>> However, with your stack slots changes , it takes 3-5x as many passes
>> in those functions that spill (some functions went from 3 passes to 15
>> passes).
>
> Ugh, that's much.  I wouldn't have expected this.  The only thing which
> _should_ have changed is, that now sometimes there are stack-accesses
> while coloring, where formerly there were simply only accesses to
> SPILL_SLOT_P() pseudos.  I believe Denis also makes some local
> optimizations while creating the code for the stack accesses early.
>
> Are you sure, that you don't coalesce SPILL_SLOT_P pseudos with 
> non-spill
> slots?

I specifically had the fix in my tree to make sure it wasn't the 
problem.

It seems we spill more webs again after allocating stack slots, for 
some reason.
Then we do two passes, can't color, assign stack slots, then spill 
again.
Repeat a few times.

I don't have time to track down *why* it happens.

It would have been nice if Denis had time tested the code before adding 
it, to be honest. It's nice to improve code, but when the allocation 
times go from 20 to 60 seconds, it's probably not worth it.

>> I should also point out the code i pointed out in the previous message
>> is still broken, if you think it will ever do something.
>> It will never trigger, afaict.
>
> It will, with pre-reload.  One sub function of build_i_graph() is
> select_regclass(), which calls web_class(), which can put a web into
> SPILLED.  This happens if one reference of a pseudo has conflicting
> constraints with all other references.  In that case this reference is
> "spilled" (for a lack of a better word, not to be confused with 
> spilling
> resulting from register pressure; it's similar to reloads just on
> pseudos).  Such a web is then marked as changed, and Denis uses the
> SPILLED list for this.
>
> The integration with the allocator is rather ugly, I agree.

This is the other reason i didn't consider merging it.
The new-register allocator is supposed to be clean, above all.
>
>> I change it to an abort in that case, and bootstrapped, and it doesn't
>> trigger the abort (meaning it never falls into that case during a
>> bootstrap).
>
> On your platform.  Take one with stranger constraints, and involving
> sources which cast between funny types, and you somewhen will hit it.
>
Err,  I tried this too (three very odd platforms). Never triggers.
You sure it's correct?

>
> Ciao,
> Michael.
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Bug in ra-colorize.c:merge_moves?
  2003-03-04 12:16               ` Michael Matz
  2003-03-04 15:42                 ` Daniel Berlin
@ 2003-03-04 20:32                 ` Denis Chertykov
  2003-03-04 21:14                   ` Daniel Berlin
  1 sibling, 1 reply; 25+ messages in thread
From: Denis Chertykov @ 2003-03-04 20:32 UTC (permalink / raw)
  To: Michael Matz
  Cc: Daniel Berlin, Denis Chertykov, Christian Ehrhardt, gcc, gcc-patches

Michael Matz <matz@suse.de> writes:

> Hi,
> 
> On Mon, 3 Mar 2003, Daniel Berlin wrote:

First of all. Michael ! Thank you for your answer.

> > I've got a tree
> > with everything but your stack slots changes merged (including
> > pre-reload), and the number of coloring passes is about the same.
> > However, with your stack slots changes , it takes 3-5x as many
> > passes
> > in those functions that spill (some functions went from 3 passes to
> > 15
> > passes).
> 
> Ugh, that's much.  I wouldn't have expected this.  The only thing
> which
> _should_ have changed is, that now sometimes there are stack-accesses
> while coloring, where formerly there were simply only accesses to
> SPILL_SLOT_P() pseudos.  I believe Denis also makes some local
> optimizations while creating the code for the stack accesses early.

Yes it is.
IE: We have a web to spill:

i2  use web1
...
i3  use web1
...
i4  use web1

We generate spill insns:

i21 def web11 web1-pseudo-spill-slot
i2  use web11
...
i31 def web12 web1-pseudo-spill-slot
i3  use web12
...
i41 def web13 web1-pseudo-spill-slot
i4  use web13

My code substitute web1-pseudo-spill-slot to real stack slot and
trying to incorporate real stack slot to original insns (i2,i3,i4).

i2  use web1-real-stack-slot
...
i3  use web1-real-stack-slot
...
i4  use web1-real-stack-slot

So, number of webs reduced and possibility of graph colorization increased.

> > I should also point out the code i pointed out in the previous
> > message
> > is still broken, if you think it will ever do something.
> > It will never trigger, afaict.
> 
> It will, with pre-reload.  One sub function of build_i_graph() is
> select_regclass(), which calls web_class(), which can put a web into
> SPILLED.  This happens if one reference of a pseudo has conflicting
> constraints with all other references.  In that case this reference is
> "spilled" (for a lack of a better word, not to be confused with

I call it "splitting".

> spilling
> resulting from register pressure; it's similar to reloads just on
> pseudos).  Such a web is then marked as changed, and Denis uses the
> SPILLED list for this.
> 
> The integration with the allocator is rather ugly, I agree.  The use
> of
> the SPILLED list for this is confusing.  What this code basically does
> is

Why it's ugly ?
Are you think that creating of new list (ie SPLITTED) will be better ?

> bootstrap).

> On your platform.  Take one with stranger constraints, and involving
> sources which cast between funny types, and you somewhen will hit it.

The code which must support "splitting" before first colorization
phase was disabled because it's not complete.
See a comment "FIXME ..." in ra-build.c:web_class

Denis.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Bug in ra-colorize.c:merge_moves?
  2003-03-04 20:32                 ` Denis Chertykov
@ 2003-03-04 21:14                   ` Daniel Berlin
  2003-03-05 22:04                     ` Denis Chertykov
  0 siblings, 1 reply; 25+ messages in thread
From: Daniel Berlin @ 2003-03-04 21:14 UTC (permalink / raw)
  To: Denis Chertykov; +Cc: Michael Matz, Christian Ehrhardt, gcc, gcc-patches


On Tuesday, March 4, 2003, at 03:22  PM, Denis Chertykov wrote:

> Michael Matz <matz@suse.de> writes:
>
>> Hi,
>>
>> On Mon, 3 Mar 2003, Daniel Berlin wrote:
>
> First of all. Michael ! Thank you for your answer.
>
>>> I've got a tree
>>> with everything but your stack slots changes merged (including
>>> pre-reload), and the number of coloring passes is about the same.
>>> However, with your stack slots changes , it takes 3-5x as many
>>> passes
>>> in those functions that spill (some functions went from 3 passes to
>>> 15
>>> passes).
>>
>> Ugh, that's much.  I wouldn't have expected this.  The only thing
>> which
>> _should_ have changed is, that now sometimes there are stack-accesses
>> while coloring, where formerly there were simply only accesses to
>> SPILL_SLOT_P() pseudos.  I believe Denis also makes some local
>> optimizations while creating the code for the stack accesses early.
>
> Yes it is.
> IE: We have a web to spill:
>
> i2  use web1
> ...
> i3  use web1
> ...
> i4  use web1
>
> We generate spill insns:
>
> i21 def web11 web1-pseudo-spill-slot
> i2  use web11
> ...
> i31 def web12 web1-pseudo-spill-slot
> i3  use web12
> ...
> i41 def web13 web1-pseudo-spill-slot
> i4  use web13
>
> My code substitute web1-pseudo-spill-slot to real stack slot and
> trying to incorporate real stack slot to original insns (i2,i3,i4).
>
> i2  use web1-real-stack-slot
> ...
> i3  use web1-real-stack-slot
> ...
> i4  use web1-real-stack-slot
>
> So, number of webs reduced and possibility of graph colorization 
> increased.
>

Um, look again.
You've *decreased* the possibility of graph colorization in this 
example.
The colorability of a graph is *not* related to the number of webs, 
it's related to the number of conflicts.
web1 is now live *longer*, and thus, conflicts with more webs than it 
did before.

It now conflicts with everything in the ...'s, *decreasing* 
colorability.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Bug in ra-colorize.c:merge_moves?
  2003-03-04 21:14                   ` Daniel Berlin
@ 2003-03-05 22:04                     ` Denis Chertykov
  2003-03-06  1:21                       ` Daniel Berlin
  0 siblings, 1 reply; 25+ messages in thread
From: Denis Chertykov @ 2003-03-05 22:04 UTC (permalink / raw)
  To: Daniel Berlin
  Cc: Denis Chertykov, Michael Matz, Christian Ehrhardt, gcc, gcc-patches

Daniel Berlin <dberlin@dberlin.org> writes:

> On Tuesday, March 4, 2003, at 03:22  PM, Denis Chertykov wrote:
> 
> > Michael Matz <matz@suse.de> writes:
> >
> >> Hi,
> >>
> >> On Mon, 3 Mar 2003, Daniel Berlin wrote:
> >
> > First of all. Michael ! Thank you for your answer.
> >
> >>> I've got a tree
> >>> with everything but your stack slots changes merged (including
> >>> pre-reload), and the number of coloring passes is about the same.
> >>> However, with your stack slots changes , it takes 3-5x as many
> >>> passes
> >>> in those functions that spill (some functions went from 3 passes to
> >>> 15
> >>> passes).
> >>
> >> Ugh, that's much.  I wouldn't have expected this.  The only thing
> >> which
> >> _should_ have changed is, that now sometimes there are stack-accesses
> >> while coloring, where formerly there were simply only accesses to
> >> SPILL_SLOT_P() pseudos.  I believe Denis also makes some local
> >> optimizations while creating the code for the stack accesses early.
> >
> > Yes it is.
> > IE: We have a web to spill:
> >
> > i2  use web1
> > ...
> > i3  use web1
> > ...
> > i4  use web1
> >
> > We generate spill insns:
> >
> > i21 def web11 web1-pseudo-spill-slot
> > i2  use web11
> > ...
> > i31 def web12 web1-pseudo-spill-slot
> > i3  use web12
> > ...
> > i41 def web13 web1-pseudo-spill-slot
> > i4  use web13
> >
> > My code substitute web1-pseudo-spill-slot to real stack slot and
> > trying to incorporate real stack slot to original insns (i2,i3,i4).
> >
> > i2  use web1-real-stack-slot
> > ...
> > i3  use web1-real-stack-slot
> > ...
> > i4  use web1-real-stack-slot
> >
> > So, number of webs reduced and possibility of graph colorization
> > increased.
> >
> 
> Um, look again.
> You've *decreased* the possibility of graph colorization in this
> example.
> The colorability of a graph is *not* related to the number of webs,
> it's related to the number of conflicts.
> web1 is now live *longer*, and thus, conflicts with more webs than it
> did before.

Please point in which place.

> 
> It now conflicts with everything in the ...'s, *decreasing*
> colorability.

May be you misunderstand me ?
web1-real-stack-slot is a memory slot.
Please point in which place I have *decreased* the possibility of
graph colorization.

Denis.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Bug in ra-colorize.c:merge_moves?
  2003-03-05 22:04                     ` Denis Chertykov
@ 2003-03-06  1:21                       ` Daniel Berlin
  2003-03-06 11:39                         ` Michael Matz
  0 siblings, 1 reply; 25+ messages in thread
From: Daniel Berlin @ 2003-03-06  1:21 UTC (permalink / raw)
  To: Denis Chertykov; +Cc: Michael Matz, Christian Ehrhardt, gcc, gcc-patches

>>
>> Um, look again.
>> You've *decreased* the possibility of graph colorization in this
>> example.
>> The colorability of a graph is *not* related to the number of webs,
>> it's related to the number of conflicts.
>> web1 is now live *longer*, and thus, conflicts with more webs than it
>> did before.
>
> Please point in which place.

Sure.

>
>>
>> It now conflicts with everything in the ...'s, *decreasing*
>> colorability.
>
> May be you misunderstand me ?
> web1-real-stack-slot is a memory slot.
> Please point in which place I have *decreased* the possibility of
> graph colorization.

Yes, but unless you've made symbol_refs of some new symbol to represent 
the stack space you are using, they are still based off a register (in 
this case, the stack register).
This register now conflicts with the webin *all* of the  ...'s
Before, each new web would conflict with a single ...


To put it in symbolic terms, let's number the ...'s.

>>> i2  use web1-real-stack-slot
>>> ...[1]
>>> i3  use web1-real-stack-slot
>>> ...[2]
>>> i4  use web1-real-stack-slot
>>> ...[3]
>>>  you now have:

web1-real-stack-slot's base register conflicts with all webs in 1, 2, 
and 3.
Let's say each ... has 3 unique webs in it.
Your way now has 6-9 webs (depending on when the base register dies).

Before, we had:
>>> We generate spill insns:
>>>
>>> i21 def web11 web1-pseudo-spill-slot
>>> i2  use web11
>>> ...
>>> i31 def web12 web1-pseudo-spill-slot
>>> i3  use web12
>>> ...
>>> i41 def web13 web1-pseudo-spill-slot
>>> i4  use web13

Here, given the same parameters, we have:
0-3 webs conflicting with web11
0-3 webs conflicting with web12
0-3 webs conflicting with web13

The colorability of each of these is quite different.

In the first case, you can't color any of the webs with 
web1-real-stack-slot's base register. That means 6 -9 webs can't use 
that color
In the second case, web11, web12, and web13 can each be assigned a 
different color, and 0-3 webs conflicting with it can't use that color.

IE in the first case, if our stack register is hard reg 3, 6-9 webs 
can't use hard reg 3.
In the second case, if we choose different colors for each pseudo-spill 
slot, say hard regs 4, 5, and 6, 0-3 webs can't use reg 4, 0-3 webs 
can't use reg 5, and 0-3 webs can't use reg 6.

This is usually much more colorable, and in the worst case, as 
colorable.
In the case the registers all only had 3 open, 6-9 webs now can't be 
colored.
before, if all the registers only had 3 open, only 0-3 webs can't be 
colored.

--Dan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Bug in ra-colorize.c:merge_moves?
  2003-03-06  1:21                       ` Daniel Berlin
@ 2003-03-06 11:39                         ` Michael Matz
  2003-03-06 16:14                           ` Daniel Berlin
  2003-03-06 16:45                           ` Denis Chertykov
  0 siblings, 2 replies; 25+ messages in thread
From: Michael Matz @ 2003-03-06 11:39 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Denis Chertykov, Christian Ehrhardt, gcc, gcc-patches

Hi,

On Wed, 5 Mar 2003, Daniel Berlin wrote:

> Yes, but unless you've made symbol_refs of some new symbol to represent
> the stack space you are using, they are still based off a register (in
> this case, the stack register).

Which is fixed, and therefore not even a condidate for coloring.  Ergo
there's no conflict between webs and the stack pointer.  Even if it were a
candidate the webs would most probably already have conflicted with it, so
it doesn't really change the conflicts for the webs.

This is different on more constrained machines, where the offset has to be
loaded into an index register.  I'm not sure, how the code will currently
look like for those, but I suppose the offset will be loaded before each
reference.  So this one also can be colored differently for each ref, like
if there were stack pseudos instead of real mem slots.

The problem with stack pseudos on constrained machines is, that the
actual mem references to the stack slots they represent are only created
after coloring.  This means, that all those mems need to be already
non-conflicting with all the other webs used around them.  On
non-constrained machines that's trivial, simply using the fixed stack
pointer.  But on other machines creating such mems involves using index
registers, and this would destroy the coloring, which is, why Denis made
them created during the coloring process, so they are taken into account.
I.e. on some machines this is useless (and arguably leads to more
difficult coloring), but on others it's needed for correctness (Denis,
correct me if I'm wrong here).


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Bug in ra-colorize.c:merge_moves?
  2003-03-06 11:39                         ` Michael Matz
@ 2003-03-06 16:14                           ` Daniel Berlin
  2003-03-06 16:30                             ` Michael Matz
  2003-03-06 17:31                             ` Daniel Berlin
  2003-03-06 16:45                           ` Denis Chertykov
  1 sibling, 2 replies; 25+ messages in thread
From: Daniel Berlin @ 2003-03-06 16:14 UTC (permalink / raw)
  To: Michael Matz; +Cc: Denis Chertykov, Christian Ehrhardt, gcc, gcc-patches


On Thursday, March 6, 2003, at 06:08  AM, Michael Matz wrote:

> Hi,
>
> On Wed, 5 Mar 2003, Daniel Berlin wrote:
>
>> Yes, but unless you've made symbol_refs of some new symbol to 
>> represent
>> the stack space you are using, they are still based off a register (in
>> this case, the stack register).
>
> Which is fixed, and therefore not even a condidate for coloring.


But still causes conflicts, regardless, since the register is no longer 
available.

>  Ergo
> there's no conflict between webs and the stack pointer.

Of course there is, since they can't use that register as a color now, 
and could before (since the pseudo spill slots could get different 
colors).

>  Even if it were a
> candidate the webs would most probably already have conflicted with 
> it, so
> it doesn't really change the conflicts for the webs.
>
This is pure conjecture, i have stats that say otherwise (that they 
didn't conflict with it before, and do now).

Theory is nice, but the fact that it takes 15 passes to color something 
that used to take 3, the only difference is stack slots, and the number 
of conflicts on the newly spilled webs has gone up, tells me you are 
wrong.

All of this is academic anyway, since my solution is to not even 
consider merging the stack slot code until it doesn't cause 15 passes 
of coloring, *whatever* the reason.
Not that this stops you or anyone else, of course.

--Dan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Bug in ra-colorize.c:merge_moves?
  2003-03-06 16:14                           ` Daniel Berlin
@ 2003-03-06 16:30                             ` Michael Matz
  2003-03-06 17:31                             ` Daniel Berlin
  1 sibling, 0 replies; 25+ messages in thread
From: Michael Matz @ 2003-03-06 16:30 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Denis Chertykov, Christian Ehrhardt, gcc, gcc-patches

Hi,

On Thu, 6 Mar 2003, Daniel Berlin wrote:

> > Which is fixed, and therefore not even a condidate for coloring.
>
> But still causes conflicts, regardless, since the register is no longer
> available.
>
> > Ergo there's no conflict between webs and the stack pointer.
>
> Of course there is, since they can't use that register as a color now,
> and could before (since the pseudo spill slots could get different
> colors).

No, a fixed register could _never_ have been used by a pseudo.  fixed regs
are completely ignored while creating conflicts.  This means, that if the
created memory addresses for the stack slots only contain the stack
pointer (or other fixed hardregs) they do not result in more conflicts
than before.  I.e. generally adding insns anywhere which only reference
fixed registers doesn't change the conflict graph in any way.

> > Even if it were a
> > candidate the webs would most probably already have conflicted with
> > it, so
> > it doesn't really change the conflicts for the webs.
> >
> This is pure conjecture, i have stats that say otherwise (that they
> didn't conflict with it before, and do now).

This means, that your stack pointer is not fixed, and yes, then what you
say can happen.  Do you have any testcase showing this?  I believe you,
but I would like to take a look to get a feel for the whole situation.
Pretty please? ;)  This needs also investigation, because of the
correctness issues I mentioned last mail.  If after coloring we create
memory references, which use hardregs, which were candidates for coloring,
we can silently emit wrong code.

> Theory is nice, but the fact that it takes 15 passes to color something
> that used to take 3, the only difference is stack slots, and the number
> of conflicts on the newly spilled webs has gone up, tells me you are
> wrong.

Or that the reasons for the many passes are different from what you
currenly think ;-)  But without tests we can only trust you.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Bug in ra-colorize.c:merge_moves?
  2003-03-06 11:39                         ` Michael Matz
  2003-03-06 16:14                           ` Daniel Berlin
@ 2003-03-06 16:45                           ` Denis Chertykov
  2003-03-06 18:01                             ` Daniel Berlin
  1 sibling, 1 reply; 25+ messages in thread
From: Denis Chertykov @ 2003-03-06 16:45 UTC (permalink / raw)
  To: Michael Matz
  Cc: Daniel Berlin, Denis Chertykov, Christian Ehrhardt, gcc, gcc-patches

Michael Matz <matz@suse.de> writes:

> Hi,
> 
> On Wed, 5 Mar 2003, Daniel Berlin wrote:
> 
> > Yes, but unless you've made symbol_refs of some new symbol to represent
> > the stack space you are using, they are still based off a register (in
> > this case, the stack register).
> 
> Which is fixed, and therefore not even a condidate for coloring.  Ergo
> there's no conflict between webs and the stack pointer.  Even if it were a
> candidate the webs would most probably already have conflicted with it, so
> it doesn't really change the conflicts for the webs.
> 
> This is different on more constrained machines, where the offset has to be
> loaded into an index register.  I'm not sure, how the code will currently
> look like for those, but I suppose the offset will be loaded before each
> reference.  So this one also can be colored differently for each ref, like
> if there were stack pseudos instead of real mem slots.
> 
> The problem with stack pseudos on constrained machines is, that the
> actual mem references to the stack slots they represent are only created
> after coloring.  This means, that all those mems need to be already
> non-conflicting with all the other webs used around them.  On
> non-constrained machines that's trivial, simply using the fixed stack
> pointer.  But on other machines creating such mems involves using index
> registers, and this would destroy the coloring, which is, why Denis made
> them created during the coloring process, so they are taken into account.
> I.e. on some machines this is useless (and arguably leads to more
> difficult coloring), but on others it's needed for correctness (Denis,
> correct me if I'm wrong here).

Again thanks for your answer.

My code must improve generated code quality and it's required for
constrained architectures.
Correction:
IE: We have a web to spill:

i2  use web1
...[1]
i3  use web1
...[2]
i4  use web1

Allocator generate spill insns:

i21 def web11 web1-pseudo-spill-slot
i2  use web11
...[1]
i31 def web12 web1-pseudo-spill-slot
i3  use web12
...[2]
i41 def web13 web1-pseudo-spill-slot
i4  use web13

After this stage we have an increased or decreased colorization
possibility. (Decreased if web1 before spill was overconstrained
ie. has CREG class on x86)

My code substitute web1-pseudo-spill-slot to real stack slot if and
only if colorization phase can't colorize web1-pseudo-spill-slot.
This pass similar to undo of previous spill phase combined with
substitution of web1-pseudo-spill-slot to web1-real-stack-slot.
After such substitution we will not have web1-pseudo-spill-slot,
web11, web12, web13.

i2  use web1-real-stack-slot
...
i3  use web1-real-stack-slot
...
i4  use web1-real-stack-slot

So, number of webs reduced (number of conflicts will be definitely
reduced) and possibility of graph colorization increased. I hope to
colorize full graph in next colorization phase.

I'm agree that this method increased a count of allocator passes, but
this method must improve code quality.

Denis.

PS: Especially for Daniel ! Look to ra-rewrite.c: actual_spill

-------------- fragment of ra-rewrite.c ----------------
actual_spill (spill_p)
     int spill_p ATTRIBUTE_UNUSED;
{
  int i;
  bitmap new_deaths;

  /* If we have a webs colored by an_unusable_color (ie we think that they are
     already in frame) we must put such webs to frame.  */
  if (/* !spill_p && */ subst_to_stack_p ())
#------^^^^^^^^^^^^^^^^  Just remove comments and you will have extra
#                        passes only at end of allocation process.
    /* If you uncomment the SPILL_P usage then you will have a calls to
       assign_stack_slots only at end of allocation process.
       See to the caller of actual_spill.  */
    {
-------------------------------------------------------

Daniel. Try to remove comments around `!spill_p &&' and compare generated code
quality for both cases (commented and not commented `!spill_p &&').

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Bug in ra-colorize.c:merge_moves?
  2003-03-06 16:14                           ` Daniel Berlin
  2003-03-06 16:30                             ` Michael Matz
@ 2003-03-06 17:31                             ` Daniel Berlin
  1 sibling, 0 replies; 25+ messages in thread
From: Daniel Berlin @ 2003-03-06 17:31 UTC (permalink / raw)
  To: Daniel Berlin
  Cc: Michael Matz, Denis Chertykov, Christian Ehrhardt, gcc, gcc-patches


On Thursday, March 6, 2003, at 10:38  AM, Daniel Berlin wrote:

>
> On Thursday, March 6, 2003, at 06:08  AM, Michael Matz wrote:
>
>> Hi,
>>
>> On Wed, 5 Mar 2003, Daniel Berlin wrote:
>>
>>> Yes, but unless you've made symbol_refs of some new symbol to 
>>> represent
>>> the stack space you are using, they are still based off a register 
>>> (in
>>> this case, the stack register).
>>
>> Which is fixed, and therefore not even a condidate for coloring.
>
>
> But still causes conflicts, regardless, since the register is no 
> longer available.
>

Sorry, read what you wrote wrong, since you said "candidate for 
coloring" rather than "register available to coloring", which is what 
you really meant.
And of course, this depends on the architecture.

Maybe you can explain why the number of conflicts *does* increase when 
Denis's code is used, since we ignore conflicts between fixed regs and 
pseudos.
Somethings is broken.

Testcases that show the number of passes has increased are easy to come 
by.
Just pick a file.

If you want one that shows particularly bad results, try expr.c.
Before, the number of functions taking at least 4 passes was 0 and it's 
now 7.
Before the spill slot coalescing change it was 49 on the 
new-regalloc-branch, which is um, bad.
Before:
[dberlin@dberlin gcc]$ grep "Pass 4" expr.i.25.lreg |uniq -c
[dberlin@dberlin gcc]$
After:
[dberlin@dberlin gcc]$ grep "Pass 4" expr.i.23.lreg |uniq -c
      7 RegAlloc Pass 4

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Bug in ra-colorize.c:merge_moves?
  2003-03-06 16:45                           ` Denis Chertykov
@ 2003-03-06 18:01                             ` Daniel Berlin
  2003-03-07 18:34                               ` Denis Chertykov
  0 siblings, 1 reply; 25+ messages in thread
From: Daniel Berlin @ 2003-03-06 18:01 UTC (permalink / raw)
  To: Denis Chertykov; +Cc: Michael Matz, Christian Ehrhardt, gcc, gcc-patches

>
> So, number of webs reduced (number of conflicts will be definitely
> reduced) and possibility of graph colorization increased. I hope to
> colorize full graph in next colorization phase.
>
> I'm agree that this method increased a count of allocator passes, but
> this method must improve code quality.
>
Sure, but that's not the problem. The problem is the code quality 
increase isn't measurable.
I couldn't measure a significant difference in spec, or other 
benchmarks (like skidmarks, for instance), on x86 or powerpc.
No measurable difference in benchmarks + increased time in allocation = 
bad.
I wouldn't complain so much if it increased code quality some 
measurable amount, but if it does, i can't find it.

If code quality was our only measure, we could do ILP based optimal 
register allocation and spilling.

> Denis.
>
> PS: Especially for Daniel ! Look to ra-rewrite.c: actual_spill
>
> -------------- fragment of ra-rewrite.c ----------------
> actual_spill (spill_p)
>      int spill_p ATTRIBUTE_UNUSED;
> {
>   int i;
>   bitmap new_deaths;
>
>   /* If we have a webs colored by an_unusable_color (ie we think that 
> they are
>      already in frame) we must put such webs to frame.  */
>   if (/* !spill_p && */ subst_to_stack_p ())
> #------^^^^^^^^^^^^^^^^  Just remove comments and you will have extra
> #                        passes only at end of allocation process.
>     /* If you uncomment the SPILL_P usage then you will have a calls to
>        assign_stack_slots only at end of allocation process.
>        See to the caller of actual_spill.  */
>     {
> -------------------------------------------------------
>
> Daniel. Try to remove comments around `!spill_p &&' and compare 
> generated code
> quality for both cases (commented and not commented `!spill_p &&').
>
I tried this once, actually, about a week ago, and it caused bootstrap 
failures.


--Dan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Bug in ra-colorize.c:merge_moves?
  2003-03-06 18:01                             ` Daniel Berlin
@ 2003-03-07 18:34                               ` Denis Chertykov
  0 siblings, 0 replies; 25+ messages in thread
From: Denis Chertykov @ 2003-03-07 18:34 UTC (permalink / raw)
  To: Daniel Berlin
  Cc: Denis Chertykov, Michael Matz, Christian Ehrhardt, gcc, gcc-patches

Daniel Berlin <dberlin@dberlin.org> writes:

> >
> > So, number of webs reduced (number of conflicts will be definitely
> > reduced) and possibility of graph colorization increased. I hope to
> > colorize full graph in next colorization phase.
> >
> > I'm agree that this method increased a count of allocator passes, but
> > this method must improve code quality.
> >
> Sure, but that's not the problem. The problem is the code quality
> increase isn't measurable.
> I couldn't measure a significant difference in spec, or other
> benchmarks (like skidmarks, for instance), on x86 or powerpc.
> No measurable difference in benchmarks + increased time in allocation
> = bad.
> I wouldn't complain so much if it increased code quality some
> measurable amount, but if it does, i can't find it.

I'm agree with you.
I have tested my patch before the following changes:
--- Fragment from Micael's mail.
Are you sure, that you don't coalesce SPILL_SLOT_P pseudos with non-spill
slots?  This indeed makes the time go up fairly much without improving the
code.  The change for that was only committed a few days ago, without it,
one had needed to ignore moves with any SPILL_SLOT_P pseudo involved.
-----

I need more time for analyzing. (1,2..3 days)

Denis.

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2003-03-07 17:44 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-02-25  9:00 Bug in ra-colorize.c:merge_moves? Christian Ehrhardt
2003-02-25 16:59 ` Michael Matz
2003-02-27 17:07   ` Christian Ehrhardt
2003-02-27 17:34     ` Michael Matz
2003-02-27 20:37       ` Michael Matz
2003-02-28 12:50         ` Christian Ehrhardt
2003-02-28 23:07           ` Michael Matz
2003-03-03 20:24     ` Daniel Berlin
2003-03-03 20:31       ` Michael Matz
2003-03-03 20:47         ` Daniel Berlin
2003-03-03 21:46           ` Denis Chertykov
2003-03-04  5:05             ` Daniel Berlin
2003-03-04 12:16               ` Michael Matz
2003-03-04 15:42                 ` Daniel Berlin
2003-03-04 20:32                 ` Denis Chertykov
2003-03-04 21:14                   ` Daniel Berlin
2003-03-05 22:04                     ` Denis Chertykov
2003-03-06  1:21                       ` Daniel Berlin
2003-03-06 11:39                         ` Michael Matz
2003-03-06 16:14                           ` Daniel Berlin
2003-03-06 16:30                             ` Michael Matz
2003-03-06 17:31                             ` Daniel Berlin
2003-03-06 16:45                           ` Denis Chertykov
2003-03-06 18:01                             ` Daniel Berlin
2003-03-07 18:34                               ` Denis Chertykov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).