public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: [tree-ssa] Translation out of SSA
       [not found] <1051194701.16445.2250.camel@p4>
@ 2003-04-25  1:23 ` Jan Vroonhof
  2003-04-25  5:25   ` law
  0 siblings, 1 reply; 4+ messages in thread
From: Jan Vroonhof @ 2003-04-25  1:23 UTC (permalink / raw)
  To: Andrew MacLeod; +Cc: gcc

Sorry to to but in, but if you could spare a few minutes to clear up a
few details for us lurkers trying to follow what is going on.

>  - conflict graph based coalesing of partitions.

Unless  I am very confused doesn't this mean this the comment in tree-ssa.c

/* The following procedures implement the out of SSA algorithm detailed in 
   the Morgan Book pages 176-186.  */

is not quite true (anymore)? The partition equivalence relation given
in the Morgan book seems much weaker (only coalescing partitions just
enough to avoid inserting copies on abnormal critical edges), whereas
the one you just implemented seems to be right on the other end of the
spectrum (i.e. coalesce unless we absolute cannot[1]).


Is it explained somewhere:
  1. Why you went for this particular option? Is it to minimise the
number of temporaries left over after unssa? I presume Morgan is just
relying on a coalescing register allocator to come just after.
  2. How the partitioning in your algorithm avoids inserting copies on
abnormal critical edges (at least to me it is non-obvious that the
algorithm guarantees the SSA temps of the same variable on both sides
of an abnormal edge end up in the same partition).
or am I just showing my lack of knowledge and this is all trivial?


> Then we'll remove the abort() permanently, and start allowing overlapping 
> live ranges, but not before I debug them :-). I know there is a bug in edge
> insertion I have to get to, and a couple of other things.

Are you talking about the following[2]

gcc -O3 -ftree-copyprop -c foo.c

for foo.c

void foo(int i)
{
  int b;
  b = 0;
  
  if (i)
    {
      b = i;
    }
  else
    {
      i = b;
    }
  foo(i);
  foo(b);
}

gives

Program received signal SIGSEGV, Segmentation fault.
bsi_insert_after (curr_bsi=0xbffff108, t=0x40344760, mode=BSI_NEW_STMT)
    at ../../gcc/tree-flow-inline.h:72
72      {
(gdb) where
#0  bsi_insert_after (curr_bsi=0xbffff108, t=0x40344760, mode=BSI_NEW_STMT)
    at ../../gcc/tree-flow-inline.h:72
#1  0x080bbd93 in bsi_commit_first_edge_insert (e=0x8716738, stmt=0x40344760)
    at ../../gcc/tree-cfg.c:3466
#2  0x080bd036 in bsi_commit_edge_inserts (update_annotations=0,
    new_blocks=0x0) at ../../gcc/tree-cfg.c:3598
#3  0x080cae2a in rewrite_out_of_ssa (fndecl=0x4036d700)

> 500 seconds, so thats a 2% hit. We might be able to decrease it, but
> we are doing a *heck* of a lot more work. Building the live ranges,
> a conflict graph, coalescing partitions, and running it through the
> real PHI node copy generator, which itself generates little
> interference graphs for copies on edges.

The Briggs, e.a. "Practical Improvements ..." paper that is the basis
for the SSA rewriting phase actually claims that you can solve the
latter problem (possible cycles in the PHI nodes within one block)
without actually constructing the graph itself. At first glance at
least they seem to be solving the same problem.

Thanks for your consideration,

Jan

Footnotes: 
[1]  The code even optimistically assumes it can map all the
partitions to their root variable and maps it to a different temporary
if it absolutely has to (quite different from the union-find proposed
by Morgan).

[2]  I actually have enabled the #if 0 bits in the to-SSA conversion
that do basic value numbering etc as well[3]. But I don't think it matters
in the above case.

[3]  They have actually bit-rotted a bit because they are #if'ed out.

-	  bsi_remove (si);
+	  bsi_remove (&si);

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [tree-ssa] Translation out of SSA
  2003-04-25  1:23 ` [tree-ssa] Translation out of SSA Jan Vroonhof
@ 2003-04-25  5:25   ` law
  2003-04-25  7:36     ` Andrew MacLeod
  0 siblings, 1 reply; 4+ messages in thread
From: law @ 2003-04-25  5:25 UTC (permalink / raw)
  To: Jan Vroonhof; +Cc: Andrew MacLeod, gcc

In message <20030424232723.GA4937@petteflet.ntlworld.com>, Jan Vroonhof writes:
 >Sorry to to but in, but if you could spare a few minutes to clear up a
 >few details for us lurkers trying to follow what is going on.
 >
 >>  - conflict graph based coalesing of partitions.
 >
 >Unless  I am very confused doesn't this mean this the comment in tree-ssa.c
 >
 >/* The following procedures implement the out of SSA algorithm detailed in 
 >   the Morgan Book pages 176-186.  */
 >
 >is not quite true (anymore)?
No, the comment is still correct.

 >The partition equivalence relation given
 >in the Morgan book seems much weaker (only coalescing partitions just
 >enough to avoid inserting copies on abnormal critical edges), whereas
 >the one you just implemented seems to be right on the other end of the
 >spectrum (i.e. coalesce unless we absolute cannot[1]).
See pages 312-316 in Morgan -- it discusses how to use a more aggressive
partitioning scheme.  It's unfortunate that he doesn't give you a
forward reference in the SSA chapter to the recommended way to 
partition.

While you can do an SSA->normal translation with the weaker partitioning,
you end up with a lot more copies than you really need.

You could leave this for a post-ssa pass to clean up, doing so is rather
wasteful (you generate a lot more nodes for later passes of the compiler
to process).    It's doubly wasteful in that the data structure you need
to do renaming & coalescing are the same as you need to do the SSA->Normal
translation.

 >  2. How the partitioning in your algorithm avoids inserting copies on
 >abnormal critical edges (at least to me it is non-obvious that the
 >algorithm guarantees the SSA temps of the same variable on both sides
 >of an abnormal edge end up in the same partition).
 >or am I just showing my lack of knowledge and this is all trivial?
I wouldn't be terribly surprised if this isn't handled yet.  It's
not particularly difficult to handle.


 >> Then we'll remove the abort() permanently, and start allowing overlapping 
 >> live ranges, but not before I debug them :-). I know there is a bug in edge
 >> insertion I have to get to, and a couple of other things.
 >
 >Are you talking about the following[2]
Unknown.  The most interesting thing I've run into in the insertion code
is it mucked things up because we had different blocks recorded for
the current statement and its container.  That caused all kinds of
problems.

 >The Briggs, e.a. "Practical Improvements ..." paper that is the basis
 >for the SSA rewriting phase actually claims that you can solve the
 >latter problem (possible cycles in the PHI nodes within one block)
 >without actually constructing the graph itself. At first glance at
 >least they seem to be solving the same problem.
Yes, the problems are very very closely related, if not identical.


 >[1]  The code even optimistically assumes it can map all the
 >partitions to their root variable and maps it to a different temporary
 >if it absolutely has to (quite different from the union-find proposed
 >by Morgan).
Nope, you just have to go elsewhere in Morgan to get his real partitioner :-)

Jeff

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [tree-ssa] Translation out of SSA
  2003-04-25  5:25   ` law
@ 2003-04-25  7:36     ` Andrew MacLeod
  2003-04-25  8:01       ` law
  0 siblings, 1 reply; 4+ messages in thread
From: Andrew MacLeod @ 2003-04-25  7:36 UTC (permalink / raw)
  To: Jeff Law; +Cc: jslists, gcc

On Thu, 2003-04-24 at 22:02, law@redhat.com wrote:
> In message <20030424232723.GA4937@petteflet.ntlworld.com>, Jan Vroonhof writes:

> 
>  >  2. How the partitioning in your algorithm avoids inserting copies on
>  >abnormal critical edges (at least to me it is non-obvious that the
>  >algorithm guarantees the SSA temps of the same variable on both sides
>  >of an abnormal edge end up in the same partition).
>  >or am I just showing my lack of knowledge and this is all trivial?
> I wouldn't be terribly surprised if this isn't handled yet.  It's
> not particularly difficult to handle.
> 
> 

Well, it doesn't right now, but it is trivial. You simply coalesce all
the things on abnormal edges first so that they aren't an issue later...
I just havent gotten to it yet. We haven't actually exercised anything
really interesting yet :-). Everything *ought* to always map to the root
variable still. Soon!

>  >> Then we'll remove the abort() permanently, and start allowing overlapping 
>  >> live ranges, but not before I debug them :-). I know there is a bug in edge
>  >> insertion I have to get to, and a couple of other things.
>  >
>  >Are you talking about the following[2]

Pretty much. I changed the insert_on_edge code from acting-on-demand to
queuing-it-up and doing it all at once. No one was using the code, and
this pass was going to be the initial client. I noticed the failure, but
there were other more serious matters failing earlier which required
attention first. I should shortly be back to using that code, and then
I'll get the bug out :-) I doubt its very serious.

> Unknown.  The most interesting thing I've run into in the insertion code
> is it mucked things up because we had different blocks recorded for
> the current statement and its container.  That caused all kinds of
> problems.

really? That would be bad. Who made the stmt and its container have
different blocks? Thats not cool. 


>
> 
>  >[1]  The code even optimistically assumes it can map all the
>  >partitions to their root variable and maps it to a different temporary
>  >if it absolutely has to (quite different from the union-find proposed
>  >by Morgan).
> Nope, you just have to go elsewhere in Morgan to get his real partitioner :-)

To be honest, I didn't even look at Morgans partitioner... I just wrote
one based on what I knew about coalescing... :-)  And yes, thats what it
does. Only maps it to a different temporary if it absolutely has to. The
plan will be to choose which ones get a different temporary carefully...
right now its pure first come-first serve. That'll be part and parcel
with the abnormal edge handling.

Andrew


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [tree-ssa] Translation out of SSA
  2003-04-25  7:36     ` Andrew MacLeod
@ 2003-04-25  8:01       ` law
  0 siblings, 0 replies; 4+ messages in thread
From: law @ 2003-04-25  8:01 UTC (permalink / raw)
  To: Andrew MacLeod; +Cc: jslists, gcc

In message <1051239386.2396.84.camel@p4>, Andrew MacLeod writes:
 >> I wouldn't be terribly surprised if this isn't handled yet.  It's
 >> not particularly difficult to handle.
 >> 
 >> 
 >
 >Well, it doesn't right now, but it is trivial. You simply coalesce all
 >the things on abnormal edges first so that they aren't an issue later...
Yup.  I actually had a change which would identify the objects
where this was an issue, but I can't find it here right now.  It was
pretty trivial though.

 >> Unknown.  The most interesting thing I've run into in the insertion code
 >> is it mucked things up because we had different blocks recorded for
 >> the current statement and its container.  That caused all kinds of
 >> problems.
 >
 >really? That would be bad. Who made the stmt and its container have
 >different blocks? Thats not cool. 
Yea.  Then again, it may have been some of my code that mucked it up.
I've got two little passes here which do "interesting" things to the
tree structures.  Mostly they're collapsing BIND_EXPRs, removing
all the pesky empty_stmt_nodes jumps to the next instruction and the
like.  On one testcase I had this resulted in a 75% reduction in the
number of INSNs we create, more realistic tests indicate we should
expect a 1-2% reduction in INSNs from this initial hunk of work.

Anyway, it's entirely possible those changes mucked things up enough
to confuse the insertion routines when I had them running earlier in
the SSA path.

 >To be honest, I didn't even look at Morgans partitioner... I just wrote
 >one based on what I knew about coalescing... :-)  And yes, thats what it
 >does. Only maps it to a different temporary if it absolutely has to. The
 >plan will be to choose which ones get a different temporary carefully...
 >right now its pure first come-first serve. That'll be part and parcel
 >with the abnormal edge handling.
Yea, I thought of this possibility as well after I sent my message given
your background in partitioning and coalescing :-)

Regardless, Morgan's real partitioner has similar goals to yours.

Jeff

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2003-04-25  3:11 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1051194701.16445.2250.camel@p4>
2003-04-25  1:23 ` [tree-ssa] Translation out of SSA Jan Vroonhof
2003-04-25  5:25   ` law
2003-04-25  7:36     ` Andrew MacLeod
2003-04-25  8:01       ` law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).