[tree-ssa] CCP inefficiencies

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* [tree-ssa] CCP inefficiencies
@ 2003-02-12 23:37 law
  2003-02-13 15:59 ` Diego Novillo
  0 siblings, 1 reply; 17+ messages in thread
From: law @ 2003-02-12 23:37 UTC (permalink / raw)
  To: dnovillo; +Cc: gcc

Wow.  Our CCP implementation is expensive.  Like absurdly so.

It looks like a lot of the problem is the over-eager copying of nodes
so that we can replace their operands and try to fold them.  This is
in itself rather expensive, but it also creates lots of garbage for
the collector to clean up.  Ugh.  

To give you an idea of how bad things are, if you stop compilation 
immediately after we tear down the SSA form, you'll find that CCP
is regularly taking 10-25% of the compilation time.

A quick prototype which avoids the silly copying of nodes and folding
of expressions which are unlikely to generate a constant drastically
improves this situation:

For insn-attrtab.i we have:

 garbage collection    :   1.79 (14%) usr   0.02 ( 4%) sys   1.82 (13%) wall
 tree CCP              :   3.32 (26%) usr   0.24 (44%) sys   3.60 (27%) wall

Now with my scheme we get:

 garbage collection    :   1.20 (12%) usr   0.01 ( 3%) sys   1.18 (12%) wall
 tree CCP              :   0.74 ( 8%) usr   0.01 ( 3%) sys   0.77 ( 8%) wall

Note the drastic improvement in the time spent inside CCP.  Also note the
significant improvement in time spent inside the garbage collector.

Or another example -- combine.i:

 garbage collection    :   0.65 (20%) usr   0.00 ( 0%) sys   0.64 (18%) wall
 tree CCP              :   0.59 (18%) usr   0.03 (20%) sys   0.71 (21%) wall

With my implementation:
 garbage collection    :   0.35 (14%) usr   0.00 ( 0%) sys   0.34 (13%) wall
 tree CCP              :   0.13 ( 5%) usr   0.00 ( 0%) sys   0.16 ( 6%) wall

Or yet another example, reload.i:

 garbage collection    :   0.30 (14%) usr   0.00 ( 0%) sys   0.31 (13%) wall
 tree CCP              :   0.48 (22%) usr   0.03 (18%) sys   0.50 (21%) wall

With my implementation:

 garbage collection    :   0.18 (10%) usr   0.00 ( 0%) sys   0.20 (11%) wall
 tree CCP              :   0.14 ( 8%) usr   0.00 ( 0%) sys   0.10 ( 5%) wall

Do you want another?  reload1.i:

 garbage collection    :   0.58 (22%) usr   0.00 ( 0%) sys   0.60 (21%) wall
 tree CCP              :   0.49 (18%) usr   0.01 ( 8%) sys   0.41 (14%) wall

With my implementation:
 garbage collection    :   0.20 (11%) usr   0.00 ( 0%) sys   0.20 (11%) wall
 tree CCP              :   0.10 ( 6%) usr   0.00 ( 0%) sys   0.10 ( 5%) wall

Do you sense a pattern here? :-)

Overall (and remember, the compiler is stopped once tree-ssa is finished)
we go from 139.11 seconds to 118.44 seconds.  Significantly fewer
major and minor page faults (which is an indicator that we're probably
sweeping through less memory in the collector).

Anyway, just wanted to give everyone a heads up to some upcoming changes...

Jeff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [tree-ssa] CCP inefficiencies
  2003-02-12 23:37 [tree-ssa] CCP inefficiencies law
@ 2003-02-13 15:59 ` Diego Novillo
  2003-02-13 16:13   ` law
  0 siblings, 1 reply; 17+ messages in thread
From: Diego Novillo @ 2003-02-13 15:59 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc

On Wed, 2003-02-12 at 18:08, law@redhat.com wrote:

> It looks like a lot of the problem is the over-eager copying of nodes
> so that we can replace their operands and try to fold them.  This is
> in itself rather expensive, but it also creates lots of garbage for
> the collector to clean up.  Ugh.  
> 
Yeah.  Attribute that to Diego being a lazy bum(1).  I originally had
the idea of having some sort of undo buffer so that CCP could try the
replacement and fold, but copying the expression tree was so much easier
:)

In any case, if you implement an undo buffer for CCP, I'll be in your
debt.  Even better, implement an undo buffer for any kind of tree
rewrite we may want to use.


Diego.

(1) The original message had an alternate spelling with bum that started
with 'sh', but this is a public list.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [tree-ssa] CCP inefficiencies
  2003-02-13 15:59 ` Diego Novillo
@ 2003-02-13 16:13   ` law
  2003-02-13 16:28     ` Diego Novillo
  2003-02-13 16:34     ` Daniel Berlin
  0 siblings, 2 replies; 17+ messages in thread
From: law @ 2003-02-13 16:13 UTC (permalink / raw)
  To: Diego Novillo; +Cc: gcc

In message <1045151865.10505.54.camel@frodo>, Diego Novillo writes:
 >On Wed, 2003-02-12 at 18:08, law@redhat.com wrote:
 >
 >> It looks like a lot of the problem is the over-eager copying of nodes
 >> so that we can replace their operands and try to fold them.  This is
 >> in itself rather expensive, but it also creates lots of garbage for
 >> the collector to clean up.  Ugh.  
 >> 
 >Yeah.  Attribute that to Diego being a lazy bum(1).  I originally had
 >the idea of having some sort of undo buffer so that CCP could try the
 >replacement and fold, but copying the expression tree was so much easier
 >:)
Or a non-destructive simplifier.  I'm not sure which is going to make
more sense, but it's pretty clear we need something better than just
copying the nodes.


Jeff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [tree-ssa] CCP inefficiencies
  2003-02-13 16:13   ` law
@ 2003-02-13 16:28     ` Diego Novillo
  2003-02-13 16:40       ` law
  2003-02-13 16:34     ` Daniel Berlin
  1 sibling, 1 reply; 17+ messages in thread
From: Diego Novillo @ 2003-02-13 16:28 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc

On Thu, 2003-02-13 at 11:10, law@redhat.com wrote:

> Or a non-destructive simplifier.  I'm not sure which is going to make
> more sense, but it's pretty clear we need something better than just
> copying the nodes.
> 
CCP always needs to do destructive things because it calls fold() which
munges the input tree.  Alternately, we could 'fix' fold() to not be
destructive.


Diego.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [tree-ssa] CCP inefficiencies
  2003-02-13 16:13   ` law
  2003-02-13 16:28     ` Diego Novillo
@ 2003-02-13 16:34     ` Daniel Berlin
  2003-02-14 16:17       ` law
  1 sibling, 1 reply; 17+ messages in thread
From: Daniel Berlin @ 2003-02-13 16:34 UTC (permalink / raw)
  To: law; +Cc: Diego Novillo, gcc


On Thursday, February 13, 2003, at 11:10  AM, law@redhat.com wrote:

> In message <1045151865.10505.54.camel@frodo>, Diego Novillo writes:
>> On Wed, 2003-02-12 at 18:08, law@redhat.com wrote:
>>
>>> It looks like a lot of the problem is the over-eager copying of nodes
>>> so that we can replace their operands and try to fold them.  This is
>>> in itself rather expensive, but it also creates lots of garbage for
>>> the collector to clean up.  Ugh.
>>>
>> Yeah.  Attribute that to Diego being a lazy bum(1).  I originally had
>> the idea of having some sort of undo buffer so that CCP could try the
>> replacement and fold, but copying the expression tree was so much 
>> easier
>> :)
> Or a non-destructive simplifier.  I'm not sure which is going to make
> more sense, but it's pretty clear we need something better than just
> copying the nodes.
>
Yeah, I was going to look at a non-destructive simplifier eventually, 
but i presume this is what you've done in your work.
We should always be able to determine if a node will become constant or 
has a good chance before ever copying it just by seeing if
1. we've replaced all variables with constants
2. we've replaced a unary operator on a variable with a unary operator 
on a constant
3. we've replaced a binary operator on two variables with a binary 
operator on two constants
etc

There's no need to copy the node to check if any of these cases hold.

> Jeff
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [tree-ssa] CCP inefficiencies
  2003-02-13 16:28     ` Diego Novillo
@ 2003-02-13 16:40       ` law
  2003-02-13 16:45         ` Diego Novillo
  0 siblings, 1 reply; 17+ messages in thread
From: law @ 2003-02-13 16:40 UTC (permalink / raw)
  To: Diego Novillo; +Cc: gcc

In message <1045153377.13193.98.camel@frodo>, Diego Novillo writes:
 >On Thu, 2003-02-13 at 11:10, law@redhat.com wrote:
 >
 >> Or a non-destructive simplifier.  I'm not sure which is going to make
 >> more sense, but it's pretty clear we need something better than just
 >> copying the nodes.
 >> 
 >CCP always needs to do destructive things because it calls fold() which
 >munges the input tree.  Alternately, we could 'fix' fold() to not be
 >destructive.
Rather than "fixing" fold, we introduce a new, simpler fold_nondestructive
or whatever.  The number of cases we care about are actually a small subset
of the cases fold currently handles.

Jeff



 >
 >
 >Diego.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [tree-ssa] CCP inefficiencies
  2003-02-13 16:40       ` law
@ 2003-02-13 16:45         ` Diego Novillo
  2003-02-14  8:32           ` Diego Novillo
  0 siblings, 1 reply; 17+ messages in thread
From: Diego Novillo @ 2003-02-13 16:45 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc

On Thu, 2003-02-13 at 11:38, law@redhat.com wrote:

> Rather than "fixing" fold, we introduce a new, simpler fold_nondestructive
> or whatever.  The number of cases we care about are actually a small subset
> of the cases fold currently handles.
> 
OK.  I guess that we only need something along the lines of what Dan
suggested: 'is this expression going to be folded into a constant?' 


Diego.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [tree-ssa] CCP inefficiencies
  2003-02-13 16:45         ` Diego Novillo
@ 2003-02-14  8:32           ` Diego Novillo
  2003-02-14  8:49             ` Daniel Berlin
  2003-02-14 14:16             ` law
  0 siblings, 2 replies; 17+ messages in thread
From: Diego Novillo @ 2003-02-14  8:32 UTC (permalink / raw)
  To: Diego Novillo; +Cc: Jeff Law, gcc

On Thu, 2003-02-13 at 11:42, Diego Novillo wrote:
> On Thu, 2003-02-13 at 11:38, law@redhat.com wrote:
> 
> > Rather than "fixing" fold, we introduce a new, simpler fold_nondestructive
> > or whatever.  The number of cases we care about are actually a small subset
> > of the cases fold currently handles.
> > 
> OK.  I guess that we only need something along the lines of what Dan
> suggested: 'is this expression going to be folded into a constant?' 
> 
On second thought, what I said above is completely wrong.  Knowing that
the expression is constant is not enough, we *really* need to fold it to
compute and return its lattice value from tree-ssa-ccp.c:evaluate_stmt.

The problem is that during the CCP pass, an expression may alternate
between constant and varying values.  Take for instance this code:

a_1 = 5;
b_1 = 3;
a_2 = phi (a_1, a_3);
while (a < 10)
{
  c_1 = a_2 + b_1;
  if (c_1)
    d_1 = c_1 - 1;
  a_3 = a_2 + 1;
}

On the first iteration, we enter the assignment to c_1 and find that all
its operands are constant, a_2 is 5 (because the only executable
argument for the phi node is a_1), and b_1 is 3.  However, knowing that
c_1 is a constant doesn't help us.  We need to fold the expression so
that we can set c_1's lattice value to 8.

Suppose that we fold the original statement from 'c_1 = a_2 + b_1' to
'c_1 = 8'.  Now fast forward to the assignment 'a_3 = a_2 + 1'.  Again,
the same problem, knowing that a_3 is a constant doesn't help, you need
to fold it to find out that its value is 6.

And now, we are screwed.  When we iterate back to a_2's phi node, we
find out that it isn't really constant, because it evaluates to 
'a_2 = phi (5, 6)'.  We now need to go back to 'c_1 = 8' and restore the
original expression.

I guess what we could do is keep a deep copy of the original expression
in evaluate_stmt only if we determine that the call to fold() will
return a constant value.  But we must be able to fold and unfold
expressions at will.

Diego.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [tree-ssa] CCP inefficiencies
  2003-02-14  8:32           ` Diego Novillo
@ 2003-02-14  8:49             ` Daniel Berlin
  2003-02-14 11:42               ` Daniel Berlin
  2003-02-14 11:48               ` Diego Novillo
  2003-02-14 14:16             ` law
  1 sibling, 2 replies; 17+ messages in thread
From: Daniel Berlin @ 2003-02-14  8:49 UTC (permalink / raw)
  To: Diego Novillo; +Cc: Jeff Law, gcc



On Fri, 14 Feb 2003, Diego Novillo wrote:

> On Thu, 2003-02-13 at 11:42, Diego Novillo wrote:
> > On Thu, 2003-02-13 at 11:38, law@redhat.com wrote:
> >
> > > Rather than "fixing" fold, we introduce a new, simpler fold_nondestructive
> > > or whatever.  The number of cases we care about are actually a small subset
> > > of the cases fold currently handles.
> > >
> > OK.  I guess that we only need something along the lines of what Dan
> > suggested: 'is this expression going to be folded into a constant?'
> >
> On second thought, what I said above is completely wrong.  Knowing that
> the expression is constant is not enough, we *really* need to fold it to
> compute and return its lattice value from tree-ssa-ccp.c:evaluate_stmt.

I was only talking about replacement, not about evaluation.
For replacement, an expression can't become constant unless you've
replaced at least one variable with a constant.
For evaluation, you can't do it without copying.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [tree-ssa] CCP inefficiencies
  2003-02-14  8:49             ` Daniel Berlin
@ 2003-02-14 11:42               ` Daniel Berlin
  2003-02-14 11:48               ` Diego Novillo
  1 sibling, 0 replies; 17+ messages in thread
From: Daniel Berlin @ 2003-02-14 11:42 UTC (permalink / raw)
  To: Diego Novillo; +Cc: Jeff Law, gcc



On Fri, 14 Feb 2003, Daniel Berlin wrote:

>
>
> On Fri, 14 Feb 2003, Diego Novillo wrote:
>
> > On Thu, 2003-02-13 at 11:42, Diego Novillo wrote:
> > > On Thu, 2003-02-13 at 11:38, law@redhat.com wrote:
> > >
> > > > Rather than "fixing" fold, we introduce a new, simpler fold_nondestructive
> > > > or whatever.  The number of cases we care about are actually a small subset
> > > > of the cases fold currently handles.
> > > >
> > > OK.  I guess that we only need something along the lines of what Dan
> > > suggested: 'is this expression going to be folded into a constant?'
> > >
> > On second thought, what I said above is completely wrong.  Knowing that
> > the expression is constant is not enough, we *really* need to fold it to
> > compute and return its lattice value from tree-ssa-ccp.c:evaluate_stmt.
>
> I was only talking about replacement, not about evaluation.
> For replacement, an expression can't become constant unless you've
> replaced at least one variable with a constant.
> For evaluation, you can't do it without copying.

At least, i *think* i was only talking about replacement.
It's almost 4am, so if i wake up and write something completely different,
...

>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [tree-ssa] CCP inefficiencies
  2003-02-14  8:49             ` Daniel Berlin
  2003-02-14 11:42               ` Daniel Berlin
@ 2003-02-14 11:48               ` Diego Novillo
  1 sibling, 0 replies; 17+ messages in thread
From: Diego Novillo @ 2003-02-14 11:48 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Jeff Law, gcc

On Fri, 2003-02-14 at 03:44, Daniel Berlin wrote:

> On Fri, 14 Feb 2003, Diego Novillo wrote:
> 
> > On second thought, what I said above is completely wrong.  Knowing that
> > the expression is constant is not enough, we *really* need to fold it to
> > compute and return its lattice value from tree-ssa-ccp.c:evaluate_stmt.
> 
> I was only talking about replacement, not about evaluation.
>
Right.  That's why I ammended my original comment :)

> For replacement, an expression can't become constant unless you've
> replaced at least one variable with a constant.
> For evaluation, you can't do it without copying.
>
Yes.


Diego.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [tree-ssa] CCP inefficiencies
  2003-02-14  8:32           ` Diego Novillo
  2003-02-14  8:49             ` Daniel Berlin
@ 2003-02-14 14:16             ` law
  2003-02-14 14:43               ` Diego Novillo
  1 sibling, 1 reply; 17+ messages in thread
From: law @ 2003-02-14 14:16 UTC (permalink / raw)
  To: Diego Novillo; +Cc: gcc

In message <1045190219.13192.138.camel@frodo>, Diego Novillo writes:
 >On Thu, 2003-02-13 at 11:42, Diego Novillo wrote:
 >> On Thu, 2003-02-13 at 11:38, law@redhat.com wrote:
 >> 
 >> > Rather than "fixing" fold, we introduce a new, simpler fold_nondestructiv
 >e
 >> > or whatever.  The number of cases we care about are actually a small subs
 >et
 >> > of the cases fold currently handles.
 >> > 
 >> OK.  I guess that we only need something along the lines of what Dan
 >> suggested: 'is this expression going to be folded into a constant?' 
 >> 
 >On second thought, what I said above is completely wrong.  Knowing that
 >the expression is constant is not enough, we *really* need to fold it to
 >compute and return its lattice value from tree-ssa-ccp.c:evaluate_stmt.
 >
 >The problem is that during the CCP pass, an expression may alternate
 >between constant and varying values.  Take for instance this code:
A value can only change twice.  The valid state changes are:
UNDEF->CONSTANT
UNDEF->VARYING
CONSTANT->VARYING

Ping-ponging isn't allowed.  CONSTANT->CONSTANT is not allowed (which is
something your code gets wrong).

 >a_1 = 5;
 >b_1 = 3;
 >a_2 = phi (a_1, a_3);
 >while (a < 10)
 >{
 >  c_1 = a_2 + b_1;
 >  if (c_1)
 >    d_1 = c_1 - 1;
 >  a_3 = a_2 + 1;
 >}
 >
 >On the first iteration, we enter the assignment to c_1 and find that all
 >its operands are constant, a_2 is 5 (because the only executable
 >argument for the phi node is a_1), and b_1 is 3.  However, knowing that
 >c_1 is a constant doesn't help us.  We need to fold the expression so
 >that we can set c_1's lattice value to 8.
 >
 >Suppose that we fold the original statement from 'c_1 = a_2 + b_1' to
 >'c_1 = 8'.  Now fast forward to the assignment 'a_3 = a_2 + 1'.  Again,
 >the same problem, knowing that a_3 is a constant doesn't help, you need
 >to fold it to find out that its value is 6.
 >
 >And now, we are screwed.  When we iterate back to a_2's phi node, we
 >find out that it isn't really constant, because it evaluates to 
 >'a_2 = phi (5, 6)'.  We now need to go back to 'c_1 = 8' and restore the
 >original expression.
 >
 >I guess what we could do is keep a deep copy of the original expression
 >in evaluate_stmt only if we determine that the call to fold() will
 >return a constant value.  But we must be able to fold and unfold
 >expressions at will.
A non-destructive folder will handle this just fine.

jeff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [tree-ssa] CCP inefficiencies
  2003-02-14 14:16             ` law
@ 2003-02-14 14:43               ` Diego Novillo
  2003-02-14 17:25                 ` law
  0 siblings, 1 reply; 17+ messages in thread
From: Diego Novillo @ 2003-02-14 14:43 UTC (permalink / raw)
  To: law; +Cc: gcc

On Fri, 14 Feb 2003, Jeff Law wrote:

> Ping-ponging isn't allowed.
>
Of course not.  Not only it's not allowed, it's impossible.  The
algorigthm is designed to only make state changes that move up
the lattice, you cannot transition down.  I didn't want to imply
ping-pong effects.  My point was that once folded, an expression
may need to be unfolded.

> CONSTANT->CONSTANT is not allowed (which is
> something your code gets wrong).
> 
C -> C is not a state change.  How can we get it wrong?  Are you
saying that there are cases where we transition sideways in the
lattice?  We can switch a variable from one constant value C1 to
different constant value C2?  Eek, that shouldn't.  Do you have a
testcase?


> A non-destructive folder will handle this just fine.
> 
Well, of course, that's the point :)  Do you think we should
re-implement fold()?  I think may_fold_p() should be good enough
for the time being.  But if you want to implement a
non-destructive folder, that's also fine with me.


Diego.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [tree-ssa] CCP inefficiencies
  2003-02-13 16:34     ` Daniel Berlin
@ 2003-02-14 16:17       ` law
  0 siblings, 0 replies; 17+ messages in thread
From: law @ 2003-02-14 16:17 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Diego Novillo, gcc

In message <7C33061F-3F6F-11D7-B968-000393575BCC@dberlin.org>, Daniel Berlin wr
ites:
 >
 >On Thursday, February 13, 2003, at 11:10  AM, law@redhat.com wrote:
 >
 >> In message <1045151865.10505.54.camel@frodo>, Diego Novillo writes:
 >>> On Wed, 2003-02-12 at 18:08, law@redhat.com wrote:
 >>>
 >>>> It looks like a lot of the problem is the over-eager copying of nodes
 >>>> so that we can replace their operands and try to fold them.  This is
 >>>> in itself rather expensive, but it also creates lots of garbage for
 >>>> the collector to clean up.  Ugh.
 >>>>
 >>> Yeah.  Attribute that to Diego being a lazy bum(1).  I originally had
 >>> the idea of having some sort of undo buffer so that CCP could try the
 >>> replacement and fold, but copying the expression tree was so much 
 >>> easier
 >>> :)
 >> Or a non-destructive simplifier.  I'm not sure which is going to make
 >> more sense, but it's pretty clear we need something better than just
 >> copying the nodes.
 >>
 >Yeah, I was going to look at a non-destructive simplifier eventually, 
 >but i presume this is what you've done in your work.
 >We should always be able to determine if a node will become constant or 
 >has a good chance before ever copying it just by seeing if
 >1. we've replaced all variables with constants
 >2. we've replaced a unary operator on a variable with a unary operator 
 >on a constant
 >3. we've replaced a binary operator on two variables with a binary 
 >operator on two constants
 >etc
 >
 >There's no need to copy the node to check if any of these cases hold.
I originally started with similar criteria -- interestingly enough 
we want to go ahead with simplifications *anytime* we're able to replace
a variable with a constant.

This was somewhat of a surprise, but failing to do so was causing us to
miss simplifications of certain relationals.  For example, if you
had:

  (lt (ssa_name X) (ssa_name Y))


  Let's assume that X has the unsigned bit set, but that it varies.  Let's
  then assume that we know Y is zero.  Even though you don't know what
  values X might take, you know that the comparison is always false.

There were other examples, but that's the type I saw we were missing
most often when requiring all variables to be replaced by constants
before copying & simplifying.

jeff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [tree-ssa] CCP inefficiencies
  2003-02-14 14:43               ` Diego Novillo
@ 2003-02-14 17:25                 ` law
  2003-02-14 21:32                   ` Richard Henderson
  0 siblings, 1 reply; 17+ messages in thread
From: law @ 2003-02-14 17:25 UTC (permalink / raw)
  To: Diego Novillo; +Cc: gcc

In message <20030214141504.GA2330@tornado.toronto.redhat.com>, Diego Novillo wr
ites:
 >On Fri, 14 Feb 2003, Jeff Law wrote:
 >
 >> Ping-ponging isn't allowed.
 >>
 >Of course not.  Not only it's not allowed, it's impossible.  The
 >algorigthm is designed to only make state changes that move up
 >the lattice, you cannot transition down.  I didn't want to imply
 >ping-pong effects.  My point was that once folded, an expression
 >may need to be unfolded.
Or we do non-destructive folds.  

 >> CONSTANT->CONSTANT is not allowed (which is
 >> something your code gets wrong).
 >> 
 >C -> C is not a state change.  How can we get it wrong?  Are you
 >saying that there are cases where we transition sideways in the
 >lattice?  We can switch a variable from one constant value C1 to
 >different constant value C2?  Eek, that shouldn't.  Do you have a
 >testcase?
I was pretty sure I saw sideways transitions, but I can't trigger
one right now.  Odd.  Regardless if you look at how set_lattice_value
works, it'll claim the value is still a constant even if it has
two different constant values.  It's easy enough to fix.

I also believe we should be checking for invalid transitions under 
ENABLE_CHECKING.  

Checks for invalid transitions into CONSTANT are easy.
(UNDEFINED->CONSTANT is OK, VARYING->CONSTANT is not)

It's easy to check CONSTANT->UNDEFINED, which is of course invalid.
Unfortunately, adding checks for VARYING->UNDEFINED is not possible
at this time due to our implementation of CCP.

Let's say we've just evaluated a node for the first time and the
evaluation returned UNDEFINED (say for the SSA_NAME destination of a
PHI node).  We go to set the lattice value to undefined (def_to_undefined).
The first thing that routine does is call get_value to get the object's
old value.  Since this is the first time we've accessed the value for this
SSA_NAME object, we get it's default value from get_default_value, which
is VARYING.  So it appears as if we're doing an UNDEFINED->VARYING
transition (which is of course invalid).   Oh well, it's not the end of
the world...

BTW, it's rather silly to call set_value if the lattice value hasn't
changed.

 >> A non-destructive folder will handle this just fine.
 >> 
 >Well, of course, that's the point :)  Do you think we should
 >re-implement fold()?  I think may_fold_p() should be good enough
 >for the time being.  But if you want to implement a
 >non-destructive folder, that's also fine with me.
Again, the number of cases we need to handle is drastically smaller
than what's dealt with in fold.  I really don't think it's going to 
be terribly difficult.

Jeff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [tree-ssa] CCP inefficiencies
  2003-02-14 17:25                 ` law
@ 2003-02-14 21:32                   ` Richard Henderson
  2003-02-14 23:40                     ` law
  0 siblings, 1 reply; 17+ messages in thread
From: Richard Henderson @ 2003-02-14 21:32 UTC (permalink / raw)
  To: law; +Cc: Diego Novillo, gcc

On Fri, Feb 14, 2003 at 10:03:32AM -0700, law@redhat.com wrote:
> Again, the number of cases we need to handle is drastically smaller
> than what's dealt with in fold.  I really don't think it's going to 
> be terribly difficult.

Nevertheless, it's code duplication.

Any chance you could pull out the bits you need from the
existing fold, and then have fold call into them?  Probably
not a bad idea anyway, since the fold function itself is
too damn big....


r~

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [tree-ssa] CCP inefficiencies
  2003-02-14 21:32                   ` Richard Henderson
@ 2003-02-14 23:40                     ` law
  0 siblings, 0 replies; 17+ messages in thread
From: law @ 2003-02-14 23:40 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Diego Novillo, gcc

In message <20030214212222.GB16166@redhat.com>, Richard Henderson writes:
 >On Fri, Feb 14, 2003 at 10:03:32AM -0700, law@redhat.com wrote:
 >> Again, the number of cases we need to handle is drastically smaller
 >> than what's dealt with in fold.  I really don't think it's going to 
 >> be terribly difficult.
 >
 >Nevertheless, it's code duplication.
 >
 >Any chance you could pull out the bits you need from the
 >existing fold, and then have fold call into them?  Probably
 >not a bad idea anyway, since the fold function itself is
 >too damn big....
Some of the code can easily be factored and shared (in fact, I had
already started down this path when I ran into some of CCP's sillyness
that I decided to fix before fixing fold :-)

If you look at fold, conceptually the code we're going to care about
is everything where "wins" is true.  But you'll also find that "wins"
code is a tiny tiny fraction of what happens inside fold.

A lot of what fold does is scramble nested expressions in the hopes
that it can be simplified -- we don't have that to deal with since
we're in GIMPLE form.

jeff


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2003-02-14 23:14 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-02-12 23:37 [tree-ssa] CCP inefficiencies law
2003-02-13 15:59 ` Diego Novillo
2003-02-13 16:13   ` law
2003-02-13 16:28     ` Diego Novillo
2003-02-13 16:40       ` law
2003-02-13 16:45         ` Diego Novillo
2003-02-14  8:32           ` Diego Novillo
2003-02-14  8:49             ` Daniel Berlin
2003-02-14 11:42               ` Daniel Berlin
2003-02-14 11:48               ` Diego Novillo
2003-02-14 14:16             ` law
2003-02-14 14:43               ` Diego Novillo
2003-02-14 17:25                 ` law
2003-02-14 21:32                   ` Richard Henderson
2003-02-14 23:40                     ` law
2003-02-13 16:34     ` Daniel Berlin
2003-02-14 16:17       ` law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).