public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* stack height reduction in tree-SSA
@ 2004-09-01  6:10 Rakesh Kumar, Noida
  2004-09-01 10:23 ` Giovanni Bajo
  2004-09-01 17:57 ` Mike Stump
  0 siblings, 2 replies; 10+ messages in thread
From: Rakesh Kumar, Noida @ 2004-09-01  6:10 UTC (permalink / raw)
  To: gcc, rakeshku

Hi,

  Sharing stack space for objects with disjoint
lifetimes is a well known problem with GCC (especially
for the kernel people). Consider the code

  if (i)
    {
      char a[100];
      use (a);
    }
  else
    {
      char b[100];
      use (b);
    }

Apparantly, 'a' and 'b' can share same stack slot. I feel
that this problem could have been fixed in tree-SSA, but
exactly opposite is done. During 'lower' pass, record_vars
saves the variable declarations in a link list, which are
later expanded to RTL during RTL generation pass in expand_vars.
Below is the dump of 'lower' pass for the above test case.

foo (i)
{
  char b[100];
  char a[100];

  if (i != 0) goto <D1052>; else goto <D1053>;
  <D1052>:;
  use (&a);
  goto <D1054>;
  <D1053>:;
  use (&b);
  <D1054>:;
  return;
}

Is it that good to forget the scope of variable and allocate each
one a new stack slot? I'm unable to get any reasons why the stack
slots can't be shared. Please make me aware of the issues involved.

Thanks and Regards,
Rakesh Kumar

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: stack height reduction in tree-SSA
  2004-09-01  6:10 stack height reduction in tree-SSA Rakesh Kumar, Noida
@ 2004-09-01 10:23 ` Giovanni Bajo
  2004-09-01 10:40   ` Joseph S. Myers
  2004-09-01 17:57 ` Mike Stump
  1 sibling, 1 reply; 10+ messages in thread
From: Giovanni Bajo @ 2004-09-01 10:23 UTC (permalink / raw)
  To: Rakesh Kumar, Noida; +Cc: gcc

Rakesh Kumar, Noida wrote:

> I feel that this problem could have been fixed in tree-SSA, but
> exactly opposite is done. [...]
> Is it that good to forget the scope of variable and allocate each
> one a new stack slot? I'm unable to get any reasons why the stack
> slots can't be shared. Please make me aware of the issues involved.

Yes. tree-ssa behaves very bad wrt stack allocation. There is at least an open
regression filed by me about it (http://gcc.gnu.org/PR16987). The bug that
tracks this problem is http://gcc.gnu.org/PR9997. Notice that, from what I
heard, the correct solution would be to share stack slots using the live ranges
of the variables, rather than the mere lexical scopes. For instance:

void foo(void)
{
   int a, b;

   a = get();
   use_it(&a);

   b = get();
   use_it(&b);
}

could be optimized to use only one variable on the stack. I have the secret
hope that someone will eventually take care of this before 3.5 is release, but
it's just a hope. Surely, this bug alone regresses GCC badly for all embedded
systems.

Giovanni Bajo


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: stack height reduction in tree-SSA
  2004-09-01 10:23 ` Giovanni Bajo
@ 2004-09-01 10:40   ` Joseph S. Myers
  2004-09-01 10:50     ` Giovanni Bajo
  2004-09-01 16:49     ` Jeffrey A Law
  0 siblings, 2 replies; 10+ messages in thread
From: Joseph S. Myers @ 2004-09-01 10:40 UTC (permalink / raw)
  To: Giovanni Bajo; +Cc: Rakesh Kumar, Noida, gcc

On Wed, 1 Sep 2004, Giovanni Bajo wrote:

> void foo(void)
> {
>    int a, b;
> 
>    a = get();
>    use_it(&a);
> 
>    b = get();
>    use_it(&b);
> }
> 
> could be optimized to use only one variable on the stack. I have the secret

In this example, a is still live during the second calls to get() and 
use_it(); the first call to use_it() could have saved a's address.

A variable's life won't exceed its scope (meaning the block in which it is 
declared, including the parts before its declaration) even if its address 
is taken.  If its address is not taken (or if the address is taken but 
only in such a way as not to escape) then you can ignore the scope and 
just observe when it is live.

-- 
Joseph S. Myers               http://www.srcf.ucam.org/~jsm28/gcc/
  http://www.srcf.ucam.org/~jsm28/gcc/#c90status - status of C90 for GCC 3.5
    jsm@polyomino.org.uk (personal mail)
    jsm28@gcc.gnu.org (Bugzilla assignments and CCs)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: stack height reduction in tree-SSA
  2004-09-01 10:40   ` Joseph S. Myers
@ 2004-09-01 10:50     ` Giovanni Bajo
  2004-09-01 16:49     ` Jeffrey A Law
  1 sibling, 0 replies; 10+ messages in thread
From: Giovanni Bajo @ 2004-09-01 10:50 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: Rakesh Kumar, Noida, gcc

Joseph S. Myers wrote:
> On Wed, 1 Sep 2004, Giovanni Bajo wrote:
> 
>> void foo(void)
>> {
>>    int a, b;
>> 
>>    a = get();
>>    use_it(&a);
>> 
>>    b = get();
>>    use_it(&b);
>> }
>> 
>> could be optimized to use only one variable on the stack. I have the
>> secret 
> 
> In this example, a is still live during the second calls to get() and
> use_it(); the first call to use_it() could have saved a's address.

Right, sorry for the bogus example.

Giovanni Bajo


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: stack height reduction in tree-SSA
  2004-09-01 10:40   ` Joseph S. Myers
  2004-09-01 10:50     ` Giovanni Bajo
@ 2004-09-01 16:49     ` Jeffrey A Law
  2004-09-01 17:18       ` Dale Johannesen
  1 sibling, 1 reply; 10+ messages in thread
From: Jeffrey A Law @ 2004-09-01 16:49 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: Giovanni Bajo, Rakesh Kumar, Noida, gcc

On Wed, 2004-09-01 at 04:39, Joseph S. Myers wrote:
> On Wed, 1 Sep 2004, Giovanni Bajo wrote:
> 
> > void foo(void)
> > {
> >    int a, b;
> > 
> >    a = get();
> >    use_it(&a);
> > 
> >    b = get();
> >    use_it(&b);
> > }
> > 
> > could be optimized to use only one variable on the stack. I have the secret
> 
> In this example, a is still live during the second calls to get() and 
> use_it(); the first call to use_it() could have saved a's address.
> 
> A variable's life won't exceed its scope (meaning the block in which it is 
> declared, including the parts before its declaration) even if its address 
> is taken.  If its address is not taken (or if the address is taken but 
> only in such a way as not to escape) then you can ignore the scope and 
> just observe when it is live.
The optimizers are not aware of scoping within a function.  So
consider something like

test ()
{
  int a;

  {
    int z = whatever
    a = c;
  }

  use a
}

Copy propagation will turn the "use a" into a "use c" -- it knows
no reason not to make that transformation because it does not 
(and should not) know about variable scoping.

We've discussed various means of trying to tackle the problem of
sharing stack slots to mitigate the stack space problems.  If
someone is interested in tackling this problem, I'd suggest
they ping Richard Henderson who seemed to have the best handle
on a potential solution when we last discussed it as a group
a few months ago.

jeff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: stack height reduction in tree-SSA
  2004-09-01 16:49     ` Jeffrey A Law
@ 2004-09-01 17:18       ` Dale Johannesen
  2004-09-01 17:38         ` Jeffrey A Law
  2004-09-08 12:00         ` tm_gccmail
  0 siblings, 2 replies; 10+ messages in thread
From: Dale Johannesen @ 2004-09-01 17:18 UTC (permalink / raw)
  To: law
  Cc: Rakesh Kumar, Noida, Giovanni Bajo, gcc, Dale Johannesen,
	Joseph S. Myers

On Sep 1, 2004, at 9:49 AM, Jeffrey A Law wrote:
> We've discussed various means of trying to tackle the problem of
> sharing stack slots to mitigate the stack space problems.  If
> someone is interested in tackling this problem, I'd suggest
> they ping Richard Henderson who seemed to have the best handle
> on a potential solution when we last discussed it as a group
> a few months ago.

One thing to bear in mind is that it is *not* always beneficial to 
reduce
stack space as much as possible, because this inhibits scheduling.
For example (there is a similar case in SPEC):

int i; double a[100];
for (i=0; i<100; i+=4)
{
   a[i] = i*i*i;
   a[i+1] = (i+1)*(i+1)*(i+1);
   a[i+2] = (i+2)*(i+2)*(i+2);
   a[i+3] = (i+3)*(i+3)*(i+3);
}

Several targets require going through a stack slot to do int->double 
conversion.
If you use a single stack slot, which is possible, you're forced to do
store-load-store-load-store-load-store-load.  You may be better off 
using 4
separate slots, as the stores and loads can be scheduled better.
(I am sure this is the case on ppc).  So please do not solve the more
common problem in a way that makes it impossible to do the right thing
here.

Even sharing space for objects whose live ranges don't overlap can lose 
if it
interferes with the several optimizations that can increase live range,
I believe, e.g. invariant hoisting, interblock scheduling....tree-ssa 
seems to
be way too early to be doing this.  Right before reload is more like it 
IMO.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: stack height reduction in tree-SSA
  2004-09-01 17:18       ` Dale Johannesen
@ 2004-09-01 17:38         ` Jeffrey A Law
  2004-09-08 12:00         ` tm_gccmail
  1 sibling, 0 replies; 10+ messages in thread
From: Jeffrey A Law @ 2004-09-01 17:38 UTC (permalink / raw)
  To: Dale Johannesen; +Cc: Rakesh Kumar, Noida, Giovanni Bajo, gcc, Joseph S. Myers

On Wed, 2004-09-01 at 11:18, Dale Johannesen wrote:
> On Sep 1, 2004, at 9:49 AM, Jeffrey A Law wrote:
> > We've discussed various means of trying to tackle the problem of
> > sharing stack slots to mitigate the stack space problems.  If
> > someone is interested in tackling this problem, I'd suggest
> > they ping Richard Henderson who seemed to have the best handle
> > on a potential solution when we last discussed it as a group
> > a few months ago.
> 
> One thing to bear in mind is that it is *not* always beneficial to 
> reduce stack space as much as possible, because this inhibits
>  scheduling.
Certainly.  IMHO, the only time we ought to be looking to share
stack slots will be for "large" objects with disjoint lifetimes.

IIRC the case that comes up in the kernel boils down to something
like this:

func1 ()
{
  int x[100];
  whatever;
}

func2 ()
{
  func1 ();
  func1 ();
  func1 ();
}

Each instance of func1 that is inlined into func2 creates a new
400 byte stack object even though their lifetimes are disjoint.

The stack hotspot for converting an int<->double or for doing
FP<->int moves is something that we generally don't want to share
as it severely inhibits scheduling and can cause all kinds of
interesting stalls on some chips (the PA comes to mind). 


Jeff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: stack height reduction in tree-SSA
  2004-09-01  6:10 stack height reduction in tree-SSA Rakesh Kumar, Noida
  2004-09-01 10:23 ` Giovanni Bajo
@ 2004-09-01 17:57 ` Mike Stump
  2004-09-01 20:06   ` Richard Henderson
  1 sibling, 1 reply; 10+ messages in thread
From: Mike Stump @ 2004-09-01 17:57 UTC (permalink / raw)
  To: Rakesh Kumar, Noida; +Cc: gcc, rakeshku

On Tuesday, August 31, 2004, at 10:41  PM, Rakesh Kumar, Noida wrote:
> Apparantly, 'a' and 'b' can share same stack slot. I feel
> that this problem could have been fixed in tree-SSA, but
> exactly opposite is done.

> I'm unable to get any reasons why the stack
> slots can't be shared.

Historically, gcc did better, then, aliasing was improved, but then it 
didn't track lifetimes of the independent variables and it would let 
the optimizer move code based upon the alias analysis and generate 
incorrect code.  To fix this, either, they had to mark them as can 
alias anything, or not reuse the space, they choose to not reuse the 
space...  and that decision is preserved in the top of the tree.

char buf[100] would be a bad testcase, try:

float f[20];
double d[10];

...

If we used a new alias set that was defined to conflict with the two 
original sets involved for each of the large variables, I think we 
could fix the original problem... not that others would not exist.  For 
small variables, I think not reusing may be best, though, that's really 
up to the scheduler in general.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: stack height reduction in tree-SSA
  2004-09-01 17:57 ` Mike Stump
@ 2004-09-01 20:06   ` Richard Henderson
  0 siblings, 0 replies; 10+ messages in thread
From: Richard Henderson @ 2004-09-01 20:06 UTC (permalink / raw)
  To: Mike Stump; +Cc: Rakesh Kumar, Noida, gcc, rakeshku

On Wed, Sep 01, 2004 at 10:57:41AM -0700, Mike Stump wrote:
> float f[20];
> double d[10];
> 
> If we used a new alias set that was defined to conflict with the two 
> original sets involved for each of the large variables, I think we 
> could fix the original problem... not that others would not exist.

It turns out that we *can't* fix this without entirely turning off
alias sets.  You don't just have to worry about f[3] and d[6], but
also any random pointer dereference.

There's a test in the testsuite that triggers this.

> For small variables, I think not reusing may be best, though, that's
> really up to the scheduler in general.

Recall that only *addressable* small variables live on the stack.
I disbelieve that these matter that much.  I'm not going to do
anything to distinguish this case.


r~

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: stack height reduction in tree-SSA
  2004-09-01 17:18       ` Dale Johannesen
  2004-09-01 17:38         ` Jeffrey A Law
@ 2004-09-08 12:00         ` tm_gccmail
  1 sibling, 0 replies; 10+ messages in thread
From: tm_gccmail @ 2004-09-08 12:00 UTC (permalink / raw)
  To: Dale Johannesen
  Cc: law, Rakesh Kumar, Noida, Giovanni Bajo, gcc, Joseph S. Myers

On Wed, 1 Sep 2004, Dale Johannesen wrote:

> On Sep 1, 2004, at 9:49 AM, Jeffrey A Law wrote:
> > We've discussed various means of trying to tackle the problem of
> > sharing stack slots to mitigate the stack space problems.  If
> > someone is interested in tackling this problem, I'd suggest
> > they ping Richard Henderson who seemed to have the best handle
> > on a potential solution when we last discussed it as a group
> > a few months ago.
> 
> One thing to bear in mind is that it is *not* always beneficial to 
> reduce
> stack space as much as possible, because this inhibits scheduling.

Some CISCy targets don't have scheduling enabled, so on these targets it's
always beneficial.

Toshi


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2004-09-08 12:00 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-09-01  6:10 stack height reduction in tree-SSA Rakesh Kumar, Noida
2004-09-01 10:23 ` Giovanni Bajo
2004-09-01 10:40   ` Joseph S. Myers
2004-09-01 10:50     ` Giovanni Bajo
2004-09-01 16:49     ` Jeffrey A Law
2004-09-01 17:18       ` Dale Johannesen
2004-09-01 17:38         ` Jeffrey A Law
2004-09-08 12:00         ` tm_gccmail
2004-09-01 17:57 ` Mike Stump
2004-09-01 20:06   ` Richard Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).