public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* stack slot reuse
@ 2010-05-21 19:44 Xinliang David Li
  2010-05-21 20:29 ` Richard Guenther
  2010-05-25 20:41 ` Easwaran Raman
  0 siblings, 2 replies; 12+ messages in thread
From: Xinliang David Li @ 2010-05-21 19:44 UTC (permalink / raw)
  To: Richard Guenther
  Cc: Steven Bosscher, Ian Lance Taylor, Vladimir Makarov, GCC Mailing List

On Fri, May 21, 2010 at 2:24 AM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Thu, May 20, 2010 at 11:21 PM, Xinliang David Li <davidxl@google.com> wrote:
>> On Thu, May 20, 2010 at 2:18 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
>>> On Thu, May 20, 2010 at 11:14 PM, Xinliang David Li <davidxl@google.com> wrote:
>>>> stack variable overlay and stack slot assignments is here too.
>>>
>>> Yes, and for these I would like to add a separate timevar. Agree?
>>
>> Yes.  (By the way, we are rewriting this pass to eliminate the code
>> motion/aliasing problem -- but that is a different topic).
>
> Btw, we want to address the same problem by representing the
> points where (big) variables go out-of scope in the IL, also to
> help DSE.  The original idea was to simply drop in an aggregate
> assignment from an undefined value at the end of the scope
> during lowering, like
>
>  var = {undefined};
>

This looks like a very interesting approach.  Do you see any downside
of this approach?  What is the problem of handling (nullifying) the
dummy statement in expansion pass?

The approach we took is different --- we move this overlay/packing
earlier (after ipa-inlining). One of the other motivation for doing
this is due to the limitation in current implementation that leaves
out many overlaying opportunities (e.g. structs with union members can
not share slots etc), but this is a probably independent issue.

Thanks,

David

> which we'd expand to nothing.  Of course shifting the problem to
> the RTL optimizers, so better expand to a similar RTL construct.
> But then are you addressing the similar problem on the RTL side?
>
> Richard.
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: stack slot reuse
  2010-05-21 19:44 stack slot reuse Xinliang David Li
@ 2010-05-21 20:29 ` Richard Guenther
  2010-05-22 20:29   ` Xinliang David Li
  2010-05-25 20:41 ` Easwaran Raman
  1 sibling, 1 reply; 12+ messages in thread
From: Richard Guenther @ 2010-05-21 20:29 UTC (permalink / raw)
  To: Xinliang David Li
  Cc: Steven Bosscher, Ian Lance Taylor, Vladimir Makarov, GCC Mailing List

On Fri, May 21, 2010 at 7:30 PM, Xinliang David Li <davidxl@google.com> wrote:
> On Fri, May 21, 2010 at 2:24 AM, Richard Guenther
> <richard.guenther@gmail.com> wrote:
>> On Thu, May 20, 2010 at 11:21 PM, Xinliang David Li <davidxl@google.com> wrote:
>>> On Thu, May 20, 2010 at 2:18 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
>>>> On Thu, May 20, 2010 at 11:14 PM, Xinliang David Li <davidxl@google.com> wrote:
>>>>> stack variable overlay and stack slot assignments is here too.
>>>>
>>>> Yes, and for these I would like to add a separate timevar. Agree?
>>>
>>> Yes.  (By the way, we are rewriting this pass to eliminate the code
>>> motion/aliasing problem -- but that is a different topic).
>>
>> Btw, we want to address the same problem by representing the
>> points where (big) variables go out-of scope in the IL, also to
>> help DSE.  The original idea was to simply drop in an aggregate
>> assignment from an undefined value at the end of the scope
>> during lowering, like
>>
>>  var = {undefined};
>>
>
> This looks like a very interesting approach.  Do you see any downside
> of this approach?  What is the problem of handling (nullifying) the
> dummy statement in expansion pass?

That is what I'd have done initially.  I could in theory see RTL
code motion optimizations move stuff in an invalid way after that
(but we try to avoid this by properly sharing TBAA compatible
slots only and fixing up points-to information as well).

So in the end it'll probably just work dropping the assignments
on the floor during expansion to RTL.

> The approach we took is different --- we move this overlay/packing
> earlier (after ipa-inlining). One of the other motivation for doing
> this is due to the limitation in current implementation that leaves
> out many overlaying opportunities (e.g. structs with union members can
> not share slots etc), but this is a probably independent issue.

Yes, one earlier idea would have unified stack slots at gimple lowering
time.  I'm not sure that after ipa-inlining is early enough (probably
it is due to the lack of code motion optimizations).

With the extra assignments I was also hoping to help analysis phases
to note that for example in

  {
    int a[10];
    foo (a);
   }
   bar ();

a is not live over the call to bar as it can't validly escape out of
its scope.

Thanks,
Richard.

> Thanks,
>
> David
>
>> which we'd expand to nothing.  Of course shifting the problem to
>> the RTL optimizers, so better expand to a similar RTL construct.
>> But then are you addressing the similar problem on the RTL side?
>>
>> Richard.
>>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: stack slot reuse
  2010-05-21 20:29 ` Richard Guenther
@ 2010-05-22 20:29   ` Xinliang David Li
  2010-05-22 20:57     ` Richard Guenther
  0 siblings, 1 reply; 12+ messages in thread
From: Xinliang David Li @ 2010-05-22 20:29 UTC (permalink / raw)
  To: Richard Guenther
  Cc: Steven Bosscher, Ian Lance Taylor, Vladimir Makarov, GCC Mailing List

On Fri, May 21, 2010 at 10:35 AM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Fri, May 21, 2010 at 7:30 PM, Xinliang David Li <davidxl@google.com> wrote:
>> On Fri, May 21, 2010 at 2:24 AM, Richard Guenther
>> <richard.guenther@gmail.com> wrote:
>>> On Thu, May 20, 2010 at 11:21 PM, Xinliang David Li <davidxl@google.com> wrote:
>>>> On Thu, May 20, 2010 at 2:18 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
>>>>> On Thu, May 20, 2010 at 11:14 PM, Xinliang David Li <davidxl@google.com> wrote:
>>>>>> stack variable overlay and stack slot assignments is here too.
>>>>>
>>>>> Yes, and for these I would like to add a separate timevar. Agree?
>>>>
>>>> Yes.  (By the way, we are rewriting this pass to eliminate the code
>>>> motion/aliasing problem -- but that is a different topic).
>>>
>>> Btw, we want to address the same problem by representing the
>>> points where (big) variables go out-of scope in the IL, also to
>>> help DSE.  The original idea was to simply drop in an aggregate
>>> assignment from an undefined value at the end of the scope
>>> during lowering, like
>>>
>>>  var = {undefined};
>>>
>>
>> This looks like a very interesting approach.  Do you see any downside
>> of this approach?  What is the problem of handling (nullifying) the
>> dummy statement in expansion pass?
>
> That is what I'd have done initially.  I could in theory see RTL
> code motion optimizations move stuff in an invalid way after that
> (but we try to avoid this by properly sharing TBAA compatible
> slots only and fixing up points-to information as well).
>
> So in the end it'll probably just work dropping the assignments
> on the floor during expansion to RTL.
>
>> The approach we took is different --- we move this overlay/packing
>> earlier (after ipa-inlining). One of the other motivation for doing
>> this is due to the limitation in current implementation that leaves
>> out many overlaying opportunities (e.g. structs with union members can
>> not share slots etc), but this is a probably independent issue.
>
> Yes, one earlier idea would have unified stack slots at gimple lowering
> time.  I'm not sure that after ipa-inlining is early enough (probably
> it is due to the lack of code motion optimizations).

Yes -- doing it after inlining is important as they are the major
contributors of the stack sharing opportunities.

>
> With the extra assignments I was also hoping to help analysis phases
> to note that for example in
>
>  {
>    int a[10];
>    foo (a);
>   }
>   bar ();
>
> a is not live over the call to bar as it can't validly escape out of
> its scope.

Yes, this will help exposing opportunities for dead store elimination
due to the anticipation of the dummy store, which would otherwise
missed by dce due to false use from bar. However, to make use of scope
information better for aliasing,  flow sensitive analysis is needed --
consider the following case:

int *gp;
int foo(..)
{
  int local;

  local = 1;  // (1)
  *gp = ...    // (2)

  bar (&local); // (3)

   ..
}

(1) and (2) is not aliased.  More generally for loop: local variable
in loop body scope does not live across iterations, so address
escaping does not 'propagate' via backedge:


for ( i = ...; i  <..; i++)
  {
       int local[100];

       for (j = ...)
        {
            local[j] = ... // (1)
            ... = *global_p; // (2)
        }
      bar (local);
  }

(1) and (2) are not aliased.


Thanks,

David


>
> Thanks,
> Richard.
>
>> Thanks,
>>
>> David
>>
>>> which we'd expand to nothing.  Of course shifting the problem to
>>> the RTL optimizers, so better expand to a similar RTL construct.
>>> But then are you addressing the similar problem on the RTL side?
>>>
>>> Richard.
>>>
>>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: stack slot reuse
  2010-05-22 20:29   ` Xinliang David Li
@ 2010-05-22 20:57     ` Richard Guenther
  0 siblings, 0 replies; 12+ messages in thread
From: Richard Guenther @ 2010-05-22 20:57 UTC (permalink / raw)
  To: Xinliang David Li
  Cc: Steven Bosscher, Ian Lance Taylor, Vladimir Makarov, GCC Mailing List

On Fri, May 21, 2010 at 10:29 PM, Xinliang David Li <davidxl@google.com> wrote:
> On Fri, May 21, 2010 at 10:35 AM, Richard Guenther
> <richard.guenther@gmail.com> wrote:
>> On Fri, May 21, 2010 at 7:30 PM, Xinliang David Li <davidxl@google.com> wrote:
>>> On Fri, May 21, 2010 at 2:24 AM, Richard Guenther
>>> <richard.guenther@gmail.com> wrote:
>>>> On Thu, May 20, 2010 at 11:21 PM, Xinliang David Li <davidxl@google.com> wrote:
>>>>> On Thu, May 20, 2010 at 2:18 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
>>>>>> On Thu, May 20, 2010 at 11:14 PM, Xinliang David Li <davidxl@google.com> wrote:
>>>>>>> stack variable overlay and stack slot assignments is here too.
>>>>>>
>>>>>> Yes, and for these I would like to add a separate timevar. Agree?
>>>>>
>>>>> Yes.  (By the way, we are rewriting this pass to eliminate the code
>>>>> motion/aliasing problem -- but that is a different topic).
>>>>
>>>> Btw, we want to address the same problem by representing the
>>>> points where (big) variables go out-of scope in the IL, also to
>>>> help DSE.  The original idea was to simply drop in an aggregate
>>>> assignment from an undefined value at the end of the scope
>>>> during lowering, like
>>>>
>>>>  var = {undefined};
>>>>
>>>
>>> This looks like a very interesting approach.  Do you see any downside
>>> of this approach?  What is the problem of handling (nullifying) the
>>> dummy statement in expansion pass?
>>
>> That is what I'd have done initially.  I could in theory see RTL
>> code motion optimizations move stuff in an invalid way after that
>> (but we try to avoid this by properly sharing TBAA compatible
>> slots only and fixing up points-to information as well).
>>
>> So in the end it'll probably just work dropping the assignments
>> on the floor during expansion to RTL.
>>
>>> The approach we took is different --- we move this overlay/packing
>>> earlier (after ipa-inlining). One of the other motivation for doing
>>> this is due to the limitation in current implementation that leaves
>>> out many overlaying opportunities (e.g. structs with union members can
>>> not share slots etc), but this is a probably independent issue.
>>
>> Yes, one earlier idea would have unified stack slots at gimple lowering
>> time.  I'm not sure that after ipa-inlining is early enough (probably
>> it is due to the lack of code motion optimizations).
>
> Yes -- doing it after inlining is important as they are the major
> contributors of the stack sharing opportunities.
>
>>
>> With the extra assignments I was also hoping to help analysis phases
>> to note that for example in
>>
>>  {
>>    int a[10];
>>    foo (a);
>>   }
>>   bar ();
>>
>> a is not live over the call to bar as it can't validly escape out of
>> its scope.
>
> Yes, this will help exposing opportunities for dead store elimination
> due to the anticipation of the dummy store, which would otherwise
> missed by dce due to false use from bar. However, to make use of scope
> information better for aliasing,  flow sensitive analysis is needed --
> consider the following case:
>
> int *gp;
> int foo(..)
> {
>  int local;
>
>  local = 1;  // (1)
>  *gp = ...    // (2)
>
>  bar (&local); // (3)
>
>   ..
> }
>
> (1) and (2) is not aliased.  More generally for loop: local variable
> in loop body scope does not live across iterations, so address
> escaping does not 'propagate' via backedge:
>
>
> for ( i = ...; i  <..; i++)
>  {
>       int local[100];
>
>       for (j = ...)
>        {
>            local[j] = ... // (1)
>            ... = *global_p; // (2)
>        }
>      bar (local);
>  }
>
> (1) and (2) are not aliased.

Sure - but that's an orthogonal issue.  I merely hope for some DSE
opportunities.

Richard.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: stack slot reuse
  2010-05-21 19:44 stack slot reuse Xinliang David Li
  2010-05-21 20:29 ` Richard Guenther
@ 2010-05-25 20:41 ` Easwaran Raman
  2010-05-26 10:57   ` Richard Guenther
  1 sibling, 1 reply; 12+ messages in thread
From: Easwaran Raman @ 2010-05-25 20:41 UTC (permalink / raw)
  To: Xinliang David Li
  Cc: Richard Guenther, Steven Bosscher, Ian Lance Taylor,
	Vladimir Makarov, GCC Mailing List

On Fri, May 21, 2010 at 10:30 AM, Xinliang David Li <davidxl@google.com> wrote:
>
> On Fri, May 21, 2010 at 2:24 AM, Richard Guenther
> <richard.guenther@gmail.com> wrote:
> > On Thu, May 20, 2010 at 11:21 PM, Xinliang David Li <davidxl@google.com> wrote:
> >> On Thu, May 20, 2010 at 2:18 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
> >>> On Thu, May 20, 2010 at 11:14 PM, Xinliang David Li <davidxl@google.com> wrote:
> >>>> stack variable overlay and stack slot assignments is here too.
> >>>
> >>> Yes, and for these I would like to add a separate timevar. Agree?
> >>
> >> Yes.  (By the way, we are rewriting this pass to eliminate the code
> >> motion/aliasing problem -- but that is a different topic).
> >
> > Btw, we want to address the same problem by representing the
> > points where (big) variables go out-of scope in the IL, also to
> > help DSE.  The original idea was to simply drop in an aggregate
> > assignment from an undefined value at the end of the scope
> > during lowering, like
> >
> >  var = {undefined};
> >
>

Is there something that prevents store sinking (or similar passes)
from moving this 'var = {undefined};' statement outside the scope? Or
should store sinking be taught to treat this as a barrier?

>
> This looks like a very interesting approach.  Do you see any downside
> of this approach?  What is the problem of handling (nullifying) the
> dummy statement in expansion pass?
>
> The approach we took is different --- we move this overlay/packing
> earlier (after ipa-inlining).

To elaborate further, we use the current stack-slot sharing heuristics
in cfgexpand.c to decide what variables can share stack slots,
synthesize union variables with those variables as fields and replace
references to those variables with field references. We have an
initial implementation and are evaluating the performance impact of
making the sharing decisions early.

Thanks,
Easwaran

>
> One of the other motivation for doing
> this is due to the limitation in current implementation that leaves
> out many overlaying opportunities (e.g. structs with union members can
> not share slots etc), but this is a probably independent issue.
>
>
> Thanks,
>
> David
>
> > which we'd expand to nothing.  Of course shifting the problem to
> > the RTL optimizers, so better expand to a similar RTL construct.
> > But then are you addressing the similar problem on the RTL side?
> >
> > Richard.
> >

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: stack slot reuse
  2010-05-25 20:41 ` Easwaran Raman
@ 2010-05-26 10:57   ` Richard Guenther
  2010-05-26 15:56     ` Xinliang David Li
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Guenther @ 2010-05-26 10:57 UTC (permalink / raw)
  To: Easwaran Raman
  Cc: Xinliang David Li, Steven Bosscher, Ian Lance Taylor,
	Vladimir Makarov, GCC Mailing List

On Tue, May 25, 2010 at 10:02 PM, Easwaran Raman <eraman@google.com> wrote:
> On Fri, May 21, 2010 at 10:30 AM, Xinliang David Li <davidxl@google.com> wrote:
>>
>> On Fri, May 21, 2010 at 2:24 AM, Richard Guenther
>> <richard.guenther@gmail.com> wrote:
>> > On Thu, May 20, 2010 at 11:21 PM, Xinliang David Li <davidxl@google.com> wrote:
>> >> On Thu, May 20, 2010 at 2:18 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
>> >>> On Thu, May 20, 2010 at 11:14 PM, Xinliang David Li <davidxl@google.com> wrote:
>> >>>> stack variable overlay and stack slot assignments is here too.
>> >>>
>> >>> Yes, and for these I would like to add a separate timevar. Agree?
>> >>
>> >> Yes.  (By the way, we are rewriting this pass to eliminate the code
>> >> motion/aliasing problem -- but that is a different topic).
>> >
>> > Btw, we want to address the same problem by representing the
>> > points where (big) variables go out-of scope in the IL, also to
>> > help DSE.  The original idea was to simply drop in an aggregate
>> > assignment from an undefined value at the end of the scope
>> > during lowering, like
>> >
>> >  var = {undefined};
>> >
>>
>
> Is there something that prevents store sinking (or similar passes)
> from moving this 'var = {undefined};' statement outside the scope? Or
> should store sinking be taught to treat this as a barrier?

Not at the moment (if indeed that assignment looks as a regular one).
Passes should be taught that it's not worthwhile to sink a
no-op.  IIRC no pass currently would sink aggregate copies anyway.

>> This looks like a very interesting approach.  Do you see any downside
>> of this approach?  What is the problem of handling (nullifying) the
>> dummy statement in expansion pass?
>>
>> The approach we took is different --- we move this overlay/packing
>> earlier (after ipa-inlining).
>
> To elaborate further, we use the current stack-slot sharing heuristics
> in cfgexpand.c to decide what variables can share stack slots,
> synthesize union variables with those variables as fields and replace
> references to those variables with field references. We have an
> initial implementation and are evaluating the performance impact of
> making the sharing decisions early.

Note that using union variables will pessimize alias analysis
as we allow type-punning with unions.  How do you address
the issue of debug information?

Some time ago I had the very simple idea to merge identically
typed variables that do not have overlapping life-ranges into
a single variable (avoiding the union issue).  That would not
catch all cases cfgexpand catches but may even re-use
common initializations.  Of course the debug information
issues would be the same.

I think we want the clobbering stores anyway, for optimization
purposes, even if we do not end up using them for the
stack slot sharing problem.

Richard.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: stack slot reuse
  2010-05-26 10:57   ` Richard Guenther
@ 2010-05-26 15:56     ` Xinliang David Li
  2010-05-26 16:11       ` Richard Guenther
  2010-05-26 17:32       ` Easwaran Raman
  0 siblings, 2 replies; 12+ messages in thread
From: Xinliang David Li @ 2010-05-26 15:56 UTC (permalink / raw)
  To: Richard Guenther
  Cc: Easwaran Raman, Steven Bosscher, Ian Lance Taylor,
	Vladimir Makarov, GCC Mailing List

On Wed, May 26, 2010 at 2:58 AM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Tue, May 25, 2010 at 10:02 PM, Easwaran Raman <eraman@google.com> wrote:
>> On Fri, May 21, 2010 at 10:30 AM, Xinliang David Li <davidxl@google.com> wrote:
>>>
>>> On Fri, May 21, 2010 at 2:24 AM, Richard Guenther
>>> <richard.guenther@gmail.com> wrote:
>>> > On Thu, May 20, 2010 at 11:21 PM, Xinliang David Li <davidxl@google.com> wrote:
>>> >> On Thu, May 20, 2010 at 2:18 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
>>> >>> On Thu, May 20, 2010 at 11:14 PM, Xinliang David Li <davidxl@google.com> wrote:
>>> >>>> stack variable overlay and stack slot assignments is here too.
>>> >>>
>>> >>> Yes, and for these I would like to add a separate timevar. Agree?
>>> >>
>>> >> Yes.  (By the way, we are rewriting this pass to eliminate the code
>>> >> motion/aliasing problem -- but that is a different topic).
>>> >
>>> > Btw, we want to address the same problem by representing the
>>> > points where (big) variables go out-of scope in the IL, also to
>>> > help DSE.  The original idea was to simply drop in an aggregate
>>> > assignment from an undefined value at the end of the scope
>>> > during lowering, like
>>> >
>>> >  var = {undefined};
>>> >
>>>
>>
>> Is there something that prevents store sinking (or similar passes)
>> from moving this 'var = {undefined};' statement outside the scope? Or
>> should store sinking be taught to treat this as a barrier?
>
> Not at the moment (if indeed that assignment looks as a regular one).
> Passes should be taught that it's not worthwhile to sink a
> no-op.  IIRC no pass currently would sink aggregate copies anyway.

Other issues to consider: 1) how does it affect SRA decisions? 2)
inline summary also needs to be taught to not include size of those
fake instructions; 3) why only aggregates? For scalars that live in
stack, they also need barriers if slot sharing pick them as
candidates, etc.


>
>>> This looks like a very interesting approach.  Do you see any downside
>>> of this approach?  What is the problem of handling (nullifying) the
>>> dummy statement in expansion pass?
>>>
>>> The approach we took is different --- we move this overlay/packing
>>> earlier (after ipa-inlining).
>>
>> To elaborate further, we use the current stack-slot sharing heuristics
>> in cfgexpand.c to decide what variables can share stack slots,
>> synthesize union variables with those variables as fields and replace
>> references to those variables with field references. We have an
>> initial implementation and are evaluating the performance impact of
>> making the sharing decisions early.
>
> Note that using union variables will pessimize alias analysis
> as we allow type-punning with unions.  How do you address
> the issue of debug information?
>

Debug information is handled. Easwaran can fill in the details.


Thanks,

David

> Some time ago I had the very simple idea to merge identically
> typed variables that do not have overlapping life-ranges into
> a single variable (avoiding the union issue).  That would not
> catch all cases cfgexpand catches but may even re-use
> common initializations.  Of course the debug information
> issues would be the same.
>
> I think we want the clobbering stores anyway, for optimization
> purposes, even if we do not end up using them for the
> stack slot sharing problem.
>
> Richard.
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: stack slot reuse
  2010-05-26 15:56     ` Xinliang David Li
@ 2010-05-26 16:11       ` Richard Guenther
  2010-05-27 10:49         ` Richard Guenther
  2010-05-26 17:32       ` Easwaran Raman
  1 sibling, 1 reply; 12+ messages in thread
From: Richard Guenther @ 2010-05-26 16:11 UTC (permalink / raw)
  To: Xinliang David Li
  Cc: Easwaran Raman, Steven Bosscher, Ian Lance Taylor,
	Vladimir Makarov, GCC Mailing List

On Wed, May 26, 2010 at 5:42 PM, Xinliang David Li <davidxl@google.com> wrote:
> On Wed, May 26, 2010 at 2:58 AM, Richard Guenther
> <richard.guenther@gmail.com> wrote:
>> On Tue, May 25, 2010 at 10:02 PM, Easwaran Raman <eraman@google.com> wrote:
>>> On Fri, May 21, 2010 at 10:30 AM, Xinliang David Li <davidxl@google.com> wrote:
>>>>
>>>> On Fri, May 21, 2010 at 2:24 AM, Richard Guenther
>>>> <richard.guenther@gmail.com> wrote:
>>>> > On Thu, May 20, 2010 at 11:21 PM, Xinliang David Li <davidxl@google.com> wrote:
>>>> >> On Thu, May 20, 2010 at 2:18 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
>>>> >>> On Thu, May 20, 2010 at 11:14 PM, Xinliang David Li <davidxl@google.com> wrote:
>>>> >>>> stack variable overlay and stack slot assignments is here too.
>>>> >>>
>>>> >>> Yes, and for these I would like to add a separate timevar. Agree?
>>>> >>
>>>> >> Yes.  (By the way, we are rewriting this pass to eliminate the code
>>>> >> motion/aliasing problem -- but that is a different topic).
>>>> >
>>>> > Btw, we want to address the same problem by representing the
>>>> > points where (big) variables go out-of scope in the IL, also to
>>>> > help DSE.  The original idea was to simply drop in an aggregate
>>>> > assignment from an undefined value at the end of the scope
>>>> > during lowering, like
>>>> >
>>>> >  var = {undefined};
>>>> >
>>>>
>>>
>>> Is there something that prevents store sinking (or similar passes)
>>> from moving this 'var = {undefined};' statement outside the scope? Or
>>> should store sinking be taught to treat this as a barrier?
>>
>> Not at the moment (if indeed that assignment looks as a regular one).
>> Passes should be taught that it's not worthwhile to sink a
>> no-op.  IIRC no pass currently would sink aggregate copies anyway.
>
> Other issues to consider: 1) how does it affect SRA decisions?

It shouldn't.  But SRA needs to be adjusted for sure.

> 2) inline summary also needs to be taught to not include size of those
> fake instructions;

That's simple.  The inliner also needs to be taught to emit the
fake assignments into the caller.

> 3) why only aggregates? For scalars that live in
> stack, they also need barriers if slot sharing pick them as
> candidates, etc.

Sure.

Richard.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: stack slot reuse
  2010-05-26 15:56     ` Xinliang David Li
  2010-05-26 16:11       ` Richard Guenther
@ 2010-05-26 17:32       ` Easwaran Raman
  1 sibling, 0 replies; 12+ messages in thread
From: Easwaran Raman @ 2010-05-26 17:32 UTC (permalink / raw)
  To: Xinliang David Li
  Cc: Richard Guenther, Steven Bosscher, Ian Lance Taylor,
	Vladimir Makarov, GCC Mailing List

On Wed, May 26, 2010 at 8:42 AM, Xinliang David Li <davidxl@google.com> wrote:
> On Wed, May 26, 2010 at 2:58 AM, Richard Guenther
> <richard.guenther@gmail.com> wrote:
>> On Tue, May 25, 2010 at 10:02 PM, Easwaran Raman <eraman@google.com> wrote:
>>> On Fri, May 21, 2010 at 10:30 AM, Xinliang David Li <davidxl@google.com> wrote:
>>>>
>>>> On Fri, May 21, 2010 at 2:24 AM, Richard Guenther
>>>> <richard.guenther@gmail.com> wrote:
>>>> > On Thu, May 20, 2010 at 11:21 PM, Xinliang David Li <davidxl@google.com> wrote:
>>>> >> On Thu, May 20, 2010 at 2:18 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
>>>> >>> On Thu, May 20, 2010 at 11:14 PM, Xinliang David Li <davidxl@google.com> wrote:
>>>> >>>> stack variable overlay and stack slot assignments is here too.
>>>> >>>
>>>> >>> Yes, and for these I would like to add a separate timevar. Agree?
>>>> >>
>>>> >> Yes.  (By the way, we are rewriting this pass to eliminate the code
>>>> >> motion/aliasing problem -- but that is a different topic).
>>>> >
>>>> > Btw, we want to address the same problem by representing the
>>>> > points where (big) variables go out-of scope in the IL, also to
>>>> > help DSE.  The original idea was to simply drop in an aggregate
>>>> > assignment from an undefined value at the end of the scope
>>>> > during lowering, like
>>>> >
>>>> >  var = {undefined};
>>>> >
>>>>
>>>
>>> Is there something that prevents store sinking (or similar passes)
>>> from moving this 'var = {undefined};' statement outside the scope? Or
>>> should store sinking be taught to treat this as a barrier?
>>
>> Not at the moment (if indeed that assignment looks as a regular one).
>> Passes should be taught that it's not worthwhile to sink a
>> no-op.  IIRC no pass currently would sink aggregate copies anyway.
>
> Other issues to consider: 1) how does it affect SRA decisions? 2)
> inline summary also needs to be taught to not include size of those
> fake instructions; 3) why only aggregates? For scalars that live in
> stack, they also need barriers if slot sharing pick them as
> candidates, etc.
>
>
>>
>>>> This looks like a very interesting approach.  Do you see any downside
>>>> of this approach?  What is the problem of handling (nullifying) the
>>>> dummy statement in expansion pass?
>>>>
>>>> The approach we took is different --- we move this overlay/packing
>>>> earlier (after ipa-inlining).
>>>
>>> To elaborate further, we use the current stack-slot sharing heuristics
>>> in cfgexpand.c to decide what variables can share stack slots,
>>> synthesize union variables with those variables as fields and replace
>>> references to those variables with field references. We have an
>>> initial implementation and are evaluating the performance impact of
>>> making the sharing decisions early.
>>
>> Note that using union variables will pessimize alias analysis
>> as we allow type-punning with unions.  How do you address
>> the issue of debug information?
>>
>
> Debug information is handled. Easwaran can fill in the details.

We retain the original VAR_DECL even after synthesizing the union
variables and keep a map between the synthesized union variable to the
set of original variables corresponding to the synthesized union. In
cfgexpand.c, when RTL is generated for the synthesized variable, we
set the RTL to all of the original variables using SET_DECL_RTL
instead of setting the RTL to the synthesized union variable.


thanks,
Easwaran

>
> Thanks,
>
> David
>
>> Some time ago I had the very simple idea to merge identically
>> typed variables that do not have overlapping life-ranges into
>> a single variable (avoiding the union issue).  That would not
>> catch all cases cfgexpand catches but may even re-use
>> common initializations.  Of course the debug information
>> issues would be the same.
>>
>> I think we want the clobbering stores anyway, for optimization
>> purposes, even if we do not end up using them for the
>> stack slot sharing problem.
>>
>> Richard.
>>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: stack slot reuse
  2010-05-26 16:11       ` Richard Guenther
@ 2010-05-27 10:49         ` Richard Guenther
  2010-05-27 11:43           ` Martin Jambor
  2010-05-27 18:36           ` Xinliang David Li
  0 siblings, 2 replies; 12+ messages in thread
From: Richard Guenther @ 2010-05-27 10:49 UTC (permalink / raw)
  To: Xinliang David Li
  Cc: Easwaran Raman, Steven Bosscher, Ian Lance Taylor,
	Vladimir Makarov, GCC Mailing List

On Wed, May 26, 2010 at 6:05 PM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Wed, May 26, 2010 at 5:42 PM, Xinliang David Li <davidxl@google.com> wrote:
>> On Wed, May 26, 2010 at 2:58 AM, Richard Guenther
>> <richard.guenther@gmail.com> wrote:
>>> On Tue, May 25, 2010 at 10:02 PM, Easwaran Raman <eraman@google.com> wrote:
>>>> On Fri, May 21, 2010 at 10:30 AM, Xinliang David Li <davidxl@google.com> wrote:
>>>>>
>>>>> On Fri, May 21, 2010 at 2:24 AM, Richard Guenther
>>>>> <richard.guenther@gmail.com> wrote:
>>>>> > On Thu, May 20, 2010 at 11:21 PM, Xinliang David Li <davidxl@google.com> wrote:
>>>>> >> On Thu, May 20, 2010 at 2:18 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
>>>>> >>> On Thu, May 20, 2010 at 11:14 PM, Xinliang David Li <davidxl@google.com> wrote:
>>>>> >>>> stack variable overlay and stack slot assignments is here too.
>>>>> >>>
>>>>> >>> Yes, and for these I would like to add a separate timevar. Agree?
>>>>> >>
>>>>> >> Yes.  (By the way, we are rewriting this pass to eliminate the code
>>>>> >> motion/aliasing problem -- but that is a different topic).
>>>>> >
>>>>> > Btw, we want to address the same problem by representing the
>>>>> > points where (big) variables go out-of scope in the IL, also to
>>>>> > help DSE.  The original idea was to simply drop in an aggregate
>>>>> > assignment from an undefined value at the end of the scope
>>>>> > during lowering, like
>>>>> >
>>>>> >  var = {undefined};
>>>>> >
>>>>>
>>>>
>>>> Is there something that prevents store sinking (or similar passes)
>>>> from moving this 'var = {undefined};' statement outside the scope? Or
>>>> should store sinking be taught to treat this as a barrier?
>>>
>>> Not at the moment (if indeed that assignment looks as a regular one).
>>> Passes should be taught that it's not worthwhile to sink a
>>> no-op.  IIRC no pass currently would sink aggregate copies anyway.
>>
>> Other issues to consider: 1) how does it affect SRA decisions?
>
> It shouldn't.  But SRA needs to be adjusted for sure.

Btw, globbing shared vars into a union will certainly also affect SRA,
no?

Richard.

>> 2) inline summary also needs to be taught to not include size of those
>> fake instructions;
>
> That's simple.  The inliner also needs to be taught to emit the
> fake assignments into the caller.
>
>> 3) why only aggregates? For scalars that live in
>> stack, they also need barriers if slot sharing pick them as
>> candidates, etc.
>
> Sure.
>
> Richard.
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: stack slot reuse
  2010-05-27 10:49         ` Richard Guenther
@ 2010-05-27 11:43           ` Martin Jambor
  2010-05-27 18:36           ` Xinliang David Li
  1 sibling, 0 replies; 12+ messages in thread
From: Martin Jambor @ 2010-05-27 11:43 UTC (permalink / raw)
  To: Richard Guenther
  Cc: Xinliang David Li, Easwaran Raman, Steven Bosscher,
	Ian Lance Taylor, Vladimir Makarov, GCC Mailing List

Hi,

I have not really payed much attention to this thread, but...

On Thu, May 27, 2010 at 11:38:09AM +0200, Richard Guenther wrote:
> On Wed, May 26, 2010 at 6:05 PM, Richard Guenther
> <richard.guenther@gmail.com> wrote:
> > On Wed, May 26, 2010 at 5:42 PM, Xinliang David Li <davidxl@google.com> wrote:
> >> On Wed, May 26, 2010 at 2:58 AM, Richard Guenther
> >> <richard.guenther@gmail.com> wrote:
> >>> On Tue, May 25, 2010 at 10:02 PM, Easwaran Raman <eraman@google.com> wrote:
> >>>> On Fri, May 21, 2010 at 10:30 AM, Xinliang David Li <davidxl@google.com> wrote:
> >>>>>
> >>>>> On Fri, May 21, 2010 at 2:24 AM, Richard Guenther
> >>>>> <richard.guenther@gmail.com> wrote:
> >>>>> > On Thu, May 20, 2010 at 11:21 PM, Xinliang David Li <davidxl@google.com> wrote:
> >>>>> >> On Thu, May 20, 2010 at 2:18 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
> >>>>> >>> On Thu, May 20, 2010 at 11:14 PM, Xinliang David Li <davidxl@google.com> wrote:
> >>>>> >>>> stack variable overlay and stack slot assignments is here too.
> >>>>> >>>
> >>>>> >>> Yes, and for these I would like to add a separate timevar. Agree?
> >>>>> >>
> >>>>> >> Yes.  (By the way, we are rewriting this pass to eliminate the code
> >>>>> >> motion/aliasing problem -- but that is a different topic).
> >>>>> >
> >>>>> > Btw, we want to address the same problem by representing the
> >>>>> > points where (big) variables go out-of scope in the IL, also to
> >>>>> > help DSE.  The original idea was to simply drop in an aggregate
> >>>>> > assignment from an undefined value at the end of the scope
> >>>>> > during lowering, like
> >>>>> >
> >>>>> >  var = {undefined};
> >>>>> >
> >>>>>
> >>>>
> >>>> Is there something that prevents store sinking (or similar passes)
> >>>> from moving this 'var = {undefined};' statement outside the scope? Or
> >>>> should store sinking be taught to treat this as a barrier?
> >>>
> >>> Not at the moment (if indeed that assignment looks as a regular one).
> >>> Passes should be taught that it's not worthwhile to sink a
> >>> no-op.  IIRC no pass currently would sink aggregate copies anyway.
> >>
> >> Other issues to consider: 1) how does it affect SRA decisions?
> >
> > It shouldn't.  But SRA needs to be adjusted for sure.
> 
> Btw, globbing shared vars into a union will certainly also affect SRA,
> no?

If the variables have different sizes, only the smallest ones might
get a scalar replacement.

If the smallest variables have different types, you'll get a lot of
VIEW_CONVERT_EXPRs for all but one (more-or-less randomly) chosen
type.

But I guess you were primarily discussing variables that need to
reside in memory so it may not be such a big concern, after all.

Martin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: stack slot reuse
  2010-05-27 10:49         ` Richard Guenther
  2010-05-27 11:43           ` Martin Jambor
@ 2010-05-27 18:36           ` Xinliang David Li
  1 sibling, 0 replies; 12+ messages in thread
From: Xinliang David Li @ 2010-05-27 18:36 UTC (permalink / raw)
  To: Richard Guenther
  Cc: Easwaran Raman, Steven Bosscher, Ian Lance Taylor,
	Vladimir Makarov, GCC Mailing List

On Thu, May 27, 2010 at 2:38 AM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Wed, May 26, 2010 at 6:05 PM, Richard Guenther
> <richard.guenther@gmail.com> wrote:
>> On Wed, May 26, 2010 at 5:42 PM, Xinliang David Li <davidxl@google.com> wrote:
>>> On Wed, May 26, 2010 at 2:58 AM, Richard Guenther
>>> <richard.guenther@gmail.com> wrote:
>>>> On Tue, May 25, 2010 at 10:02 PM, Easwaran Raman <eraman@google.com> wrote:
>>>>> On Fri, May 21, 2010 at 10:30 AM, Xinliang David Li <davidxl@google.com> wrote:
>>>>>>
>>>>>> On Fri, May 21, 2010 at 2:24 AM, Richard Guenther
>>>>>> <richard.guenther@gmail.com> wrote:
>>>>>> > On Thu, May 20, 2010 at 11:21 PM, Xinliang David Li <davidxl@google.com> wrote:
>>>>>> >> On Thu, May 20, 2010 at 2:18 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
>>>>>> >>> On Thu, May 20, 2010 at 11:14 PM, Xinliang David Li <davidxl@google.com> wrote:
>>>>>> >>>> stack variable overlay and stack slot assignments is here too.
>>>>>> >>>
>>>>>> >>> Yes, and for these I would like to add a separate timevar. Agree?
>>>>>> >>
>>>>>> >> Yes.  (By the way, we are rewriting this pass to eliminate the code
>>>>>> >> motion/aliasing problem -- but that is a different topic).
>>>>>> >
>>>>>> > Btw, we want to address the same problem by representing the
>>>>>> > points where (big) variables go out-of scope in the IL, also to
>>>>>> > help DSE.  The original idea was to simply drop in an aggregate
>>>>>> > assignment from an undefined value at the end of the scope
>>>>>> > during lowering, like
>>>>>> >
>>>>>> >  var = {undefined};
>>>>>> >
>>>>>>
>>>>>
>>>>> Is there something that prevents store sinking (or similar passes)
>>>>> from moving this 'var = {undefined};' statement outside the scope? Or
>>>>> should store sinking be taught to treat this as a barrier?
>>>>
>>>> Not at the moment (if indeed that assignment looks as a regular one).
>>>> Passes should be taught that it's not worthwhile to sink a
>>>> no-op.  IIRC no pass currently would sink aggregate copies anyway.
>>>
>>> Other issues to consider: 1) how does it affect SRA decisions?
>>
>> It shouldn't.  But SRA needs to be adjusted for sure.
>
> Btw, globbing shared vars into a union will certainly also affect SRA,
> no?
>

It certainly will affect it.

David


> Richard.
>
>>> 2) inline summary also needs to be taught to not include size of those
>>> fake instructions;
>>
>> That's simple.  The inliner also needs to be taught to emit the
>> fake assignments into the caller.
>>
>>> 3) why only aggregates? For scalars that live in
>>> stack, they also need barriers if slot sharing pick them as
>>> candidates, etc.
>>
>> Sure.
>>
>> Richard.
>>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2010-05-27 16:23 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-05-21 19:44 stack slot reuse Xinliang David Li
2010-05-21 20:29 ` Richard Guenther
2010-05-22 20:29   ` Xinliang David Li
2010-05-22 20:57     ` Richard Guenther
2010-05-25 20:41 ` Easwaran Raman
2010-05-26 10:57   ` Richard Guenther
2010-05-26 15:56     ` Xinliang David Li
2010-05-26 16:11       ` Richard Guenther
2010-05-27 10:49         ` Richard Guenther
2010-05-27 11:43           ` Martin Jambor
2010-05-27 18:36           ` Xinliang David Li
2010-05-26 17:32       ` Easwaran Raman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).