Re: MEM flags and stack temps

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Re: MEM flags and stack temps
@ 2000-11-16  3:25 Jan Hubicka
  0 siblings, 0 replies; 10+ messages in thread
From: Jan Hubicka @ 2000-11-16  3:25 UTC (permalink / raw)
  To: mark, law, kenner, gcc

> 
>   - Treat stack slots like registers, and allocate them in a 
>     "stack allocator".  In other words, have (MEM (STACK_SLOT x))
>     for a while, and then resolve them to hard slots late
>     in the game.
> 
I've done some experiments in this area - my approach was to always use
ADDRESSOF to generate stack slots in early pass, so I've got instantly
the ability to eliminate dead memory alocations entirely and the
ADDRESSOF pass can be definitly tweaked to be more smart about stack
frame allocations.

To get real benefits from such code we need safe way to eliminate dead
stores - current gcc does this partly because of Richard's patch and
with my hack it seemed to work well (producing shorter code).

Before Kenner's patch, the hack was enlarging code size slightly. I believe
that because of non-functional rtx-cost calculations.  After the patch,
code was slight win.

So perhaps ADDRESSOF is easy way to go.  BTW why it requires the pointer
to tree representation of type? It would be easier if it just contained
the size field and identifier at least from the RTL point of view.

Honza

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: MEM flags and stack temps
@ 2000-11-16 14:25 Richard Kenner
  0 siblings, 0 replies; 10+ messages in thread
From: Richard Kenner @ 2000-11-16 14:25 UTC (permalink / raw)
  To: jbuck; +Cc: gcc

    This is possibly naive, since I'm not familiar enough with the
    relevant back end structures.  But what if the offset calculation
    showed up as its own RTL object, computing an address-register-like
    pseudo which is in turn "read" by the addressof(stack slot) construct?
    It would then be cse'd, and if all of the addressof's for that stack
    slot are removed, these offsets would then be dead, and eliminated by
    dce.

But you can't know if the offset is too large enough.  Most of them won't be.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: MEM flags and stack temps
  2000-11-16 14:04 Richard Kenner
@ 2000-11-16 14:24 ` Joe Buck
  0 siblings, 0 replies; 10+ messages in thread
From: Joe Buck @ 2000-11-16 14:24 UTC (permalink / raw)
  To: Richard Kenner; +Cc: gcc

>     You seem to be thinking of a completely different problem.  Is this an
>     issue with the ordering of the passes?
> 
> Yes.  The idea is that you want to keep stack slots as RTL objects as
> long as possible to do optimizations on *them*.  But when you actually find
> you *do* need them, you have to address them.  If the offset to address them
> is large enough, that offset calculation itself would need to be cse'd
> and moved out of loops.

This is possibly naive, since I'm not familiar enough with the relevant
back end structures.  But what if the offset calculation showed up as
its own RTL object, computing an address-register-like pseudo which is in
turn "read" by the addressof(stack slot) construct?  It would then be
cse'd, and if all of the addressof's for that stack slot are removed,
these offsets would then be dead, and eliminated by dce.

The idea is that they'd be cse'd and moved out of loops on the assumption
that they are needed, and if it turns out that they aren't then they
will be killed.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: MEM flags and stack temps
@ 2000-11-16 14:04 Richard Kenner
  2000-11-16 14:24 ` Joe Buck
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Kenner @ 2000-11-16 14:04 UTC (permalink / raw)
  To: jbuck; +Cc: gcc

    You seem to be thinking of a completely different problem.  Is this an
    issue with the ordering of the passes?

Yes.  The idea is that you want to keep stack slots as RTL objects as
long as possible to do optimizations on *them*.  But when you actually find
you *do* need them, you have to address them.  If the offset to address them
is large enough, that offset calculation itself would need to be cse'd
and moved out of loops.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: MEM flags and stack temps
  2000-11-16  3:27 Richard Kenner
@ 2000-11-16 14:01 ` Joe Buck
  0 siblings, 0 replies; 10+ messages in thread
From: Joe Buck @ 2000-11-16 14:01 UTC (permalink / raw)
  To: Richard Kenner; +Cc: gcc

>     Yes!  With the second approach, it will then be straightforward to get
>     rid of the huge number of dead stores we get on C++ code (currently
>     once an object is assigned to the stack and we write it, the compiler
>     is incapable of seeing that the value is not needed).  All (ok, big
>     "all") we need to do is extend dead code elimination to work on stack
>     slots where addresses are not taken (currently the dead code
>     eliminator in dce.c believes that all writes to memory are necessary
>     under all conditions).
> 
> The problem is timing: you also want to be able to do CSE and loop
> optimizations on the addresss of temporaries that aren't eliminated if they
> are outside the addressable range of the machine.

Clearly I'm missing something.  In the cases I'm talking about that seem
to be the main cause of poor C++ performance, after inlining there are no
addresses taken at all: objects are passed by reference to an inline
function and then dereferenced, but if we have ADDRESSOF(STACK_SLOT x)
and then dereference it the ADDRESSOF just cancels out.  So stack slots
work like another type of register.  Maybe even using a term like stack
slot is misleading, as they work like pseudo registers.

If we have an object whose address is taken we have to worry about
aliasing so dce can't kill stores.

You seem to be thinking of a completely different problem.  Is this an
issue with the ordering of the passes?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: MEM flags and stack temps
@ 2000-11-16  3:27 Richard Kenner
  2000-11-16 14:01 ` Joe Buck
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Kenner @ 2000-11-16  3:27 UTC (permalink / raw)
  To: jbuck; +Cc: gcc

    Yes!  With the second approach, it will then be straightforward to get
    rid of the huge number of dead stores we get on C++ code (currently
    once an object is assigned to the stack and we write it, the compiler
    is incapable of seeing that the value is not needed).  All (ok, big
    "all") we need to do is extend dead code elimination to work on stack
    slots where addresses are not taken (currently the dead code
    eliminator in dce.c believes that all writes to memory are necessary
    under all conditions).

The problem is timing: you also want to be able to do CSE and loop
optimizations on the addresss of temporaries that aren't eliminated if they
are outside the addressable range of the machine.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: MEM flags and stack temps
  2000-11-15  8:47   ` Mark Mitchell
@ 2000-11-15 17:00     ` Joe Buck
  0 siblings, 0 replies; 10+ messages in thread
From: Joe Buck @ 2000-11-15 17:00 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: law, kenner, gcc

Mark Mitchell writes:

[ #1 deleted ]

#2:
>   - Treat stack slots like registers, and allocate them in a 
>     "stack allocator".  In other words, have (MEM (STACK_SLOT x))
>     for a while, and then resolve them to hard slots late
>     in the game.
> 
> I actually think both approaches are good ideas.  The second is
> appealing in that it helps blur the boundary between things that fit
> in a machine register, and things that don't.

Yes!  With the second approach, it will then be straightforward to get rid of
the huge number of dead stores we get on C++ code (currently once an
object is assigned to the stack and we write it, the compiler is incapable
of seeing that the value is not needed).  All (ok, big "all") we need to
do is extend dead code elimination to work on stack slots where addresses
are not taken (currently the dead code eliminator in dce.c believes that
all writes to memory are necessary under all conditions).

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: MEM flags and stack temps
  2000-11-15  8:37 ` law
@ 2000-11-15  8:47   ` Mark Mitchell
  2000-11-15 17:00     ` Joe Buck
  0 siblings, 1 reply; 10+ messages in thread
From: Mark Mitchell @ 2000-11-15  8:47 UTC (permalink / raw)
  To: law; +Cc: kenner, gcc

>>>>> "law" == law  <law@redhat.com> writes:

    >> Any thoughts here?

    law> I don't think it's a major problem.  The only area where
    law> (IMHO) it causes problems is the ability to share space for
    law> large arrays in different (disjoint) scopes.

We went around on this a little bit once before.  I remember a couple
of reasonable approaches:

  - Move some of the sharing code into the front-ends/AST->RTL
    conversion machinery.
      
    In other words, just reuse the same DECLs in some cases.

  - Treat stack slots like registers, and allocate them in a 
    "stack allocator".  In other words, have (MEM (STACK_SLOT x))
    for a while, and then resolve them to hard slots late
    in the game.

I actually think both approaches are good ideas.  The second is
appealing in that it helps blur the boundary between things that fit
in a machine register, and things that don't.  In the long run, we
want to do CSE, etc., on structures too.  If I write:

  struct S a, b, c;

  a.x = 3;
  a.y = 4;
  b = a;
  c = b;

It would be great if the compiler knew that `c' was a struct whose `x'
field was `3' and whose `y' field was `4'.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: MEM flags and stack temps
  2000-11-15  5:59 Richard Kenner
@ 2000-11-15  8:37 ` law
  2000-11-15  8:47   ` Mark Mitchell
  0 siblings, 1 reply; 10+ messages in thread
From: law @ 2000-11-15  8:37 UTC (permalink / raw)
  To: Richard Kenner; +Cc: gcc

  In message < 10011151358.AA19694@vlsi1.ultra.nyu.edu >you write:
  > We have code to not reuse stack temps for a temporary of a different alias
  > set.  I think we also need to do the same thing to not reuse for differnt
  > values of volatile, unchanging, scalar, and in_struct.
Quite possibly.


  > But if we do that, we're essentially not sharing anymore.
For the most part, yes.


  > Any thoughts here?
I don't think it's a major problem.  The only area where (IMHO) it causes
problems is the ability to share space for large arrays in different
(disjoint) scopes.

jeff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* MEM flags and stack temps
@ 2000-11-15  5:59 Richard Kenner
  2000-11-15  8:37 ` law
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Kenner @ 2000-11-15  5:59 UTC (permalink / raw)
  To: gcc

We have code to not reuse stack temps for a temporary of a different alias
set.  I think we also need to do the same thing to not reuse for differnt
values of volatile, unchanging, scalar, and in_struct.

But if we do that, we're essentially not sharing anymore.

Any thoughts here?

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2000-11-16 14:25 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-11-16  3:25 MEM flags and stack temps Jan Hubicka
  -- strict thread matches above, loose matches on Subject: below --
2000-11-16 14:25 Richard Kenner
2000-11-16 14:04 Richard Kenner
2000-11-16 14:24 ` Joe Buck
2000-11-16  3:27 Richard Kenner
2000-11-16 14:01 ` Joe Buck
2000-11-15  5:59 Richard Kenner
2000-11-15  8:37 ` law
2000-11-15  8:47   ` Mark Mitchell
2000-11-15 17:00     ` Joe Buck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).