[RFC] Dynamically aligning the stack

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [RFC] Dynamically aligning the stack
@ 2015-03-26 23:28 Steve Ellcey 
  2015-03-29 23:02 ` Mike Stump
  2015-04-14  5:18 ` Jeff Law
  0 siblings, 2 replies; 9+ messages in thread
From: Steve Ellcey  @ 2015-03-26 23:28 UTC (permalink / raw)
  To: gcc-patches

I am looking at ways of dynamically realigning the runtime stack in GCC.
Currently it looks like x86 is the only architecture that supports this.

The issue that I am trying to address is MSA registers on MIPS.  The O32
MIPS ABI specifies an 8 byte aligned stack but MSA registers should be 16
byte aligned when spilled to memory.  I don't see anyway to do this unless
we can force the stack to be 16 byte aligned.

Richard Henderson created a way of aligning local variables with an
alignment greater than the maximum stack alignment by using alloca
and creating an aligned pointer in to that space but that doesn't help
when reload or LRA is spilling a register to memory.

My thought was to use alloca, not to create space, but to move the stack
pointer to an aligned address so that subsequent spills using the stack
pointer would be to aligned addresses.  As a test I created a simple gimple
pass that called __builtin_alloca at the beginning of each function just
to see if that was possible and it seemed to work fine.  This seemed to be
much cleaner than when I tried to modify the stack pointer in expand_prologue.
When I did it there I had issues with tests that use setjmp/longjmp and there
seems to be a lot of bookkeeping needed to track registers and offsets when
working at that level.  With this gimple pass that was all taken care of by
existing mechanisms.

Of course that test just did an alloca of 8 bytes, the actual code needs
to allocate a dynamic number of bytes depending on the current value of the
stack pointer.

	__builtin_alloca(stack_pointer MOD desired_alignment)

So my first question is: Is there way to access/refrence the stack pointer
in a gimple pass?  If so, how?

My second question is what do people think about this as a way to dymanically
align the stack?  It seems a lot simpler and more target independent than
what x86 is doing.

Steve Ellcey
sellcey@imgtec.com

A pass that inserts __builtin_alloca(8) at front of all routines:

unsigned int
pass_realign_stack::execute (function *fun)
{
  basic_block bb;
  gimple g;
  tree size;
  gimple_stmt_iterator gsi;
  bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (fun));
  gsi = gsi_start_bb (bb);
  size = build_int_cst (sizetype, 8);
  g = gimple_build_call (builtin_decl_explicit (BUILT_IN_ALLOCA), 1, size);
  gsi_insert_before (&gsi, g, GSI_NEW_STMT);
  return 0;
}

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] Dynamically aligning the stack
  2015-03-26 23:28 [RFC] Dynamically aligning the stack Steve Ellcey 
@ 2015-03-29 23:02 ` Mike Stump
  2015-04-14  5:18 ` Jeff Law
  1 sibling, 0 replies; 9+ messages in thread
From: Mike Stump @ 2015-03-29 23:02 UTC (permalink / raw)
  To: Steve Ellcey; +Cc: gcc-patches

On Mar 26, 2015, at 4:28 PM, Steve Ellcey <sellcey@imgtec.com> wrote:
> I am looking at ways of dynamically realigning the runtime stack in GCC.

Ick, sorry to hear it.  The best approach is to just tell the powers that be that an abi roll is the best, cheapest and most reliable way to fix it.  [ ducks ]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] Dynamically aligning the stack
  2015-03-26 23:28 [RFC] Dynamically aligning the stack Steve Ellcey 
  2015-03-29 23:02 ` Mike Stump
@ 2015-04-14  5:18 ` Jeff Law
  2015-04-14 16:30   ` Steve Ellcey
  1 sibling, 1 reply; 9+ messages in thread
From: Jeff Law @ 2015-04-14  5:18 UTC (permalink / raw)
  To: Steve Ellcey, gcc-patches

On 03/26/2015 05:28 PM, Steve Ellcey  wrote:
> The issue that I am trying to address is MSA registers on MIPS.  The O32
> MIPS ABI specifies an 8 byte aligned stack but MSA registers should be 16
> byte aligned when spilled to memory.  I don't see anyway to do this unless
> we can force the stack to be 16 byte aligned.
>
> Richard Henderson created a way of aligning local variables with an
> alignment greater than the maximum stack alignment by using alloca
> and creating an aligned pointer in to that space but that doesn't help
> when reload or LRA is spilling a register to memory.
Right.

>
> My thought was to use alloca, not to create space, but to move the stack
> pointer to an aligned address so that subsequent spills using the stack
> pointer would be to aligned addresses.  As a test I created a simple gimple
> pass that called __builtin_alloca at the beginning of each function just
> to see if that was possible and it seemed to work fine.  This seemed to be
> much cleaner than when I tried to modify the stack pointer in expand_prologue.
> When I did it there I had issues with tests that use setjmp/longjmp and there
> seems to be a lot of bookkeeping needed to track registers and offsets when
> working at that level.  With this gimple pass that was all taken care of by
> existing mechanisms.
But I don't see how using alloca ensures that you're going to have an 
aligned spill slot.  It can get you an aligned stack pointer, but that 
doesn't ensure alignment of any particular spill slot IIRC.

>
> Of course that test just did an alloca of 8 bytes, the actual code needs
> to allocate a dynamic number of bytes depending on the current value of the
> stack pointer.
>
> 	__builtin_alloca(stack_pointer MOD desired_alignment)
>
> So my first question is: Is there way to access/refrence the stack pointer
> in a gimple pass?  If so, how?
Not that I'm aware of.  In general gimple isn't supposed to know about 
things at that level.  You might be able to argue that a stack pointer 
is conceptually generic enough to provide access to it at gimple, but 
it'd certainly require some discussion.

>
> My second question is what do people think about this as a way to dymanically
> align the stack?  It seems a lot simpler and more target independent than
> what x86 is doing.
My biggest worry is the large disconnect between where you're trying to 
solve the problem (gimple) and where the problematic bits are 
(LRA/reload).  That seems like to be fragile in the long run.

jeff

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] Dynamically aligning the stack
  2015-04-14  5:18 ` Jeff Law
@ 2015-04-14 16:30   ` Steve Ellcey
  2015-04-14 16:55     ` H.J. Lu
  2015-04-14 16:58     ` Jeff Law
  0 siblings, 2 replies; 9+ messages in thread
From: Steve Ellcey @ 2015-04-14 16:30 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

On Mon, 2015-04-13 at 23:18 -0600, Jeff Law wrote:

> But I don't see how using alloca ensures that you're going to have an 
> aligned spill slot.  It can get you an aligned stack pointer, but that 
> doesn't ensure alignment of any particular spill slot IIRC.

It doesn't.  I found a big hole in my idea because as soon as you do an alloca
then frame_pointer_needed is set to true and spills are done relative to
the frame pointer, not the stack pointer.  Thus having an aligned stack
pointer at that point doesn't help at all with the alignment of spills.

> > My second question is what do people think about this as a way to dymanically
> > align the stack?  It seems a lot simpler and more target independent than
> > what x86 is doing.
> My biggest worry is the large disconnect between where you're trying to 
> solve the problem (gimple) and where the problematic bits are 
> (LRA/reload).  That seems like to be fragile in the long run.
> 
> jeff

Yes, I am trying to look at how the x86 does dynamic stack alignment but
it is difficult to untangle the generic concepts from the parts tied
specifically to the x86 calling convention.  No other platform appears
to do dynamic stack alignment.

Steve Ellcey
sellcey@imgtec.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] Dynamically aligning the stack
  2015-04-14 16:30   ` Steve Ellcey
@ 2015-04-14 16:55     ` H.J. Lu
  2015-04-14 16:58     ` Jeff Law
  1 sibling, 0 replies; 9+ messages in thread
From: H.J. Lu @ 2015-04-14 16:55 UTC (permalink / raw)
  To: sellcey; +Cc: Jeff Law, GCC Patches

On Tue, Apr 14, 2015 at 9:30 AM, Steve Ellcey <sellcey@imgtec.com> wrote:
>
>> > My second question is what do people think about this as a way to dymanically
>> > align the stack?  It seems a lot simpler and more target independent than
>> > what x86 is doing.
>> My biggest worry is the large disconnect between where you're trying to
>> solve the problem (gimple) and where the problematic bits are
>> (LRA/reload).  That seems like to be fragile in the long run.
>>
>> jeff
>
> Yes, I am trying to look at how the x86 does dynamic stack alignment but
> it is difficult to untangle the generic concepts from the parts tied
> specifically to the x86 calling convention.  No other platform appears
> to do dynamic stack alignment.
>

The infrastructure changes made to support dynamic stack
alignment on x86 should be useful to implement dynamic stack
alignment on other targets.  You can go back to GCC 4.4 to
see how i386 backend was modified to support dynamic stack
alignment:

https://gcc.gnu.org/ml/gcc-patches/2008-07/msg00647.html
https://gcc.gnu.org/ml/gcc-patches/2008-07/msg00652.html

-- 
H.J.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] Dynamically aligning the stack
  2015-04-14 16:30   ` Steve Ellcey
  2015-04-14 16:55     ` H.J. Lu
@ 2015-04-14 16:58     ` Jeff Law
  2015-04-14 17:08       ` H.J. Lu
  1 sibling, 1 reply; 9+ messages in thread
From: Jeff Law @ 2015-04-14 16:58 UTC (permalink / raw)
  To: sellcey; +Cc: gcc-patches

On 04/14/2015 10:30 AM, Steve Ellcey wrote:
> On Mon, 2015-04-13 at 23:18 -0600, Jeff Law wrote:
>
>> But I don't see how using alloca ensures that you're going to have an
>> aligned spill slot.  It can get you an aligned stack pointer, but that
>> doesn't ensure alignment of any particular spill slot IIRC.
>
> It doesn't.  I found a big hole in my idea because as soon as you do an alloca
> then frame_pointer_needed is set to true and spills are done relative to
> the frame pointer, not the stack pointer.  Thus having an aligned stack
> pointer at that point doesn't help at all with the alignment of spills.
Right.  I almost mentioned something about the frame pointer here... 
But yes, once you do alloca, spills are going to be frame relative.

I don't recall if we can have a different alignment requirements for the 
frame and stack.  Even if we can get an aligned frame, that's still not 
a guarantee of an aligned spill slot.

> Yes, I am trying to look at how the x86 does dynamic stack alignment but
> it is difficult to untangle the generic concepts from the parts tied
> specifically to the x86 calling convention.  No other platform appears
> to do dynamic stack alignment.
I'm not aware of any other target doing dynamic stack realignment.  x86 
has a long history of alignment issues (*), so folks have been thinking 
about how to do it there for a long time.

I'm a bit surprised this hasn't come up on other architectures given the 
proliferation of wider and wider vectors.  If you can see a path to 
generalization of what the x86 is doing so that other targets can use 
it, it'd definitely be a win.

Jeff

(*) I'm referring to the fact that doubles are natively 32 bit aligned, 
but you could get far better performance if you 64 bit align them. 
That's where preferred stack boundary came from, which was a preference, 
not a requirement.  Dynamic stack realignment and such came later.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] Dynamically aligning the stack
  2015-04-14 16:58     ` Jeff Law
@ 2015-04-14 17:08       ` H.J. Lu
  2015-04-21 16:53         ` Steve Ellcey
  0 siblings, 1 reply; 9+ messages in thread
From: H.J. Lu @ 2015-04-14 17:08 UTC (permalink / raw)
  To: Jeff Law; +Cc: sellcey, GCC Patches

On Tue, Apr 14, 2015 at 9:58 AM, Jeff Law <law@redhat.com> wrote:
> On 04/14/2015 10:30 AM, Steve Ellcey wrote:
>>
>> On Mon, 2015-04-13 at 23:18 -0600, Jeff Law wrote:
>>
>>> But I don't see how using alloca ensures that you're going to have an
>>> aligned spill slot.  It can get you an aligned stack pointer, but that
>>> doesn't ensure alignment of any particular spill slot IIRC.
>>
>>
>> It doesn't.  I found a big hole in my idea because as soon as you do an
>> alloca
>> then frame_pointer_needed is set to true and spills are done relative to
>> the frame pointer, not the stack pointer.  Thus having an aligned stack
>> pointer at that point doesn't help at all with the alignment of spills.
>
> Right.  I almost mentioned something about the frame pointer here... But
> yes, once you do alloca, spills are going to be frame relative.
>
> I don't recall if we can have a different alignment requirements for the
> frame and stack.  Even if we can get an aligned frame, that's still not a
> guarantee of an aligned spill slot.
>
>> Yes, I am trying to look at how the x86 does dynamic stack alignment but
>> it is difficult to untangle the generic concepts from the parts tied
>> specifically to the x86 calling convention.  No other platform appears
>> to do dynamic stack alignment.
>
> I'm not aware of any other target doing dynamic stack realignment.  x86 has
> a long history of alignment issues (*), so folks have been thinking about
> how to do it there for a long time.
>
> I'm a bit surprised this hasn't come up on other architectures given the
> proliferation of wider and wider vectors.  If you can see a path to
> generalization of what the x86 is doing so that other targets can use it,
> it'd definitely be a win.

We have done just that in GCC 4.4 to implement dynamic stack
alignment on x86 :-).  Some of x86 backend changes for dynamic
stack alignment are x86 psABI specific.  Others are historical,
like -mstackrealign. which was the old attempt for dynamic stack
alignment.

-- 
H.J.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] Dynamically aligning the stack
  2015-04-14 17:08       ` H.J. Lu
@ 2015-04-21 16:53         ` Steve Ellcey
  2015-04-21 17:57           ` H.J. Lu
  0 siblings, 1 reply; 9+ messages in thread
From: Steve Ellcey @ 2015-04-21 16:53 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Jeff Law, GCC Patches

On Tue, 2015-04-14 at 10:08 -0700, H.J. Lu wrote:

> We have done just that in GCC 4.4 to implement dynamic stack
> alignment on x86 :-).  Some of x86 backend changes for dynamic
> stack alignment are x86 psABI specific.  Others are historical,
> like -mstackrealign. which was the old attempt for dynamic stack
> alignment.

I am a bit confused about the history of stack alignment on x86.  So I
guess -mpreferred-stack-boundary=X came first and is not
obsolete/depreciated. But I thought -mstackrealign=X was the current
method of aligning the stack, but based on this comment and the patches
you pointed me at I guess this is also obsolete (or at least deprecated)
and that -mincoming-stack-boundary=X is the current option that should
be used.  But I am not sure how this option works.

Obviously it tells GCC what assumption to make about stack alignment at
the start of a function but how do you tell GCC what alignment you want
for the function?  Or does GCC figure that out for itself based on the
instructions and data types it sees in the function?

Steve Ellcey
sellcey@imgtec.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] Dynamically aligning the stack
  2015-04-21 16:53         ` Steve Ellcey
@ 2015-04-21 17:57           ` H.J. Lu
  0 siblings, 0 replies; 9+ messages in thread
From: H.J. Lu @ 2015-04-21 17:57 UTC (permalink / raw)
  To: sellcey; +Cc: Jeff Law, GCC Patches

On Tue, Apr 21, 2015 at 9:52 AM, Steve Ellcey <sellcey@imgtec.com> wrote:
> On Tue, 2015-04-14 at 10:08 -0700, H.J. Lu wrote:
>
>> We have done just that in GCC 4.4 to implement dynamic stack
>> alignment on x86 :-).  Some of x86 backend changes for dynamic
>> stack alignment are x86 psABI specific.  Others are historical,
>> like -mstackrealign. which was the old attempt for dynamic stack
>> alignment.
>
> I am a bit confused about the history of stack alignment on x86.  So I
> guess -mpreferred-stack-boundary=X came first and is not
> obsolete/depreciated. But I thought -mstackrealign=X was the current
> method of aligning the stack, but based on this comment and the patches
> you pointed me at I guess this is also obsolete (or at least deprecated)
> and that -mincoming-stack-boundary=X is the current option that should
> be used.  But I am not sure how this option works.

-mpreferred-stack-boundary=X and -mincoming-stack-boundary=X
set stack alignment.  -mstackrealign=X:

'-mstackrealign'
     Realign the stack at entry.  On the Intel x86, the '-mstackrealign'
     option generates an alternate prologue and epilogue that realigns
     the run-time stack if necessary.  This supports mixing legacy codes
     that keep 4-byte stack alignment with modern codes that keep
     16-byte stack alignment for SSE compatibility.  See also the
     attribute 'force_align_arg_pointer', applicable to individual
     functions.

assumes 4-byte incoming stack alignment in 32-bit.   It isn't needed
in most cases since GCC has been generating 16-byte outgoing
stack alignment for ages.

> Obviously it tells GCC what assumption to make about stack alignment at
> the start of a function but how do you tell GCC what alignment you want
> for the function?  Or does GCC figure that out for itself based on the
> instructions and data types it sees in the function?
>

Please do

# git grep "stack_alignment_needed = "

to see how middle-end and backend track stack alignment requirement.

-- 
H.J.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-04-21 17:57 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-26 23:28 [RFC] Dynamically aligning the stack Steve Ellcey 
2015-03-29 23:02 ` Mike Stump
2015-04-14  5:18 ` Jeff Law
2015-04-14 16:30   ` Steve Ellcey
2015-04-14 16:55     ` H.J. Lu
2015-04-14 16:58     ` Jeff Law
2015-04-14 17:08       ` H.J. Lu
2015-04-21 16:53         ` Steve Ellcey
2015-04-21 17:57           ` H.J. Lu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).