* alloca attribute?
@ 2006-11-29 4:19 Perry Smith
2006-11-29 10:19 ` Andrew Haley
0 siblings, 1 reply; 8+ messages in thread
From: Perry Smith @ 2006-11-29 4:19 UTC (permalink / raw)
To: MSX to GCC
I wrote a class that switches the stack to a new area. This is for
the power PC.
In the text below, I'll use main, testit, and newStack. main is the
main program, testit is a function that main calls, and newStack is
the method that switches the stack to the new space. main calls
testit which calls s.newStack. (s is an instance of the class that
switches the stack).
The purpose of separating main and testit is so I can verify that
returning from testit works properly.
newStack gets the current value of r1 (the stack pointer) and copies
the last two stack frames (which would be the stack frame for testit
and newStack) to the top of some allocated memory. It alters r1(0)
(the previous stack value for newStack) in the new memory to point to
the address of testit's new stack frame. It sets r1 up to the base
of this new area and returns.
With g++ and no optimization, this works. When newStack returns, it
consumes its stack frame in the new memory leaving only testit's new
stack frame and r1 pointing to the base of the new stack from for
testit. When testit returns, it loads r1 with r1(0) and returns.
This properly puts r1 back to main's stack frame.
If I put -O3, then at the return of testit, instead of loading r1
with r1(0), just adds in the size of the stack frame (and assumes
that r1 has not been munged with). I presume this is faster. I know
that xlc does the same thing. As a result, when we return back to
main, the stack pointer is off in the weeds somewhere.
I suspected that somehow, alloca gave the compiler a clue that it
could not do the add, it had to load r1 with r1(0).
So, I wrote a macro:
#define doNewStack(s) \
do { \
void *notUsed = alloca(1); \
s.newStack(); \
} while (0)
testit is changed to call doNewStack(s); where it use to call
s.newStack(); (The purpose of the do while is so it is a single C
statement.)
This does as I hoped. It flags the compiler and tells it that r1 has
been munged. As a result, it loads r1 from r1(0) for the return of
testit and does not do the add immediate.
I would like to do the same thing without using a macro: give
newStack an attribute that tells the compiler that r1 has been
munged. I looked and did not see any attribute that looks like it
applied but I thought I would ask and see.
The other danger I am worried about is if testit is inlined. It
seems like that would/could hose me up as well but, I'm not sure.
I wrote my class and put the implementation of newStack in a separate
file so it can not be inlined. I suppose a completely different
approach would be to move the implementation of newStack back into
the class definition and give it the inline attribute so it is always
inlined. Then change it to only copy one stack frame. It could do
the alloca too so that the compiler would load r1 with r1(0) rather
than doing the add. Hmmmm...
I'm get the feeling that I am reinventing the wheel here.
Any thoughts or help here would be great!
Thank you for your help,
Perry Smith ( pedz@easesoftware.com )
Ease Software, Inc. ( http://www.easesoftware.com )
Low cost SATA Disk Systems for IBMs p5, pSeries, and RS/6000 AIX systems
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: alloca attribute?
2006-11-29 4:19 alloca attribute? Perry Smith
@ 2006-11-29 10:19 ` Andrew Haley
2006-11-29 14:17 ` Perry Smith
0 siblings, 1 reply; 8+ messages in thread
From: Andrew Haley @ 2006-11-29 10:19 UTC (permalink / raw)
To: Perry Smith; +Cc: MSX to GCC
Perry Smith writes:
> I wrote a class that switches the stack to a new area. This is for
> the power PC.
>
> In the text below, I'll use main, testit, and newStack. main is the
> main program, testit is a function that main calls, and newStack is
> the method that switches the stack to the new space. main calls
> testit which calls s.newStack. (s is an instance of the class that
> switches the stack).
>
> The purpose of separating main and testit is so I can verify that
> returning from testit works properly.
>
> newStack gets the current value of r1 (the stack pointer) and copies
> the last two stack frames (which would be the stack frame for testit
> and newStack) to the top of some allocated memory. It alters r1(0)
> (the previous stack value for newStack) in the new memory to point to
> the address of testit's new stack frame. It sets r1 up to the base
> of this new area and returns.
OK, before we go any further. Did you write and test DWARF unwinder
information for newStack?
> With g++ and no optimization, this works. When newStack returns,
> it consumes its stack frame in the new memory leaving only testit's
> new stack frame and r1 pointing to the base of the new stack from
> for testit. When testit returns, it loads r1 with r1(0) and
> returns. This properly puts r1 back to main's stack frame.
>
> If I put -O3, then at the return of testit, instead of loading r1
> with r1(0), just adds in the size of the stack frame (and assumes
> that r1 has not been munged with). I presume this is faster. I
> know that xlc does the same thing. As a result, when we return
> back to main, the stack pointer is off in the weeds somewhere.
I get the feeling I'm not understanding something here. As long as
newStack is correct and handles all registers according to the ABI,
there shouldn't be any trouble.
Andrew.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: alloca attribute?
2006-11-29 10:19 ` Andrew Haley
@ 2006-11-29 14:17 ` Perry Smith
2006-11-29 14:25 ` Andrew Haley
0 siblings, 1 reply; 8+ messages in thread
From: Perry Smith @ 2006-11-29 14:17 UTC (permalink / raw)
To: Andrew Haley; +Cc: MSX to GCC
On Nov 29, 2006, at 4:19 AM, Andrew Haley wrote:
> Perry Smith writes:
>> I wrote a class that switches the stack to a new area. This is for
>> the power PC.
>>
>> In the text below, I'll use main, testit, and newStack. main is the
>> main program, testit is a function that main calls, and newStack is
>> the method that switches the stack to the new space. main calls
>> testit which calls s.newStack. (s is an instance of the class that
>> switches the stack).
>>
>> The purpose of separating main and testit is so I can verify that
>> returning from testit works properly.
>>
>> newStack gets the current value of r1 (the stack pointer) and copies
>> the last two stack frames (which would be the stack frame for testit
>> and newStack) to the top of some allocated memory. It alters r1(0)
>> (the previous stack value for newStack) in the new memory to point to
>> the address of testit's new stack frame. It sets r1 up to the base
>> of this new area and returns.
>
> OK, before we go any further. Did you write and test DWARF unwinder
> information for newStack?
No, I did not. I thought it would be too big a task and I'm willing
to put a try/catch
after the stack has been changed so the unwind does not need to go
through
this. This may be naive. (See more below).
>> With g++ and no optimization, this works. When newStack returns,
>> it consumes its stack frame in the new memory leaving only testit's
>> new stack frame and r1 pointing to the base of the new stack from
>> for testit. When testit returns, it loads r1 with r1(0) and
>> returns. This properly puts r1 back to main's stack frame.
>>
>> If I put -O3, then at the return of testit, instead of loading r1
>> with r1(0), just adds in the size of the stack frame (and assumes
>> that r1 has not been munged with). I presume this is faster. I
>> know that xlc does the same thing. As a result, when we return
>> back to main, the stack pointer is off in the weeds somewhere.
>
> I get the feeling I'm not understanding something here. As long as
> newStack is correct and handles all registers according to the ABI,
> there shouldn't be any trouble.
I'm sure you understand this better than I do. Can you point me to
the info
needed for the DWARF unwinder you mention above? That may educate me
more.
It still seems I am going to have problems so I will try and repeat
the symptom:
The code generated to return from a function can have two forms.
Logically, it is
a three step process. Pick up the return address which is located at
r1(8) (assuming
a 32 bit system) into a register (like r0), replace r1 with r1(0)
which moves r1 to
point to the previous stack frame, and then branch to what was at r1(8).
But in optimized code, the compiler does not load r1 with r1(0). It
assumes that r1
has not changed, and it knows the size of the stack frame, it just
adds the size of the
stack frame to r1. This will be the same address if r1 has not been
changed.
It seems like (but I may be wrong), even with the DWARF unwinder
information, the
compiler will still produce the code that adds the size of the stack
from to r1 to get
r1 to point to the previous stack frame instead of loading r1 with r1
(0).
Am I making sense?
Perry Smith ( pedz@easesoftware.com )
Ease Software, Inc. ( http://www.easesoftware.com )
Low cost SATA Disk Systems for IBMs p5, pSeries, and RS/6000 AIX systems
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: alloca attribute?
2006-11-29 14:17 ` Perry Smith
@ 2006-11-29 14:25 ` Andrew Haley
2006-11-29 14:50 ` Perry Smith
0 siblings, 1 reply; 8+ messages in thread
From: Andrew Haley @ 2006-11-29 14:25 UTC (permalink / raw)
To: Perry Smith; +Cc: MSX to GCC
Perry Smith writes:
>
> On Nov 29, 2006, at 4:19 AM, Andrew Haley wrote:
>
> > Perry Smith writes:
> >> I wrote a class that switches the stack to a new area. This is for
> >> the power PC.
> >>
> >> In the text below, I'll use main, testit, and newStack. main is the
> >> main program, testit is a function that main calls, and newStack is
> >> the method that switches the stack to the new space. main calls
> >> testit which calls s.newStack. (s is an instance of the class that
> >> switches the stack).
> >>
> >> The purpose of separating main and testit is so I can verify that
> >> returning from testit works properly.
> >>
> >> newStack gets the current value of r1 (the stack pointer) and copies
> >> the last two stack frames (which would be the stack frame for testit
> >> and newStack) to the top of some allocated memory. It alters r1(0)
> >> (the previous stack value for newStack) in the new memory to point to
> >> the address of testit's new stack frame. It sets r1 up to the base
> >> of this new area and returns.
> >
> > OK, before we go any further. Did you write and test DWARF unwinder
> > information for newStack?
>
> No, I did not. I thought it would be too big a task and I'm
> willing to put a try/catch after the stack has been changed so the
> unwind does not need to go through this. This may be naive. (See
> more below).
You'll need unwinder data, sooner or later. But let's leave it for
later, it's not strictly relevant to what you need now.
> >> With g++ and no optimization, this works. When newStack returns,
> >> it consumes its stack frame in the new memory leaving only testit's
> >> new stack frame and r1 pointing to the base of the new stack from
> >> for testit. When testit returns, it loads r1 with r1(0) and
> >> returns. This properly puts r1 back to main's stack frame.
> >>
> >> If I put -O3, then at the return of testit, instead of loading r1
> >> with r1(0), just adds in the size of the stack frame (and assumes
> >> that r1 has not been munged with). I presume this is faster. I
> >> know that xlc does the same thing. As a result, when we return
> >> back to main, the stack pointer is off in the weeds somewhere.
> >
> > I get the feeling I'm not understanding something here. As long as
> > newStack is correct and handles all registers according to the ABI,
> > there shouldn't be any trouble.
>
> I'm sure you understand this better than I do. Can you point me to
> the info needed for the DWARF unwinder you mention above? That may
> educate me more.
>
> It still seems I am going to have problems so I will try and repeat
> the symptom:
>
> The code generated to return from a function can have two forms.
> Logically, it is a three step process. Pick up the return address
> which is located at r1(8) (assuming a 32 bit system) into a
> register (like r0), replace r1 with r1(0) which moves r1 to point
> to the previous stack frame, and then branch to what was at r1(8).
>
> But in optimized code, the compiler does not load r1 with r1(0).
> It assumes that r1 has not changed, and it knows the size of the
> stack frame, it just adds the size of the stack frame to r1. This
> will be the same address if r1 has not been changed.
Right.
> It seems like (but I may be wrong), even with the DWARF unwinder
> information, the compiler will still produce the code that adds the
> size of the stack from to r1 to get r1 to point to the previous
> stack frame instead of loading r1 with r1 (0).
Sure, but why would it matter? Your newStack routine should do
something like
newStack:
save caller registers somewhere
load new stack and frame pointer
call <foo> -- whatever it is you want to run on the new stack
restore caller registers, including stack and frame pointer
return
so it should not metter how the caller of newStack uses its stack
frame or what foo does. As long as you conform to the ABI you'll be
OK.
Andrew.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: alloca attribute?
2006-11-29 14:25 ` Andrew Haley
@ 2006-11-29 14:50 ` Perry Smith
2006-11-29 15:05 ` Andrew Haley
0 siblings, 1 reply; 8+ messages in thread
From: Perry Smith @ 2006-11-29 14:50 UTC (permalink / raw)
To: Andrew Haley; +Cc: MSX to GCC
On Nov 29, 2006, at 8:24 AM, Andrew Haley wrote:
> Perry Smith writes:
>>
>> On Nov 29, 2006, at 4:19 AM, Andrew Haley wrote:
>>
>>> Perry Smith writes:
>> No, I did not. I thought it would be too big a task and I'm
>> willing to put a try/catch after the stack has been changed so the
>> unwind does not need to go through this. This may be naive. (See
>> more below).
>
> You'll need unwinder data, sooner or later. But let's leave it for
> later, it's not strictly relevant to what you need now.
O.k.
>>
>> But in optimized code, the compiler does not load r1 with r1(0).
>> It assumes that r1 has not changed, and it knows the size of the
>> stack frame, it just adds the size of the stack frame to r1. This
>> will be the same address if r1 has not been changed.
>
> Right.
>
>> It seems like (but I may be wrong), even with the DWARF unwinder
>> information, the compiler will still produce the code that adds the
>> size of the stack from to r1 to get r1 to point to the previous
>> stack frame instead of loading r1 with r1 (0).
>
> Sure, but why would it matter? Your newStack routine should do
> something like
>
> newStack:
> save caller registers somewhere
> load new stack and frame pointer
> call <foo> -- whatever it is you want to run on the new stack
> restore caller registers, including stack and frame pointer
> return
>
> so it should not metter how the caller of newStack uses its stack
> frame or what foo does. As long as you conform to the ABI you'll be
> OK.
Ahh... I think I see your confusion. newStack does not call <foo>.
I could do that and it
would probably be safer, easier. But sometimes newStack would need
to call foo(a) and
other times it would need to call foo(a, b), etc. I thought about
using varargs so that
any foo that is called must take a varargs argument. That approach,
obviously, is
doable but imposes several restriction.
But then I hit upon this other idea (which may suck). newStack
simply mucks with
the stack and returns back to testit (the routine that calls
newStack). I wish I could
include some graphics but I'll try to describe it with words.
I think of the stack frames as a linked list. Before newStack
returns, r1 points to the
list. newStack's frame is in the new area and testit's (the function
that called newStack) is
also in the new area. The links from stack frame to stack frame have
been set up
so that the new frame for newStack points to the new frame for
testit. The new frame
for testit points back to the original stack frame of main.
After newStack completes, the linked list of stack frames is still as
it was before except
that the top frame (testit's stack frame) is now at a different
address (in the new area).
But it points back to the original parent stack frame on the original
stack. The key here
is that testit is running on a new stack frame that newStack created
-- but, it doesn't know
that.
If it would help, I can draw some graphics and post a link to it.
Also... I apologize for frequently sending the same message twice.
The default
for my Apple Mail program is to send mine/html gunk. I forget to
convert it to plain text
and gcc rejects. I then have to resend it after converting it to
plain text. So, I think,
sometimes some people get two copies of the email.
Perry Smith ( pedz@easesoftware.com )
Ease Software, Inc. ( http://www.easesoftware.com )
Low cost SATA Disk Systems for IBMs p5, pSeries, and RS/6000 AIX systems
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: alloca attribute?
2006-11-29 14:50 ` Perry Smith
@ 2006-11-29 15:05 ` Andrew Haley
2006-11-29 15:30 ` Perry Smith
0 siblings, 1 reply; 8+ messages in thread
From: Andrew Haley @ 2006-11-29 15:05 UTC (permalink / raw)
To: Perry Smith; +Cc: MSX to GCC
Perry Smith writes:
> On Nov 29, 2006, at 8:24 AM, Andrew Haley wrote:
>
> > Perry Smith writes:
> >>
> >> On Nov 29, 2006, at 4:19 AM, Andrew Haley wrote:
> >>
> >>> Perry Smith writes:
> >> No, I did not. I thought it would be too big a task and I'm
> >> willing to put a try/catch after the stack has been changed so the
> >> unwind does not need to go through this. This may be naive. (See
> >> more below).
> >
> > You'll need unwinder data, sooner or later. But let's leave it for
> > later, it's not strictly relevant to what you need now.
>
> O.k.
>
> >>
> >> But in optimized code, the compiler does not load r1 with r1(0).
> >> It assumes that r1 has not changed, and it knows the size of the
> >> stack frame, it just adds the size of the stack frame to r1. This
> >> will be the same address if r1 has not been changed.
> >
> > Right.
> >
> >> It seems like (but I may be wrong), even with the DWARF unwinder
> >> information, the compiler will still produce the code that adds the
> >> size of the stack from to r1 to get r1 to point to the previous
> >> stack frame instead of loading r1 with r1 (0).
> >
> > Sure, but why would it matter? Your newStack routine should do
> > something like
> >
> > newStack:
> > save caller registers somewhere
> > load new stack and frame pointer
> > call <foo> -- whatever it is you want to run on the new stack
> > restore caller registers, including stack and frame pointer
> > return
> >
> > so it should not metter how the caller of newStack uses its stack
> > frame or what foo does. As long as you conform to the ABI you'll be
> > OK.
>
> Ahh... I think I see your confusion. newStack does not call <foo>.
> I could do that and it would probably be safer, easier. But
> sometimes newStack would need to call foo(a) and other times it
> would need to call foo(a, b), etc. I thought about using varargs
> so that any foo that is called must take a varargs argument. That
> approach, obviously, is doable but imposes several restriction.
>
> But then I hit upon this other idea (which may suck). newStack
> simply mucks with the stack and returns back to testit (the routine
> that calls newStack).
You're right. It sucks. :-)
> I wish I could include some graphics but I'll try to describe it
> with words.
No, you've explained it perfectly well, but trust me here: it's a
really bad idea. gcc can eliminate the stack pointer to the frame
pointer, as you have discovered. gcc can also save a pointer to a
local area on the stack, and that can be of indeterminate size.
So gcc might do this
int a = 2;
p = &a; // allocated on the stack
newStack();
*p = 3;
print a;
And a, being on the new stack, would now be different from *p, which
still points to the old stack. It can't possibly work.
Andrew.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: alloca attribute?
2006-11-29 15:05 ` Andrew Haley
@ 2006-11-29 15:30 ` Perry Smith
2006-11-29 15:58 ` Andrew Haley
0 siblings, 1 reply; 8+ messages in thread
From: Perry Smith @ 2006-11-29 15:30 UTC (permalink / raw)
To: Andrew Haley; +Cc: MSX to GCC
On Nov 29, 2006, at 9:05 AM, Andrew Haley wrote:
> Perry Smith writes:
>> On Nov 29, 2006, at 8:24 AM, Andrew Haley wrote:
>>
>>> Perry Smith writes:
>> But then I hit upon this other idea (which may suck). newStack
>> simply mucks with the stack and returns back to testit (the routine
>> that calls newStack).
>
> You're right. It sucks. :-)
>
>> I wish I could include some graphics but I'll try to describe it
>> with words.
>
> No, you've explained it perfectly well, but trust me here: it's a
> really bad idea. gcc can eliminate the stack pointer to the frame
> pointer, as you have discovered. gcc can also save a pointer to a
> local area on the stack, and that can be of indeterminate size.
>
> So gcc might do this
>
> int a = 2;
> p = &a; // allocated on the stack
> newStack();
> *p = 3;
> print a;
>
> And a, being on the new stack, would now be different from *p, which
> still points to the old stack. It can't possibly work.
Ahh... I see. It took me a while to figure out what you were saying.
Shoot!
o.k. So if I go with the idea of newStack calling foo, will I need the
unwind stuff you mentioned before? I'm somewhat terrified of
mucking with that but I probably need to learn how it works anyhow.
Is this what I need to read and understand?
http://www.codesourcery.com/cxx-abi/abi-eh.html
Or... do you have any ideas of a completely different approach somehow?
This is going to suck so bad... all of my external entry points need
to move
the stack. I thought, briefly, gee, thats just the five or so driver
entry points. But
then there is the interrupt handler, iodone, timer routines, all
sorts of nasty
things.
Thanks for you help...
Perry Smith ( pedz@easesoftware.com )
Ease Software, Inc. ( http://www.easesoftware.com )
Low cost SATA Disk Systems for IBMs p5, pSeries, and RS/6000 AIX systems
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: alloca attribute?
2006-11-29 15:30 ` Perry Smith
@ 2006-11-29 15:58 ` Andrew Haley
0 siblings, 0 replies; 8+ messages in thread
From: Andrew Haley @ 2006-11-29 15:58 UTC (permalink / raw)
To: Perry Smith; +Cc: MSX to GCC
Perry Smith writes:
> On Nov 29, 2006, at 9:05 AM, Andrew Haley wrote:
>
> > Perry Smith writes:
> >> On Nov 29, 2006, at 8:24 AM, Andrew Haley wrote:
> >>
> >>> Perry Smith writes:
> >> But then I hit upon this other idea (which may suck). newStack
> >> simply mucks with the stack and returns back to testit (the routine
> >> that calls newStack).
> >
> > You're right. It sucks. :-)
> >
> >> I wish I could include some graphics but I'll try to describe it
> >> with words.
> >
> > No, you've explained it perfectly well, but trust me here: it's a
> > really bad idea. gcc can eliminate the stack pointer to the frame
> > pointer, as you have discovered. gcc can also save a pointer to a
> > local area on the stack, and that can be of indeterminate size.
> >
> > So gcc might do this
> >
> > int a = 2;
> > p = &a; // allocated on the stack
> > newStack();
> > *p = 3;
> > print a;
> >
> > And a, being on the new stack, would now be different from *p, which
> > still points to the old stack. It can't possibly work.
>
> Ahh... I see. It took me a while to figure out what you were saying.
>
> Shoot!
>
> o.k. So if I go with the idea of newStack calling foo, will I need the
> unwind stuff you mentioned before? I'm somewhat terrified of
> mucking with that but I probably need to learn how it works anyhow.
If you want to switch stacks, and you want to throw exceptions past
the switch, you'll have to create unwinder data, regardless of how you
do the stack switching. Sorry. If you catch all exceptions on the
other side of the stack switch so that they don't propagate past the
switch, you won't need unwinder data.
> Is this what I need to read and understand?
> http://www.codesourcery.com/cxx-abi/abi-eh.html
>
> Or... do you have any ideas of a completely different approach somehow?
Well, I described how I thought it ought to work in a previous mail.
> This is going to suck so bad... all of my external entry points
> need to move the stack. I thought, briefly, gee, thats just the
> five or so driver entry points. But then there is the interrupt
> handler, iodone, timer routines, all sorts of nasty things.
Moving the stack isn't at all unusual. Operating systems do it all
the time. But you've got to do it right. Save the call-saved
registers, load fp and sp to point to the new area, and go.
Andrew.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2006-11-29 15:58 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-11-29 4:19 alloca attribute? Perry Smith
2006-11-29 10:19 ` Andrew Haley
2006-11-29 14:17 ` Perry Smith
2006-11-29 14:25 ` Andrew Haley
2006-11-29 14:50 ` Perry Smith
2006-11-29 15:05 ` Andrew Haley
2006-11-29 15:30 ` Perry Smith
2006-11-29 15:58 ` Andrew Haley
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).