* proper tail recursion for gcc
@ 2000-07-19 18:01 Mark Probst
2000-07-19 23:06 ` Geoff Keating
0 siblings, 1 reply; 19+ messages in thread
From: Mark Probst @ 2000-07-19 18:01 UTC (permalink / raw)
To: gcc
hi
i want to implement proper tail recursion for a much broader set of
cases than is currently implemented.
motivation: people using gnu c as a portable assembler suffer from the
fact that there is no way to make gcc do proper tail recursion. as a
result, some scheme compilers (bigloo, stalin and hobbit, which is
used by guile) are not standard compliant (scheme requires proper tail
recursion) and at least one prolog compiler (gnu prolog) has switched
from generating c code to assembler, making it non-portable (it is
currently only avaible for x86 and sparc). the prolog-like mercury
has to jump through some (non-portable) hoops to use gcc as an
efficient assembler. the currently available tail call optimization
in gcc does not go far enough, especially in that it does not permit
separate compilation (tail-callees must be static).
proposal: i want to implement a construct for tail calls. this could
either be in the form of a modifier for the return statement or a new
language construct, i.e.
return __tailcall__ f(a,b,c);
or
__tailcall__ f(a,b,c);
suggestions regarding the syntax are welcome. there are several
reasons for doing it this way. the alternatives would be:
* analysis within the compiler regarding which tail calls can be made
proper. the problem with this approach is that there might be a lot
of badly behaving code in existance that assumes that tail calls are
not proper. furthermore, there would have to be sharp and easily
satisfyable constraints on the c code of the caller or else there
would be no way to make sure that tail calls really are proper.
* declaration of tail-callees. using this approach it would still be
necessary to analyze the code in order to determine whether a given
tail call could be made proper. hence, all above mentioned problems
apply here also.
limitations:
* functions with variable arguments could not make proper tail calls.
i'd like to implement this feature for at least the alpha and x86
architectures.
do you have any suggestions? anything i should look out for?
bye
schani
--
Mark Probst
Student, Programmer
http://www.complang.tuwien.ac.at/~schani/
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: proper tail recursion for gcc
2000-07-19 18:01 proper tail recursion for gcc Mark Probst
@ 2000-07-19 23:06 ` Geoff Keating
2000-07-19 23:35 ` Zack Weinberg
2000-07-20 3:20 ` Mark Probst
0 siblings, 2 replies; 19+ messages in thread
From: Geoff Keating @ 2000-07-19 23:06 UTC (permalink / raw)
To: Mark Probst; +Cc: gcc
Mark Probst <schani@mips.complang.tuwien.ac.at> writes:
> i want to implement proper tail recursion for a much broader set of
> cases than is currently implemented.
...
I think you'll find this has already been done in the development gcc.
It's the 'sibling call' optimisation.
--
- Geoffrey Keating <geoffk@cygnus.com>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: proper tail recursion for gcc
2000-07-19 23:06 ` Geoff Keating
@ 2000-07-19 23:35 ` Zack Weinberg
2000-07-20 2:40 ` Jakub Jelinek
` (2 more replies)
2000-07-20 3:20 ` Mark Probst
1 sibling, 3 replies; 19+ messages in thread
From: Zack Weinberg @ 2000-07-19 23:35 UTC (permalink / raw)
To: Geoff Keating; +Cc: Mark Probst, gcc
On Wed, Jul 19, 2000 at 11:06:08PM -0700, Geoff Keating wrote:
>
> Mark Probst <schani@mips.complang.tuwien.ac.at> writes:
>
> > i want to implement proper tail recursion for a much broader set of
> > cases than is currently implemented.
> ...
>
> I think you'll find this has already been done in the development gcc.
> It's the 'sibling call' optimisation.
It's not as complete as a Scheme compiler is required to implement.
The Scheme standard requires sibcall optimization for *every* function
call followed immediately by a return from the current function. We
only seem to do the optimization when the argument lists and return
types are identical.
Consider, for instance:
foo(a, b)
{
c = calculate_bar(a, b);
foo_with_bar(a, b, c);
}
foo_with_bar(a, b, c)
{
...
}
We don't sibcall optimize this, but we could and probably should.
[Perhaps this is the sort of thing that would be handled by the
"better tree-based sibcall optimizer" coming RSN?]
zw
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: proper tail recursion for gcc
2000-07-19 23:35 ` Zack Weinberg
@ 2000-07-20 2:40 ` Jakub Jelinek
2000-07-20 12:16 ` Zack Weinberg
2000-07-20 8:26 ` Mark Mitchell
2000-07-20 12:07 ` Richard Henderson
2 siblings, 1 reply; 19+ messages in thread
From: Jakub Jelinek @ 2000-07-20 2:40 UTC (permalink / raw)
To: Zack Weinberg; +Cc: Geoff Keating, Mark Probst, gcc
On Wed, Jul 19, 2000 at 11:33:08PM -0700, Zack Weinberg wrote:
> On Wed, Jul 19, 2000 at 11:06:08PM -0700, Geoff Keating wrote:
> >
> > Mark Probst <schani@mips.complang.tuwien.ac.at> writes:
> >
> > > i want to implement proper tail recursion for a much broader set of
> > > cases than is currently implemented.
> > ...
> >
> > I think you'll find this has already been done in the development gcc.
> > It's the 'sibling call' optimisation.
>
> It's not as complete as a Scheme compiler is required to implement.
> The Scheme standard requires sibcall optimization for *every* function
> call followed immediately by a return from the current function. We
There must be exceptions, like if you take an address of some local variable and pass
it to some other functions...
> only seem to do the optimization when the argument lists and return
> types are identical.
>
> Consider, for instance:
>
> foo(a, b)
> {
> c = calculate_bar(a, b);
> foo_with_bar(a, b, c);
> }
>
> foo_with_bar(a, b, c)
> {
> ...
> }
>
> We don't sibcall optimize this, but we could and probably should.
> [Perhaps this is the sort of thing that would be handled by the
> "better tree-based sibcall optimizer" coming RSN?]
I believe the problem here is not if we use tree-based or
expand_call/rtl-based sibcall optimizer, the issues are backend related.
Basically, if the incoming arguments stack area of the tail function is
larger than of tail callee, then you have to allocate some area immediately
below virtual-incoming-args-rtx to compensate it and in a machine specific
way copy machine dependent stuff down (or up, depending on stack growth),
e.g. on Intel the return address stored on the stack.
It is IMHO worth implementing, but under a separate -f switch not enabled by
default from -O options, because I don't think it will be a win for random C
code.
(BTW: The above example will be optimized on SPARC (if there are less than 6
word size arguments in both prototypes)).
Jakub
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: proper tail recursion for gcc
2000-07-19 23:06 ` Geoff Keating
2000-07-19 23:35 ` Zack Weinberg
@ 2000-07-20 3:20 ` Mark Probst
2000-07-20 7:12 ` Alexandre Oliva
2000-07-20 12:12 ` Richard Henderson
1 sibling, 2 replies; 19+ messages in thread
From: Mark Probst @ 2000-07-20 3:20 UTC (permalink / raw)
To: Geoff Keating; +Cc: gcc
On Wed, Jul 19, 2000 at 11:06:08PM -0700, Geoff Keating wrote:
> > i want to implement proper tail recursion for a much broader set of
> > cases than is currently implemented.
> ...
>
> I think you'll find this has already been done in the development gcc.
> It's the 'sibling call' optimisation.
i was referring to exactly this optimization. at least on the alpha,
for example, only calls to static functions are optimized, which is
not very helpful if you need separate compilation.
bye
schani
--
Mark Probst
Student, Programmer
http://www.complang.tuwien.ac.at/~schani/
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: proper tail recursion for gcc
2000-07-20 3:20 ` Mark Probst
@ 2000-07-20 7:12 ` Alexandre Oliva
2000-07-20 7:32 ` Theodore Papadopoulo
2000-07-20 12:12 ` Richard Henderson
1 sibling, 1 reply; 19+ messages in thread
From: Alexandre Oliva @ 2000-07-20 7:12 UTC (permalink / raw)
To: Mark Probst; +Cc: Geoff Keating, gcc
On Jul 20, 2000, Mark Probst <schani@mips.complang.tuwien.ac.at> wrote:
> On Wed, Jul 19, 2000 at 11:06:08PM -0700, Geoff Keating wrote:
>> I think you'll find this has already been done in the development gcc.
>> It's the 'sibling call' optimisation.
> only calls to static functions are optimized
Non-static functions may be overridden if the object file is used to
create a library and a program linked with the library defines the
same symbol.
--
Alexandre Oliva Enjoy Guarana', see http://www.ic.unicamp.br/~oliva/
Red Hat GCC Developer aoliva@{cygnus.com, redhat.com}
CS PhD student at IC-Unicamp oliva@{lsd.ic.unicamp.br, gnu.org}
Free Software Evangelist *Please* write to mailing lists, not to me
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: proper tail recursion for gcc
2000-07-20 7:12 ` Alexandre Oliva
@ 2000-07-20 7:32 ` Theodore Papadopoulo
0 siblings, 0 replies; 19+ messages in thread
From: Theodore Papadopoulo @ 2000-07-20 7:32 UTC (permalink / raw)
To: Alexandre Oliva; +Cc: Mark Probst, Geoff Keating, gcc
aoliva@redhat.com said:
>> only calls to static functions are optimized
> Non-static functions may be overridden if the object file is used to
> create a library and a program linked with the library defines the
> same symbol.
Then, may be an attribute never_overriden (or something similar) could
be of some help here ???
On the other hand, that's certainly dangerous....
--------------------------------------------------------------------
Theodore Papadopoulo
Email: Theodore.Papadopoulo@sophia.inria.fr Tel: (33) 04 92 38 76 01
--------------------------------------------------------------------
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: proper tail recursion for gcc
2000-07-19 23:35 ` Zack Weinberg
2000-07-20 2:40 ` Jakub Jelinek
@ 2000-07-20 8:26 ` Mark Mitchell
2000-07-20 10:02 ` Mark Probst
2000-07-20 12:07 ` Richard Henderson
2 siblings, 1 reply; 19+ messages in thread
From: Mark Mitchell @ 2000-07-20 8:26 UTC (permalink / raw)
To: zack; +Cc: geoffk, schani, gcc
>>>>> "Zack" == Zack Weinberg <zack@wolery.cumb.org> writes:
Zack> We don't sibcall optimize this, but we could and probably
Zack> should. [Perhaps this is the sort of thing that would be
Zack> handled by the "better tree-based sibcall optimizer" coming
Zack> RSN?]
Yup. Except that RSN isn't on anybody's schedule, so far as I know. :-(
--
Mark Mitchell mark@codesourcery.com
CodeSourcery, LLC http://www.codesourcery.com
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: proper tail recursion for gcc
2000-07-20 8:26 ` Mark Mitchell
@ 2000-07-20 10:02 ` Mark Probst
0 siblings, 0 replies; 19+ messages in thread
From: Mark Probst @ 2000-07-20 10:02 UTC (permalink / raw)
To: Mark Mitchell; +Cc: gcc
On Thu, Jul 20, 2000 at 08:24:59AM -0700, Mark Mitchell wrote:
> Zack> We don't sibcall optimize this, but we could and probably
> Zack> should. [Perhaps this is the sort of thing that would be
> Zack> handled by the "better tree-based sibcall optimizer" coming
> Zack> RSN?]
>
> Yup. Except that RSN isn't on anybody's schedule, so far as I know. :-(
could you mail me the plans/thoughts for the tree-based sibcall
optimized? i'd really like to do work in this direction. i also
learned that the sibcall optimizer does optimize non-static calls on
some platforms, like the sparc. on alpha and i386 it does not,
however. i'd like to look into this issue also, as these are my
primary development targets.
bye
schani
--
Mark Probst
Student, Programmer
http://www.complang.tuwien.ac.at/~schani/
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: proper tail recursion for gcc
2000-07-19 23:35 ` Zack Weinberg
2000-07-20 2:40 ` Jakub Jelinek
2000-07-20 8:26 ` Mark Mitchell
@ 2000-07-20 12:07 ` Richard Henderson
2000-07-20 14:35 ` Per Bothner
2 siblings, 1 reply; 19+ messages in thread
From: Richard Henderson @ 2000-07-20 12:07 UTC (permalink / raw)
To: Zack Weinberg; +Cc: Geoff Keating, Mark Probst, gcc
On Wed, Jul 19, 2000 at 11:33:08PM -0700, Zack Weinberg wrote:
> It's not as complete as a Scheme compiler is required to implement.
> The Scheme standard requires sibcall optimization for *every* function
> call followed immediately by a return from the current function.
Indeed. And such complete tail call optimizations are impossible
with a stack-based call frame, such as is used for C and so forth.
For Scheme one must put call frames in the heap and garbage collect them.
> We only seem to do the optimization when the argument lists and return
> types are identical.
>
> Consider, for instance:
>
> foo(a, b)
> {
> c = calculate_bar(a, b);
> foo_with_bar(a, b, c);
> }
>
> foo_with_bar(a, b, c)
> {
> ...
> }
>
> We don't sibcall optimize this, but we could and probably should.
No, we can't. Foo knows that it has room on the stack for
two arguments, since that's what it was given. Foo_with_bar
requires three arguments, so there's no enough room to
perform the tail call without changing the caller to pop more
from the stack frame.
I believe you will find that this example does get transformed
to a tail call on Sparc, since in this case all arguments will
be in registers.
> [Perhaps this is the sort of thing that would be handled by the
> "better tree-based sibcall optimizer" coming RSN?]
No, the tree-based sibcall optimizer just makes the analysis
easier and the code cleaner. It does not change the calling
convention dependant restrictions placed on the optimization.
r~
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: proper tail recursion for gcc
2000-07-20 3:20 ` Mark Probst
2000-07-20 7:12 ` Alexandre Oliva
@ 2000-07-20 12:12 ` Richard Henderson
2000-07-20 12:34 ` Mark Probst
1 sibling, 1 reply; 19+ messages in thread
From: Richard Henderson @ 2000-07-20 12:12 UTC (permalink / raw)
To: Mark Probst; +Cc: Geoff Keating, gcc
On Thu, Jul 20, 2000 at 11:21:51AM +0000, Mark Probst wrote:
> i was referring to exactly this optimization. at least on the alpha,
> for example, only calls to static functions are optimized, which is
> not very helpful if you need separate compilation.
This is required by the calling conventions.
Given foo calls bar calls baz, if baz function is in a different file
than bar, it might use a different $gp value. If baz uses a different
$gp value, then the $gp value on return to foo will be incorrect and
foo may crash.
There is no way around this, except foreknowledge that bar and baz
use the same $gp value. And that may only be guaranteed by putting
the functions in the same translation unit.
r~
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: proper tail recursion for gcc
2000-07-20 2:40 ` Jakub Jelinek
@ 2000-07-20 12:16 ` Zack Weinberg
0 siblings, 0 replies; 19+ messages in thread
From: Zack Weinberg @ 2000-07-20 12:16 UTC (permalink / raw)
To: Jakub Jelinek; +Cc: Geoff Keating, Mark Probst, gcc
On Thu, Jul 20, 2000 at 11:41:18AM +0200, Jakub Jelinek wrote:
> On Wed, Jul 19, 2000 at 11:33:08PM -0700, Zack Weinberg wrote:
> > On Wed, Jul 19, 2000 at 11:06:08PM -0700, Geoff Keating wrote:
> > >
> > > Mark Probst <schani@mips.complang.tuwien.ac.at> writes:
> > >
> > > > i want to implement proper tail recursion for a much broader set of
> > > > cases than is currently implemented.
> > > ...
> > >
> > > I think you'll find this has already been done in the development gcc.
> > > It's the 'sibling call' optimisation.
> >
> > It's not as complete as a Scheme compiler is required to implement.
> > The Scheme standard requires sibcall optimization for *every* function
> > call followed immediately by a return from the current function. We
>
> There must be exceptions, like if you take an address of some local
> variable and pass it to some other functions...
Scheme is a LISP dialect. It doesn't have any constructs that would
interfere with this optimization.
...
> > We don't sibcall optimize this, but we could and probably should.
> > [Perhaps this is the sort of thing that would be handled by the
> > "better tree-based sibcall optimizer" coming RSN?]
>
> I believe the problem here is not if we use tree-based or
> expand_call/rtl-based sibcall optimizer, the issues are backend
> related. Basically, if the incoming arguments stack area of the
> tail function is larger than of tail callee, then you have to
> allocate some area immediately below virtual-incoming-args-rtx to
> compensate it and in a machine specific way copy machine dependent
> stuff down (or up, depending on stack growth), e.g. on Intel the
> return address stored on the stack.
I do understand this.
In the specific example I gave, it would be possible to implement it
without machine-specific stack diddling, if we had a concept of
alternate entry points. In pseudo assembler:
foo:
move a(%sp), %r1 ; r1, r2 call-saved
move b(%sp), %r2
push %r2
push %r1
call calculate_bar ; result in %r0
add 8, %sp
jmp foo_tail
foo_with_bar:
move a(%sp), %r1
move b(%sp), %r2
move c(%sp), %r0
foo_tail:
...
But that would take serious interprocedural analysis, which we aren't
going to have anytime soon.
> It is IMHO worth implementing, but under a separate -f switch not
> enabled by default from -O options, because I don't think it will be
> a win for random C code.
This comes back to Mark's original idea, to have an annotation on
specific function calls indicating that they should be sibcall
optimized. I'd rather have that than another -f switch.
zw
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: proper tail recursion for gcc
2000-07-20 12:12 ` Richard Henderson
@ 2000-07-20 12:34 ` Mark Probst
2000-07-20 13:47 ` Richard Henderson
0 siblings, 1 reply; 19+ messages in thread
From: Mark Probst @ 2000-07-20 12:34 UTC (permalink / raw)
To: Richard Henderson; +Cc: gcc
On Thu, Jul 20, 2000 at 12:12:15PM -0700, Richard Henderson wrote:
> On Thu, Jul 20, 2000 at 11:21:51AM +0000, Mark Probst wrote:
> > i was referring to exactly this optimization. at least on the alpha,
> > for example, only calls to static functions are optimized, which is
> > not very helpful if you need separate compilation.
>
> This is required by the calling conventions.
>
> Given foo calls bar calls baz, if baz function is in a different file
> than bar, it might use a different $gp value. If baz uses a different
> $gp value, then the $gp value on return to foo will be incorrect and
> foo may crash.
this is true if foo and bar are in the same compilation unit, because
in this case foo does not reload $gp after calling bar. but if foo
knew that bar might do a tail call to a non-static function, it could
reload $gp and everything would be fine.
as i understand it, this knowledge is available (or could be made
available) to foo only if bar is defined before foo. is this correct?
is there a way around this? if there is not, we would have to assume
that bar makes a tail-call to a non-static function. we could let the
programmer explicitly declare that it does not, if the reloading of
$gp should be avoided.
bye
schani
--
Mark Probst
Student, Programmer
http://www.complang.tuwien.ac.at/~schani/
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: proper tail recursion for gcc
2000-07-20 12:34 ` Mark Probst
@ 2000-07-20 13:47 ` Richard Henderson
0 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2000-07-20 13:47 UTC (permalink / raw)
To: Mark Probst; +Cc: gcc
On Thu, Jul 20, 2000 at 08:37:11PM +0000, Mark Probst wrote:
> ... but if foo knew that bar might do a tail call to a non-static
> function, it could reload $gp and everything would be fine.
Not unless you communicate that knowledge to the linker
as well, which attempts to remove gp loads that are not
necessary. Not necessary, that is, if you're following
the ABI rules.
r~
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: proper tail recursion for gcc
2000-07-20 12:07 ` Richard Henderson
@ 2000-07-20 14:35 ` Per Bothner
2000-07-20 14:48 ` Joern Rennecke
2000-07-21 3:51 ` Andreas Schwab
0 siblings, 2 replies; 19+ messages in thread
From: Per Bothner @ 2000-07-20 14:35 UTC (permalink / raw)
To: Richard Henderson; +Cc: gcc
Richard Henderson <rth@cygnus.com> writes:
> On Wed, Jul 19, 2000 at 11:33:08PM -0700, Zack Weinberg wrote:
> > It's not as complete as a Scheme compiler is required to implement.
> > The Scheme standard requires sibcall optimization for *every* function
> > call followed immediately by a return from the current function.
>
> Indeed. And such complete tail call optimizations are impossible
> with a stack-based call frame, such as is used for C and so forth.
This is false. General tail-call-elimination is quite compatible
with a stack-based call frame. It can even be made compatible with
normal calling conventions, though it is easier if you have a
calling convention where the called function (callee) pops the stack.
> For Scheme one must put call frames in the heap and garbage collect them.
That is sort-of true, but has nothing to do with tail-call-elimination.
Full lexical closures and call-with-current-continuation are most
noturally implemented as you described, but there are various tricks
and hybrid implementations that people do.
> > We only seem to do the optimization when the argument lists and return
> > types are identical.
> >
> > Consider, for instance:
> >
> > foo(a, b)
> > {
> > c = calculate_bar(a, b);
> > foo_with_bar(a, b, c);
> > }
> >
> > foo_with_bar(a, b, c)
> > {
> > ...
> > }
> >
> > We don't sibcall optimize this, but we could and probably should.
>
> No, we can't. Foo knows that it has room on the stack for
> two arguments, since that's what it was given. Foo_with_bar
> requires three arguments, so there's no enough room to
> perform the tail call without changing the caller to pop more
> from the stack frame.
It is possible though ugly. You can adjust the stack frame to
make room for the extra argument. I.e. you change:
a, b, return_pc, fp, locals ...
to:
a, b, c, return_pc
and then jump to foo_with_bar.
When foo_with_bar returns, it will return to foo's caller, which
will pop off b and c. a remains on the stack, but does no harm.
When foo's caller returns, a will get popped, just as if a had
been alocated with alloca.
This will fail if foo's caller is compiled to not use a frame pointer,
since the stack pointer will not be where it expects it.
--
--Per Bothner
per@bothner.com http://www.bothner.com/~per/
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: proper tail recursion for gcc
2000-07-20 14:35 ` Per Bothner
@ 2000-07-20 14:48 ` Joern Rennecke
2000-07-20 16:36 ` John Vickers
2000-07-21 3:51 ` Andreas Schwab
1 sibling, 1 reply; 19+ messages in thread
From: Joern Rennecke @ 2000-07-20 14:48 UTC (permalink / raw)
To: Per Bothner; +Cc: Richard Henderson, gcc
> When foo_with_bar returns, it will return to foo's caller, which
> will pop off b and c. a remains on the stack, but does no harm.
> When foo's caller returns, a will get popped, just as if a had
> been alocated with alloca.
>
> This will fail if foo's caller is compiled to not use a frame pointer,
> since the stack pointer will not be where it expects it.
And even if it has been compiled with a frame pointer, there can
be a stack overflow if the call is done in a loop with many iterations.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: proper tail recursion for gcc
2000-07-20 14:48 ` Joern Rennecke
@ 2000-07-20 16:36 ` John Vickers
2000-07-20 21:42 ` Per Bothner
0 siblings, 1 reply; 19+ messages in thread
From: John Vickers @ 2000-07-20 16:36 UTC (permalink / raw)
To: Joern Rennecke; +Cc: Per Bothner, Richard Henderson, gcc
Joern Rennecke wrote:
>
> > When foo_with_bar returns, it will return to foo's caller, which
> > will pop off b and c. a remains on the stack, but does no harm.
> > When foo's caller returns, a will get popped, just as if a had
> > been alocated with alloca.
Every argument frame written to the stack for a tail call
has to start at the beginning of the argument frame of the caller:
i.e. each tail caller in effect pops it's own arguments before pushing
the tail callee's arguments.
Otherwise tail recursion between two functions with different sized
argument frames would use unbounded memory.
> >
> > This will fail if foo's caller is compiled to not use a frame pointer,
> > since the stack pointer will not be where it expects it.
>
> And even if it has been compiled with a frame pointer, there can
> be a stack overflow if the call is done in a loop with many iterations.
What is this "loop" construct of which you speak ?
Regards,
John.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: proper tail recursion for gcc
2000-07-20 16:36 ` John Vickers
@ 2000-07-20 21:42 ` Per Bothner
0 siblings, 0 replies; 19+ messages in thread
From: Per Bothner @ 2000-07-20 21:42 UTC (permalink / raw)
To: John Vickers; +Cc: gcc
John Vickers <John.Vickers@pace.co.uk> writes:
> Every argument frame written to the stack for a tail call
> has to start at the beginning of the argument frame of the caller:
> i.e. each tail caller in effect pops it's own arguments before pushing
> the tail callee's arguments.
Except if you use the typical C calling conventions, it is the caller
that is responsible for pushing the arguemnts (due to need for
handling varargs). So if the called function also pops up the arguments,
you risk popping too much off, which could clobber the stack (if for
example signals are pushed on the same stack).
That is why a tail-call-elimination works so much better if you use
a calling convention where the callee pops the arguements.
> Otherwise tail recursion between two functions with different sized
> argument frames would use unbounded memory.
There is a risk of that. Of course the stack would grow much faster
if you just did the recursive calls, but you can't guarantee that
slower that a long loop won't overflow the stack.
However, if you wanted to compile a langauge like Scheme, you could
use a convention where all Scheme procedures take the same number
of arguments (from the compiler's point of view) by using a list
argument. If you restrict yourself to handling tail-call-elimination
where the caller and calle have the same parameter and return types
(which is all that is really needed for implementing Scheme) the
problem is a lot easier.
> > And even if it has been compiled with a frame pointer, there can
> > be a stack overflow if the call is done in a loop with many iterations.
>
> What is this "loop" construct of which you speak ?
In Scheme, there is no "loop" primitive. Instead, there are two standard
pieces of syntactic sugar that are defined in terms of tail recursion.
More generally, a loop is just a special case of tail recursion.
--
--Per Bothner
per@bothner.com http://www.bothner.com/~per/
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: proper tail recursion for gcc
2000-07-20 14:35 ` Per Bothner
2000-07-20 14:48 ` Joern Rennecke
@ 2000-07-21 3:51 ` Andreas Schwab
1 sibling, 0 replies; 19+ messages in thread
From: Andreas Schwab @ 2000-07-21 3:51 UTC (permalink / raw)
To: Per Bothner; +Cc: Richard Henderson, gcc
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1716 bytes --]
Per Bothner <per@bothner.com> writes:
|> Richard Henderson <rth@cygnus.com> writes:
|>
|> > On Wed, Jul 19, 2000 at 11:33:08PM -0700, Zack Weinberg wrote:
|> > > We only seem to do the optimization when the argument lists and return
|> > > types are identical.
|> > >
|> > > Consider, for instance:
|> > >
|> > > foo(a, b)
|> > > {
|> > > c = calculate_bar(a, b);
|> > > foo_with_bar(a, b, c);
|> > > }
|> > >
|> > > foo_with_bar(a, b, c)
|> > > {
|> > > ...
|> > > }
|> > >
|> > > We don't sibcall optimize this, but we could and probably should.
|> >
|> > No, we can't. Foo knows that it has room on the stack for
|> > two arguments, since that's what it was given. Foo_with_bar
|> > requires three arguments, so there's no enough room to
|> > perform the tail call without changing the caller to pop more
|> > from the stack frame.
|>
|> It is possible though ugly. You can adjust the stack frame to
|> make room for the extra argument. I.e. you change:
|> a, b, return_pc, fp, locals ...
|> to:
|> a, b, c, return_pc
|> and then jump to foo_with_bar.
|> When foo_with_bar returns, it will return to foo's caller, which
|> will pop off b and c. a remains on the stack, but does no harm.
|> When foo's caller returns, a will get popped, just as if a had
|> been alocated with alloca.
This means that the stack grows without bounds if foo's caller never
returns and calls foo in an infinite loop, e.g:
toplevel()
{
for (;;)
foo(a,b);
}
Andreas.
--
Andreas Schwab "And now for something
SuSE Labs completely different."
Andreas.Schwab@suse.de
SuSE GmbH, Schanzäckerstr. 10, D-90443 Nürnberg
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2000-07-21 3:51 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-07-19 18:01 proper tail recursion for gcc Mark Probst
2000-07-19 23:06 ` Geoff Keating
2000-07-19 23:35 ` Zack Weinberg
2000-07-20 2:40 ` Jakub Jelinek
2000-07-20 12:16 ` Zack Weinberg
2000-07-20 8:26 ` Mark Mitchell
2000-07-20 10:02 ` Mark Probst
2000-07-20 12:07 ` Richard Henderson
2000-07-20 14:35 ` Per Bothner
2000-07-20 14:48 ` Joern Rennecke
2000-07-20 16:36 ` John Vickers
2000-07-20 21:42 ` Per Bothner
2000-07-21 3:51 ` Andreas Schwab
2000-07-20 3:20 ` Mark Probst
2000-07-20 7:12 ` Alexandre Oliva
2000-07-20 7:32 ` Theodore Papadopoulo
2000-07-20 12:12 ` Richard Henderson
2000-07-20 12:34 ` Mark Probst
2000-07-20 13:47 ` Richard Henderson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).