proper tail recursion for gcc

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* proper tail recursion for gcc
@ 2000-07-19 18:01 Mark Probst
  2000-07-19 23:06 ` Geoff Keating
  0 siblings, 1 reply; 19+ messages in thread
From: Mark Probst @ 2000-07-19 18:01 UTC (permalink / raw)
  To: gcc

hi

i want to implement proper tail recursion for a much broader set of
cases than is currently implemented.

motivation: people using gnu c as a portable assembler suffer from the
fact that there is no way to make gcc do proper tail recursion. as a
result, some scheme compilers (bigloo, stalin and hobbit, which is
used by guile) are not standard compliant (scheme requires proper tail
recursion) and at least one prolog compiler (gnu prolog) has switched
from generating c code to assembler, making it non-portable (it is
currently only avaible for x86 and sparc). the prolog-like mercury
has to jump through some (non-portable) hoops to use gcc as an
efficient assembler. the currently available tail call optimization
in gcc does not go far enough, especially in that it does not permit
separate compilation (tail-callees must be static).

proposal: i want to implement a construct for tail calls. this could
either be in the form of a modifier for the return statement or a new
language construct, i.e.

	return __tailcall__ f(a,b,c);

or

	__tailcall__ f(a,b,c);

suggestions regarding the syntax are welcome. there are several
reasons for doing it this way. the alternatives would be:

* analysis within the compiler regarding which tail calls can be made
  proper. the problem with this approach is that there might be a lot
  of badly behaving code in existance that assumes that tail calls are
  not proper. furthermore, there would have to be sharp and easily
  satisfyable constraints on the c code of the caller or else there
  would be no way to make sure that tail calls really are proper.

* declaration of tail-callees. using this approach it would still be
  necessary to analyze the code in order to determine whether a given
  tail call could be made proper. hence, all above mentioned problems
  apply here also.

limitations:

* functions with variable arguments could not make proper tail calls.

i'd like to implement this feature for at least the alpha and x86
architectures.

do you have any suggestions? anything i should look out for?

bye
schani

-- 
Mark Probst
Student, Programmer
http://www.complang.tuwien.ac.at/~schani/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: proper tail recursion for gcc
  2000-07-19 18:01 proper tail recursion for gcc Mark Probst
@ 2000-07-19 23:06 ` Geoff Keating
  2000-07-19 23:35   ` Zack Weinberg
  2000-07-20  3:20   ` Mark Probst
  0 siblings, 2 replies; 19+ messages in thread
From: Geoff Keating @ 2000-07-19 23:06 UTC (permalink / raw)
  To: Mark Probst; +Cc: gcc

Mark Probst <schani@mips.complang.tuwien.ac.at> writes:

> i want to implement proper tail recursion for a much broader set of
> cases than is currently implemented.
...

I think you'll find this has already been done in the development gcc.
It's the 'sibling call' optimisation.

-- 
- Geoffrey Keating <geoffk@cygnus.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: proper tail recursion for gcc
  2000-07-19 23:06 ` Geoff Keating
@ 2000-07-19 23:35   ` Zack Weinberg
  2000-07-20  2:40     ` Jakub Jelinek
                       ` (2 more replies)
  2000-07-20  3:20   ` Mark Probst
  1 sibling, 3 replies; 19+ messages in thread
From: Zack Weinberg @ 2000-07-19 23:35 UTC (permalink / raw)
  To: Geoff Keating; +Cc: Mark Probst, gcc

On Wed, Jul 19, 2000 at 11:06:08PM -0700, Geoff Keating wrote:
> 
> Mark Probst <schani@mips.complang.tuwien.ac.at> writes:
> 
> > i want to implement proper tail recursion for a much broader set of
> > cases than is currently implemented.
> ...
> 
> I think you'll find this has already been done in the development gcc.
> It's the 'sibling call' optimisation.

It's not as complete as a Scheme compiler is required to implement.
The Scheme standard requires sibcall optimization for *every* function
call followed immediately by a return from the current function.  We
only seem to do the optimization when the argument lists and return
types are identical.

Consider, for instance:

foo(a, b)
{
  c = calculate_bar(a, b);
  foo_with_bar(a, b, c);
}

foo_with_bar(a, b, c)
{
  ...
}

We don't sibcall optimize this, but we could and probably should.
[Perhaps this is the sort of thing that would be handled by the
"better tree-based sibcall optimizer" coming RSN?]

zw

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: proper tail recursion for gcc
  2000-07-19 23:35   ` Zack Weinberg
@ 2000-07-20  2:40     ` Jakub Jelinek
  2000-07-20 12:16       ` Zack Weinberg
  2000-07-20  8:26     ` Mark Mitchell
  2000-07-20 12:07     ` Richard Henderson
  2 siblings, 1 reply; 19+ messages in thread
From: Jakub Jelinek @ 2000-07-20  2:40 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: Geoff Keating, Mark Probst, gcc

On Wed, Jul 19, 2000 at 11:33:08PM -0700, Zack Weinberg wrote:
> On Wed, Jul 19, 2000 at 11:06:08PM -0700, Geoff Keating wrote:
> > 
> > Mark Probst <schani@mips.complang.tuwien.ac.at> writes:
> > 
> > > i want to implement proper tail recursion for a much broader set of
> > > cases than is currently implemented.
> > ...
> > 
> > I think you'll find this has already been done in the development gcc.
> > It's the 'sibling call' optimisation.
> 
> It's not as complete as a Scheme compiler is required to implement.
> The Scheme standard requires sibcall optimization for *every* function
> call followed immediately by a return from the current function.  We

There must be exceptions, like if you take an address of some local variable and pass
it to some other functions...

> only seem to do the optimization when the argument lists and return
> types are identical.
> 
> Consider, for instance:
> 
> foo(a, b)
> {
>   c = calculate_bar(a, b);
>   foo_with_bar(a, b, c);
> }
> 
> foo_with_bar(a, b, c)
> {
>   ...
> }
> 
> We don't sibcall optimize this, but we could and probably should.
> [Perhaps this is the sort of thing that would be handled by the
> "better tree-based sibcall optimizer" coming RSN?]

I believe the problem here is not if we use tree-based or
expand_call/rtl-based sibcall optimizer, the issues are backend related.
Basically, if the incoming arguments stack area of the tail function is
larger than of tail callee, then you have to allocate some area immediately
below virtual-incoming-args-rtx to compensate it and in a machine specific
way copy machine dependent stuff down (or up, depending on stack growth),
e.g. on Intel the return address stored on the stack.
It is IMHO worth implementing, but under a separate -f switch not enabled by
default from -O options, because I don't think it will be a win for random C
code.
(BTW: The above example will be optimized on SPARC (if there are less than 6
word size arguments in both prototypes)).

	Jakub

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: proper tail recursion for gcc
  2000-07-19 23:06 ` Geoff Keating
  2000-07-19 23:35   ` Zack Weinberg
@ 2000-07-20  3:20   ` Mark Probst
  2000-07-20  7:12     ` Alexandre Oliva
  2000-07-20 12:12     ` Richard Henderson
  1 sibling, 2 replies; 19+ messages in thread
From: Mark Probst @ 2000-07-20  3:20 UTC (permalink / raw)
  To: Geoff Keating; +Cc: gcc

On Wed, Jul 19, 2000 at 11:06:08PM -0700, Geoff Keating wrote:
> > i want to implement proper tail recursion for a much broader set of
> > cases than is currently implemented.
> ...
> 
> I think you'll find this has already been done in the development gcc.
> It's the 'sibling call' optimisation.

i was referring to exactly this optimization. at least on the alpha,
for example, only calls to static functions are optimized, which is
not very helpful if you need separate compilation.

bye
schani

-- 
Mark Probst
Student, Programmer
http://www.complang.tuwien.ac.at/~schani/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: proper tail recursion for gcc
  2000-07-20  3:20   ` Mark Probst
@ 2000-07-20  7:12     ` Alexandre Oliva
  2000-07-20  7:32       ` Theodore Papadopoulo
  2000-07-20 12:12     ` Richard Henderson
  1 sibling, 1 reply; 19+ messages in thread
From: Alexandre Oliva @ 2000-07-20  7:12 UTC (permalink / raw)
  To: Mark Probst; +Cc: Geoff Keating, gcc

On Jul 20, 2000, Mark Probst <schani@mips.complang.tuwien.ac.at> wrote:

> On Wed, Jul 19, 2000 at 11:06:08PM -0700, Geoff Keating wrote:

>> I think you'll find this has already been done in the development gcc.
>> It's the 'sibling call' optimisation.

> only calls to static functions are optimized

Non-static functions may be overridden if the object file is used to
create a library and a program linked with the library defines the
same symbol.

-- 
Alexandre Oliva   Enjoy Guarana', see http://www.ic.unicamp.br/~oliva/
Red Hat GCC Developer                  aoliva@{cygnus.com, redhat.com}
CS PhD student at IC-Unicamp        oliva@{lsd.ic.unicamp.br, gnu.org}
Free Software Evangelist    *Please* write to mailing lists, not to me

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: proper tail recursion for gcc
  2000-07-20  7:12     ` Alexandre Oliva
@ 2000-07-20  7:32       ` Theodore Papadopoulo
  0 siblings, 0 replies; 19+ messages in thread
From: Theodore Papadopoulo @ 2000-07-20  7:32 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: Mark Probst, Geoff Keating, gcc

aoliva@redhat.com said:
>> only calls to static functions are optimized
> Non-static functions may be overridden if the object file is used to
> create a library and a program linked with the library defines the
> same symbol. 

Then, may be an attribute never_overriden (or something similar) could
be of some help here ???
On the other hand, that's certainly dangerous....

--------------------------------------------------------------------
Theodore Papadopoulo
Email: Theodore.Papadopoulo@sophia.inria.fr Tel: (33) 04 92 38 76 01
 --------------------------------------------------------------------


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: proper tail recursion for gcc
  2000-07-19 23:35   ` Zack Weinberg
  2000-07-20  2:40     ` Jakub Jelinek
@ 2000-07-20  8:26     ` Mark Mitchell
  2000-07-20 10:02       ` Mark Probst
  2000-07-20 12:07     ` Richard Henderson
  2 siblings, 1 reply; 19+ messages in thread
From: Mark Mitchell @ 2000-07-20  8:26 UTC (permalink / raw)
  To: zack; +Cc: geoffk, schani, gcc

>>>>> "Zack" == Zack Weinberg <zack@wolery.cumb.org> writes:

    Zack> We don't sibcall optimize this, but we could and probably
    Zack> should.  [Perhaps this is the sort of thing that would be
    Zack> handled by the "better tree-based sibcall optimizer" coming
    Zack> RSN?]

Yup.  Except that RSN isn't on anybody's schedule, so far as I know. :-(

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: proper tail recursion for gcc
  2000-07-20  8:26     ` Mark Mitchell
@ 2000-07-20 10:02       ` Mark Probst
  0 siblings, 0 replies; 19+ messages in thread
From: Mark Probst @ 2000-07-20 10:02 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: gcc

On Thu, Jul 20, 2000 at 08:24:59AM -0700, Mark Mitchell wrote:
>     Zack> We don't sibcall optimize this, but we could and probably
>     Zack> should.  [Perhaps this is the sort of thing that would be
>     Zack> handled by the "better tree-based sibcall optimizer" coming
>     Zack> RSN?]
> 
> Yup.  Except that RSN isn't on anybody's schedule, so far as I know. :-(

could you mail me the plans/thoughts for the tree-based sibcall
optimized? i'd really like to do work in this direction. i also
learned that the sibcall optimizer does optimize non-static calls on
some platforms, like the sparc. on alpha and i386 it does not,
however. i'd like to look into this issue also, as these are my
primary development targets.

bye
schani

-- 
Mark Probst
Student, Programmer
http://www.complang.tuwien.ac.at/~schani/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: proper tail recursion for gcc
  2000-07-19 23:35   ` Zack Weinberg
  2000-07-20  2:40     ` Jakub Jelinek
  2000-07-20  8:26     ` Mark Mitchell
@ 2000-07-20 12:07     ` Richard Henderson
  2000-07-20 14:35       ` Per Bothner
  2 siblings, 1 reply; 19+ messages in thread
From: Richard Henderson @ 2000-07-20 12:07 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: Geoff Keating, Mark Probst, gcc

On Wed, Jul 19, 2000 at 11:33:08PM -0700, Zack Weinberg wrote:
> It's not as complete as a Scheme compiler is required to implement.
> The Scheme standard requires sibcall optimization for *every* function
> call followed immediately by a return from the current function.

Indeed.  And such complete tail call optimizations are impossible
with a stack-based call frame, such as is used for C and so forth.
For Scheme one must put call frames in the heap and garbage collect them.

> We only seem to do the optimization when the argument lists and return
> types are identical.
> 
> Consider, for instance:
> 
> foo(a, b)
> {
>   c = calculate_bar(a, b);
>   foo_with_bar(a, b, c);
> }
> 
> foo_with_bar(a, b, c)
> {
>   ...
> }
> 
> We don't sibcall optimize this, but we could and probably should.

No, we can't.  Foo knows that it has room on the stack for
two arguments, since that's what it was given.  Foo_with_bar
requires three arguments, so there's no enough room to 
perform the tail call without changing the caller to pop more
from the stack frame.

I believe you will find that this example does get transformed 
to a tail call on Sparc, since in this case all arguments will 
be in registers.

> [Perhaps this is the sort of thing that would be handled by the
> "better tree-based sibcall optimizer" coming RSN?]

No, the tree-based sibcall optimizer just makes the analysis 
easier and the code cleaner.  It does not change the calling
convention dependant restrictions placed on the optimization.

r~

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: proper tail recursion for gcc
  2000-07-20  3:20   ` Mark Probst
  2000-07-20  7:12     ` Alexandre Oliva
@ 2000-07-20 12:12     ` Richard Henderson
  2000-07-20 12:34       ` Mark Probst
  1 sibling, 1 reply; 19+ messages in thread
From: Richard Henderson @ 2000-07-20 12:12 UTC (permalink / raw)
  To: Mark Probst; +Cc: Geoff Keating, gcc

On Thu, Jul 20, 2000 at 11:21:51AM +0000, Mark Probst wrote:
> i was referring to exactly this optimization. at least on the alpha,
> for example, only calls to static functions are optimized, which is
> not very helpful if you need separate compilation.

This is required by the calling conventions.

Given foo calls bar calls baz, if baz function is in a different file
than bar, it might use a different $gp value.  If baz uses a different
$gp value, then the $gp value on return to foo will be incorrect and
foo may crash.

There is no way around this, except foreknowledge that bar and baz
use the same $gp value.  And that may only be guaranteed by putting
the functions in the same translation unit.

r~

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: proper tail recursion for gcc
  2000-07-20  2:40     ` Jakub Jelinek
@ 2000-07-20 12:16       ` Zack Weinberg
  0 siblings, 0 replies; 19+ messages in thread
From: Zack Weinberg @ 2000-07-20 12:16 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Geoff Keating, Mark Probst, gcc

On Thu, Jul 20, 2000 at 11:41:18AM +0200, Jakub Jelinek wrote:
> On Wed, Jul 19, 2000 at 11:33:08PM -0700, Zack Weinberg wrote:
> > On Wed, Jul 19, 2000 at 11:06:08PM -0700, Geoff Keating wrote:
> > > 
> > > Mark Probst <schani@mips.complang.tuwien.ac.at> writes:
> > > 
> > > > i want to implement proper tail recursion for a much broader set of
> > > > cases than is currently implemented.
> > > ...
> > > 
> > > I think you'll find this has already been done in the development gcc.
> > > It's the 'sibling call' optimisation.
> > 
> > It's not as complete as a Scheme compiler is required to implement.
> > The Scheme standard requires sibcall optimization for *every* function
> > call followed immediately by a return from the current function.  We
> 
> There must be exceptions, like if you take an address of some local
> variable and pass it to some other functions...

Scheme is a LISP dialect.  It doesn't have any constructs that would
interfere with this optimization.

...
> > We don't sibcall optimize this, but we could and probably should.
> > [Perhaps this is the sort of thing that would be handled by the
> > "better tree-based sibcall optimizer" coming RSN?]
> 
> I believe the problem here is not if we use tree-based or
> expand_call/rtl-based sibcall optimizer, the issues are backend
> related.  Basically, if the incoming arguments stack area of the
> tail function is larger than of tail callee, then you have to
> allocate some area immediately below virtual-incoming-args-rtx to
> compensate it and in a machine specific way copy machine dependent
> stuff down (or up, depending on stack growth), e.g. on Intel the
> return address stored on the stack.

I do understand this.

In the specific example I gave, it would be possible to implement it
without machine-specific stack diddling, if we had a concept of
alternate entry points.  In pseudo assembler:

foo:
	move a(%sp), %r1	; r1, r2 call-saved
	move b(%sp), %r2
	push %r2
	push %r1
	call calculate_bar	; result in %r0
	add  8, %sp
	jmp  foo_tail

foo_with_bar:
	move a(%sp), %r1
	move b(%sp), %r2
	move c(%sp), %r0
foo_tail:
	...

But that would take serious interprocedural analysis, which we aren't
going to have anytime soon.

> It is IMHO worth implementing, but under a separate -f switch not
> enabled by default from -O options, because I don't think it will be
> a win for random C code.

This comes back to Mark's original idea, to have an annotation on
specific function calls indicating that they should be sibcall
optimized.  I'd rather have that than another -f switch.

zw

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: proper tail recursion for gcc
  2000-07-20 12:12     ` Richard Henderson
@ 2000-07-20 12:34       ` Mark Probst
  2000-07-20 13:47         ` Richard Henderson
  0 siblings, 1 reply; 19+ messages in thread
From: Mark Probst @ 2000-07-20 12:34 UTC (permalink / raw)
  To: Richard Henderson; +Cc: gcc

On Thu, Jul 20, 2000 at 12:12:15PM -0700, Richard Henderson wrote:
> On Thu, Jul 20, 2000 at 11:21:51AM +0000, Mark Probst wrote:
> > i was referring to exactly this optimization. at least on the alpha,
> > for example, only calls to static functions are optimized, which is
> > not very helpful if you need separate compilation.
> 
> This is required by the calling conventions.
> 
> Given foo calls bar calls baz, if baz function is in a different file
> than bar, it might use a different $gp value.  If baz uses a different
> $gp value, then the $gp value on return to foo will be incorrect and
> foo may crash.

this is true if foo and bar are in the same compilation unit, because
in this case foo does not reload $gp after calling bar. but if foo
knew that bar might do a tail call to a non-static function, it could
reload $gp and everything would be fine.

as i understand it, this knowledge is available (or could be made
available) to foo only if bar is defined before foo. is this correct?
is there a way around this? if there is not, we would have to assume
that bar makes a tail-call to a non-static function. we could let the
programmer explicitly declare that it does not, if the reloading of
$gp should be avoided.

bye
schani

-- 
Mark Probst
Student, Programmer
http://www.complang.tuwien.ac.at/~schani/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: proper tail recursion for gcc
  2000-07-20 12:34       ` Mark Probst
@ 2000-07-20 13:47         ` Richard Henderson
  0 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2000-07-20 13:47 UTC (permalink / raw)
  To: Mark Probst; +Cc: gcc

On Thu, Jul 20, 2000 at 08:37:11PM +0000, Mark Probst wrote:
> ... but if foo knew that bar might do a tail call to a non-static
> function, it could reload $gp and everything would be fine.

Not unless you communicate that knowledge to the linker
as well, which attempts to remove gp loads that are not
necessary.  Not necessary, that is, if you're following
the ABI rules.


r~

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: proper tail recursion for gcc
  2000-07-20 12:07     ` Richard Henderson
@ 2000-07-20 14:35       ` Per Bothner
  2000-07-20 14:48         ` Joern Rennecke
  2000-07-21  3:51         ` Andreas Schwab
  0 siblings, 2 replies; 19+ messages in thread
From: Per Bothner @ 2000-07-20 14:35 UTC (permalink / raw)
  To: Richard Henderson; +Cc: gcc

Richard Henderson <rth@cygnus.com> writes:

> On Wed, Jul 19, 2000 at 11:33:08PM -0700, Zack Weinberg wrote:
> > It's not as complete as a Scheme compiler is required to implement.
> > The Scheme standard requires sibcall optimization for *every* function
> > call followed immediately by a return from the current function.
> 
> Indeed.  And such complete tail call optimizations are impossible
> with a stack-based call frame, such as is used for C and so forth.

This is false.  General tail-call-elimination is quite compatible
with a stack-based call frame.  It can even be made compatible with
normal calling conventions, though it is easier if you have a
calling convention where the called function (callee) pops the stack.

> For Scheme one must put call frames in the heap and garbage collect them.

That is sort-of true, but has nothing to do with tail-call-elimination.
Full lexical closures and call-with-current-continuation are most
noturally implemented as you described, but there are various tricks
and hybrid implementations that people do.

> > We only seem to do the optimization when the argument lists and return
> > types are identical.
> > 
> > Consider, for instance:
> > 
> > foo(a, b)
> > {
> >   c = calculate_bar(a, b);
> >   foo_with_bar(a, b, c);
> > }
> > 
> > foo_with_bar(a, b, c)
> > {
> >   ...
> > }
> > 
> > We don't sibcall optimize this, but we could and probably should.
> 
> No, we can't.  Foo knows that it has room on the stack for
> two arguments, since that's what it was given.  Foo_with_bar
> requires three arguments, so there's no enough room to 
> perform the tail call without changing the caller to pop more
> from the stack frame.

It is possible though ugly.  You can adjust the stack frame to
make room for the extra argument.  I.e. you change:
        a, b, return_pc, fp, locals ...
to:
        a, b, c, return_pc
and then jump to foo_with_bar.
When foo_with_bar returns, it will return to foo's caller, which
will pop off b and c.  a remains on the stack, but does no harm.
When foo's caller returns, a will get popped, just as if a had
been alocated with alloca.

This will fail if foo's caller is compiled to not use a frame pointer,
since the stack pointer will not be where it expects it.
-- 
	--Per Bothner
per@bothner.com   http://www.bothner.com/~per/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: proper tail recursion for gcc
  2000-07-20 14:35       ` Per Bothner
@ 2000-07-20 14:48         ` Joern Rennecke
  2000-07-20 16:36           ` John Vickers
  2000-07-21  3:51         ` Andreas Schwab
  1 sibling, 1 reply; 19+ messages in thread
From: Joern Rennecke @ 2000-07-20 14:48 UTC (permalink / raw)
  To: Per Bothner; +Cc: Richard Henderson, gcc

> When foo_with_bar returns, it will return to foo's caller, which
> will pop off b and c.  a remains on the stack, but does no harm.
> When foo's caller returns, a will get popped, just as if a had
> been alocated with alloca.
> 
> This will fail if foo's caller is compiled to not use a frame pointer,
> since the stack pointer will not be where it expects it.

And even if it has been compiled with a frame pointer, there can
be a stack overflow if the call is done in a loop with many iterations.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: proper tail recursion for gcc
  2000-07-20 14:48         ` Joern Rennecke
@ 2000-07-20 16:36           ` John Vickers
  2000-07-20 21:42             ` Per Bothner
  0 siblings, 1 reply; 19+ messages in thread
From: John Vickers @ 2000-07-20 16:36 UTC (permalink / raw)
  To: Joern Rennecke; +Cc: Per Bothner, Richard Henderson, gcc

Joern Rennecke wrote:
> 
> > When foo_with_bar returns, it will return to foo's caller, which
> > will pop off b and c.  a remains on the stack, but does no harm.
> > When foo's caller returns, a will get popped, just as if a had
> > been alocated with alloca.

Every argument frame written to the stack for a tail call
has to start at the beginning of the argument frame of the caller:
i.e. each tail caller in effect pops it's own arguments before pushing
the tail callee's arguments.

Otherwise tail recursion between two functions with different sized
argument frames would use unbounded memory.

> >
> > This will fail if foo's caller is compiled to not use a frame pointer,
> > since the stack pointer will not be where it expects it.
> 
> And even if it has been compiled with a frame pointer, there can
> be a stack overflow if the call is done in a loop with many iterations.

What is this "loop" construct of which you speak ?

Regards,

John.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: proper tail recursion for gcc
  2000-07-20 16:36           ` John Vickers
@ 2000-07-20 21:42             ` Per Bothner
  0 siblings, 0 replies; 19+ messages in thread
From: Per Bothner @ 2000-07-20 21:42 UTC (permalink / raw)
  To: John Vickers; +Cc: gcc

John Vickers <John.Vickers@pace.co.uk> writes:

> Every argument frame written to the stack for a tail call
> has to start at the beginning of the argument frame of the caller:
> i.e. each tail caller in effect pops it's own arguments before pushing
> the tail callee's arguments.

Except if you use the typical C calling conventions, it is the caller
that is responsible for pushing the arguemnts (due to need for
handling varargs).  So if the called function also pops up the arguments,
you risk popping too much off, which could clobber the stack (if for
example signals are pushed on the same stack).

That is why a tail-call-elimination works so much better if you use
a calling convention where the callee pops the arguements.

> Otherwise tail recursion between two functions with different sized
> argument frames would use unbounded memory.

There is a risk of that.  Of course the stack would grow much faster
if you just did the recursive calls, but you can't guarantee that
slower that a long loop won't overflow the stack.

However, if you wanted to compile a langauge like Scheme, you could
use a convention where all Scheme procedures take the same number
of arguments (from the compiler's point of view) by using a list
argument.  If you restrict yourself to handling tail-call-elimination
where the caller and calle have the same parameter and return types
(which is all that is really needed for implementing Scheme) the
problem is a lot easier.

> > And even if it has been compiled with a frame pointer, there can
> > be a stack overflow if the call is done in a loop with many iterations.
> 
> What is this "loop" construct of which you speak ?

In Scheme, there is no "loop" primitive.  Instead, there are two standard
pieces of syntactic sugar that are defined in terms of tail recursion.
More generally, a loop is just a special case of tail recursion.
-- 
	--Per Bothner
per@bothner.com   http://www.bothner.com/~per/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: proper tail recursion for gcc
  2000-07-20 14:35       ` Per Bothner
  2000-07-20 14:48         ` Joern Rennecke
@ 2000-07-21  3:51         ` Andreas Schwab
  1 sibling, 0 replies; 19+ messages in thread
From: Andreas Schwab @ 2000-07-21  3:51 UTC (permalink / raw)
  To: Per Bothner; +Cc: Richard Henderson, gcc

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1716 bytes --]

Per Bothner <per@bothner.com> writes:

|> Richard Henderson <rth@cygnus.com> writes:
|> 
|> > On Wed, Jul 19, 2000 at 11:33:08PM -0700, Zack Weinberg wrote:
|> > > We only seem to do the optimization when the argument lists and return
|> > > types are identical.
|> > > 
|> > > Consider, for instance:
|> > > 
|> > > foo(a, b)
|> > > {
|> > >   c = calculate_bar(a, b);
|> > >   foo_with_bar(a, b, c);
|> > > }
|> > > 
|> > > foo_with_bar(a, b, c)
|> > > {
|> > >   ...
|> > > }
|> > > 
|> > > We don't sibcall optimize this, but we could and probably should.
|> > 
|> > No, we can't.  Foo knows that it has room on the stack for
|> > two arguments, since that's what it was given.  Foo_with_bar
|> > requires three arguments, so there's no enough room to 
|> > perform the tail call without changing the caller to pop more
|> > from the stack frame.
|> 
|> It is possible though ugly.  You can adjust the stack frame to
|> make room for the extra argument.  I.e. you change:
|>         a, b, return_pc, fp, locals ...
|> to:
|>         a, b, c, return_pc
|> and then jump to foo_with_bar.
|> When foo_with_bar returns, it will return to foo's caller, which
|> will pop off b and c.  a remains on the stack, but does no harm.
|> When foo's caller returns, a will get popped, just as if a had
|> been alocated with alloca.

This means that the stack grows without bounds if foo's caller never
returns and calls foo in an infinite loop, e.g:

toplevel()
{
  for (;;)
    foo(a,b);
}

Andreas.

-- 
Andreas Schwab                                  "And now for something
SuSE Labs                                        completely different."
Andreas.Schwab@suse.de
SuSE GmbH, SchanzÃ¤ckerstr. 10, D-90443 NÃ¼rnberg

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2000-07-21  3:51 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-07-19 18:01 proper tail recursion for gcc Mark Probst
2000-07-19 23:06 ` Geoff Keating
2000-07-19 23:35   ` Zack Weinberg
2000-07-20  2:40     ` Jakub Jelinek
2000-07-20 12:16       ` Zack Weinberg
2000-07-20  8:26     ` Mark Mitchell
2000-07-20 10:02       ` Mark Probst
2000-07-20 12:07     ` Richard Henderson
2000-07-20 14:35       ` Per Bothner
2000-07-20 14:48         ` Joern Rennecke
2000-07-20 16:36           ` John Vickers
2000-07-20 21:42             ` Per Bothner
2000-07-21  3:51         ` Andreas Schwab
2000-07-20  3:20   ` Mark Probst
2000-07-20  7:12     ` Alexandre Oliva
2000-07-20  7:32       ` Theodore Papadopoulo
2000-07-20 12:12     ` Richard Henderson
2000-07-20 12:34       ` Mark Probst
2000-07-20 13:47         ` Richard Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).