public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: forcing tail/sibling call optimization
@ 2000-11-26  8:14 Robert Dewar
  2000-11-26 13:43 ` Fergus Henderson
  2000-11-27  7:58 ` Jeffrey A Law
  0 siblings, 2 replies; 64+ messages in thread
From: Robert Dewar @ 2000-11-26  8:14 UTC (permalink / raw)
  To: dewar, fjh; +Cc: gcc

The formal definition seems quite reasonable here, but I must say
I dislike the syntax, and it is unnecessarily provocative, since
this really is not semantically a goto at all (e.g. the formal
denmotational semantics would have nothing to do with the semantics
of goto). Why not something like "tail return" or perhaps better 
simply a compiler directive.

The real point here is that the semantics is identical to return. What
the declaration (in whatever form it exists) is doing is to say that
the semanjtics of execution will be undefined if there are any subsequent
references to local variables of the caller. That's *all* that needs to
be said semantically.

In Ada, one would provide this capability by simply having a pragma
e.g. pragma Tail_Return; which has no effect on the semantics of the
program if it is correct.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-26  8:14 forcing tail/sibling call optimization Robert Dewar
@ 2000-11-26 13:43 ` Fergus Henderson
  2000-11-27  7:58 ` Jeffrey A Law
  1 sibling, 0 replies; 64+ messages in thread
From: Fergus Henderson @ 2000-11-26 13:43 UTC (permalink / raw)
  To: Robert Dewar; +Cc: gcc

On 26-Nov-2000, Robert Dewar <dewar@gnat.com> wrote:
> The formal definition seems quite reasonable here, but I must say
> I dislike the syntax [...]

OK, you've convinced me that reusing `goto' is a bad idea.

How about `return __tailcall <func> ( <args> ) ;'?

-- 
Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
                                    |  of excellence is a lethal habit"
WWW: < http://www.cs.mu.oz.au/~fjh >  |     -- the last words of T. S. Garp.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-26  8:14 forcing tail/sibling call optimization Robert Dewar
  2000-11-26 13:43 ` Fergus Henderson
@ 2000-11-27  7:58 ` Jeffrey A Law
  2000-11-27  8:05   ` David Edelsohn
                     ` (3 more replies)
  1 sibling, 4 replies; 64+ messages in thread
From: Jeffrey A Law @ 2000-11-27  7:58 UTC (permalink / raw)
  To: Robert Dewar; +Cc: fjh, gcc

  In message < 20001126161445.8878634D82@nile.gnat.com >you write:
  > The formal definition seems quite reasonable here, but I must say
  > I dislike the syntax, and it is unnecessarily provocative, since
  > this really is not semantically a goto at all (e.g. the formal
  > denmotational semantics would have nothing to do with the semantics
  > of goto). Why not something like "tail return" or perhaps better 
  > simply a compiler directive.
  > 
  > The real point here is that the semantics is identical to return. What
  > the declaration (in whatever form it exists) is doing is to say that
  > the semanjtics of execution will be undefined if there are any subsequent
  > references to local variables of the caller. That's *all* that needs to
  > be said semantically.
  > 
  > In Ada, one would provide this capability by simply having a pragma
  > e.g. pragma Tail_Return; which has no effect on the semantics of the
  > program if it is correct.
Instead of tagging the return itself, couldn't we use an attribute which
would (in effect) tell the optimizer that any tail call in the function
is safe to optimize?  Yes, we lose the ability to specify that a specific
call site is safe to optimize, but we don't have to invent yet another
GCC extension of dubious value.

jeff

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-27  7:58 ` Jeffrey A Law
@ 2000-11-27  8:05   ` David Edelsohn
  2000-11-27  8:07   ` Andi Kleen
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 64+ messages in thread
From: David Edelsohn @ 2000-11-27  8:05 UTC (permalink / raw)
  To: law; +Cc: Robert Dewar, fjh, gcc

>>>>> Jeffrey A Law writes:

Jeff> Instead of tagging the return itself, couldn't we use an attribute which
Jeff> would (in effect) tell the optimizer that any tail call in the function
Jeff> is safe to optimize?  Yes, we lose the ability to specify that a specific
Jeff> call site is safe to optimize, but we don't have to invent yet another
Jeff> GCC extension of dubious value.

	I completely agree that this should be implemented using the GCC
Extension of function attributes and not a new call-site keyword.

David

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-27  7:58 ` Jeffrey A Law
  2000-11-27  8:05   ` David Edelsohn
@ 2000-11-27  8:07   ` Andi Kleen
  2000-11-27  8:25     ` Jeffrey A Law
                       ` (2 more replies)
  2000-11-27 10:47   ` Mark Mitchell
  2000-11-27 23:39   ` Fergus Henderson
  3 siblings, 3 replies; 64+ messages in thread
From: Andi Kleen @ 2000-11-27  8:07 UTC (permalink / raw)
  To: law; +Cc: gcc

Jeffrey A Law <law@redhat.com> writes:
>   > program if it is correct.
> Instead of tagging the return itself, couldn't we use an attribute which
> would (in effect) tell the optimizer that any tail call in the function
> is safe to optimize?  Yes, we lose the ability to specify that a specific
> call site is safe to optimize, but we don't have to invent yet another
> GCC extension of dubious value.

Problem I see with that is diagnostics again: e.g. someone depends on a 
particular call being a tail call and not allocating new storage. With a 
"tail call statement" gcc could easily give a warning or even an error
when the compiler is not able to satisfy that requirement. With the function
wide attribute it would be much harder to enforce it. Every call could
possible be a tail call depending on rather complex, machine dependent rules.
How would you generate the "not a tail call" warning without getting lots
of false positives? With an explicit statement it is clear what to do. 


-Andi

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-27  8:07   ` Andi Kleen
@ 2000-11-27  8:25     ` Jeffrey A Law
  2000-11-27  8:39       ` Andi Kleen
  2000-11-27  8:38     ` Bernd Schmidt
  2000-11-27 10:48     ` Mark Mitchell
  2 siblings, 1 reply; 64+ messages in thread
From: Jeffrey A Law @ 2000-11-27  8:25 UTC (permalink / raw)
  To: Andi Kleen; +Cc: gcc

  In message < oupn1el760c.fsf@pigdrop.muc.suse.de >you write:
  > Problem I see with that is diagnostics again: e.g. someone depends on a 
  > particular call being a tail call and not allocating new storage.
Therein lies the first problem -- programmer dependence on specific 
optimizations in the compiler.  That's a fundamental mistake.

If you _must_ have tail calls optimized to gotos, then write them that way
or use a language which guarantees tail calls will turn into gotos.  If
tail call optimizations are merely convenience, then write them as calls :-)

  >  With a 
  > "tail call statement" gcc could easily give a warning or even an error
  > when the compiler is not able to satisfy that requirement.
True.

  >  With the functio
  > wide attribute it would be much harder to enforce it. Every call could
  > possible be a tail call depending on rather complex, machine dependent rule
True.

  > How would you generate the "not a tail call" warning without getting lots
  > of false positives? With an explicit statement it is clear what to do. 
I don't think there really is a way.  Nor do I think we should spend a lot
of time worrying about it :-)

jeff

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-27  8:07   ` Andi Kleen
  2000-11-27  8:25     ` Jeffrey A Law
@ 2000-11-27  8:38     ` Bernd Schmidt
  2000-11-27 11:26       ` Eric W. Biederman
  2000-11-27 10:48     ` Mark Mitchell
  2 siblings, 1 reply; 64+ messages in thread
From: Bernd Schmidt @ 2000-11-27  8:38 UTC (permalink / raw)
  To: Andi Kleen; +Cc: law, gcc

On 27 Nov 2000, Andi Kleen wrote:

> Jeffrey A Law <law@redhat.com> writes:
> >   > program if it is correct.
> > Instead of tagging the return itself, couldn't we use an attribute which
> > would (in effect) tell the optimizer that any tail call in the function
> > is safe to optimize?  Yes, we lose the ability to specify that a specific
> > call site is safe to optimize, but we don't have to invent yet another
> > GCC extension of dubious value.
> 
> Problem I see with that is diagnostics again: e.g. someone depends on a 
> particular call being a tail call and not allocating new storage. With a 
> "tail call statement" gcc could easily give a warning or even an error
> when the compiler is not able to satisfy that requirement. With the function
> wide attribute it would be much harder to enforce it. Every call could
> possible be a tail call depending on rather complex, machine dependent rules.
> How would you generate the "not a tail call" warning without getting lots
> of false positives? With an explicit statement it is clear what to do. 

Even an explicit statement doesn't help you because even if it doesn't warn
when compiling for one machine, it might still warn on another machine, or
with a different version of the compiler.

IMHO we shouldn't encourage users to rely on gcc being able to optimize
tail calls.  It's a nice optimization, but if a program relies on it for
correctness, that program is broken.


Bernd

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-27  8:25     ` Jeffrey A Law
@ 2000-11-27  8:39       ` Andi Kleen
  2000-11-27  9:48         ` Jeffrey A Law
  2000-11-27 10:54         ` Mark Mitchell
  0 siblings, 2 replies; 64+ messages in thread
From: Andi Kleen @ 2000-11-27  8:39 UTC (permalink / raw)
  To: Jeffrey A Law; +Cc: gcc

On Mon, Nov 27, 2000 at 05:25:45PM +0100, Jeffrey A Law wrote:
> 
>   In message < oupn1el760c.fsf@pigdrop.muc.suse.de >you write:
>   > Problem I see with that is diagnostics again: e.g. someone depends on a 
>   > particular call being a tail call and not allocating new storage.
> Therein lies the first problem -- programmer dependence on specific 
> optimizations in the compiler.  That's a fundamental mistake.

Well, I think Fergus' whole point of the extension was to use gcc 
as a backend for language compilers that have this requirement. 

> 
> If you _must_ have tail calls optimized to gotos, then write them that way
> or use a language which guarantees tail calls will turn into gotos.  If
> tail call optimizations are merely convenience, then write them as calls :-)

That would require putting the whole program into a single function, which
given the current gcc optimizer's runtimes is surely to end with a very
slow compile process for anything >-O0.

-Andi

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-27  8:39       ` Andi Kleen
@ 2000-11-27  9:48         ` Jeffrey A Law
  2000-11-27 11:21           ` Lars Brinkhoff
  2000-11-27 10:54         ` Mark Mitchell
  1 sibling, 1 reply; 64+ messages in thread
From: Jeffrey A Law @ 2000-11-27  9:48 UTC (permalink / raw)
  To: Andi Kleen; +Cc: gcc

  In message < 20001127174448.A2375@fred.local >you write:
  > On Mon, Nov 27, 2000 at 05:25:45PM +0100, Jeffrey A Law wrote:
  > > 
  > >   In message < oupn1el760c.fsf@pigdrop.muc.suse.de >you write:
  > >   > Problem I see with that is diagnostics again: e.g. someone depends on
  >  a 
  > >   > particular call being a tail call and not allocating new storage.
  > > Therein lies the first problem -- programmer dependence on specific 
  > > optimizations in the compiler.  That's a fundamental mistake.
  > 
  > Well, I think Fergus' whole point of the extension was to use gcc 
  > as a backend for language compilers that have this requirement. 
But I would claim that C is a terrible choice for the target language
of this translator because of this kind of issue.

I would think it would be better to actually write a true front-end for
the source language.


  > That would require putting the whole program into a single function, which
  > given the current gcc optimizer's runtimes is surely to end with a very
  > slow compile process for anything >-O0.
True.  But that's the price you pay when targetting a language (ie C) that
doesn't have the feature set (tail call opts) necessary to correctly 
implement the source language.

jeff

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-27  7:58 ` Jeffrey A Law
  2000-11-27  8:05   ` David Edelsohn
  2000-11-27  8:07   ` Andi Kleen
@ 2000-11-27 10:47   ` Mark Mitchell
  2000-11-28 19:21     ` Fergus Henderson
  2000-11-27 23:39   ` Fergus Henderson
  3 siblings, 1 reply; 64+ messages in thread
From: Mark Mitchell @ 2000-11-27 10:47 UTC (permalink / raw)
  To: law; +Cc: dewar, fjh, gcc

>>>>> "Jeffrey" == Jeffrey A Law <law@redhat.com> writes:

    Jeffrey> Instead of tagging the return itself, couldn't we use an
    Jeffrey> attribute which would (in effect) tell the optimizer that
    Jeffrey> any tail call in the function is safe to optimize?  Yes,
    Jeffrey> we lose the ability to specify that a specific call site
    Jeffrey> is safe to optimize, but we don't have to invent yet
    Jeffrey> another GCC extension of dubious value.

Yes.  There is simply no need for an extension of this nature -- there
is no reason the compiler (perhaps with a few hints in the form of
attributes) can't do this optimization.  Introducing this extension
would carry on a bad tradition in GCC: inventing extensions, rather
than having the compiler perform optimizations.

The issue, I take it, is that if there are addressed locals in the
caller, then the compiler might be afraid to do a tail call that would
clobber them?  Since we're accepting the point of view that the
programmer is going to have to do something anyhow, it's worth
pointing out that the programmer can already make that clear to the
compiler: make the call after the addressed locals have gone out of
scope.  In other words, instead of:

  {   
    int i;
    f(&i);
    g(); /* I hope this will be a tail call! */ 
  }

do:
 
  {
    {
      int i;
      f(&i);
    }
    g();
  }

Yes, that's harder in some cases -- but it can always be done with a
goto to the call site, which can go in the outermost block of the
function.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-27  8:07   ` Andi Kleen
  2000-11-27  8:25     ` Jeffrey A Law
  2000-11-27  8:38     ` Bernd Schmidt
@ 2000-11-27 10:48     ` Mark Mitchell
  2000-11-27 12:46       ` Harvey J. Stein
  2 siblings, 1 reply; 64+ messages in thread
From: Mark Mitchell @ 2000-11-27 10:48 UTC (permalink / raw)
  To: freitag; +Cc: law, gcc

>>>>> "Andi" == Andi Kleen <freitag@alancoxonachip.com> writes:

    Andi> Problem I see with that is diagnostics again: e.g. someone
    Andi> depends on a particular call being a tail call and not
    Andi> allocating new storage.

The ANSI/ISO C way of doing this would be a #pragma (or _Pragma) near
the call site.  In GCC, we could do that, or we could do something
like:

  __tailcall__ (xyz (foo));

or something.  But that magic should be only used for warnings -- it
should not be used to have the compiler generate different code.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-27  8:39       ` Andi Kleen
  2000-11-27  9:48         ` Jeffrey A Law
@ 2000-11-27 10:54         ` Mark Mitchell
  1 sibling, 0 replies; 64+ messages in thread
From: Mark Mitchell @ 2000-11-27 10:54 UTC (permalink / raw)
  To: ak; +Cc: law, gcc

>>>>> "Andi" == Andi Kleen <ak@muc.de> writes:

    Andi> On Mon, Nov 27, 2000 at 05:25:45PM +0100, Jeffrey A Law
    Andi> wrote:
    >>  In message < oupn1el760c.fsf@pigdrop.muc.suse.de >you write: >
    >> Problem I see with that is diagnostics again: e.g. someone
    >> depends on a > particular call being a tail call and not
    >> allocating new storage.  Therein lies the first problem --
    >> programmer dependence on specific optimizations in the
    >> compiler.  That's a fundamental mistake.

    Andi> Well, I think Fergus' whole point of the extension was to
    Andi> use gcc as a backend for language compilers that have this
    Andi> requirement.

Yup.  Sadly, that's an approach that just can't work in general.  No
such compiler will ever be portable, and depending on the way GCC
compiles C here is a mistake.

We could do this *inside* the compiler -- i.e., but a
"make-me-a-tail-call-dang-it" bit on CALL_EXPRs.  It would be
appropriate for a language front-end for Scheme, say, to set that bit.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-27  9:48         ` Jeffrey A Law
@ 2000-11-27 11:21           ` Lars Brinkhoff
  0 siblings, 0 replies; 64+ messages in thread
From: Lars Brinkhoff @ 2000-11-27 11:21 UTC (permalink / raw)
  To: law; +Cc: Andi Kleen, gcc

Jeffrey A Law <law@redhat.com> writes:
>   > Well, I think Fergus' whole point of the extension was to use
>   > gcc as a backend for language compilers that have this
>   > requirement.
> I would think it would be better to actually write a true front-end for
> the source language.

Or a front-end for C--.

>   > That would require putting the whole program into a single
>   > function, which given the current gcc optimizer's runtimes is
>   > surely to end with a very slow compile process for anything
>   > >-O0.

Been there, done that, ran out of virtual memory.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-27  8:38     ` Bernd Schmidt
@ 2000-11-27 11:26       ` Eric W. Biederman
  0 siblings, 0 replies; 64+ messages in thread
From: Eric W. Biederman @ 2000-11-27 11:26 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Andi Kleen, law, gcc

Bernd Schmidt <bernds@redhat.com> writes:

> Even an explicit statement doesn't help you because even if it doesn't warn
> when compiling for one machine, it might still warn on another machine, or
> with a different version of the compiler.

It shouldn't warn. It should fail a tail call cannot be implemented correctly.
That is the whole point of the extension.

Allowing __tailcall to fail for now seems a reasonable compromise
so it is not required on every platform immediately.  Better still
would be to require it on every platform.

OTOH it is fine to let things like trying to tail call a varargs function
to be illegal.

> IMHO we shouldn't encourage users to rely on gcc being able to optimize
> tail calls.  It's a nice optimization, but if a program relies on it for
> correctness, that program is broken.

There are certain classes of programs that run just fine in C except
they occasionally get unnecessary stack overflows.    A guarantee
that they would not get a stack overflow enables these classes of programs
to be written in C.

Just as inline allows macros to be removed.  
return __tailcall func(); allows state machines to be implemented
with each state a separate function instead of the classic,
state = FOO;
switch (state) {
case A:
case B:
case FOO:
}

There are other examples as well. 

Eric

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-27 10:48     ` Mark Mitchell
@ 2000-11-27 12:46       ` Harvey J. Stein
  2000-11-27 13:02         ` Travis Moulton
  0 siblings, 1 reply; 64+ messages in thread
From: Harvey J. Stein @ 2000-11-27 12:46 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: freitag, law, gcc

Mark Mitchell <mark@codesourcery.com> writes:

 > >>>>> "Andi" == Andi Kleen <freitag@alancoxonachip.com> writes:
 > 
 >     Andi> Problem I see with that is diagnostics again: e.g. someone
 >     Andi> depends on a particular call being a tail call and not
 >     Andi> allocating new storage.
 > 
 > The ANSI/ISO C way of doing this would be a #pragma (or _Pragma) near
 > the call site.  In GCC, we could do that, or we could do something
 > like:
 > 
 >   __tailcall__ (xyz (foo));
 > 
 > or something.  But that magic should be only used for warnings -- it
 > should not be used to have the compiler generate different code.

Not only should it not be used for generating different code, it
*cannot* be used for generating different code.

The caller of xyz (call it f) would have to jump to xyz.  The caller
of f is going to pop f's args when f returns.  How is the caller going
to pop off the right amount of space after f's effectively changed the
amount of space that it's using?  If xyz uses the same amount of stack
space as f, then it's possible.  The nature of the specific
instructions used for manipulating the stack could make it possible.
But I don't see how it can work in general.

-- 
Harvey Stein
Bloomberg LP
hjstein@bfr.co.il

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-27 12:46       ` Harvey J. Stein
@ 2000-11-27 13:02         ` Travis Moulton
  0 siblings, 0 replies; 64+ messages in thread
From: Travis Moulton @ 2000-11-27 13:02 UTC (permalink / raw)
  To: Harvey J. Stein, gcc

Harvey,
> The caller of xyz (call it f) would have to jump to xyz.  The caller
> of f is going to pop f's args when f returns.  How is the caller
> going to pop off the right amount of space after f's effectively
> changed the amount of space that it's using?  If xyz uses the same
> amount of stack space as f, then it's possible.  The nature of the
> specific instructions used for manipulating the stack could make it
> possible.  But I don't see how it can work in general.

The standard method when hand-coding asm is if f takes as many or more 
arguments than xyz, then f overwrites it's arguments.  f may leave 
some of its arguments untouched if xyz takes fewer args.  In the event 
that xyz takes more arguments than f, you're best to make a normal 
call not a tail call.  I assume the same is applicable to the gcc.

	-Travis Moulton



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-27  7:58 ` Jeffrey A Law
                     ` (2 preceding siblings ...)
  2000-11-27 10:47   ` Mark Mitchell
@ 2000-11-27 23:39   ` Fergus Henderson
  3 siblings, 0 replies; 64+ messages in thread
From: Fergus Henderson @ 2000-11-27 23:39 UTC (permalink / raw)
  To: Jeffrey A Law; +Cc: Robert Dewar, gcc

On 27-Nov-2000, Jeffrey A Law <law@redhat.com> wrote:
> Instead of tagging the return itself, couldn't we use an attribute which
> would (in effect) tell the optimizer that any tail call in the function
> is safe to optimize?  Yes, we lose the ability to specify that a specific
> call site is safe to optimize, but we don't have to invent yet another
> GCC extension of dubious value.

I'd be happy to change the syntax to use "statement attributes"

	__attribute__((__tailcall__)) return func(args);

or "expression attributes"

	return __attribute__((__tailcall__)) func(args);

But putting the attribute on the caller function rather than on the
call is not a good idea, IMHO.  It's just not a good match for the
problem.  It would be harder for front-ends that target GNU C to
generate such an attribute, and the result would be less useful.
For languages that want guaranteed tail call optimization, it would
be much less useful, since putting the attribute on the caller function
means that if the caller contains any tail call which is not safely
optimizable then none of the tail calls will get optimized.

In the C code that the Mercury compiler generates, you can certainly
have functions that contain both tail calls that can safely be
optimized and tail calls that can't.

For example, Mercury code such as

	:- mode foo(in, in, out).
	foo(A, B, C) :-
		Z = quux(A),
		if Z > 0 then
			bar(A, B, C, _D)
		else
			baz(A, B, C).

	:- mode bar(in, in, out, out).
	:- mode baz(in, in, out).

will compile to something like

	foo(int a, int b, int *c) {
		int z = quux(a);
		if (z > 0) {
			int d;
			bar(a, b, c, &d);
		} else {
			baz(a, b, c);
		}
	}

and although it's safe to tailcall baz(), it's not safe to tailcall bar().

My guess is that the same is true of other compilers for function/logic 
languages which target C.

-- 
Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
                                    |  of excellence is a lethal habit"
WWW: < http://www.cs.mu.oz.au/~fjh >  |     -- the last words of T. S. Garp.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-27 10:47   ` Mark Mitchell
@ 2000-11-28 19:21     ` Fergus Henderson
  2000-11-29  2:09       ` Mark Mitchell
  0 siblings, 1 reply; 64+ messages in thread
From: Fergus Henderson @ 2000-11-28 19:21 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: gcc

On 27-Nov-2000, Mark Mitchell <mark@codesourcery.com> wrote:
> The issue, I take it, is that if there are addressed locals in the
> caller, then the compiler might be afraid to do a tail call that would
> clobber them?

Right.

> Since we're accepting the point of view that the
> programmer is going to have to do something anyhow, it's worth
> pointing out that the programmer can already make that clear to the
> compiler: make the call after the addressed locals have gone out of
> scope.  In other words, instead of:
> 
>   {   
>     int i;
>     f(&i);
>     g(); /* I hope this will be a tail call! */ 
>   }
> 
> do:
>  
>   {
>     {
>       int i;
>       f(&i);
>     }
>     g();
>   }
> 
> Yes, that's harder in some cases -- but it can always be done with a
> goto to the call site, which can go in the outermost block of the
> function.

A goto to the call site is not sufficient, in general.
You also need to create copies of all of the parameters
to the tail call which are local variables.

You also need to make sure that you don't take the address of any of
the caller's function parameters; instead, if that would be required,
you must copy the function parameters into locals and take the address
of the locals.

For example, suppose that in your example above, instead of `g();' it
was `g(i);'.  And suppose this tail call is nested inside other code:

   void foo(int p)
   {   
     int j;
     f2(&p, &j);
     if (j > 0)
       {
	 int i;
	 f(&i);
	 g(i); /* I hope this will be a tail call! */ 
       }
     else
       {
         f3(j); /* Likewise here */
       }
   }

To give the C compiler a decent hint, we have to turn it
into something like this:

   void foo(int p)
   {   
     int outer_i;
     int outer_j;
     {
       int inner_p;
       int j;

       inner_p = p;
       f2(&inner_p, &j);
       if (j > 0)
         {
	   int i;
	   f(&i);
	   outer_i = i;
	   goto g_tailcall;
         }
       else
         {
           outer_j = j;
           goto f3_tailcall;
         }
     }
     return;

   g_tailcall:
     g(outer_i); /* Now I _really_ hope this will be a tail call! */
     return;

   f3_tailcall:
     f3(outer_j); /* Likewise here */
     return;
   }

This is a _lot_ uglier that just adding a simple tail call annotation 
at the point of the call.  And it would probably result in less efficient
code in some cases (e.g. if the parameters that need to be copied were
structs).  So I would still much rather have the annotation.
C-- and Microsoft's IL both provide much more direct support for this
than GNU C.

However, since this is automatically generated code, I suppose we
could grudgingly make do with that, if it worked.

The bad news is that gcc doesn't yet do tail recursion optimization
for code like the code above.  I think that is because of this code in
sibcall.c:

 |         /* We must be careful with stack slots which are live at
 |            potential optimization sites.
 |
 |            ?!? This test is overly conservative and will be replaced.  */
 |         if (frame_offset)
 |           goto failure;

-- 
Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
                                    |  of excellence is a lethal habit"
WWW: < http://www.cs.mu.oz.au/~fjh >  |     -- the last words of T. S. Garp.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-28 19:21     ` Fergus Henderson
@ 2000-11-29  2:09       ` Mark Mitchell
  2000-11-30 23:59         ` Fergus Henderson
  0 siblings, 1 reply; 64+ messages in thread
From: Mark Mitchell @ 2000-11-29  2:09 UTC (permalink / raw)
  To: fjh; +Cc: gcc

Fergus --

  You're definitely right -- the transformation you have to do to make
the code safe for tailcalls from the point of view of the C compiler
are more severe than I thought.  Addressing parameters is particularly
deadly.  On the other hand, as you say, we're talking about
machine-generated code anyhow -- so ugliness probably doesn't matter
as much as speed and correctness.

  I think we're going to have to agree to disagree here.  It's still
my position that we shouldn't introduce an extension for the tail-call
stuff, because I don't view the translation of functional languages
into C as fundamentally the right way to go.  It would be so much
better to write a Scheme front-end for GCC -- and I would totally
support, then, providing a bit on CALL_EXPRs to force tailcalls.
Presumably, the Scheme front-end would set this bit.

  I know that you, and Mike Stump, disagree with me -- and that's
cool.  We'll see what happens from here.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-29  2:09       ` Mark Mitchell
@ 2000-11-30 23:59         ` Fergus Henderson
  2000-12-01 15:51           ` Joe Buck
  0 siblings, 1 reply; 64+ messages in thread
From: Fergus Henderson @ 2000-11-30 23:59 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: gcc

On 29-Nov-2000, Mark Mitchell <mark@codesourcery.com> wrote:
> 
>   I think we're going to have to agree to disagree here.

Yes.

> It's still
> my position that we shouldn't introduce an extension for the tail-call
> stuff, because I don't view the translation of functional languages
> into C as fundamentally the right way to go.  It would be so much
> better to write a Scheme front-end for GCC -- and I would totally
> support, then, providing a bit on CALL_EXPRs to force tailcalls.
> Presumably, the Scheme front-end would set this bit.

Scheme would not be a good choice as a target language for source
languages like Mercury (or Haskell, ML, Clean, etc.,) since Scheme is
dynamically typed, whereas Mercury is statically typed.

C is certainly not ideal as a target language, but it does have some
big advantages, even in comparison to languages like C-- that were
explicitly designed as target languages.  (I can elaborate here if you
want details.)  So the use of C as a target language is not going to
go away anytime in the near future.  I think this use will continue
to be sufficiently important that it is worth providing extensions
to GNU C to support it better.

But as you said, we'll have to agree to disagree.

> We'll see what happens from here.

I didn't spot anything on gcc.gnu.org saying what procedure is used to
resolve these kinds of disagreements.  I guess eventually, if/when I
get around to implementing this, it will come down to a vote as to
whether or not the patch gets incorporated?

-- 
Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
                                    |  of excellence is a lethal habit"
WWW: < http://www.cs.mu.oz.au/~fjh >  |     -- the last words of T. S. Garp.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-30 23:59         ` Fergus Henderson
@ 2000-12-01 15:51           ` Joe Buck
  2001-01-03 12:24             ` Fergus Henderson
  0 siblings, 1 reply; 64+ messages in thread
From: Joe Buck @ 2000-12-01 15:51 UTC (permalink / raw)
  To: Fergus Henderson; +Cc: Mark Mitchell, gcc

Fergus Henderson writes:

> I didn't spot anything on gcc.gnu.org saying what procedure is used to
> resolve these kinds of disagreements.  I guess eventually, if/when I
> get around to implementing this, it will come down to a vote as to
> whether or not the patch gets incorporated?

Right now, Mark's under the gun to get 3.0 out, and is trying to limit
the job.  I expect that the SC will back him up on this (I certainly
will).

After 3.0 is out, we can then open the door to talking about extensions.
I suspect that a very clean patch that implements the desired
functionality will help your case a lot.

I've personally written many code generators that produce C, so certainly
my personal inclination is to support features that can make this task
easier.  But the developers now have a lot on their plate to get 3.0 out.
I suggest postponing this discussion until 3.0 has branched off.


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-12-01 15:51           ` Joe Buck
@ 2001-01-03 12:24             ` Fergus Henderson
  2001-01-03 13:09               ` Richard Henderson
  0 siblings, 1 reply; 64+ messages in thread
From: Fergus Henderson @ 2001-01-03 12:24 UTC (permalink / raw)
  To: gcc, gcc-patches

On 01-Dec-2000, Joe Buck <jbuck@racerx.synopsys.com> wrote:
> 
> I suspect that a very clean patch that implements the desired
> functionality will help your case a lot.

OK, here's a start.

2000-01-04  Fergus Henderson  <fjh@cs.mu.oz.au>

	Allow front-ends to force tail call optimization.

	* tree.h (CALL_EXPR_FORCE_TAILCALL): New boolean field.
	* expr.c (expand_expr): use expand_call_1 rather than expand_call,
	  passing CALL_EXPR_FORCE_TAILCALL field.

	* calls.h, calls.c (expand_call_1): New.  Like expand_call,
	  but with additional boolean FORCE_TAILCALL parameter that it
	  copies to the CALL_PLACEHOLDER rtx.
	  (expand_call): call expand_call_1.

	* rtl.def (CALL_PLACEHOLDER): Add new boolean field force_tailcall.
	  integrate.c (copy_insn_list): Copy it.
	  sibcall.c (optimize_sibling_and_tail_recursive_call,
	  replace_call_placeholder) Use it.

This patch is not yet "very clean".  One issue is that the
patch adds another argument to expand_call().  But I was a bit
tentative about changing the front-end interface, so instead I added
a new function expand_call_1().  Would it be better to go ahead and
change the interface to expand_call()? 
(Alternatively, any suggestions for a better name for the new function?)

Another issue is that this patch adds calls to warning() in the
back-end, and the line numbers on those warnings don't come out
right (this might be a bug in my front-end, though -- I've haven't
made much effort to get the line number info right yet).

Another potential issue is that I had to use a whole new field in the
CALL_PLACEHOLDER rtx, rather than using a bit-field, since all
of the bit-field flags seem to be already in use.  But I don't
see any reasonable alternative to that.

I've haven't tested this much yet.

I also have some changes to add C syntax for accessing this
feature, but those are not finished yet.

Index: calls.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/calls.c,v
retrieving revision 1.172
diff -u -d -u -c -p -r1.172 calls.c
*** calls.c	2000/12/30 14:52:15	1.172
--- calls.c	2001/01/03 19:49:11
*************** expand_call (exp, target, ignore)
*** 2064,2069 ****
--- 2064,2088 ----
       rtx target;
       int ignore;
  {
+   return expand_call_1(exp, target, ignore, 0);
+ }
+ 
+ /* Generate all the code for a function call
+    and return an rtx for its value.
+    Store the value in TARGET (specified as an rtx) if convenient.
+    If the value is stored in TARGET then TARGET is returned.
+    If IGNORE is nonzero, then we ignore the value of the function call.
+    If FORCE_TAILCALL is nonzero, then we should try to do sibling call
+    or tail recursion optimization for this call, even if we can't prove
+    that the locals and parameters are no longer live at the call.  */
+ 
+ rtx
+ expand_call_1 (exp, target, ignore, force_tailcall)
+      tree exp;
+      rtx target;
+      int ignore;
+      int force_tailcall;
+ {
    /* Nonzero if we are currently expanding a call.  */
    static int currently_expanding_call = 0;
  
*************** expand_call (exp, target, ignore)
*** 3436,3445 ****
        emit_call_insn (gen_rtx_CALL_PLACEHOLDER (VOIDmode, normal_call_insns,
  						tail_call_insns,
  						tail_recursion_insns,
! 						tail_recursion_label));
      }
    else
!     emit_insns (normal_call_insns);
  
    currently_expanding_call--;
  
--- 3455,3472 ----
        emit_call_insn (gen_rtx_CALL_PLACEHOLDER (VOIDmode, normal_call_insns,
  						tail_call_insns,
  						tail_recursion_insns,
! 						tail_recursion_label,
! 						force_tailcall));
      }
    else
!     {
!       /* If the user told us to force last call optimization,
!          and we tried, but we couldn't, then issue a warning.  */
!       if (force_tailcall && flag_optimize_sibling_calls)
!         warning("cannot optimize tail call");
! 
!       emit_insns (normal_call_insns);
!     }
  
    currently_expanding_call--;
  
Index: expr.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/expr.c,v
retrieving revision 1.284
diff -u -d -u -c -p -r1.284 expr.c
*** expr.c	2000/12/31 01:53:54	1.284
--- expr.c	2001/01/03 19:49:15
*************** expand_expr (exp, target, tmode, modifie
*** 7245,7251 ****
  	    return expand_builtin (exp, target, subtarget, tmode, ignore);
  	}
  
!       return expand_call (exp, target, ignore);
  
      case NON_LVALUE_EXPR:
      case NOP_EXPR:
--- 7245,7252 ----
  	    return expand_builtin (exp, target, subtarget, tmode, ignore);
  	}
  
!       return expand_call_1 (exp, target, ignore,
!       			    CALL_EXPR_FORCE_TAILCALL (exp));
  
      case NON_LVALUE_EXPR:
      case NOP_EXPR:
Index: expr.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/expr.h,v
retrieving revision 1.73
diff -u -d -u -c -p -r1.73 expr.h
*** expr.h	2000/12/03 23:58:44	1.73
--- expr.h	2001/01/03 19:49:16
*************** extern rtx hard_function_value PARAMS ((
*** 1130,1135 ****
--- 1130,1136 ----
  extern rtx prepare_call_address	PARAMS ((rtx, tree, rtx *, int));
  
  extern rtx expand_call PARAMS ((tree, rtx, int));
+ extern rtx expand_call_1 PARAMS ((tree, rtx, int, int));
  
  extern rtx expand_shift PARAMS ((enum tree_code, enum machine_mode, rtx, tree,
  				 rtx, int));
Index: integrate.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/integrate.c,v
retrieving revision 1.123
diff -u -d -u -c -p -r1.123 integrate.c
*** integrate.c	2000/12/30 13:10:50	1.123
--- integrate.c	2001/01/03 19:49:17
*************** copy_insn_list (insns, map, static_chain
*** 1430,1435 ****
--- 1430,1436 ----
  	    {
  	      rtx sequence[3];
  	      rtx tail_label;
+ 	      int force_tailcall;
  
  	      for (i = 0; i < 3; i++)
  		{
*************** copy_insn_list (insns, map, static_chain
*** 1451,1461 ****
  	      tail_label = copy_rtx_and_substitute (XEXP (PATTERN (insn), 3),
  						    map, 0);
  
  	      copy = emit_call_insn (gen_rtx_CALL_PLACEHOLDER (VOIDmode,
  							       sequence[0],
  							       sequence[1],
  							       sequence[2],
! 							       tail_label));
  	      break;
  	    }
  
--- 1452,1465 ----
  	      tail_label = copy_rtx_and_substitute (XEXP (PATTERN (insn), 3),
  						    map, 0);
  
+ 	      force_tailcall = XINT (PATTERN (insn), 4);
+ 
  	      copy = emit_call_insn (gen_rtx_CALL_PLACEHOLDER (VOIDmode,
  							       sequence[0],
  							       sequence[1],
  							       sequence[2],
! 							       tail_label,
! 							       force_tailcall));
  	      break;
  	    }
  
Index: rtl.def
===================================================================
RCS file: /cvs/gcc/gcc/gcc/rtl.def,v
retrieving revision 1.43
diff -u -d -u -c -p -r1.43 rtl.def
*** rtl.def	2000/12/29 17:35:57	1.43
--- rtl.def	2001/01/03 19:49:18
*************** DEF_RTL_EXPR(CONSTANT_P_RTX, "constant_p
*** 919,938 ****
     Immediately after RTL generation, this placeholder will be replaced
     by the insns to perform the call, sibcall or tail recursion.
  
!    This RTX has 4 operands.  The first three are lists of instructions to
     perform the call as a normal call, sibling call and tail recursion
     respectively.  The latter two lists may be NULL, the first may never
     be NULL.
  
!    The last operand is the tail recursion CODE_LABEL, which may be NULL if no 
     potential tail recursive calls were found.
  
     The tail recursion label is needed so that we can clear LABEL_PRESERVE_P
     after we select a call method.
  
     This method of tail-call elimination is intended to be replaced by
     tree-based optimizations once front-end conversions are complete.  */
! DEF_RTL_EXPR(CALL_PLACEHOLDER, "call_placeholder", "uuuu", 'x')
  
  /* Describes a merge operation between two vector values.
     Operands 0 and 1 are the vectors to be merged, operand 2 is a bitmask
--- 919,944 ----
     Immediately after RTL generation, this placeholder will be replaced
     by the insns to perform the call, sibcall or tail recursion.
  
!    This RTX has 5 operands.  The first three are lists of instructions to
     perform the call as a normal call, sibling call and tail recursion
     respectively.  The latter two lists may be NULL, the first may never
     be NULL.
  
!    The fourth operand is the tail recursion CODE_LABEL, which may be NULL if no
     potential tail recursive calls were found.
  
     The tail recursion label is needed so that we can clear LABEL_PRESERVE_P
     after we select a call method.
  
+    The last operand is used by front-ends that want to force tail call
+    optimization.  If the last operand is nonzero, this indicates that the
+    call should be treated as a sibling call or tail recursive call, even
+    if it does not appear to be one.  The back-end will issue a warning if
+    it cannot cannot do so.
+ 
     This method of tail-call elimination is intended to be replaced by
     tree-based optimizations once front-end conversions are complete.  */
! DEF_RTL_EXPR(CALL_PLACEHOLDER, "call_placeholder", "uuuui", 'x')
  
  /* Describes a merge operation between two vector values.
     Operands 0 and 1 are the vectors to be merged, operand 2 is a bitmask
Index: sibcall.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/sibcall.c,v
retrieving revision 1.10
diff -u -d -u -c -p -r1.10 sibcall.c
*** sibcall.c	2000/10/24 11:25:50	1.10
--- sibcall.c	2001/01/03 19:49:18
*************** replace_call_placeholder (insn, use)
*** 415,421 ****
    else if (use == sibcall_use_sibcall)
      emit_insns_before (XEXP (PATTERN (insn), 1), insn);
    else if (use == sibcall_use_normal)
!     emit_insns_before (XEXP (PATTERN (insn), 0), insn);
    else
      abort();
  
--- 415,425 ----
    else if (use == sibcall_use_sibcall)
      emit_insns_before (XEXP (PATTERN (insn), 1), insn);
    else if (use == sibcall_use_normal)
!     {
!       if (XEXP (PATTERN (insn), 4))
!         warning("cannot optimize tail call");
!       emit_insns_before (XEXP (PATTERN (insn), 0), insn);
!     }
    else
      abort();
  
*************** optimize_sibling_and_tail_recursive_call
*** 533,560 ****
  	{
  	  int sibcall = (XEXP (PATTERN (insn), 1) != NULL_RTX);
  	  int tailrecursion = (XEXP (PATTERN (insn), 2) != NULL_RTX);
  	  basic_block succ_block, call_block;
  	  rtx temp, hardret, softret;
  
! 	  /* We must be careful with stack slots which are live at
! 	     potential optimization sites.
  
! 	     ?!? This test is overly conservative and will be replaced.  */
! 	  if (frame_offset)
! 	    goto failure;
  
! 	  /* Taking the address of a local variable is fatal to tail
! 	     recursion if the address is used by the recursive call.  */
! 	  if (current_function_uses_addressof)
! 	    goto failure;
  
! 	  /* alloca (until we have stack slot life analysis) inhibits
! 	     sibling call optimizations, but not tail recursion.
! 	     Similarly if we use varargs or stdarg since they implicitly
! 	     may take the address of an argument.  */
!  	  if (current_function_calls_alloca
! 	      || current_function_varargs || current_function_stdarg)
! 	    sibcall = 0;
  
  	  call_block = BLOCK_FOR_INSN (insn);
  
--- 537,578 ----
  	{
  	  int sibcall = (XEXP (PATTERN (insn), 1) != NULL_RTX);
  	  int tailrecursion = (XEXP (PATTERN (insn), 2) != NULL_RTX);
+ 	  int force_tailcall = (XINT (PATTERN (insn), 4));
  	  basic_block succ_block, call_block;
  	  rtx temp, hardret, softret;
  
! 	  if (force_tailcall)
! 	    {
! 	      /* The front-end has marked this call to indicate
! 	         that it should be treated as a tail call.
! 	         This means that the programmer or front-end has promised
! 		 that it's OK for the compiler to deallocate this function's
! 		 parameters and local variables before making this call.
! 		 So we can safely skip the checks for taking the
! 		 address of local variables.  */
! 	    }
! 	  else 
! 	    {
! 	      /* We must be careful with stack slots which are live at
! 	         potential optimization sites.
  
! 	         ?!? This test is overly conservative and will be replaced.  */
! 	      if (frame_offset)
! 	        goto failure;
  
! 	      /* Taking the address of a local variable is fatal to tail
! 	         recursion if the address is used by the recursive call.  */
! 	      if (current_function_uses_addressof)
! 	        goto failure;
  
! 	      /* alloca (until we have stack slot life analysis) inhibits
! 	         sibling call optimizations, but not tail recursion.
! 	         Similarly if we use varargs or stdarg since they implicitly
! 	         may take the address of an argument.  */
!  	      if (current_function_calls_alloca
! 	          || current_function_varargs || current_function_stdarg)
! 	        sibcall = 0;
! 	    }
  
  	  call_block = BLOCK_FOR_INSN (insn);
  
*************** optimize_sibling_and_tail_recursive_call
*** 614,619 ****
--- 632,639 ----
  	     execute after returning from the function call.  So this call
  	     can not be optimized.  */
  failure:
+ 	  if (force_tailcall)
+ 	    warning("cannot optimize tail call");
  	  sibcall = 0, tailrecursion = 0;
  success:
  
Index: tree.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree.h,v
retrieving revision 1.215
diff -u -d -u -c -p -r1.215 tree.h
*** tree.h	2000/12/30 13:10:51	1.215
--- tree.h	2001/01/03 19:49:20
*************** struct tree_common
*** 243,248 ****
--- 243,250 ----
             TREE_PARMLIST (C++)
         SAVE_EXPR_NOPLACEHOLDER in
  	   SAVE_EXPR
+        CALL_EXPR_FORCE_TAILCALL in
+            CALL_EXPR, METHOD_CALL_EXPR
  
     asm_written_flag:
  
*************** struct tree_vec
*** 782,787 ****
--- 784,797 ----
  
  /* In a CONSTRUCTOR node.  */
  #define CONSTRUCTOR_ELTS(NODE) TREE_OPERAND (NODE, 1)
+ 
+ /* In a CALL_EXPR or METHOD_CALL_EXPR, nonzero means that this call
+    should be treated as a tail recursive call or sibling call,
+    even if there are still local variables that appear to be live.
+    The back-end will issue a warning if it can't generate a call
+    for which this flag is nonzero as a tail recursive call or
+    sibling call.  */
+ #define CALL_EXPR_FORCE_TAILCALL(NODE) ((NODE)->common.unsigned_flag)
  
  /* In ordinary expression nodes.  */
  #define TREE_OPERAND(NODE, I) (EXPR_CHECK (NODE)->exp.operands[I])


For reference, here's the change to my front-end:

Index: mercury-gcc.c
===================================================================
RCS file: /home/pgrad/fjh/fs/hg/fjh_repository/mercury/mercury-gcc.c,v
retrieving revision 1.18
diff -u -d -u -c -p -r1.18 mercury-gcc.c
cvs diff: conflicting specifications of output style
*** mercury-gcc.c	2001/01/03 17:14:16	1.18
--- mercury-gcc.c	2001/01/03 20:17:09
*************** tree
*** 414,420 ****
  merc_build_call_expr (fnptr, args, force_tailcall_p)
       tree fnptr;
       tree args;
!      int force_tailcall_p ATTRIBUTE_UNUSED;
  {
    tree fnptrtype;
    tree fntype;
--- 414,420 ----
  merc_build_call_expr (fnptr, args, force_tailcall_p)
       tree fnptr;
       tree args;
!      int force_tailcall_p;
  {
    tree fnptrtype;
    tree fntype;
*************** merc_build_call_expr (fnptr, args, force
*** 426,431 ****
--- 426,432 ----
    rettype = TREE_TYPE (fntype);
    call = build (CALL_EXPR, rettype, fnptr, args, NULL_TREE);
    TREE_SIDE_EFFECTS (call) = 1;
+   CALL_EXPR_FORCE_TAILCALL (call) = force_tailcall_p;
    return fold (call);
  }
  
-- 
Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
                                    |  of excellence is a lethal habit"
WWW: < http://www.cs.mu.oz.au/~fjh >  |     -- the last words of T. S. Garp.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2001-01-03 12:24             ` Fergus Henderson
@ 2001-01-03 13:09               ` Richard Henderson
  2001-01-03 14:59                 ` Fergus Henderson
  0 siblings, 1 reply; 64+ messages in thread
From: Richard Henderson @ 2001-01-03 13:09 UTC (permalink / raw)
  To: Fergus Henderson; +Cc: gcc, gcc-patches

On Thu, Jan 04, 2001 at 07:24:14AM +1100, Fergus Henderson wrote:
> 	* tree.h (CALL_EXPR_FORCE_TAILCALL): New boolean field.

This part is fine.

> 	* expr.c (expand_expr): use expand_call_1 rather than expand_call,
> 	  passing CALL_EXPR_FORCE_TAILCALL field.

This should be completely unnecessary -- expand_call receives
the CALL_EXPR, from which it can read CALL_EXPR_FORCE_TAILCALL.

> 	* rtl.def (CALL_PLACEHOLDER): Add new boolean field force_tailcall.
> 	  integrate.c (copy_insn_list): Copy it.
> 	  sibcall.c (optimize_sibling_and_tail_recursive_call,
> 	  replace_call_placeholder) Use it.

I'm a bit confused by your intended usage here.  Are you
intending CALL_EXPR_FORCE_TAILCALL to indicate that it is
_possible_ to perform a tail call despite otherwise 
potential problems with the local stack frame?  Or are you
intending to indicate that this _will_ be a tail call.

The later usage seems more in tune with the name of the
flag.  In which case the change to CALL_PLACEHOLDER is 
not needed.  That is used to choose between alternate
code sequences at some later date.  If we _know_ that 
we are going to tail call, then we can simply generate
either the tail recursion or sibling call sequence 
directly and be done with it.

Hmm.  I do suppose that would require that inlining be
disabled, or that it happen at the tree level.  But I'm
not sure what sort of evil inlining would imply for 
such functional languagaes as you are implementing.


r~

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2001-01-03 13:09               ` Richard Henderson
@ 2001-01-03 14:59                 ` Fergus Henderson
  2001-01-03 15:32                   ` Richard Henderson
  2002-09-14 23:35                   ` Fergus Henderson
  0 siblings, 2 replies; 64+ messages in thread
From: Fergus Henderson @ 2001-01-03 14:59 UTC (permalink / raw)
  To: Richard Henderson; +Cc: gcc, gcc-patches

On 03-Jan-2001, Richard Henderson <rth@redhat.com> wrote:
> On Thu, Jan 04, 2001 at 07:24:14AM +1100, Fergus Henderson wrote:
> > 	* tree.h (CALL_EXPR_FORCE_TAILCALL): New boolean field.
> 
> This part is fine.
> 
> > 	* expr.c (expand_expr): use expand_call_1 rather than expand_call,
> > 	  passing CALL_EXPR_FORCE_TAILCALL field.
> 
> This should be completely unnecessary -- expand_call receives
> the CALL_EXPR, from which it can read CALL_EXPR_FORCE_TAILCALL.

You're right.  My mistake.

> > 	* rtl.def (CALL_PLACEHOLDER): Add new boolean field force_tailcall.
> > 	  integrate.c (copy_insn_list): Copy it.
> > 	  sibcall.c (optimize_sibling_and_tail_recursive_call,
> > 	  replace_call_placeholder) Use it.
> 
> I'm a bit confused by your intended usage here.  Are you
> intending CALL_EXPR_FORCE_TAILCALL to indicate that it is
> _possible_ to perform a tail call despite otherwise 
> potential problems with the local stack frame?  Or are you
> intending to indicate that this _will_ be a tail call.

Hmm, good question ;-)

Just the former, in general.  The front-end author can't guess at what
machine-specific or implementation-specific reasons there might be for
not doing a tail call.  All the front-end can do is to let the
back-end know that the locals are not live at this call,
and (perhaps) that the call is in a tail position.
So in general it can only be a hint.

However, in the special case where the parameter types and return type
of the callee matches the parameter types and return type of the caller,
and it the functions are not varargs functions, then the back-end
really ought to respect the hint.  Likewise, if the planned
forthcoming special "tailcall" calling convention is used, the
back-end really ought to respect the hint.  So in those cases,
it should be more than just a hint.

But since the current implementation of the gcc back-end doesn't have
the special tail call convention yet, and since it doesn't always
optimize tail calls even when the parameter and return types match and
the function is not a varargs function, the two statements in the
previous paragraph had better remain just implementation advice
("should") rather than a formal requirement ("shall"), at least in the
short term.  

In the long term it would be nice if one or the other or both could
eventually be made formal requirements.  Then front-ends for languages
like ML that really *need* tail calls could use this.

> The later usage seems more in tune with the name of the
> flag.

Good point.  How about I rename it as "CALL_EXPR_USE_TAILCALL"?
That seems more neutral.

Alternatively, it could be "CALL_EXPR_HINT_TAILCALL"
or "CALL_EXPR_TAILCALL_HINT".  That might be more
descriptive for now.  But as discussed above, in the
long term I want this flag to be more than a hint.

I suppose we could have two different flags, one for the hint, and one
for the requirement.  But those tree flag bits seem to be in fairly
short supply.  Is it worth using two flags?

> In which case the change to CALL_PLACEHOLDER is 
> not needed.  That is used to choose between alternate
> code sequences at some later date.  If we _know_ that 
> we are going to tail call, then we can simply generate
> either the tail recursion or sibling call sequence 
> directly and be done with it.

You're right here too, the change to the CALL_PLACEHOLDER rtx
is not needed.  This is true even if the flag is just a hint,
so long as the hint carries the impliciation that the call is
in a tail position, not just that the locals aren't live.  I was
thinking that the architecture- and calling-convention-specific
reasons why sibling call optimization might fail were checked
in optimize_sibling_and_tail_recursive_call().  But I see now
that we already check all those reasons in expand_call(), and
optimize_sibling_and_tail_recursive_call is just checking the
tail-ness of the call and the liveness of the locals.

> Hmm.  I do suppose that would require that inlining be
> disabled, or that it happen at the tree level.

Well, we do inlining in the Mercury compiler front-end anyway,
at the "High Level Data Structure" (annotated predicate
calculus) level, so inlining in the back-end is not critical
for us.

> But I'm not sure what sort of evil inlining would imply for 
> such functional languagaes as you are implementing.

There shouldn't be any problem, so long as you do proper
tail call handling for the resulting function after inlining.
That is, when inlining a call to a function foo that contains tail
calls to bar1, bar2, etc., if the call to foo is itself a tail
call then the inlined calls to bar1, bar2, etc. should remain
tail calls.

If the call to foo is not a tail call, then it's OK to
inline it and make the calls to bar1, bar2, etc. ordinary
calls rather than tail calls.  That can only increase
stack usage by a constant factor, I think.

Anyway, I'll post a revised diff sometime in the moderately near future.
Thanks for the review.

-- 
Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
                                    |  of excellence is a lethal habit"
WWW: < http://www.cs.mu.oz.au/~fjh >  |     -- the last words of T. S. Garp.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2001-01-03 14:59                 ` Fergus Henderson
@ 2001-01-03 15:32                   ` Richard Henderson
  2001-01-03 15:53                     ` Fergus Henderson
  2002-09-14 23:35                   ` Fergus Henderson
  1 sibling, 1 reply; 64+ messages in thread
From: Richard Henderson @ 2001-01-03 15:32 UTC (permalink / raw)
  To: Fergus Henderson; +Cc: gcc, gcc-patches

On Thu, Jan 04, 2001 at 09:58:31AM +1100, Fergus Henderson wrote:
> Just the former, in general.  The front-end author can't guess at what
> machine-specific or implementation-specific reasons there might be for
> not doing a tail call.  All the front-end can do is to let the
> back-end know that the locals are not live at this call,
> and (perhaps) that the call is in a tail position.
> So in general it can only be a hint.

Mostly what I meant here is "Does CALL_EXPR_TAILCALL imply tail-ness"?
I.e. can you only get it in constructs like "return foo()" and semantic
equivalences thereof.

> Good point.  How about I rename it as "CALL_EXPR_USE_TAILCALL"?
> That seems more neutral.

Sure.  Or just CALL_EXPR_TAILCALL.

> I suppose we could have two different flags, one for the hint, and one
> for the requirement.  But those tree flag bits seem to be in fairly
> short supply.  Is it worth using two flags?

Maybe.  Mark Mitchell has always wanted us to do that optimization
at the tree level instead of after the fact in rtl.  Doing that would
mean verifying tail-ness, validating argument safety, and setting the 
hint bit.  If the target rejects the tail call conversion, no harm
done.  Alternately, if the front end didn't really care if the tail
call didn't happen, it could query the target before setting the bit.

Let's just go with the one bit for now and make it mean "require".

> But I see now that we already check all those reasons in expand_call(),
> and optimize_sibling_and_tail_recursive_call is just checking the
> tail-ness of the call and the liveness of the locals.

Yep.  Most of the target-specific reasons why conversion might not
be possible depends on the types involved, so we have to perform
that check before we throw the types away.

> If the call to foo is not a tail call, then it's OK to
> inline it and make the calls to bar1, bar2, etc. ordinary
> calls rather than tail calls.  That can only increase
> stack usage by a constant factor, I think.

True, but I'm thinking of implementation correctness here.

If we emit a tail call sequence into the rtl directly, then
that is _going_ to be a tail call now and forevermore.  Thus
we can't perform rtl inlining with that sequence installed.

So we either have to annotate CALL_PLACEHOLDER as you did,
or prevent rtl inlining on functions that have had tail calls
expanded already.

Long term we want to get rid of rtl inlining entirely and
perform this operation at the tree level.  This already 
happens in the C++ front end, but hasn't been made general.

So short-term I'd rather just disable rtl inlining if we
expand a CALL_EXPR with CALL_EXPR_TAILCALL set.



r~

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2001-01-03 15:32                   ` Richard Henderson
@ 2001-01-03 15:53                     ` Fergus Henderson
  2001-01-03 16:11                       ` Richard Henderson
  0 siblings, 1 reply; 64+ messages in thread
From: Fergus Henderson @ 2001-01-03 15:53 UTC (permalink / raw)
  To: Richard Henderson; +Cc: gcc, gcc-patches

On 03-Jan-2001, Richard Henderson <rth@redhat.com> wrote:
> Mostly what I meant here is "Does CALL_EXPR_TAILCALL imply tail-ness"?

I think the answer to that should be yes.

> CALL_EXPR_TAILCALL.

OK.

> if the front end didn't really care if the tail
> call didn't happen, it could query the target before setting the bit.

How can it query the target?  The query needed is not just "does this
target support tail calls", it is "does this target support *this*
tail call", and the answer can depend on lots of stuff that the front
end doesn't want to know about.

> Let's just go with the one bit for now and make it mean "require".

I don't think that will work.  The "require" bit is pretty useless
until GCC supports tail calls consistently, because AFAICT the
front-end can't safely set it, at least not without encoding too much
knowledge of the target and of GCC back-end limitations.

I'm happy to go with just one bit for now, but to be useful I think it
has to just mean "suggest" (to be specific: it is a guarantee from the
front-end to the back-end about liveness of locals and tail-ness, but
there should be no guarantee from the back-end that it will actually
use a tail call).

> So short-term I'd rather just disable rtl inlining if we
> expand a CALL_EXPR with CALL_EXPR_TAILCALL set.

OK, shall do.

-- 
Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
                                    |  of excellence is a lethal habit"
WWW: < http://www.cs.mu.oz.au/~fjh >  |     -- the last words of T. S. Garp.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2001-01-03 15:53                     ` Fergus Henderson
@ 2001-01-03 16:11                       ` Richard Henderson
  2001-01-03 16:36                         ` Fergus Henderson
  0 siblings, 1 reply; 64+ messages in thread
From: Richard Henderson @ 2001-01-03 16:11 UTC (permalink / raw)
  To: Fergus Henderson; +Cc: gcc, gcc-patches

On Thu, Jan 04, 2001 at 10:53:15AM +1100, Fergus Henderson wrote:
> How can it query the target?  The query needed is not just "does this
> target support tail calls", it is "does this target support *this*
> tail call", and the answer can depend on lots of stuff that the front
> end doesn't want to know about.

See FUNCTION_OK_FOR_SIBCALL; cfun->decl must be valid during the
query so that the target can compare return types (see the x86
version for why this matters).

> I don't think that will work.  The "require" bit is pretty useless
> until GCC supports tail calls consistently, because AFAICT the
> front-end can't safely set it, at least not without encoding too much
> knowledge of the target and of GCC back-end limitations.

Well, that depends.  If the language requires the conversion,
then the front end should just set the bit.  For the most part,
I believe that most of the common targets do support tail calls
in some form, so I wouldn't let that worry you overmuch.

In any case, whether the bit means "suggest" and failure elicits
a warning or whether it means "require" and failure elicits an
error is a mere detail at present.


r~

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2001-01-03 16:11                       ` Richard Henderson
@ 2001-01-03 16:36                         ` Fergus Henderson
  0 siblings, 0 replies; 64+ messages in thread
From: Fergus Henderson @ 2001-01-03 16:36 UTC (permalink / raw)
  To: Richard Henderson; +Cc: gcc, gcc-patches

On 03-Jan-2001, Richard Henderson <rth@redhat.com> wrote:
> On Thu, Jan 04, 2001 at 10:53:15AM +1100, Fergus Henderson wrote:
> > On 03-Jan-2001, Richard Henderson <rth@redhat.com> wrote:
> > > if the front end didn't really care if the tail
> > > call didn't happen, it could query the target before setting the bit.
> >
> > How can it query the target?  The query needed is not just "does this
> > target support tail calls", it is "does this target support *this*
> > tail call", and the answer can depend on lots of stuff that the front
> > end doesn't want to know about.
> 
> See FUNCTION_OK_FOR_SIBCALL; cfun->decl must be valid during the
> query so that the target can compare return types (see the x86
> version for why this matters).

But that's just one of the many ways in which sibling call optimization
can fail.  There's lots of other ways it can fail in expand_call()
[see below].  Note that quite a few of these ways are just limitations
of the current implementation, not inherent requirements.  It seems
like a bad idea to duplicate that logic in the language front-end. 

P.S.

For completeness here a list of some of the other ways it can
fail in expand_call(). 

 |      || args_size.var)
 ...
 | #ifdef HAVE_sibcall_epilogue
 |       !HAVE_sibcall_epilogue
 | #else
 |       1
 | #endif
 ...
 |       /* Doing sibling call optimization needs some work, since
 |          structure_value_addr can be allocated on the stack.
 |          It does not seem worth the effort since few optimizable
 |          sibling calls will return a structure.  */
 |       || structure_value_addr != NULL_RTX
 |       /* If the register holding the address is a callee saved
 |          register, then we lose.  We have no way to prevent that,
 |          so we only allow calls to named functions.  */
 |       /* ??? This could be done by having the insn constraints
 |          use a register class that is all call-clobbered.  Any
 |          reload insns generated to fix things up would appear
 |          before the sibcall_epilogue.  */
 |       || fndecl == NULL_TREE
 |       || (flags & (ECF_RETURNS_TWICE | ECF_LONGJMP))
 |       || TREE_THIS_VOLATILE (fndecl)
 |       || !FUNCTION_OK_FOR_SIBCALL (fndecl)
 |       /* If this function requires more stack slots than the current
 |          function, we cannot change it into a sibling call.  */
 |       || args_size.constant > current_function_args_size
 |       /* If the callee pops its own arguments, then it must pop exactly
 |          the same number of arguments as the current function.  */
 |       || RETURN_POPS_ARGS (fndecl, funtype, args_size.constant)
 |          != RETURN_POPS_ARGS (current_function_decl,
 |                               TREE_TYPE (current_function_decl),
 |                               current_function_args_size))
 ...
 |       /* Handle calls that return values in multiple non-contiguous locations.
 |          The Irix 6 ABI has examples of this.  */
 |       else if (GET_CODE (valreg) == PARALLEL)
 |         {
 ...
 |           /* We can not support sibling calls for this case.  */
 |           sibcall_failure = 1;


-- 
Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
                                    |  of excellence is a lethal habit"
WWW: < http://www.cs.mu.oz.au/~fjh >  |     -- the last words of T. S. Garp.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2001-01-03 14:59                 ` Fergus Henderson
  2001-01-03 15:32                   ` Richard Henderson
@ 2002-09-14 23:35                   ` Fergus Henderson
  2002-09-16  9:26                     ` Richard Henderson
  1 sibling, 1 reply; 64+ messages in thread
From: Fergus Henderson @ 2002-09-14 23:35 UTC (permalink / raw)
  To: Richard Henderson; +Cc: gcc, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 2287 bytes --]

I've finally gotten around to having another go at my patches which
allow the front-end to indicate to the back-end when it is safe to do
a tail call.

Firstly, I have a question about coding style.
Which of the following two alternative code samples attached is preferred?
The first one is more concise, but the code checking the list of reasons
for sibcall failure is duplicated.  The second is longer, but the
code dealing with each reason for sibcall failure is in a single place.

Secondly, I have some comments and a question in response to Richard's review
comments.  The semantics I would like for the flag are these:

	If CALL_EXPR_TAILCALL flag is set, this indicates that the
	call is in a tail position, and the caller's stack frame is
	no longer live.  The back-end shall report an error if the
	call is not in a tail position.  The back-end should trust
	the front-end about the liveness of the caller's stack
	frame; any use of the caller's stack frame after such a
	call is undefined behaviour.  The back-end will report a
	diagnostic if the call cannot be optimized as a tail call.

These semantics require the back-end to report an error if the front-end
marks a call as a tail call but the call is not in a tail position, which
requires keeping the CALL_PLACEHOLDER.  I think the error check is useful.
So I'd rather keep the CALL_PLACEHOLDER.  Is that OK?

> On 03-Jan-2001, Richard Henderson <rth@redhat.com> wrote:
> > I'm a bit confused by your intended usage here.  Are you
> > intending CALL_EXPR_FORCE_TAILCALL to indicate that it is
> > _possible_ to perform a tail call despite otherwise 
> > potential problems with the local stack frame?  Or are you
> > intending to indicate that this _will_ be a tail call.
> > In which case the change to CALL_PLACEHOLDER is 
> > not needed.  That is used to choose between alternate
> > code sequences at some later date.  If we _know_ that 
> > we are going to tail call, then we can simply generate
> > either the tail recursion or sibling call sequence 
> > directly and be done with it.

-- 
Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
The University of Melbourne         |  of excellence is a lethal habit"
WWW: <http://www.cs.mu.oz.au/~fjh>  |     -- the last words of T. S. Garp.

[-- Attachment #2: sibcall-test1 --]
[-- Type: text/plain, Size: 1649 bytes --]

  /* Tail calls can make things harder to debug, and we're traditionally
     pushed these optimizations into -O2.  Don't try if we're already
     expanding a call, as that means we're an argument.  Don't try if
     there's cleanups, as we know there's code to follow the call.

     If rtx_equal_function_value_matters is false, that means we've
     finished with regular parsing.  Which means that some of the
     machinery we use to generate tail-calls is no longer in place.
     This is most often true of sjlj-exceptions, which we couldn't
     tail-call to anyway.  */

  if (currently_expanding_call++ != 0
      || !flag_optimize_sibling_calls
      || !rtx_equal_function_value_matters
      || any_pending_cleanups (1)
      || args_size.var)
    {
      try_tail_call = try_tail_recursion = 0;
      if (marked_as_tail_call)
        {
          if (currently_expanding_call != 1)
            error("call marked as tail call is not in tail position (it is an argument to another call)");
          else if (!flag_optimize_sibling_calls)
	    warning("can't optimize tail call: -foptimize-sibling-calls not set");
          else if (any_pending_cleanups)
            error("call marked as tail call is not in tail position (there are pending cleanups)");
          else if (!rtx_equal_function_value_matters)
            /* Should this be an internal_error rather than a warning? */
	    warning("can't optimize tail call emitted after regular parsing");
          else if (args_size.var)
	    warning("can't optimize tail call for variable-argument function");
	  else
	    abort();
	  warned_about_tail_call = true;
        }
    }


[-- Attachment #3: sibcall-test2 --]
[-- Type: text/plain, Size: 2094 bytes --]

  /* Don't try tail calling if we're already expanding a call, as that means
     we're an argument. */

  if (currently_expanding_call++ != 0)
    {
      if (marked_as_tail_call)
        {
          error("call marked as tail call is not in tail position (it is an argument to another call)");
	  warned_about_tail_call = true;
	}
      try_tail_call = try_tail_recursion = 0;
    }

  /* Tail calls can make things harder to debug, and we've traditionally
     pushed these optimizations into -O2.  */

  if (!flag_optimize_sibling_calls != 0)
    {
      if (marked_as_tail_call)
        {
	  inform("can't optimize tail call: -foptimize-sibling-calls not set");
	  warned_about_tail_call = true;
	}
      try_tail_call = try_tail_recursion = 0;
    }

  /* Don't try tail calling if there's cleanups, as we know there's code to
     follow the call. */

  if (any_pending_cleanups != 0)
    {
      if (marked_as_tail_call)
        {
          error("call marked as tail call is not in tail position (there are pending cleanups)");
	  warned_about_tail_call = true;
	}
      try_tail_call = try_tail_recursion = 0;
    }

  /* If rtx_equal_function_value_matters is false, that means we've
     finished with regular parsing.  Which means that some of the
     machinery we use to generate tail-calls is no longer in place.
     This is most often true of sjlj-exceptions, which we couldn't
     tail-call to anyway.  */

  if (!rtx_equal_function_value_matters)
    {
      if (marked_as_tail_call)
        {
          /* Should this be an internal_error rather than a warning? */
	  warning("can't optimize tail call emitted after regular parsing");
	  warned_about_tail_call = true;
	}
      try_tail_call = try_tail_recursion = 0;
    }

  /* Don't try tail calling for variable-argument functions.
     The current implementation won't handle those.  */

  if (args_size.var)
    {
      if (marked_as_tail_call)
        {
	  warning("can't optimize tail call for variable-argument function");
	  warned_about_tail_call = true;
	}
      try_tail_call = try_tail_recursion = 0;
    }


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2002-09-14 23:35                   ` Fergus Henderson
@ 2002-09-16  9:26                     ` Richard Henderson
  0 siblings, 0 replies; 64+ messages in thread
From: Richard Henderson @ 2002-09-16  9:26 UTC (permalink / raw)
  To: Fergus Henderson; +Cc: gcc, gcc-patches

On Sun, Sep 15, 2002 at 04:30:41PM +1000, Fergus Henderson wrote:
> Firstly, I have a question about coding style.
> Which of the following two alternative code samples attached is preferred?
> The first one is more concise, but the code checking the list of reasons
> for sibcall failure is duplicated.  The second is longer, but the
> code dealing with each reason for sibcall failure is in a single place.

I prefer the second.  Alternately, something like
function_cannot_inline_p which returns a string with
the reason for failure.

> Secondly, I have some comments and a question in response to Richard's review
> comments.  The semantics I would like for the flag are these:
> 
> 	If CALL_EXPR_TAILCALL flag is set, this indicates that the
> 	call is in a tail position, and the caller's stack frame is
> 	no longer live.  The back-end shall report an error if the
> 	call is not in a tail position.  The back-end should trust
> 	the front-end about the liveness of the caller's stack
> 	frame; any use of the caller's stack frame after such a
> 	call is undefined behaviour.  The back-end will report a
> 	diagnostic if the call cannot be optimized as a tail call.
> 
> These semantics require the back-end to report an error if the front-end
> marks a call as a tail call but the call is not in a tail position, which
> requires keeping the CALL_PLACEHOLDER.  I think the error check is useful.
> So I'd rather keep the CALL_PLACEHOLDER.  Is that OK?

Yes, that's fine.


r~

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
@ 2000-11-29  4:58 Robert Dewar
  0 siblings, 0 replies; 64+ messages in thread
From: Robert Dewar @ 2000-11-29  4:58 UTC (permalink / raw)
  To: fjh, mark; +Cc: gcc

<<  I think we're going to have to agree to disagree here.  It's still
my position that we shouldn't introduce an extension for the tail-call
stuff, because I don't view the translation of functional languages
into C as fundamentally the right way to go.  It would be so much
better to write a Scheme front-end for GCC -- and I would totally
support, then, providing a bit on CALL_EXPRs to force tailcalls.
Presumably, the Scheme front-end would set this bit.
>>

Mark, you might disagree, but the approach of translating into C in cases
like this is VERY well established, and is pretty much a standard 
technique. It is not so terrible to say, sorry, gcc cannot support your
needs, but it seems a bit gratuitious to say, sorry, gcc won't support
your needs, because we think you are doing the wrong thing.

So anyway, add me to the list of disagreers here, but as you say, 
no big deal that people disagree on this :-)

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-27 14:42     ` Harvey J. Stein
@ 2000-11-27 16:07       ` Mark Probst
  0 siblings, 0 replies; 64+ messages in thread
From: Mark Probst @ 2000-11-27 16:07 UTC (permalink / raw)
  To: gcc, Harvey J. Stein

On Mon, Nov 27, 2000 at 03:33:03PM -0500, Harvey J. Stein wrote:
> Except that this extension already exists - functions can already have
> the __stdcall__ attribute, and gcc can be called with the -mrtd
> switch.  I think that's all the calling convention support that's
> necessary.

this is only available for i386 and it doesn't handle more cases than
the standard sibcall optimization, i.e. no variable argument
functions, no calls to functions taking more arguments (more argument
space) than the caller, no indirect function calls, ...

bye
schani

-- 
Mark Probst
Student, Programmer, Juggler
http://www.complang.tuwien.ac.at/~schani/

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
@ 2000-11-27 15:33 Mike Stump
  0 siblings, 0 replies; 64+ messages in thread
From: Mike Stump @ 2000-11-27 15:33 UTC (permalink / raw)
  To: dewar, freitag, law; +Cc: gcc

> To: freitag@alancoxonachip.com, law@redhat.com
> Date: Mon, 27 Nov 2000 11:44:00 -0500 (EST)
> From: dewar@gnat.com (Robert Dewar)

> I think this misses the point

Further, most people are missing the point, it is _NOT_ an
optimization.  Just as a call produces code to call a routine, a
recursive tail call is a direction to produce a goto.

When you think of it this way, you realize, you don't need a warning
when it doesn't apply, because it _must_ apply.  If the abi needs to
be different, then it needs to be different.  If you have to have an
attribute, well, that is the price to pay.  If you have to have it
prototyped in all places, than that is the price to pay.  If varargs
doesn't work with it, well, than that is a documented limitation.

We need to see if what we can come up with is suitable for what people
need from C as an assembler.  I suspect they will just say fine, no
problem, tweak the generation slightly, and presto.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-27 10:22   ` Mark Probst
@ 2000-11-27 14:42     ` Harvey J. Stein
  2000-11-27 16:07       ` Mark Probst
  0 siblings, 1 reply; 64+ messages in thread
From: Harvey J. Stein @ 2000-11-27 14:42 UTC (permalink / raw)
  To: Mark Probst; +Cc: gcc, Jeffrey A Law

Mark Probst <schani@mips.complang.tuwien.ac.at> writes:

 > On Mon, Nov 27, 2000 at 10:45:14AM -0700, Jeffrey A Law wrote:
 > > I don't care if slows down or speeds up execution -- dependence on this kind
 > > of transformation in languages such as C is terribly bad.  C code which
 > > relies on this transformation is broken.
 > 
 > as has been pointed out, this kind of transformation cannot be
 > implemented as a simple optimization since it requires a different
 > calling convention. hence, it can only be implemented as a gnu
 > extension to the c language. as such, it is as legitimate as nested
 > functions or label pointers, imho.

Except that this extension already exists - functions can already have
the __stdcall__ attribute, and gcc can be called with the -mrtd
switch.  I think that's all the calling convention support that's
necessary.

The only thing new that's needed in the compiler is special treatment
of return(g(...)) when it occurs in a __stdcall__ function and g is a
__stdcall__ fcn too.  More specifically, if f is a __stdcall__ and g
is a __stdcall__, then when encountering return(g(...)) inside of f,
GCC should compile it to "pop f's stack & args, push g's args, jump to
g", instead of "push g's args, call g, pop f's stack, return what g
returned".

-- 
Harvey Stein
Bloomberg LP
hjstein@bfr.co.il

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-27  9:44 ` Jeffrey A Law
  2000-11-27 10:22   ` Mark Probst
@ 2000-11-27 14:30   ` Harvey J. Stein
  1 sibling, 0 replies; 64+ messages in thread
From: Harvey J. Stein @ 2000-11-27 14:30 UTC (permalink / raw)
  To: law; +Cc: Robert Dewar, freitag, gcc

Jeffrey A Law <law@redhat.com> writes:

 > In message < 20001127164400.0929334D80@nile.gnat.com >you write:
 > > 
 > > I think this misses the point, this is not just an optimization,
 > > it is a fundamental functional capability, in other words, we
 > > would want the compiler to do this EVEN IF it slowed down
 > > execution.
 >
 > I don't care if slows down or speeds up execution -- dependence on
 > this kind of transformation in languages such as C is terribly bad.
 > C code which relies on this transformation is broken.

It would be a nice thing to have for Scheme implementations.  Lots of
Scheme implementations these days translate to C & then use the C
compiler.  Those that translate Scheme fcns to C fcns are the simplest
to build and understand (not to mention allowing reasonably clean use
of the debugger on the compiled code), but are not properly tail
recursive, as the Scheme standards require.  Those that are properly
tail recursive translate to one big ugly C function and implement
calling conventions themselves.  This is more work, isn't so great for
readability of the C code, strains the C compiler, and has problems
with C<->Scheme interoperability.

Trying to make C properly tail recursive is probably impossible.  In
C, the caller is responsible for pushing and popping arguments onto
and off of the stack, making tail recursion elimination a global
optimization problem instead of a local problem.

However, it becomes trivial with different calling conventions.  If
the caller was responsible for pushing arguments onto the stack and
the callee was responsible for popping the arguments, the callee would
be in a position to rewrite the stack whenever it's jumping to another
function instead of calling it.  That's all that's required for proper
tail recursion, namely rewriting the stack and jumping instead of
pushing the stack and calling whenever the function is going to return
what the callee returns.

This modified calling convention may also require telling the callee
how much stack space the caller used to call the callee, which
probably requires an additional argument or a register dedicated to
such.  This might only be needed, however, for variable argument
functions.

If one were to add support for this alternate calling convention (say
with a command line switch), then Scheme implementers (and anyone else
who needs proper tail recursion) would at least be able to compile
self contained Scheme programs by doing straight Scheme fcn -> C fcn
translation and wouldn't lose proper tail recursion.

If one were to add support for this directly into the language
supported by gcc (for example, with function attributes), then it
could be used to preserve proper tail recursion while still supporting 
standard calling conventions, thus enabling proper tail recursion as
well as utilization of all the standard libraries.

There're also some precedents for doing both of these.  For one thing,
you see it in MSDOS with the PASCAL vs C calling conventions.  For
that matter, GCC already has -mrtd as well as the __stdcall__
attribute, and aside from telling the callee the amount of stack space
the caller used, these are basically all that's needed wrt to calling
convention changes.

Of course, in addition to supporting this calling convention, GCC has
to take advantage of it.  To get proper tail recursion with these
calling conventions, all GCC would have to do is when it's compiling a
function (call it F) which uses this alternate calling convention, and
F calls a function (call it G) which also uses this alternate calling
convention, and F is merely calling G and returning whatever G
returns, then instead of pushing G's args, calling G, poping G's args
& returning what G returned, it should pop F's args, push G's args &
jump to G.  It can jump to G directly because G is going to pop its
own args.  It can rewrite the stack because in this calling
convention, F is responsible for cleaning up the stack, instead of the
caller of F being responsible.

In fact, for the implementers of properly tail recursive languages,
all that would be needed would be that GCC do the above for code of
the form return(g(....)).

Given that GCC is already so close to being able to do this, why not
just take the last step?

-- 
Harvey Stein
Bloomberg LP
hjstein@bfr.co.il

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-27  9:39 Geert Bosch
@ 2000-11-27 12:06 ` Jeffrey A Law
  0 siblings, 0 replies; 64+ messages in thread
From: Jeffrey A Law @ 2000-11-27 12:06 UTC (permalink / raw)
  To: Geert Bosch; +Cc: gcc

  In message < 20001127173909.2975234D80@nile.gnat.com >you write:
  > In order to write a GCC front-end for any language with proper 
  > tail-recursion,  there should be a way in the tree to represent a tail call
Right.  We don't have that ability right now, but I'd support adding it for
those languages which need it.


  > If the GCC backend is not able to compile this on a certain architecture,
  > it abandons the compilation with an error.
Right.
jeff

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-27  9:44 ` Jeffrey A Law
@ 2000-11-27 10:22   ` Mark Probst
  2000-11-27 14:42     ` Harvey J. Stein
  2000-11-27 14:30   ` Harvey J. Stein
  1 sibling, 1 reply; 64+ messages in thread
From: Mark Probst @ 2000-11-27 10:22 UTC (permalink / raw)
  To: gcc, Jeffrey A Law

On Mon, Nov 27, 2000 at 10:45:14AM -0700, Jeffrey A Law wrote:
> I don't care if slows down or speeds up execution -- dependence on this kind
> of transformation in languages such as C is terribly bad.  C code which
> relies on this transformation is broken.

as has been pointed out, this kind of transformation cannot be
implemented as a simple optimization since it requires a different
calling convention. hence, it can only be implemented as a gnu
extension to the c language. as such, it is as legitimate as nested
functions or label pointers, imho.

> About the only thing worse would be to sit down, examine the compiler's output
> to see when it makes the transformation, then twiddle the code to make it
> compiler friendly, then complain when the next rev of the compiler doesn't
> behave in the same manner :-)

no. the semantics of this transformation have to be clearly specified.

bye
schani

-- 
Mark Probst
Student, Programmer, Juggler
http://www.complang.tuwien.ac.at/~schani/

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-27  9:14 ` Bernd Schmidt
@ 2000-11-27 10:09   ` Michael Matz
  0 siblings, 0 replies; 64+ messages in thread
From: Michael Matz @ 2000-11-27 10:09 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Robert Dewar, freitag, gcc, law

Hi,

On Mon, 27 Nov 2000, Bernd Schmidt wrote:
> > 
> > That is certainly true in C, but it makes C useless as a target for the
> > kind of translation being considered here, where the "optimization" is
> > a fundamental requirement. The discussion on the above subject line,
> > is all about creating a variant language that WOULD have the required
> > semantic characteristics.
> 
> My point is that gcc may not be the best starting point for implementing
> such a variant language.

But still one of the best starting points there is (not that there are
many of them besides those labeled "Starting from scratch: how to
write an optimizing compiler") ;-)


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
@ 2000-11-27 10:04 Robert Dewar
  0 siblings, 0 replies; 64+ messages in thread
From: Robert Dewar @ 2000-11-27 10:04 UTC (permalink / raw)
  To: dewar, law; +Cc: freitag, gcc

<<In contrast, dependence on this kind of transformation in other languages
might be quite reasonable (and I believe there are languages which explicitly
require these transformations).  Typically those languages have conventions
which significantly ease these kinds of transformations.

We're apparently not communicating well.
>>

Definitely not!

We have a *functional* requirement in the source language to use O(1)
space in these cases.

We want to buidl a translator that translates into a language similar
to C (C-- was mentioned in this context).

GNU-C is *almost* usable as a target, but not quite, because of the
lack of control over sibling call "optimization".

Yes, of course it is terrible for a user program to depend on optimization,
but that has nothing at all to do with the situation here.

It seems quite reasonable to me to consider adding functionality to
GNU-C to support this requirement. Of course it would never be appropriate
for a programmer writing in C to use this feature directly in C code (well
almost never).

I don't see what the big deal is in implementing something useful here,
and it definitely does meet a need.

<<Optimizations can occur across several axis -- execution speed, code density,
stack space, etc etc etc.  This transformation is an optimization in stack
space and sometimes code speed.
>>

No, it is NOT an optimization in stack space, it is a functional feature
that is needed to ensure proper translation of the source program.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-27  8:44 Robert Dewar
@ 2000-11-27  9:44 ` Jeffrey A Law
  2000-11-27 10:22   ` Mark Probst
  2000-11-27 14:30   ` Harvey J. Stein
  0 siblings, 2 replies; 64+ messages in thread
From: Jeffrey A Law @ 2000-11-27  9:44 UTC (permalink / raw)
  To: Robert Dewar; +Cc: freitag, gcc

  In message < 20001127164400.0929334D80@nile.gnat.com >you write:
  > <<Therein lies the first problem -- programmer dependence on specific
  > optimizations in the compiler.  That's a fundamental mistake.
  > >>
  > 
  > I think this misses the point, this is not just an optimization, it is
  > a fundamental functional capability, in other words, we would want the
  > compiler to do this EVEN IF it slowed down execution.
I don't care if slows down or speeds up execution -- dependence on this kind
of transformation in languages such as C is terribly bad.  C code which
relies on this transformation is broken.

About the only thing worse would be to sit down, examine the compiler's output
to see when it makes the transformation, then twiddle the code to make it
compiler friendly, then complain when the next rev of the compiler doesn't
behave in the same manner :-)

In contrast, dependence on this kind of transformation in other languages
might be quite reasonable (and I believe there are languages which explicitly
require these transformations).  Typically those languages have conventions
which significantly ease these kinds of transformations.

We're apparently not communicating well.  


  > difference between an optimization that simply saves some time, and
  > a fundamental transformation that changes the computational nature
  > of the program. Changing the amount of storage used from O(who knows
  > what) to O(1) is not simply an "optimization".
Optimizations can occur across several axis -- execution speed, code density,
stack space, etc etc etc.  This transformation is an optimization in stack
space and sometimes code speed.

jeff

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
@ 2000-11-27  9:39 Geert Bosch
  2000-11-27 12:06 ` Jeffrey A Law
  0 siblings, 1 reply; 64+ messages in thread
From: Geert Bosch @ 2000-11-27  9:39 UTC (permalink / raw)
  To: law; +Cc: gcc

On Mon, 27 Nov 2000 09:26:06 -0700, Jeffrey A Law wrote:

  If you _must_ have tail calls optimized to gotos, then write them that way
  or use a language which guarantees tail calls will turn into gotos.  If
  tail call optimizations are merely convenience, then write them as calls :-)

The original request by Fergus Henderson was:

<<This feature is important for compilers of high-level functional and
  logic programming languages that target C.  For such languages,
  recursion is the primary form of iteration, and it is important that
  sibling calls be optimized.  Often the high-level language compiler has
  more information about when it is safe to do tail calls than gcc has,
  but currently there is no way for the high-level language compiler to
  communicate that to gcc.

  The language C-- (see <www.cminusminus.org>), which is designed to be
  ideal as a target language, provides this feature.  I would like GNU C
  to provide it too.>>

I think that adding extentions to GNU C in order to support the (changing) 
C-- language is a bad idea.  On the other hand, extending GCC to allow 
front-ends to implement such a language seems like a worthwhile goal.  

In order to write a GCC front-end for any language with proper 
tail-recursion,  there should be a way in the tree to represent a tail call. 
If the GCC backend is not able to compile this on a certain architecture,
it abandons the compilation with an error.

  -Geert



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-27  9:08 Robert Dewar
@ 2000-11-27  9:14 ` Bernd Schmidt
  2000-11-27 10:09   ` Michael Matz
  0 siblings, 1 reply; 64+ messages in thread
From: Bernd Schmidt @ 2000-11-27  9:14 UTC (permalink / raw)
  To: Robert Dewar; +Cc: freitag, gcc, law

On Mon, 27 Nov 2000, Robert Dewar wrote:

> <<IMHO we shouldn't encourage users to rely on gcc being able to optimize
> tail calls.  It's a nice optimization, but if a program relies on it for
> correctness, that program is broken.
> >>
> 
> That is certainly true in C, but it makes C useless as a target for the
> kind of translation being considered here, where the "optimization" is
> a fundamental requirement. The discussion on the above subject line,
> is all about creating a variant language that WOULD have the required
> semantic characteristics.

My point is that gcc may not be the best starting point for implementing
such a variant language.


Bernd

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
@ 2000-11-27  9:08 Robert Dewar
  2000-11-27  9:14 ` Bernd Schmidt
  0 siblings, 1 reply; 64+ messages in thread
From: Robert Dewar @ 2000-11-27  9:08 UTC (permalink / raw)
  To: bernds, freitag; +Cc: gcc, law

<<IMHO we shouldn't encourage users to rely on gcc being able to optimize
tail calls.  It's a nice optimization, but if a program relies on it for
correctness, that program is broken.
>>

That is certainly true in C, but it makes C useless as a target for the
kind of translation being considered here, where the "optimization" is
a fundamental requirement. The discussion on the above subject line,
is all about creating a variant language that WOULD have the required
semantic characteristics.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
@ 2000-11-27  8:44 Robert Dewar
  2000-11-27  9:44 ` Jeffrey A Law
  0 siblings, 1 reply; 64+ messages in thread
From: Robert Dewar @ 2000-11-27  8:44 UTC (permalink / raw)
  To: freitag, law; +Cc: gcc

<<Therein lies the first problem -- programmer dependence on specific
optimizations in the compiler.  That's a fundamental mistake.
>>

I think this misses the point, this is not just an optimization, it is
a fundamental functional capability, in other words, we would want the
compiler to do this EVEN IF it slowed down execution. There is a big
difference between an optimization that simply saves some time, and
a fundamental transformation that changes the computational nature
of the program. Changing the amount of storage used from O(who knows
what) to O(1) is not simply an "optimization".

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-26 15:46 Robert Dewar
  2000-11-26 16:21 ` Joseph S. Myers
  2000-11-26 18:08 ` Fergus Henderson
@ 2000-11-26 21:50 ` Jeffrey A Law
  2 siblings, 0 replies; 64+ messages in thread
From: Jeffrey A Law @ 2000-11-26 21:50 UTC (permalink / raw)
  To: Robert Dewar; +Cc: fjh, bernds, gcc

  In message < 20001126234619.A52E534D80@nile.gnat.com >you write:
  > Note that if you want to make a warning message a formal requirement,
  > then you must provide a formal definition of what a warning message is,
  > and I do not think you want to get into that sort of thing :-)
  > 
  > I think there is real merit in providing fairly formal specifications
  > for GNU C enhancements for three reasons.
  > 
  > 1. The attempt to produce a FFS can often reveal difficulties that are
  > not apparent in a less formal approach.
I couldn't agree more.  The rather informal approach has led to numerous
problems over the years - even for those extensions which are considered
"well documented".  

jeff

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
@ 2000-11-26 18:09 Robert Dewar
  0 siblings, 0 replies; 64+ messages in thread
From: Robert Dewar @ 2000-11-26 18:09 UTC (permalink / raw)
  To: dewar, fjh; +Cc: bernds, gcc

<<I happen to think that it would be good for ANSI/ISO language standards
to include requirements for warnings.  However, that is a separate battle,
and for this proposal I'm happy to amend the formal spec so that the
warnings are merely implementation advice rather than requirements.
>>

It is an unwinnable battle I would say, since a formal specification
of what constitutes a warning msg is impossible. Note that not even
the Ada standard *requires* error messages as such (it is just to
hard to define what an error message is). The only formal requirement
in Ada is that certain errors be "be detected prior to run time".

It is a common misconception that making things implementation advice
rather than implementation requirements weakens the "requirement", but
in fact IA is often much stronger, since you can say things in IA
that you could never get away with in formal definitions.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-26 15:46 Robert Dewar
  2000-11-26 16:21 ` Joseph S. Myers
@ 2000-11-26 18:08 ` Fergus Henderson
  2000-11-26 21:50 ` Jeffrey A Law
  2 siblings, 0 replies; 64+ messages in thread
From: Fergus Henderson @ 2000-11-26 18:08 UTC (permalink / raw)
  To: Robert Dewar; +Cc: bernds, gcc

On 26-Nov-2000, Robert Dewar <dewar@gnat.com> wrote:
> Note that if you want to make a warning message a formal requirement,
> then you must provide a formal definition of what a warning message is,
> and I do not think you want to get into that sort of thing :-)

It's not hard to give a formal definition for warnings, at least to
the level of precision used in the C standard.  But that discussion is
probably not appropriate for this forum.

> I think there is real merit in providing fairly formal specifications
> for GNU C enhancements

I agree.

-- 
Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
                                    |  of excellence is a lethal habit"
WWW: < http://www.cs.mu.oz.au/~fjh >  |     -- the last words of T. S. Garp.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-26 15:27 Robert Dewar
@ 2000-11-26 17:56 ` Fergus Henderson
  0 siblings, 0 replies; 64+ messages in thread
From: Fergus Henderson @ 2000-11-26 17:56 UTC (permalink / raw)
  To: Robert Dewar; +Cc: bernds, gcc

On 26-Nov-2000, Robert Dewar <dewar@gnat.com> wrote:
> <<In ANSI/ISO C: no.  But the ANSI/ISO C standard is not a good model
> for how to write specifications.
> >>
> 
> I disagree, it is an EXCELLENT model for how to write *specifications*.
>
> <<In GNU C: yes.  For example, many parts of the GNU C manual say
> that if a certain option is enabled, the compiler will issue
> certain warnings.  If the GNU C manual is treated as a specification,
> these parts of the manual are definitely requirements on the compiler.
> >>
> 
> The GNU C manual is nowhere near a formal specification, it is fine
> for a compiler manual to talk about warnings, but it is not possible
> to "treat" the GNU C manual as a specification, it is just not 
> precise enough.

Sorry, I was not clear.  When I said that the ANSI/ISO C standard is
not a good model [see Footnote], I meant in comparison to e.g. the Ada
standard.  I certainly did not mean to imply that the GNU C manual was
a better model!

> So if you want to state what you are proposing as a formal specification,
> the "requirement" for a warning should be implementation advice. After all
> we expect that GNU C will follow implementation advice for features that
> we design :-)

I happen to think that it would be good for ANSI/ISO language standards
to include requirements for warnings.  However, that is a separate battle,
and for this proposal I'm happy to amend the formal spec so that the
warnings are merely implementation advice rather than requirements.

[Footnote]
The ANSI/ISO C standard is not a good model, because
(1) the notions of conformance that it defines are not useful
    ("conforming programs" is the universal set and
     "strictly conforming programs" can't produce any output).
(2) it is very imprecise in many parts
(3) it took the C committee a long time to accept the idea of
    incorporating "implementation advice" or "recommended practice"
    sections; the C89 standard didn't have any of those,
    and even in C99 there are still many parts of the standard
    which are phrased as normative but using deliberately poorly defined 
    terms, in some cases deliberately so because they didn't have
    any concept of "implementation advice" sections.

-- 
Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
                                    |  of excellence is a lethal habit"
WWW: < http://www.cs.mu.oz.au/~fjh >  |     -- the last words of T. S. Garp.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-26 11:34   ` Per Bothner
  2000-11-26 11:55     ` Mark Probst
@ 2000-11-26 17:40     ` Fergus Henderson
  1 sibling, 0 replies; 64+ messages in thread
From: Fergus Henderson @ 2000-11-26 17:40 UTC (permalink / raw)
  To: Per Bothner; +Cc: gcc

On 26-Nov-2000, Per Bothner <per@bothner.com> wrote:
> I'm concerned that on many machines the default calling convention
> may make tail-call optimization difficult.  To make tail-call
> optimization practical (or at least efficient) you may want to use
> a non-default calling convention.

For compiling functional/logic languages to C, it's important to
get tail-call optimization, even if the code isn't efficient.

Making it efficient is desirable, of course, but I would like some
way to tell the C compiler to generate a tail call, even if this
is going to be slower than an ordinary call, so that we can get
O(1) stack space usage.

(I understand that implementing that in all possible cases is not
easy, that's why my specification defined it as a hint, with a warning
if the hint can't be applied.  However, in the long term it would be
nice if all the places where that warning is issued could be fixed
to instead generate appropriate code.)

> Specifically, the traditional C calling convention has the caller
> pop the arguments from the stack after the function returns.  This
> is needed to handle varags.  Safe and efficient tail-call
> elimination assumes the callee pops the arguments.
>
> This means that syntax associated with the call site is not enough.
>
> You also (or instead) need some annotation when the called function
> is compiled.

Providing alternative calling conventions which make tail calls
more efficient is a good idea.  However, I think the `__tailcall'
extension is useful even without it.  I think the two are separate
features, and I would like to add `__tailcall' support to GNU C
without waiting for appropriate alternative calling conventions to be
implemented.

Is that OK?

-- 
Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
                                    |  of excellence is a lethal habit"
WWW: < http://www.cs.mu.oz.au/~fjh >  |     -- the last words of T. S. Garp.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-26 15:46 Robert Dewar
@ 2000-11-26 16:21 ` Joseph S. Myers
  2000-11-26 18:08 ` Fergus Henderson
  2000-11-26 21:50 ` Jeffrey A Law
  2 siblings, 0 replies; 64+ messages in thread
From: Joseph S. Myers @ 2000-11-26 16:21 UTC (permalink / raw)
  To: Robert Dewar; +Cc: fjh, bernds, gcc

On Sun, 26 Nov 2000, Robert Dewar wrote:

> I think there is real merit in providing fairly formal specifications
> for GNU C enhancements for three reasons.
>
> 1. The attempt to produce a FFS can often reveal difficulties that are
> not apparent in a less formal approach.
>
> 2. If other C compilers want to copy the feature, which I assume we
> welcome, then there is a clear guide.
>
> 3. If the ISO committee wants to copy the feature, which I assume
> we welcome, then there is a clear guide.

Indeed, it seems fairly impossible at present to work out what the
specification of most GNU C extensions is supposed to be.

A few months ago I noted problems with statement expressions.  (See c/772
for conclusions from that long discussion.)

Consider __attribute__, one of the most useful and widely used extensions.

The syntax is nowhere specified.  This was noted over three years ago
( http://bugs.debian.org/12253 - no documentation of where to put an
attribute on a function definition).

Attributes on labels seem not to be properly documented.  The docs for
-Wunused-label say you can put the "unused" attribute on a label, but the
syntax is nowhere stated.

Attributes do not work properly inside nested declarators - many of them
should apply to a function type but instead apply only to a function
declaration.  The grammar has them as parts of declarations instead of
parts of declarators, but function attributes clearly ought to be adding
alternatives to the productions "direct-declarator: direct-declarator (
parameter-type-list )" and "direct-declarator: direct-declarator (
identifier-list_opt )" (from C99).  Some attributes, if they can get
parsed, do get attached to the function type (e.g. noreturn), but others
don't; e.g. the format attributes work by adding to a list of format
functions stored by DECL_NAME and DECL_ASSEMBLER_NAME (for C++ mangled
names).

The rules for composite types and compatible types in the presence of
attributes are nowhere specified.  Different attributes need to work
differently here; those such as "const" and "format" that say more about
the type need to work differently from those such as "packed" that change
the type.  Some of this is implemented (e.g. the internal use of "const"
and "volatile" on function types is special cased to be different from
that on object types), but it isn't documented.

The grammar is also full of conflicts relating to attributes.  There used
to be a comment listing the conflicts that was removed because it was so
out of date, but their presence perhaps indicates lack of careful design.
Nowhere does the manual describe the conflicts and what the parsing rules
are supposed to be (ideally the intended rules were checked against the
parser resolution, but I strongly suspect that people have just increased
the %expect value to quiet Bison rather than checking carefully in each
case).

I strongly agree that GCC extensions should have formal specifications (in
the form of changes to the C99 or C++ grammar, as appropriate, plus
specifications of constraints, semantics, and undefined behaviour).

-- 
Joseph S. Myers
jsm28@cam.ac.uk

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
@ 2000-11-26 15:46 Robert Dewar
  2000-11-26 16:21 ` Joseph S. Myers
                   ` (2 more replies)
  0 siblings, 3 replies; 64+ messages in thread
From: Robert Dewar @ 2000-11-26 15:46 UTC (permalink / raw)
  To: dewar, fjh; +Cc: bernds, gcc

Note that if you want to make a warning message a formal requirement,
then you must provide a formal definition of what a warning message is,
and I do not think you want to get into that sort of thing :-)

I think there is real merit in providing fairly formal specifications
for GNU C enhancements for three reasons.

1. The attempt to produce a FFS can often reveal difficulties that are
not apparent in a less formal approach.

2. If other C compilers want to copy the feature, which I assume we
welcome, then there is a clear guide.

3. If the ISO committee wants to copy the feature, which I assume
we welcome, then there is a clear guide.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
@ 2000-11-26 15:27 Robert Dewar
  2000-11-26 17:56 ` Fergus Henderson
  0 siblings, 1 reply; 64+ messages in thread
From: Robert Dewar @ 2000-11-26 15:27 UTC (permalink / raw)
  To: dewar, fjh; +Cc: bernds, gcc

<<In ANSI/ISO C: no.  But the ANSI/ISO C standard is not a good model
for how to write specifications.
>>

I disagree, it is an EXCELLENT model for how to write *specifications*.


<<In GNU C: yes.  For example, many parts of the GNU C manual say
that if a certain option is enabled, the compiler will issue
certain warnings.  If the GNU C manual is treated as a specification,
these parts of the manual are definitely requirements on the compiler.
>>

The GNU C manual is nowhere near a formal specification, it is fine
for a compiler manual to talk about warnings, but it is not possible
to "treat" the GNU C manual as a specification, it is just not 
precise enough.

So if you want to state what you are proposing as a formal specification,
the "requirement" for a warning should be implementation advice. After all
we expect that GNU C will follow implementation advice for features that
we design :-)

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-26  8:21 Robert Dewar
@ 2000-11-26 13:51 ` Fergus Henderson
  0 siblings, 0 replies; 64+ messages in thread
From: Fergus Henderson @ 2000-11-26 13:51 UTC (permalink / raw)
  To: Robert Dewar; +Cc: bernds, gcc

On 26-Nov-2000, Robert Dewar <dewar@gnat.com> wrote:
> >       Implementations shall issue a warning if there are any tail call
> >       statements for which the implementation is unable to follow
> >       the implementation advice.
> 
> Are there precedents in either ANSI C or GNU C for requiring warnings.

In ANSI/ISO C: no.  But the ANSI/ISO C standard is not a good model
for how to write specifications.

In GNU C: yes.  For example, many parts of the GNU C manual say
that if a certain option is enabled, the compiler will issue
certain warnings.  If the GNU C manual is treated as a specification,
these parts of the manual are definitely requirements on the compiler.

-- 
Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
                                    |  of excellence is a lethal habit"
WWW: < http://www.cs.mu.oz.au/~fjh >  |     -- the last words of T. S. Garp.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-26 11:34   ` Per Bothner
@ 2000-11-26 11:55     ` Mark Probst
  2000-11-26 17:40     ` Fergus Henderson
  1 sibling, 0 replies; 64+ messages in thread
From: Mark Probst @ 2000-11-26 11:55 UTC (permalink / raw)
  To: gcc, Per Bothner, Fergus Henderson

On Sun, Nov 26, 2000 at 11:31:25AM -0800, Per Bothner wrote:
> I'm concerned that on many machines the default calling convention
> may make tail-call optimization difficult.  To make tail-call
> optimization practical (or at least efficient) you may want to use
> a non-default calling convention.
> 
> Specifically, the traditional C calling convention has the caller
> pop the arguments from the stack after the function returns.  This
> is needed to handle varags.  Safe and efficient tail-call
> elimination assumes the callee pops the arguments.

correct. in order to handle tail-call varargs as well, you must
additionally pass an implicit parameter containing the number of
vararg bytes or a pointer to the position in the stack where the last
argument is stored.

as i have already pointed out, i have implemented such a calling
convention in gcc. the implementation is still very crude, some small
details are not yet done, and i haven't done any larger tests, but it
works on alpha and i386 for simple cases.

> This means that syntax associated with the call site is not enough.
> You also (or instead) need some annotation when the called function
> is compiled.  Because of the need to handle unknown functions,
> that annotation needs to be an attribute of the function *type*.

also correct. i have used an __attribute__((tailcall)). both the
caller and the callee must have this attribute (if the callee is a
function pointer, its type must be attributed).

i would rather solve the last few remaining issues clean the
implementation up before i give it away but if anybody is really
interested, i'd be glad to mail the patch.

bye
schani

-- 
Mark Probst
Student, Programmer, Juggler
http://www.complang.tuwien.ac.at/~schani/

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-26  7:44 ` Fergus Henderson
  2000-11-26  8:18   ` Bernd Schmidt
  2000-11-26  9:55   ` Andi Kleen
@ 2000-11-26 11:34   ` Per Bothner
  2000-11-26 11:55     ` Mark Probst
  2000-11-26 17:40     ` Fergus Henderson
  2 siblings, 2 replies; 64+ messages in thread
From: Per Bothner @ 2000-11-26 11:34 UTC (permalink / raw)
  To: Fergus Henderson; +Cc: gcc

I'm concerned that on many machines the default calling convention
may make tail-call optimization difficult.  To make tail-call
optimization practical (or at least efficient) you may want to use
a non-default calling convention.

Specifically, the traditional C calling convention has the caller
pop the arguments from the stack after the function returns.  This
is needed to handle varags.  Safe and efficient tail-call
elimination assumes the callee pops the arguments.

This means that syntax associated with the call site is not enough.
You also (or instead) need some annotation when the called function
is compiled.  Because of the need to handle unknown functions,
that annotation needs to be an attribute of the function *type*.

The special case where tail-calling a function that as the same
parameter types as the callee can however be handled relatively
easily using standard calling conventions.

There was some discussion on this topic not too long ago, I believe
on this list and this year.  People pointed out some of the problems.
-- 
	--Per Bothner
per@bothner.com   http://www.bothner.com/~per/

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
@ 2000-11-26 10:59 Timothy J. Wood
  0 siblings, 0 replies; 64+ messages in thread
From: Timothy J. Wood @ 2000-11-26 10:59 UTC (permalink / raw)
  To: Robert Dewar; +Cc: fjh, gcc

On Sunday, November 26, 2000, at 08:14 AM, Robert Dewar wrote:
>The formal definition seems quite reasonable here, but I must say
>I dislike the syntax, and it is unnecessarily provocative, since
>this really is not semantically a goto at all (e.g. the formal
>denmotational semantics would have nothing to do with the semantics
>of goto). Why not something like "tail return" or perhaps better 
>simply a compiler directive.

  This would introduce a new reserved word, wouldn't it?  This would break existing source code that used 'tail' as a variable, class, etc.


  It seems like rather than telling the compiler to just blindly do a tail return even though a function takes a pointer that it could conceivably store, a 'better' solution would be to add a attribute that could be applied to arguments to specify that the function doesn't hold onto that pointer.  The compiler would then have enough information to decide when it could do a tail call and the compiler could potentially _enforce_ the restriction that the argument cannot be stored anywhere but in a location that is provably part of a stack local.

  The only thing this wouldn't do is to give a warning/error when the developer knows that a tail call must occur to prevent bugs (like stack overflow), but it seems like adding an attribute to the arguments would prevent errors due to someone providing a function that isn't a valid tail call target.

-tim

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-26  7:44 ` Fergus Henderson
  2000-11-26  8:18   ` Bernd Schmidt
@ 2000-11-26  9:55   ` Andi Kleen
  2000-11-26 11:34   ` Per Bothner
  2 siblings, 0 replies; 64+ messages in thread
From: Andi Kleen @ 2000-11-26  9:55 UTC (permalink / raw)
  To: Fergus Henderson; +Cc: gcc

Fergus Henderson <fjh@cs.mu.oz.au> writes:
> 
> *******************************************************************************
> MATERIAL FOR LANGUAGE STANDARD
> *******************************************************************************
> 
> Syntax:
> 	Add a new statement production:
> 
> 		/* tail call statement */
> 		statement --> GOTO primary "(" exprlist ")" ";"

I would prefer not to use goto, but a new keyword here. Overloading makes for worse
syntax error recovery/reporting for the normal case when your extension is not
used. gcc's syntax error messages are already hard to grok for newbies, no need
to make it even harder. For example if I would typo goto label(); for some reason
in an ANSI-C program and that extension was enabled I would probably get a weird
error message that would be hard to make sense of it that new extension wasn't 
know. With a new __builtin_ prefixed keyword it would be less likely to get into
such an error situation. If you don't like the __ you can always define it to 
something else.


Except for that your proposal looks useful to me.


-Andi

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
@ 2000-11-26  9:12 Geert Bosch
  0 siblings, 0 replies; 64+ messages in thread
From: Geert Bosch @ 2000-11-26  9:12 UTC (permalink / raw)
  To: Fergus Henderson, gcc

On Sun, 26 Nov 2000 22:55:56 +1100, Fergus Henderson wrote:

  the compiler won't be able to perform sibling call optimization, since
  it can't be sure that bar() doesn't save the addresses of x and y in
  some storage that is accessible to baz().  I would like some way to
  tell the compiler to go ahead and perform the sibling call
  optimization anyway.

Actually, a similar situation exist with "const" calls.
It is possible to mark a function with __attribute__ ((const)),
but not do this for individual calls. 

When passing arguments by reference, it is not possible to
mark a function const, as the compiler will assume the function
will return the same result if the address of the argument has
not changed, even when the value of the argument has.

The example below shows an instance of this problem. I wonder
what would be involved in allowing front-ends to specify what
calls should be considered const. In GNAT for example, this information
is readily available but it cannot be communicated to the backend
easily.

  -Geert

#include <stdio.h>
int calls = 0; /* Used for counting the actual nr of invocations of len */

__attribute__ ((const)) size_t len (const char *string)
   /* Calls to len should be considered considered to be const iff
    * the actual passed to string is const. So, in fact, it should
    * be possible to define the const attribute on a per call basis */
{  calls++; 
   return strlen (string);
}

int words (char *sentence)
{  size_t i;
   int spaces = 0;
   /* sentence is not constant in this context, so calls to len
    * should not be considered to be __attribute__ ((const)) calls */
   for (i = 0; i < len (sentence); i++)
   { if (sentence[i] == '.') sentence[i + 1] = 0;
     if (sentence[i] == ' ') spaces++;
   }
   return len (sentence) > 0 ? spaces + 1 : 0;
} 

int cwords (const char *sentence)
{  size_t i;
   int spaces = 0;
   /* sentence is constant in this context, so len (sentence) is in
    * fact constant */
   for (i = 0; i < len (sentence); i++)
     if (sentence [i] == ' ') spaces++;
   return len (sentence) > 0 ? spaces + 1 : 0;
}

int main ()
{  char s[] = "This is a sentence. This one is not counted.";

   calls = 0;
   printf ("\"%s\" has %i words\n", &s, cwords (s));
   printf ("%i calls to strlen\n", calls);

   calls = 0;
   /* This counts 8 words with -O, while it should count 4. */
   printf ("\"%s\" has %i words\n", &s, words (s) );
   printf ("%i calls to strlen\n", calls);

   return 0;
}



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
@ 2000-11-26  8:21 Robert Dewar
  2000-11-26 13:51 ` Fergus Henderson
  0 siblings, 1 reply; 64+ messages in thread
From: Robert Dewar @ 2000-11-26  8:21 UTC (permalink / raw)
  To: bernds, fjh; +Cc: dewar, gcc

>       Implementations shall issue a warning if there are any tail call
>       statements for which the implementation is unable to follow
>       the implementation advice.

Are there precedents in either ANSI C or GNU C for requiring warnings.
I find the requirement dubious, it would require a formal definition
of what a warningf is. This sounds like it should be implementation
advice, not a normative requirement.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-26  7:44 ` Fergus Henderson
@ 2000-11-26  8:18   ` Bernd Schmidt
  2000-11-26  9:55   ` Andi Kleen
  2000-11-26 11:34   ` Per Bothner
  2 siblings, 0 replies; 64+ messages in thread
From: Bernd Schmidt @ 2000-11-26  8:18 UTC (permalink / raw)
  To: Fergus Henderson; +Cc: Robert Dewar, gcc

On Mon, 27 Nov 2000, Fergus Henderson wrote:

> Implementation Advice:
> 	Implementations should implement tail call statements in such
> 	a way that all storage for the invocation of the calling
> 	function is reclaimed by the time execution transfers to the
> 	callee function.
> 
> Implementation Requirements:
> 	Implementations shall issue a warning if there are any tail call
> 	statements for which the implementation is unable to follow
> 	the implementation advice.

Whether all storage can be reclaimed is potentially machine dependent.
This means that the warning will not be very useful; relying on it will
lead to nonportable code.


Bernd

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-26  5:00 Robert Dewar
@ 2000-11-26  7:44 ` Fergus Henderson
  2000-11-26  8:18   ` Bernd Schmidt
                     ` (2 more replies)
  0 siblings, 3 replies; 64+ messages in thread
From: Fergus Henderson @ 2000-11-26  7:44 UTC (permalink / raw)
  To: Robert Dewar; +Cc: gcc

On 26-Nov-2000, Robert Dewar <dewar@gnat.com> wrote:
> How about two steps here. First the clear specification, then the decision
> as to whether this extension makes sense (I have trouble seeing what this
> clear specification would look like).

OK.  Here's a specification.  I've followed it with some less formal
explanatory material of the kind that might be suitable for the gcc
manual.

*******************************************************************************
MATERIAL FOR LANGUAGE STANDARD
*******************************************************************************

Syntax:
	Add a new statement production:

		/* tail call statement */
		statement --> GOTO primary "(" exprlist ")" ";"

Static semantics:
	The <primary> expression denotes the address of the callee function.
	It shall have a type which is convertable to function pointer type.
	The <exprlist> denotes the arguments which will be passed to
	the callee function.  They shall have types which are convertible
	to the callee's parameter types.  The callee's return type
	shall exactly match the return type of the containing function.
	[NOTE: Alternatively, that restriction could be relaxed slightly,
	e.g. to say that they shall match modulo `const', `volatile',
	and `restrict' qualifiers.]

	The compiler shall report an error if any of these requirements
	are violated.

Dynamic semantics:
	The dynamic semantics of a tail call statement statement are exactly
	the same as the semantics of the corresponding `return'
	statement, i.e. `return <primary> ( <exprlist> ) ;',
	except that the lifetime of the parameters and local variables
	of the calling function ends sooner.
	In particular, it ends after evaluation of the callee address
	(the <primary> expression) and the <exprlist> arguments,
	before execution transfers to the callee.

	Evaluation of a tail call statement thus proceeds in the following
	steps:
	1. The <primary> expression and the arguments in the <exprlist> are
	   evaluated.  (The order of evaluation is unspecified.)
	2. After they have been evaluated, the lifetime of the calling
	   function's parameters and local variables ends.
	3. Execution transfers to the callee function, whose address was
	   computed by evaluating the <primary> expression in step 1.
	   The parameters of that function are initialized with the
	   values for the arguments computed in step 1.
	4. When the callee function returns, execution will return to
	   the caller's caller.  The value returned from the callee
	   function (if any) will be returned to the caller's caller.

	If the program references the parameters or the local variables
	of the calling function after their lifetime has ended,
	the behaviour is undefined.

Implementation Advice:
	Implementations should implement tail call statements in such
	a way that all storage for the invocation of the calling
	function is reclaimed by the time execution transfers to the
	callee function.

Implementation Requirements:
	Implementations shall issue a warning if there are any tail call
	statements for which the implementation is unable to follow
	the implementation advice.

*******************************************************************************
MATERIAL FOR USER MANUAL
*******************************************************************************

Tail Calls
----------

When the very last thing that a function does is calling another
function (or calling itself recursively), the compiler can sometimes
perform "tail call optimization", whereby it deallocates the storage
used for the caller function's parameters and local variables
*before* calling the callee function.

However, this can only be done if the compiler can prove that
there will be no references to the caller's parameters and
local variables while executing the callee function.
If for instance you take the address of a local variable or
parameter and pass it to a (non-inline) extern function
before the last call, e.g.

	typedef struct { ... } S;
	extern void bar(S *, S *);
	extern void baz(S *);

	void foo(S x) {
		S y;
		bar(&x, &y);
		baz(y);
	}

then the compiler won't be able to perform tail call optimization
on the last call, since it can't be sure that the callee won't
access the caller's local variables or parameters.

GNU C provides an extension called "tail call statements" which lets
you help the compiler do tail call optimization in such circumstances.
A tail call statement has the syntax

	goto <expression> ;

The semantics are the same as

	return <expression> ;

except for the following differences:

	- <expression> must be a function-call expression
	- the return type of the callee must exactly match
	  the return type of the caller
	- the lifetime of the parameters and local variables
	  ends just before execution transfers to the callee
	  (but after the callee function's address and argument
	  values have evaluated)

Using a tail call statement is a strong hint to the compiler that it
should try to perform tail call optimization for this call.  Note
however that in the current implementation, there are some cases where
the compiler will still not be able to perform the optimization.  The
compiler will issue a warning if it can't do tail call optimization
for a tail call statement.

-- 
Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
                                    |  of excellence is a lethal habit"
WWW: < http://www.cs.mu.oz.au/~fjh >  |     -- the last words of T. S. Garp.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
  2000-11-26  3:56 Fergus Henderson
@ 2000-11-26  5:22 ` Mark Probst
  0 siblings, 0 replies; 64+ messages in thread
From: Mark Probst @ 2000-11-26  5:22 UTC (permalink / raw)
  To: gcc

On Sun, Nov 26, 2000 at 10:55:56PM +1100, Fergus Henderson wrote:
> The basic idea is to provide a way for the programmer to tell
> the C compiler that a particular call should be treated as
> a sibling call or tail call, even if the compiler can't prove
> that the storage for the locals is no longer live.
> 
> This feature is important for compilers of high-level functional and
> logic programming languages that target C.  For such languages,
> recursion is the primary form of iteration, and it is important that
> sibling calls be optimized.  Often the high-level language compiler
> has more information about when it is safe to do tail calls than gcc
> has, but currently there is no way for the high-level language
> compiler to communicate that to gcc.
> 
> The language C-- (see <www.cminusminus.org>), which is designed to be
> ideal as a target language, provides this feature.  I would like
> GNU C to provide it too.

i am already working on this. in fact, it already works for small test
programs on alpha and i386. there are still some small issues to
resolve and the implementation probably isn't very clean, but it's
clear that it can be done. i'd be happy to provide you with additional
information if you like.

bye
schani

-- 
Mark Probst
Student, Programmer, Juggler
http://www.complang.tuwien.ac.at/~schani/

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: forcing tail/sibling call optimization
@ 2000-11-26  5:00 Robert Dewar
  2000-11-26  7:44 ` Fergus Henderson
  0 siblings, 1 reply; 64+ messages in thread
From: Robert Dewar @ 2000-11-26  5:00 UTC (permalink / raw)
  To: fjh, gcc

<<So, any comments?  If I were to provide a clean patch that implements
this feature, together with documentation giving a clear specification,
would you be happy to have this change included in future versions of
GNU C?
>>

How about two steps here. First the clear specification, then the decision
as to whether this extension makes sense (I have trouble seeing what this
clear specification would look like). I think it is very important that
extensions to GNU be very clearly defined.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* forcing tail/sibling call optimization
@ 2000-11-26  3:56 Fergus Henderson
  2000-11-26  5:22 ` Mark Probst
  0 siblings, 1 reply; 64+ messages in thread
From: Fergus Henderson @ 2000-11-26  3:56 UTC (permalink / raw)
  To: gcc

Hi,

I would like to propose an extension to GNU C.  I'm willing to
implement it, but first I'd like to test the waters here to see
the reaction.

The basic idea is to provide a way for the programmer to tell
the C compiler that a particular call should be treated as
a sibling call or tail call, even if the compiler can't prove
that the storage for the locals is no longer live.

For example, for code such as

	extern void bar(int *, int *);
	extern void baz(int);

	void foo(int x) {
		int y;
		bar(&x, &y);
		baz(y);
	}

the compiler won't be able to perform sibling call optimization, since
it can't be sure that bar() doesn't save the addresses of x and y in
some storage that is accessible to baz().  I would like some way to
tell the compiler to go ahead and perform the sibling call
optimization anyway.

This feature is important for compilers of high-level functional and
logic programming languages that target C.  For such languages,
recursion is the primary form of iteration, and it is important that
sibling calls be optimized.  Often the high-level language compiler
has more information about when it is safe to do tail calls than gcc
has, but currently there is no way for the high-level language
compiler to communicate that to gcc.

The language C-- (see <www.cminusminus.org>), which is designed to be
ideal as a target language, provides this feature.  I would like
GNU C to provide it too.

So, any comments?  If I were to provide a clean patch that implements
this feature, together with documentation giving a clear specification,
would you be happy to have this change included in future versions of
GNU C?

I'm not sure what the best syntax for the extension would be.
One possibility is to insert a new keyword `__tailcall' before
such calls:

	void foo(int x) {
		int y;
		bar(&x, &y);
		__tailcall baz(y);
	}

Another alternative is to reuse the `goto' keyword

	void foo(int x) {
		int y;
		bar(&x, &y);
		goto baz(y);
	}

-- 
Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
                                    |  of excellence is a lethal habit"
WWW: < http://www.cs.mu.oz.au/~fjh >  |     -- the last words of T. S. Garp.

^ permalink raw reply	[flat|nested] 64+ messages in thread

end of thread, other threads:[~2002-09-16 16:19 UTC | newest]

Thread overview: 64+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-11-26  8:14 forcing tail/sibling call optimization Robert Dewar
2000-11-26 13:43 ` Fergus Henderson
2000-11-27  7:58 ` Jeffrey A Law
2000-11-27  8:05   ` David Edelsohn
2000-11-27  8:07   ` Andi Kleen
2000-11-27  8:25     ` Jeffrey A Law
2000-11-27  8:39       ` Andi Kleen
2000-11-27  9:48         ` Jeffrey A Law
2000-11-27 11:21           ` Lars Brinkhoff
2000-11-27 10:54         ` Mark Mitchell
2000-11-27  8:38     ` Bernd Schmidt
2000-11-27 11:26       ` Eric W. Biederman
2000-11-27 10:48     ` Mark Mitchell
2000-11-27 12:46       ` Harvey J. Stein
2000-11-27 13:02         ` Travis Moulton
2000-11-27 10:47   ` Mark Mitchell
2000-11-28 19:21     ` Fergus Henderson
2000-11-29  2:09       ` Mark Mitchell
2000-11-30 23:59         ` Fergus Henderson
2000-12-01 15:51           ` Joe Buck
2001-01-03 12:24             ` Fergus Henderson
2001-01-03 13:09               ` Richard Henderson
2001-01-03 14:59                 ` Fergus Henderson
2001-01-03 15:32                   ` Richard Henderson
2001-01-03 15:53                     ` Fergus Henderson
2001-01-03 16:11                       ` Richard Henderson
2001-01-03 16:36                         ` Fergus Henderson
2002-09-14 23:35                   ` Fergus Henderson
2002-09-16  9:26                     ` Richard Henderson
2000-11-27 23:39   ` Fergus Henderson
  -- strict thread matches above, loose matches on Subject: below --
2000-11-29  4:58 Robert Dewar
2000-11-27 15:33 Mike Stump
2000-11-27 10:04 Robert Dewar
2000-11-27  9:39 Geert Bosch
2000-11-27 12:06 ` Jeffrey A Law
2000-11-27  9:08 Robert Dewar
2000-11-27  9:14 ` Bernd Schmidt
2000-11-27 10:09   ` Michael Matz
2000-11-27  8:44 Robert Dewar
2000-11-27  9:44 ` Jeffrey A Law
2000-11-27 10:22   ` Mark Probst
2000-11-27 14:42     ` Harvey J. Stein
2000-11-27 16:07       ` Mark Probst
2000-11-27 14:30   ` Harvey J. Stein
2000-11-26 18:09 Robert Dewar
2000-11-26 15:46 Robert Dewar
2000-11-26 16:21 ` Joseph S. Myers
2000-11-26 18:08 ` Fergus Henderson
2000-11-26 21:50 ` Jeffrey A Law
2000-11-26 15:27 Robert Dewar
2000-11-26 17:56 ` Fergus Henderson
2000-11-26 10:59 Timothy J. Wood
2000-11-26  9:12 Geert Bosch
2000-11-26  8:21 Robert Dewar
2000-11-26 13:51 ` Fergus Henderson
2000-11-26  5:00 Robert Dewar
2000-11-26  7:44 ` Fergus Henderson
2000-11-26  8:18   ` Bernd Schmidt
2000-11-26  9:55   ` Andi Kleen
2000-11-26 11:34   ` Per Bothner
2000-11-26 11:55     ` Mark Probst
2000-11-26 17:40     ` Fergus Henderson
2000-11-26  3:56 Fergus Henderson
2000-11-26  5:22 ` Mark Probst

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).