public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
* Tail call optimization on function with many arguments
@ 2015-08-18 17:30 Lionel Villard
  2015-08-18 20:58 ` Jeff Law
  0 siblings, 1 reply; 2+ messages in thread
From: Lionel Villard @ 2015-08-18 17:30 UTC (permalink / raw)
  To: gcc-help

Hi,
 
consider this small C program:
 
  #include <stdlib.h>
  int tailoptimized(int c1, int c2, int c3, int c4, int c5, int c6)
 {
    if (c1 > c5)
        {
            if (c4 < 5)
            {
                return c3 + c2;
            }
        }
        else if (c5 > 45)
                return c6;
        return c3 + c4 + c5 + c6;
  }
 
  int nottailoptimized(int c1, int c2, int c3, int c4, int c5, int c6, int 
c7)
  {
      if (c2 > c5)
      {
          if (c2 < 5)
         {
              return c3 + c2;
         }
      }
      else if (c1 > 45)
            return c6;
      return c1 + c4 + c5 + c6;
 }
 
  int call(int count)
  {
      if (count > 100)
    {
        return tailoptimized(count *2, count, count * 3, count, count, 
count);
    }
    return nottailoptimized(count, count* 2, count, count, count, count, 
count * 4);
  }
 
  int main (int argc, char* argv[])
  {
      call(atoi(argv[0]));
  }
 
I used this command to compile it:
 
    gcc -O1 -foptimize-sibling-calls -S -c test.c
 
and generated assembly code for the 'call' function is:
 
call:
.LFB17:
    .cfi_startproc
    cmpl    $100, %edi
    jle    .L12
    leal    (%rdi,%rdi), %eax
    leal    (%rax,%rdi), %edx
    movl    %edi, %r9d
    movl    %edi, %r8d
    movl    %edi, %ecx
    movl    %edi, %esi
    movl    %eax, %edi
    jmp    tailoptimized
.L12:
    subq    $8, %rsp
    .cfi_def_cfa_offset 16
    leal    (%rdi,%rdi), %esi
    leal    0(,%rdi,4), %eax
    movl    %eax, (%rsp)
    movl    %edi, %r9d
    movl    %edi, %r8d
    movl    %edi, %ecx
    movl    %edi, %edx
    call    nottailoptimized
    addq    $8, %rsp
    .cfi_def_cfa_offset 8
    ret
    .cfi_endproc
 
As you can see, when a function takes more than 6 arguments 
(nottailoptimized in this case), TCO is turned off.
 
What's the reason for this limitation (number of registers?)? Is there a 
flag/option that would allow more than 6 arguments?
 
Thanks!
Lionel

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Tail call optimization on function with many arguments
  2015-08-18 17:30 Tail call optimization on function with many arguments Lionel Villard
@ 2015-08-18 20:58 ` Jeff Law
  0 siblings, 0 replies; 2+ messages in thread
From: Jeff Law @ 2015-08-18 20:58 UTC (permalink / raw)
  To: Lionel Villard, gcc-help

On 08/18/2015 11:30 AM, Lionel Villard wrote:
> Hi,
>
> consider this small C program:
>
>    #include <stdlib.h>
>    int tailoptimized(int c1, int c2, int c3, int c4, int c5, int c6)
>   {
>      if (c1 > c5)
>          {
>              if (c4 < 5)
>              {
>                  return c3 + c2;
>              }
>          }
>          else if (c5 > 45)
>                  return c6;
>          return c3 + c4 + c5 + c6;
>    }
>
>    int nottailoptimized(int c1, int c2, int c3, int c4, int c5, int c6, int
> c7)
>    {
>        if (c2 > c5)
>        {
>            if (c2 < 5)
>           {
>                return c3 + c2;
>           }
>        }
>        else if (c1 > 45)
>              return c6;
>        return c1 + c4 + c5 + c6;
>   }
>
>    int call(int count)
>    {
>        if (count > 100)
>      {
>          return tailoptimized(count *2, count, count * 3, count, count,
> count);
>      }
>      return nottailoptimized(count, count* 2, count, count, count, count,
> count * 4);
>    }
>
>    int main (int argc, char* argv[])
>    {
>        call(atoi(argv[0]));
>    }
>
> I used this command to compile it:
>
>      gcc -O1 -foptimize-sibling-calls -S -c test.c
>
> and generated assembly code for the 'call' function is:
>
> call:
> .LFB17:
>      .cfi_startproc
>      cmpl    $100, %edi
>      jle    .L12
>      leal    (%rdi,%rdi), %eax
>      leal    (%rax,%rdi), %edx
>      movl    %edi, %r9d
>      movl    %edi, %r8d
>      movl    %edi, %ecx
>      movl    %edi, %esi
>      movl    %eax, %edi
>      jmp    tailoptimized
> .L12:
>      subq    $8, %rsp
>      .cfi_def_cfa_offset 16
>      leal    (%rdi,%rdi), %esi
>      leal    0(,%rdi,4), %eax
>      movl    %eax, (%rsp)
>      movl    %edi, %r9d
>      movl    %edi, %r8d
>      movl    %edi, %ecx
>      movl    %edi, %edx
>      call    nottailoptimized
>      addq    $8, %rsp
>      .cfi_def_cfa_offset 8
>      ret
>      .cfi_endproc
>
> As you can see, when a function takes more than 6 arguments
> (nottailoptimized in this case), TCO is turned off.
>
> What's the reason for this limitation (number of registers?)? Is there a
> flag/option that would allow more than 6 arguments?
There's no inherent limit on the number of arguments.  It's constrained 
more by the need to avoid stack allocation in the caller.

In your example "call" must allocate a stack slot for the last arugment 
to "nottailoptimized".  The x86_64 ABI mandates that "call" would also 
be responsible for deallocating that stack slot.  That obviously can't 
happen if the call to "nottailoptimized" was turned into a tail call.

"tailoptimized" doesn't suffer from this problem because it does not 
need any of its arguments passed in the stack.


In theory, the tail call optimization could be enhanced to create a 
clone of "nottailoptimized" which would de-allocate the stack for any 
arguments and the tail call optimization could jump to that special 
copy.  Nobody's implemented that.  It's not even clear if it'd be a good 
idea or not.

Jeff

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2015-08-18 20:58 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-18 17:30 Tail call optimization on function with many arguments Lionel Villard
2015-08-18 20:58 ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).