public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/97181] New: Inlining of leaf case in recursion
@ 2020-09-23 14:26 tkoenig at gcc dot gnu.org
  2020-09-23 14:44 ` [Bug tree-optimization/97181] " tkoenig at gcc dot gnu.org
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: tkoenig at gcc dot gnu.org @ 2020-09-23 14:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97181

            Bug ID: 97181
           Summary: Inlining of leaf case in recursion
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: tkoenig at gcc dot gnu.org
  Target Milestone: ---

The following two programs are functionally equivalent:

$ cat v1.f90 
program main
  implicit none
  integer, parameter :: ip = selected_int_kind(15)
  integer :: l_max
  integer(kind=ip) :: n_level
  integer :: num
  num = 0
  read (*,*) l_max
  call btsum (0_ip, 0_ip, l_max)
  print *,num
contains
  recursive subroutine btsum (n, s, level)
    integer(kind=ip), value :: n, s
    integer, value :: level
    if (level /= 0) then
       call btsum (3*n,  s  , level-1)
       call btsum (3*n+1,s+1, level-1)
       call btsum (3*n+2,s+2, level-1)
       return
    end if
    if (popcnt (n) == s) then
       call do_something (n, s)
    end if
  end subroutine btsum
  subroutine do_something (n,s)
    integer (kind=ip), value :: n, s
    num = num + 1
  end subroutine do_something
end program main

program main
  implicit none
  integer, parameter :: ip = selected_int_kind(15)
  integer :: l_max
  integer(kind=ip) :: n_level
  integer :: num
  num = 0
  read (*,*) l_max
  call btsum (0_ip,0_ip,l_max)
  print *,num
contains
  recursive subroutine btsum (n, s, level)
    integer(kind=ip), value :: n, s
    integer, value :: level
    if (level > 1) then
       call btsum (3*n,  s  , level-1)
       call btsum (3*n+1,s+1, level-1)
       call btsum (3*n+2,s+2, level-1)
       return
    end if
    if (popcnt (3*n  ) == s  ) call do_something (3*n  , s)
    if (popcnt (3*n+1) == s+1) call do_something (3*n+1, s+1)
    if (popcnt (3*n+2) == s+2) call do_something (3*n+2, s+2)
  end subroutine btsum
  subroutine do_something (n,s)
    integer (kind=ip), value :: n, s
    num = num + 1
  end subroutine do_something
end program main

The only difference is that the last level of recursion in
btsum has been made explicit.

The difference in running time is quite significant:

$ gfortran -O3 -march=native v1.f90 
$ echo 19 | ./a.out
    68611936
$ time echo 19 | ./a.out
    68611936

real    0m2,611s
user    0m2,606s
sys     0m0,005s
$ gfortran -O3 -march=native v2.f90 
$ time echo 19 | ./a.out
    68611936

real    0m1,614s
user    0m1,609s
sys     0m0,005s

Using PGO makes the difference smaller, but it is still there:

$ gfortran -fprofile-arcs -O3 -march=native v1.f90 
$ echo 19 | ./a.out
    68611936
$ gfortran -fprofile-use -O3 -march=native v1.f90 
$ time echo 19 | ./a.out
    68611936

real    0m2,011s
user    0m2,011s
sys     0m0,000s

$ gfortran -fprofile-arcs -O3 -march=native v2.f90 
$ echo 19 | ./a.out
    68611936
$ gfortran -fprofile-use -O3 -march=native v2.f90 
$ time echo 19 | ./a.out
    68611936

real    0m1,386s
user    0m1,382s
sys     0m0,004s

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/97181] Inlining of leaf case in recursion
  2020-09-23 14:26 [Bug tree-optimization/97181] New: Inlining of leaf case in recursion tkoenig at gcc dot gnu.org
@ 2020-09-23 14:44 ` tkoenig at gcc dot gnu.org
  2020-09-24  7:14 ` [Bug ipa/97181] " rguenth at gcc dot gnu.org
  2020-09-24 10:20 ` hubicka at ucw dot cz
  2 siblings, 0 replies; 4+ messages in thread
From: tkoenig at gcc dot gnu.org @ 2020-09-23 14:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97181

Thomas Koenig <tkoenig at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Version|unknown                     |11.0

--- Comment #1 from Thomas Koenig <tkoenig at gcc dot gnu.org> ---
The second program should be v2.f90 (missed it in the cut&paste).

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug ipa/97181] Inlining of leaf case in recursion
  2020-09-23 14:26 [Bug tree-optimization/97181] New: Inlining of leaf case in recursion tkoenig at gcc dot gnu.org
  2020-09-23 14:44 ` [Bug tree-optimization/97181] " tkoenig at gcc dot gnu.org
@ 2020-09-24  7:14 ` rguenth at gcc dot gnu.org
  2020-09-24 10:20 ` hubicka at ucw dot cz
  2 siblings, 0 replies; 4+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-09-24  7:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97181

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
          Component|tree-optimization           |ipa
                 CC|                            |hubicka at gcc dot gnu.org,
                   |                            |jamborm at gcc dot gnu.org,
                   |                            |marxin at gcc dot gnu.org

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
The exact transform doesn't look feasible - the analysis required would be
that 'level' decreases and recursion stops at level == 0 which means we
can clone btsum for level == 0 and change

    if (level /= 0) then
       call btsum (3*n,  s  , level-1)
       call btsum (3*n+1,s+1, level-1)
       call btsum (3*n+2,s+2, level-1)
       return
    end if

to

    if (level == 1) then
       call btsum.0 (3*n, s)
       call btsum.0 (3*n+1, s+1)
       call btsum.0 (3*n_2, s+2)
       return
    else if (level /= 0) then
       call btsum (3*n,  s  , level-1)
       call btsum (3*n+1,s+1, level-1)
       call btsum (3*n+2,s+2, level-1)
       return
    end if

and then inline btsum.0.  Notice how the possibility of level < 0 is left
untouched ... [I think there are no unsigned types in fortran]

That said, I don't think IPA-CP/VRP do this kind of "evolution analysis"
on parameters [in the recursion case]?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug ipa/97181] Inlining of leaf case in recursion
  2020-09-23 14:26 [Bug tree-optimization/97181] New: Inlining of leaf case in recursion tkoenig at gcc dot gnu.org
  2020-09-23 14:44 ` [Bug tree-optimization/97181] " tkoenig at gcc dot gnu.org
  2020-09-24  7:14 ` [Bug ipa/97181] " rguenth at gcc dot gnu.org
@ 2020-09-24 10:20 ` hubicka at ucw dot cz
  2 siblings, 0 replies; 4+ messages in thread
From: hubicka at ucw dot cz @ 2020-09-24 10:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97181

--- Comment #3 from Jan Hubicka <hubicka at ucw dot cz> ---
> and then inline btsum.0.  Notice how the possibility of level < 0 is left
> untouched ... [I think there are no unsigned types in fortran]
> 
> That said, I don't think IPA-CP/VRP do this kind of "evolution analysis"
> on parameters [in the recursion case]?

ipa-cp does this kind of analysis for the recursive clonning (for
exchange). I would hope that it could handle case like this evnetually
too.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-09-24 10:20 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-23 14:26 [Bug tree-optimization/97181] New: Inlining of leaf case in recursion tkoenig at gcc dot gnu.org
2020-09-23 14:44 ` [Bug tree-optimization/97181] " tkoenig at gcc dot gnu.org
2020-09-24  7:14 ` [Bug ipa/97181] " rguenth at gcc dot gnu.org
2020-09-24 10:20 ` hubicka at ucw dot cz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).