public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug fortran/59345] New: _gfortran_internal_pack on compiler generated temps
@ 2013-11-29 14:31 Joost.VandeVondele at mat dot ethz.ch
  2013-12-22 21:00 ` [Bug fortran/59345] " dominiq at lps dot ens.fr
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Joost.VandeVondele at mat dot ethz.ch @ 2013-11-29 14:31 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59345

            Bug ID: 59345
           Summary: _gfortran_internal_pack on compiler generated temps
           Product: gcc
           Version: 4.9.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: fortran
          Assignee: unassigned at gcc dot gnu.org
          Reporter: Joost.VandeVondele at mat dot ethz.ch

There is a missed optimization on compiler generated temporaries. Basically:

SUBROUTINE S1(A)
 REAL :: A(3)
 CALL S2(-A)
END SUBROUTINE

leads to an optimized tree that contains calls to
_gfortran_internal_pack
_gfortran_internal_unpack
__builtin_free

which should not be needed as generated temps are known to be contiguous (in
particular in this case, where it is generated on the stack).

This would help to fully resolve PR38318 .


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug fortran/59345] _gfortran_internal_pack on compiler generated temps
  2013-11-29 14:31 [Bug fortran/59345] New: _gfortran_internal_pack on compiler generated temps Joost.VandeVondele at mat dot ethz.ch
@ 2013-12-22 21:00 ` dominiq at lps dot ens.fr
  2014-12-06 10:05 ` Joost.VandeVondele at mat dot ethz.ch
  2014-12-06 15:49 ` Joost.VandeVondele at mat dot ethz.ch
  2 siblings, 0 replies; 4+ messages in thread
From: dominiq at lps dot ens.fr @ 2013-12-22 21:00 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59345

Dominique d'Humieres <dominiq at lps dot ens.fr> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2013-12-22
     Ever confirmed|0                           |1

--- Comment #1 from Dominique d'Humieres <dominiq at lps dot ens.fr> ---
Confirmed at r206155.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug fortran/59345] _gfortran_internal_pack on compiler generated temps
  2013-11-29 14:31 [Bug fortran/59345] New: _gfortran_internal_pack on compiler generated temps Joost.VandeVondele at mat dot ethz.ch
  2013-12-22 21:00 ` [Bug fortran/59345] " dominiq at lps dot ens.fr
@ 2014-12-06 10:05 ` Joost.VandeVondele at mat dot ethz.ch
  2014-12-06 15:49 ` Joost.VandeVondele at mat dot ethz.ch
  2 siblings, 0 replies; 4+ messages in thread
From: Joost.VandeVondele at mat dot ethz.ch @ 2014-12-06 10:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59345

Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|2013-12-22 00:00:00         |2014-12-6
                 CC|                            |Joost.VandeVondele at mat dot ethz
                   |                            |.ch
      Known to fail|                            |4.9.2, 5.0

--- Comment #2 from Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> ---
still happens with trunk. 

In the microbenchmark below, seems like a 3-fold overhead due to packing. This
is similar to using an assumed shape dummy arg as a temp, while in the latter
case, this can be fixed with the contiguous attribute. Could the solution be as
simple as somehow providing the 'contiguous' attribute to compiler generated
temporaries ?

> gfortran -Ofast -fno-inline t.f90
> ./a.out
 with packing:   1.8157229999999998       sec.
 without packing:  0.49092599999999997      sec. 
 assumed shape, no contiguous :   1.9047100000000006      sec. 
 assumed shape, contiguous :  0.46692899999999948      sec. 
 total calls to foo:   400000000 expected   200000000

> cat t.f90
MODULE M
 INTEGER, SAVE :: count=0
CONTAINS
 SUBROUTINE S1(A,foo)
  REAL :: A(3)
  CALL foo(-A)
 END SUBROUTINE

 SUBROUTINE S2(A,foo)
  REAL :: A(3)
  REAL :: B(3)
  B=-A 
  CALL foo(B)
 END SUBROUTINE

 SUBROUTINE S3(A,B,foo)
  REAL :: A(3)
  REAL :: B(:)
  B=-A 
  CALL foo(B)
 END SUBROUTINE

 SUBROUTINE S4(A,B,foo)
  REAL :: A(3)
  REAL, CONTIGUOUS :: B(:)
  B=-A 
  CALL foo(B)
 END SUBROUTINE

 SUBROUTINE foo(A)
  REAL :: A(3)
  count=count+1
 END SUBROUTINE
END MODULE

PROGRAM TEST
   USE M
   IMPLICIT NONE
   REAL :: A(3),B(3)
   INTEGER :: i
   REAL*8 :: t1,t2,t3,t4,t5,t6,t7,t8
   INTEGER :: N
   A=0
   N=100000000

   CALL CPU_TIME(t1)
   DO i=1,N
      CALL S1(A,foo)
   ENDDO
   CALL CPU_TIME(t2)

   CALL CPU_TIME(t3)
   DO i=1,N
      CALL S2(A,foo)
   ENDDO
   CALL CPU_TIME(t4)

   CALL CPU_TIME(t5)
   DO i=1,N
      CALL S3(A,B,foo)
   ENDDO
   CALL CPU_TIME(t6)

   CALL CPU_TIME(t7)
   DO i=1,N
      CALL S4(A,B,foo)
   ENDDO
   CALL CPU_TIME(t8)

   WRITE(6,*) "with packing:", t2-t1, " sec."
   WRITE(6,*) "without packing:", t4-t3, "sec. "
   WRITE(6,*) "assumed shape, no contiguous :", t6-t5, "sec. "
   WRITE(6,*) "assumed shape, contiguous :", t8-t7, "sec. "

   WRITE(6,*) "total calls to foo:", count, "expected", 2*N
END


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug fortran/59345] _gfortran_internal_pack on compiler generated temps
  2013-11-29 14:31 [Bug fortran/59345] New: _gfortran_internal_pack on compiler generated temps Joost.VandeVondele at mat dot ethz.ch
  2013-12-22 21:00 ` [Bug fortran/59345] " dominiq at lps dot ens.fr
  2014-12-06 10:05 ` Joost.VandeVondele at mat dot ethz.ch
@ 2014-12-06 15:49 ` Joost.VandeVondele at mat dot ethz.ch
  2 siblings, 0 replies; 4+ messages in thread
From: Joost.VandeVondele at mat dot ethz.ch @ 2014-12-06 15:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59345

--- Comment #3 from Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> ---
I'm pasting here another testcase, since I think it is related. 

This works as it should (i.e. no pack/unpack), an allocatable as function
result:

> cat tt.f90
SUBROUTINE S1(A)
 INTERFACE
   FUNCTION CONTIGUOUS_F1() RESULT(res)
    INTEGER, ALLOCATABLE :: res(:)
   END FUNCTION
 END INTERFACE
 CALL S2(CONTIGUOUS_F1())
END SUBROUTINE

This generates a pack/unpack as well, i.e. an array that is a function result:

> cat tt.f90
SUBROUTINE S1(A)
 INTERFACE
   FUNCTION CONTIGUOUS_F1() RESULT(res)
    INTEGER :: res(5)
   END FUNCTION
 END INTERFACE
 CALL S2(CONTIGUOUS_F1())
END SUBROUTINE

This also leads to a pack, a function that returns an allocatable, but called
via a procedure pointer.

> cat tt.f90
SUBROUTINE S1(A)
 INTERFACE
   FUNCTION CONTIGUOUS_F1() RESULT(res)
    INTEGER, ALLOCATABLE :: res(:)
   END FUNCTION
 END INTERFACE
 PROCEDURE(CONTIGUOUS_F1), POINTER :: A
 CALL S2(A())
END SUBROUTINE


In these cases, the issue seems that gfc_is_simply_contiguous returns false,
while maybe it should return true ?

I think this is also the reason things go wrong with the testcase in comment
#1, this is an EXPR_OP, and somehow might be simply_contiguous nevertheless.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-12-06 15:49 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-29 14:31 [Bug fortran/59345] New: _gfortran_internal_pack on compiler generated temps Joost.VandeVondele at mat dot ethz.ch
2013-12-22 21:00 ` [Bug fortran/59345] " dominiq at lps dot ens.fr
2014-12-06 10:05 ` Joost.VandeVondele at mat dot ethz.ch
2014-12-06 15:49 ` Joost.VandeVondele at mat dot ethz.ch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).