public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug fortran/31016] New: Use __buildin_memcpy and __memcpy for array assignment
@ 2007-03-01 18:55 burnus at gcc dot gnu dot org
2007-03-01 19:16 ` [Bug fortran/31016] " burnus at gcc dot gnu dot org
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: burnus at gcc dot gnu dot org @ 2007-03-01 18:55 UTC (permalink / raw)
To: gcc-bugs
For the most common array assignments where the size is known at compile-time,
we already use __buildin_memcpy; but the following cases were missed:
subroutine bar(a)
implicit none
real :: a(*),b(12)
b = a(1:12)
end subroutine
subroutine bar(a,b)
implicit none
real :: a(*),b(*)
a(1:12) = b(2:13)
end subroutine
And __buildin_memset can be used for:
subroutine bar(a)
implicit none
real :: a(*),b(12)
a(1:12) = 12
end subroutine
For the following examples, the __buildin_* function can not be used as the
size is not known at compile time, but the memory should be contiguous and thus
__memcpy can be used:
subroutine bar(a,n)
implicit none
integer :: n
real :: a(n), b(n)
a = b
end subroutine
For the following case, one could use memset, but I'm not sure whether it will
be on average be faster than a normal do loop. (Overhead of function call
versus the optimization of memset using e.g. copy-on-write.)
subroutine bar(a,n)
implicit none
integer :: n
real :: a(n)
a = 5
end subroutine
--
Summary: Use __buildin_memcpy and __memcpy for array assignment
Product: gcc
Version: 4.3.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: enhancement
Priority: P3
Component: fortran
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: burnus at gcc dot gnu dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31016
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug fortran/31016] Use __buildin_memcpy and __memcpy for array assignment
2007-03-01 18:55 [Bug fortran/31016] New: Use __buildin_memcpy and __memcpy for array assignment burnus at gcc dot gnu dot org
@ 2007-03-01 19:16 ` burnus at gcc dot gnu dot org
2007-03-01 19:27 ` burnus at gcc dot gnu dot org
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: burnus at gcc dot gnu dot org @ 2007-03-01 19:16 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from burnus at gcc dot gnu dot org 2007-03-01 19:15 -------
subroutine bar(a,b,n)
implicit none
integer :: n
real :: a(n,n), b(n,n)
a = b
end subroutine
For that example example, the overhead is even more obvious. One needs to run
only:
for (int i = 0; i < n*n; i++)
a[i] = b[i]
However, gfortran generates two loops and a whole stack of temporary variables.
Analogously for
subroutine bar(a,n)
implicit none
integer :: n
real :: a(n,n)
a = 12
end subroutine
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31016
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug fortran/31016] Use __buildin_memcpy and __memcpy for array assignment
2007-03-01 18:55 [Bug fortran/31016] New: Use __buildin_memcpy and __memcpy for array assignment burnus at gcc dot gnu dot org
2007-03-01 19:16 ` [Bug fortran/31016] " burnus at gcc dot gnu dot org
@ 2007-03-01 19:27 ` burnus at gcc dot gnu dot org
2007-03-01 19:35 ` tkoenig at gcc dot gnu dot org
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: burnus at gcc dot gnu dot org @ 2007-03-01 19:27 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from burnus at gcc dot gnu dot org 2007-03-01 19:26 -------
And another example for compile-time known sizes:
subroutine bar(a,n)
implicit none
integer :: n
real :: a(n),b(12)
a(1:12) = b
a(2:n) = b
! Here, n is unknown, but it is only valid if the shapes of b an a are the same
end subroutine
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31016
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug fortran/31016] Use __buildin_memcpy and __memcpy for array assignment
2007-03-01 18:55 [Bug fortran/31016] New: Use __buildin_memcpy and __memcpy for array assignment burnus at gcc dot gnu dot org
2007-03-01 19:16 ` [Bug fortran/31016] " burnus at gcc dot gnu dot org
2007-03-01 19:27 ` burnus at gcc dot gnu dot org
@ 2007-03-01 19:35 ` tkoenig at gcc dot gnu dot org
2007-04-29 10:46 ` jb at gcc dot gnu dot org
2007-04-30 16:25 ` pinskia at gcc dot gnu dot org
4 siblings, 0 replies; 6+ messages in thread
From: tkoenig at gcc dot gnu dot org @ 2007-03-01 19:35 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from tkoenig at gcc dot gnu dot org 2007-03-01 19:34 -------
Confirmed.
--
tkoenig at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Ever Confirmed|0 |1
Last reconfirmed|0000-00-00 00:00:00 |2007-03-01 19:34:52
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31016
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug fortran/31016] Use __buildin_memcpy and __memcpy for array assignment
2007-03-01 18:55 [Bug fortran/31016] New: Use __buildin_memcpy and __memcpy for array assignment burnus at gcc dot gnu dot org
` (2 preceding siblings ...)
2007-03-01 19:35 ` tkoenig at gcc dot gnu dot org
@ 2007-04-29 10:46 ` jb at gcc dot gnu dot org
2007-04-30 16:25 ` pinskia at gcc dot gnu dot org
4 siblings, 0 replies; 6+ messages in thread
From: jb at gcc dot gnu dot org @ 2007-04-29 10:46 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from jb at gcc dot gnu dot org 2007-04-29 11:46 -------
I don't think you can use memset for populating real arrays except when the
value is 0.0; the bit patterns would be different, as memset takes an int
argument which is actually converted to unsigned char.
AFAIK the libc memset/cpy choose the algorithm depending the the size etc., so
you have to do a big block to make up for all the overhead. But what could be
done for small multidimensional arrays would be to "flatten" the nested loops
into the equivalent 1D loop? Perhaps this is something better done in the
middle end?
--
jb at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jb at gcc dot gnu dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31016
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug fortran/31016] Use __buildin_memcpy and __memcpy for array assignment
2007-03-01 18:55 [Bug fortran/31016] New: Use __buildin_memcpy and __memcpy for array assignment burnus at gcc dot gnu dot org
` (3 preceding siblings ...)
2007-04-29 10:46 ` jb at gcc dot gnu dot org
@ 2007-04-30 16:25 ` pinskia at gcc dot gnu dot org
4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2007-04-30 16:25 UTC (permalink / raw)
To: gcc-bugs
------- Comment #5 from pinskia at gcc dot gnu dot org 2007-04-30 17:25 -------
> AFAIK the libc memset/cpy choose the algorithm depending the the size etc., so
> you have to do a big block to make up for all the overhead. But what could be
> done for small multidimensional arrays would be to "flatten" the nested loops
> into the equivalent 1D loop? Perhaps this is something better done in the
> middle end?
Well libc's memcpy/memset does optimize by size but the compiler also optimizes
memcpy/memset if the size is constant and also based on the alignment so it
could optimize it down to two instructions instead of a couple (and on PPC,
with -maltivec, GCC can optimize using VMX also which makes the instruction
count go down even more).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31016
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-04-30 16:25 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-03-01 18:55 [Bug fortran/31016] New: Use __buildin_memcpy and __memcpy for array assignment burnus at gcc dot gnu dot org
2007-03-01 19:16 ` [Bug fortran/31016] " burnus at gcc dot gnu dot org
2007-03-01 19:27 ` burnus at gcc dot gnu dot org
2007-03-01 19:35 ` tkoenig at gcc dot gnu dot org
2007-04-29 10:46 ` jb at gcc dot gnu dot org
2007-04-30 16:25 ` pinskia at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).