* [Bug fortran/41137] inefficient zeroing of an array
2009-08-21 6:15 [Bug fortran/41137] New: inefficient zeroing of an array jv244 at cam dot ac dot uk
@ 2009-08-21 7:02 ` jv244 at cam dot ac dot uk
2009-08-21 7:40 ` dfranke at gcc dot gnu dot org
` (12 subsequent siblings)
13 siblings, 0 replies; 21+ messages in thread
From: jv244 at cam dot ac dot uk @ 2009-08-21 7:02 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from jv244 at cam dot ac dot uk 2009-08-21 07:02 -------
Just for reference, the difference in time between the two variants is truly
impressive. About a factor of 11 with gcc 4.4 and 8 with gcc 4.5. Given that a
code like CP2K spents sometimes about 5-10% of its time in zeroing stuff, this
would help significantly.
trunk:
> gfortran -O3 -march=native test.f90
> ./a.out
0.10000600
0.84405303
4.4 branch:
> gfortran -O3 -march=native test.f90
> ./a.out
0.10400600
1.1320710
test code:
SUBROUTINE S(a,n)
INTEGER :: n
REAL :: a(n,n,n,n)
a(:,:,:,:)=0.0
END SUBROUTINE
SUBROUTINE S2(a)
REAL :: a(10,10,10,10)
a(:,:,:,:)=0.0
END SUBROUTINE
REAL :: a(10,10,10,10),t1,t2
INTEGER :: I,N
N=100000
CALL CPU_TIME(t1)
DO I=1,N
CALL S2(a)
ENDDO
CALL CPU_TIME(t2)
write(6,*) t2-t1
CALL CPU_TIME(t1)
DO I=1,N
CALL S(a,10)
ENDDO
CALL CPU_TIME(t2)
write(6,*) t2-t1
END
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41137
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug fortran/41137] inefficient zeroing of an array
2009-08-21 6:15 [Bug fortran/41137] New: inefficient zeroing of an array jv244 at cam dot ac dot uk
2009-08-21 7:02 ` [Bug fortran/41137] " jv244 at cam dot ac dot uk
@ 2009-08-21 7:40 ` dfranke at gcc dot gnu dot org
2009-08-21 8:29 ` jv244 at cam dot ac dot uk
` (11 subsequent siblings)
13 siblings, 0 replies; 21+ messages in thread
From: dfranke at gcc dot gnu dot org @ 2009-08-21 7:40 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from dfranke at gcc dot gnu dot org 2009-08-21 07:39 -------
I think PR31009 is similar.
--
dfranke at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |dfranke at gcc dot gnu dot
| |org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41137
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug fortran/41137] inefficient zeroing of an array
2009-08-21 6:15 [Bug fortran/41137] New: inefficient zeroing of an array jv244 at cam dot ac dot uk
2009-08-21 7:02 ` [Bug fortran/41137] " jv244 at cam dot ac dot uk
2009-08-21 7:40 ` dfranke at gcc dot gnu dot org
@ 2009-08-21 8:29 ` jv244 at cam dot ac dot uk
2009-08-24 20:06 ` jv244 at cam dot ac dot uk
` (10 subsequent siblings)
13 siblings, 0 replies; 21+ messages in thread
From: jv244 at cam dot ac dot uk @ 2009-08-21 8:29 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from jv244 at cam dot ac dot uk 2009-08-21 08:29 -------
(In reply to comment #2)
> I think PR31009 is similar.
In fact, this is almost a dup of PR31016, since also here, I'm explicitly
talking about the case of known-to-be-contiguous arrays.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41137
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug fortran/41137] inefficient zeroing of an array
2009-08-21 6:15 [Bug fortran/41137] New: inefficient zeroing of an array jv244 at cam dot ac dot uk
` (2 preceding siblings ...)
2009-08-21 8:29 ` jv244 at cam dot ac dot uk
@ 2009-08-24 20:06 ` jv244 at cam dot ac dot uk
2009-11-01 16:21 ` tkoenig at gcc dot gnu dot org
` (9 subsequent siblings)
13 siblings, 0 replies; 21+ messages in thread
From: jv244 at cam dot ac dot uk @ 2009-08-24 20:06 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from jv244 at cam dot ac dot uk 2009-08-24 20:06 -------
I don't think this PR depends on PR40632, which just provides a F2008 mechanism
to signal an assumed shape array to be contiguous (certainly a useful feature
in its own respect). The cases discussed here are rather assumed size and
explicit shape arrays, which are always contiguous. As an added complication,
certain array sections of these arrays are also known to be contiguous at
compile time.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41137
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug fortran/41137] inefficient zeroing of an array
2009-08-21 6:15 [Bug fortran/41137] New: inefficient zeroing of an array jv244 at cam dot ac dot uk
` (3 preceding siblings ...)
2009-08-24 20:06 ` jv244 at cam dot ac dot uk
@ 2009-11-01 16:21 ` tkoenig at gcc dot gnu dot org
2009-11-01 17:36 ` tkoenig at gcc dot gnu dot org
` (8 subsequent siblings)
13 siblings, 0 replies; 21+ messages in thread
From: tkoenig at gcc dot gnu dot org @ 2009-11-01 16:21 UTC (permalink / raw)
To: gcc-bugs
------- Comment #5 from tkoenig at gcc dot gnu dot org 2009-11-01 16:21 -------
A workaround (which should really be implemented within the compiler):
subroutine s(a,n)
integer :: n
real :: a(n*n*n*n)
a = 0.0
end subroutine
This is legal Fortran, equivalent to your routine, and should be much faster.
Confirmed, BTW.
--
tkoenig at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Ever Confirmed|0 |1
Last reconfirmed|0000-00-00 00:00:00 |2009-11-01 16:21:21
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41137
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug fortran/41137] inefficient zeroing of an array
2009-08-21 6:15 [Bug fortran/41137] New: inefficient zeroing of an array jv244 at cam dot ac dot uk
` (4 preceding siblings ...)
2009-11-01 16:21 ` tkoenig at gcc dot gnu dot org
@ 2009-11-01 17:36 ` tkoenig at gcc dot gnu dot org
2010-05-07 21:02 ` dfranke at gcc dot gnu dot org
` (7 subsequent siblings)
13 siblings, 0 replies; 21+ messages in thread
From: tkoenig at gcc dot gnu dot org @ 2009-11-01 17:36 UTC (permalink / raw)
To: gcc-bugs
--
tkoenig at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Severity|normal |enhancement
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41137
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug fortran/41137] inefficient zeroing of an array
2009-08-21 6:15 [Bug fortran/41137] New: inefficient zeroing of an array jv244 at cam dot ac dot uk
` (5 preceding siblings ...)
2009-11-01 17:36 ` tkoenig at gcc dot gnu dot org
@ 2010-05-07 21:02 ` dfranke at gcc dot gnu dot org
2010-06-21 15:02 ` burnus at gcc dot gnu dot org
` (6 subsequent siblings)
13 siblings, 0 replies; 21+ messages in thread
From: dfranke at gcc dot gnu dot org @ 2010-05-07 21:02 UTC (permalink / raw)
To: gcc-bugs
------- Comment #6 from dfranke at gcc dot gnu dot org 2010-05-07 21:01 -------
See also PR40598.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41137
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug fortran/41137] inefficient zeroing of an array
2009-08-21 6:15 [Bug fortran/41137] New: inefficient zeroing of an array jv244 at cam dot ac dot uk
` (6 preceding siblings ...)
2010-05-07 21:02 ` dfranke at gcc dot gnu dot org
@ 2010-06-21 15:02 ` burnus at gcc dot gnu dot org
2010-06-21 15:22 ` burnus at gcc dot gnu dot org
` (5 subsequent siblings)
13 siblings, 0 replies; 21+ messages in thread
From: burnus at gcc dot gnu dot org @ 2010-06-21 15:02 UTC (permalink / raw)
To: gcc-bugs
------- Comment #7 from burnus at gcc dot gnu dot org 2010-06-21 15:02 -------
(In reply to comment #1)
> Just for reference, the difference in time between the two variants is truly
> impressive. About a factor of 11 with gcc 4.4 and 8 with gcc 4.5.
I get for the example the following values, note especially the newly added
CONTIGUOUS result:
0.31601900 - assumed-shape
0.21601403 - assumed-shape CONTIGUOUS
0.21601295 - explicit size (n,n,...)
0.20801300 - explicit size (10,10,...)
0.21601403 - explicit size (10*10*...)
Ignoring some measuring noise, assumed-shape is 46% (-O0) to 25% (-O3) slower
than explicit size, but using the CONTIGUOUS attribute, the performance is
re-gained. I cannot reproduce the factor of 10 results, however. What surprises
me a bit is that -flto -fwhole-program does not reduce the speed penalty of
assumed-shape arrays.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41137
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug fortran/41137] inefficient zeroing of an array
2009-08-21 6:15 [Bug fortran/41137] New: inefficient zeroing of an array jv244 at cam dot ac dot uk
` (7 preceding siblings ...)
2010-06-21 15:02 ` burnus at gcc dot gnu dot org
@ 2010-06-21 15:22 ` burnus at gcc dot gnu dot org
2010-06-21 15:49 ` jv244 at cam dot ac dot uk
` (4 subsequent siblings)
13 siblings, 0 replies; 21+ messages in thread
From: burnus at gcc dot gnu dot org @ 2010-06-21 15:22 UTC (permalink / raw)
To: gcc-bugs
------- Comment #8 from burnus at gcc dot gnu dot org 2010-06-21 15:22 -------
(In reply to comment #7)
> I get for the example the following values, note especially the newly added
> CONTIGUOUS result:
For the test case, see attachment 20966 at PR 44612; that PR I have filled
because GCC does not optimize away the loops, which only set but never read the
value from the variable. (Ifort does this optimization.) Additionally, if one
prints the variable, ifort is twice as fast. For curiosity: Using NAG, the
timing is 0.6900000 vs. 1.2200000, i.e. the assumed-shape version is actually
faster [though, its overall the performance is poor].
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41137
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug fortran/41137] inefficient zeroing of an array
2009-08-21 6:15 [Bug fortran/41137] New: inefficient zeroing of an array jv244 at cam dot ac dot uk
` (8 preceding siblings ...)
2010-06-21 15:22 ` burnus at gcc dot gnu dot org
@ 2010-06-21 15:49 ` jv244 at cam dot ac dot uk
2010-06-21 17:00 ` burnus at gcc dot gnu dot org
` (3 subsequent siblings)
13 siblings, 0 replies; 21+ messages in thread
From: jv244 at cam dot ac dot uk @ 2010-06-21 15:49 UTC (permalink / raw)
To: gcc-bugs
------- Comment #9 from jv244 at cam dot ac dot uk 2010-06-21 15:49 -------
(In reply to comment #7)
> I cannot reproduce the factor of 10 results, however.
Here this still is the case (so might depend on the precise architecture):
/data03/vondele/gcc_trunk/build/libexec/gcc/x86_64-unknown-linux-gnu/4.6.0/f951
test.f90 -march=k8-sse3 -mcx16 -msahf --param l1-cache-size=64 --param
l1-cache-line-size=64 --param l2-cache-size=1024 -mtune=k8 -quiet -dumpbase
test.f90 -auxbase test -O3 -version -fintrinsic-modules-path
/data03/vondele/gcc_trunk/build/lib/gcc/x86_64-unknown-linux-gnu/4.6.0/finclude
-o /tmp/ccXsKXnD.s
> ./a.out
0.10800600
1.0520660
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41137
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug fortran/41137] inefficient zeroing of an array
2009-08-21 6:15 [Bug fortran/41137] New: inefficient zeroing of an array jv244 at cam dot ac dot uk
` (9 preceding siblings ...)
2010-06-21 15:49 ` jv244 at cam dot ac dot uk
@ 2010-06-21 17:00 ` burnus at gcc dot gnu dot org
2010-06-21 17:44 ` jakub at gcc dot gnu dot org
` (2 subsequent siblings)
13 siblings, 0 replies; 21+ messages in thread
From: burnus at gcc dot gnu dot org @ 2010-06-21 17:00 UTC (permalink / raw)
To: gcc-bugs
------- Comment #10 from burnus at gcc dot gnu dot org 2010-06-21 17:00 -------
(In reply to comment #9)
> (In reply to comment #7)
> > I cannot reproduce the factor of 10 results, however.
> Here this still is the case (so might depend on the precise architecture):
OK, I was using -fwhole-file out of habit - thus the difference is that small
(all optimization levels, including -O0). Otherwise, I also get the same
factor-of-10 difference. If one splits it in two files, one needs to use "-O3
-flto" to get a fast program.
For comparison, using two files, ifort also shows a factor of 2 to 5 difference
(and is at -O0 ten times slower than gfortran; at -O2 it is twice as fast as
gfortran).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41137
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug fortran/41137] inefficient zeroing of an array
2009-08-21 6:15 [Bug fortran/41137] New: inefficient zeroing of an array jv244 at cam dot ac dot uk
` (10 preceding siblings ...)
2010-06-21 17:00 ` burnus at gcc dot gnu dot org
@ 2010-06-21 17:44 ` jakub at gcc dot gnu dot org
2010-06-22 14:42 ` burnus at gcc dot gnu dot org
2010-06-22 15:25 ` jakub at gcc dot gnu dot org
13 siblings, 0 replies; 21+ messages in thread
From: jakub at gcc dot gnu dot org @ 2010-06-21 17:44 UTC (permalink / raw)
To: gcc-bugs
------- Comment #11 from jakub at gcc dot gnu dot org 2010-06-21 17:43 -------
What's the reason why gfc_trans_zero_assign insists that len is INTEGER_CST?
At least if it is contiguous (and not assumed size), why can't memset be used
even for non-constant sizes?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41137
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug fortran/41137] inefficient zeroing of an array
2009-08-21 6:15 [Bug fortran/41137] New: inefficient zeroing of an array jv244 at cam dot ac dot uk
` (11 preceding siblings ...)
2010-06-21 17:44 ` jakub at gcc dot gnu dot org
@ 2010-06-22 14:42 ` burnus at gcc dot gnu dot org
2010-06-22 15:25 ` jakub at gcc dot gnu dot org
13 siblings, 0 replies; 21+ messages in thread
From: burnus at gcc dot gnu dot org @ 2010-06-22 14:42 UTC (permalink / raw)
To: gcc-bugs
------- Comment #12 from burnus at gcc dot gnu dot org 2010-06-22 14:42 -------
(In reply to comment #11)
> What's the reason why gfc_trans_zero_assign insists that len is INTEGER_CST?
> At least if it is contiguous (and not assumed size), why can't memset be used
> even for non-constant sizes?
Suggested by Jakub:
- if (!len || TREE_CODE (len) != INTEGER_CST)
+ if (!len
+ || (TREE_CODE (len) != INTEGER_CST
+ && !gfc_is_simply_contiguous (expr, false)))
Though, one needs to be careful that one zeros the right spot (maybe already
taken care of):
a(5:) = 0
Additionally, one could do the same for arrays which are contiguous but have a
descriptor - for which one has to calculate the size manually (as "len" ==
NULL). At least after memset/memcpy middle-end fixes, the change should be
profitable.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41137
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug fortran/41137] inefficient zeroing of an array
2009-08-21 6:15 [Bug fortran/41137] New: inefficient zeroing of an array jv244 at cam dot ac dot uk
` (12 preceding siblings ...)
2010-06-22 14:42 ` burnus at gcc dot gnu dot org
@ 2010-06-22 15:25 ` jakub at gcc dot gnu dot org
13 siblings, 0 replies; 21+ messages in thread
From: jakub at gcc dot gnu dot org @ 2010-06-22 15:25 UTC (permalink / raw)
To: gcc-bugs
------- Comment #13 from jakub at gcc dot gnu dot org 2010-06-22 15:25 -------
Well, a(5:)=0.0 doesn't satisfy copyable_array_p, so gfc_trans_zero_assign
isn't called at all.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41137
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug fortran/41137] inefficient zeroing of an array
[not found] <bug-41137-4@http.gcc.gnu.org/bugzilla/>
@ 2013-03-29 9:47 ` Joost.VandeVondele at mat dot ethz.ch
2013-03-29 22:19 ` tkoenig at gcc dot gnu.org
` (4 subsequent siblings)
5 siblings, 0 replies; 21+ messages in thread
From: Joost.VandeVondele at mat dot ethz.ch @ 2013-03-29 9:47 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41137
Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed|2009-11-01 16:21:21 |2013-03-29
CC| |Joost.VandeVondele at mat
| |dot ethz.ch
Blocks| |38654
--- Comment #14 from Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> 2013-03-29 09:46:53 UTC ---
The code in comment #0 is actually a frontend optimization, PR38654.
Noteworthy that the optimizers (ipa-cp plus others) do the right thing for the
tester in comment #1 at -O3 (but can't do this in the general case).
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug fortran/41137] inefficient zeroing of an array
[not found] <bug-41137-4@http.gcc.gnu.org/bugzilla/>
2013-03-29 9:47 ` Joost.VandeVondele at mat dot ethz.ch
@ 2013-03-29 22:19 ` tkoenig at gcc dot gnu.org
2013-03-29 22:39 ` burnus at gcc dot gnu.org
` (3 subsequent siblings)
5 siblings, 0 replies; 21+ messages in thread
From: tkoenig at gcc dot gnu.org @ 2013-03-29 22:19 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41137
--- Comment #15 from Thomas Koenig <tkoenig at gcc dot gnu.org> 2013-03-29 22:19:05 UTC ---
The patch from comment#12 causes memory failure of the
following code:
module zero
implicit none
contains
subroutine foo(a)
real, contiguous :: a(:,:)
a(:,:) = 0
end subroutine foo
end module zero
program main
use zero
implicit none
real, dimension(5,5) :: a
a = 1.
call foo(a(1:5:2,1:5:2))
write (*,'(5F12.5)') a
end program main
which is a bit strange.
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug fortran/41137] inefficient zeroing of an array
[not found] <bug-41137-4@http.gcc.gnu.org/bugzilla/>
2013-03-29 9:47 ` Joost.VandeVondele at mat dot ethz.ch
2013-03-29 22:19 ` tkoenig at gcc dot gnu.org
@ 2013-03-29 22:39 ` burnus at gcc dot gnu.org
2014-05-01 12:16 ` dominiq at lps dot ens.fr
` (2 subsequent siblings)
5 siblings, 0 replies; 21+ messages in thread
From: burnus at gcc dot gnu.org @ 2013-03-29 22:39 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41137
Tobias Burnus <burnus at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |burnus at gcc dot gnu.org
--- Comment #16 from Tobias Burnus <burnus at gcc dot gnu.org> 2013-03-29 22:38:58 UTC ---
Possible off-topic remark - or hitting right on the nail: Looking at
a(:,:,:,:)=0.0
and
a(5:) = 0.0
I wonder whether it couldn't be handled via RANGE_REF, e.g.
RANGE_REF(a,5,...) = { };
should work if I am not mistaken. Currently, we only do "a = 0.0" -> "a = {};".
See ARRAY_RANGE_REF in trans-expr.c's class_array_data_assign
(gfc_index_zero_node is the offset) for the usage; see also GCC internal manual
and Ada.
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug fortran/41137] inefficient zeroing of an array
[not found] <bug-41137-4@http.gcc.gnu.org/bugzilla/>
` (2 preceding siblings ...)
2013-03-29 22:39 ` burnus at gcc dot gnu.org
@ 2014-05-01 12:16 ` dominiq at lps dot ens.fr
2014-05-01 12:35 ` Joost.VandeVondele at mat dot ethz.ch
2014-05-01 17:00 ` tkoenig at gcc dot gnu.org
5 siblings, 0 replies; 21+ messages in thread
From: dominiq at lps dot ens.fr @ 2014-05-01 12:16 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41137
Dominique d'Humieres <dominiq at lps dot ens.fr> changed:
What |Removed |Added
----------------------------------------------------------------------------
Known to work| |4.6.4, 4.7.3, 4.8.2, 4.9.0
Known to fail| |4.5.4
--- Comment #17 from Dominique d'Humieres <dominiq at lps dot ens.fr> ---
With -O3, I get the same timings for the test in comment 1 since gcc 4.6.4.
Could this PR be closed as FIXED or did I miss something in the discussion?
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug fortran/41137] inefficient zeroing of an array
[not found] <bug-41137-4@http.gcc.gnu.org/bugzilla/>
` (3 preceding siblings ...)
2014-05-01 12:16 ` dominiq at lps dot ens.fr
@ 2014-05-01 12:35 ` Joost.VandeVondele at mat dot ethz.ch
2014-05-01 17:00 ` tkoenig at gcc dot gnu.org
5 siblings, 0 replies; 21+ messages in thread
From: Joost.VandeVondele at mat dot ethz.ch @ 2014-05-01 12:35 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41137
--- Comment #18 from Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> ---
(In reply to Dominique d'Humieres from comment #17)
> With -O3, I get the same timings for the test in comment 1 since gcc 4.6.4.
> Could this PR be closed as FIXED or did I miss something in the discussion?
However, the difference remains if the subroutines would be in separate files
(comment #14), in fact, with '-O3 -fno-ipa-cp -fno-inline' the timings remain
poor:
> ./a.out
0.156975999
0.655900002
I think the issue is that the frontend could/should generate better code for
this.
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug fortran/41137] inefficient zeroing of an array
[not found] <bug-41137-4@http.gcc.gnu.org/bugzilla/>
` (4 preceding siblings ...)
2014-05-01 12:35 ` Joost.VandeVondele at mat dot ethz.ch
@ 2014-05-01 17:00 ` tkoenig at gcc dot gnu.org
5 siblings, 0 replies; 21+ messages in thread
From: tkoenig at gcc dot gnu.org @ 2014-05-01 17:00 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41137
--- Comment #19 from Thomas Koenig <tkoenig at gcc dot gnu.org> ---
Also see PR 55858.
^ permalink raw reply [flat|nested] 21+ messages in thread