public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug optimization/14741] New: missing transformations lead to poorly optimized code
@ 2004-03-26 12:17 jv244 at cam dot ac dot uk
2004-03-28 3:20 ` [Bug optimization/14741] " giovannibajo at libero dot it
` (13 more replies)
0 siblings, 14 replies; 19+ messages in thread
From: jv244 at cam dot ac dot uk @ 2004-03-26 12:17 UTC (permalink / raw)
To: gcc-bugs
I think this testcase is worth a PR so that it might be solved one day.
as shown and discussed in :
http://gcc.gnu.org/ml/gcc/2004-03/msg01457.html
and related messages gfortran is about 10 times slower than ifc (or xlf) on the
code below, and no compiler options seem to be able to change this (this seems
not tree-ssa specific).
INTEGER, PARAMETER :: N=1024
REAL*8 :: A(N,N), B(N,N), C(N,N)
REAL*8 :: t1,t2
A=0.1D0
B=0.1D0
C=0.0D0
CALL cpu_time(t1)
CALL mult(A,B,C,N)
CALL cpu_time(t2)
write(6,*) t2-t1,C(1,1)
END
SUBROUTINE mult(A,B,C,N)
REAL*8 :: A(N,N), B(N,N), C(N,N)
INTEGER :: I,J,K,N
DO J=1,N
DO I=1,N
DO K=1,N
C(I,J)=C(I,J)+A(I,K)*B(K,J)
ENDDO
ENDDO
ENDDO
END
--
Summary: missing transformations lead to poorly optimized code
Product: gcc
Version: tree-ssa
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: jv244 at cam dot ac dot uk
CC: gcc-bugs at gcc dot gnu dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14741
^ permalink raw reply [flat|nested] 19+ messages in thread
* [Bug optimization/14741] missing transformations lead to poorly optimized code
2004-03-26 12:17 [Bug optimization/14741] New: missing transformations lead to poorly optimized code jv244 at cam dot ac dot uk
@ 2004-03-28 3:20 ` giovannibajo at libero dot it
2004-05-31 5:18 ` [Bug rtl-optimization/14741] " pinskia at gcc dot gnu dot org
` (12 subsequent siblings)
13 siblings, 0 replies; 19+ messages in thread
From: giovannibajo at libero dot it @ 2004-03-28 3:20 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From giovannibajo at libero dot it 2004-03-28 03:20 -------
Confirmed by Scott Robert Ladd in the referenced message. Not a regression, so
I am not setting any milestone.
--
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Ever Confirmed| |1
Keywords| |pessimizes-code
Known to fail| |tree-ssa
Last reconfirmed|0000-00-00 00:00:00 |2004-03-28 03:20:22
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14741
^ permalink raw reply [flat|nested] 19+ messages in thread
* [Bug rtl-optimization/14741] missing transformations lead to poorly optimized code
2004-03-26 12:17 [Bug optimization/14741] New: missing transformations lead to poorly optimized code jv244 at cam dot ac dot uk
2004-03-28 3:20 ` [Bug optimization/14741] " giovannibajo at libero dot it
@ 2004-05-31 5:18 ` pinskia at gcc dot gnu dot org
2004-05-31 9:26 ` [Bug rtl-optimization/14741] [gfortran] " tobi at gcc dot gnu dot org
` (11 subsequent siblings)
13 siblings, 0 replies; 19+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-05-31 5:18 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-05-30 04:44 -------
This is a dup of bug 14771
*** This bug has been marked as a duplicate of 14771 ***
--
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |DUPLICATE
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14741
^ permalink raw reply [flat|nested] 19+ messages in thread
* [Bug rtl-optimization/14741] [gfortran] missing transformations lead to poorly optimized code
2004-03-26 12:17 [Bug optimization/14741] New: missing transformations lead to poorly optimized code jv244 at cam dot ac dot uk
2004-03-28 3:20 ` [Bug optimization/14741] " giovannibajo at libero dot it
2004-05-31 5:18 ` [Bug rtl-optimization/14741] " pinskia at gcc dot gnu dot org
@ 2004-05-31 9:26 ` tobi at gcc dot gnu dot org
2004-09-23 16:48 ` [Bug rtl-optimization/14741] " pinskia at gcc dot gnu dot org
` (10 subsequent siblings)
13 siblings, 0 replies; 19+ messages in thread
From: tobi at gcc dot gnu dot org @ 2004-05-31 9:26 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From tobi at gcc dot gnu dot org 2004-05-30 10:53 -------
This depends on 14771, but it's not a dup, because AFAIK the backend doesn't do
these kind of transformations, anyway. At least it didn't when this was
discussed the last time.
--
What |Removed |Added
----------------------------------------------------------------------------
BugsThisDependsOn| |14771
Status|RESOLVED |REOPENED
Resolution|DUPLICATE |
Summary|missing transformations lead|[gfortran] missing
|to poorly optimized code |transformations lead to
| |poorly optimized code
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14741
^ permalink raw reply [flat|nested] 19+ messages in thread
* [Bug rtl-optimization/14741] missing transformations lead to poorly optimized code
2004-03-26 12:17 [Bug optimization/14741] New: missing transformations lead to poorly optimized code jv244 at cam dot ac dot uk
` (2 preceding siblings ...)
2004-05-31 9:26 ` [Bug rtl-optimization/14741] [gfortran] " tobi at gcc dot gnu dot org
@ 2004-09-23 16:48 ` pinskia at gcc dot gnu dot org
2004-12-23 2:21 ` pinskia at gcc dot gnu dot org
` (9 subsequent siblings)
13 siblings, 0 replies; 19+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-09-23 16:48 UTC (permalink / raw)
To: gcc-bugs
--
What |Removed |Added
----------------------------------------------------------------------------
Status|REOPENED |NEW
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14741
^ permalink raw reply [flat|nested] 19+ messages in thread
* [Bug rtl-optimization/14741] missing transformations lead to poorly optimized code
2004-03-26 12:17 [Bug optimization/14741] New: missing transformations lead to poorly optimized code jv244 at cam dot ac dot uk
` (3 preceding siblings ...)
2004-09-23 16:48 ` [Bug rtl-optimization/14741] " pinskia at gcc dot gnu dot org
@ 2004-12-23 2:21 ` pinskia at gcc dot gnu dot org
2005-01-18 11:35 ` [Bug tree-optimization/14741] " rakdver at gcc dot gnu dot org
` (8 subsequent siblings)
13 siblings, 0 replies; 19+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-12-23 2:21 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-12-23 02:21 -------
Hmm, I get this for the loop on ppc:
L42:
fmr f12,f0
L33:
lfd f13,0(r11)
add r11,r11,r7
lfd f0,0(r9)
addi r9,r9,8
fmadd f0,f13,f0,f12
bdnz L42
The problem the secondary bb is PR 19038 but other than that, this loop as optimizated as it gets on
ppc.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14741
^ permalink raw reply [flat|nested] 19+ messages in thread
* [Bug tree-optimization/14741] missing transformations lead to poorly optimized code
2004-03-26 12:17 [Bug optimization/14741] New: missing transformations lead to poorly optimized code jv244 at cam dot ac dot uk
` (4 preceding siblings ...)
2004-12-23 2:21 ` pinskia at gcc dot gnu dot org
@ 2005-01-18 11:35 ` rakdver at gcc dot gnu dot org
2005-01-19 23:39 ` pinskia at gcc dot gnu dot org
` (7 subsequent siblings)
13 siblings, 0 replies; 19+ messages in thread
From: rakdver at gcc dot gnu dot org @ 2005-01-18 11:35 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From rakdver at gcc dot gnu dot org 2005-01-18 11:35 -------
The relevant part of the code looks like this:
do
{
k_2 = phi(...,k_1);
k_1 = k_2 + 1
} while (k_2 != endvalue)
/* k_1 unused outside of the loop */
Ivopts decide that it makes more sense to perform increment after the comparison:
while (1)
{
k_2 = phi(...,k_1);
if (k_2 == endvalue)
break;
k_1 = k_2 + 1
}
Which sort of is true; number of executions of the increment is decreased by one
per loop execution, and also we need one less register, since k_1 and k_2 can be
coalesced now.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14741
^ permalink raw reply [flat|nested] 19+ messages in thread
* [Bug tree-optimization/14741] missing transformations lead to poorly optimized code
2004-03-26 12:17 [Bug optimization/14741] New: missing transformations lead to poorly optimized code jv244 at cam dot ac dot uk
` (5 preceding siblings ...)
2005-01-18 11:35 ` [Bug tree-optimization/14741] " rakdver at gcc dot gnu dot org
@ 2005-01-19 23:39 ` pinskia at gcc dot gnu dot org
2005-01-23 13:31 ` steven at gcc dot gnu dot org
` (6 subsequent siblings)
13 siblings, 0 replies; 19+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-01-19 23:39 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2005-01-19 23:38 -------
We now get:
L33:
lfd f13,0(r11)
add r11,r11,r8
lfd f0,0(r10)
addi r10,r10,8
fmadd f0,f13,f0,f12
fmr f12,f0
bdnz L33
Which is much better, thanks Zdenek.
The only problem left looks a coalescing problem with out of ssa
Before out of ssa:
# ivtmp.89_54 = PHI <ivtmp.89_31(2), ivtmp.89_33(3)>;
# lsm_tmp.85_52 = PHI <lsm_tmp.85_53(2), D.538_49(3)>;
# k_3 = PHI <1(2), k_7(3)>;
<L4>:;
D.529_38 = k_3 * stride.10_5;
D.530_39 = i_2 + D.529_38;
D.531_40 = offset.11_9 + D.530_39;
D.532_42 = (*a_41)[D.531_40];
b_30 = ivtmp.89_54;
D.536_47 = *b_30;
D.537_48 = D.532_42 * D.536_47;
D.538_49 = D.537_48 + lsm_tmp.85_52;
D.773_12 = (<unnamed type>) k_3;
D.774_8 = D.773_12 + 1;
k_7 = (int4) D.774_8;
ivtmp.89_33 = ivtmp.89_54 + 8B;
D.771_18 = (<unnamed type>) k_7;
D.772_17 = D.771_18 + 4294967295;
k_13 = (int4) D.772_17;
if (stride.10_5 == k_13) goto <L25>; else goto <L4>;
After:
<L4>:;
D.538 = (*a)[offset.11 + i + k * stride.10] * *ivtmp.89 + lsm_tmp.85;
k = (int4) ((<unnamed type>) k + 1);
ivtmp.89 = ivtmp.89 + 8B;
lsm_tmp.85 = D.538;
if (stride.10 == (int4) ((<unnamed type>) k + 4294967295)) goto <L25>; else goto <L4>;
We should have coalesced lsm_tmp.85 and D.538 together.
--
What |Removed |Added
----------------------------------------------------------------------------
Severity|normal |enhancement
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14741
^ permalink raw reply [flat|nested] 19+ messages in thread
* [Bug tree-optimization/14741] missing transformations lead to poorly optimized code
2004-03-26 12:17 [Bug optimization/14741] New: missing transformations lead to poorly optimized code jv244 at cam dot ac dot uk
` (6 preceding siblings ...)
2005-01-19 23:39 ` pinskia at gcc dot gnu dot org
@ 2005-01-23 13:31 ` steven at gcc dot gnu dot org
2005-01-23 19:43 ` steven at gcc dot gnu dot org
` (5 subsequent siblings)
13 siblings, 0 replies; 19+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-01-23 13:31 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steven at gcc dot gnu dot org 2005-01-23 13:31 -------
My patch for PR19464 will fix this.
--
What |Removed |Added
----------------------------------------------------------------------------
BugsThisDependsOn| |19464
AssignedTo|unassigned at gcc dot gnu |steven at gcc dot gnu dot
|dot org |org
Status|NEW |ASSIGNED
Last reconfirmed|2004-09-15 03:25:55 |2005-01-23 13:31:52
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14741
^ permalink raw reply [flat|nested] 19+ messages in thread
* [Bug tree-optimization/14741] missing transformations lead to poorly optimized code
2004-03-26 12:17 [Bug optimization/14741] New: missing transformations lead to poorly optimized code jv244 at cam dot ac dot uk
` (7 preceding siblings ...)
2005-01-23 13:31 ` steven at gcc dot gnu dot org
@ 2005-01-23 19:43 ` steven at gcc dot gnu dot org
2005-01-23 20:00 ` steven at gcc dot gnu dot org
` (4 subsequent siblings)
13 siblings, 0 replies; 19+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-01-23 19:43 UTC (permalink / raw)
To: gcc-bugs
--
Bug 14741 depends on bug 19464, which changed state.
Bug 19464 Summary: [3.3/3.4/4.0 Regression] gcse causes poor register allocation
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19464
What |Old Value |New Value
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution| |FIXED
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14741
^ permalink raw reply [flat|nested] 19+ messages in thread
* [Bug tree-optimization/14741] missing transformations lead to poorly optimized code
2004-03-26 12:17 [Bug optimization/14741] New: missing transformations lead to poorly optimized code jv244 at cam dot ac dot uk
` (8 preceding siblings ...)
2005-01-23 19:43 ` steven at gcc dot gnu dot org
@ 2005-01-23 20:00 ` steven at gcc dot gnu dot org
2005-01-28 16:00 ` jv244 at cam dot ac dot uk
` (3 subsequent siblings)
13 siblings, 0 replies; 19+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-01-23 20:00 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steven at gcc dot gnu dot org 2005-01-23 20:00 -------
Joost, could you try this with CVS head? We should do a lot better
now. Could you also show the code ifc produces for your test case?
Maybe they have some option enabled by default that we have disabled
by default (loop unrolling, vectorization, etc.).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14741
^ permalink raw reply [flat|nested] 19+ messages in thread
* [Bug tree-optimization/14741] missing transformations lead to poorly optimized code
2004-03-26 12:17 [Bug optimization/14741] New: missing transformations lead to poorly optimized code jv244 at cam dot ac dot uk
` (9 preceding siblings ...)
2005-01-23 20:00 ` steven at gcc dot gnu dot org
@ 2005-01-28 16:00 ` jv244 at cam dot ac dot uk
2005-01-28 16:23 ` steven at gcc dot gnu dot org
` (2 subsequent siblings)
13 siblings, 0 replies; 19+ messages in thread
From: jv244 at cam dot ac dot uk @ 2005-01-28 16:00 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From jv244 at cam dot ac dot uk 2005-01-28 15:59 -------
Hi Steven, I now ( gcc version 4.0.0 20050128 (experimental) )get the following,
where the first number is the timing.
multgen/basic_mult> gfortran -O3 -ffast-math mult.f90
multgen/basic_mult> ./a.out
59.0300000000000 10.2400000000000
which is good as compared to
multgen/basic_mult> ifort -O3 mult.f90
multgen/basic_mult> ./a.out
64.8900000000000 10.2399999999998
but very bad (factor 20) as compared to
multgen/basic_mult> ifort -O3 -xN mult.f90
mult.f90(4) : (col. 0) remark: LOOP WAS VECTORIZED.
mult.f90(5) : (col. 0) remark: LOOP WAS VECTORIZED.
mult.f90(6) : (col. 0) remark: LOOP WAS VECTORIZED.
multgen/basic_mult> ./a.out
2.13000000000000 10.2399999999998
I'll try to attach the relevant asm for this.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14741
^ permalink raw reply [flat|nested] 19+ messages in thread
* [Bug tree-optimization/14741] missing transformations lead to poorly optimized code
2004-03-26 12:17 [Bug optimization/14741] New: missing transformations lead to poorly optimized code jv244 at cam dot ac dot uk
` (10 preceding siblings ...)
2005-01-28 16:00 ` jv244 at cam dot ac dot uk
@ 2005-01-28 16:23 ` steven at gcc dot gnu dot org
2005-01-28 16:31 ` jv244 at cam dot ac dot uk
2005-01-28 17:22 ` dberlin at dberlin dot org
13 siblings, 0 replies; 19+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-01-28 16:23 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steven at gcc dot gnu dot org 2005-01-28 16:23 -------
The -xN you add make ifort specialize the code for Pentium 4. So far,
nobody has cared to make GCC produce good code for the good old Pentium 4
so I would not be terribly surprised if we lose a lot just on the normal
code generation. Add to that the fact that -xN enables a lot of extra
optimizations in ifort that gcc does not have yet (vectorization is one
example in your case), it is not a surprise at all that we are that much
slower.
I don't know if the mainline vectorizer is already smart enough to handle
the loop in your code. Probably it is not.
You could try "gfortran -O3 -mtune=pentium4 -ffast-math -mfpmath=sse
-ftree-loop-linear -ftree-vectorize yourcode.f90" and see if it helps.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14741
^ permalink raw reply [flat|nested] 19+ messages in thread
* [Bug tree-optimization/14741] missing transformations lead to poorly optimized code
2004-03-26 12:17 [Bug optimization/14741] New: missing transformations lead to poorly optimized code jv244 at cam dot ac dot uk
` (11 preceding siblings ...)
2005-01-28 16:23 ` steven at gcc dot gnu dot org
@ 2005-01-28 16:31 ` jv244 at cam dot ac dot uk
2005-01-28 17:22 ` Daniel Berlin
2005-01-28 17:22 ` dberlin at dberlin dot org
13 siblings, 1 reply; 19+ messages in thread
From: jv244 at cam dot ac dot uk @ 2005-01-28 16:31 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From jv244 at cam dot ac dot uk 2005-01-28 16:31 -------
> You could try "gfortran -O3 -mtune=pentium4 -ffast-math -mfpmath=sse
> -ftree-loop-linear -ftree-vectorize yourcode.f90" and see if it helps.
Unhappily, seems to make things slower:
multgen/basic_mult> gfortran -O3 -mtune=pentium4 -ffast-math -mfpmath=sse
-ftree-loop-linear -ftree-vectorize mult.f90
mult.f90:0: warning: SSE instruction set disabled, using 387 arithmetics
multgen/basic_mult> ./a.out
60.3900000000000 10.2400000000000
whereas removing some of these flags helps a little
multgen/basic_mult> gfortran -O3 -mtune=pentium4 -ffast-math mult.f90
multgen/basic_mult> ./a.out
56.4700000000000 10.2400000000000
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14741
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Bug tree-optimization/14741] missing transformations lead to poorly optimized code
2005-01-28 16:31 ` jv244 at cam dot ac dot uk
@ 2005-01-28 17:22 ` Daniel Berlin
0 siblings, 0 replies; 19+ messages in thread
From: Daniel Berlin @ 2005-01-28 17:22 UTC (permalink / raw)
To: jv244 at cam dot ac dot uk; +Cc: gcc-bugs
On Fri, 28 Jan 2005, jv244 at cam dot ac dot uk wrote:
>
> ------- Additional Comments From jv244 at cam dot ac dot uk 2005-01-28 16:31 -------
>
>> You could try "gfortran -O3 -mtune=pentium4 -ffast-math -mfpmath=sse
>> -ftree-loop-linear -ftree-vectorize yourcode.f90" and see if it helps.
>
> Unhappily, seems to make things slower:
>
> multgen/basic_mult> gfortran -O3 -mtune=pentium4 -ffast-math -mfpmath=sse
> -ftree-loop-linear -ftree-vectorize mult.f90
> mult.f90:0: warning: SSE instruction set disabled, using 387 arithmetics
You'd need -msse2 or -msse (or is it -march=pentium4 that enables these?)
^ permalink raw reply [flat|nested] 19+ messages in thread
* [Bug tree-optimization/14741] missing transformations lead to poorly optimized code
2004-03-26 12:17 [Bug optimization/14741] New: missing transformations lead to poorly optimized code jv244 at cam dot ac dot uk
` (12 preceding siblings ...)
2005-01-28 16:31 ` jv244 at cam dot ac dot uk
@ 2005-01-28 17:22 ` dberlin at dberlin dot org
13 siblings, 0 replies; 19+ messages in thread
From: dberlin at dberlin dot org @ 2005-01-28 17:22 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From dberlin at gcc dot gnu dot org 2005-01-28 17:22 -------
Subject: Re: missing transformations lead to
poorly optimized code
On Fri, 28 Jan 2005, jv244 at cam dot ac dot uk wrote:
>
> ------- Additional Comments From jv244 at cam dot ac dot uk 2005-01-28 16:31 -------
>
>> You could try "gfortran -O3 -mtune=pentium4 -ffast-math -mfpmath=sse
>> -ftree-loop-linear -ftree-vectorize yourcode.f90" and see if it helps.
>
> Unhappily, seems to make things slower:
>
> multgen/basic_mult> gfortran -O3 -mtune=pentium4 -ffast-math -mfpmath=sse
> -ftree-loop-linear -ftree-vectorize mult.f90
> mult.f90:0: warning: SSE instruction set disabled, using 387 arithmetics
You'd need -msse2 or -msse (or is it -march=pentium4 that enables these?)
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14741
^ permalink raw reply [flat|nested] 19+ messages in thread
* [Bug tree-optimization/14741] missing transformations lead to poorly optimized code
[not found] <bug-14741-6642@http.gcc.gnu.org/bugzilla/>
2005-10-07 21:21 ` steven at gcc dot gnu dot org
2007-07-03 18:10 ` jv244 at cam dot ac dot uk
@ 2009-09-29 18:59 ` jv244 at cam dot ac dot uk
2 siblings, 0 replies; 19+ messages in thread
From: jv244 at cam dot ac dot uk @ 2009-09-29 18:59 UTC (permalink / raw)
To: gcc-bugs
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 619 bytes --]
------- Comment #16 from jv244 at cam dot ac dot uk 2009-09-29 18:59 -------
since graphite should be able to fix this PR, I tried it without luck:
> gfortran -ffast-math -O3 -march=native -fgraphite -floop-interchange -floop-block test.f90
test.f90: In function MAIN__:
test.f90:1:0: sorry, unimplemented: loop blocking not implemented
test.f90:1:0: internal compiler error: in apply_poly_transforms, at
graphite-poly.c:253
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://gcc.gnu.org/bugs.html> for instructions.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14741
^ permalink raw reply [flat|nested] 19+ messages in thread
* [Bug tree-optimization/14741] missing transformations lead to poorly optimized code
[not found] <bug-14741-6642@http.gcc.gnu.org/bugzilla/>
2005-10-07 21:21 ` steven at gcc dot gnu dot org
@ 2007-07-03 18:10 ` jv244 at cam dot ac dot uk
2009-09-29 18:59 ` jv244 at cam dot ac dot uk
2 siblings, 0 replies; 19+ messages in thread
From: jv244 at cam dot ac dot uk @ 2007-07-03 18:10 UTC (permalink / raw)
To: gcc-bugs
------- Comment #15 from jv244 at cam dot ac dot uk 2007-07-03 18:09 -------
current gfortran trunk is still about a factor of 8 slower than ifort:
> gfortran -O3 -ffast-math -ftree-vectorize -march=native test.f90
> ./a.out
12.9808110000000 10.2399999999998
> ifort -xT -O2 test.f90
> ./a.out
1.62810200000000 10.2399999999998
(first number is the time)
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14741
^ permalink raw reply [flat|nested] 19+ messages in thread
* [Bug tree-optimization/14741] missing transformations lead to poorly optimized code
[not found] <bug-14741-6642@http.gcc.gnu.org/bugzilla/>
@ 2005-10-07 21:21 ` steven at gcc dot gnu dot org
2007-07-03 18:10 ` jv244 at cam dot ac dot uk
2009-09-29 18:59 ` jv244 at cam dot ac dot uk
2 siblings, 0 replies; 19+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-10-07 21:21 UTC (permalink / raw)
To: gcc-bugs
------- Comment #14 from steven at gcc dot gnu dot org 2005-10-07 21:21 -------
I don't have time to work on these (new job), so unassigning.
--
steven at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
AssignedTo|steven at gcc dot gnu dot |unassigned at gcc dot gnu
|org |dot org
Status|ASSIGNED |NEW
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14741
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2009-09-29 18:59 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-03-26 12:17 [Bug optimization/14741] New: missing transformations lead to poorly optimized code jv244 at cam dot ac dot uk
2004-03-28 3:20 ` [Bug optimization/14741] " giovannibajo at libero dot it
2004-05-31 5:18 ` [Bug rtl-optimization/14741] " pinskia at gcc dot gnu dot org
2004-05-31 9:26 ` [Bug rtl-optimization/14741] [gfortran] " tobi at gcc dot gnu dot org
2004-09-23 16:48 ` [Bug rtl-optimization/14741] " pinskia at gcc dot gnu dot org
2004-12-23 2:21 ` pinskia at gcc dot gnu dot org
2005-01-18 11:35 ` [Bug tree-optimization/14741] " rakdver at gcc dot gnu dot org
2005-01-19 23:39 ` pinskia at gcc dot gnu dot org
2005-01-23 13:31 ` steven at gcc dot gnu dot org
2005-01-23 19:43 ` steven at gcc dot gnu dot org
2005-01-23 20:00 ` steven at gcc dot gnu dot org
2005-01-28 16:00 ` jv244 at cam dot ac dot uk
2005-01-28 16:23 ` steven at gcc dot gnu dot org
2005-01-28 16:31 ` jv244 at cam dot ac dot uk
2005-01-28 17:22 ` Daniel Berlin
2005-01-28 17:22 ` dberlin at dberlin dot org
[not found] <bug-14741-6642@http.gcc.gnu.org/bugzilla/>
2005-10-07 21:21 ` steven at gcc dot gnu dot org
2007-07-03 18:10 ` jv244 at cam dot ac dot uk
2009-09-29 18:59 ` jv244 at cam dot ac dot uk
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).