public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/40648] New: misaligned store vectorizer patch introduced 10% runtime regression on Polyhedron test_fpu
@ 2009-07-04 11:54 ubizjak at gmail dot com
2009-07-04 12:06 ` [Bug target/40648] " rguenth at gcc dot gnu dot org
` (11 more replies)
0 siblings, 12 replies; 13+ messages in thread
From: ubizjak at gmail dot com @ 2009-07-04 11:54 UTC (permalink / raw)
To: gcc-bugs
Hello!
The "[patch, vectorizer] misaligned store support" patch [1] resulted in more
than 10% longer execution time for Polyhedron test_fpu test on Core2.
The test is compiled with "-march=x86-64 -ffast-math -funroll-loops -O3",
results are:
time ./a.out
Benchmark running, hopefully as only ACTIVE task
Test1 - Gauss 2000 (101x101) inverts 2.5 sec Err= 0.000000000000006
Test2 - Crout 2000 (101x101) inverts 2.5 sec Err= 0.000000000000015
Test3 - Crout 2 (1001x1001) inverts 2.3 sec Err= 0.000000000000065
Test4 - Lapack 2 (1001x1001) inverts 2.4 sec Err= 0.000000000000250
total = 9.6 sec
real 0m9.864s
user 0m9.778s
sys 0m0.074s
with patch [1] included, vs.:
time ./a.out
Benchmark running, hopefully as only ACTIVE task
Test1 - Gauss 2000 (101x101) inverts 1.9 sec Err= 0.000000000000006
Test2 - Crout 2000 (101x101) inverts 2.5 sec Err= 0.000000000000015
Test3 - Crout 2 (1001x1001) inverts 2.3 sec Err= 0.000000000000065
Test4 - Lapack 2 (1001x1001) inverts 2.0 sec Err= 0.000000000000250
total = 8.6 sec
real 0m8.869s
user 0m8.788s
sys 0m0.068s
when patch [1] is reverted.
The compiler is from today's SVN, "xgcc (GCC) 4.5.0 20090704 (experimental)
[trunk revision 149223]".
The effect of this patch can also be seen on [2], see test_fpu chart between
2009-06-05 and 2009-06-06.
[1] http://gcc.gnu.org/ml/gcc-patches/2009-06/msg00492.html
[2] http://gcc.opensuse.org/c++bench/polyhedron/polyhedron-summary.txt-2-0.html
--
Summary: misaligned store vectorizer patch introduced 10% runtime
regression on Polyhedron test_fpu
Product: gcc
Version: 4.5.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: ubizjak at gmail dot com
GCC build triplet: x86_64-pc-linux-gnu
GCC host triplet: x86_64-pc-linux-gnu
GCC target triplet: x86_64-pc-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40648
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/40648] misaligned store vectorizer patch introduced 10% runtime regression on Polyhedron test_fpu
2009-07-04 11:54 [Bug target/40648] New: misaligned store vectorizer patch introduced 10% runtime regression on Polyhedron test_fpu ubizjak at gmail dot com
@ 2009-07-04 12:06 ` rguenth at gcc dot gnu dot org
2009-07-04 12:33 ` rguenth at gcc dot gnu dot org
` (10 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-07-04 12:06 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from rguenth at gcc dot gnu dot org 2009-07-04 12:05 -------
Can you check numbers with vectorization disabled? I see the regression as
well on a AMD Fam 10 machine which supposedly has unaligned moves as fast
as aligned moves (if the data turns out to be aligned). Which means the
data really is unaligned.
What is the difference in code generation? Can you create a testcase from
the hot loop(s)?
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |rguenth at gcc dot gnu dot
| |org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40648
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/40648] misaligned store vectorizer patch introduced 10% runtime regression on Polyhedron test_fpu
2009-07-04 11:54 [Bug target/40648] New: misaligned store vectorizer patch introduced 10% runtime regression on Polyhedron test_fpu ubizjak at gmail dot com
2009-07-04 12:06 ` [Bug target/40648] " rguenth at gcc dot gnu dot org
@ 2009-07-04 12:33 ` rguenth at gcc dot gnu dot org
2009-07-04 12:36 ` rguenth at gcc dot gnu dot org
` (9 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-07-04 12:33 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from rguenth at gcc dot gnu dot org 2009-07-04 12:33 -------
One loop is
139 0.0046 : DO l = 1 , K
622 0.0208 : IF ( B(l,j)/=ZERO ) THEN
: temp = Alpha*B(l,j)
21380 0.7146 : DO i = 1 , M
569348 19.0299 : C(i,j) = C(i,j) + temp*A(i,l)
: ENDDO
: ENDIF
: ENDDO
where C(i,j) and A(i,l) are all unaligned. As the number of iterations is
symbolic we use epilogue peeling. But instead we could have done
peeling to align the loads/stores from/to C.
SUBROUTINE DGEMM(M,N,K,Alpha,A,Lda,B,Ldb,Beta,C,Ldc)
IMPLICIT NONE
DOUBLE PRECISION , PARAMETER :: ONE = 1.0D+0 , ZERO = 0.0D+0
DOUBLE PRECISION :: Alpha , Beta
INTEGER :: K , Lda , Ldb , Ldc , M , N
DOUBLE PRECISION , DIMENSION(Lda,*) :: A
DOUBLE PRECISION , DIMENSION(Ldb,*) :: B
DOUBLE PRECISION , DIMENSION(Ldc,*) :: C
INTENT (IN) A , Alpha , B , Beta , K , Lda , Ldb , Ldc , M , N
INTENT (INOUT) C
INTEGER :: i , j , l
DOUBLE PRECISION :: temp
DO j = 1 , N
IF ( Beta==ZERO ) THEN
DO i = 1 , M
C(i,j) = ZERO
ENDDO
ELSEIF ( Beta/=ONE ) THEN
DO i = 1 , M
C(i,j) = Beta*C(i,j)
ENDDO
ENDIF
DO l = 1 , K
IF ( B(l,j)/=ZERO ) THEN
temp = Alpha*B(l,j)
DO i = 1 , M
C(i,j) = C(i,j) + temp*A(i,l)
ENDDO
ENDIF
ENDDO
ENDDO
END SUBROUTINE DGEMM
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Ever Confirmed|0 |1
Last reconfirmed|0000-00-00 00:00:00 |2009-07-04 12:33:20
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40648
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/40648] misaligned store vectorizer patch introduced 10% runtime regression on Polyhedron test_fpu
2009-07-04 11:54 [Bug target/40648] New: misaligned store vectorizer patch introduced 10% runtime regression on Polyhedron test_fpu ubizjak at gmail dot com
2009-07-04 12:06 ` [Bug target/40648] " rguenth at gcc dot gnu dot org
2009-07-04 12:33 ` rguenth at gcc dot gnu dot org
@ 2009-07-04 12:36 ` rguenth at gcc dot gnu dot org
2009-07-04 12:44 ` ubizjak at gmail dot com
` (8 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-07-04 12:36 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from rguenth at gcc dot gnu dot org 2009-07-04 12:36 -------
Tuned for Core2 I get for the innermost loop
.L19:
leal (%eax,%ebx), %edx
movsd (%eax,%ecx), %xmm1
movsd (%edx), %xmm7
movhpd 8(%eax,%ecx), %xmm1
movhpd 8(%edx), %xmm7
movapd %xmm1, %xmm0
incl %esi
mulpd %xmm3, %xmm0
addl $16, %eax
addpd %xmm7, %xmm0
cmpl %edi, %esi
movlpd %xmm0, (%edx)
movhpd %xmm0, 8(%edx)
jb .L19
which is slower than with vectorization disabled (which is what happened
before the patch?).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40648
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/40648] misaligned store vectorizer patch introduced 10% runtime regression on Polyhedron test_fpu
2009-07-04 11:54 [Bug target/40648] New: misaligned store vectorizer patch introduced 10% runtime regression on Polyhedron test_fpu ubizjak at gmail dot com
` (2 preceding siblings ...)
2009-07-04 12:36 ` rguenth at gcc dot gnu dot org
@ 2009-07-04 12:44 ` ubizjak at gmail dot com
2009-07-04 13:40 ` ubizjak at gmail dot com
` (7 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: ubizjak at gmail dot com @ 2009-07-04 12:44 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from ubizjak at gmail dot com 2009-07-04 12:43 -------
(In reply to comment #1)
> Can you check numbers with vectorization disabled? I see the regression as
> well on a AMD Fam 10 machine which supposedly has unaligned moves as fast
> as aligned moves (if the data turns out to be aligned). Which means the
> data really is unaligned.
Without the vectorisation (adding -fno-tree-vectorize), there is no difference:
time ./a.out
Benchmark running, hopefully as only ACTIVE task
Test1 - Gauss 2000 (101x101) inverts 2.2 sec Err= 0.000000000000006
Test2 - Crout 2000 (101x101) inverts 2.5 sec Err= 0.000000000000023
Test3 - Crout 2 (1001x1001) inverts 2.6 sec Err= 0.000000000000031
Test4 - Lapack 2 (1001x1001) inverts 2.2 sec Err= 0.000000000000250
total = 9.5 sec
real 0m9.760s
user 0m9.674s
sys 0m0.082s
> What is the difference in code generation? Can you create a testcase from
> the hot loop(s)?
The first thing to catch my eye is missing CSE on memory references in the
_MAIN loop (added -march=barcelona to generate movupd insns:
.L14:
movupd equiv.0.1551(%rdx,%rax), %xmm8 #, tmp720
movupd a3.1557(%rdx,%rax), %xmm7 #, tmp722
leaq 16(%rdx), %r10 #, tmp789
movupd equiv.0.1551(%r10,%rax), %xmm6 #, tmp867
movupd a3.1557(%r10,%rax), %xmm5 #, tmp869
leaq 32(%rdx), %rcx #, ivtmp.845
subpd %xmm8, %xmm7 # tmp720, tmp722
movupd equiv.0.1551(%rcx,%rax), %xmm3 #, tmp873
movupd a3.1557(%rcx,%rax), %xmm4 #, tmp875
subpd %xmm6, %xmm5 # tmp867, tmp869
leaq 48(%rdx), %r9 #, ivtmp.845
leaq 64(%rdx), %r8 #, ivtmp.845
subpd %xmm3, %xmm4 # tmp873, tmp875
movupd a3.1557(%r9,%rax), %xmm15 #, tmp881
movupd equiv.0.1551(%r9,%rax), %xmm2 #, tmp879
movupd a3.1557(%r8,%rax), %xmm13 #, tmp887
movupd equiv.0.1551(%r8,%rax), %xmm14 #, tmp885
and in regressed case:
.L13:
movupd (%rdx,%rax), %xmm15 #* ivtmp.792, tmp962
movupd (%r12,%rax), %xmm14 #* ivtmp.792, tmp964
leaq 16(%rax), %rsi #, tmp1050
movupd (%rdx,%rsi), %xmm13 #, tmp1114
movupd (%r12,%rsi), %xmm12 #, tmp1116
leaq 32(%rax), %r15 #, ivtmp.792
subpd %xmm15, %xmm14 # tmp962, tmp964
movupd (%rdx,%r15), %xmm11 #* ivtmp.792, tmp1120
movupd (%r12,%r15), %xmm10 #* ivtmp.792, tmp1122
subpd %xmm13, %xmm12 # tmp1114, tmp1116
leaq 48(%rax), %rcx #, ivtmp.792
leaq 64(%rax), %r13 #, ivtmp.792
subpd %xmm11, %xmm10 # tmp1120, tmp1122
movupd (%r12,%rcx), %xmm8 #* ivtmp.792, tmp1128
movupd (%rdx,%rcx), %xmm9 #* ivtmp.792, tmp1126
movupd (%r12,%r13), %xmm6 #* ivtmp.792, tmp1134
movupd (%rdx,%r13), %xmm7 #* ivtmp.792, tmp1132
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40648
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/40648] misaligned store vectorizer patch introduced 10% runtime regression on Polyhedron test_fpu
2009-07-04 11:54 [Bug target/40648] New: misaligned store vectorizer patch introduced 10% runtime regression on Polyhedron test_fpu ubizjak at gmail dot com
` (3 preceding siblings ...)
2009-07-04 12:44 ` ubizjak at gmail dot com
@ 2009-07-04 13:40 ` ubizjak at gmail dot com
2009-07-04 14:02 ` dominiq at lps dot ens dot fr
` (6 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: ubizjak at gmail dot com @ 2009-07-04 13:40 UTC (permalink / raw)
To: gcc-bugs
------- Comment #5 from ubizjak at gmail dot com 2009-07-04 13:40 -------
(In reply to comment #4)
> and in regressed case:
... in NON-regressed case. The regressed code is the first dump.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40648
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/40648] misaligned store vectorizer patch introduced 10% runtime regression on Polyhedron test_fpu
2009-07-04 11:54 [Bug target/40648] New: misaligned store vectorizer patch introduced 10% runtime regression on Polyhedron test_fpu ubizjak at gmail dot com
` (4 preceding siblings ...)
2009-07-04 13:40 ` ubizjak at gmail dot com
@ 2009-07-04 14:02 ` dominiq at lps dot ens dot fr
2009-07-05 8:12 ` eres at il dot ibm dot com
` (5 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: dominiq at lps dot ens dot fr @ 2009-07-04 14:02 UTC (permalink / raw)
To: gcc-bugs
------- Comment #6 from dominiq at lps dot ens dot fr 2009-07-04 14:02 -------
I have seen this problem also. From a crude profiling, it seems that the slow
routines are dgemm as pointed in comment #2 and gauss. This is a regression
with respect to 4.4.0 and it has started between June 5 and 6.
On i686-apple-darwin9 with "-ffast-math -funroll-loops -O3" but without
specifying -march, the inner loop is unrolled 8 times with 4.4 and only 4 times
with trunk with a lot of extra code for the memory access (I am not fluent with
the *86 assembly). I have compared the outputs of -ftree-vectorizer-verbose=2,
but did not see anything suspicious.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40648
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/40648] misaligned store vectorizer patch introduced 10% runtime regression on Polyhedron test_fpu
2009-07-04 11:54 [Bug target/40648] New: misaligned store vectorizer patch introduced 10% runtime regression on Polyhedron test_fpu ubizjak at gmail dot com
` (5 preceding siblings ...)
2009-07-04 14:02 ` dominiq at lps dot ens dot fr
@ 2009-07-05 8:12 ` eres at il dot ibm dot com
2009-07-07 15:48 ` rguenth at gcc dot gnu dot org
` (4 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: eres at il dot ibm dot com @ 2009-07-05 8:12 UTC (permalink / raw)
To: gcc-bugs
------- Comment #7 from eres at il dot ibm dot com 2009-07-05 08:12 -------
Testing test_fpu on Power7 with the power7 branch shows no significant
difference between the version compiled with the misaligned store support patch
and without it. (using -mcpu=power7 -ffast-math -funroll-loops -O3)
The version with the misaligned store support patch is ~23% faster than the
version with -fno-tree-vectorize.
So it seems like this is a tuning issue for x86-64 and might be addressed in
the cost model.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40648
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/40648] misaligned store vectorizer patch introduced 10% runtime regression on Polyhedron test_fpu
2009-07-04 11:54 [Bug target/40648] New: misaligned store vectorizer patch introduced 10% runtime regression on Polyhedron test_fpu ubizjak at gmail dot com
` (6 preceding siblings ...)
2009-07-05 8:12 ` eres at il dot ibm dot com
@ 2009-07-07 15:48 ` rguenth at gcc dot gnu dot org
2009-07-09 7:33 ` eres at il dot ibm dot com
` (3 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-07-07 15:48 UTC (permalink / raw)
To: gcc-bugs
------- Comment #8 from rguenth at gcc dot gnu dot org 2009-07-07 15:47 -------
The issue is likely the sequence
load upper half of cache line 1
load lower half of cache line 2
store upper half of cache line 1
store lower half of cache line 2 <---
load upper half of cache line 2 <---
load lower half of cache line 3
...
where the marked lines probably cause internal delays.
Not using unaligned stores for this kind of data dependence or peeling
for alignment will probably help here.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40648
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/40648] misaligned store vectorizer patch introduced 10% runtime regression on Polyhedron test_fpu
2009-07-04 11:54 [Bug target/40648] New: misaligned store vectorizer patch introduced 10% runtime regression on Polyhedron test_fpu ubizjak at gmail dot com
` (7 preceding siblings ...)
2009-07-07 15:48 ` rguenth at gcc dot gnu dot org
@ 2009-07-09 7:33 ` eres at il dot ibm dot com
2009-10-25 12:41 ` eres at il dot ibm dot com
` (2 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: eres at il dot ibm dot com @ 2009-07-09 7:33 UTC (permalink / raw)
To: gcc-bugs
------- Comment #9 from eres at il dot ibm dot com 2009-07-09 07:32 -------
> Not using unaligned stores for this kind of data dependence or peeling
> for alignment will probably help here.
The decision of how to vectorized can be changed for x86 (or any other target).
Instead of first checking for misalign support in
vect_enhance_data_refs_alignment () function (tree-vect-data-refs.c)
peeling and versioning can be checked first...
(Please see also the comment at the beginning of this function)
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40648
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/40648] misaligned store vectorizer patch introduced 10% runtime regression on Polyhedron test_fpu
2009-07-04 11:54 [Bug target/40648] New: misaligned store vectorizer patch introduced 10% runtime regression on Polyhedron test_fpu ubizjak at gmail dot com
` (8 preceding siblings ...)
2009-07-09 7:33 ` eres at il dot ibm dot com
@ 2009-10-25 12:41 ` eres at il dot ibm dot com
2009-10-28 10:33 ` ubizjak at gmail dot com
2009-10-28 10:37 ` ubizjak at gmail dot com
11 siblings, 0 replies; 13+ messages in thread
From: eres at il dot ibm dot com @ 2009-10-25 12:41 UTC (permalink / raw)
To: gcc-bugs
------- Comment #10 from eres at il dot ibm dot com 2009-10-25 12:41 -------
(In reply to comment #0)
> Hello!
> The "[patch, vectorizer] misaligned store support" patch [1] resulted in more
> than 10% longer execution time for Polyhedron test_fpu test on Core2.
> The test is compiled with "-march=x86-64 -ffast-math -funroll-loops -O3",
This patch changes wrongly the decision of when to peel the loop which causes
the negative effect on the performance.
I am preparing a patch to fix this.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40648
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/40648] misaligned store vectorizer patch introduced 10% runtime regression on Polyhedron test_fpu
2009-07-04 11:54 [Bug target/40648] New: misaligned store vectorizer patch introduced 10% runtime regression on Polyhedron test_fpu ubizjak at gmail dot com
` (9 preceding siblings ...)
2009-10-25 12:41 ` eres at il dot ibm dot com
@ 2009-10-28 10:33 ` ubizjak at gmail dot com
2009-10-28 10:37 ` ubizjak at gmail dot com
11 siblings, 0 replies; 13+ messages in thread
From: ubizjak at gmail dot com @ 2009-10-28 10:33 UTC (permalink / raw)
To: gcc-bugs
------- Comment #11 from ubizjak at gmail dot com 2009-10-28 10:33 -------
Author: revitale
Date: Tue Oct 27 11:46:07 2009
New Revision: 153590
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=153590
Log:
Fix PR40648 -- Fix misaligned store vectorizer patch
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.dg/vect/costmodel/i386/costmodel-fast-math-vect-pr29925.c
trunk/gcc/testsuite/gcc.dg/vect/costmodel/i386/costmodel-vect-31.c
trunk/gcc/testsuite/gcc.dg/vect/costmodel/i386/costmodel-vect-33.c
trunk/gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-fast-math-vect-pr29925.c
trunk/gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-vect-31.c
trunk/gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-vect-33.c
trunk/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-31.c
trunk/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-64.c
trunk/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-66.c
trunk/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-68.c
trunk/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-69.c
trunk/gcc/testsuite/gcc.dg/vect/slp-25.c
trunk/gcc/testsuite/gcc.dg/vect/vect-109.c
trunk/gcc/testsuite/gcc.dg/vect/vect-26.c
trunk/gcc/testsuite/gcc.dg/vect/vect-27.c
trunk/gcc/testsuite/gcc.dg/vect/vect-28.c
trunk/gcc/testsuite/gcc.dg/vect/vect-29.c
trunk/gcc/testsuite/gcc.dg/vect/vect-33.c
trunk/gcc/testsuite/gcc.dg/vect/vect-44.c
trunk/gcc/testsuite/gcc.dg/vect/vect-48.c
trunk/gcc/testsuite/gcc.dg/vect/vect-50.c
trunk/gcc/testsuite/gcc.dg/vect/vect-52.c
trunk/gcc/testsuite/gcc.dg/vect/vect-54.c
trunk/gcc/testsuite/gcc.dg/vect/vect-56.c
trunk/gcc/testsuite/gcc.dg/vect/vect-58.c
trunk/gcc/testsuite/gcc.dg/vect/vect-60.c
trunk/gcc/testsuite/gcc.dg/vect/vect-70.c
trunk/gcc/testsuite/gcc.dg/vect/vect-72.c
trunk/gcc/testsuite/gcc.dg/vect/vect-87.c
trunk/gcc/testsuite/gcc.dg/vect/vect-88.c
trunk/gcc/testsuite/gcc.dg/vect/vect-89.c
trunk/gcc/testsuite/gcc.dg/vect/vect-91.c
trunk/gcc/testsuite/gcc.dg/vect/vect-92.c
trunk/gcc/testsuite/gcc.dg/vect/vect-93.c
trunk/gcc/testsuite/gcc.dg/vect/vect-95.c
trunk/gcc/testsuite/gcc.dg/vect/vect-96.c
trunk/gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c
trunk/gcc/testsuite/gcc.dg/vect/vect-multitypes-3.c
trunk/gcc/testsuite/gcc.dg/vect/vect-multitypes-4.c
trunk/gcc/testsuite/gcc.dg/vect/vect-multitypes-6.c
trunk/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-2.c
trunk/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-3.c
trunk/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-4.c
trunk/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-5.c
trunk/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-6.c
trunk/gcc/testsuite/gcc.target/powerpc/vsx-vectorize-7.c
trunk/gcc/testsuite/gfortran.dg/vect/vect-2.f90
trunk/gcc/testsuite/gfortran.dg/vect/vect-3.f90
trunk/gcc/testsuite/gfortran.dg/vect/vect-4.f90
trunk/gcc/testsuite/gfortran.dg/vect/vect-5.f90
trunk/gcc/tree-vect-data-refs.c
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40648
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/40648] misaligned store vectorizer patch introduced 10% runtime regression on Polyhedron test_fpu
2009-07-04 11:54 [Bug target/40648] New: misaligned store vectorizer patch introduced 10% runtime regression on Polyhedron test_fpu ubizjak at gmail dot com
` (10 preceding siblings ...)
2009-10-28 10:33 ` ubizjak at gmail dot com
@ 2009-10-28 10:37 ` ubizjak at gmail dot com
11 siblings, 0 replies; 13+ messages in thread
From: ubizjak at gmail dot com @ 2009-10-28 10:37 UTC (permalink / raw)
To: gcc-bugs
------- Comment #12 from ubizjak at gmail dot com 2009-10-28 10:36 -------
The patch fixed the regression, see test_fpu chart [1] between
2009-10-27 and 2009-10-28.
[1] http://gcc.opensuse.org/c++bench/polyhedron/polyhedron-summary.txt-2-0.html
--
ubizjak at gmail dot com changed:
What |Removed |Added
----------------------------------------------------------------------------
URL| |http://gcc.gnu.org/ml/gcc-
| |patches/2009-
| |10/msg01604.html
Status|NEW |RESOLVED
Resolution| |FIXED
Target Milestone|--- |4.5.0
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40648
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2009-10-28 10:37 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-07-04 11:54 [Bug target/40648] New: misaligned store vectorizer patch introduced 10% runtime regression on Polyhedron test_fpu ubizjak at gmail dot com
2009-07-04 12:06 ` [Bug target/40648] " rguenth at gcc dot gnu dot org
2009-07-04 12:33 ` rguenth at gcc dot gnu dot org
2009-07-04 12:36 ` rguenth at gcc dot gnu dot org
2009-07-04 12:44 ` ubizjak at gmail dot com
2009-07-04 13:40 ` ubizjak at gmail dot com
2009-07-04 14:02 ` dominiq at lps dot ens dot fr
2009-07-05 8:12 ` eres at il dot ibm dot com
2009-07-07 15:48 ` rguenth at gcc dot gnu dot org
2009-07-09 7:33 ` eres at il dot ibm dot com
2009-10-25 12:41 ` eres at il dot ibm dot com
2009-10-28 10:33 ` ubizjak at gmail dot com
2009-10-28 10:37 ` ubizjak at gmail dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).