public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/35252] New: No vectorization for complex arrays
@ 2008-02-19 10:20 ubizjak at gmail dot com
2008-03-12 6:05 ` [Bug tree-optimization/35252] " victork at gcc dot gnu dot org
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: ubizjak at gmail dot com @ 2008-02-19 10:20 UTC (permalink / raw)
To: gcc-bugs
This testcase produces unoptimal code:
_Complex float af[16], bf[16], cf[16];
_Complex double ad[16], bd[16], cd[16];
void testf(void)
{
int i;
for (i = 0; i < 16; i++)
cf[i] = af[i] * bf[i];
}
void testd(void)
{
int i;
for (i = 0; i < 16; i++)
cd[i] = ad[i] + bd[i];
}
gcc -O2 -ftree-vectorize -msse2:
testd:
xorl %eax, %eax
.p2align 4,,7
.p2align 3
.L7:
movsd ad+8(%eax), %xmm1
movsd ad(%eax), %xmm0
addsd bd+8(%eax), %xmm1
addsd bd(%eax), %xmm0
movsd %xmm1, cd+8(%eax)
movsd %xmm0, cd(%eax)
addl $16, %eax
cmpl $256, %eax
jne .L7
rep
ret
And with -ffast-math:
testf:
xorl %eax, %eax
.p2align 4,,7
.p2align 3
.L2:
movss bf(,%eax,8), %xmm2
movss bf+4(,%eax,8), %xmm3
movss af(,%eax,8), %xmm5
movss af+4(,%eax,8), %xmm4
movaps %xmm2, %xmm0
movaps %xmm3, %xmm1
mulss %xmm5, %xmm0
mulss %xmm4, %xmm1
mulss %xmm4, %xmm2
mulss %xmm5, %xmm3
subss %xmm1, %xmm0
addss %xmm3, %xmm2
movss %xmm0, cf(,%eax,8)
movss %xmm2, cf+4(,%eax,8)
addl $1, %eax
cmpl $16, %eax
jne .L2
rep
ret
Note, that we can use SSE3 addsubps insn in the later case.
--
Summary: No vectorization for complex arrays
Product: gcc
Version: 4.4.0
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: ubizjak at gmail dot com
OtherBugsDependingO 31485
nThis:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35252
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/35252] No vectorization for complex arrays
2008-02-19 10:20 [Bug tree-optimization/35252] New: No vectorization for complex arrays ubizjak at gmail dot com
@ 2008-03-12 6:05 ` victork at gcc dot gnu dot org
2008-07-27 21:46 ` victork at gcc dot gnu dot org
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: victork at gcc dot gnu dot org @ 2008-03-12 6:05 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from victork at gcc dot gnu dot org 2008-03-12 06:05 -------
We don't recognize REALPART_EXPR and IMAGPART_EXPR in vectorizer.
These should be recognized as load operations:
CR.39_21 = REALPART_EXPR <ad[i_17]>;
CI.40_22 = IMAGPART_EXPR <ad[i_17]>;
CR.41_23 = REALPART_EXPR <bd[i_17]>;
CI.42_24 = IMAGPART_EXPR <bd[i_17]>;
These should be recognized as store operations:
REALPART_EXPR <cd[i_17]> = CR.43_25;
IMAGPART_EXPR <cd[i_17]> = CI.44_26;
--
victork at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
AssignedTo|unassigned at gcc dot gnu |victork at gcc dot gnu dot
|dot org |org
Status|UNCONFIRMED |ASSIGNED
Ever Confirmed|0 |1
Last reconfirmed|0000-00-00 00:00:00 |2008-03-12 06:05:04
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35252
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/35252] No vectorization for complex arrays
2008-02-19 10:20 [Bug tree-optimization/35252] New: No vectorization for complex arrays ubizjak at gmail dot com
2008-03-12 6:05 ` [Bug tree-optimization/35252] " victork at gcc dot gnu dot org
@ 2008-07-27 21:46 ` victork at gcc dot gnu dot org
2008-07-29 21:55 ` victork at gcc dot gnu dot org
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: victork at gcc dot gnu dot org @ 2008-07-27 21:46 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from victork at gcc dot gnu dot org 2008-07-27 21:45 -------
Subject: Bug 35252
Author: victork
Date: Sun Jul 27 21:44:25 2008
New Revision: 138198
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=138198
Log:
2008-07-27 Victor Kaplansky <victork@il.ibm.com>
PR tree-optimization/35252
* tree-vect-analyze.c (vect_build_slp_tree): Make IMAGPART_EXPR and
REALPART_EXPR to be considered as same load operation.
testsuite
PR tree-optimization/35252
* gcc.dg/vect/vect-complex-1.c, gcc.dg/vect/vect-complex-2.c,
gcc.dg/vect/fast-math-vect-complex-3.c,
gcc.dg/vect/vect-complex-4.c: New tests.
Added:
trunk/gcc/testsuite/gcc.dg/vect/fast-math-vect-complex-3.c
trunk/gcc/testsuite/gcc.dg/vect/vect-complex-1.c
trunk/gcc/testsuite/gcc.dg/vect/vect-complex-2.c
trunk/gcc/testsuite/gcc.dg/vect/vect-complex-4.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-vect-analyze.c
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35252
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/35252] No vectorization for complex arrays
2008-02-19 10:20 [Bug tree-optimization/35252] New: No vectorization for complex arrays ubizjak at gmail dot com
2008-03-12 6:05 ` [Bug tree-optimization/35252] " victork at gcc dot gnu dot org
2008-07-27 21:46 ` victork at gcc dot gnu dot org
@ 2008-07-29 21:55 ` victork at gcc dot gnu dot org
2008-08-02 12:08 ` rguenth at gcc dot gnu dot org
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: victork at gcc dot gnu dot org @ 2008-07-29 21:55 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from victork at gcc dot gnu dot org 2008-07-29 21:54 -------
Revision 138198 fixes vectorization of addition of complex numbers, while
vectorization complex multiplication works on PowerPC and on x86 is a known
issue
- see pr30211.
I'm closing this bugzilla as duplicate of PR30211.
*** This bug has been marked as a duplicate of 30211 ***
--
victork at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution| |DUPLICATE
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35252
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/35252] No vectorization for complex arrays
2008-02-19 10:20 [Bug tree-optimization/35252] New: No vectorization for complex arrays ubizjak at gmail dot com
` (2 preceding siblings ...)
2008-07-29 21:55 ` victork at gcc dot gnu dot org
@ 2008-08-02 12:08 ` rguenth at gcc dot gnu dot org
2008-08-04 11:45 ` ubizjak at gmail dot com
2008-08-05 8:18 ` victork at gcc dot gnu dot org
5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-08-02 12:08 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from rguenth at gcc dot gnu dot org 2008-08-02 12:06 -------
Subject: Bug 35252
Author: rguenth
Date: Sat Aug 2 12:05:47 2008
New Revision: 138553
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=138553
Log:
2008-08-02 Richard Guenther <rguenther@suse.de>
PR target/35252
* config/i386/sse.md (SSEMODE4S, SSEMODE2D): New mode iterators.
(ssedoublesizemode): New mode attribute.
(sse_shufps): Call gen_sse_shufps_v4sf.
(sse_shufps_1): Macroize.
(sse2_shufpd): Call gen_Sse_shufpd_v2df.
(sse2_shufpd_1): Macroize.
(vec_extract_odd, vec_extract_even): New expanders.
(vec_interleave_highv4sf, vec_interleave_lowv4sf,
vec_interleave_highv2df, vec_interleave_lowv2df): Likewise.
* i386.c (ix86_expand_vector_init_one_nonzero): Call
gen_sse_shufps_v4sf instead of gen_sse_shufps_1.
(ix86_expand_vector_set): Likewise.
(ix86_expand_reduc_v4sf): Likewise.
* lib/target-supports.exp (vect_extract_even_odd_wide) Add.
(vect_strided_wide): Likewise.
* gcc.dg/vect/fast-math-pr35982.c: Enable for
vect_extract_even_odd_wide.
* gcc.dg/vect/fast-math-vect-complex-3.c: Likewise.
* gcc.dg/vect/vect-1.c: Likewise.
* gcc.dg/vect/vect-107.c: Likewise.
* gcc.dg/vect/vect-98.c: Likewise.
* gcc.dg/vect/vect-strided-float.c: Likewise.
* gcc.dg/vect/slp-11.c: Enable for vect_strided_wide.
* gcc.dg/vect/slp-12a.c: Likewise.
* gcc.dg/vect/slp-12b.c: Likewise.
* gcc.dg/vect/slp-19.c: Likewise.
* gcc.dg/vect/slp-23.c: Likewise.
* gcc.dg/vect/slp-5.c: Likewise.
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386.c
trunk/gcc/config/i386/sse.md
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.dg/vect/fast-math-pr35982.c
trunk/gcc/testsuite/gcc.dg/vect/fast-math-vect-complex-3.c
trunk/gcc/testsuite/gcc.dg/vect/slp-11.c
trunk/gcc/testsuite/gcc.dg/vect/slp-12a.c
trunk/gcc/testsuite/gcc.dg/vect/slp-12b.c
trunk/gcc/testsuite/gcc.dg/vect/slp-19.c
trunk/gcc/testsuite/gcc.dg/vect/slp-23.c
trunk/gcc/testsuite/gcc.dg/vect/slp-5.c
trunk/gcc/testsuite/gcc.dg/vect/vect-1.c
trunk/gcc/testsuite/gcc.dg/vect/vect-107.c
trunk/gcc/testsuite/gcc.dg/vect/vect-98.c
trunk/gcc/testsuite/gcc.dg/vect/vect-strided-float.c
trunk/gcc/testsuite/lib/target-supports.exp
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35252
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/35252] No vectorization for complex arrays
2008-02-19 10:20 [Bug tree-optimization/35252] New: No vectorization for complex arrays ubizjak at gmail dot com
` (3 preceding siblings ...)
2008-08-02 12:08 ` rguenth at gcc dot gnu dot org
@ 2008-08-04 11:45 ` ubizjak at gmail dot com
2008-08-05 8:18 ` victork at gcc dot gnu dot org
5 siblings, 0 replies; 7+ messages in thread
From: ubizjak at gmail dot com @ 2008-08-04 11:45 UTC (permalink / raw)
To: gcc-bugs
------- Comment #5 from ubizjak at gmail dot com 2008-08-04 11:43 -------
Hm, following testcase doesn't vectorize due to vect cost model
(-O2 -msse3 -ftree-vectorize -ffast-math) on i686 target:
--cut here--
void testf(void)
{
int i;
for (i = 0; i < 16; i++)
cf[i] = af[i] + bf[i];
}
--cut here--
Compilation reports:
pr30211.c:8: note: vectorization_factor = 2, niters = 16
pr30211.c:8: note: === vect_update_slp_costs_according_to_vf ===
pr30211.c:8: note: cost model: vector iteration cost = 16 is divisible by
scalar iteration cost = 8 by a factor greater than or equal to the
vectorization factor = 2 .
pr30211.c:8: note: not vectorized: vectorization not profitable.
pr30211.c:8: note: not vectorized: vector version will never be profitable.
However, without cost model the loop in this testcase compiles to:
.L2:
movaps bf(%eax), %xmm0
addps af(%eax), %xmm0
movaps %xmm0, cf(%eax)
addl $16, %eax
cmpl $128, %eax
jne .L2
which is IMO faster than equivalent scalar version:
.L2:
movss bf+4(,%eax,8), %xmm1
addss af+4(,%eax,8), %xmm1
movss bf(,%eax,8), %xmm0
addss af(,%eax,8), %xmm0
movss %xmm0, cf(,%eax,8)
movss %xmm1, cf+4(,%eax,8)
addl $1, %eax
cmpl $16, %eax
jne .L2
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35252
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/35252] No vectorization for complex arrays
2008-02-19 10:20 [Bug tree-optimization/35252] New: No vectorization for complex arrays ubizjak at gmail dot com
` (4 preceding siblings ...)
2008-08-04 11:45 ` ubizjak at gmail dot com
@ 2008-08-05 8:18 ` victork at gcc dot gnu dot org
5 siblings, 0 replies; 7+ messages in thread
From: victork at gcc dot gnu dot org @ 2008-08-05 8:18 UTC (permalink / raw)
To: gcc-bugs
------- Comment #6 from victork at gcc dot gnu dot org 2008-08-05 08:16 -------
> Hm, following testcase doesn't vectorize due to vect cost model
> (-O2 -msse3 -ftree-vectorize -ffast-math) on i686 target:
The problem is that we count some costs twice - as being vectorized by SLP and
non-SLP. I'm going to submit a patch to fix this.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35252
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2008-08-05 8:18 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-02-19 10:20 [Bug tree-optimization/35252] New: No vectorization for complex arrays ubizjak at gmail dot com
2008-03-12 6:05 ` [Bug tree-optimization/35252] " victork at gcc dot gnu dot org
2008-07-27 21:46 ` victork at gcc dot gnu dot org
2008-07-29 21:55 ` victork at gcc dot gnu dot org
2008-08-02 12:08 ` rguenth at gcc dot gnu dot org
2008-08-04 11:45 ` ubizjak at gmail dot com
2008-08-05 8:18 ` victork at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).