[Bug tree-optimization/35252] New: No vectorization for complex arrays

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug tree-optimization/35252]  New: No vectorization for complex arrays
@ 2008-02-19 10:20 ubizjak at gmail dot com
  2008-03-12  6:05 ` [Bug tree-optimization/35252] " victork at gcc dot gnu dot org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: ubizjak at gmail dot com @ 2008-02-19 10:20 UTC (permalink / raw)
  To: gcc-bugs

This testcase produces unoptimal code:

_Complex float af[16], bf[16], cf[16];
_Complex double ad[16], bd[16], cd[16];

void testf(void)
{
  int i;

  for (i = 0; i < 16; i++)
    cf[i] = af[i] * bf[i];
}

void testd(void)
{
  int i;

  for (i = 0; i < 16; i++)
    cd[i] = ad[i] + bd[i];
}

gcc -O2 -ftree-vectorize -msse2:

testd:
        xorl    %eax, %eax
        .p2align 4,,7
        .p2align 3
.L7:
        movsd   ad+8(%eax), %xmm1
        movsd   ad(%eax), %xmm0
        addsd   bd+8(%eax), %xmm1
        addsd   bd(%eax), %xmm0
        movsd   %xmm1, cd+8(%eax)
        movsd   %xmm0, cd(%eax)
        addl    $16, %eax
        cmpl    $256, %eax
        jne     .L7
        rep
        ret

And with -ffast-math:

testf:
        xorl    %eax, %eax
        .p2align 4,,7
        .p2align 3
.L2:
        movss   bf(,%eax,8), %xmm2
        movss   bf+4(,%eax,8), %xmm3
        movss   af(,%eax,8), %xmm5
        movss   af+4(,%eax,8), %xmm4
        movaps  %xmm2, %xmm0
        movaps  %xmm3, %xmm1
        mulss   %xmm5, %xmm0
        mulss   %xmm4, %xmm1
        mulss   %xmm4, %xmm2
        mulss   %xmm5, %xmm3
        subss   %xmm1, %xmm0
        addss   %xmm3, %xmm2
        movss   %xmm0, cf(,%eax,8)
        movss   %xmm2, cf+4(,%eax,8)
        addl    $1, %eax
        cmpl    $16, %eax
        jne     .L2
        rep
        ret

Note, that we can use SSE3 addsubps insn in the later case.


-- 
           Summary: No vectorization for complex arrays
           Product: gcc
           Version: 4.4.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: ubizjak at gmail dot com
OtherBugsDependingO 31485
             nThis:


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35252


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/35252] No vectorization for complex arrays
  2008-02-19 10:20 [Bug tree-optimization/35252] New: No vectorization for complex arrays ubizjak at gmail dot com
@ 2008-03-12  6:05 ` victork at gcc dot gnu dot org
  2008-07-27 21:46 ` victork at gcc dot gnu dot org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: victork at gcc dot gnu dot org @ 2008-03-12  6:05 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from victork at gcc dot gnu dot org  2008-03-12 06:05 -------
We don't recognize REALPART_EXPR and IMAGPART_EXPR in vectorizer.

These should be recognized as load operations:
  CR.39_21 = REALPART_EXPR <ad[i_17]>;
  CI.40_22 = IMAGPART_EXPR <ad[i_17]>;
  CR.41_23 = REALPART_EXPR <bd[i_17]>;
  CI.42_24 = IMAGPART_EXPR <bd[i_17]>;

These should be recognized as store operations:
  REALPART_EXPR <cd[i_17]> = CR.43_25;
  IMAGPART_EXPR <cd[i_17]> = CI.44_26;


-- 

victork at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|unassigned at gcc dot gnu   |victork at gcc dot gnu dot
                   |dot org                     |org
             Status|UNCONFIRMED                 |ASSIGNED
     Ever Confirmed|0                           |1
   Last reconfirmed|0000-00-00 00:00:00         |2008-03-12 06:05:04
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35252


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/35252] No vectorization for complex arrays
  2008-02-19 10:20 [Bug tree-optimization/35252] New: No vectorization for complex arrays ubizjak at gmail dot com
  2008-03-12  6:05 ` [Bug tree-optimization/35252] " victork at gcc dot gnu dot org
@ 2008-07-27 21:46 ` victork at gcc dot gnu dot org
  2008-07-29 21:55 ` victork at gcc dot gnu dot org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: victork at gcc dot gnu dot org @ 2008-07-27 21:46 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from victork at gcc dot gnu dot org  2008-07-27 21:45 -------
Subject: Bug 35252

Author: victork
Date: Sun Jul 27 21:44:25 2008
New Revision: 138198

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=138198
Log:
2008-07-27  Victor Kaplansky  <victork@il.ibm.com>

        PR tree-optimization/35252
        * tree-vect-analyze.c (vect_build_slp_tree): Make IMAGPART_EXPR and
        REALPART_EXPR to be considered as same load operation.

testsuite

        PR tree-optimization/35252
        * gcc.dg/vect/vect-complex-1.c, gcc.dg/vect/vect-complex-2.c,
        gcc.dg/vect/fast-math-vect-complex-3.c,
        gcc.dg/vect/vect-complex-4.c: New tests.


Added:
    trunk/gcc/testsuite/gcc.dg/vect/fast-math-vect-complex-3.c
    trunk/gcc/testsuite/gcc.dg/vect/vect-complex-1.c
    trunk/gcc/testsuite/gcc.dg/vect/vect-complex-2.c
    trunk/gcc/testsuite/gcc.dg/vect/vect-complex-4.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-vect-analyze.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35252


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/35252] No vectorization for complex arrays
  2008-02-19 10:20 [Bug tree-optimization/35252] New: No vectorization for complex arrays ubizjak at gmail dot com
  2008-03-12  6:05 ` [Bug tree-optimization/35252] " victork at gcc dot gnu dot org
  2008-07-27 21:46 ` victork at gcc dot gnu dot org
@ 2008-07-29 21:55 ` victork at gcc dot gnu dot org
  2008-08-02 12:08 ` rguenth at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: victork at gcc dot gnu dot org @ 2008-07-29 21:55 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from victork at gcc dot gnu dot org  2008-07-29 21:54 -------
Revision 138198 fixes vectorization of addition of complex numbers, while
vectorization complex multiplication works on PowerPC and on x86 is a known
issue
- see pr30211.

I'm closing this bugzilla as duplicate of PR30211.

*** This bug has been marked as a duplicate of 30211 ***


-- 

victork at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|                            |DUPLICATE


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35252


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/35252] No vectorization for complex arrays
  2008-02-19 10:20 [Bug tree-optimization/35252] New: No vectorization for complex arrays ubizjak at gmail dot com
                   ` (2 preceding siblings ...)
  2008-07-29 21:55 ` victork at gcc dot gnu dot org
@ 2008-08-02 12:08 ` rguenth at gcc dot gnu dot org
  2008-08-04 11:45 ` ubizjak at gmail dot com
  2008-08-05  8:18 ` victork at gcc dot gnu dot org
  5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-08-02 12:08 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from rguenth at gcc dot gnu dot org  2008-08-02 12:06 -------
Subject: Bug 35252

Author: rguenth
Date: Sat Aug  2 12:05:47 2008
New Revision: 138553

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=138553
Log:
2008-08-02  Richard Guenther  <rguenther@suse.de>

        PR target/35252
        * config/i386/sse.md (SSEMODE4S, SSEMODE2D): New mode iterators.
        (ssedoublesizemode): New mode attribute.
        (sse_shufps): Call gen_sse_shufps_v4sf.
        (sse_shufps_1): Macroize.
        (sse2_shufpd): Call gen_Sse_shufpd_v2df.
        (sse2_shufpd_1): Macroize.
        (vec_extract_odd, vec_extract_even): New expanders.
        (vec_interleave_highv4sf, vec_interleave_lowv4sf,
        vec_interleave_highv2df, vec_interleave_lowv2df): Likewise.
        * i386.c (ix86_expand_vector_init_one_nonzero): Call
        gen_sse_shufps_v4sf instead of gen_sse_shufps_1.
        (ix86_expand_vector_set): Likewise.
        (ix86_expand_reduc_v4sf): Likewise.

        * lib/target-supports.exp (vect_extract_even_odd_wide) Add.
        (vect_strided_wide): Likewise.
        * gcc.dg/vect/fast-math-pr35982.c: Enable for
        vect_extract_even_odd_wide.
        * gcc.dg/vect/fast-math-vect-complex-3.c: Likewise.
        * gcc.dg/vect/vect-1.c: Likewise.
        * gcc.dg/vect/vect-107.c: Likewise.
        * gcc.dg/vect/vect-98.c: Likewise.
        * gcc.dg/vect/vect-strided-float.c: Likewise.
        * gcc.dg/vect/slp-11.c: Enable for vect_strided_wide.
        * gcc.dg/vect/slp-12a.c: Likewise.
        * gcc.dg/vect/slp-12b.c: Likewise.
        * gcc.dg/vect/slp-19.c: Likewise.
        * gcc.dg/vect/slp-23.c: Likewise.
        * gcc.dg/vect/slp-5.c: Likewise.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386.c
    trunk/gcc/config/i386/sse.md
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.dg/vect/fast-math-pr35982.c
    trunk/gcc/testsuite/gcc.dg/vect/fast-math-vect-complex-3.c
    trunk/gcc/testsuite/gcc.dg/vect/slp-11.c
    trunk/gcc/testsuite/gcc.dg/vect/slp-12a.c
    trunk/gcc/testsuite/gcc.dg/vect/slp-12b.c
    trunk/gcc/testsuite/gcc.dg/vect/slp-19.c
    trunk/gcc/testsuite/gcc.dg/vect/slp-23.c
    trunk/gcc/testsuite/gcc.dg/vect/slp-5.c
    trunk/gcc/testsuite/gcc.dg/vect/vect-1.c
    trunk/gcc/testsuite/gcc.dg/vect/vect-107.c
    trunk/gcc/testsuite/gcc.dg/vect/vect-98.c
    trunk/gcc/testsuite/gcc.dg/vect/vect-strided-float.c
    trunk/gcc/testsuite/lib/target-supports.exp


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35252


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/35252] No vectorization for complex arrays
  2008-02-19 10:20 [Bug tree-optimization/35252] New: No vectorization for complex arrays ubizjak at gmail dot com
                   ` (3 preceding siblings ...)
  2008-08-02 12:08 ` rguenth at gcc dot gnu dot org
@ 2008-08-04 11:45 ` ubizjak at gmail dot com
  2008-08-05  8:18 ` victork at gcc dot gnu dot org
  5 siblings, 0 replies; 7+ messages in thread
From: ubizjak at gmail dot com @ 2008-08-04 11:45 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from ubizjak at gmail dot com  2008-08-04 11:43 -------
Hm, following testcase doesn't vectorize due to vect cost model
(-O2 -msse3 -ftree-vectorize -ffast-math) on i686 target:

--cut here--
void testf(void)
{
  int i;

  for (i = 0; i < 16; i++)
    cf[i] = af[i] + bf[i];
}
--cut here--


Compilation reports:

pr30211.c:8: note: vectorization_factor = 2, niters = 16
pr30211.c:8: note: === vect_update_slp_costs_according_to_vf ===
pr30211.c:8: note: cost model: vector iteration cost = 16 is divisible by
scalar iteration cost = 8 by a factor greater than or equal to the
vectorization factor = 2 .
pr30211.c:8: note: not vectorized: vectorization not profitable.
pr30211.c:8: note: not vectorized: vector version will never be profitable.

However, without cost model the loop in this testcase compiles to:

.L2:
        movaps  bf(%eax), %xmm0
        addps   af(%eax), %xmm0
        movaps  %xmm0, cf(%eax)
        addl    $16, %eax
        cmpl    $128, %eax
        jne     .L2

which is IMO faster than equivalent scalar version:

.L2:
        movss   bf+4(,%eax,8), %xmm1
        addss   af+4(,%eax,8), %xmm1
        movss   bf(,%eax,8), %xmm0
        addss   af(,%eax,8), %xmm0
        movss   %xmm0, cf(,%eax,8)
        movss   %xmm1, cf+4(,%eax,8)
        addl    $1, %eax
        cmpl    $16, %eax
        jne     .L2


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35252


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/35252] No vectorization for complex arrays
  2008-02-19 10:20 [Bug tree-optimization/35252] New: No vectorization for complex arrays ubizjak at gmail dot com
                   ` (4 preceding siblings ...)
  2008-08-04 11:45 ` ubizjak at gmail dot com
@ 2008-08-05  8:18 ` victork at gcc dot gnu dot org
  5 siblings, 0 replies; 7+ messages in thread
From: victork at gcc dot gnu dot org @ 2008-08-05  8:18 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from victork at gcc dot gnu dot org  2008-08-05 08:16 -------
> Hm, following testcase doesn't vectorize due to vect cost model
> (-O2 -msse3 -ftree-vectorize -ffast-math) on i686 target:

The problem is that we count some costs twice - as being vectorized by SLP and
non-SLP. I'm going to submit a patch to fix this.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35252


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2008-08-05  8:18 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-02-19 10:20 [Bug tree-optimization/35252] New: No vectorization for complex arrays ubizjak at gmail dot com
2008-03-12  6:05 ` [Bug tree-optimization/35252] " victork at gcc dot gnu dot org
2008-07-27 21:46 ` victork at gcc dot gnu dot org
2008-07-29 21:55 ` victork at gcc dot gnu dot org
2008-08-02 12:08 ` rguenth at gcc dot gnu dot org
2008-08-04 11:45 ` ubizjak at gmail dot com
2008-08-05  8:18 ` victork at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).