public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/106010] New: Miss vectorization for complex type copy.
@ 2022-06-17  4:37 crazylht at gmail dot com
  2022-06-20 10:10 ` [Bug middle-end/106010] " rguenth at gcc dot gnu.org
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: crazylht at gmail dot com @ 2022-06-17  4:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106010

            Bug ID: 106010
           Summary: Miss vectorization for complex type copy.
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: crazylht at gmail dot com
  Target Milestone: ---

This is from PR105923

void
foo (_Complex double *p, _Complex double* q)
{
    for (int i = 0; i != 100000; i++)
      p[i] = q[i];
}

gcc generates

foo(double _Complex*, double _Complex*):
        xor     eax, eax
.L2:
        vmovsd  xmm1, QWORD PTR [rsi+rax]
        vmovsd  xmm0, QWORD PTR [rsi+8+rax]
        vmovsd  QWORD PTR [rdi+rax], xmm1
        vmovsd  QWORD PTR [rdi+8+rax], xmm0
        add     rax, 16
        cmp     rax, 1600000
        jne     .L2
        ret

llvm generates:

foo(double _Complex*, double _Complex*):                           #
@foo(double _Complex*, double _Complex*)
        xor     eax, eax
.LBB0_1:                                # =>This Inner Loop Header: Depth=1
        movups  xmm0, xmmword ptr [rsi + rax]
        movups  xmmword ptr [rdi + rax], xmm0
        add     rax, 16
        cmp     rax, 1600000
        jne     .LBB0_1
        ret

vectorizer failed because get_related_vectype_for_scalar_type failed for
complex type.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug middle-end/106010] Miss vectorization for complex type copy.
  2022-06-17  4:37 [Bug tree-optimization/106010] New: Miss vectorization for complex type copy crazylht at gmail dot com
@ 2022-06-20 10:10 ` rguenth at gcc dot gnu.org
  2022-06-23  6:44 ` crazylht at gmail dot com
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-06-20 10:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106010

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
             Target|                            |x86_64-*-*
   Last reconfirmed|                            |2022-06-20
          Component|tree-optimization           |middle-end
     Ever confirmed|0                           |1

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
We RTL expand

;; _5 = MEM[(complex double *)q_9(D) + ivtmp.12_14 * 1];

(insn 9 8 10 (set (reg:DF 82 [ _5 ])
        (mem:DF (plus:DI (reg/v/f:DI 86 [ q ])
                (reg:DI 84 [ ivtmp.12 ])) [1 MEM[(complex double *)q_9(D) +
ivtmp.12_14 * 1]+0 S8 A64])) "t.c":5:15 -1
     (nil))

(insn 10 9 0 (set (reg:DF 83 [ _5+8 ])
        (mem:DF (plus:DI (plus:DI (reg/v/f:DI 86 [ q ])
                    (reg:DI 84 [ ivtmp.12 ]))
                (const_int 8 [0x8])) [1 MEM[(complex double *)q_9(D) +
ivtmp.12_14 * 1]+8 S8 A64])) "t.c":5:15 -1
     (nil))

;; MEM[(complex double *)p_10(D) + ivtmp.12_14 * 1] = _5;

(insn 11 10 12 (set (mem:DF (plus:DI (reg/v/f:DI 85 [ p ])
                (reg:DI 84 [ ivtmp.12 ])) [1 MEM[(complex double *)p_10(D) +
ivtmp.12_14 * 1]+0 S8 A64])
        (reg:DF 82 [ _5 ])) "t.c":5:12 -1
     (nil))

(insn 12 11 0 (set (mem:DF (plus:DI (plus:DI (reg/v/f:DI 85 [ p ])
                    (reg:DI 84 [ ivtmp.12 ]))
                (const_int 8 [0x8])) [1 MEM[(complex double *)p_10(D) +
ivtmp.12_14 * 1]+8 S8 A64])
        (reg:DF 83 [ _5+8 ])) "t.c":5:12 -1
     (nil))

likely assigning (concat:CD ...) to the pseudos instead of using xmm regs.
So for the copy case that's a target issue IMHO.

One could argue that without move patterns for complex we should eventually
lower memory accesses like we lower arithmetic.  Note as soon as we do

    for (int i = 0; i != 100000; i++)
      p[i] = q[i] + 1.;

we do get the memory accesses lowered and the code vectorized.

Extra complications with _Complex arguments where we do _not_ want to
lower the loads (without further thoughts).

  foo (p[i]);

for

  foo (p[i] + 1.);

we get

  _6 = IMAGPART_EXPR <*_3>;
  _4 = REALPART_EXPR <*_3>;
  _5 = _4 + 1.0e+0;
  _7 = COMPLEX_EXPR <_5, _6>;
  bar (_7);

which is also similar as to what we expand foo (p[i]) to.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug middle-end/106010] Miss vectorization for complex type copy.
  2022-06-17  4:37 [Bug tree-optimization/106010] New: Miss vectorization for complex type copy crazylht at gmail dot com
  2022-06-20 10:10 ` [Bug middle-end/106010] " rguenth at gcc dot gnu.org
@ 2022-06-23  6:44 ` crazylht at gmail dot com
  2022-06-23  6:45 ` crazylht at gmail dot com
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: crazylht at gmail dot com @ 2022-06-23  6:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106010

--- Comment #2 from Hongtao.liu <crazylht at gmail dot com> ---
It seems loop vectorizer assume unroll factor to be number of elemenets,
similar as groups size for SLP.
I'm trying to handle them for when scalar type is COMPLEX, not sure if there's
other issues? It looks to me Complex type only existed for movement(paramenter
pass, load, store), any real operators will be lowered to operations for imag
and real parts.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug middle-end/106010] Miss vectorization for complex type copy.
  2022-06-17  4:37 [Bug tree-optimization/106010] New: Miss vectorization for complex type copy crazylht at gmail dot com
  2022-06-20 10:10 ` [Bug middle-end/106010] " rguenth at gcc dot gnu.org
  2022-06-23  6:44 ` crazylht at gmail dot com
@ 2022-06-23  6:45 ` crazylht at gmail dot com
  2022-07-11 17:32 ` pinskia at gcc dot gnu.org
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: crazylht at gmail dot com @ 2022-06-23  6:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106010

--- Comment #3 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Hongtao.liu from comment #2)
> It seems loop vectorizer assume unroll factor to be number of elemenets,
> similar as groups size for SLP.
> I'm trying to handle them for when scalar type is COMPLEX, not sure if
> there's other issues? It looks to me Complex type only existed for
> movement(paramenter pass, load, store), any real operators will be lowered
> to operations for imag and real parts.

Maybe add a new member bool complex_p in stmt_vec_info.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug middle-end/106010] Miss vectorization for complex type copy.
  2022-06-17  4:37 [Bug tree-optimization/106010] New: Miss vectorization for complex type copy crazylht at gmail dot com
                   ` (2 preceding siblings ...)
  2022-06-23  6:45 ` crazylht at gmail dot com
@ 2022-07-11 17:32 ` pinskia at gcc dot gnu.org
  2022-07-20  8:07 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-07-11 17:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106010

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |zhongyunde at huawei dot com

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
*** Bug 106254 has been marked as a duplicate of this bug. ***

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug middle-end/106010] Miss vectorization for complex type copy.
  2022-06-17  4:37 [Bug tree-optimization/106010] New: Miss vectorization for complex type copy crazylht at gmail dot com
                   ` (3 preceding siblings ...)
  2022-07-11 17:32 ` pinskia at gcc dot gnu.org
@ 2022-07-20  8:07 ` cvs-commit at gcc dot gnu.org
  2022-07-20  8:15 ` crazylht at gmail dot com
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-07-20  8:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106010

--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:

https://gcc.gnu.org/g:f9d4c3b45c5ed5f45c8089c990dbd4e181929c3d

commit r13-1762-gf9d4c3b45c5ed5f45c8089c990dbd4e181929c3d
Author: liuhongt <hongtao.liu@intel.com>
Date:   Tue Jul 19 17:24:52 2022 +0800

    Lower complex type move to enable vectorization for complex type
load&store.

    2022-07-20  Richard Biener  <richard.guenther@gmail.com>
                Hongtao Liu  <hongtao.liu@intel.com>

    gcc/ChangeLog:

            PR tree-optimization/106010
            * tree-complex.cc (init_dont_simulate_again): Lower complex
            type move.
            (expand_complex_move): Also expand COMPLEX_CST for rhs.

    gcc/testsuite/ChangeLog:

            * gcc.target/i386/pr106010-1a.c: New test.
            * gcc.target/i386/pr106010-1b.c: New test.
            * gcc.target/i386/pr106010-1c.c: New test.
            * gcc.target/i386/pr106010-2a.c: New test.
            * gcc.target/i386/pr106010-2b.c: New test.
            * gcc.target/i386/pr106010-2c.c: New test.
            * gcc.target/i386/pr106010-3a.c: New test.
            * gcc.target/i386/pr106010-3b.c: New test.
            * gcc.target/i386/pr106010-3c.c: New test.
            * gcc.target/i386/pr106010-4a.c: New test.
            * gcc.target/i386/pr106010-4b.c: New test.
            * gcc.target/i386/pr106010-4c.c: New test.
            * gcc.target/i386/pr106010-5a.c: New test.
            * gcc.target/i386/pr106010-5b.c: New test.
            * gcc.target/i386/pr106010-5c.c: New test.
            * gcc.target/i386/pr106010-6a.c: New test.
            * gcc.target/i386/pr106010-6b.c: New test.
            * gcc.target/i386/pr106010-6c.c: New test.
            * gcc.target/i386/pr106010-7a.c: New test.
            * gcc.target/i386/pr106010-7b.c: New test.
            * gcc.target/i386/pr106010-7c.c: New test.
            * gcc.target/i386/pr106010-8a.c: New test.
            * gcc.target/i386/pr106010-8b.c: New test.
            * gcc.target/i386/pr106010-8c.c: New test.
            * gcc.target/i386/pr106010-9a.c: New test.
            * gcc.target/i386/pr106010-9b.c: New test.
            * gcc.target/i386/pr106010-9c.c: New test.
            * gcc.target/i386/pr106010-9d.c: New test.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug middle-end/106010] Miss vectorization for complex type copy.
  2022-06-17  4:37 [Bug tree-optimization/106010] New: Miss vectorization for complex type copy crazylht at gmail dot com
                   ` (4 preceding siblings ...)
  2022-07-20  8:07 ` cvs-commit at gcc dot gnu.org
@ 2022-07-20  8:15 ` crazylht at gmail dot com
  2022-07-22  2:08 ` cvs-commit at gcc dot gnu.org
  2022-07-22  3:23 ` linkw at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: crazylht at gmail dot com @ 2022-07-20  8:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106010

Hongtao.liu <crazylht at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED

--- Comment #6 from Hongtao.liu <crazylht at gmail dot com> ---
Fixed in GCC13, I'll revisit PR105923 for complex type libmvec.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug middle-end/106010] Miss vectorization for complex type copy.
  2022-06-17  4:37 [Bug tree-optimization/106010] New: Miss vectorization for complex type copy crazylht at gmail dot com
                   ` (5 preceding siblings ...)
  2022-07-20  8:15 ` crazylht at gmail dot com
@ 2022-07-22  2:08 ` cvs-commit at gcc dot gnu.org
  2022-07-22  3:23 ` linkw at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-07-22  2:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106010

--- Comment #7 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:

https://gcc.gnu.org/g:1cc0e9a46e79e7ccc7f904b951e369b2b2647567

commit r13-1791-g1cc0e9a46e79e7ccc7f904b951e369b2b2647567
Author: liuhongt <hongtao.liu@intel.com>
Date:   Fri Jul 22 09:54:52 2022 +0800

    Adjust testcase.

    r13-1762-gf9d4c3b45c5ed5f45c8089c990dbd4e181929c3d lower complex type
    move to scalars, but testcase pr23911 is supposed to scan __complex__
    constant which is never available, so adjust testcase to scan
    IMAGPART/REALPART_EXPR constants separately.

    gcc/testsuite/ChangeLog

            PR tree-optimization/106010
            * gcc.dg/pr23911.c: Scan IMAGPART/REALPART_EXPR = ** instead
            of __complex__ since COMPLEX_CST is lower to scalars.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug middle-end/106010] Miss vectorization for complex type copy.
  2022-06-17  4:37 [Bug tree-optimization/106010] New: Miss vectorization for complex type copy crazylht at gmail dot com
                   ` (6 preceding siblings ...)
  2022-07-22  2:08 ` cvs-commit at gcc dot gnu.org
@ 2022-07-22  3:23 ` linkw at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: linkw at gcc dot gnu.org @ 2022-07-22  3:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106010

Kewen Lin <linkw at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |seurer at gcc dot gnu.org

--- Comment #8 from Kewen Lin <linkw at gcc dot gnu.org> ---
*** Bug 106396 has been marked as a duplicate of this bug. ***

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-07-22  3:23 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-17  4:37 [Bug tree-optimization/106010] New: Miss vectorization for complex type copy crazylht at gmail dot com
2022-06-20 10:10 ` [Bug middle-end/106010] " rguenth at gcc dot gnu.org
2022-06-23  6:44 ` crazylht at gmail dot com
2022-06-23  6:45 ` crazylht at gmail dot com
2022-07-11 17:32 ` pinskia at gcc dot gnu.org
2022-07-20  8:07 ` cvs-commit at gcc dot gnu.org
2022-07-20  8:15 ` crazylht at gmail dot com
2022-07-22  2:08 ` cvs-commit at gcc dot gnu.org
2022-07-22  3:23 ` linkw at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).