public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/106010] New: Miss vectorization for complex type copy.
@ 2022-06-17 4:37 crazylht at gmail dot com
2022-06-20 10:10 ` [Bug middle-end/106010] " rguenth at gcc dot gnu.org
` (7 more replies)
0 siblings, 8 replies; 9+ messages in thread
From: crazylht at gmail dot com @ 2022-06-17 4:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106010
Bug ID: 106010
Summary: Miss vectorization for complex type copy.
Product: gcc
Version: 13.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: crazylht at gmail dot com
Target Milestone: ---
This is from PR105923
void
foo (_Complex double *p, _Complex double* q)
{
for (int i = 0; i != 100000; i++)
p[i] = q[i];
}
gcc generates
foo(double _Complex*, double _Complex*):
xor eax, eax
.L2:
vmovsd xmm1, QWORD PTR [rsi+rax]
vmovsd xmm0, QWORD PTR [rsi+8+rax]
vmovsd QWORD PTR [rdi+rax], xmm1
vmovsd QWORD PTR [rdi+8+rax], xmm0
add rax, 16
cmp rax, 1600000
jne .L2
ret
llvm generates:
foo(double _Complex*, double _Complex*): #
@foo(double _Complex*, double _Complex*)
xor eax, eax
.LBB0_1: # =>This Inner Loop Header: Depth=1
movups xmm0, xmmword ptr [rsi + rax]
movups xmmword ptr [rdi + rax], xmm0
add rax, 16
cmp rax, 1600000
jne .LBB0_1
ret
vectorizer failed because get_related_vectype_for_scalar_type failed for
complex type.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug middle-end/106010] Miss vectorization for complex type copy.
2022-06-17 4:37 [Bug tree-optimization/106010] New: Miss vectorization for complex type copy crazylht at gmail dot com
@ 2022-06-20 10:10 ` rguenth at gcc dot gnu.org
2022-06-23 6:44 ` crazylht at gmail dot com
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-06-20 10:10 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106010
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Target| |x86_64-*-*
Last reconfirmed| |2022-06-20
Component|tree-optimization |middle-end
Ever confirmed|0 |1
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
We RTL expand
;; _5 = MEM[(complex double *)q_9(D) + ivtmp.12_14 * 1];
(insn 9 8 10 (set (reg:DF 82 [ _5 ])
(mem:DF (plus:DI (reg/v/f:DI 86 [ q ])
(reg:DI 84 [ ivtmp.12 ])) [1 MEM[(complex double *)q_9(D) +
ivtmp.12_14 * 1]+0 S8 A64])) "t.c":5:15 -1
(nil))
(insn 10 9 0 (set (reg:DF 83 [ _5+8 ])
(mem:DF (plus:DI (plus:DI (reg/v/f:DI 86 [ q ])
(reg:DI 84 [ ivtmp.12 ]))
(const_int 8 [0x8])) [1 MEM[(complex double *)q_9(D) +
ivtmp.12_14 * 1]+8 S8 A64])) "t.c":5:15 -1
(nil))
;; MEM[(complex double *)p_10(D) + ivtmp.12_14 * 1] = _5;
(insn 11 10 12 (set (mem:DF (plus:DI (reg/v/f:DI 85 [ p ])
(reg:DI 84 [ ivtmp.12 ])) [1 MEM[(complex double *)p_10(D) +
ivtmp.12_14 * 1]+0 S8 A64])
(reg:DF 82 [ _5 ])) "t.c":5:12 -1
(nil))
(insn 12 11 0 (set (mem:DF (plus:DI (plus:DI (reg/v/f:DI 85 [ p ])
(reg:DI 84 [ ivtmp.12 ]))
(const_int 8 [0x8])) [1 MEM[(complex double *)p_10(D) +
ivtmp.12_14 * 1]+8 S8 A64])
(reg:DF 83 [ _5+8 ])) "t.c":5:12 -1
(nil))
likely assigning (concat:CD ...) to the pseudos instead of using xmm regs.
So for the copy case that's a target issue IMHO.
One could argue that without move patterns for complex we should eventually
lower memory accesses like we lower arithmetic. Note as soon as we do
for (int i = 0; i != 100000; i++)
p[i] = q[i] + 1.;
we do get the memory accesses lowered and the code vectorized.
Extra complications with _Complex arguments where we do _not_ want to
lower the loads (without further thoughts).
foo (p[i]);
for
foo (p[i] + 1.);
we get
_6 = IMAGPART_EXPR <*_3>;
_4 = REALPART_EXPR <*_3>;
_5 = _4 + 1.0e+0;
_7 = COMPLEX_EXPR <_5, _6>;
bar (_7);
which is also similar as to what we expand foo (p[i]) to.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug middle-end/106010] Miss vectorization for complex type copy.
2022-06-17 4:37 [Bug tree-optimization/106010] New: Miss vectorization for complex type copy crazylht at gmail dot com
2022-06-20 10:10 ` [Bug middle-end/106010] " rguenth at gcc dot gnu.org
@ 2022-06-23 6:44 ` crazylht at gmail dot com
2022-06-23 6:45 ` crazylht at gmail dot com
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: crazylht at gmail dot com @ 2022-06-23 6:44 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106010
--- Comment #2 from Hongtao.liu <crazylht at gmail dot com> ---
It seems loop vectorizer assume unroll factor to be number of elemenets,
similar as groups size for SLP.
I'm trying to handle them for when scalar type is COMPLEX, not sure if there's
other issues? It looks to me Complex type only existed for movement(paramenter
pass, load, store), any real operators will be lowered to operations for imag
and real parts.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug middle-end/106010] Miss vectorization for complex type copy.
2022-06-17 4:37 [Bug tree-optimization/106010] New: Miss vectorization for complex type copy crazylht at gmail dot com
2022-06-20 10:10 ` [Bug middle-end/106010] " rguenth at gcc dot gnu.org
2022-06-23 6:44 ` crazylht at gmail dot com
@ 2022-06-23 6:45 ` crazylht at gmail dot com
2022-07-11 17:32 ` pinskia at gcc dot gnu.org
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: crazylht at gmail dot com @ 2022-06-23 6:45 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106010
--- Comment #3 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Hongtao.liu from comment #2)
> It seems loop vectorizer assume unroll factor to be number of elemenets,
> similar as groups size for SLP.
> I'm trying to handle them for when scalar type is COMPLEX, not sure if
> there's other issues? It looks to me Complex type only existed for
> movement(paramenter pass, load, store), any real operators will be lowered
> to operations for imag and real parts.
Maybe add a new member bool complex_p in stmt_vec_info.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug middle-end/106010] Miss vectorization for complex type copy.
2022-06-17 4:37 [Bug tree-optimization/106010] New: Miss vectorization for complex type copy crazylht at gmail dot com
` (2 preceding siblings ...)
2022-06-23 6:45 ` crazylht at gmail dot com
@ 2022-07-11 17:32 ` pinskia at gcc dot gnu.org
2022-07-20 8:07 ` cvs-commit at gcc dot gnu.org
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-07-11 17:32 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106010
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |zhongyunde at huawei dot com
--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
*** Bug 106254 has been marked as a duplicate of this bug. ***
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug middle-end/106010] Miss vectorization for complex type copy.
2022-06-17 4:37 [Bug tree-optimization/106010] New: Miss vectorization for complex type copy crazylht at gmail dot com
` (3 preceding siblings ...)
2022-07-11 17:32 ` pinskia at gcc dot gnu.org
@ 2022-07-20 8:07 ` cvs-commit at gcc dot gnu.org
2022-07-20 8:15 ` crazylht at gmail dot com
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-07-20 8:07 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106010
--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:
https://gcc.gnu.org/g:f9d4c3b45c5ed5f45c8089c990dbd4e181929c3d
commit r13-1762-gf9d4c3b45c5ed5f45c8089c990dbd4e181929c3d
Author: liuhongt <hongtao.liu@intel.com>
Date: Tue Jul 19 17:24:52 2022 +0800
Lower complex type move to enable vectorization for complex type
load&store.
2022-07-20 Richard Biener <richard.guenther@gmail.com>
Hongtao Liu <hongtao.liu@intel.com>
gcc/ChangeLog:
PR tree-optimization/106010
* tree-complex.cc (init_dont_simulate_again): Lower complex
type move.
(expand_complex_move): Also expand COMPLEX_CST for rhs.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr106010-1a.c: New test.
* gcc.target/i386/pr106010-1b.c: New test.
* gcc.target/i386/pr106010-1c.c: New test.
* gcc.target/i386/pr106010-2a.c: New test.
* gcc.target/i386/pr106010-2b.c: New test.
* gcc.target/i386/pr106010-2c.c: New test.
* gcc.target/i386/pr106010-3a.c: New test.
* gcc.target/i386/pr106010-3b.c: New test.
* gcc.target/i386/pr106010-3c.c: New test.
* gcc.target/i386/pr106010-4a.c: New test.
* gcc.target/i386/pr106010-4b.c: New test.
* gcc.target/i386/pr106010-4c.c: New test.
* gcc.target/i386/pr106010-5a.c: New test.
* gcc.target/i386/pr106010-5b.c: New test.
* gcc.target/i386/pr106010-5c.c: New test.
* gcc.target/i386/pr106010-6a.c: New test.
* gcc.target/i386/pr106010-6b.c: New test.
* gcc.target/i386/pr106010-6c.c: New test.
* gcc.target/i386/pr106010-7a.c: New test.
* gcc.target/i386/pr106010-7b.c: New test.
* gcc.target/i386/pr106010-7c.c: New test.
* gcc.target/i386/pr106010-8a.c: New test.
* gcc.target/i386/pr106010-8b.c: New test.
* gcc.target/i386/pr106010-8c.c: New test.
* gcc.target/i386/pr106010-9a.c: New test.
* gcc.target/i386/pr106010-9b.c: New test.
* gcc.target/i386/pr106010-9c.c: New test.
* gcc.target/i386/pr106010-9d.c: New test.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug middle-end/106010] Miss vectorization for complex type copy.
2022-06-17 4:37 [Bug tree-optimization/106010] New: Miss vectorization for complex type copy crazylht at gmail dot com
` (4 preceding siblings ...)
2022-07-20 8:07 ` cvs-commit at gcc dot gnu.org
@ 2022-07-20 8:15 ` crazylht at gmail dot com
2022-07-22 2:08 ` cvs-commit at gcc dot gnu.org
2022-07-22 3:23 ` linkw at gcc dot gnu.org
7 siblings, 0 replies; 9+ messages in thread
From: crazylht at gmail dot com @ 2022-07-20 8:15 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106010
Hongtao.liu <crazylht at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Status|NEW |RESOLVED
--- Comment #6 from Hongtao.liu <crazylht at gmail dot com> ---
Fixed in GCC13, I'll revisit PR105923 for complex type libmvec.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug middle-end/106010] Miss vectorization for complex type copy.
2022-06-17 4:37 [Bug tree-optimization/106010] New: Miss vectorization for complex type copy crazylht at gmail dot com
` (5 preceding siblings ...)
2022-07-20 8:15 ` crazylht at gmail dot com
@ 2022-07-22 2:08 ` cvs-commit at gcc dot gnu.org
2022-07-22 3:23 ` linkw at gcc dot gnu.org
7 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-07-22 2:08 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106010
--- Comment #7 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:
https://gcc.gnu.org/g:1cc0e9a46e79e7ccc7f904b951e369b2b2647567
commit r13-1791-g1cc0e9a46e79e7ccc7f904b951e369b2b2647567
Author: liuhongt <hongtao.liu@intel.com>
Date: Fri Jul 22 09:54:52 2022 +0800
Adjust testcase.
r13-1762-gf9d4c3b45c5ed5f45c8089c990dbd4e181929c3d lower complex type
move to scalars, but testcase pr23911 is supposed to scan __complex__
constant which is never available, so adjust testcase to scan
IMAGPART/REALPART_EXPR constants separately.
gcc/testsuite/ChangeLog
PR tree-optimization/106010
* gcc.dg/pr23911.c: Scan IMAGPART/REALPART_EXPR = ** instead
of __complex__ since COMPLEX_CST is lower to scalars.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug middle-end/106010] Miss vectorization for complex type copy.
2022-06-17 4:37 [Bug tree-optimization/106010] New: Miss vectorization for complex type copy crazylht at gmail dot com
` (6 preceding siblings ...)
2022-07-22 2:08 ` cvs-commit at gcc dot gnu.org
@ 2022-07-22 3:23 ` linkw at gcc dot gnu.org
7 siblings, 0 replies; 9+ messages in thread
From: linkw at gcc dot gnu.org @ 2022-07-22 3:23 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106010
Kewen Lin <linkw at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |seurer at gcc dot gnu.org
--- Comment #8 from Kewen Lin <linkw at gcc dot gnu.org> ---
*** Bug 106396 has been marked as a duplicate of this bug. ***
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2022-07-22 3:23 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-17 4:37 [Bug tree-optimization/106010] New: Miss vectorization for complex type copy crazylht at gmail dot com
2022-06-20 10:10 ` [Bug middle-end/106010] " rguenth at gcc dot gnu.org
2022-06-23 6:44 ` crazylht at gmail dot com
2022-06-23 6:45 ` crazylht at gmail dot com
2022-07-11 17:32 ` pinskia at gcc dot gnu.org
2022-07-20 8:07 ` cvs-commit at gcc dot gnu.org
2022-07-20 8:15 ` crazylht at gmail dot com
2022-07-22 2:08 ` cvs-commit at gcc dot gnu.org
2022-07-22 3:23 ` linkw at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).