public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/59650] New: Inefficient vector assignment code
@ 2013-12-31 14:50 freddie at witherden dot org
2021-08-21 22:05 ` [Bug tree-optimization/59650] " pinskia at gcc dot gnu.org
0 siblings, 1 reply; 2+ messages in thread
From: freddie at witherden dot org @ 2013-12-31 14:50 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59650
Bug ID: 59650
Summary: Inefficient vector assignment code
Product: gcc
Version: 4.8.2
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: freddie at witherden dot org
Consider the following snippet:
typedef double v4d __attribute__((vector_size(32)));
v4d set1(double *v)
{
v4d tmp = { v[0], v[1], v[2], v[3] };
return tmp;
}
v4d set2(double *v)
{
v4d tmp;
tmp[0] = v[0];
tmp[1] = v[1];
tmp[2] = v[2];
tmp[3] = v[3];
return tmp;
}
if my understanding of the vector extensions is correct they should both do the
same thing. Compiling with GCC 4.8.2 with -O3 -march=native on a Sandy Bridge
system gives:
0000000000000000 <_Z4set1Pd>:
0: c5 fb 10 57 10 vmovsd 0x10(%rdi),%xmm2
5: c5 fb 10 1f vmovsd (%rdi),%xmm3
9: c5 e9 16 47 18 vmovhpd 0x18(%rdi),%xmm2,%xmm0
e: c5 e1 16 4f 08 vmovhpd 0x8(%rdi),%xmm3,%xmm1
13: c4 e3 75 18 c0 01 vinsertf128 $0x1,%xmm0,%ymm1,%ymm0
19: c3 retq
1a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
0000000000000020 <_Z4set2Pd>:
20: c5 fb 10 07 vmovsd (%rdi),%xmm0
24: c5 f9 28 c0 vmovapd %xmm0,%xmm0
28: c5 f9 28 c8 vmovapd %xmm0,%xmm1
2c: c5 f1 16 4f 08 vmovhpd 0x8(%rdi),%xmm1,%xmm1
31: c4 e3 7d 18 c1 00 vinsertf128 $0x0,%xmm1,%ymm0,%ymm0
37: c4 e3 7d 19 c1 01 vextractf128 $0x1,%ymm0,%xmm1
3d: c5 f1 12 4f 10 vmovlpd 0x10(%rdi),%xmm1,%xmm1
42: c4 e3 7d 18 c1 01 vinsertf128 $0x1,%xmm1,%ymm0,%ymm0
48: c4 e3 7d 19 c1 01 vextractf128 $0x1,%ymm0,%xmm1
4e: c5 f1 16 4f 18 vmovhpd 0x18(%rdi),%xmm1,%xmm1
53: c4 e3 7d 18 c1 01 vinsertf128 $0x1,%xmm1,%ymm0,%ymm0
59: c3 retq
where I note the functions are different. For set1 I note that four moves are
issued whereas I was expecting two 128-bit unaligned moves. The code for set2
also appears to be inefficient.
^ permalink raw reply [flat|nested] 2+ messages in thread
* [Bug tree-optimization/59650] Inefficient vector assignment code
2013-12-31 14:50 [Bug tree-optimization/59650] New: Inefficient vector assignment code freddie at witherden dot org
@ 2021-08-21 22:05 ` pinskia at gcc dot gnu.org
0 siblings, 0 replies; 2+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-08-21 22:05 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59650
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|--- |11.0
Resolution|--- |FIXED
Status|UNCONFIRMED |RESOLVED
--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Fixed in GCC 11 with a few improvements.
-march=sandybridge
-O3:
vmovupd (%rdi), %xmm1
vinsertf128 $0x1, 16(%rdi), %ymm1, %ymm0
-O2:
vmovupd 16(%rdi), %xmm1
vmovupd (%rdi), %xmm0
vinsertf128 $0x1, %xmm1, %ymm0, %ymm0
-mavx -O3:
vmovupd (%rdi), %ymm0
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2021-08-21 22:05 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-12-31 14:50 [Bug tree-optimization/59650] New: Inefficient vector assignment code freddie at witherden dot org
2021-08-21 22:05 ` [Bug tree-optimization/59650] " pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).