public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/56511] New: memcpy misses chance to use AVX instructions
@ 2013-03-03 6:14 jyasskin at gcc dot gnu.org
2013-03-03 6:20 ` [Bug rtl-optimization/56511] " jyasskin at gcc dot gnu.org
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: jyasskin at gcc dot gnu.org @ 2013-03-03 6:14 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56511
Bug #: 56511
Summary: memcpy misses chance to use AVX instructions
Classification: Unclassified
Product: gcc
Version: 4.7.2
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: jyasskin@gcc.gnu.org
When operating on sufficiently aligned storage, memcpy should be able to use
vector instructions.
$ cat test.c
#include <string.h>
typedef float vec __attribute__((vector_size(32)));
typedef struct S {
vec v;
char __attribute__((aligned(__alignof__(vec)))) c[sizeof(vec)];
} S;
void assign_vec(S* s, const vec* v) { s->v = *v; }
void memcpy_vec(S* s, const vec* v) { memcpy(&s->v, v, sizeof(vec)); }
void memcpy_char(S* s, const vec* v) { memcpy(s->c, v, sizeof(vec)); }
$ gcc -mavx -S test.c -O2 -Wall -o -
.file "test.c"
.text
.p2align 4,,15
.globl assign_vec
.type assign_vec, @function
assign_vec:
.LFB12:
.cfi_startproc
vmovaps (%rsi), %ymm0
vmovaps %ymm0, (%rdi)
vzeroupper
ret
.cfi_endproc
.LFE12:
.size assign_vec, .-assign_vec
.p2align 4,,15
.globl memcpy_vec
.type memcpy_vec, @function
memcpy_vec:
.LFB13:
.cfi_startproc
movq (%rsi), %rax
movq %rax, (%rdi)
movq 8(%rsi), %rax
movq %rax, 8(%rdi)
movq 16(%rsi), %rax
movq %rax, 16(%rdi)
movq 24(%rsi), %rax
movq %rax, 24(%rdi)
ret
.cfi_endproc
.LFE13:
.size memcpy_vec, .-memcpy_vec
.p2align 4,,15
.globl memcpy_char
.type memcpy_char, @function
memcpy_char:
.LFB14:
.cfi_startproc
movq (%rsi), %rdx
movq %rdx, 32(%rdi)
movq 8(%rsi), %rdx
movq %rdx, 40(%rdi)
movq 16(%rsi), %rdx
movq %rdx, 48(%rdi)
movq 24(%rsi), %rdx
movq %rdx, 56(%rdi)
ret
.cfi_endproc
.LFE14:
.size memcpy_char, .-memcpy_char
I don't have a gcc-4.8 around to test with, but I believe it's also missing
this optimization.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug rtl-optimization/56511] memcpy misses chance to use AVX instructions
2013-03-03 6:14 [Bug rtl-optimization/56511] New: memcpy misses chance to use AVX instructions jyasskin at gcc dot gnu.org
@ 2013-03-03 6:20 ` jyasskin at gcc dot gnu.org
2013-03-03 6:50 ` [Bug target/56511] " pinskia at gcc dot gnu.org
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: jyasskin at gcc dot gnu.org @ 2013-03-03 6:20 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56511
--- Comment #1 from Jeffrey Yasskin <jyasskin at gcc dot gnu.org> 2013-03-03 06:19:57 UTC ---
LLVM catches this optimization.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug target/56511] memcpy misses chance to use AVX instructions
2013-03-03 6:14 [Bug rtl-optimization/56511] New: memcpy misses chance to use AVX instructions jyasskin at gcc dot gnu.org
2013-03-03 6:20 ` [Bug rtl-optimization/56511] " jyasskin at gcc dot gnu.org
@ 2013-03-03 6:50 ` pinskia at gcc dot gnu.org
2013-03-04 20:10 ` izamyatin at gmail dot com
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2013-03-03 6:50 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56511
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target| |x86_64-*-*
Component|rtl-optimization |target
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> 2013-03-03 06:50:09 UTC ---
This is a target specific optimization. PPC uses VMX/altivec for all of these
already.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug target/56511] memcpy misses chance to use AVX instructions
2013-03-03 6:14 [Bug rtl-optimization/56511] New: memcpy misses chance to use AVX instructions jyasskin at gcc dot gnu.org
2013-03-03 6:20 ` [Bug rtl-optimization/56511] " jyasskin at gcc dot gnu.org
2013-03-03 6:50 ` [Bug target/56511] " pinskia at gcc dot gnu.org
@ 2013-03-04 20:10 ` izamyatin at gmail dot com
2013-03-07 8:53 ` izamyatin at gmail dot com
2021-08-05 12:36 ` hjl.tools at gmail dot com
4 siblings, 0 replies; 6+ messages in thread
From: izamyatin at gmail dot com @ 2013-03-04 20:10 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56511
Igor Zamyatin <izamyatin at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |izamyatin at gmail dot com
--- Comment #3 from Igor Zamyatin <izamyatin at gmail dot com> 2013-03-04 20:10:09 UTC ---
This seems to be the old issue with non-optimal memcpy,memset etc expanding for
x86. See for instance http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052.
There were some attempts to imrove that expanding (e.g. here -
http://gcc.gnu.org/ml/gcc-patches/2011-07/msg00853.html) but with no success
yet.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug target/56511] memcpy misses chance to use AVX instructions
2013-03-03 6:14 [Bug rtl-optimization/56511] New: memcpy misses chance to use AVX instructions jyasskin at gcc dot gnu.org
` (2 preceding siblings ...)
2013-03-04 20:10 ` izamyatin at gmail dot com
@ 2013-03-07 8:53 ` izamyatin at gmail dot com
2021-08-05 12:36 ` hjl.tools at gmail dot com
4 siblings, 0 replies; 6+ messages in thread
From: izamyatin at gmail dot com @ 2013-03-07 8:53 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56511
--- Comment #4 from Igor Zamyatin <izamyatin at gmail dot com> 2013-03-07 08:52:53 UTC ---
Doesn't first argument of memcpy which is called from memcpy_vec have unknown
(1byte) alignment? If yes, how PPC managed to produce vector instructions?
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug target/56511] memcpy misses chance to use AVX instructions
2013-03-03 6:14 [Bug rtl-optimization/56511] New: memcpy misses chance to use AVX instructions jyasskin at gcc dot gnu.org
` (3 preceding siblings ...)
2013-03-07 8:53 ` izamyatin at gmail dot com
@ 2021-08-05 12:36 ` hjl.tools at gmail dot com
4 siblings, 0 replies; 6+ messages in thread
From: hjl.tools at gmail dot com @ 2021-08-05 12:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56511
H.J. Lu <hjl.tools at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|--- |12.0
Status|UNCONFIRMED |RESOLVED
Resolution|--- |DUPLICATE
--- Comment #5 from H.J. Lu <hjl.tools at gmail dot com> ---
Fixed for GCC 12:
[hjl@gnu-skl-2 gcc]$ cat x2.c
#include <string.h>
typedef float vec __attribute__((vector_size(32)));
typedef struct S {
vec v;
char __attribute__((aligned(__alignof__(vec)))) c[sizeof(vec)];
} S;
void assign_vec(S* s, const vec* v) { s->v = *v; }
void memcpy_vec(S* s, const vec* v) { memcpy(&s->v, v, sizeof(vec)); }
void memcpy_char(S* s, const vec* v) { memcpy(s->c, v, sizeof(vec)); }
[hjl@gnu-skl-2 gcc]$ ./xgcc -B./ -S -O3 -march=haswell x2.c
[hjl@gnu-skl-2 gcc]$ cat x2.s
.file "x2.c"
.text
.p2align 4
.globl assign_vec
.type assign_vec, @function
assign_vec:
.LFB0:
.cfi_startproc
vmovaps (%rsi), %ymm0
vmovaps %ymm0, (%rdi)
vzeroupper
ret
.cfi_endproc
.LFE0:
.size assign_vec, .-assign_vec
.p2align 4
.globl memcpy_vec
.type memcpy_vec, @function
memcpy_vec:
.LFB1:
.cfi_startproc
vmovdqu (%rsi), %ymm15
vmovdqu %ymm15, (%rdi)
vzeroupper
ret
.cfi_endproc
.LFE1:
.size memcpy_vec, .-memcpy_vec
.p2align 4
.globl memcpy_char
.type memcpy_char, @function
memcpy_char:
.LFB2:
.cfi_startproc
vmovdqu (%rsi), %ymm15
vmovdqu %ymm15, 32(%rdi)
vzeroupper
ret
.cfi_endproc
.LFE2:
.size memcpy_char, .-memcpy_char
.ident "GCC: (GNU) 12.0.0 20210805 (experimental) [master revision
f7aa81892eb:82bfff3e5fa:c16f21c7cf97ce48967e42d3b5d22ea169a9c2c8]"
.section .note.GNU-stack,"",@progbits
[hjl@gnu-skl-2 gcc]$
*** This bug has been marked as a duplicate of bug 90773 ***
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-08-05 12:36 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-03 6:14 [Bug rtl-optimization/56511] New: memcpy misses chance to use AVX instructions jyasskin at gcc dot gnu.org
2013-03-03 6:20 ` [Bug rtl-optimization/56511] " jyasskin at gcc dot gnu.org
2013-03-03 6:50 ` [Bug target/56511] " pinskia at gcc dot gnu.org
2013-03-04 20:10 ` izamyatin at gmail dot com
2013-03-07 8:53 ` izamyatin at gmail dot com
2021-08-05 12:36 ` hjl.tools at gmail dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).