public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/56511] New: memcpy misses chance to use AVX instructions
@ 2013-03-03  6:14 jyasskin at gcc dot gnu.org
  2013-03-03  6:20 ` [Bug rtl-optimization/56511] " jyasskin at gcc dot gnu.org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: jyasskin at gcc dot gnu.org @ 2013-03-03  6:14 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56511

             Bug #: 56511
           Summary: memcpy misses chance to use AVX instructions
    Classification: Unclassified
           Product: gcc
           Version: 4.7.2
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: jyasskin@gcc.gnu.org


When operating on sufficiently aligned storage, memcpy should be able to use
vector instructions.

$ cat test.c
#include <string.h>

typedef float vec __attribute__((vector_size(32)));
typedef struct S {
  vec v;
  char __attribute__((aligned(__alignof__(vec)))) c[sizeof(vec)];
} S;
void assign_vec(S* s, const vec* v) { s->v = *v; }
void memcpy_vec(S* s, const vec* v) { memcpy(&s->v, v, sizeof(vec)); }
void memcpy_char(S* s, const vec* v) { memcpy(s->c, v, sizeof(vec)); }

$ gcc -mavx -S test.c -O2  -Wall -o - 
        .file   "test.c"
        .text
        .p2align 4,,15
        .globl  assign_vec
        .type   assign_vec, @function
assign_vec:
.LFB12:
        .cfi_startproc
        vmovaps (%rsi), %ymm0
        vmovaps %ymm0, (%rdi)
        vzeroupper
        ret
        .cfi_endproc
.LFE12:
        .size   assign_vec, .-assign_vec
        .p2align 4,,15
        .globl  memcpy_vec
        .type   memcpy_vec, @function
memcpy_vec:
.LFB13:
        .cfi_startproc
        movq    (%rsi), %rax
        movq    %rax, (%rdi)
        movq    8(%rsi), %rax
        movq    %rax, 8(%rdi)
        movq    16(%rsi), %rax
        movq    %rax, 16(%rdi)
        movq    24(%rsi), %rax
        movq    %rax, 24(%rdi)
        ret
        .cfi_endproc
.LFE13:
        .size   memcpy_vec, .-memcpy_vec
        .p2align 4,,15
        .globl  memcpy_char
        .type   memcpy_char, @function
memcpy_char:
.LFB14:
        .cfi_startproc
        movq    (%rsi), %rdx
        movq    %rdx, 32(%rdi)
        movq    8(%rsi), %rdx
        movq    %rdx, 40(%rdi)
        movq    16(%rsi), %rdx
        movq    %rdx, 48(%rdi)
        movq    24(%rsi), %rdx
        movq    %rdx, 56(%rdi)
        ret
        .cfi_endproc
.LFE14:
        .size   memcpy_char, .-memcpy_char


I don't have a gcc-4.8 around to test with, but I believe it's also missing
this optimization.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug rtl-optimization/56511] memcpy misses chance to use AVX instructions
  2013-03-03  6:14 [Bug rtl-optimization/56511] New: memcpy misses chance to use AVX instructions jyasskin at gcc dot gnu.org
@ 2013-03-03  6:20 ` jyasskin at gcc dot gnu.org
  2013-03-03  6:50 ` [Bug target/56511] " pinskia at gcc dot gnu.org
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: jyasskin at gcc dot gnu.org @ 2013-03-03  6:20 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56511

--- Comment #1 from Jeffrey Yasskin <jyasskin at gcc dot gnu.org> 2013-03-03 06:19:57 UTC ---
LLVM catches this optimization.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/56511] memcpy misses chance to use AVX instructions
  2013-03-03  6:14 [Bug rtl-optimization/56511] New: memcpy misses chance to use AVX instructions jyasskin at gcc dot gnu.org
  2013-03-03  6:20 ` [Bug rtl-optimization/56511] " jyasskin at gcc dot gnu.org
@ 2013-03-03  6:50 ` pinskia at gcc dot gnu.org
  2013-03-04 20:10 ` izamyatin at gmail dot com
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2013-03-03  6:50 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56511

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|                            |x86_64-*-*
          Component|rtl-optimization            |target

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> 2013-03-03 06:50:09 UTC ---
This is a target specific optimization.  PPC uses VMX/altivec for all of these
already.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/56511] memcpy misses chance to use AVX instructions
  2013-03-03  6:14 [Bug rtl-optimization/56511] New: memcpy misses chance to use AVX instructions jyasskin at gcc dot gnu.org
  2013-03-03  6:20 ` [Bug rtl-optimization/56511] " jyasskin at gcc dot gnu.org
  2013-03-03  6:50 ` [Bug target/56511] " pinskia at gcc dot gnu.org
@ 2013-03-04 20:10 ` izamyatin at gmail dot com
  2013-03-07  8:53 ` izamyatin at gmail dot com
  2021-08-05 12:36 ` hjl.tools at gmail dot com
  4 siblings, 0 replies; 6+ messages in thread
From: izamyatin at gmail dot com @ 2013-03-04 20:10 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56511

Igor Zamyatin <izamyatin at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |izamyatin at gmail dot com

--- Comment #3 from Igor Zamyatin <izamyatin at gmail dot com> 2013-03-04 20:10:09 UTC ---
This seems to be the old issue with non-optimal memcpy,memset etc expanding for
x86. See for instance http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052.
There were some attempts to imrove that expanding (e.g. here -
http://gcc.gnu.org/ml/gcc-patches/2011-07/msg00853.html) but with no success
yet.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/56511] memcpy misses chance to use AVX instructions
  2013-03-03  6:14 [Bug rtl-optimization/56511] New: memcpy misses chance to use AVX instructions jyasskin at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2013-03-04 20:10 ` izamyatin at gmail dot com
@ 2013-03-07  8:53 ` izamyatin at gmail dot com
  2021-08-05 12:36 ` hjl.tools at gmail dot com
  4 siblings, 0 replies; 6+ messages in thread
From: izamyatin at gmail dot com @ 2013-03-07  8:53 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56511

--- Comment #4 from Igor Zamyatin <izamyatin at gmail dot com> 2013-03-07 08:52:53 UTC ---
Doesn't first argument of memcpy which is called from memcpy_vec have unknown
(1byte) alignment? If yes, how PPC managed to produce vector instructions?


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/56511] memcpy misses chance to use AVX instructions
  2013-03-03  6:14 [Bug rtl-optimization/56511] New: memcpy misses chance to use AVX instructions jyasskin at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2013-03-07  8:53 ` izamyatin at gmail dot com
@ 2021-08-05 12:36 ` hjl.tools at gmail dot com
  4 siblings, 0 replies; 6+ messages in thread
From: hjl.tools at gmail dot com @ 2021-08-05 12:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56511

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |12.0
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |DUPLICATE

--- Comment #5 from H.J. Lu <hjl.tools at gmail dot com> ---
Fixed for GCC 12:

[hjl@gnu-skl-2 gcc]$ cat x2.c
#include <string.h>

typedef float vec __attribute__((vector_size(32)));
typedef struct S {
  vec v;
  char __attribute__((aligned(__alignof__(vec)))) c[sizeof(vec)];
} S;
void assign_vec(S* s, const vec* v) { s->v = *v; }
void memcpy_vec(S* s, const vec* v) { memcpy(&s->v, v, sizeof(vec)); }
void memcpy_char(S* s, const vec* v) { memcpy(s->c, v, sizeof(vec)); }
[hjl@gnu-skl-2 gcc]$ ./xgcc -B./ -S -O3 -march=haswell x2.c
[hjl@gnu-skl-2 gcc]$ cat x2.s
        .file   "x2.c"
        .text
        .p2align 4
        .globl  assign_vec
        .type   assign_vec, @function
assign_vec:
.LFB0:
        .cfi_startproc
        vmovaps (%rsi), %ymm0
        vmovaps %ymm0, (%rdi)
        vzeroupper
        ret
        .cfi_endproc
.LFE0:
        .size   assign_vec, .-assign_vec
        .p2align 4
        .globl  memcpy_vec
        .type   memcpy_vec, @function
memcpy_vec:
.LFB1:
        .cfi_startproc
        vmovdqu (%rsi), %ymm15
        vmovdqu %ymm15, (%rdi)
        vzeroupper
        ret
        .cfi_endproc
.LFE1:
        .size   memcpy_vec, .-memcpy_vec
        .p2align 4
        .globl  memcpy_char
        .type   memcpy_char, @function
memcpy_char:
.LFB2:
        .cfi_startproc
        vmovdqu (%rsi), %ymm15
        vmovdqu %ymm15, 32(%rdi)
        vzeroupper
        ret
        .cfi_endproc
.LFE2:
        .size   memcpy_char, .-memcpy_char
        .ident  "GCC: (GNU) 12.0.0 20210805 (experimental) [master revision
f7aa81892eb:82bfff3e5fa:c16f21c7cf97ce48967e42d3b5d22ea169a9c2c8]"
        .section        .note.GNU-stack,"",@progbits
[hjl@gnu-skl-2 gcc]$

*** This bug has been marked as a duplicate of bug 90773 ***

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-08-05 12:36 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-03  6:14 [Bug rtl-optimization/56511] New: memcpy misses chance to use AVX instructions jyasskin at gcc dot gnu.org
2013-03-03  6:20 ` [Bug rtl-optimization/56511] " jyasskin at gcc dot gnu.org
2013-03-03  6:50 ` [Bug target/56511] " pinskia at gcc dot gnu.org
2013-03-04 20:10 ` izamyatin at gmail dot com
2013-03-07  8:53 ` izamyatin at gmail dot com
2021-08-05 12:36 ` hjl.tools at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).