public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/107837] New: Missed optimization: Using memcpy to load a struct unnecessary uses stack space
@ 2022-11-23 15:23 chfast at gmail dot com
  2022-11-23 18:06 ` [Bug tree-optimization/107837] [10/11/12/13 Regression] " pinskia at gcc dot gnu.org
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: chfast at gmail dot com @ 2022-11-23 15:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107837

            Bug ID: 107837
           Summary: Missed optimization: Using memcpy to load a struct
                    unnecessary uses stack space
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: chfast at gmail dot com
  Target Milestone: ---

I have a simple struct with array uint64_t[4]. When using memcpy() load it from
a storage of bytes and then performing some additional operations, a temporary
object on the stack is created.


struct uint256
{
    unsigned long v[4];
};

void load_bad(uint256* o, const char* src) noexcept
{
    uint256 x;
    __builtin_memcpy(&x, src, sizeof(x));
    uint256 y;
    y.v[0] = __builtin_bswap64(x.v[3]);
    y.v[1] = __builtin_bswap64(x.v[2]);
    y.v[2] = __builtin_bswap64(x.v[1]);
    y.v[3] = __builtin_bswap64(x.v[0]);
    *o = y;
}


load_bad(uint256*, char const*):
        movdqu  xmm0, XMMWORD PTR [rsi]
        movdqu  xmm1, XMMWORD PTR [rsi+16]
        movaps  XMMWORD PTR [rsp-40], xmm0
        mov     rdx, QWORD PTR [rsp-32]
        mov     rax, QWORD PTR [rsp-40]
        movaps  XMMWORD PTR [rsp-24], xmm1
        mov     rsi, QWORD PTR [rsp-16]
        mov     rcx, QWORD PTR [rsp-24]
        bswap   rdx
        bswap   rax
        mov     QWORD PTR [rdi+16], rdx
        bswap   rsi
        bswap   rcx
        mov     QWORD PTR [rdi], rsi
        mov     QWORD PTR [rdi+8], rcx
        mov     QWORD PTR [rdi+24], rax
        ret


The workaround is to use reinterpret_cast.

https://godbolt.org/z/WevYch8nv

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-07-07 10:44 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-23 15:23 [Bug tree-optimization/107837] New: Missed optimization: Using memcpy to load a struct unnecessary uses stack space chfast at gmail dot com
2022-11-23 18:06 ` [Bug tree-optimization/107837] [10/11/12/13 Regression] " pinskia at gcc dot gnu.org
2022-12-02 10:58 ` [Bug tree-optimization/107837] [10/11/12/13 Regression] Missed optimization: Using memcpy to load a struct unnecessary uses stack space since r8-5200 jakub at gcc dot gnu.org
2023-07-07 10:44 ` [Bug tree-optimization/107837] [11/12/13/14 " rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).