public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/102438] New: [x86-64] Failure to optimize out random extra store+load in vector code when memcpy is used
@ 2021-09-21 22:37 gabravier at gmail dot com
  2021-09-21 23:32 ` [Bug target/102438] [x86-64] Failure to optimize out spill in vector code when a cast " pinskia at gcc dot gnu.org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: gabravier at gmail dot com @ 2021-09-21 22:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102438

            Bug ID: 102438
           Summary: [x86-64] Failure to optimize out random extra
                    store+load in vector code when memcpy is used
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gabravier at gmail dot com
  Target Milestone: ---

#include <stddef.h>

typedef double simde_float64x1_t __attribute__((__vector_size__(8)));

simde_float64x1_t simde_vabs_f64(simde_float64x1_t a) {
    simde_float64x1_t r;
    r[0] = -a[0];
    return (simde_float64x1_t)r;
}

On AMD64 with -O3, this is outputted:

simde_vabs_f64(double __vector(1)):
        movsd   xmm0, QWORD PTR [rsp+8]
        xorpd   xmm0, XMMWORD PTR .LC0[rip]
        mov     rax, rdi
        movsd   QWORD PTR [rsp-24], xmm0
        mov     rdx, QWORD PTR [rsp-24]
        mov     QWORD PTR [rdi], rdx
        ret

If we instead just return `r` (without the cast) this is instead outputted:

simde_vabs_f64(double __vector(1)):
        movsd   xmm0, QWORD PTR [rsp+8]
        xorpd   xmm0, XMMWORD PTR .LC0[rip]
        mov     rax, rdi
        movsd   QWORD PTR [rdi], xmm0
        ret

It seems as though the presence of a cast (to the same type, no less) confuses
GCC into spilling the result into memory.

The GIMPLE optimized output is different for the two, so idk how much this
target-specific to x86, but I haven't been able to reproduce it anywhere else,
so ¯\_(ツ)_/¯. 

PS: The same bug can also be reproduced with -m32

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-09-22  3:08 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-21 22:37 [Bug target/102438] New: [x86-64] Failure to optimize out random extra store+load in vector code when memcpy is used gabravier at gmail dot com
2021-09-21 23:32 ` [Bug target/102438] [x86-64] Failure to optimize out spill in vector code when a cast " pinskia at gcc dot gnu.org
2021-09-21 23:33 ` pinskia at gcc dot gnu.org
2021-09-22  3:04 ` crazylht at gmail dot com
2021-09-22  3:08 ` crazylht at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).