public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/114187] New: [14 regression] bizarre register dance on x86_64 for pass-by-value struct
@ 2024-03-01  9:16 matteo at mitalia dot net
  2024-03-01  9:25 ` [Bug target/114187] " pinskia at gcc dot gnu.org
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: matteo at mitalia dot net @ 2024-03-01  9:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114187

            Bug ID: 114187
           Summary: [14 regression] bizarre register dance on x86_64 for
                    pass-by-value struct
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: matteo at mitalia dot net
  Target Milestone: ---

Sample code (+ godbolt link https://godbolt.org/z/zf6e16Wcq )

```
struct P2d {
    double x, y;
};

double sumxy(double x, double y) {
    return x + y;
}

double sumxy_p(P2d p) {
    return p.x + p.y;
}

double sumxy_p_ref(const P2d& p) {
    return p.x + p.y;
}
```

with g++ 13.2 -O3 generates a perfectly reasonable

```
sumxy(double, double):
        addsd   xmm0, xmm1
        ret
sumxy_p(P2d):
        addsd   xmm0, xmm1
        ret
sumxy_p_ref(P2d const&):
        movsd   xmm0, QWORD PTR [rdi]
        addsd   xmm0, QWORD PTR [rdi+8]
        ret
```

instead with g++ 14 (g++
(Compiler-Explorer-Build-gcc-b05f474c8f7768dad50a99a2d676660ee4db09c6-binutils-2.40)
14.0.1 20240301 (experimental)) we get

```
sumxy(double, double):
        addsd   xmm0, xmm1
        ret
sumxy_p(P2d):
        movq    rax, xmm1
        movq    rdx, xmm0
        xchg    rdx, rax
        movq    xmm0, rax
        movq    xmm2, rdx
        addsd   xmm0, xmm2
        ret
sumxy_p_ref(P2d const&):
        movsd   xmm0, QWORD PTR [rdi]
        addsd   xmm0, QWORD PTR [rdi+8]
        ret
```

Notice the bizarre registers dance for sumxy_p(P2d) (p.x goes through xmm0 →
rdx → rax → xmm0; p.y in turn xmm1 → rax → rdx → xmm2; then they finally get
summed); sumxy(double, double) which, register-wise, should be the same, is
unaffected.

This exact same code (both for gcc 13 and gcc 14) is generated at all
optimization levels I tested (-Og, -O1, -O2, -O3) except -O0 of course, so it
doesn't seem to depend from particular optimization passes enabled only at high
optimization levels. Also (as reasonable) it doesn't seem to depend on the C++
frontend, as compiling this with plain gcc (adding a typedef for the struct and
changing the reference to a pointer) yields the exact same results.

Most importantly, it seems something target-specific, as ARM64 builds don't
exhibit particular problems, and produce pretty much the same (reasonable) code
both on 14.0 and 13.2

```
sumxy(double, double):
        fadd    d0, d0, d1
        ret
sumxy_p(P2d):
        fadd    d0, d0, d1
        ret
sumxy_p_ref(P2d const&):
        ldp     d0, d31, [x0]
        fadd    d0, d0, d31
        ret
```

(gcc 13.2 generates slightly different code for sumxy_p_ref, but in a very
minor way)

Fiddling around, with -march=nocona (that leaves gcc 13.2 unaffected) I get a
more compact but still absurd dance:

```
sumxy_p(P2d):
        movsd   QWORD PTR [rsp-8], xmm1
        mov     rdx, QWORD PTR [rsp-8]
        movq    xmm2, rdx
        addsd   xmm0, xmm2
        ret
```

here p.x is left in xmm0 where it should, but xmm1 goes through the stack (!),
a GP register (rdx) and finally to xmm2. It feels like in general it wants to
launder xmm1 through a 64 bit GP register before summing it, a bit like a light
version of -ffloat-store.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-03-04 13:19 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-01  9:16 [Bug rtl-optimization/114187] New: [14 regression] bizarre register dance on x86_64 for pass-by-value struct matteo at mitalia dot net
2024-03-01  9:25 ` [Bug target/114187] " pinskia at gcc dot gnu.org
2024-03-01  9:27 ` pinskia at gcc dot gnu.org
2024-03-01 12:53 ` [Bug target/114187] [14 regression] bizarre register dance on x86_64 for pass-by-value struct since r14-2526 jakub at gcc dot gnu.org
2024-03-01 16:08 ` roger at nextmovesoftware dot com
2024-03-01 19:15 ` roger at nextmovesoftware dot com
2024-03-04  0:51 ` cvs-commit at gcc dot gnu.org
2024-03-04 13:19 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).