public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/100077] New: x86: by-value floating point array in struct - xmm regs spilling to stack
@ 2021-04-14 10:17 michaeljclark at mac dot com
  2021-04-14 12:40 ` [Bug target/100077] " rguenth at gcc dot gnu.org
  2021-04-14 13:52 ` matz at gcc dot gnu.org
  0 siblings, 2 replies; 3+ messages in thread
From: michaeljclark at mac dot com @ 2021-04-14 10:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100077

            Bug ID: 100077
           Summary: x86: by-value floating point array in struct - xmm
                    regs spilling to stack
           Product: gcc
           Version: 10.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: michaeljclark at mac dot com
  Target Milestone: ---

Hi,

compiling a vec3 cross product using struct by-value on msvc,
clang and gcc. gcc is going through memory on the stack.
operands are by-value so I can't use restrict. same with -O2 and -Os.
i vaguely remember seeing this a couple of times but i searched
to see if i had reported it and couldn't find a duplicate report.

link with the 3 compilers here: https://godbolt.org/z/YWWfYxbM3

MSVC:  /O2 /fp:fast /arch:AVX2
Clang: -Os -mavx -x c
GCC: -Os -mavx -x c

--- BEGIN EXAMPLE ---

struct vec3a { float v[3]; };
typedef struct vec3a vec3a;

vec3a vec3f_cross_0(vec3a v1, vec3a v2)
{
    vec3a dest = {
        v1.v[1]*v2.v[2]-v1.v[2]*v2.v[1],
        v1.v[2]*v2.v[0]-v1.v[0]*v2.v[2],
        v1.v[0]*v2.v[1]-v1.v[1]*v2.v[0]
    };
    return dest;
}

struct vec3f { float x, y, z; };
typedef struct vec3f vec3f;

vec3f vec3f_cross_1(vec3f v1, vec3f v2)
{
    vec3f dest = {
        v1.y*v2.z-v1.z*v2.y,
        v1.z*v2.x-v1.x*v2.z,
        v1.x*v2.y-v1.y*v2.x
    };
    return dest;
}

void vec3f_cross_2(float dest[3], float v1[3], float v2[3])
{
    dest[0]=v1[1]*v2[2]-v1[2]*v2[1];
    dest[1]=v1[2]*v2[0]-v1[0]*v2[2];
    dest[2]=v1[0]*v2[1]-v1[1]*v2[0];
}

--- END EXAMPLE ---

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug target/100077] x86: by-value floating point array in struct - xmm regs spilling to stack
  2021-04-14 10:17 [Bug target/100077] New: x86: by-value floating point array in struct - xmm regs spilling to stack michaeljclark at mac dot com
@ 2021-04-14 12:40 ` rguenth at gcc dot gnu.org
  2021-04-14 13:52 ` matz at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-04-14 12:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100077

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2021-04-14
             Target|                            |x86_64-*-*
                 CC|                            |matz at gcc dot gnu.org,
                   |                            |rguenth at gcc dot gnu.org
           Keywords|                            |missed-optimization
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
The issue is the way the ABI packs things (you are comparing different ABIs as
well), on x86 linux you get things packed in 8-byte quantities which means
v[0] and v[1] go in one xmm reg and v[2] goes in another.  This detail becomes
visible too late to elide the stack slot assignment.

There are duplicate related bugreports but there's also no easy fix
forthcoming.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug target/100077] x86: by-value floating point array in struct - xmm regs spilling to stack
  2021-04-14 10:17 [Bug target/100077] New: x86: by-value floating point array in struct - xmm regs spilling to stack michaeljclark at mac dot com
  2021-04-14 12:40 ` [Bug target/100077] " rguenth at gcc dot gnu.org
@ 2021-04-14 13:52 ` matz at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: matz at gcc dot gnu.org @ 2021-04-14 13:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100077

--- Comment #2 from Michael Matz <matz at gcc dot gnu.org> ---
Yeah, to solve this fully requires representing the parameter passing in a
better
way, one that can be (a) used on the gimple side (where the code is already
generated assuming the vec3a params go into memory) and (b) is surviving the
gimple to RTL switch (or at least is used during that switch to find a better
expansion of the parameter into register loads (using shuffles in this case)).

No easy fix :-/

(Note: in normal programs such kernels should be inlined into whatever uses
the basic operations in loops, at which point this particular problem of
parameter
passing artifacts simply goes away, so it's visible only in micro tests.  It's
still a problem of course)

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-04-14 13:52 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-14 10:17 [Bug target/100077] New: x86: by-value floating point array in struct - xmm regs spilling to stack michaeljclark at mac dot com
2021-04-14 12:40 ` [Bug target/100077] " rguenth at gcc dot gnu.org
2021-04-14 13:52 ` matz at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).