public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/105260] New: Union with user-defined empty destructor leads to worse code-gen
@ 2022-04-13  9:39 m.cencora at gmail dot com
  2022-04-13 12:10 ` [Bug c++/105260] " rguenth at gcc dot gnu.org
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: m.cencora at gmail dot com @ 2022-04-13  9:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105260

            Bug ID: 105260
           Summary: Union with user-defined empty destructor leads to
                    worse code-gen
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: m.cencora at gmail dot com
  Target Milestone: ---

Following code leads to unnecessary stack spill of deserialized Foo variable:
g++ -std=c++17 -O2

#include <new>

inline unsigned deserializeUInt(const unsigned char* &in)
{
    unsigned out;
    __builtin_memcpy(&out, in, sizeof(out));
    in += sizeof(out);
    out = __builtin_bswap32(out);
    return out;
}

struct Foo
{
    unsigned a;
    unsigned b;

    static Foo deserialize(const unsigned char* &in)
    {
        return Foo{
            deserializeUInt(in),
            deserializeUInt(in)
        };
    }
};

struct Result
{
    unsigned idx;
    union
    {
        unsigned a;
        const void* ptr;
    };
};

Result dummyFunc(Foo);

void deserializeAndInvoke(const unsigned char* it)
{
#ifndef WORKAROUND
    union NoDestroy
    {
        ~NoDestroy() {}
        Foo value;
    };

    NoDestroy un{ 
        Foo::deserialize(it)
    };
    auto& arg = un.value;

#elif WORKAROUND == 1
    union NoDestroy
    {
        Foo value;
    };
    NoDestroy un{ 
        Foo::deserialize(it)
    };
    auto& arg = un.value;

#elif WORKAROUND == 2  
    alignas(Foo) char rawStorage[sizeof(Foo)];

    auto& arg = *new (&rawStorage[0]) Foo{
        Foo::deserialize(it)
    };
#endif

    dummyFunc(arg);
}

deserializeAndInvoke(unsigned char const*):
        mov     edx, DWORD PTR [rdi]
        mov     eax, DWORD PTR [rdi+4]
        bswap   edx
        bswap   eax
        mov     DWORD PTR [rsp-16], edx
        mov     DWORD PTR [rsp-12], eax
        mov     rdi, QWORD PTR [rsp-16]
        jmp     dummyFunc(Foo)

The spill can be avoided, when we remove user-defined destructor of NoDestroy
union, or
when we construct Foo via placement new in raw buffer.

Generated code with either of the workarounds:
g++ -std=c++17 -O2 -DWORKAROUND=1
or
g++ -std=c++17 -O2 -DWORKAROUND=2

deserializeAndInvoke(unsigned char const*):
        mov     rax, rdi
        mov     edi, DWORD PTR [rdi]
        mov     eax, DWORD PTR [rax+4]
        bswap   edi
        mov     edi, edi
        bswap   eax
        sal     rax, 32
        or      rdi, rax
        jmp     dummyFunc(Foo)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug c++/105260] Union with user-defined empty destructor leads to worse code-gen
  2022-04-13  9:39 [Bug c++/105260] New: Union with user-defined empty destructor leads to worse code-gen m.cencora at gmail dot com
@ 2022-04-13 12:10 ` rguenth at gcc dot gnu.org
  2022-04-13 12:13 ` rguenth at gcc dot gnu.org
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-04-13 12:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105260

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Well, it simply looks like the calling conventions for dummyFunc are different
with the ABI in effect depending on the PODness(?), and the code is as
optimized as it can be.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug c++/105260] Union with user-defined empty destructor leads to worse code-gen
  2022-04-13  9:39 [Bug c++/105260] New: Union with user-defined empty destructor leads to worse code-gen m.cencora at gmail dot com
  2022-04-13 12:10 ` [Bug c++/105260] " rguenth at gcc dot gnu.org
@ 2022-04-13 12:13 ` rguenth at gcc dot gnu.org
  2022-04-13 12:15 ` rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-04-13 12:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105260

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |ABI, wrong-code

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
OTOH NoDestroy is not really used but affects 'Foo'?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug c++/105260] Union with user-defined empty destructor leads to worse code-gen
  2022-04-13  9:39 [Bug c++/105260] New: Union with user-defined empty destructor leads to worse code-gen m.cencora at gmail dot com
  2022-04-13 12:10 ` [Bug c++/105260] " rguenth at gcc dot gnu.org
  2022-04-13 12:13 ` rguenth at gcc dot gnu.org
@ 2022-04-13 12:15 ` rguenth at gcc dot gnu.org
  2022-04-13 12:16 ` m.cencora at gmail dot com
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-04-13 12:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105260

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
That said, 'auto& arg' might just hide the interesting bit but my C++ fu is too
weak to see how un.value would differ.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug c++/105260] Union with user-defined empty destructor leads to worse code-gen
  2022-04-13  9:39 [Bug c++/105260] New: Union with user-defined empty destructor leads to worse code-gen m.cencora at gmail dot com
                   ` (2 preceding siblings ...)
  2022-04-13 12:15 ` rguenth at gcc dot gnu.org
@ 2022-04-13 12:16 ` m.cencora at gmail dot com
  2022-04-13 12:19 ` m.cencora at gmail dot com
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: m.cencora at gmail dot com @ 2022-04-13 12:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105260

--- Comment #4 from m.cencora at gmail dot com ---
I dont think ABI is an issue here. The Foo variable is spilled into stack, and
then reloaded back into RDI register before invoking dummyFunc.
Also clang generates optimal code as can be seen here:
https://godbolt.org/z/K6jWY9h3b

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug c++/105260] Union with user-defined empty destructor leads to worse code-gen
  2022-04-13  9:39 [Bug c++/105260] New: Union with user-defined empty destructor leads to worse code-gen m.cencora at gmail dot com
                   ` (3 preceding siblings ...)
  2022-04-13 12:16 ` m.cencora at gmail dot com
@ 2022-04-13 12:19 ` m.cencora at gmail dot com
  2022-04-13 12:26 ` m.cencora at gmail dot com
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: m.cencora at gmail dot com @ 2022-04-13 12:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105260

--- Comment #5 from m.cencora at gmail dot com ---
I've slighlty refactored the code, to remove the auto variables. This issue
remains

#include <new>

inline unsigned deserializeUInt(const unsigned char* &in)
{
    unsigned out;
    __builtin_memcpy(&out, in, sizeof(out));
    in += sizeof(out);
    out = __builtin_bswap32(out);
    return out;
}

struct Foo
{
    unsigned a;
    unsigned b;

    static Foo deserialize(const unsigned char* &in)
    {
        return Foo{
            deserializeUInt(in),
            deserializeUInt(in)
        };
    }
};

struct Result
{
    unsigned idx;
    union
    {
        unsigned a;
        const void* ptr;
    };
};

Result dummyFunc(Foo);

void deserializeAndInvoke(const unsigned char* it)
{
#ifndef WORKAROUND
    union NoDestroy
    {
        ~NoDestroy() {}
        Foo value;
    };

    NoDestroy un{ 
        Foo::deserialize(it)
    };
    dummyFunc(un.value);

#elif WORKAROUND == 1
    union NoDestroy
    {
        Foo value;
    };
    NoDestroy un{ 
        Foo::deserialize(it)
    };
    dummyFunc(un.value);

#elif WORKAROUND == 2  
    alignas(Foo) char rawStorage[sizeof(Foo)];

    Foo* foo = new (&rawStorage[0]) Foo{
        Foo::deserialize(it)
    };
    dummyFunc(*foo);
#endif

}

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug c++/105260] Union with user-defined empty destructor leads to worse code-gen
  2022-04-13  9:39 [Bug c++/105260] New: Union with user-defined empty destructor leads to worse code-gen m.cencora at gmail dot com
                   ` (4 preceding siblings ...)
  2022-04-13 12:19 ` m.cencora at gmail dot com
@ 2022-04-13 12:26 ` m.cencora at gmail dot com
  2022-04-13 13:12 ` jakub at gcc dot gnu.org
  2022-04-13 16:08 ` jamborm at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: m.cencora at gmail dot com @ 2022-04-13 12:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105260

--- Comment #6 from m.cencora at gmail dot com ---
Furthermore in all the scenarios the same function is called, with same
arguments, so the calling convention/ABI is same.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug c++/105260] Union with user-defined empty destructor leads to worse code-gen
  2022-04-13  9:39 [Bug c++/105260] New: Union with user-defined empty destructor leads to worse code-gen m.cencora at gmail dot com
                   ` (5 preceding siblings ...)
  2022-04-13 12:26 ` m.cencora at gmail dot com
@ 2022-04-13 13:12 ` jakub at gcc dot gnu.org
  2022-04-13 16:08 ` jamborm at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-04-13 13:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105260

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org,
                   |                            |jamborm at gcc dot gnu.org
           Keywords|ABI, wrong-code             |

--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I don't see any ABI differences, nor wrong-code.
If you look at *.optimized dump differences, there are pretty much none, except
for one extra CLOBBER.
But there are 2 differences not visible in the dump.
One is that while neither un variable is TREE_ADDRESSABLE, the no WORKAROUND
one has TREE_ADDRESSABLE type (as it has non-trivial destructor).
The other difference that matters for this testcase is that TYPE_MODE of the
NoDestroy union type is BLKmode, while for WORKAROUND=1 case it is DImode.

This is both done in:
  /* If this type has a copy constructor or a destructor, force its
     mode to be BLKmode, and force its TREE_ADDRESSABLE bit to be
     nonzero.  This will cause it to be passed by invisible reference
     and prevent it from being returned in a register.  */
  if (type_has_nontrivial_copy_init (t)
      || TYPE_HAS_NONTRIVIAL_DESTRUCTOR (t))
    {
      tree variants;
      SET_DECL_MODE (TYPE_MAIN_DECL (t), BLKmode);
      for (variants = t; variants; variants = TYPE_NEXT_VARIANT (variants))
        {
          SET_TYPE_MODE (variants, BLKmode);
          TREE_ADDRESSABLE (variants) = 1;
        }
    }
Now, for the ABI passing the above is really required, such types are indeed
passed differently from ones that have trivialy copy ctor and dtor, but for
this testcase it is just an optimization decision (use_register_for_decl).
We do there:
  /* Only register-like things go in registers.  */
  if (DECL_MODE (decl) == BLKmode)
    return false;
Guess we'd need to recompute mode for non-TREE_ADDRESSABLE vars before
expansion.
Or find out why SRA doesn't optimize this (remove the useless union, replace
all the un.value occurrences with a var with Foo type.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug c++/105260] Union with user-defined empty destructor leads to worse code-gen
  2022-04-13  9:39 [Bug c++/105260] New: Union with user-defined empty destructor leads to worse code-gen m.cencora at gmail dot com
                   ` (6 preceding siblings ...)
  2022-04-13 13:12 ` jakub at gcc dot gnu.org
@ 2022-04-13 16:08 ` jamborm at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: jamborm at gcc dot gnu.org @ 2022-04-13 16:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105260

--- Comment #8 from Martin Jambor <jamborm at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #7)
> Or find out why SRA doesn't optimize this (remove the useless union, replace
> all the un.value occurrences with a var with Foo type.

IIUC, it just isn't profitable, SRA sees:

  un.value = D.2545;
  dummyFunc (MEM[(const struct Foo &)&un]);
  un ={v} {CLOBBER(eol)};

and it just cannot avoid ultimately setting up the structure and pass
it to dummyFunc (IPA-SRA does not control that function and so cannot
change it).

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-04-13 16:08 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-13  9:39 [Bug c++/105260] New: Union with user-defined empty destructor leads to worse code-gen m.cencora at gmail dot com
2022-04-13 12:10 ` [Bug c++/105260] " rguenth at gcc dot gnu.org
2022-04-13 12:13 ` rguenth at gcc dot gnu.org
2022-04-13 12:15 ` rguenth at gcc dot gnu.org
2022-04-13 12:16 ` m.cencora at gmail dot com
2022-04-13 12:19 ` m.cencora at gmail dot com
2022-04-13 12:26 ` m.cencora at gmail dot com
2022-04-13 13:12 ` jakub at gcc dot gnu.org
2022-04-13 16:08 ` jamborm at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).