public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "rguenth at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/111166] gcc unnecessarily creates vector operations for packing 32 bit integers into struct (x86_64)
Date: Mon, 28 Aug 2023 12:37:22 +0000	[thread overview]
Message-ID: <bug-111166-4-MLHBEuDckM@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-111166-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111166

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Depends on|                            |101926

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
Your benchmark confirms the vectorized variant is slower, on a 7900X it's
both the memory roundtrip and the gpr->xmm move causing it.  perf shows

       |    turn_into_struct():
     1 |      movd       %edi,%xmm1
     3 |      movd       %esi,%xmm4
     4 |      movd       %edx,%xmm0
    95 |      movd       %ecx,%xmm3
     6 |      punpckldq  %xmm4,%xmm1
     2 |      punpckldq  %xmm3,%xmm0
     1 |      movdqa     %xmm1,%xmm2
       |      punpcklqdq %xmm0,%xmm2
     5 |      movaps     %xmm2,-0x18(%rsp)
    63 |      mov        -0x18(%rsp),%rdi
    70 |      mov        -0x10(%rsp),%rsi
    47 |      jmp        400630 <do_smth_with_4_u32>

note the situation is difficult to rectify - ideally the vectorizer
would see that we require two 64bit register pieces but it doesn't - it sees
we store into memory.

I'll note the non-vectorized code is also far from optimal.  clang
produces the following which is faster by more of the delta that
the vectorized version is slower compared to the scalar GCC variant.

turn_into_struct:                       # @turn_into_struct
        .cfi_startproc
# %bb.0:
                                        # kill: def $ecx killed $ecx def $rcx
                                        # kill: def $esi killed $esi def $rsi
        shlq    $32, %rsi
        movl    %edi, %edi
        orq     %rsi, %rdi
        shlq    $32, %rcx
        movl    %edx, %esi
        orq     %rcx, %rsi
        jmp     do_smth_with_4_u32      # TAILCALL


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101926
[Bug 101926] [meta-bug] struct/complex/other argument passing and return should
be improved

  parent reply	other threads:[~2023-08-28 12:37 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-26 18:29 [Bug tree-optimization/111166] New: " gnu_bugzilla_gcc at catelyn dot tech
2023-08-28  7:37 ` [Bug target/111166] " rguenth at gcc dot gnu.org
2023-08-28 11:49 ` gnu_bugzilla_gcc at catelyn dot tech
2023-08-28 11:52 ` gnu_bugzilla_gcc at catelyn dot tech
2023-08-28 12:37 ` rguenth at gcc dot gnu.org [this message]
2023-08-28 12:51 ` gnu_bugzilla_gcc at catelyn dot tech
2023-08-28 12:53 ` rguenth at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-111166-4-MLHBEuDckM@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).