public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "peter at cordes dot ca" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/102494] Failure to optimize vector reduction properly especially when using OpenMP
Date: Mon, 25 Oct 2021 21:44:09 +0000	[thread overview]
Message-ID: <bug-102494-4-hdA7DQ6Jy2@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-102494-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102494

Peter Cordes <peter at cordes dot ca> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |peter at cordes dot ca

--- Comment #10 from Peter Cordes <peter at cordes dot ca> ---
Current trunk with -fopenmp is still not good https://godbolt.org/z/b3jjhcvTa 
Still doing two separate sign extensions and two stores / wider reload (store
forwarding stall):

-O3 -march=skylake -fopenmp
simde_vaddlv_s8:
        push    rbp
        vpmovsxbw       xmm2, xmm0
        vpsrlq  xmm0, xmm0, 32
        mov     rbp, rsp
        vpmovsxbw       xmm3, xmm0
        and     rsp, -32
        vmovq   QWORD PTR [rsp-16], xmm2
        vmovq   QWORD PTR [rsp-8], xmm3
        vmovdqa xmm4, XMMWORD PTR [rsp-16]
   ... then asm using byte-shifts

Including stuff like
   movdqa  xmm1, xmm0
   psrldq  xmm1, 4

instead of pshufd, which is an option because high garbage can be ignored.

And ARM64 goes scalar.

----

Current trunk *without* -fopenmp produces decent asm
https://godbolt.org/z/h1KEKPTW9

For ARM64 we've been making good asm since GCC 10.x (vs. scalar in 9.3)
simde_vaddlv_s8:
        sxtl    v0.8h, v0.8b
        addv    h0, v0.8h
        umov    w0, v0.h[0]
        ret

x86-64 gcc  -O3 -march=skylake
simde_vaddlv_s8:
        vpmovsxbw       xmm1, xmm0
        vpsrlq  xmm0, xmm0, 32
        vpmovsxbw       xmm0, xmm0
        vpaddw  xmm0, xmm1, xmm0
        vpsrlq  xmm1, xmm0, 32
        vpaddw  xmm0, xmm0, xmm1
        vpsrlq  xmm1, xmm0, 16
        vpaddw  xmm0, xmm0, xmm1
        vpextrw eax, xmm0, 0
        ret


That's pretty good, but  VMOVD eax, xmm0  would be more efficient than  VPEXTRW
when we don't need to avoid high garbage (because it's a return value in this
case).  VPEXTRW zero-extends into RAX, so it's not directly helpful if we need
to sign-extend to 32 or 64-bit for some reason; we'd still need a scalar movsx.

Or with BMI2, go scalar before the last shift / VPADDW step, e.g.
  ...
  vmovd  eax, xmm0
  rorx   edx, eax, 16
  add    eax, edx

  parent reply	other threads:[~2021-10-25 21:44 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-27  0:52 [Bug tree-optimization/102494] New: Failure to optimize out " gabravier at gmail dot com
2021-09-27  1:45 ` [Bug tree-optimization/102494] " pinskia at gcc dot gnu.org
2021-09-27  3:01 ` crazylht at gmail dot com
2021-09-27  5:08 ` [Bug tree-optimization/102494] Failure to optimize " crazylht at gmail dot com
2021-09-27  5:13 ` crazylht at gmail dot com
2021-09-27  5:55 ` crazylht at gmail dot com
2021-09-27  8:47 ` rguenth at gcc dot gnu.org
2021-09-28  6:57 ` crazylht at gmail dot com
2021-09-28  7:09 ` rguenther at suse dot de
2021-10-08  2:10 ` cvs-commit at gcc dot gnu.org
2021-10-25 21:44 ` peter at cordes dot ca [this message]
2021-10-25 22:00 ` peter at cordes dot ca
2021-10-26  8:13 ` crazylht at gmail dot com

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-102494-4-hdA7DQ6Jy2@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).