public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "cvs-commit at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug middle-end/88873] missing vectorization for decomposed operations on a vector type
Date: Fri, 04 Aug 2023 15:24:11 +0000	[thread overview]
Message-ID: <bug-88873-4-lg0wC8So76@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-88873-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88873

--- Comment #10 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Roger Sayle <sayle@gcc.gnu.org>:

https://gcc.gnu.org/g:faa2202ee7fcf039b2016ce5766a2927526c5f78

commit r14-2997-gfaa2202ee7fcf039b2016ce5766a2927526c5f78
Author: Roger Sayle <roger@nextmovesoftware.com>
Date:   Fri Aug 4 16:23:38 2023 +0100

    i386: Split SUBREGs of SSE vector registers into vec_select insns.

    This patch is the final piece in the series to improve the ABI issues
    affecting PR 88873.  The previous patches tackled inserting DFmode
    values into V2DFmode registers, by introducing insvti_{low,high}part
    patterns.  This patch improves the extraction of DFmode values from
    V2DFmode registers via TImode intermediates.

    I'd initially thought this would require new extvti_{low,high}part
    patterns to be defined, but all that's required is to recognize that
    the SUBREG idioms produced by combine are equivalent to (forms of)
    vec_select patterns.  The target-independent middle-end can't be sure
    that the appropriate vec_select instruction exists on the target,
    hence doesn't canonicalize a SUBREG of a vector mode as a vec_select,
    but the backend can provide a define_split stating where and when
    this is useful, for example, considering whether the operand is in
    memory, or whether !TARGET_SSE_MATH and the destination is i387.

    For pr88873.c, gcc -O2 -march=cascadelake currently generates:

    foo:    vpunpcklqdq     %xmm3, %xmm2, %xmm7
            vpunpcklqdq     %xmm1, %xmm0, %xmm6
            vpunpcklqdq     %xmm5, %xmm4, %xmm2
            vmovdqa %xmm7, -24(%rsp)
            vmovdqa %xmm6, %xmm1
            movq    -16(%rsp), %rax
            vpinsrq $1, %rax, %xmm7, %xmm4
            vmovapd %xmm4, %xmm6
            vfmadd132pd     %xmm1, %xmm2, %xmm6
            vmovapd %xmm6, -24(%rsp)
            vmovsd  -16(%rsp), %xmm1
            vmovsd  -24(%rsp), %xmm0
            ret

    with this patch, we now generate:

    foo:    vpunpcklqdq     %xmm1, %xmm0, %xmm6
            vpunpcklqdq     %xmm3, %xmm2, %xmm7
            vpunpcklqdq     %xmm5, %xmm4, %xmm2
            vmovdqa %xmm6, %xmm1
            vfmadd132pd     %xmm7, %xmm2, %xmm1
            vmovsd  %xmm1, %xmm1, %xmm0
            vunpckhpd       %xmm1, %xmm1, %xmm1
            ret

    The improvement is even more dramatic when compared to the original
    29 instructions shown in comment #8.  GCC 13, for example, required
    12 transfers to/from memory.

    2023-08-04  Roger Sayle  <roger@nextmovesoftware.com>

    gcc/ChangeLog
            * config/i386/sse.md (define_split): Convert highpart:DF extract
            from V2DFmode register into a sse2_storehpd instruction.
            (define_split): Likewise, convert lowpart:DF extract from V2DF
            register into a sse2_storelpd instruction.

    gcc/testsuite/ChangeLog
            * gcc.target/i386/pr88873.c: Tweak to check for improved code.

      parent reply	other threads:[~2023-08-04 15:24 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <bug-88873-4@http.gcc.gnu.org/bugzilla/>
2021-08-21 21:25 ` [Bug tree-optimization/88873] " pinskia at gcc dot gnu.org
2023-06-21 13:33 ` [Bug middle-end/88873] " rguenth at gcc dot gnu.org
2023-06-21 22:18 ` roger at nextmovesoftware dot com
2023-07-10  8:09 ` cvs-commit at gcc dot gnu.org
2023-07-12 11:33 ` rguenth at gcc dot gnu.org
2023-07-14 17:13 ` cvs-commit at gcc dot gnu.org
2023-07-20  8:25 ` cvs-commit at gcc dot gnu.org
2023-08-04 15:24 ` cvs-commit at gcc dot gnu.org [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-88873-4-lg0wC8So76@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).