[Bug middle-end/88873] missing vectorization for decomposed operations on a vector type

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

From: "rguenth at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug middle-end/88873] missing vectorization for decomposed operations on a vector type
Date: Wed, 21 Jun 2023 13:33:37 +0000	[thread overview]
Message-ID: <bug-88873-4-2SSmbH5lQl@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-88873-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88873

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |linkw at gcc dot gnu.org,
                   |                            |rguenth at gcc dot gnu.org,
                   |                            |sayle at gcc dot gnu.org,
                   |                            |vmakarov at gcc dot gnu.org
          Component|tree-optimization           |middle-end
           Keywords|                            |ra

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
So we "like"

v2df bar (v2df a, v2df b, v2df c)
{
  vector(2) double vect__4.19;
  vect__4.19_19 = .FMA (b_10(D), a_11(D), c_9(D)); [tail call]
  return vect__4.19_19;
}

but foo has the usual ABI issues:

struct s_t foo (struct s_t a, struct s_t b, struct s_t c)
{
  vector(2) double vect__4.13;
  vector(2) double vect__1.12;
  vector(2) double vect__3.9; 
  vector(2) double vect__2.6;
  struct s_t D.4355;
  vect__1.12_14 = MEM <vector(2) double> [(double *)&c];
  vect__2.6_12 = MEM <vector(2) double> [(double *)&b];
  vect__3.9_13 = MEM <vector(2) double> [(double *)&a];
  vect__4.13_15 = .FMA (vect__2.6_12, vect__3.9_13, vect__1.12_14);
  MEM <vector(2) double> [(double *)&D.4355] = vect__4.13_15;
  return D.4355;
}

where the argument passing / return value handling gets us

foo:
        vmovq   %xmm3, %rax
        vmovq   %xmm0, -24(%rsp)
        vpinsrq $1, %rax, %xmm2, %xmm7
        vmovq   %xmm5, %rax
        vmovq   %xmm1, -16(%rsp)
        vmovapd %xmm7, %xmm6
        vpinsrq $1, %rax, %xmm4, %xmm2
        vmovq   %xmm4, -40(%rsp)
        vfmadd132pd     -24(%rsp), %xmm2, %xmm6
        vmovq   %xmm5, -32(%rsp)
        vmovapd %xmm6, -56(%rsp)
        vmovsd  -48(%rsp), %xmm1
        vmovsd  -56(%rsp), %xmm0
        ret

that's very weird, we also seem to half-way clean up things but fail to
eliminate the useless vmovq   %xmm5, -32(%rsp) spill for example.

The IBM folks who want to use SRA-style analysis at RTL expansion time
might in the end deal with this as well.

We expand to

(insn 2 21 3 2 (set (reg:DF 91)
        (reg:DF 20 xmm0 [ a ])) "t2.c":8:1 -1
     (nil))
(insn 3 2 4 2 (set (reg:DF 92)
        (reg:DF 21 xmm1 [ a+8 ])) "t2.c":8:1 -1
     (nil))
(insn 4 3 5 2 (set (reg:TI 90)
        (const_int 0 [0])) "t2.c":8:1 -1
     (nil))
(insn 5 4 6 2 (set (subreg:DF (reg:TI 90) 0)
        (reg:DF 91)) "t2.c":8:1 -1
     (nil))
(insn 6 5 7 2 (set (subreg:DF (reg:TI 90) 8)
        (reg:DF 92)) "t2.c":8:1 -1
     (nil))

so we're using TImode pseudos because the aggregate has TImode but the
accesses should tell us that V2DFmode would be a way better choice
(or alternatively V2DImode in case float modes are too dangerous).

The actual single use is then

(insn 23 20 24 2 (set (reg:V2DF 85 [ vect__4.13 ])
        (fma:V2DF (subreg:V2DF (reg/v:TI 93 [ b ]) 0)
            (subreg:V2DF (reg/v:TI 89 [ a ]) 0)
            (subreg:V2DF (reg/v:TI 97 [ c ]) 0))) "t2.c":9:18 -1
     (nil))

and of course IRA/LRA are not able to deal with this situation nicely,
possibly because of the subreg sets of the TImode pseudo which we
do not split (well, we can't).  We could possibly use STV to handle
some of this though(?)

next prev parent reply	other threads:[~2023-06-21 13:33 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <bug-88873-4@http.gcc.gnu.org/bugzilla/>
2021-08-21 21:25 ` [Bug tree-optimization/88873] " pinskia at gcc dot gnu.org
2023-06-21 13:33 ` rguenth at gcc dot gnu.org [this message]
2023-06-21 22:18 ` [Bug middle-end/88873] " roger at nextmovesoftware dot com
2023-07-10  8:09 ` cvs-commit at gcc dot gnu.org
2023-07-12 11:33 ` rguenth at gcc dot gnu.org
2023-07-14 17:13 ` cvs-commit at gcc dot gnu.org
2023-07-20  8:25 ` cvs-commit at gcc dot gnu.org
2023-08-04 15:24 ` cvs-commit at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-88873-4-2SSmbH5lQl@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).