public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "linkw at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/96933] New: inefficient code for char/short vec CTOR
Date: Fri, 04 Sep 2020 09:31:00 +0000	[thread overview]
Message-ID: <bug-96933-4@http.gcc.gnu.org/bugzilla/> (raw)

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96933

            Bug ID: 96933
           Summary: inefficient code for char/short vec CTOR
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: linkw at gcc dot gnu.org
  Target Milestone: ---

When I'm investigate the vectorization cost for vec_construct, I happened to
find the generated code for vector construction is inefficient with DIRECT_MOVE
support.

The test case looks like:

vector unsigned char test_char(unsigned char f1, unsigned char f2,
                               unsigned char f3, unsigned char f4,
                               unsigned char f5, unsigned char f6,
                               unsigned char f7, unsigned char f8,
                               unsigned char f9, unsigned char f10,
                               unsigned char f11, unsigned char f12,
                               unsigned char f13, unsigned char f14,
                               unsigned char f15, unsigned char f16) {

  vector unsigned char v = {f1, f2,  f3,  f4,  f5,  f6,  f7,  f8,
                            f9, f10, f11, f12, f13, f14, f15, f16};
  return v;
}

The generated code currently with -mcpu=power9:

0000000000000000 <test_char>:
   0:   e8 ff a1 fb     std     r29,-24(r1)
   4:   f0 ff c1 fb     std     r30,-16(r1)
   8:   f8 ff e1 fb     std     r31,-8(r1)
   c:   60 00 a1 8b     lbz     r29,96(r1)
  10:   68 00 c1 8b     lbz     r30,104(r1)
  14:   70 00 e1 8b     lbz     r31,112(r1)
  18:   d1 ff 81 98     stb     r4,-47(r1)
  1c:   d2 ff a1 98     stb     r5,-46(r1)
  20:   78 00 81 89     lbz     r12,120(r1)
  24:   80 00 01 88     lbz     r0,128(r1)
  28:   88 00 61 89     lbz     r11,136(r1)
  2c:   90 00 81 88     lbz     r4,144(r1)
  30:   98 00 a1 88     lbz     r5,152(r1)
  34:   d0 ff 61 98     stb     r3,-48(r1)
  38:   d3 ff c1 98     stb     r6,-45(r1)
  3c:   d4 ff e1 98     stb     r7,-44(r1)
  40:   d8 ff a1 9b     stb     r29,-40(r1)
  44:   d5 ff 01 99     stb     r8,-43(r1)
  48:   d6 ff 21 99     stb     r9,-42(r1)
  4c:   d7 ff 41 99     stb     r10,-41(r1)
  50:   d9 ff c1 9b     stb     r30,-39(r1)
  54:   da ff e1 9b     stb     r31,-38(r1)
  58:   db ff 81 99     stb     r12,-37(r1)
  5c:   dc ff 01 98     stb     r0,-36(r1)
  60:   dd ff 61 99     stb     r11,-35(r1)
  64:   de ff 81 98     stb     r4,-34(r1)
  68:   df ff a1 98     stb     r5,-33(r1)
  6c:   e8 ff a1 eb     ld      r29,-24(r1)
  70:   f0 ff c1 eb     ld      r30,-16(r1)
  74:   f8 ff e1 eb     ld      r31,-8(r1)
  78:   d9 ff 41 f4     lxv     vs34,-48(r1)
  7c:   20 00 80 4e     blr

But it can be more efficient with direct move and vector merge, such as:

   0:   67 01 43 7c     mtvsrd  vs34,r3
   4:   68 00 61 80     lwz     r3,104(r1)
   8:   60 00 61 81     lwz     r11,96(r1)
   c:   67 01 64 7c     mtvsrd  vs35,r4
  10:   70 00 81 80     lwz     r4,112(r1)
  14:   67 01 03 7d     mtvsrd  vs40,r3
  18:   78 00 61 80     lwz     r3,120(r1)
  1c:   67 01 85 7c     mtvsrd  vs36,r5
  20:   67 01 a6 7c     mtvsrd  vs37,r6
  24:   67 01 07 7c     mtvsrd  vs32,r7
  28:   67 01 28 7c     mtvsrd  vs33,r8
  2c:   67 01 24 7d     mtvsrd  vs41,r4
  30:   80 00 81 80     lwz     r4,128(r1)
  34:   0c 10 43 10     vmrghb  v2,v3,v2
  38:   67 01 63 7c     mtvsrd  vs35,r3
  3c:   88 00 61 80     lwz     r3,136(r1)
  40:   67 01 eb 7c     mtvsrd  vs39,r11
  44:   0c 20 85 10     vmrghb  v4,v5,v4
  48:   67 01 a4 7c     mtvsrd  vs37,r4
  4c:   90 00 81 80     lwz     r4,144(r1)
  50:   0c 00 01 10     vmrghb  v0,v1,v0
  54:   67 01 23 7c     mtvsrd  vs33,r3
  58:   98 00 61 80     lwz     r3,152(r1)
  5c:   67 01 c9 7c     mtvsrd  vs38,r9
  60:   0c 38 e8 10     vmrghb  v7,v8,v7
  64:   67 01 04 7d     mtvsrd  vs40,r4
  68:   0c 48 63 10     vmrghb  v3,v3,v9
  6c:   67 01 23 7d     mtvsrd  vs41,r3
  70:   0c 28 a1 10     vmrghb  v5,v1,v5
  74:   67 01 2a 7c     mtvsrd  vs33,r10
  78:   0c 40 09 11     vmrghb  v8,v9,v8
  7c:   0c 30 21 10     vmrghb  v1,v1,v6
  80:   4c 11 44 10     vmrglh  v2,v4,v2
  84:   4c 39 63 10     vmrglh  v3,v3,v7
  88:   4c 29 88 10     vmrglh  v4,v8,v5
  8c:   4c 01 a1 10     vmrglh  v5,v1,v0
  90:   8c 19 64 10     vmrglw  v3,v4,v3
  94:   8c 11 45 10     vmrglw  v2,v5,v2
  98:   57 13 43 f0     xxmrgld vs34,vs35,vs34

             reply	other threads:[~2020-09-04  9:31 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-04  9:31 linkw at gcc dot gnu.org [this message]
2020-09-04  9:33 ` [Bug target/96933] rs6000: " linkw at gcc dot gnu.org
2020-09-04 10:26 ` segher at gcc dot gnu.org
2020-09-04 10:46 ` linkw at gcc dot gnu.org
2020-09-04 12:06 ` rguenth at gcc dot gnu.org
2020-09-04 13:04 ` segher at gcc dot gnu.org
2020-09-07  2:39 ` linkw at gcc dot gnu.org
2020-09-07  7:26 ` linkw at gcc dot gnu.org
2020-09-07 15:14 ` segher at gcc dot gnu.org
2020-09-08  5:26 ` linkw at gcc dot gnu.org
2020-09-08 18:30 ` segher at gcc dot gnu.org
2020-09-09  5:20 ` linkw at gcc dot gnu.org
2020-11-05  8:09 ` cvs-commit at gcc dot gnu.org
2020-11-05  8:42 ` linkw at gcc dot gnu.org
2020-11-06 22:14 ` cvs-commit at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-96933-4@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).