public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: "H.J. Lu" <hjl.tools@gmail.com>
To: gcc-patches@gcc.gnu.org
Cc: Uros Bizjak <ubizjak@gmail.com>, Jakub Jelinek <jakub@redhat.com>,
	Richard Sandiford <richard.sandiford@arm.com>,
	Richard Biener <richard.guenther@gmail.com>
Subject: [PATCH v3 0/2] x86: Convert CONST_WIDE_INT/CONST_VECTOR to broadcast
Date: Tue,  8 Jun 2021 10:59:16 -0700	[thread overview]
Message-ID: <20210608175918.61759-1-hjl.tools@gmail.com> (raw)

1. Update move expanders to convert the CONST_WIDE_INT and CONST_VECTO
operands to vector broadcast from an integer with AVX2.
2. Add ix86_gen_scratch_sse_rtx to return a scratch SSE register which
won't increase stack alignment requirement and blocks transformation by
the combine pass.
3. Update PR 87767 tests to expect integer broadcast instead of broadcast
from memory.
4. Update avx512f_cond_move.c to expect integer broadcast.
5. Update vec_duplicate to allow to fail so that backend can only allow
broadcasting an integer constant to a vector when broadcast instruction
is available.  This can be used by memset expander to avoid vec_duplicate
when loading from constant pool is more efficient.
6. Add vec_duplicate<mode> expander and enable vec_duplicate from a
non-standard SSE constant integer only if vector broadcast is available.

A small benchmark:

https://gitlab.com/x86-benchmarks/microbenchmark/-/tree/memset/broadcast

shows that broadcast is a little bit faster on Intel Core i7-8559U:

$ make
gcc -g -I. -O2   -c -o test.o test.c
gcc -g   -c -o memory.o memory.S
gcc -g   -c -o broadcast.o broadcast.S
gcc -g   -c -o vec_dup_sse2.o vec_dup_sse2.S
gcc -o test test.o memory.o broadcast.o vec_dup_sse2.o
./test
memory      : 147215
broadcast   : 121213
vec_dup_sse2: 171366
$

broadcast is also smaller:

$ size memory.o broadcast.o
   text	   data	    bss	    dec	    hex	filename
    132	      0	      0	    132	     84	memory.o
    122	      0	      0	    122	     7a	broadcast.o
$

H.J. Lu (2):
  x86: Convert CONST_WIDE_INT/CONST_VECTOR to broadcast
  x86: Add vec_duplicate<mode> expander

 gcc/config/i386/i386-expand.c                 | 213 +++++++++++++++++-
 gcc/config/i386/i386-protos.h                 |   3 +
 gcc/config/i386/i386.c                        |  31 +++
 gcc/config/i386/sse.md                        |  20 ++
 gcc/doc/md.texi                               |   2 -
 .../i386/avx512f-broadcast-pr87767-1.c        |   7 +-
 .../i386/avx512f-broadcast-pr87767-5.c        |   5 +-
 .../gcc.target/i386/avx512f_cond_move.c       |   4 +-
 .../i386/avx512vl-broadcast-pr87767-1.c       |  12 +-
 .../i386/avx512vl-broadcast-pr87767-5.c       |   9 +-
 gcc/testsuite/gcc.target/i386/pr100865-1.c    |  13 ++
 gcc/testsuite/gcc.target/i386/pr100865-10a.c  |  33 +++
 gcc/testsuite/gcc.target/i386/pr100865-10b.c  |   7 +
 gcc/testsuite/gcc.target/i386/pr100865-2.c    |  14 ++
 gcc/testsuite/gcc.target/i386/pr100865-3.c    |  15 ++
 gcc/testsuite/gcc.target/i386/pr100865-4a.c   |  16 ++
 gcc/testsuite/gcc.target/i386/pr100865-4b.c   |   9 +
 gcc/testsuite/gcc.target/i386/pr100865-5a.c   |  16 ++
 gcc/testsuite/gcc.target/i386/pr100865-5b.c   |   9 +
 gcc/testsuite/gcc.target/i386/pr100865-6a.c   |  16 ++
 gcc/testsuite/gcc.target/i386/pr100865-6b.c   |   9 +
 gcc/testsuite/gcc.target/i386/pr100865-7a.c   |  17 ++
 gcc/testsuite/gcc.target/i386/pr100865-7b.c   |   9 +
 gcc/testsuite/gcc.target/i386/pr100865-8a.c   |  24 ++
 gcc/testsuite/gcc.target/i386/pr100865-8b.c   |   7 +
 gcc/testsuite/gcc.target/i386/pr100865-9a.c   |  25 ++
 gcc/testsuite/gcc.target/i386/pr100865-9b.c   |   7 +
 27 files changed, 526 insertions(+), 26 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-10a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-10b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-4a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-4b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-5a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-5b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-6a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-6b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-7a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-7b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-8a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-8b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-9a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-9b.c

-- 
2.31.1


             reply	other threads:[~2021-06-08 17:59 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-08 17:59 H.J. Lu [this message]
2021-06-08 17:59 ` [PATCH v3 1/2] " H.J. Lu
2021-06-09  8:17   ` Hongtao Liu
2021-06-09 12:56     ` H.J. Lu
2021-06-08 17:59 ` [PATCH v3 2/2] x86: Add vec_duplicate<mode> expander H.J. Lu
2021-06-09  6:40   ` Uros Bizjak
2021-06-09 23:30     ` H.J. Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210608175918.61759-1-hjl.tools@gmail.com \
    --to=hjl.tools@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=richard.guenther@gmail.com \
    --cc=richard.sandiford@arm.com \
    --cc=ubizjak@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).