public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 0/3] x86: Update memcpy/memset inline strategies
@ 2021-03-22 13:16 H.J. Lu
  2021-03-22 13:16 ` [PATCH 1/3] x86: Update memcpy/memset inline strategies for Ice Lake H.J. Lu
                   ` (2 more replies)
  0 siblings, 3 replies; 31+ messages in thread
From: H.J. Lu @ 2021-03-22 13:16 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jan Hubicka, Uros Bizjak, Hongtao Liu, Hongyu Wang

Simply memcpy and memset inline strategies to avoid branches:

1. With MOVE_RATIO and CLEAR_RATIO == 17, GCC will use integer/vector
   load and store for up to 16 * 16 (256) bytes when the data size is
   fixed and known.
2. Inline only if data size is known to be <= 256.
   a. Use "rep movsb/stosb" with simple code sequence if the data size
      is a constant.
   b. Use loop if data size is not a constant.
3. Use memcpy/memset libray function if data size is unknown or > 256.

There are no significant performance impacts on SPEC CPU 2017.  There
are visible performance improvements on eembc benchmarks with one
regression.

H.J. Lu (3):
  x86: Update memcpy/memset inline strategies for Ice Lake
  x86: Update memcpy/memset inline strategies for Skylake family CPUs
  x86: Update memcpy/memset inline strategies for -mtune=generic

 gcc/config/i386/i386-expand.c                 |  11 +-
 gcc/config/i386/i386-options.c                |  12 +-
 gcc/config/i386/i386.h                        |   2 +
 gcc/config/i386/x86-tune-costs.h              | 185 ++++++++++++++++--
 gcc/config/i386/x86-tune.def                  |   6 +
 .../gcc.target/i386/memcpy-strategy-10.c      |  11 ++
 .../gcc.target/i386/memcpy-strategy-11.c      |  18 ++
 .../gcc.target/i386/memcpy-strategy-12.c      |   9 +
 .../gcc.target/i386/memcpy-strategy-13.c      |  11 ++
 .../gcc.target/i386/memcpy-strategy-5.c       |  11 ++
 .../gcc.target/i386/memcpy-strategy-6.c       |  18 ++
 .../gcc.target/i386/memcpy-strategy-7.c       |   9 +
 .../gcc.target/i386/memcpy-strategy-8.c       |  18 ++
 .../gcc.target/i386/memcpy-strategy-9.c       |   9 +
 .../gcc.target/i386/memset-strategy-10.c      |  11 ++
 .../gcc.target/i386/memset-strategy-11.c      |   9 +
 .../gcc.target/i386/memset-strategy-3.c       |  17 ++
 .../gcc.target/i386/memset-strategy-4.c       |  17 ++
 .../gcc.target/i386/memset-strategy-5.c       |  11 ++
 .../gcc.target/i386/memset-strategy-6.c       |   9 +
 .../gcc.target/i386/memset-strategy-7.c       |  11 ++
 .../gcc.target/i386/memset-strategy-8.c       |   9 +
 .../gcc.target/i386/memset-strategy-9.c       |  17 ++
 gcc/testsuite/gcc.target/i386/shrink_wrap_1.c |   2 +-
 gcc/testsuite/gcc.target/i386/sw-1.c          |   2 +-
 25 files changed, 413 insertions(+), 32 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/memcpy-strategy-10.c
 create mode 100644 gcc/testsuite/gcc.target/i386/memcpy-strategy-11.c
 create mode 100644 gcc/testsuite/gcc.target/i386/memcpy-strategy-12.c
 create mode 100644 gcc/testsuite/gcc.target/i386/memcpy-strategy-13.c
 create mode 100644 gcc/testsuite/gcc.target/i386/memcpy-strategy-5.c
 create mode 100644 gcc/testsuite/gcc.target/i386/memcpy-strategy-6.c
 create mode 100644 gcc/testsuite/gcc.target/i386/memcpy-strategy-7.c
 create mode 100644 gcc/testsuite/gcc.target/i386/memcpy-strategy-8.c
 create mode 100644 gcc/testsuite/gcc.target/i386/memcpy-strategy-9.c
 create mode 100644 gcc/testsuite/gcc.target/i386/memset-strategy-10.c
 create mode 100644 gcc/testsuite/gcc.target/i386/memset-strategy-11.c
 create mode 100644 gcc/testsuite/gcc.target/i386/memset-strategy-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/memset-strategy-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/memset-strategy-5.c
 create mode 100644 gcc/testsuite/gcc.target/i386/memset-strategy-6.c
 create mode 100644 gcc/testsuite/gcc.target/i386/memset-strategy-7.c
 create mode 100644 gcc/testsuite/gcc.target/i386/memset-strategy-8.c
 create mode 100644 gcc/testsuite/gcc.target/i386/memset-strategy-9.c

-- 
2.30.2


^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2021-10-01 15:25 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-22 13:16 [PATCH 0/3] x86: Update memcpy/memset inline strategies H.J. Lu
2021-03-22 13:16 ` [PATCH 1/3] x86: Update memcpy/memset inline strategies for Ice Lake H.J. Lu
2021-03-22 14:10   ` Jan Hubicka
2021-03-22 23:57     ` [PATCH v2 " H.J. Lu
2021-03-29 13:43       ` H.J. Lu
2021-03-31  6:59       ` Richard Biener
2021-03-31  8:05       ` Jan Hubicka
2021-03-31 13:09         ` H.J. Lu
2021-03-31 13:40           ` Jan Hubicka
2021-03-31 13:47             ` Jan Hubicka
2021-03-31 15:41               ` H.J. Lu
2021-03-31 17:43                 ` Jan Hubicka
2021-03-31 17:54                   ` H.J. Lu
2021-04-01  5:57                     ` Hongyu Wang
2021-03-22 13:16 ` [PATCH 2/3] x86: Update memcpy/memset inline strategies for Skylake family CPUs H.J. Lu
2021-04-05 13:45   ` H.J. Lu
2021-04-05 21:14     ` Jan Hubicka
2021-04-05 21:53       ` H.J. Lu
2021-04-06  9:09         ` Hongyu Wang
2021-04-06  9:51           ` Jan Hubicka
2021-04-06 12:34             ` H.J. Lu
2021-03-22 13:16 ` [PATCH 3/3] x86: Update memcpy/memset inline strategies for -mtune=generic H.J. Lu
2021-03-22 13:29   ` Richard Biener
2021-03-22 13:38     ` H.J. Lu
2021-03-23  2:41       ` Hongyu Wang
2021-03-23  8:19         ` Richard Biener
2021-08-22 15:28           ` PING [PATCH] " H.J. Lu
2021-09-08  3:01             ` PING^2 " H.J. Lu
2021-09-13 13:38               ` H.J. Lu
2021-09-20 17:06                 ` PING^3 " H.J. Lu
2021-10-01 15:24                   ` PING^4 " H.J. Lu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).