public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH 0/3] memset zva optimization
@ 2017-11-09  5:13 Siddhesh Poyarekar
  2017-11-09  5:14 ` [PATCH 2/3] benchtests: Bump start size since smaller sizes are noisy Siddhesh Poyarekar
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Siddhesh Poyarekar @ 2017-11-09  5:13 UTC (permalink / raw)
  To: libc-alpha; +Cc: Wilco.Dijkstra, szabolcs.nagy

This patchset updates the benchmarks to walk uniformly backwards and finally
adds multiarch implementation for memset.

Based on feedback, I have reduced the change to just having a separate memset
implementation for ZVA == 64 which roughly doubles performance for
~size=256-512 bytes and results in a net improvement for all sizes larger than
256 bytes due to not having to read zva on every function call.  The net gain
reduces as sizes increase since the impact of the zva read is minimal for
larger sizes.

Siddhesh Poyarekar (3):
  benchtests: Fix walking sizes and directions for *-walk benchmarks
  benchtests: Bump start size since smaller sizes are noisy
  aarch64: Hoist ZVA check out of the memset function

 benchtests/bench-memcpy-walk.c                 | 16 ++++-----
 benchtests/bench-memmove-walk.c                | 17 ++++-----
 benchtests/bench-memset-walk.c                 |  6 ++--
 sysdeps/aarch64/memset-reg.h                   | 30 ++++++++++++++++
 sysdeps/aarch64/memset.S                       | 27 +++++---------
 sysdeps/aarch64/multiarch/Makefile             |  2 +-
 sysdeps/aarch64/multiarch/ifunc-impl-list.c    |  3 ++
 sysdeps/aarch64/multiarch/init-arch.h          |  8 +++--
 sysdeps/aarch64/multiarch/memset.c             | 41 +++++++++++++++++++++
 sysdeps/aarch64/multiarch/memset_generic.S     | 27 ++++++++++++++
 sysdeps/aarch64/multiarch/memset_zva_64.S      | 49 ++++++++++++++++++++++++++
 sysdeps/aarch64/multiarch/rtld-memset.S        | 23 ++++++++++++
 sysdeps/unix/sysv/linux/aarch64/cpu-features.c | 10 ++++++
 sysdeps/unix/sysv/linux/aarch64/cpu-features.h |  1 +
 14 files changed, 214 insertions(+), 46 deletions(-)
 create mode 100644 sysdeps/aarch64/memset-reg.h
 create mode 100644 sysdeps/aarch64/multiarch/memset.c
 create mode 100644 sysdeps/aarch64/multiarch/memset_generic.S
 create mode 100644 sysdeps/aarch64/multiarch/memset_zva_64.S
 create mode 100644 sysdeps/aarch64/multiarch/rtld-memset.S

-- 
2.7.5

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-11-20 12:34 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-09  5:13 [PATCH 0/3] memset zva optimization Siddhesh Poyarekar
2017-11-09  5:14 ` [PATCH 2/3] benchtests: Bump start size since smaller sizes are noisy Siddhesh Poyarekar
2017-11-14  9:19   ` Siddhesh Poyarekar
2017-11-20 12:34   ` Siddhesh Poyarekar
2017-11-09  5:14 ` [PATCH 1/3] benchtests: Fix walking sizes and directions for *-walk benchmarks Siddhesh Poyarekar
2017-11-14  9:18   ` Siddhesh Poyarekar
2017-11-20 12:34   ` Siddhesh Poyarekar
2017-11-09  5:14 ` [PATCH 3/3] aarch64: Hoist ZVA check out of the memset function Siddhesh Poyarekar
2017-11-09  5:33   ` Andrew Pinski
2017-11-09  5:45     ` Siddhesh Poyarekar
2017-11-09  5:46       ` Andrew Pinski
2017-11-09  5:59         ` Siddhesh Poyarekar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).