public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH 0/4] LoongArch: Add ifunc support for str{cpy, rchr},
@ 2023-09-08  9:33 dengjianbo
  2023-09-08  9:33 ` [PATCH 1/4] LoongArch: Add ifunc support for strcpy{aligned, unaligned, lsx, lasx} dengjianbo
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: dengjianbo @ 2023-09-08  9:33 UTC (permalink / raw)
  To: libc-alpha
  Cc: adhemerval.zanella, xry111, caiyinyu, xuchenghua, huangpei, dengjianbo

This patch add mutiple versions of strcpy, stpcpy, strrchr implemented
by basic LoongArch instructions, LSX instructions, LASX instructions.
Even though this implementation experience degradation in a few cases,
overall, the performance gains are significant.

See:
https://github.com/jiadengx/glibc_test/blob/main/bench/strcpy_compare.out
https://github.com/jiadengx/glibc_test/blob/main/bench/stpcpy_compare.out

Test results are compared with generic strcpy and stpcpy, not strlen +
memcpy in the benchmark.

Generic strrchr is implemented by strlen + memrchr, the strrchr_lasx
will be compared with generic_strrchr implemented by strlen-lasx and
memrchr-lasx, strrchr-lsx will be compared with generic_strrchr
implemented by strlen-lsx and memrchr-lsx, strrchr-aligned will be
compared with generic_strrchr implemented by strlen-aligned and
memrchr-generic.
https://github.com/jiadengx/glibc_test/blob/main/bench/strrchr_lasx_compare.out
https://github.com/jiadengx/glibc_test/blob/main/bench/strrchr_lsx_compare.out
https://github.com/jiadengx/glibc_test/blob/main/bench/strrchr_aligned_compare.out

In the data, positive values in the parentheses indicate that our
implementation took less time, indicating a performance improvement;
negative values in the parentheses mean that our implementation took
more time, indicating a decrease in performance. Following is the
summarise of the performance comparing with the generic version in the
glibc microbenchmark, 

name              reduce time percent
strcpy-aligned    10%-45%
strcpy-unaligned  10%-49%, comparing with the aligned version,unaligned
                  version experience better performance in case src and
                  dest cannot be both aligned with 8bytes
strcpy-lsx        20%-80%
strcpy-lasx       15%-86%

stpcpy-lasx       10%-87%
stpcpy-lsx        10%-80%
stpcpy-aligned    5%-45%

strrchr-lasx      10%-50%
strrchr-lsx       0%-50%
strrchr-aligned   5%-50%

dengjianbo (4):
  LoongArch: Add ifunc support for strcpy{aligned, unaligned, lsx, lasx}
  LoongArch: Add ifunc support for stpcpy{aligned, lsx, lasx}
  LoongArch: Add ifunc support for strrchr{aligned, lsx, lasx}
  LoongArch: Change to put magic number to .rodata section

 sysdeps/loongarch/lp64/multiarch/Makefile     |  10 +
 .../lp64/multiarch/ifunc-impl-list.c          |  25 +++
 .../loongarch/lp64/multiarch/ifunc-stpcpy.h   |  40 ++++
 .../loongarch/lp64/multiarch/ifunc-strrchr.h  |  41 ++++
 .../loongarch/lp64/multiarch/memmove-lsx.S    |  20 +-
 .../loongarch/lp64/multiarch/stpcpy-aligned.S | 191 ++++++++++++++++
 .../loongarch/lp64/multiarch/stpcpy-lasx.S    | 208 ++++++++++++++++++
 sysdeps/loongarch/lp64/multiarch/stpcpy-lsx.S | 206 +++++++++++++++++
 sysdeps/loongarch/lp64/multiarch/stpcpy.c     |  42 ++++
 .../loongarch/lp64/multiarch/strcpy-aligned.S | 185 ++++++++++++++++
 .../loongarch/lp64/multiarch/strcpy-lasx.S    | 208 ++++++++++++++++++
 sysdeps/loongarch/lp64/multiarch/strcpy-lsx.S | 197 +++++++++++++++++
 .../lp64/multiarch/strcpy-unaligned.S         | 131 +++++++++++
 sysdeps/loongarch/lp64/multiarch/strcpy.c     |  35 +++
 .../lp64/multiarch/strrchr-aligned.S          | 170 ++++++++++++++
 .../loongarch/lp64/multiarch/strrchr-lasx.S   | 176 +++++++++++++++
 .../loongarch/lp64/multiarch/strrchr-lsx.S    | 144 ++++++++++++
 sysdeps/loongarch/lp64/multiarch/strrchr.c    |  36 +++
 18 files changed, 2055 insertions(+), 10 deletions(-)
 create mode 100644 sysdeps/loongarch/lp64/multiarch/ifunc-stpcpy.h
 create mode 100644 sysdeps/loongarch/lp64/multiarch/ifunc-strrchr.h
 create mode 100644 sysdeps/loongarch/lp64/multiarch/stpcpy-aligned.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/stpcpy-lasx.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/stpcpy-lsx.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/stpcpy.c
 create mode 100644 sysdeps/loongarch/lp64/multiarch/strcpy-aligned.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/strcpy-lasx.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/strcpy-lsx.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/strcpy-unaligned.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/strcpy.c
 create mode 100644 sysdeps/loongarch/lp64/multiarch/strrchr-aligned.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/strrchr-lasx.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/strrchr-lsx.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/strrchr.c

-- 
2.40.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-09-13  7:47 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-08  9:33 [PATCH 0/4] LoongArch: Add ifunc support for str{cpy, rchr}, dengjianbo
2023-09-08  9:33 ` [PATCH 1/4] LoongArch: Add ifunc support for strcpy{aligned, unaligned, lsx, lasx} dengjianbo
2023-09-08 14:22   ` Xi Ruoyao
2023-09-11  9:53     ` dengjianbo
2023-09-13  7:47       ` dengjianbo
2023-09-08  9:33 ` [PATCH 2/4] LoongArch: Add ifunc support for stpcpy{aligned, " dengjianbo
2023-09-08  9:33 ` [PATCH 3/4] LoongArch: Add ifunc support for strrchr{aligned, " dengjianbo
2023-09-08  9:33 ` [PATCH 4/4] LoongArch: Change to put magic number to .rodata section dengjianbo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).