public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH 0/2] LoongArch: Add ifunc support for strchr{nul},
@ 2023-08-15  9:43 dengjianbo
  2023-08-15  9:43 ` [PATCH 1/2] Loongarch: Add ifunc support for strchr{aligned, lsx, lasx} and strchrnul{aligned, lsx, lasx} dengjianbo
  2023-08-15  9:43 ` [PATCH 2/2] Loongarch: Add ifunc support for memcpy{aligned, unaligned, lsx, lasx} and memmove{aligned, unaligned, " dengjianbo
  0 siblings, 2 replies; 3+ messages in thread
From: dengjianbo @ 2023-08-15  9:43 UTC (permalink / raw)
  To: libc-alpha
  Cc: adhemerval.zanella, xry111, caiyinyu, xuchenghua, huangpei, dengjianbo

Although our implementations of strchr, strchrnul, memcpy and memmove
experience performance degradation in a few cases, overall, the
performance gains are significant.

See:
https://github.com/jiadengx/glibc_test/blob/main/bench/strchr_compare.out
https://github.com/jiadengx/glibc_test/blob/main/bench/strchrnul_compare.out
https://github.com/jiadengx/glibc_test/blob/main/bench/memcpy_compare.out
https://github.com/jiadengx/glibc_test/blob/main/bench/memmove_compare.out

In the data, positive values in the parentheses indicate that our
implementation took less time, indicating a performance improvement;
negative values in the parentheses mean that our implementation took
more time, indicating a decrease in performance.

strchr-lasx       reduces the runtime about 50%-83%
strchr-lsx        reduces the runtime about 30%-67%
strchr-aligned    reduces the runtime about 10%-20%

strchrnul-lasx    reduces the runtime about 50%-83%
strchrnul-lsx     reduces the runtime about 36%-65%
strchrnul-aligned reduces the runtime about 6%-10%

memcpy-lasx       reduces the runtime about 8%-76%
memcpy-lsx        reduces the runtime about 8%-72%
memcpy-unaligned  reduces the runtime of unaligned data
                  copying up to 40%
memcpy-aligned    reduece the runtime of unaligned data
                  copying up to 25%

memmove-lasx      reduces the runtime about 20%-73%
memmove-lsx       reduces the runtime about 50%
memmove-unaligned reduces the runtime of unaligned data
                  moving up to 40%
memmove-aligned   reduces the runtime of unaligned data
                  moving up to 25%

comparing command:
python benchtests/scripts/compare_strings.py
-i build/benchtests/bench-strchr.out
-f generic_strchr,__strchr_lasx,__strchr_lsx,__strchr_aligned
-a timings -a length,pos,alignment
-s build/benchtests/bench-strchr.out
-b generic_strchr > strchr_compare.out

python benchtests/scripts/compare_strings.py
-i build/benchtests/bench-strchrnul.out
-f generic_strchrnul,__strchrnul_lasx,__strchrnul_lsx,__strchrnul_aligned
-a timings -a length,pos,alignment
-s build/benchtests/bench-strchrnul.out
-b generic_strchrnul > strchrnul_compare.out

python benchtests/scripts/compare_strings.py 
-i ./build/benchtests/bench-memcpy.out
-f generic_memcpy,__memcpy_lasx,__memcpy_lsx,__memcpy_unaligned,__memcpy_aligned
-a timings -a length,align1,align2,"dst > src"
-s ./build/benchtests/bench-memcpy.out
-b generic_memcpy > memcpy_compare.out

python benchtests/scripts/compare_strings.py
-i ./build/benchtests/bench-memmove.out
-f generic_memmove,__memmove_lasx,__memmove_lsx,__memmove_unaligned,__memmove_aligned
-a timings -a length,align1,align2
-s ./build/benchtests/bench-memmove.out
-b generic_memmove > memmove_compare.out

dengjianbo (2):
  Loongarch: Add ifunc support for strchr{aligned, lsx, lasx} and
    strchrnul{aligned, lsx, lasx}
  Loongarch: Add ifunc support for memcpy{aligned, unaligned, lsx, lasx}
    and memmove{aligned, unaligned, lsx, lasx}

 sysdeps/loongarch/lp64/multiarch/Makefile     |  11 +
 .../lp64/multiarch/ifunc-impl-list.c          |  34 +
 sysdeps/loongarch/lp64/multiarch/ifunc-lasx.h |  45 +
 .../loongarch/lp64/multiarch/ifunc-strchr.h   |  41 +
 .../lp64/multiarch/ifunc-strchrnul.h          |  41 +
 .../loongarch/lp64/multiarch/memcpy-aligned.S | 783 ++++++++++++++++++
 .../loongarch/lp64/multiarch/memcpy-lasx.S    |  20 +
 sysdeps/loongarch/lp64/multiarch/memcpy-lsx.S |  20 +
 .../lp64/multiarch/memcpy-unaligned.S         | 247 ++++++
 sysdeps/loongarch/lp64/multiarch/memcpy.c     |  37 +
 .../lp64/multiarch/memmove-aligned.S          |  20 +
 .../loongarch/lp64/multiarch/memmove-lasx.S   | 287 +++++++
 .../loongarch/lp64/multiarch/memmove-lsx.S    | 534 ++++++++++++
 .../lp64/multiarch/memmove-unaligned.S        | 380 +++++++++
 sysdeps/loongarch/lp64/multiarch/memmove.c    |  38 +
 .../loongarch/lp64/multiarch/strchr-aligned.S |  99 +++
 .../loongarch/lp64/multiarch/strchr-lasx.S    |  91 ++
 sysdeps/loongarch/lp64/multiarch/strchr-lsx.S |  73 ++
 sysdeps/loongarch/lp64/multiarch/strchr.c     |  36 +
 .../lp64/multiarch/strchrnul-aligned.S        |  95 +++
 .../loongarch/lp64/multiarch/strchrnul-lasx.S |  22 +
 .../loongarch/lp64/multiarch/strchrnul-lsx.S  |  22 +
 sysdeps/loongarch/lp64/multiarch/strchrnul.c  |  39 +
 23 files changed, 3015 insertions(+)
 create mode 100644 sysdeps/loongarch/lp64/multiarch/ifunc-lasx.h
 create mode 100644 sysdeps/loongarch/lp64/multiarch/ifunc-strchr.h
 create mode 100644 sysdeps/loongarch/lp64/multiarch/ifunc-strchrnul.h
 create mode 100644 sysdeps/loongarch/lp64/multiarch/memcpy-aligned.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/memcpy-lasx.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/memcpy-lsx.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/memcpy-unaligned.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/memcpy.c
 create mode 100644 sysdeps/loongarch/lp64/multiarch/memmove-aligned.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/memmove-lasx.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/memmove-lsx.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/memmove-unaligned.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/memmove.c
 create mode 100644 sysdeps/loongarch/lp64/multiarch/strchr-aligned.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/strchr-lasx.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/strchr-lsx.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/strchr.c
 create mode 100644 sysdeps/loongarch/lp64/multiarch/strchrnul-aligned.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/strchrnul-lasx.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/strchrnul-lsx.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/strchrnul.c

-- 
2.40.0


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-08-15  9:43 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-15  9:43 [PATCH 0/2] LoongArch: Add ifunc support for strchr{nul}, dengjianbo
2023-08-15  9:43 ` [PATCH 1/2] Loongarch: Add ifunc support for strchr{aligned, lsx, lasx} and strchrnul{aligned, lsx, lasx} dengjianbo
2023-08-15  9:43 ` [PATCH 2/2] Loongarch: Add ifunc support for memcpy{aligned, unaligned, lsx, lasx} and memmove{aligned, unaligned, " dengjianbo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).