public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH 0/5] LoongArch: Multiarch string and memory copy routines for unaligned access
@ 2023-04-15 11:23 Xi Ruoyao
  2023-04-15 11:23 ` [PATCH 1/5] LoongArch: Add bits/hwcap.h for Linux Xi Ruoyao
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Xi Ruoyao @ 2023-04-15 11:23 UTC (permalink / raw)
  To: libc-alpha; +Cc: caiyinyu, Wang Xuerui, Adhemerval Zanella Netto, Xi Ruoyao

LoongArch CPUs may have hardware unaligned access support.  For the
launched LoongArch CPUs, those branded as Loongson-3 (for desktops or
servers) have hardware unaligned access support, but those branded as
Loongson-2 (for embedded or industrial applications) do not.

On Linux, the unaligned access support is indicated by a HWCAP bit
provided by the kernel.  So we can multiarch stpcpy and memcpy with
ifunc to take the advantage on the CPUs with unaligned access support.

On a Loongson-3A5000HV CPU running at 2.5GHz, "make bench" has shown
these changes can really improve the performance:

- https://www.linuxfromscratch.org/~xry111/loongarch-ual-bench/bench-stpcpy-summary.txt
- https://www.linuxfromscratch.org/~xry111/loongarch-ual-bench/bench-memcpy-summary.txt

Xi Ruoyao (5):
  LoongArch: Add bits/hwcap.h for Linux
  LoongArch: Add LOONGARCH_HAVE_UAL macro
  string: stpcpy.c: Only alias __stpcpy to stpcpy if STPCPY undefined
  LoongArch: Multiarch stpcpy for unaligned access
  LoongArch: Multiarch memcpy for unaligned access

 string/stpcpy.c                               |  3 ++
 sysdeps/loongarch/loongarch-features.h        | 26 ++++++++++
 sysdeps/loongarch/multiarch/Makefile          |  6 +++
 sysdeps/loongarch/multiarch/memcpy-generic.c  | 27 ++++++++++
 sysdeps/loongarch/multiarch/memcpy-ual.c      | 50 +++++++++++++++++++
 sysdeps/loongarch/multiarch/memcpy.c          | 39 +++++++++++++++
 sysdeps/loongarch/multiarch/stpcpy-generic.c  | 25 ++++++++++
 sysdeps/loongarch/multiarch/stpcpy-ual.c      | 43 ++++++++++++++++
 sysdeps/loongarch/multiarch/stpcpy.c          | 37 ++++++++++++++
 .../loongarch/multiarch/wordcopy-ual-inline.c | 31 ++++++++++++
 .../unix/sysv/linux/loongarch/bits/hwcap.h    | 37 ++++++++++++++
 .../sysv/linux/loongarch/loongarch-features.h | 30 +++++++++++
 sysdeps/unix/sysv/linux/loongarch/sysdep.h    |  1 +
 13 files changed, 355 insertions(+)
 create mode 100644 sysdeps/loongarch/loongarch-features.h
 create mode 100644 sysdeps/loongarch/multiarch/Makefile
 create mode 100644 sysdeps/loongarch/multiarch/memcpy-generic.c
 create mode 100644 sysdeps/loongarch/multiarch/memcpy-ual.c
 create mode 100644 sysdeps/loongarch/multiarch/memcpy.c
 create mode 100644 sysdeps/loongarch/multiarch/stpcpy-generic.c
 create mode 100644 sysdeps/loongarch/multiarch/stpcpy-ual.c
 create mode 100644 sysdeps/loongarch/multiarch/stpcpy.c
 create mode 100644 sysdeps/loongarch/multiarch/wordcopy-ual-inline.c
 create mode 100644 sysdeps/unix/sysv/linux/loongarch/bits/hwcap.h
 create mode 100644 sysdeps/unix/sysv/linux/loongarch/loongarch-features.h

-- 
2.39.2


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/5] LoongArch: Add bits/hwcap.h for Linux
  2023-04-15 11:23 [PATCH 0/5] LoongArch: Multiarch string and memory copy routines for unaligned access Xi Ruoyao
@ 2023-04-15 11:23 ` Xi Ruoyao
  2023-04-15 11:23 ` [PATCH 2/5] LoongArch: Add LOONGARCH_HAVE_UAL macro Xi Ruoyao
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Xi Ruoyao @ 2023-04-15 11:23 UTC (permalink / raw)
  To: libc-alpha; +Cc: caiyinyu, Wang Xuerui, Adhemerval Zanella Netto, Xi Ruoyao

Add this file and include it from sysdep.h so we can use HWCAP bits in
Glibc and/or downstream apps.
---
 .../unix/sysv/linux/loongarch/bits/hwcap.h    | 37 +++++++++++++++++++
 sysdeps/unix/sysv/linux/loongarch/sysdep.h    |  1 +
 2 files changed, 38 insertions(+)
 create mode 100644 sysdeps/unix/sysv/linux/loongarch/bits/hwcap.h

diff --git a/sysdeps/unix/sysv/linux/loongarch/bits/hwcap.h b/sysdeps/unix/sysv/linux/loongarch/bits/hwcap.h
new file mode 100644
index 0000000000..50100fba61
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/loongarch/bits/hwcap.h
@@ -0,0 +1,37 @@
+/* Defines for bits in AT_HWCAP.  LoongArch Linux version.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#if !defined(_SYS_AUXV_H) && !defined(_LINUX_LOONGARCH_SYSDEP_H)
+# error "Never include <bits/hwcap.h> directly; use <sys/auxv.h> instead."
+#endif
+
+/* The bit numbers must match those in the kernel's <asm/hwcap.h>.  */
+
+#define HWCAP_LOONGARCH_CPUCFG		(1 << 0)
+#define HWCAP_LOONGARCH_LAM		(1 << 1)
+#define HWCAP_LOONGARCH_UAL		(1 << 2)
+#define HWCAP_LOONGARCH_FPU		(1 << 3)
+#define HWCAP_LOONGARCH_LSX		(1 << 4)
+#define HWCAP_LOONGARCH_LASX		(1 << 5)
+#define HWCAP_LOONGARCH_CRC32		(1 << 6)
+#define HWCAP_LOONGARCH_COMPLEX		(1 << 7)
+#define HWCAP_LOONGARCH_CRYPTO		(1 << 8)
+#define HWCAP_LOONGARCH_LVZ		(1 << 9)
+#define HWCAP_LOONGARCH_LBT_X86		(1 << 10)
+#define HWCAP_LOONGARCH_LBT_ARM		(1 << 11)
+#define HWCAP_LOONGARCH_LBT_MIPS	(1 << 12)
diff --git a/sysdeps/unix/sysv/linux/loongarch/sysdep.h b/sysdeps/unix/sysv/linux/loongarch/sysdep.h
index 8a2d73ec8c..782ea1ccf9 100644
--- a/sysdeps/unix/sysv/linux/loongarch/sysdep.h
+++ b/sysdeps/unix/sysv/linux/loongarch/sysdep.h
@@ -22,6 +22,7 @@
 #include <sysdeps/unix/sysv/linux/sysdep.h>
 #include <sysdeps/unix/sysdep.h>
 #include <tls.h>
+#include <bits/hwcap.h>
 
 #ifdef __ASSEMBLER__
 
-- 
2.39.2


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 2/5] LoongArch: Add LOONGARCH_HAVE_UAL macro
  2023-04-15 11:23 [PATCH 0/5] LoongArch: Multiarch string and memory copy routines for unaligned access Xi Ruoyao
  2023-04-15 11:23 ` [PATCH 1/5] LoongArch: Add bits/hwcap.h for Linux Xi Ruoyao
@ 2023-04-15 11:23 ` Xi Ruoyao
  2023-04-15 11:23 ` [PATCH 3/5] string: stpcpy.c: Only alias __stpcpy to stpcpy if STPCPY undefined Xi Ruoyao
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Xi Ruoyao @ 2023-04-15 11:23 UTC (permalink / raw)
  To: libc-alpha; +Cc: caiyinyu, Wang Xuerui, Adhemerval Zanella Netto, Xi Ruoyao

On Linux, we can determine if the CPU supports unaligned access via
HWCAP.  Otherwise, we conservatively assume the CPU does not support
unaligned access.

Maybe we should add a built-in macro for GCC to tell if
-m[no-]strict-align is used, but it won't happen for GCC 12 and 13.
---
 sysdeps/loongarch/loongarch-features.h        | 26 ++++++++++++++++
 .../sysv/linux/loongarch/loongarch-features.h | 30 +++++++++++++++++++
 2 files changed, 56 insertions(+)
 create mode 100644 sysdeps/loongarch/loongarch-features.h
 create mode 100644 sysdeps/unix/sysv/linux/loongarch/loongarch-features.h

diff --git a/sysdeps/loongarch/loongarch-features.h b/sysdeps/loongarch/loongarch-features.h
new file mode 100644
index 0000000000..722b4b61dc
--- /dev/null
+++ b/sysdeps/loongarch/loongarch-features.h
@@ -0,0 +1,26 @@
+/* Macros to test for CPU features on LoongArch.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library.  If not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef _LOONGARCH_LOONGARCH_FEATURES_H
+#define _LOONGARCH_LOONGARCH_FEATURES_H 1
+
+#ifndef LOONGARCH_HAVE_UAL
+# define LOONGARCH_HAVE_UAL	0
+#endif
+
+#endif  /* loongarch-features.h */
diff --git a/sysdeps/unix/sysv/linux/loongarch/loongarch-features.h b/sysdeps/unix/sysv/linux/loongarch/loongarch-features.h
new file mode 100644
index 0000000000..d4c18d3cfe
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/loongarch/loongarch-features.h
@@ -0,0 +1,30 @@
+/* Macros to test for CPU features on LoongArch.  Linux version.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library.  If not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef _LINUX_LOONGARCH_FEATURES_H
+#define _LINUX_LOONGARCH_FEATURES_H 1
+
+#ifndef __ASSEMBLER__
+# include <ldsodefs.h>
+
+# define LOONGARCH_HAVE_UAL	(GLRO (dl_hwcap) & HWCAP_LOONGARCH_UAL)
+#endif
+
+#include_next <loongarch-features.h>
+
+#endif  /* loongarch-features.h */
-- 
2.39.2


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 3/5] string: stpcpy.c: Only alias __stpcpy to stpcpy if STPCPY undefined
  2023-04-15 11:23 [PATCH 0/5] LoongArch: Multiarch string and memory copy routines for unaligned access Xi Ruoyao
  2023-04-15 11:23 ` [PATCH 1/5] LoongArch: Add bits/hwcap.h for Linux Xi Ruoyao
  2023-04-15 11:23 ` [PATCH 2/5] LoongArch: Add LOONGARCH_HAVE_UAL macro Xi Ruoyao
@ 2023-04-15 11:23 ` Xi Ruoyao
  2023-04-15 11:23 ` [PATCH 4/5] LoongArch: Multiarch stpcpy for unaligned access Xi Ruoyao
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Xi Ruoyao @ 2023-04-15 11:23 UTC (permalink / raw)
  To: libc-alpha; +Cc: caiyinyu, Wang Xuerui, Adhemerval Zanella Netto, Xi Ruoyao

If STPCPY is defined, it's likely __stpcpy is not provided by this file
anyway.  So it does not make too much sense to make the alias.
---
 string/stpcpy.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/string/stpcpy.c b/string/stpcpy.c
index dd0fef12ef..80ddf81595 100644
--- a/string/stpcpy.c
+++ b/string/stpcpy.c
@@ -116,6 +116,9 @@ STPCPY (char *dest, const char *src)
 		  : stpcpy_unaligned_loop ((op_t*) dest,
 					   (const op_t *) (src - ofs) , ofs);
 }
+
+#ifndef STPCPY
 weak_alias (__stpcpy, stpcpy)
 libc_hidden_def (__stpcpy)
 libc_hidden_builtin_def (stpcpy)
+#endif
-- 
2.39.2


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 4/5] LoongArch: Multiarch stpcpy for unaligned access
  2023-04-15 11:23 [PATCH 0/5] LoongArch: Multiarch string and memory copy routines for unaligned access Xi Ruoyao
                   ` (2 preceding siblings ...)
  2023-04-15 11:23 ` [PATCH 3/5] string: stpcpy.c: Only alias __stpcpy to stpcpy if STPCPY undefined Xi Ruoyao
@ 2023-04-15 11:23 ` Xi Ruoyao
  2023-04-15 11:23 ` [PATCH 5/5] LoongArch: Multiarch memcpy " Xi Ruoyao
  2023-04-18  3:01 ` [PATCH 0/5] LoongArch: Multiarch string and memory copy routines " caiyinyu
  5 siblings, 0 replies; 7+ messages in thread
From: Xi Ruoyao @ 2023-04-15 11:23 UTC (permalink / raw)
  To: libc-alpha; +Cc: caiyinyu, Wang Xuerui, Adhemerval Zanella Netto, Xi Ruoyao

When the CPU supports unaligned access, we can align the src pointer
first (to avoid a segmentation fault if a small source string is located
at the end of a page), then just copy the string pretending both src and
dest are aligned.
---
 sysdeps/loongarch/multiarch/Makefile         |  5 +++
 sysdeps/loongarch/multiarch/stpcpy-generic.c | 25 ++++++++++++
 sysdeps/loongarch/multiarch/stpcpy-ual.c     | 43 ++++++++++++++++++++
 sysdeps/loongarch/multiarch/stpcpy.c         | 37 +++++++++++++++++
 4 files changed, 110 insertions(+)
 create mode 100644 sysdeps/loongarch/multiarch/Makefile
 create mode 100644 sysdeps/loongarch/multiarch/stpcpy-generic.c
 create mode 100644 sysdeps/loongarch/multiarch/stpcpy-ual.c
 create mode 100644 sysdeps/loongarch/multiarch/stpcpy.c

diff --git a/sysdeps/loongarch/multiarch/Makefile b/sysdeps/loongarch/multiarch/Makefile
new file mode 100644
index 0000000000..958752bcbd
--- /dev/null
+++ b/sysdeps/loongarch/multiarch/Makefile
@@ -0,0 +1,5 @@
+ifeq ($(subdir),string)
+sysdep_routines += stpcpy-generic stpcpy-ual
+
+CFLAGS-stpcpy-ual.c += -mno-strict-align
+endif
diff --git a/sysdeps/loongarch/multiarch/stpcpy-generic.c b/sysdeps/loongarch/multiarch/stpcpy-generic.c
new file mode 100644
index 0000000000..487388372f
--- /dev/null
+++ b/sysdeps/loongarch/multiarch/stpcpy-generic.c
@@ -0,0 +1,25 @@
+/* Multiarch stpcpy for LoongArch.  Generic version.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <string.h>
+
+extern __typeof (stpcpy) __stpcpy_generic attribute_hidden;
+
+#define STPCPY __stpcpy_generic
+
+#include <string/stpcpy.c>
diff --git a/sysdeps/loongarch/multiarch/stpcpy-ual.c b/sysdeps/loongarch/multiarch/stpcpy-ual.c
new file mode 100644
index 0000000000..9cd13e7fef
--- /dev/null
+++ b/sysdeps/loongarch/multiarch/stpcpy-ual.c
@@ -0,0 +1,43 @@
+/* Multiarch stpcpy for LoongArch.  Unaligned access version.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <string.h>
+
+static __typeof (stpcpy) __stpcpy_unused __attribute__ ((unused));
+extern __typeof (stpcpy) __stpcpy_ual attribute_hidden;
+
+#define STPCPY __stpcpy_unused
+
+#include <string/stpcpy.c>
+
+char *
+__stpcpy_ual (char *dest, const char *src)
+{
+  /* Copy just a few bytes to make SRC aligned.  It's needed not to trigger
+     a segmentation fault when a short string is at the end of a segment.  */
+  size_t len = (-(uintptr_t) src) % OPSIZ;
+  for (; len != 0; len--, ++dest)
+    {
+      char c = *src++;
+      *dest = c;
+      if (c == '\0')
+	return dest;
+    }
+
+  return stpcpy_aligned_loop ((op_t *) dest, (const op_t *) src);
+}
diff --git a/sysdeps/loongarch/multiarch/stpcpy.c b/sysdeps/loongarch/multiarch/stpcpy.c
new file mode 100644
index 0000000000..58bfb1c89d
--- /dev/null
+++ b/sysdeps/loongarch/multiarch/stpcpy.c
@@ -0,0 +1,37 @@
+/* Multiple versions of stpcpy.  LoongArch version.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#if defined SHARED && IS_IN (libc)
+# define __NO_STRING_INLINES
+# define NO_MEMPCPY_STPCPY_REDIRECT
+# include <string.h>
+
+extern __typeof (__stpcpy) __stpcpy_generic attribute_hidden;
+extern __typeof (__stpcpy) __stpcpy_ual attribute_hidden;
+
+# include <loongarch-features.h>
+# define INIT_ARCH()
+
+libc_ifunc_hidden (__stpcpy, __stpcpy,
+		   LOONGARCH_HAVE_UAL ? __stpcpy_ual : __stpcpy_generic);
+weak_alias (__stpcpy, stpcpy)
+libc_hidden_def (__stpcpy)
+libc_hidden_def (stpcpy)
+#else
+# include <string/stpcpy.c>
+#endif
-- 
2.39.2


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 5/5] LoongArch: Multiarch memcpy for unaligned access
  2023-04-15 11:23 [PATCH 0/5] LoongArch: Multiarch string and memory copy routines for unaligned access Xi Ruoyao
                   ` (3 preceding siblings ...)
  2023-04-15 11:23 ` [PATCH 4/5] LoongArch: Multiarch stpcpy for unaligned access Xi Ruoyao
@ 2023-04-15 11:23 ` Xi Ruoyao
  2023-04-18  3:01 ` [PATCH 0/5] LoongArch: Multiarch string and memory copy routines " caiyinyu
  5 siblings, 0 replies; 7+ messages in thread
From: Xi Ruoyao @ 2023-04-15 11:23 UTC (permalink / raw)
  To: libc-alpha; +Cc: caiyinyu, Wang Xuerui, Adhemerval Zanella Netto, Xi Ruoyao

When the CPU supports unaligned access, we can align the dest pointer
first (this will make a better performance than solely relying on the
hardware unaligned access support), then just copy the memory area word
by word pretending both src and dest are aligned.
---
 sysdeps/loongarch/multiarch/Makefile          |  3 +-
 sysdeps/loongarch/multiarch/memcpy-generic.c  | 27 ++++++++++
 sysdeps/loongarch/multiarch/memcpy-ual.c      | 50 +++++++++++++++++++
 sysdeps/loongarch/multiarch/memcpy.c          | 39 +++++++++++++++
 .../loongarch/multiarch/wordcopy-ual-inline.c | 31 ++++++++++++
 5 files changed, 149 insertions(+), 1 deletion(-)
 create mode 100644 sysdeps/loongarch/multiarch/memcpy-generic.c
 create mode 100644 sysdeps/loongarch/multiarch/memcpy-ual.c
 create mode 100644 sysdeps/loongarch/multiarch/memcpy.c
 create mode 100644 sysdeps/loongarch/multiarch/wordcopy-ual-inline.c

diff --git a/sysdeps/loongarch/multiarch/Makefile b/sysdeps/loongarch/multiarch/Makefile
index 958752bcbd..34e2f2a334 100644
--- a/sysdeps/loongarch/multiarch/Makefile
+++ b/sysdeps/loongarch/multiarch/Makefile
@@ -1,5 +1,6 @@
 ifeq ($(subdir),string)
-sysdep_routines += stpcpy-generic stpcpy-ual
+sysdep_routines += stpcpy-generic stpcpy-ual memcpy-generic memcpy-ual
 
 CFLAGS-stpcpy-ual.c += -mno-strict-align
+CFLAGS-memcpy-ual.c += -mno-strict-align
 endif
diff --git a/sysdeps/loongarch/multiarch/memcpy-generic.c b/sysdeps/loongarch/multiarch/memcpy-generic.c
new file mode 100644
index 0000000000..9374ced033
--- /dev/null
+++ b/sysdeps/loongarch/multiarch/memcpy-generic.c
@@ -0,0 +1,27 @@
+/* Multiarch memcpy for LoongArch.  Generic version.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <string.h>
+
+extern __typeof (memcpy) __memcpy_generic attribute_hidden;
+
+#define MEMCPY __memcpy_generic
+#undef libc_hidden_def
+#define libc_hidden_def(name)
+
+#include <string/memcpy.c>
diff --git a/sysdeps/loongarch/multiarch/memcpy-ual.c b/sysdeps/loongarch/multiarch/memcpy-ual.c
new file mode 100644
index 0000000000..e7cd8f253b
--- /dev/null
+++ b/sysdeps/loongarch/multiarch/memcpy-ual.c
@@ -0,0 +1,50 @@
+/* Multiarch memcpy for LoongArch.  Unaligned access version.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <string.h>
+
+extern __typeof (memcpy) __memcpy_ual attribute_hidden;
+
+#include "wordcopy-ual-inline.c"
+
+#define OPSIZ	(sizeof (op_t))
+
+void *
+__memcpy_ual (void *dest, const void *src, size_t len)
+{
+  unsigned long int dstp = (long int) dest;
+  unsigned long int srcp = (long int) src;
+
+  /* If there not too few bytes to copy, use word copy.  */
+  if (len >= OP_T_THRES)
+    {
+      /* Copy just a few bytes to make DSTP aligned.  Not needed with
+         unaligned access support, but it improves the performance.  */
+      len -= (-dstp) % OPSIZ;
+      BYTE_COPY_FWD (dstp, srcp, (-dstp) % OPSIZ);
+
+      _wordcopy_fwd_ual (dstp, srcp, len / OPSIZ);
+      dstp += len & -OPSIZ;
+      srcp += len & -OPSIZ;
+      len %= OPSIZ;
+    }
+
+  BYTE_COPY_FWD (dstp, srcp, len);
+
+  return dest;
+}
diff --git a/sysdeps/loongarch/multiarch/memcpy.c b/sysdeps/loongarch/multiarch/memcpy.c
new file mode 100644
index 0000000000..6a3089f88c
--- /dev/null
+++ b/sysdeps/loongarch/multiarch/memcpy.c
@@ -0,0 +1,39 @@
+/* Multiple versions of memcpy.  LoongArch version.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#if defined SHARED && IS_IN (libc)
+# undef memcpy
+# define memcpy __redirect_memcpy
+# include <string.h>
+# undef memcpy
+
+extern __typeof (__redirect_memcpy) __libc_memcpy;
+
+extern __typeof (__redirect_memcpy) __memcpy_generic attribute_hidden;
+extern __typeof (__redirect_memcpy) __memcpy_ual attribute_hidden;
+
+# include <loongarch-features.h>
+# define INIT_ARCH()
+
+libc_ifunc (__libc_memcpy,
+	    LOONGARCH_HAVE_UAL ? __memcpy_ual : __memcpy_generic);
+strong_alias (__libc_memcpy, memcpy);
+libc_hidden_ver (__libc_memcpy, memcpy)
+#else
+# include <string/memcpy.c>
+#endif
diff --git a/sysdeps/loongarch/multiarch/wordcopy-ual-inline.c b/sysdeps/loongarch/multiarch/wordcopy-ual-inline.c
new file mode 100644
index 0000000000..a552aa6946
--- /dev/null
+++ b/sysdeps/loongarch/multiarch/wordcopy-ual-inline.c
@@ -0,0 +1,31 @@
+/* Reuse subroutine from string/wordcopy.c for LoongArch unaligned access.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <sys/cdefs.h>
+
+static __always_inline void _wordcopy_fwd_ual (long int, long int, size_t);
+static void _nouse_1 (long int, long int, size_t) __attribute__ ((unused));
+static void _nouse_2 (long int, long int, size_t) __attribute__ ((unused));
+static void _nouse_3 (long int, long int, size_t) __attribute__ ((unused));
+
+#define WORDCOPY_FWD_ALIGNED _wordcopy_fwd_ual
+#define WORDCOPY_BWD_ALIGNED _nouse_1
+#define WORDCOPY_FWD_DEST_ALIGNED _nouse_2
+#define WORDCOPY_BWD_DEST_ALIGNED _nouse_3
+
+#include <string/wordcopy.c>
-- 
2.39.2


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/5] LoongArch: Multiarch string and memory copy routines for unaligned access
  2023-04-15 11:23 [PATCH 0/5] LoongArch: Multiarch string and memory copy routines for unaligned access Xi Ruoyao
                   ` (4 preceding siblings ...)
  2023-04-15 11:23 ` [PATCH 5/5] LoongArch: Multiarch memcpy " Xi Ruoyao
@ 2023-04-18  3:01 ` caiyinyu
  5 siblings, 0 replies; 7+ messages in thread
From: caiyinyu @ 2023-04-18  3:01 UTC (permalink / raw)
  To: Xi Ruoyao, libc-alpha; +Cc: Wang Xuerui, Adhemerval Zanella Netto

We are preparing a series of patches that include ifunc support 
(aligned/unaligned/vectorized assembly implementation) for str/mem 
functions, tunable functionality, and vectorized _dl_runtime_resolve. 
However, we are not currently able to submit them to the upstream 
community. We may consider publishing them on GitHub in the future like 
gcc and binutils.

We will temporarily keep your patches.

在 2023/4/15 下午7:23, Xi Ruoyao 写道:
> LoongArch CPUs may have hardware unaligned access support.  For the
> launched LoongArch CPUs, those branded as Loongson-3 (for desktops or
> servers) have hardware unaligned access support, but those branded as
> Loongson-2 (for embedded or industrial applications) do not.
>
> On Linux, the unaligned access support is indicated by a HWCAP bit
> provided by the kernel.  So we can multiarch stpcpy and memcpy with
> ifunc to take the advantage on the CPUs with unaligned access support.
>
> On a Loongson-3A5000HV CPU running at 2.5GHz, "make bench" has shown
> these changes can really improve the performance:
>
> - https://www.linuxfromscratch.org/~xry111/loongarch-ual-bench/bench-stpcpy-summary.txt
> - https://www.linuxfromscratch.org/~xry111/loongarch-ual-bench/bench-memcpy-summary.txt
>
> Xi Ruoyao (5):
>    LoongArch: Add bits/hwcap.h for Linux
>    LoongArch: Add LOONGARCH_HAVE_UAL macro
>    string: stpcpy.c: Only alias __stpcpy to stpcpy if STPCPY undefined
>    LoongArch: Multiarch stpcpy for unaligned access
>    LoongArch: Multiarch memcpy for unaligned access
>
>   string/stpcpy.c                               |  3 ++
>   sysdeps/loongarch/loongarch-features.h        | 26 ++++++++++
>   sysdeps/loongarch/multiarch/Makefile          |  6 +++
>   sysdeps/loongarch/multiarch/memcpy-generic.c  | 27 ++++++++++
>   sysdeps/loongarch/multiarch/memcpy-ual.c      | 50 +++++++++++++++++++
>   sysdeps/loongarch/multiarch/memcpy.c          | 39 +++++++++++++++
>   sysdeps/loongarch/multiarch/stpcpy-generic.c  | 25 ++++++++++
>   sysdeps/loongarch/multiarch/stpcpy-ual.c      | 43 ++++++++++++++++
>   sysdeps/loongarch/multiarch/stpcpy.c          | 37 ++++++++++++++
>   .../loongarch/multiarch/wordcopy-ual-inline.c | 31 ++++++++++++
>   .../unix/sysv/linux/loongarch/bits/hwcap.h    | 37 ++++++++++++++
>   .../sysv/linux/loongarch/loongarch-features.h | 30 +++++++++++
>   sysdeps/unix/sysv/linux/loongarch/sysdep.h    |  1 +
>   13 files changed, 355 insertions(+)
>   create mode 100644 sysdeps/loongarch/loongarch-features.h
>   create mode 100644 sysdeps/loongarch/multiarch/Makefile
>   create mode 100644 sysdeps/loongarch/multiarch/memcpy-generic.c
>   create mode 100644 sysdeps/loongarch/multiarch/memcpy-ual.c
>   create mode 100644 sysdeps/loongarch/multiarch/memcpy.c
>   create mode 100644 sysdeps/loongarch/multiarch/stpcpy-generic.c
>   create mode 100644 sysdeps/loongarch/multiarch/stpcpy-ual.c
>   create mode 100644 sysdeps/loongarch/multiarch/stpcpy.c
>   create mode 100644 sysdeps/loongarch/multiarch/wordcopy-ual-inline.c
>   create mode 100644 sysdeps/unix/sysv/linux/loongarch/bits/hwcap.h
>   create mode 100644 sysdeps/unix/sysv/linux/loongarch/loongarch-features.h
>


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-04-18  3:01 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-15 11:23 [PATCH 0/5] LoongArch: Multiarch string and memory copy routines for unaligned access Xi Ruoyao
2023-04-15 11:23 ` [PATCH 1/5] LoongArch: Add bits/hwcap.h for Linux Xi Ruoyao
2023-04-15 11:23 ` [PATCH 2/5] LoongArch: Add LOONGARCH_HAVE_UAL macro Xi Ruoyao
2023-04-15 11:23 ` [PATCH 3/5] string: stpcpy.c: Only alias __stpcpy to stpcpy if STPCPY undefined Xi Ruoyao
2023-04-15 11:23 ` [PATCH 4/5] LoongArch: Multiarch stpcpy for unaligned access Xi Ruoyao
2023-04-15 11:23 ` [PATCH 5/5] LoongArch: Multiarch memcpy " Xi Ruoyao
2023-04-18  3:01 ` [PATCH 0/5] LoongArch: Multiarch string and memory copy routines " caiyinyu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).