public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH 0/7] Add ifunc support for existing Zbb optimizations
@ 2024-04-22  7:43 Christoph Müllner
  2024-04-22  7:43 ` [PATCH 1/7] RISC-V: Use .insn directive form for orc.b Christoph Müllner
                   ` (6 more replies)
  0 siblings, 7 replies; 20+ messages in thread
From: Christoph Müllner @ 2024-04-22  7:43 UTC (permalink / raw)
  To: libc-alpha, Adhemerval Zanella, Palmer Dabbelt, Darius Rad,
	Andrew Waterman, Philipp Tomsich, Evan Green, DJ Delorie,
	Vineet Gupta, Kito Cheng, Jeff Law
  Cc: Christoph Müllner

Glibc recently got hwprobe() support for RISC-V, which allows querying
avaiable extensions at runtime.  On top of that an optimized memcpy()
routine (for fast unaligned accesses) has been merged, which is built by
recompiling the generic C code with a different compiler flag.  An ifunc
resolver then detects which routine should be run using hwprobe().

This patchset follows this idea and recompiles the following functions
with the existing Zbb/orc.b optimization in riscv/string-fza.h enabled:
memchr, memrchr, strchrnul, strcmp, strlen, strncmp.

Recompilation is achieved by defining the `__riscv_zbb` macro, which
gates the use of orc.b.  As defining this macro on our own is not enough
to make GAS accept an 'orc.b' instruction, the first patch changes it
into the .insn directive form.

This series was tested by writing a simple test program to invoke the
libc routines (e.g. strcmp) and a modified QEMU that reports the
emulation of orc.b on stderr.  With that the QEMU can be used to test
if the optimized routines are executed (-cpu "rv64,zbb=[false,true]").
Further, this series was tested with SPEC CPU 2017 intrate with Zbb
enabled.

Christoph Müllner (7):
  RISC-V: Use .insn directive form for orc.b
  RISC-V: Add Zbb optimized memchr as ifunc
  RISC-V: Add Zbb optimized memrchr as ifunc
  RISC-V: Add Zbb optimized strchrnul as ifunc
  RISC-V: Add Zbb optimized strcmp as ifunc
  RISC-V: Add Zbb optimized strlen as ifunc
  RISC-V: Add Zbb optimized strncmp as ifunc

 sysdeps/riscv/multiarch/memchr-generic.c      | 26 ++++++++
 sysdeps/riscv/multiarch/memchr-zbb.c          | 30 +++++++++
 sysdeps/riscv/multiarch/memrchr-generic.c     | 26 ++++++++
 sysdeps/riscv/multiarch/memrchr-zbb.c         | 30 +++++++++
 sysdeps/riscv/multiarch/strchrnul-generic.c   | 26 ++++++++
 sysdeps/riscv/multiarch/strchrnul-zbb.c       | 30 +++++++++
 sysdeps/riscv/multiarch/strcmp-generic.c      | 26 ++++++++
 sysdeps/riscv/multiarch/strcmp-zbb.c          | 30 +++++++++
 sysdeps/riscv/multiarch/strlen-generic.c      | 26 ++++++++
 sysdeps/riscv/multiarch/strlen-zbb.c          | 30 +++++++++
 sysdeps/riscv/multiarch/strncmp-generic.c     | 26 ++++++++
 sysdeps/riscv/multiarch/strncmp-zbb.c         | 30 +++++++++
 sysdeps/riscv/string-fza.h                    |  3 +-
 .../unix/sysv/linux/riscv/multiarch/Makefile  | 18 ++++++
 .../linux/riscv/multiarch/ifunc-impl-list.c   | 51 ++++++++++++++--
 .../unix/sysv/linux/riscv/multiarch/memchr.c  | 57 +++++++++++++++++
 .../unix/sysv/linux/riscv/multiarch/memrchr.c | 61 +++++++++++++++++++
 .../sysv/linux/riscv/multiarch/strchrnul.c    | 61 +++++++++++++++++++
 .../unix/sysv/linux/riscv/multiarch/strcmp.c  | 57 +++++++++++++++++
 .../unix/sysv/linux/riscv/multiarch/strlen.c  | 57 +++++++++++++++++
 .../unix/sysv/linux/riscv/multiarch/strncmp.c | 57 +++++++++++++++++
 21 files changed, 752 insertions(+), 6 deletions(-)
 create mode 100644 sysdeps/riscv/multiarch/memchr-generic.c
 create mode 100644 sysdeps/riscv/multiarch/memchr-zbb.c
 create mode 100644 sysdeps/riscv/multiarch/memrchr-generic.c
 create mode 100644 sysdeps/riscv/multiarch/memrchr-zbb.c
 create mode 100644 sysdeps/riscv/multiarch/strchrnul-generic.c
 create mode 100644 sysdeps/riscv/multiarch/strchrnul-zbb.c
 create mode 100644 sysdeps/riscv/multiarch/strcmp-generic.c
 create mode 100644 sysdeps/riscv/multiarch/strcmp-zbb.c
 create mode 100644 sysdeps/riscv/multiarch/strlen-generic.c
 create mode 100644 sysdeps/riscv/multiarch/strlen-zbb.c
 create mode 100644 sysdeps/riscv/multiarch/strncmp-generic.c
 create mode 100644 sysdeps/riscv/multiarch/strncmp-zbb.c
 create mode 100644 sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c
 create mode 100644 sysdeps/unix/sysv/linux/riscv/multiarch/memrchr.c
 create mode 100644 sysdeps/unix/sysv/linux/riscv/multiarch/strchrnul.c
 create mode 100644 sysdeps/unix/sysv/linux/riscv/multiarch/strcmp.c
 create mode 100644 sysdeps/unix/sysv/linux/riscv/multiarch/strlen.c
 create mode 100644 sysdeps/unix/sysv/linux/riscv/multiarch/strncmp.c

-- 
2.44.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/7] RISC-V: Use .insn directive form for orc.b
  2024-04-22  7:43 [PATCH 0/7] Add ifunc support for existing Zbb optimizations Christoph Müllner
@ 2024-04-22  7:43 ` Christoph Müllner
  2024-04-22  7:43 ` [PATCH 2/7] RISC-V: Add Zbb optimized memchr as ifunc Christoph Müllner
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 20+ messages in thread
From: Christoph Müllner @ 2024-04-22  7:43 UTC (permalink / raw)
  To: libc-alpha, Adhemerval Zanella, Palmer Dabbelt, Darius Rad,
	Andrew Waterman, Philipp Tomsich, Evan Green, DJ Delorie,
	Vineet Gupta, Kito Cheng, Jeff Law
  Cc: Christoph Müllner

The orc.b instruction is part of the Zbb extension, which was ratified in 2021.
GAS releases older than that won't know of the instruction.
Newer GAS releases accepted orc.b only if Zbb is enabled in the -march string.

This patch changes the inline asm for orc.b to the '.insn' directive form,
which allows (future changes) to build orc.b-optimized routines by defining
the Zbb feature test macro (__riscv_zbb).  Building optimized routines
this way is compatible with releases of GAS with and without Zbb support.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
---
 sysdeps/riscv/string-fza.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/sysdeps/riscv/string-fza.h b/sysdeps/riscv/string-fza.h
index ee5c31317f..47b1fd152f 100644
--- a/sysdeps/riscv/string-fza.h
+++ b/sysdeps/riscv/string-fza.h
@@ -36,7 +36,8 @@ find_zero_all (op_t x)
   asm ("th.tstnbz %0, %1" : "=r" (r) : "r" (x));
   return r;
 #else
-  asm ("orc.b %0, %1" : "=r" (r) : "r" (x));
+  /* orc.b using the .insn directive (supported by older Binutils releases).  */
+  asm (".insn i 0x13, 0x5, %0, %1, 0x287" : "=r" (r) : "r" (x));
   return ~r;
 #endif
 }
-- 
2.44.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 2/7] RISC-V: Add Zbb optimized memchr as ifunc
  2024-04-22  7:43 [PATCH 0/7] Add ifunc support for existing Zbb optimizations Christoph Müllner
  2024-04-22  7:43 ` [PATCH 1/7] RISC-V: Use .insn directive form for orc.b Christoph Müllner
@ 2024-04-22  7:43 ` Christoph Müllner
  2024-04-24 12:53   ` Adhemerval Zanella Netto
  2024-04-22  7:43 ` [PATCH 3/7] RISC-V: Add Zbb optimized memrchr " Christoph Müllner
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 20+ messages in thread
From: Christoph Müllner @ 2024-04-22  7:43 UTC (permalink / raw)
  To: libc-alpha, Adhemerval Zanella, Palmer Dabbelt, Darius Rad,
	Andrew Waterman, Philipp Tomsich, Evan Green, DJ Delorie,
	Vineet Gupta, Kito Cheng, Jeff Law
  Cc: Christoph Müllner

When building with Zbb enabled, memchr benefits from using orc.b in
find_zero_all().  This patch changes the build system such, that a
non-Zbb version as well as a Zbb version of this routine is built.
Further, a ifunc resolver is provided that selects the right routine
based on the outcome of extension probing via hwprobe().

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
---
 sysdeps/riscv/multiarch/memchr-generic.c      | 26 +++++++++
 sysdeps/riscv/multiarch/memchr-zbb.c          | 30 ++++++++++
 .../unix/sysv/linux/riscv/multiarch/Makefile  |  3 +
 .../linux/riscv/multiarch/ifunc-impl-list.c   | 31 ++++++++--
 .../unix/sysv/linux/riscv/multiarch/memchr.c  | 57 +++++++++++++++++++
 5 files changed, 142 insertions(+), 5 deletions(-)
 create mode 100644 sysdeps/riscv/multiarch/memchr-generic.c
 create mode 100644 sysdeps/riscv/multiarch/memchr-zbb.c
 create mode 100644 sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c

diff --git a/sysdeps/riscv/multiarch/memchr-generic.c b/sysdeps/riscv/multiarch/memchr-generic.c
new file mode 100644
index 0000000000..a96c36398b
--- /dev/null
+++ b/sysdeps/riscv/multiarch/memchr-generic.c
@@ -0,0 +1,26 @@
+/* Re-include the default memchr implementation.
+   Copyright (C) 2024 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <string.h>
+
+#if IS_IN(libc)
+# define MEMCHR __memchr_generic
+# undef libc_hidden_builtin_def
+# define libc_hidden_builtin_def(x)
+#endif
+#include <string/memchr.c>
diff --git a/sysdeps/riscv/multiarch/memchr-zbb.c b/sysdeps/riscv/multiarch/memchr-zbb.c
new file mode 100644
index 0000000000..bead0335ae
--- /dev/null
+++ b/sysdeps/riscv/multiarch/memchr-zbb.c
@@ -0,0 +1,30 @@
+/* Re-include the default memchr implementation for Zbb.
+   Copyright (C) 2024 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <string.h>
+
+#if IS_IN(libc)
+# define MEMCHR __memchr_zbb
+# undef libc_hidden_builtin_def
+# define libc_hidden_builtin_def(x)
+#endif
+/* Convince preprocessor to have Zbb instructions.  */
+#ifndef __riscv_zbb
+# define __riscv_zbb
+#endif
+#include <string/memchr.c>
diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
index fcef5659d4..5586d11c89 100644
--- a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
+++ b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
@@ -1,5 +1,8 @@
 ifeq ($(subdir),string)
 sysdep_routines += \
+  memchr \
+  memchr-generic \
+  memchr-zbb \
   memcpy \
   memcpy-generic \
   memcpy_noalignment \
diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
index 9f806d7a9e..7321144a32 100644
--- a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
+++ b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
@@ -20,19 +20,40 @@
 #include <string.h>
 #include <sys/hwprobe.h>
 
+#define ARRAY_SIZE(A) (sizeof (A) / sizeof ((A)[0]))
+
 size_t
 __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
 			size_t max)
 {
   size_t i = max;
+  struct riscv_hwprobe pairs[] = {
+    { .key = RISCV_HWPROBE_KEY_IMA_EXT_0 },
+    { .key = RISCV_HWPROBE_KEY_CPUPERF_0 },
+  };
 
+  bool has_zbb = false;
   bool fast_unaligned = false;
 
-  struct riscv_hwprobe pair = { .key = RISCV_HWPROBE_KEY_CPUPERF_0 };
-  if (__riscv_hwprobe (&pair, 1, 0, NULL, 0) == 0
-      && (pair.value & RISCV_HWPROBE_MISALIGNED_MASK)
-          == RISCV_HWPROBE_MISALIGNED_FAST)
-    fast_unaligned = true;
+  if (__riscv_hwprobe (pairs, ARRAY_SIZE (pairs), 0, NULL, 0) == 0)
+    {
+      struct riscv_hwprobe *pair;
+
+      /* RISCV_HWPROBE_KEY_IMA_EXT_0  */
+      pair = &pairs[0];
+      if (pair->value & RISCV_HWPROBE_EXT_ZBB)
+        has_zbb = true;
+
+      /* RISCV_HWPROBE_KEY_CPUPERF_0  */
+      pair = &pairs[1];
+      if ((pair->value & RISCV_HWPROBE_MISALIGNED_MASK)
+	   == RISCV_HWPROBE_MISALIGNED_FAST)
+        fast_unaligned = true;
+    }
+
+  IFUNC_IMPL (i, name, memchr,
+	      IFUNC_IMPL_ADD (array, i, memchr, has_zbb, __memchr_zbb)
+	      IFUNC_IMPL_ADD (array, i, memchr, 1, __memchr_generic))
 
   IFUNC_IMPL (i, name, memcpy,
 	      IFUNC_IMPL_ADD (array, i, memcpy, fast_unaligned,
diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c b/sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c
new file mode 100644
index 0000000000..bc076cbf24
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c
@@ -0,0 +1,57 @@
+/* Multiple versions of memchr.
+   All versions must be listed in ifunc-impl-list.c.
+   Copyright (C) 2017-2024 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#if IS_IN (libc)
+/* Redefine memchr so that the compiler won't complain about the type
+   mismatch with the IFUNC selector in strong_alias, below.  */
+# undef memchr
+# define memchr __redirect_memchr
+# include <stdint.h>
+# include <string.h>
+# include <ifunc-init.h>
+# include <riscv-ifunc.h>
+# include <sys/hwprobe.h>
+
+extern __typeof (__redirect_memchr) __libc_memchr;
+
+extern __typeof (__redirect_memchr) __memchr_generic attribute_hidden;
+extern __typeof (__redirect_memchr) __memchr_zbb attribute_hidden;
+
+static inline __typeof (__redirect_memchr) *
+select_memchr_ifunc (uint64_t dl_hwcap, __riscv_hwprobe_t hwprobe_func)
+{
+  unsigned long long int v;
+  if (__riscv_hwprobe_one (hwprobe_func, RISCV_HWPROBE_KEY_IMA_EXT_0, &v) == 0
+      && (v & RISCV_HWPROBE_EXT_ZBB))
+    return __memchr_zbb;
+
+  return __memchr_generic;
+}
+
+riscv_libc_ifunc (__libc_memchr, select_memchr_ifunc);
+
+# undef memchr
+strong_alias (__libc_memchr, memchr);
+# ifdef SHARED
+__hidden_ver1 (memchr, __GI_memchr, __redirect_memchr)
+  __attribute__ ((visibility ("hidden"))) __attribute_copy__ (memchr);
+# endif
+#else
+# include <string/memchr.c>
+#endif
-- 
2.44.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 3/7] RISC-V: Add Zbb optimized memrchr as ifunc
  2024-04-22  7:43 [PATCH 0/7] Add ifunc support for existing Zbb optimizations Christoph Müllner
  2024-04-22  7:43 ` [PATCH 1/7] RISC-V: Use .insn directive form for orc.b Christoph Müllner
  2024-04-22  7:43 ` [PATCH 2/7] RISC-V: Add Zbb optimized memchr as ifunc Christoph Müllner
@ 2024-04-22  7:43 ` Christoph Müllner
  2024-04-22  7:44 ` [PATCH 4/7] RISC-V: Add Zbb optimized strchrnul " Christoph Müllner
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 20+ messages in thread
From: Christoph Müllner @ 2024-04-22  7:43 UTC (permalink / raw)
  To: libc-alpha, Adhemerval Zanella, Palmer Dabbelt, Darius Rad,
	Andrew Waterman, Philipp Tomsich, Evan Green, DJ Delorie,
	Vineet Gupta, Kito Cheng, Jeff Law
  Cc: Christoph Müllner

When building with Zbb enabled, memrchr benefits from using orc.b in
find_zero_all().  This patch changes the build system such, that a
non-Zbb version as well as a Zbb version of this routine is built.
Further, a ifunc resolver is provided that selects the right routine
based on the outcome of extension probing via hwprobe().

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
---
 sysdeps/riscv/multiarch/memrchr-generic.c     | 26 ++++++++
 sysdeps/riscv/multiarch/memrchr-zbb.c         | 30 +++++++++
 .../unix/sysv/linux/riscv/multiarch/Makefile  |  3 +
 .../linux/riscv/multiarch/ifunc-impl-list.c   |  4 ++
 .../unix/sysv/linux/riscv/multiarch/memrchr.c | 61 +++++++++++++++++++
 5 files changed, 124 insertions(+)
 create mode 100644 sysdeps/riscv/multiarch/memrchr-generic.c
 create mode 100644 sysdeps/riscv/multiarch/memrchr-zbb.c
 create mode 100644 sysdeps/unix/sysv/linux/riscv/multiarch/memrchr.c

diff --git a/sysdeps/riscv/multiarch/memrchr-generic.c b/sysdeps/riscv/multiarch/memrchr-generic.c
new file mode 100644
index 0000000000..7c1286cb60
--- /dev/null
+++ b/sysdeps/riscv/multiarch/memrchr-generic.c
@@ -0,0 +1,26 @@
+/* Re-include the default memrchr implementation.
+   Copyright (C) 2024 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <string.h>
+
+#if IS_IN(libc)
+# define MEMRCHR __memrchr_generic
+# undef libc_hidden_builtin_def
+# define libc_hidden_builtin_def(x)
+#endif
+#include <string/memrchr.c>
diff --git a/sysdeps/riscv/multiarch/memrchr-zbb.c b/sysdeps/riscv/multiarch/memrchr-zbb.c
new file mode 100644
index 0000000000..c074c7607c
--- /dev/null
+++ b/sysdeps/riscv/multiarch/memrchr-zbb.c
@@ -0,0 +1,30 @@
+/* Re-include the default memrchr implementation for Zbb.
+   Copyright (C) 2024 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <string.h>
+
+#if IS_IN(libc)
+# define MEMRCHR __memrchr_zbb
+# undef libc_hidden_builtin_def
+# define libc_hidden_builtin_def(x)
+#endif
+/* Convince preprocessor to have Zbb instructions.  */
+#ifndef __riscv_zbb
+# define __riscv_zbb
+#endif
+#include <string/memrchr.c>
diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
index 5586d11c89..d36d0bd6dd 100644
--- a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
+++ b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
@@ -6,6 +6,9 @@ sysdep_routines += \
   memcpy \
   memcpy-generic \
   memcpy_noalignment \
+  memrchr \
+  memrchr-generic \
+  memrchr-zbb \
   # sysdep_routines
 
 CFLAGS-memcpy_noalignment.c += -mno-strict-align
diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
index 7321144a32..f4ec58b9d8 100644
--- a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
+++ b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
@@ -60,5 +60,9 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
 			      __memcpy_noalignment)
 	      IFUNC_IMPL_ADD (array, i, memcpy, 1, __memcpy_generic))
 
+  IFUNC_IMPL (i, name, memrchr,
+	      IFUNC_IMPL_ADD (array, i, memrchr, has_zbb, __memrchr_zbb)
+	      IFUNC_IMPL_ADD (array, i, memrchr, 1, __memrchr_generic))
+
   return 0;
 }
diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/memrchr.c b/sysdeps/unix/sysv/linux/riscv/multiarch/memrchr.c
new file mode 100644
index 0000000000..e19ac471c1
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/riscv/multiarch/memrchr.c
@@ -0,0 +1,61 @@
+/* Multiple versions of memrchr.
+   All versions must be listed in ifunc-impl-list.c.
+   Copyright (C) 2017-2024 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#if IS_IN (libc)
+/* Redefine memrchr so that the compiler won't complain about the type
+   mismatch with the IFUNC selector in strong_alias, below.  */
+# undef memrchr
+# undef __memrchr
+# define memrchr __redirect_memrchr
+# define __memrchr __redirect___memrchr
+# include <stdint.h>
+# include <string.h>
+# include <ifunc-init.h>
+# include <riscv-ifunc.h>
+# include <sys/hwprobe.h>
+
+extern __typeof (__redirect_memrchr) __libc_memrchr;
+
+extern __typeof (__redirect_memrchr) __memrchr_generic attribute_hidden;
+extern __typeof (__redirect_memrchr) __memrchr_zbb attribute_hidden;
+
+static inline __typeof (__redirect_memrchr) *
+select_memrchr_ifunc (uint64_t dl_hwcap, __riscv_hwprobe_t hwprobe_func)
+{
+  unsigned long long int v;
+  if (__riscv_hwprobe_one (hwprobe_func, RISCV_HWPROBE_KEY_IMA_EXT_0, &v) == 0
+      && (v & RISCV_HWPROBE_EXT_ZBB))
+    return __memrchr_zbb;
+
+  return __memrchr_generic;
+}
+
+riscv_libc_ifunc (__libc_memrchr, select_memrchr_ifunc);
+
+# undef memrchr
+# undef __memrchr
+strong_alias (__libc_memrchr, __memrchr);
+weak_alias (__memrchr, memrchr)
+# ifdef SHARED
+__hidden_ver1 (memrchr, __GI___memrchr, __redirect_memrchr)
+  __attribute__ ((visibility ("hidden"))) __attribute_copy__ (memrchr);
+# endif
+#else
+# include <string/memrchr.c>
+#endif
-- 
2.44.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 4/7] RISC-V: Add Zbb optimized strchrnul as ifunc
  2024-04-22  7:43 [PATCH 0/7] Add ifunc support for existing Zbb optimizations Christoph Müllner
                   ` (2 preceding siblings ...)
  2024-04-22  7:43 ` [PATCH 3/7] RISC-V: Add Zbb optimized memrchr " Christoph Müllner
@ 2024-04-22  7:44 ` Christoph Müllner
  2024-04-22  7:44 ` [PATCH 5/7] RISC-V: Add Zbb optimized strcmp " Christoph Müllner
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 20+ messages in thread
From: Christoph Müllner @ 2024-04-22  7:44 UTC (permalink / raw)
  To: libc-alpha, Adhemerval Zanella, Palmer Dabbelt, Darius Rad,
	Andrew Waterman, Philipp Tomsich, Evan Green, DJ Delorie,
	Vineet Gupta, Kito Cheng, Jeff Law
  Cc: Christoph Müllner

When building with Zbb enabled, strchrnul benefits from using orc.b in
find_zero_all().  This patch changes the build system such, that a
non-Zbb version as well as a Zbb version of this routine is built.
Further, a ifunc resolver is provided that selects the right routine
based on the outcome of extension probing via hwprobe().

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
---
 sysdeps/riscv/multiarch/strchrnul-generic.c   | 26 ++++++++
 sysdeps/riscv/multiarch/strchrnul-zbb.c       | 30 +++++++++
 .../unix/sysv/linux/riscv/multiarch/Makefile  |  3 +
 .../linux/riscv/multiarch/ifunc-impl-list.c   |  4 ++
 .../sysv/linux/riscv/multiarch/strchrnul.c    | 61 +++++++++++++++++++
 5 files changed, 124 insertions(+)
 create mode 100644 sysdeps/riscv/multiarch/strchrnul-generic.c
 create mode 100644 sysdeps/riscv/multiarch/strchrnul-zbb.c
 create mode 100644 sysdeps/unix/sysv/linux/riscv/multiarch/strchrnul.c

diff --git a/sysdeps/riscv/multiarch/strchrnul-generic.c b/sysdeps/riscv/multiarch/strchrnul-generic.c
new file mode 100644
index 0000000000..0b64273ebf
--- /dev/null
+++ b/sysdeps/riscv/multiarch/strchrnul-generic.c
@@ -0,0 +1,26 @@
+/* Re-include the default strchrnul implementation.
+   Copyright (C) 2024 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <string.h>
+
+#if IS_IN(libc)
+# define STRCHRNUL __strchrnul_generic
+# undef libc_hidden_builtin_def
+# define libc_hidden_builtin_def(x)
+#endif
+#include <string/strchrnul.c>
diff --git a/sysdeps/riscv/multiarch/strchrnul-zbb.c b/sysdeps/riscv/multiarch/strchrnul-zbb.c
new file mode 100644
index 0000000000..f12fba7709
--- /dev/null
+++ b/sysdeps/riscv/multiarch/strchrnul-zbb.c
@@ -0,0 +1,30 @@
+/* Re-include the default strchrnul implementation for Zbb.
+   Copyright (C) 2024 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <string.h>
+
+#if IS_IN(libc)
+# define STRCHRNUL __strchrnul_zbb
+# undef libc_hidden_builtin_def
+# define libc_hidden_builtin_def(x)
+#endif
+/* Convince preprocessor to have Zbb instructions.  */
+#ifndef __riscv_zbb
+# define __riscv_zbb
+#endif
+#include <string/strchrnul.c>
diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
index d36d0bd6dd..242c2527d2 100644
--- a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
+++ b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
@@ -9,6 +9,9 @@ sysdep_routines += \
   memrchr \
   memrchr-generic \
   memrchr-zbb \
+  strchrnul \
+  strchrnul-generic \
+  strchrnul-zbb \
   # sysdep_routines
 
 CFLAGS-memcpy_noalignment.c += -mno-strict-align
diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
index f4ec58b9d8..97cc821a56 100644
--- a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
+++ b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
@@ -64,5 +64,9 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
 	      IFUNC_IMPL_ADD (array, i, memrchr, has_zbb, __memrchr_zbb)
 	      IFUNC_IMPL_ADD (array, i, memrchr, 1, __memrchr_generic))
 
+  IFUNC_IMPL (i, name, strchrnul,
+	      IFUNC_IMPL_ADD (array, i, strchrnul, has_zbb, __strchrnul_zbb)
+	      IFUNC_IMPL_ADD (array, i, strchrnul, 1, __strchrnul_generic))
+
   return 0;
 }
diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/strchrnul.c b/sysdeps/unix/sysv/linux/riscv/multiarch/strchrnul.c
new file mode 100644
index 0000000000..c3f245678e
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/riscv/multiarch/strchrnul.c
@@ -0,0 +1,61 @@
+/* Multiple versions of strchrnul.
+   All versions must be listed in ifunc-impl-list.c.
+   Copyright (C) 2017-2024 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#if IS_IN (libc)
+/* Redefine strchrnul so that the compiler won't complain about the type
+   mismatch with the IFUNC selector in strong_alias, below.  */
+# undef strchrnul
+# undef __strchrnul
+# define strchrnul __redirect_strchrnul
+# define __strchrnul __redirect_strchrnul
+# include <stdint.h>
+# include <string.h>
+# include <ifunc-init.h>
+# include <riscv-ifunc.h>
+# include <sys/hwprobe.h>
+
+extern __typeof (__redirect_strchrnul) __libc_strchrnul;
+
+extern __typeof (__redirect_strchrnul) __strchrnul_generic attribute_hidden;
+extern __typeof (__redirect_strchrnul) __strchrnul_zbb attribute_hidden;
+
+static inline __typeof (__redirect_strchrnul) *
+select_strchrnul_ifunc (uint64_t dl_hwcap, __riscv_hwprobe_t hwprobe_func)
+{
+  unsigned long long int v;
+  if (__riscv_hwprobe_one (hwprobe_func, RISCV_HWPROBE_KEY_IMA_EXT_0, &v) == 0
+      && (v & RISCV_HWPROBE_EXT_ZBB))
+    return __strchrnul_zbb;
+
+  return __strchrnul_generic;
+}
+
+riscv_libc_ifunc (__libc_strchrnul, select_strchrnul_ifunc);
+
+# undef strchrnul
+# undef __strchrnul
+strong_alias (__libc_strchrnul, __strchrnul);
+weak_alias (__strchrnul, strchrnul)
+# ifdef SHARED
+__hidden_ver1 (strchrnul, __GI___strchrnul, __redirect_strchrnul)
+  __attribute__ ((visibility ("hidden"))) __attribute_copy__ (strchrnul);
+# endif
+#else
+# include <string/strchrnul.c>
+#endif
-- 
2.44.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 5/7] RISC-V: Add Zbb optimized strcmp as ifunc
  2024-04-22  7:43 [PATCH 0/7] Add ifunc support for existing Zbb optimizations Christoph Müllner
                   ` (3 preceding siblings ...)
  2024-04-22  7:44 ` [PATCH 4/7] RISC-V: Add Zbb optimized strchrnul " Christoph Müllner
@ 2024-04-22  7:44 ` Christoph Müllner
  2024-04-22  7:44 ` [PATCH 6/7] RISC-V: Add Zbb optimized strlen " Christoph Müllner
  2024-04-22  7:44 ` [PATCH 7/7] RISC-V: Add Zbb optimized strncmp " Christoph Müllner
  6 siblings, 0 replies; 20+ messages in thread
From: Christoph Müllner @ 2024-04-22  7:44 UTC (permalink / raw)
  To: libc-alpha, Adhemerval Zanella, Palmer Dabbelt, Darius Rad,
	Andrew Waterman, Philipp Tomsich, Evan Green, DJ Delorie,
	Vineet Gupta, Kito Cheng, Jeff Law
  Cc: Christoph Müllner

When building with Zbb enabled, strcmp benefits from using orc.b in
find_zero_all().  This patch changes the build system such, that a
non-Zbb version as well as a Zbb version of this routine is built.
Further, a ifunc resolver is provided that selects the right routine
based on the outcome of extension probing via hwprobe().

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
---
 sysdeps/riscv/multiarch/strcmp-generic.c      | 26 +++++++++
 sysdeps/riscv/multiarch/strcmp-zbb.c          | 30 ++++++++++
 .../unix/sysv/linux/riscv/multiarch/Makefile  |  3 +
 .../linux/riscv/multiarch/ifunc-impl-list.c   |  4 ++
 .../unix/sysv/linux/riscv/multiarch/strcmp.c  | 57 +++++++++++++++++++
 5 files changed, 120 insertions(+)
 create mode 100644 sysdeps/riscv/multiarch/strcmp-generic.c
 create mode 100644 sysdeps/riscv/multiarch/strcmp-zbb.c
 create mode 100644 sysdeps/unix/sysv/linux/riscv/multiarch/strcmp.c

diff --git a/sysdeps/riscv/multiarch/strcmp-generic.c b/sysdeps/riscv/multiarch/strcmp-generic.c
new file mode 100644
index 0000000000..5da954d426
--- /dev/null
+++ b/sysdeps/riscv/multiarch/strcmp-generic.c
@@ -0,0 +1,26 @@
+/* Re-include the default strcmp implementation.
+   Copyright (C) 2024 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <string.h>
+
+#if IS_IN(libc)
+# define STRCMP __strcmp_generic
+# undef libc_hidden_builtin_def
+# define libc_hidden_builtin_def(x)
+#endif
+#include <string/strcmp.c>
diff --git a/sysdeps/riscv/multiarch/strcmp-zbb.c b/sysdeps/riscv/multiarch/strcmp-zbb.c
new file mode 100644
index 0000000000..2ccde14f53
--- /dev/null
+++ b/sysdeps/riscv/multiarch/strcmp-zbb.c
@@ -0,0 +1,30 @@
+/* Re-include the default strcmp implementation for Zbb.
+   Copyright (C) 2024 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <string.h>
+
+#if IS_IN(libc)
+# define STRCMP __strcmp_zbb
+# undef libc_hidden_builtin_def
+# define libc_hidden_builtin_def(x)
+#endif
+/* Convince preprocessor to have Zbb instructions.  */
+#ifndef __riscv_zbb
+# define __riscv_zbb
+#endif
+#include <string/strcmp.c>
diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
index 242c2527d2..8aef9f1638 100644
--- a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
+++ b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
@@ -12,6 +12,9 @@ sysdep_routines += \
   strchrnul \
   strchrnul-generic \
   strchrnul-zbb \
+  strcmp \
+  strcmp-generic \
+  strcmp-zbb \
   # sysdep_routines
 
 CFLAGS-memcpy_noalignment.c += -mno-strict-align
diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
index 97cc821a56..f4df7a2b0b 100644
--- a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
+++ b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
@@ -68,5 +68,9 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
 	      IFUNC_IMPL_ADD (array, i, strchrnul, has_zbb, __strchrnul_zbb)
 	      IFUNC_IMPL_ADD (array, i, strchrnul, 1, __strchrnul_generic))
 
+  IFUNC_IMPL (i, name, strcmp,
+	      IFUNC_IMPL_ADD (array, i, strcmp, has_zbb, __strcmp_zbb)
+	      IFUNC_IMPL_ADD (array, i, strcmp, 1, __strcmp_generic))
+
   return 0;
 }
diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/strcmp.c b/sysdeps/unix/sysv/linux/riscv/multiarch/strcmp.c
new file mode 100644
index 0000000000..47b3277e77
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/riscv/multiarch/strcmp.c
@@ -0,0 +1,57 @@
+/* Multiple versions of strcmp.
+   All versions must be listed in ifunc-impl-list.c.
+   Copyright (C) 2017-2024 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#if IS_IN (libc)
+/* Redefine strcmp so that the compiler won't complain about the type
+   mismatch with the IFUNC selector in strong_alias, below.  */
+# undef strcmp
+# define strcmp __redirect_strcmp
+# include <stdint.h>
+# include <string.h>
+# include <ifunc-init.h>
+# include <riscv-ifunc.h>
+# include <sys/hwprobe.h>
+
+extern __typeof (__redirect_strcmp) __libc_strcmp;
+
+extern __typeof (__redirect_strcmp) __strcmp_generic attribute_hidden;
+extern __typeof (__redirect_strcmp) __strcmp_zbb attribute_hidden;
+
+static inline __typeof (__redirect_strcmp) *
+select_strcmp_ifunc (uint64_t dl_hwcap, __riscv_hwprobe_t hwprobe_func)
+{
+  unsigned long long int v;
+  if (__riscv_hwprobe_one (hwprobe_func, RISCV_HWPROBE_KEY_IMA_EXT_0, &v) == 0
+      && (v & RISCV_HWPROBE_EXT_ZBB))
+    return __strcmp_zbb;
+
+  return __strcmp_generic;
+}
+
+riscv_libc_ifunc (__libc_strcmp, select_strcmp_ifunc);
+
+# undef strcmp
+strong_alias (__libc_strcmp, strcmp);
+# ifdef SHARED
+__hidden_ver1 (strcmp, __GI_strcmp, __redirect_strcmp)
+  __attribute__ ((visibility ("hidden"))) __attribute_copy__ (strcmp);
+# endif
+#else
+# include <string/strcmp.c>
+#endif
-- 
2.44.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 6/7] RISC-V: Add Zbb optimized strlen as ifunc
  2024-04-22  7:43 [PATCH 0/7] Add ifunc support for existing Zbb optimizations Christoph Müllner
                   ` (4 preceding siblings ...)
  2024-04-22  7:44 ` [PATCH 5/7] RISC-V: Add Zbb optimized strcmp " Christoph Müllner
@ 2024-04-22  7:44 ` Christoph Müllner
  2024-04-22  7:44 ` [PATCH 7/7] RISC-V: Add Zbb optimized strncmp " Christoph Müllner
  6 siblings, 0 replies; 20+ messages in thread
From: Christoph Müllner @ 2024-04-22  7:44 UTC (permalink / raw)
  To: libc-alpha, Adhemerval Zanella, Palmer Dabbelt, Darius Rad,
	Andrew Waterman, Philipp Tomsich, Evan Green, DJ Delorie,
	Vineet Gupta, Kito Cheng, Jeff Law
  Cc: Christoph Müllner

When building with Zbb enabled, strlen benefits from using orc.b in
find_zero_all().  This patch changes the build system such, that a
non-Zbb version as well as a Zbb version of this routine is built.
Further, a ifunc resolver is provided that selects the right routine
based on the outcome of extension probing via hwprobe().

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
---
 sysdeps/riscv/multiarch/strlen-generic.c      | 26 +++++++++
 sysdeps/riscv/multiarch/strlen-zbb.c          | 30 ++++++++++
 .../unix/sysv/linux/riscv/multiarch/Makefile  |  3 +
 .../linux/riscv/multiarch/ifunc-impl-list.c   |  4 ++
 .../unix/sysv/linux/riscv/multiarch/strlen.c  | 57 +++++++++++++++++++
 5 files changed, 120 insertions(+)
 create mode 100644 sysdeps/riscv/multiarch/strlen-generic.c
 create mode 100644 sysdeps/riscv/multiarch/strlen-zbb.c
 create mode 100644 sysdeps/unix/sysv/linux/riscv/multiarch/strlen.c

diff --git a/sysdeps/riscv/multiarch/strlen-generic.c b/sysdeps/riscv/multiarch/strlen-generic.c
new file mode 100644
index 0000000000..e057409278
--- /dev/null
+++ b/sysdeps/riscv/multiarch/strlen-generic.c
@@ -0,0 +1,26 @@
+/* Re-include the default strlen implementation.
+   Copyright (C) 2024 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <string.h>
+
+#if IS_IN(libc)
+# define STRLEN __strlen_generic
+# undef libc_hidden_builtin_def
+# define libc_hidden_builtin_def(x)
+#endif
+#include <string/strlen.c>
diff --git a/sysdeps/riscv/multiarch/strlen-zbb.c b/sysdeps/riscv/multiarch/strlen-zbb.c
new file mode 100644
index 0000000000..81ef16bb4a
--- /dev/null
+++ b/sysdeps/riscv/multiarch/strlen-zbb.c
@@ -0,0 +1,30 @@
+/* Re-include the default strlen implementation for Zbb.
+   Copyright (C) 2024 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <string.h>
+
+#if IS_IN(libc)
+# define STRLEN __strlen_zbb
+# undef libc_hidden_builtin_def
+# define libc_hidden_builtin_def(x)
+#endif
+/* Convince preprocessor to have Zbb instructions.  */
+#ifndef __riscv_zbb
+# define __riscv_zbb
+#endif
+#include <string/strlen.c>
diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
index 8aef9f1638..daf5af9608 100644
--- a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
+++ b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
@@ -15,6 +15,9 @@ sysdep_routines += \
   strcmp \
   strcmp-generic \
   strcmp-zbb \
+  strlen \
+  strlen-generic \
+  strlen-zbb \
   # sysdep_routines
 
 CFLAGS-memcpy_noalignment.c += -mno-strict-align
diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
index f4df7a2b0b..f5f34818ed 100644
--- a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
+++ b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
@@ -72,5 +72,9 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
 	      IFUNC_IMPL_ADD (array, i, strcmp, has_zbb, __strcmp_zbb)
 	      IFUNC_IMPL_ADD (array, i, strcmp, 1, __strcmp_generic))
 
+  IFUNC_IMPL (i, name, strlen,
+	      IFUNC_IMPL_ADD (array, i, strlen, has_zbb, __strlen_zbb)
+	      IFUNC_IMPL_ADD (array, i, strlen, 1, __strlen_generic))
+
   return 0;
 }
diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/strlen.c b/sysdeps/unix/sysv/linux/riscv/multiarch/strlen.c
new file mode 100644
index 0000000000..6833d05bc8
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/riscv/multiarch/strlen.c
@@ -0,0 +1,57 @@
+/* Multiple versions of strlen.
+   All versions must be listed in ifunc-impl-list.c.
+   Copyright (C) 2017-2024 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#if IS_IN (libc)
+/* Redefine strlen so that the compiler won't complain about the type
+   mismatch with the IFUNC selector in strong_alias, below.  */
+# undef strlen
+# define strlen __redirect_strlen
+# include <stdint.h>
+# include <string.h>
+# include <ifunc-init.h>
+# include <riscv-ifunc.h>
+# include <sys/hwprobe.h>
+
+extern __typeof (__redirect_strlen) __libc_strlen;
+
+extern __typeof (__redirect_strlen) __strlen_generic attribute_hidden;
+extern __typeof (__redirect_strlen) __strlen_zbb attribute_hidden;
+
+static inline __typeof (__redirect_strlen) *
+select_strlen_ifunc (uint64_t dl_hwcap, __riscv_hwprobe_t hwprobe_func)
+{
+  unsigned long long int v;
+  if (__riscv_hwprobe_one (hwprobe_func, RISCV_HWPROBE_KEY_IMA_EXT_0, &v) == 0
+      && (v & RISCV_HWPROBE_EXT_ZBB))
+    return __strlen_zbb;
+
+  return __strlen_generic;
+}
+
+riscv_libc_ifunc (__libc_strlen, select_strlen_ifunc);
+
+# undef strlen
+strong_alias (__libc_strlen, strlen);
+# ifdef SHARED
+__hidden_ver1 (strlen, __GI_strlen, __redirect_strlen)
+  __attribute__ ((visibility ("hidden"))) __attribute_copy__ (strlen);
+# endif
+#else
+# include <string/strlen.c>
+#endif
-- 
2.44.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 7/7] RISC-V: Add Zbb optimized strncmp as ifunc
  2024-04-22  7:43 [PATCH 0/7] Add ifunc support for existing Zbb optimizations Christoph Müllner
                   ` (5 preceding siblings ...)
  2024-04-22  7:44 ` [PATCH 6/7] RISC-V: Add Zbb optimized strlen " Christoph Müllner
@ 2024-04-22  7:44 ` Christoph Müllner
  6 siblings, 0 replies; 20+ messages in thread
From: Christoph Müllner @ 2024-04-22  7:44 UTC (permalink / raw)
  To: libc-alpha, Adhemerval Zanella, Palmer Dabbelt, Darius Rad,
	Andrew Waterman, Philipp Tomsich, Evan Green, DJ Delorie,
	Vineet Gupta, Kito Cheng, Jeff Law
  Cc: Christoph Müllner

When building with Zbb enabled, strncmp benefits from using orc.b in
find_zero_all().  This patch changes the build system such, that a
non-Zbb version as well as a Zbb version of this routine is built.
Further, a ifunc resolver is provided that selects the right routine
based on the outcome of extension probing via hwprobe().

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
---
 sysdeps/riscv/multiarch/strncmp-generic.c     | 26 +++++++++
 sysdeps/riscv/multiarch/strncmp-zbb.c         | 30 ++++++++++
 .../unix/sysv/linux/riscv/multiarch/Makefile  |  3 +
 .../linux/riscv/multiarch/ifunc-impl-list.c   |  4 ++
 .../unix/sysv/linux/riscv/multiarch/strncmp.c | 57 +++++++++++++++++++
 5 files changed, 120 insertions(+)
 create mode 100644 sysdeps/riscv/multiarch/strncmp-generic.c
 create mode 100644 sysdeps/riscv/multiarch/strncmp-zbb.c
 create mode 100644 sysdeps/unix/sysv/linux/riscv/multiarch/strncmp.c

diff --git a/sysdeps/riscv/multiarch/strncmp-generic.c b/sysdeps/riscv/multiarch/strncmp-generic.c
new file mode 100644
index 0000000000..67eb89e62e
--- /dev/null
+++ b/sysdeps/riscv/multiarch/strncmp-generic.c
@@ -0,0 +1,26 @@
+/* Re-include the default strncmp implementation.
+   Copyright (C) 2024 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <string.h>
+
+#if IS_IN(libc)
+# define STRNCMP __strncmp_generic
+# undef libc_hidden_builtin_def
+# define libc_hidden_builtin_def(x)
+#endif
+#include <string/strncmp.c>
diff --git a/sysdeps/riscv/multiarch/strncmp-zbb.c b/sysdeps/riscv/multiarch/strncmp-zbb.c
new file mode 100644
index 0000000000..993b494d8b
--- /dev/null
+++ b/sysdeps/riscv/multiarch/strncmp-zbb.c
@@ -0,0 +1,30 @@
+/* Re-include the default strncmp implementation for Zbb.
+   Copyright (C) 2024 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <string.h>
+
+#if IS_IN(libc)
+# define STRNCMP __strncmp_zbb
+# undef libc_hidden_builtin_def
+# define libc_hidden_builtin_def(x)
+#endif
+/* Convince preprocessor to have Zbb instructions.  */
+#ifndef __riscv_zbb
+# define __riscv_zbb
+#endif
+#include <string/strncmp.c>
diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
index daf5af9608..4a34eab00c 100644
--- a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
+++ b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
@@ -18,6 +18,9 @@ sysdep_routines += \
   strlen \
   strlen-generic \
   strlen-zbb \
+  strncmp \
+  strncmp-generic \
+  strncmp-zbb \
   # sysdep_routines
 
 CFLAGS-memcpy_noalignment.c += -mno-strict-align
diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
index f5f34818ed..583c77934e 100644
--- a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
+++ b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
@@ -76,5 +76,9 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
 	      IFUNC_IMPL_ADD (array, i, strlen, has_zbb, __strlen_zbb)
 	      IFUNC_IMPL_ADD (array, i, strlen, 1, __strlen_generic))
 
+  IFUNC_IMPL (i, name, strncmp,
+	      IFUNC_IMPL_ADD (array, i, strncmp, has_zbb, __strncmp_zbb)
+	      IFUNC_IMPL_ADD (array, i, strncmp, 1, __strncmp_generic))
+
   return 0;
 }
diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/strncmp.c b/sysdeps/unix/sysv/linux/riscv/multiarch/strncmp.c
new file mode 100644
index 0000000000..9bc026681f
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/riscv/multiarch/strncmp.c
@@ -0,0 +1,57 @@
+/* Multiple versions of strncmp.
+   All versions must be listed in ifunc-impl-list.c.
+   Copyright (C) 2017-2024 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#if IS_IN (libc)
+/* Redefine strncmp so that the compiler won't complain about the type
+   mismatch with the IFUNC selector in strong_alias, below.  */
+# undef strncmp
+# define strncmp __redirect_strncmp
+# include <stdint.h>
+# include <string.h>
+# include <ifunc-init.h>
+# include <riscv-ifunc.h>
+# include <sys/hwprobe.h>
+
+extern __typeof (__redirect_strncmp) __libc_strncmp;
+
+extern __typeof (__redirect_strncmp) __strncmp_generic attribute_hidden;
+extern __typeof (__redirect_strncmp) __strncmp_zbb attribute_hidden;
+
+static inline __typeof (__redirect_strncmp) *
+select_strncmp_ifunc (uint64_t dl_hwcap, __riscv_hwprobe_t hwprobe_func)
+{
+  unsigned long long int v;
+  if (__riscv_hwprobe_one (hwprobe_func, RISCV_HWPROBE_KEY_IMA_EXT_0, &v) == 0
+      && (v & RISCV_HWPROBE_EXT_ZBB))
+    return __strncmp_zbb;
+
+  return __strncmp_generic;
+}
+
+riscv_libc_ifunc (__libc_strncmp, select_strncmp_ifunc);
+
+# undef strncmp
+strong_alias (__libc_strncmp, strncmp);
+# ifdef SHARED
+__hidden_ver1 (strncmp, __GI_strncmp, __redirect_strncmp)
+  __attribute__ ((visibility ("hidden"))) __attribute_copy__ (strncmp);
+# endif
+#else
+# include <string/strncmp.c>
+#endif
-- 
2.44.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/7] RISC-V: Add Zbb optimized memchr as ifunc
  2024-04-22  7:43 ` [PATCH 2/7] RISC-V: Add Zbb optimized memchr as ifunc Christoph Müllner
@ 2024-04-24 12:53   ` Adhemerval Zanella Netto
  2024-04-24 13:16     ` Christoph Müllner
  0 siblings, 1 reply; 20+ messages in thread
From: Adhemerval Zanella Netto @ 2024-04-24 12:53 UTC (permalink / raw)
  To: Christoph Müllner, libc-alpha, Palmer Dabbelt, Darius Rad,
	Andrew Waterman, Philipp Tomsich, Evan Green, DJ Delorie,
	Vineet Gupta, Kito Cheng, Jeff Law



On 22/04/24 04:43, Christoph Müllner wrote:
> When building with Zbb enabled, memchr benefits from using orc.b in
> find_zero_all().  This patch changes the build system such, that a
> non-Zbb version as well as a Zbb version of this routine is built.
> Further, a ifunc resolver is provided that selects the right routine
> based on the outcome of extension probing via hwprobe().
> 
> Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
> ---
>  sysdeps/riscv/multiarch/memchr-generic.c      | 26 +++++++++
>  sysdeps/riscv/multiarch/memchr-zbb.c          | 30 ++++++++++
>  .../unix/sysv/linux/riscv/multiarch/Makefile  |  3 +
>  .../linux/riscv/multiarch/ifunc-impl-list.c   | 31 ++++++++--
>  .../unix/sysv/linux/riscv/multiarch/memchr.c  | 57 +++++++++++++++++++
>  5 files changed, 142 insertions(+), 5 deletions(-)
>  create mode 100644 sysdeps/riscv/multiarch/memchr-generic.c
>  create mode 100644 sysdeps/riscv/multiarch/memchr-zbb.c
>  create mode 100644 sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c
> 
> diff --git a/sysdeps/riscv/multiarch/memchr-generic.c b/sysdeps/riscv/multiarch/memchr-generic.c
> new file mode 100644
> index 0000000000..a96c36398b
> --- /dev/null
> +++ b/sysdeps/riscv/multiarch/memchr-generic.c
> @@ -0,0 +1,26 @@
> +/* Re-include the default memchr implementation.
> +   Copyright (C) 2024 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#include <string.h>
> +
> +#if IS_IN(libc)
> +# define MEMCHR __memchr_generic
> +# undef libc_hidden_builtin_def
> +# define libc_hidden_builtin_def(x)
> +#endif
> +#include <string/memchr.c>
> diff --git a/sysdeps/riscv/multiarch/memchr-zbb.c b/sysdeps/riscv/multiarch/memchr-zbb.c
> new file mode 100644
> index 0000000000..bead0335ae
> --- /dev/null
> +++ b/sysdeps/riscv/multiarch/memchr-zbb.c
> @@ -0,0 +1,30 @@
> +/* Re-include the default memchr implementation for Zbb.
> +   Copyright (C) 2024 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#include <string.h>
> +
> +#if IS_IN(libc)
> +# define MEMCHR __memchr_zbb
> +# undef libc_hidden_builtin_def
> +# define libc_hidden_builtin_def(x)
> +#endif
> +/* Convince preprocessor to have Zbb instructions.  */
> +#ifndef __riscv_zbb
> +# define __riscv_zbb
> +#endif

Is there a way to specific the compiler to enable a extension, like aarch64
-march=arch{+[no]feature}? I think ideally this should be enabled as CFLAGS
instead of messing with compiler defined pre-processor.

> +#include <string/memchr.c>
> diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
> index fcef5659d4..5586d11c89 100644
> --- a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
> +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
> @@ -1,5 +1,8 @@
>  ifeq ($(subdir),string)
>  sysdep_routines += \
> +  memchr \
> +  memchr-generic \
> +  memchr-zbb \
>    memcpy \
>    memcpy-generic \
>    memcpy_noalignment \
> diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
> index 9f806d7a9e..7321144a32 100644
> --- a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
> +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
> @@ -20,19 +20,40 @@
>  #include <string.h>
>  #include <sys/hwprobe.h>
>  
> +#define ARRAY_SIZE(A) (sizeof (A) / sizeof ((A)[0]))
> +
>  size_t
>  __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
>  			size_t max)
>  {
>    size_t i = max;
> +  struct riscv_hwprobe pairs[] = {
> +    { .key = RISCV_HWPROBE_KEY_IMA_EXT_0 },
> +    { .key = RISCV_HWPROBE_KEY_CPUPERF_0 },
> +  };
>  
> +  bool has_zbb = false;
>    bool fast_unaligned = false;
>  
> -  struct riscv_hwprobe pair = { .key = RISCV_HWPROBE_KEY_CPUPERF_0 };
> -  if (__riscv_hwprobe (&pair, 1, 0, NULL, 0) == 0
> -      && (pair.value & RISCV_HWPROBE_MISALIGNED_MASK)
> -          == RISCV_HWPROBE_MISALIGNED_FAST)
> -    fast_unaligned = true;
> +  if (__riscv_hwprobe (pairs, ARRAY_SIZE (pairs), 0, NULL, 0) == 0)
> +    {
> +      struct riscv_hwprobe *pair;
> +
> +      /* RISCV_HWPROBE_KEY_IMA_EXT_0  */
> +      pair = &pairs[0];
> +      if (pair->value & RISCV_HWPROBE_EXT_ZBB)
> +        has_zbb = true;
> +
> +      /* RISCV_HWPROBE_KEY_CPUPERF_0  */
> +      pair = &pairs[1];
> +      if ((pair->value & RISCV_HWPROBE_MISALIGNED_MASK)
> +	   == RISCV_HWPROBE_MISALIGNED_FAST)
> +        fast_unaligned = true;
> +    }
> +
> +  IFUNC_IMPL (i, name, memchr,
> +	      IFUNC_IMPL_ADD (array, i, memchr, has_zbb, __memchr_zbb)
> +	      IFUNC_IMPL_ADD (array, i, memchr, 1, __memchr_generic))
>  
>    IFUNC_IMPL (i, name, memcpy,
>  	      IFUNC_IMPL_ADD (array, i, memcpy, fast_unaligned,
> diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c b/sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c
> new file mode 100644
> index 0000000000..bc076cbf24
> --- /dev/null
> +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c
> @@ -0,0 +1,57 @@
> +/* Multiple versions of memchr.
> +   All versions must be listed in ifunc-impl-list.c.
> +   Copyright (C) 2017-2024 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#if IS_IN (libc)
> +/* Redefine memchr so that the compiler won't complain about the type
> +   mismatch with the IFUNC selector in strong_alias, below.  */
> +# undef memchr
> +# define memchr __redirect_memchr
> +# include <stdint.h>
> +# include <string.h>
> +# include <ifunc-init.h>
> +# include <riscv-ifunc.h>
> +# include <sys/hwprobe.h>
> +
> +extern __typeof (__redirect_memchr) __libc_memchr;
> +
> +extern __typeof (__redirect_memchr) __memchr_generic attribute_hidden;
> +extern __typeof (__redirect_memchr) __memchr_zbb attribute_hidden;
> +
> +static inline __typeof (__redirect_memchr) *
> +select_memchr_ifunc (uint64_t dl_hwcap, __riscv_hwprobe_t hwprobe_func)
> +{
> +  unsigned long long int v;
> +  if (__riscv_hwprobe_one (hwprobe_func, RISCV_HWPROBE_KEY_IMA_EXT_0, &v) == 0
> +      && (v & RISCV_HWPROBE_EXT_ZBB))
> +    return __memchr_zbb;
> +
> +  return __memchr_generic;
> +}
> +
> +riscv_libc_ifunc (__libc_memchr, select_memchr_ifunc);
> +
> +# undef memchr
> +strong_alias (__libc_memchr, memchr);
> +# ifdef SHARED
> +__hidden_ver1 (memchr, __GI_memchr, __redirect_memchr)
> +  __attribute__ ((visibility ("hidden"))) __attribute_copy__ (memchr);
> +# endif
> +#else
> +# include <string/memchr.c>
> +#endif

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/7] RISC-V: Add Zbb optimized memchr as ifunc
  2024-04-24 12:53   ` Adhemerval Zanella Netto
@ 2024-04-24 13:16     ` Christoph Müllner
  2024-04-24 13:36       ` Adhemerval Zanella Netto
  0 siblings, 1 reply; 20+ messages in thread
From: Christoph Müllner @ 2024-04-24 13:16 UTC (permalink / raw)
  To: Adhemerval Zanella Netto
  Cc: libc-alpha, Palmer Dabbelt, Darius Rad, Andrew Waterman,
	Philipp Tomsich, Evan Green, DJ Delorie, Vineet Gupta,
	Kito Cheng, Jeff Law

On Wed, Apr 24, 2024 at 2:53 PM Adhemerval Zanella Netto
<adhemerval.zanella@linaro.org> wrote:
>
>
>
> On 22/04/24 04:43, Christoph Müllner wrote:
> > When building with Zbb enabled, memchr benefits from using orc.b in
> > find_zero_all().  This patch changes the build system such, that a
> > non-Zbb version as well as a Zbb version of this routine is built.
> > Further, a ifunc resolver is provided that selects the right routine
> > based on the outcome of extension probing via hwprobe().
> >
> > Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
> > ---
> >  sysdeps/riscv/multiarch/memchr-generic.c      | 26 +++++++++
> >  sysdeps/riscv/multiarch/memchr-zbb.c          | 30 ++++++++++
> >  .../unix/sysv/linux/riscv/multiarch/Makefile  |  3 +
> >  .../linux/riscv/multiarch/ifunc-impl-list.c   | 31 ++++++++--
> >  .../unix/sysv/linux/riscv/multiarch/memchr.c  | 57 +++++++++++++++++++
> >  5 files changed, 142 insertions(+), 5 deletions(-)
> >  create mode 100644 sysdeps/riscv/multiarch/memchr-generic.c
> >  create mode 100644 sysdeps/riscv/multiarch/memchr-zbb.c
> >  create mode 100644 sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c
> >
> > diff --git a/sysdeps/riscv/multiarch/memchr-generic.c b/sysdeps/riscv/multiarch/memchr-generic.c
> > new file mode 100644
> > index 0000000000..a96c36398b
> > --- /dev/null
> > +++ b/sysdeps/riscv/multiarch/memchr-generic.c
> > @@ -0,0 +1,26 @@
> > +/* Re-include the default memchr implementation.
> > +   Copyright (C) 2024 Free Software Foundation, Inc.
> > +   This file is part of the GNU C Library.
> > +
> > +   The GNU C Library is free software; you can redistribute it and/or
> > +   modify it under the terms of the GNU Lesser General Public
> > +   License as published by the Free Software Foundation; either
> > +   version 2.1 of the License, or (at your option) any later version.
> > +
> > +   The GNU C Library is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   Lesser General Public License for more details.
> > +
> > +   You should have received a copy of the GNU Lesser General Public
> > +   License along with the GNU C Library; if not, see
> > +   <https://www.gnu.org/licenses/>.  */
> > +
> > +#include <string.h>
> > +
> > +#if IS_IN(libc)
> > +# define MEMCHR __memchr_generic
> > +# undef libc_hidden_builtin_def
> > +# define libc_hidden_builtin_def(x)
> > +#endif
> > +#include <string/memchr.c>
> > diff --git a/sysdeps/riscv/multiarch/memchr-zbb.c b/sysdeps/riscv/multiarch/memchr-zbb.c
> > new file mode 100644
> > index 0000000000..bead0335ae
> > --- /dev/null
> > +++ b/sysdeps/riscv/multiarch/memchr-zbb.c
> > @@ -0,0 +1,30 @@
> > +/* Re-include the default memchr implementation for Zbb.
> > +   Copyright (C) 2024 Free Software Foundation, Inc.
> > +   This file is part of the GNU C Library.
> > +
> > +   The GNU C Library is free software; you can redistribute it and/or
> > +   modify it under the terms of the GNU Lesser General Public
> > +   License as published by the Free Software Foundation; either
> > +   version 2.1 of the License, or (at your option) any later version.
> > +
> > +   The GNU C Library is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   Lesser General Public License for more details.
> > +
> > +   You should have received a copy of the GNU Lesser General Public
> > +   License along with the GNU C Library; if not, see
> > +   <https://www.gnu.org/licenses/>.  */
> > +
> > +#include <string.h>
> > +
> > +#if IS_IN(libc)
> > +# define MEMCHR __memchr_zbb
> > +# undef libc_hidden_builtin_def
> > +# define libc_hidden_builtin_def(x)
> > +#endif
> > +/* Convince preprocessor to have Zbb instructions.  */
> > +#ifndef __riscv_zbb
> > +# define __riscv_zbb
> > +#endif
>
> Is there a way to specific the compiler to enable a extension, like aarch64
> -march=arch{+[no]feature}? I think ideally this should be enabled as CFLAGS
> instead of messing with compiler defined pre-processor.

The tools expect a list of all extensions as parameter to the -march= option.
But there is no way to append extensions to an existing march string
on the command line.

And if we would add this feature today, it would take many years until we could
use it here, because we want to remain compatible with old tools.
Or we enable the optimization only when being built with new tools, but that
adds even more complexity and build/test configurations.

What we have is:
* Preprocessor (since forever): Extension test macros (__riscv_EXTENSION)
* Command line (since forever): -march=BASE_EXTENSIONLIST
* GAS (since Nov 21): .option arch, +EXTENSION (in combination with
option push/pop)
* GCC (since Nov 23): __attribute__((target("arch=+EXTENSION")))

I was not sure about using __riscv_zbb as well, but I considered it safe within
ifdef tests that ensure the macro won't be set twice.
If that's a concern, I could change to use something like this:
#define __riscv_force_zbb
#include <impl.c>
#undef __riscv_force_zbb
... and change string-fza.h like this:
#if defined(__riscv_zbb) || defined(__riscv_force_zbb)
// orc.b
#endif

BR
Christoph

> > +#include <string/memchr.c>
> > diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
> > index fcef5659d4..5586d11c89 100644
> > --- a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
> > +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
> > @@ -1,5 +1,8 @@
> >  ifeq ($(subdir),string)
> >  sysdep_routines += \
> > +  memchr \
> > +  memchr-generic \
> > +  memchr-zbb \
> >    memcpy \
> >    memcpy-generic \
> >    memcpy_noalignment \
> > diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
> > index 9f806d7a9e..7321144a32 100644
> > --- a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
> > +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
> > @@ -20,19 +20,40 @@
> >  #include <string.h>
> >  #include <sys/hwprobe.h>
> >
> > +#define ARRAY_SIZE(A) (sizeof (A) / sizeof ((A)[0]))
> > +
> >  size_t
> >  __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
> >                       size_t max)
> >  {
> >    size_t i = max;
> > +  struct riscv_hwprobe pairs[] = {
> > +    { .key = RISCV_HWPROBE_KEY_IMA_EXT_0 },
> > +    { .key = RISCV_HWPROBE_KEY_CPUPERF_0 },
> > +  };
> >
> > +  bool has_zbb = false;
> >    bool fast_unaligned = false;
> >
> > -  struct riscv_hwprobe pair = { .key = RISCV_HWPROBE_KEY_CPUPERF_0 };
> > -  if (__riscv_hwprobe (&pair, 1, 0, NULL, 0) == 0
> > -      && (pair.value & RISCV_HWPROBE_MISALIGNED_MASK)
> > -          == RISCV_HWPROBE_MISALIGNED_FAST)
> > -    fast_unaligned = true;
> > +  if (__riscv_hwprobe (pairs, ARRAY_SIZE (pairs), 0, NULL, 0) == 0)
> > +    {
> > +      struct riscv_hwprobe *pair;
> > +
> > +      /* RISCV_HWPROBE_KEY_IMA_EXT_0  */
> > +      pair = &pairs[0];
> > +      if (pair->value & RISCV_HWPROBE_EXT_ZBB)
> > +        has_zbb = true;
> > +
> > +      /* RISCV_HWPROBE_KEY_CPUPERF_0  */
> > +      pair = &pairs[1];
> > +      if ((pair->value & RISCV_HWPROBE_MISALIGNED_MASK)
> > +        == RISCV_HWPROBE_MISALIGNED_FAST)
> > +        fast_unaligned = true;
> > +    }
> > +
> > +  IFUNC_IMPL (i, name, memchr,
> > +           IFUNC_IMPL_ADD (array, i, memchr, has_zbb, __memchr_zbb)
> > +           IFUNC_IMPL_ADD (array, i, memchr, 1, __memchr_generic))
> >
> >    IFUNC_IMPL (i, name, memcpy,
> >             IFUNC_IMPL_ADD (array, i, memcpy, fast_unaligned,
> > diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c b/sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c
> > new file mode 100644
> > index 0000000000..bc076cbf24
> > --- /dev/null
> > +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c
> > @@ -0,0 +1,57 @@
> > +/* Multiple versions of memchr.
> > +   All versions must be listed in ifunc-impl-list.c.
> > +   Copyright (C) 2017-2024 Free Software Foundation, Inc.
> > +   This file is part of the GNU C Library.
> > +
> > +   The GNU C Library is free software; you can redistribute it and/or
> > +   modify it under the terms of the GNU Lesser General Public
> > +   License as published by the Free Software Foundation; either
> > +   version 2.1 of the License, or (at your option) any later version.
> > +
> > +   The GNU C Library is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   Lesser General Public License for more details.
> > +
> > +   You should have received a copy of the GNU Lesser General Public
> > +   License along with the GNU C Library; if not, see
> > +   <https://www.gnu.org/licenses/>.  */
> > +
> > +#if IS_IN (libc)
> > +/* Redefine memchr so that the compiler won't complain about the type
> > +   mismatch with the IFUNC selector in strong_alias, below.  */
> > +# undef memchr
> > +# define memchr __redirect_memchr
> > +# include <stdint.h>
> > +# include <string.h>
> > +# include <ifunc-init.h>
> > +# include <riscv-ifunc.h>
> > +# include <sys/hwprobe.h>
> > +
> > +extern __typeof (__redirect_memchr) __libc_memchr;
> > +
> > +extern __typeof (__redirect_memchr) __memchr_generic attribute_hidden;
> > +extern __typeof (__redirect_memchr) __memchr_zbb attribute_hidden;
> > +
> > +static inline __typeof (__redirect_memchr) *
> > +select_memchr_ifunc (uint64_t dl_hwcap, __riscv_hwprobe_t hwprobe_func)
> > +{
> > +  unsigned long long int v;
> > +  if (__riscv_hwprobe_one (hwprobe_func, RISCV_HWPROBE_KEY_IMA_EXT_0, &v) == 0
> > +      && (v & RISCV_HWPROBE_EXT_ZBB))
> > +    return __memchr_zbb;
> > +
> > +  return __memchr_generic;
> > +}
> > +
> > +riscv_libc_ifunc (__libc_memchr, select_memchr_ifunc);
> > +
> > +# undef memchr
> > +strong_alias (__libc_memchr, memchr);
> > +# ifdef SHARED
> > +__hidden_ver1 (memchr, __GI_memchr, __redirect_memchr)
> > +  __attribute__ ((visibility ("hidden"))) __attribute_copy__ (memchr);
> > +# endif
> > +#else
> > +# include <string/memchr.c>
> > +#endif

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/7] RISC-V: Add Zbb optimized memchr as ifunc
  2024-04-24 13:16     ` Christoph Müllner
@ 2024-04-24 13:36       ` Adhemerval Zanella Netto
  2024-04-26 11:40         ` Christoph Müllner
  2024-04-30 15:13         ` Palmer Dabbelt
  0 siblings, 2 replies; 20+ messages in thread
From: Adhemerval Zanella Netto @ 2024-04-24 13:36 UTC (permalink / raw)
  To: Christoph Müllner
  Cc: libc-alpha, Palmer Dabbelt, Darius Rad, Andrew Waterman,
	Philipp Tomsich, Evan Green, DJ Delorie, Vineet Gupta,
	Kito Cheng, Jeff Law



On 24/04/24 10:16, Christoph Müllner wrote:
> On Wed, Apr 24, 2024 at 2:53 PM Adhemerval Zanella Netto
> <adhemerval.zanella@linaro.org> wrote:
>>
>>
>>
>> On 22/04/24 04:43, Christoph Müllner wrote:
>>> When building with Zbb enabled, memchr benefits from using orc.b in
>>> find_zero_all().  This patch changes the build system such, that a
>>> non-Zbb version as well as a Zbb version of this routine is built.
>>> Further, a ifunc resolver is provided that selects the right routine
>>> based on the outcome of extension probing via hwprobe().
>>>
>>> Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
>>> ---
>>>  sysdeps/riscv/multiarch/memchr-generic.c      | 26 +++++++++
>>>  sysdeps/riscv/multiarch/memchr-zbb.c          | 30 ++++++++++
>>>  .../unix/sysv/linux/riscv/multiarch/Makefile  |  3 +
>>>  .../linux/riscv/multiarch/ifunc-impl-list.c   | 31 ++++++++--
>>>  .../unix/sysv/linux/riscv/multiarch/memchr.c  | 57 +++++++++++++++++++
>>>  5 files changed, 142 insertions(+), 5 deletions(-)
>>>  create mode 100644 sysdeps/riscv/multiarch/memchr-generic.c
>>>  create mode 100644 sysdeps/riscv/multiarch/memchr-zbb.c
>>>  create mode 100644 sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c
>>>
>>> diff --git a/sysdeps/riscv/multiarch/memchr-generic.c b/sysdeps/riscv/multiarch/memchr-generic.c
>>> new file mode 100644
>>> index 0000000000..a96c36398b
>>> --- /dev/null
>>> +++ b/sysdeps/riscv/multiarch/memchr-generic.c
>>> @@ -0,0 +1,26 @@
>>> +/* Re-include the default memchr implementation.
>>> +   Copyright (C) 2024 Free Software Foundation, Inc.
>>> +   This file is part of the GNU C Library.
>>> +
>>> +   The GNU C Library is free software; you can redistribute it and/or
>>> +   modify it under the terms of the GNU Lesser General Public
>>> +   License as published by the Free Software Foundation; either
>>> +   version 2.1 of the License, or (at your option) any later version.
>>> +
>>> +   The GNU C Library is distributed in the hope that it will be useful,
>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>> +   Lesser General Public License for more details.
>>> +
>>> +   You should have received a copy of the GNU Lesser General Public
>>> +   License along with the GNU C Library; if not, see
>>> +   <https://www.gnu.org/licenses/>.  */
>>> +
>>> +#include <string.h>
>>> +
>>> +#if IS_IN(libc)
>>> +# define MEMCHR __memchr_generic
>>> +# undef libc_hidden_builtin_def
>>> +# define libc_hidden_builtin_def(x)
>>> +#endif
>>> +#include <string/memchr.c>
>>> diff --git a/sysdeps/riscv/multiarch/memchr-zbb.c b/sysdeps/riscv/multiarch/memchr-zbb.c
>>> new file mode 100644
>>> index 0000000000..bead0335ae
>>> --- /dev/null
>>> +++ b/sysdeps/riscv/multiarch/memchr-zbb.c
>>> @@ -0,0 +1,30 @@
>>> +/* Re-include the default memchr implementation for Zbb.
>>> +   Copyright (C) 2024 Free Software Foundation, Inc.
>>> +   This file is part of the GNU C Library.
>>> +
>>> +   The GNU C Library is free software; you can redistribute it and/or
>>> +   modify it under the terms of the GNU Lesser General Public
>>> +   License as published by the Free Software Foundation; either
>>> +   version 2.1 of the License, or (at your option) any later version.
>>> +
>>> +   The GNU C Library is distributed in the hope that it will be useful,
>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>> +   Lesser General Public License for more details.
>>> +
>>> +   You should have received a copy of the GNU Lesser General Public
>>> +   License along with the GNU C Library; if not, see
>>> +   <https://www.gnu.org/licenses/>.  */
>>> +
>>> +#include <string.h>
>>> +
>>> +#if IS_IN(libc)
>>> +# define MEMCHR __memchr_zbb
>>> +# undef libc_hidden_builtin_def
>>> +# define libc_hidden_builtin_def(x)
>>> +#endif
>>> +/* Convince preprocessor to have Zbb instructions.  */
>>> +#ifndef __riscv_zbb
>>> +# define __riscv_zbb
>>> +#endif
>>
>> Is there a way to specific the compiler to enable a extension, like aarch64
>> -march=arch{+[no]feature}? I think ideally this should be enabled as CFLAGS
>> instead of messing with compiler defined pre-processor.
> 
> The tools expect a list of all extensions as parameter to the -march= option.
> But there is no way to append extensions to an existing march string
> on the command line.
> 
> And if we would add this feature today, it would take many years until we could
> use it here, because we want to remain compatible with old tools.
> Or we enable the optimization only when being built with new tools, but that
> adds even more complexity and build/test configurations.
> 
> What we have is:
> * Preprocessor (since forever): Extension test macros (__riscv_EXTENSION)
> * Command line (since forever): -march=BASE_EXTENSIONLIST
> * GAS (since Nov 21): .option arch, +EXTENSION (in combination with
> option push/pop)
> * GCC (since Nov 23): __attribute__((target("arch=+EXTENSION")))
> 
> I was not sure about using __riscv_zbb as well, but I considered it safe within
> ifdef tests that ensure the macro won't be set twice.
> If that's a concern, I could change to use something like this:
> #define __riscv_force_zbb
> #include <impl.c>
> #undef __riscv_force_zbb
> ... and change string-fza.h like this:
> #if defined(__riscv_zbb) || defined(__riscv_force_zbb)
> // orc.b
> #endif
> 
> BR
> Christoph

Another options would to parse the current march and add the extension if required,
something like:

abi=$(riscv64-linux-gnu-gcc -Q --help=target | grep march | cut -d '=' -f2 | xargs)
if [[ ! "$abi" =~ "_zbb" ]]
then
  abi="$abi"_zbb
fi

I don't have a strong preference, it is just that by not using the compiler flag
we won't be able to either use the builtin (__builtin_riscv_orc_b_32) and/or get
a possible better code generation from compiler.

> 
>>> +#include <string/memchr.c>
>>> diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
>>> index fcef5659d4..5586d11c89 100644
>>> --- a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
>>> +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
>>> @@ -1,5 +1,8 @@
>>>  ifeq ($(subdir),string)
>>>  sysdep_routines += \
>>> +  memchr \
>>> +  memchr-generic \
>>> +  memchr-zbb \
>>>    memcpy \
>>>    memcpy-generic \
>>>    memcpy_noalignment \
>>> diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
>>> index 9f806d7a9e..7321144a32 100644
>>> --- a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
>>> +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
>>> @@ -20,19 +20,40 @@
>>>  #include <string.h>
>>>  #include <sys/hwprobe.h>
>>>
>>> +#define ARRAY_SIZE(A) (sizeof (A) / sizeof ((A)[0]))
>>> +
>>>  size_t
>>>  __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
>>>                       size_t max)
>>>  {
>>>    size_t i = max;
>>> +  struct riscv_hwprobe pairs[] = {
>>> +    { .key = RISCV_HWPROBE_KEY_IMA_EXT_0 },
>>> +    { .key = RISCV_HWPROBE_KEY_CPUPERF_0 },
>>> +  };
>>>
>>> +  bool has_zbb = false;
>>>    bool fast_unaligned = false;
>>>
>>> -  struct riscv_hwprobe pair = { .key = RISCV_HWPROBE_KEY_CPUPERF_0 };
>>> -  if (__riscv_hwprobe (&pair, 1, 0, NULL, 0) == 0
>>> -      && (pair.value & RISCV_HWPROBE_MISALIGNED_MASK)
>>> -          == RISCV_HWPROBE_MISALIGNED_FAST)
>>> -    fast_unaligned = true;
>>> +  if (__riscv_hwprobe (pairs, ARRAY_SIZE (pairs), 0, NULL, 0) == 0)
>>> +    {
>>> +      struct riscv_hwprobe *pair;
>>> +
>>> +      /* RISCV_HWPROBE_KEY_IMA_EXT_0  */
>>> +      pair = &pairs[0];
>>> +      if (pair->value & RISCV_HWPROBE_EXT_ZBB)
>>> +        has_zbb = true;
>>> +
>>> +      /* RISCV_HWPROBE_KEY_CPUPERF_0  */
>>> +      pair = &pairs[1];
>>> +      if ((pair->value & RISCV_HWPROBE_MISALIGNED_MASK)
>>> +        == RISCV_HWPROBE_MISALIGNED_FAST)
>>> +        fast_unaligned = true;
>>> +    }
>>> +
>>> +  IFUNC_IMPL (i, name, memchr,
>>> +           IFUNC_IMPL_ADD (array, i, memchr, has_zbb, __memchr_zbb)
>>> +           IFUNC_IMPL_ADD (array, i, memchr, 1, __memchr_generic))
>>>
>>>    IFUNC_IMPL (i, name, memcpy,
>>>             IFUNC_IMPL_ADD (array, i, memcpy, fast_unaligned,
>>> diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c b/sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c
>>> new file mode 100644
>>> index 0000000000..bc076cbf24
>>> --- /dev/null
>>> +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c
>>> @@ -0,0 +1,57 @@
>>> +/* Multiple versions of memchr.
>>> +   All versions must be listed in ifunc-impl-list.c.
>>> +   Copyright (C) 2017-2024 Free Software Foundation, Inc.
>>> +   This file is part of the GNU C Library.
>>> +
>>> +   The GNU C Library is free software; you can redistribute it and/or
>>> +   modify it under the terms of the GNU Lesser General Public
>>> +   License as published by the Free Software Foundation; either
>>> +   version 2.1 of the License, or (at your option) any later version.
>>> +
>>> +   The GNU C Library is distributed in the hope that it will be useful,
>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>> +   Lesser General Public License for more details.
>>> +
>>> +   You should have received a copy of the GNU Lesser General Public
>>> +   License along with the GNU C Library; if not, see
>>> +   <https://www.gnu.org/licenses/>.  */
>>> +
>>> +#if IS_IN (libc)
>>> +/* Redefine memchr so that the compiler won't complain about the type
>>> +   mismatch with the IFUNC selector in strong_alias, below.  */
>>> +# undef memchr
>>> +# define memchr __redirect_memchr
>>> +# include <stdint.h>
>>> +# include <string.h>
>>> +# include <ifunc-init.h>
>>> +# include <riscv-ifunc.h>
>>> +# include <sys/hwprobe.h>
>>> +
>>> +extern __typeof (__redirect_memchr) __libc_memchr;
>>> +
>>> +extern __typeof (__redirect_memchr) __memchr_generic attribute_hidden;
>>> +extern __typeof (__redirect_memchr) __memchr_zbb attribute_hidden;
>>> +
>>> +static inline __typeof (__redirect_memchr) *
>>> +select_memchr_ifunc (uint64_t dl_hwcap, __riscv_hwprobe_t hwprobe_func)
>>> +{
>>> +  unsigned long long int v;
>>> +  if (__riscv_hwprobe_one (hwprobe_func, RISCV_HWPROBE_KEY_IMA_EXT_0, &v) == 0
>>> +      && (v & RISCV_HWPROBE_EXT_ZBB))
>>> +    return __memchr_zbb;
>>> +
>>> +  return __memchr_generic;
>>> +}
>>> +
>>> +riscv_libc_ifunc (__libc_memchr, select_memchr_ifunc);
>>> +
>>> +# undef memchr
>>> +strong_alias (__libc_memchr, memchr);
>>> +# ifdef SHARED
>>> +__hidden_ver1 (memchr, __GI_memchr, __redirect_memchr)
>>> +  __attribute__ ((visibility ("hidden"))) __attribute_copy__ (memchr);
>>> +# endif
>>> +#else
>>> +# include <string/memchr.c>
>>> +#endif

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/7] RISC-V: Add Zbb optimized memchr as ifunc
  2024-04-24 13:36       ` Adhemerval Zanella Netto
@ 2024-04-26 11:40         ` Christoph Müllner
  2024-04-30 15:13         ` Palmer Dabbelt
  1 sibling, 0 replies; 20+ messages in thread
From: Christoph Müllner @ 2024-04-26 11:40 UTC (permalink / raw)
  To: Adhemerval Zanella Netto
  Cc: libc-alpha, Palmer Dabbelt, Darius Rad, Andrew Waterman,
	Philipp Tomsich, Evan Green, DJ Delorie, Vineet Gupta,
	Kito Cheng, Jeff Law

On Wed, Apr 24, 2024 at 3:36 PM Adhemerval Zanella Netto
<adhemerval.zanella@linaro.org> wrote:
>
>
>
> On 24/04/24 10:16, Christoph Müllner wrote:
> > On Wed, Apr 24, 2024 at 2:53 PM Adhemerval Zanella Netto
> > <adhemerval.zanella@linaro.org> wrote:
> >>
> >>
> >>
> >> On 22/04/24 04:43, Christoph Müllner wrote:
> >>> When building with Zbb enabled, memchr benefits from using orc.b in
> >>> find_zero_all().  This patch changes the build system such, that a
> >>> non-Zbb version as well as a Zbb version of this routine is built.
> >>> Further, a ifunc resolver is provided that selects the right routine
> >>> based on the outcome of extension probing via hwprobe().
> >>>
> >>> Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
> >>> ---
> >>>  sysdeps/riscv/multiarch/memchr-generic.c      | 26 +++++++++
> >>>  sysdeps/riscv/multiarch/memchr-zbb.c          | 30 ++++++++++
> >>>  .../unix/sysv/linux/riscv/multiarch/Makefile  |  3 +
> >>>  .../linux/riscv/multiarch/ifunc-impl-list.c   | 31 ++++++++--
> >>>  .../unix/sysv/linux/riscv/multiarch/memchr.c  | 57 +++++++++++++++++++
> >>>  5 files changed, 142 insertions(+), 5 deletions(-)
> >>>  create mode 100644 sysdeps/riscv/multiarch/memchr-generic.c
> >>>  create mode 100644 sysdeps/riscv/multiarch/memchr-zbb.c
> >>>  create mode 100644 sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c
> >>>
> >>> diff --git a/sysdeps/riscv/multiarch/memchr-generic.c b/sysdeps/riscv/multiarch/memchr-generic.c
> >>> new file mode 100644
> >>> index 0000000000..a96c36398b
> >>> --- /dev/null
> >>> +++ b/sysdeps/riscv/multiarch/memchr-generic.c
> >>> @@ -0,0 +1,26 @@
> >>> +/* Re-include the default memchr implementation.
> >>> +   Copyright (C) 2024 Free Software Foundation, Inc.
> >>> +   This file is part of the GNU C Library.
> >>> +
> >>> +   The GNU C Library is free software; you can redistribute it and/or
> >>> +   modify it under the terms of the GNU Lesser General Public
> >>> +   License as published by the Free Software Foundation; either
> >>> +   version 2.1 of the License, or (at your option) any later version.
> >>> +
> >>> +   The GNU C Library is distributed in the hope that it will be useful,
> >>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> >>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> >>> +   Lesser General Public License for more details.
> >>> +
> >>> +   You should have received a copy of the GNU Lesser General Public
> >>> +   License along with the GNU C Library; if not, see
> >>> +   <https://www.gnu.org/licenses/>.  */
> >>> +
> >>> +#include <string.h>
> >>> +
> >>> +#if IS_IN(libc)
> >>> +# define MEMCHR __memchr_generic
> >>> +# undef libc_hidden_builtin_def
> >>> +# define libc_hidden_builtin_def(x)
> >>> +#endif
> >>> +#include <string/memchr.c>
> >>> diff --git a/sysdeps/riscv/multiarch/memchr-zbb.c b/sysdeps/riscv/multiarch/memchr-zbb.c
> >>> new file mode 100644
> >>> index 0000000000..bead0335ae
> >>> --- /dev/null
> >>> +++ b/sysdeps/riscv/multiarch/memchr-zbb.c
> >>> @@ -0,0 +1,30 @@
> >>> +/* Re-include the default memchr implementation for Zbb.
> >>> +   Copyright (C) 2024 Free Software Foundation, Inc.
> >>> +   This file is part of the GNU C Library.
> >>> +
> >>> +   The GNU C Library is free software; you can redistribute it and/or
> >>> +   modify it under the terms of the GNU Lesser General Public
> >>> +   License as published by the Free Software Foundation; either
> >>> +   version 2.1 of the License, or (at your option) any later version.
> >>> +
> >>> +   The GNU C Library is distributed in the hope that it will be useful,
> >>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> >>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> >>> +   Lesser General Public License for more details.
> >>> +
> >>> +   You should have received a copy of the GNU Lesser General Public
> >>> +   License along with the GNU C Library; if not, see
> >>> +   <https://www.gnu.org/licenses/>.  */
> >>> +
> >>> +#include <string.h>
> >>> +
> >>> +#if IS_IN(libc)
> >>> +# define MEMCHR __memchr_zbb
> >>> +# undef libc_hidden_builtin_def
> >>> +# define libc_hidden_builtin_def(x)
> >>> +#endif
> >>> +/* Convince preprocessor to have Zbb instructions.  */
> >>> +#ifndef __riscv_zbb
> >>> +# define __riscv_zbb
> >>> +#endif
> >>
> >> Is there a way to specific the compiler to enable a extension, like aarch64
> >> -march=arch{+[no]feature}? I think ideally this should be enabled as CFLAGS
> >> instead of messing with compiler defined pre-processor.
> >
> > The tools expect a list of all extensions as parameter to the -march= option.
> > But there is no way to append extensions to an existing march string
> > on the command line.
> >
> > And if we would add this feature today, it would take many years until we could
> > use it here, because we want to remain compatible with old tools.
> > Or we enable the optimization only when being built with new tools, but that
> > adds even more complexity and build/test configurations.
> >
> > What we have is:
> > * Preprocessor (since forever): Extension test macros (__riscv_EXTENSION)
> > * Command line (since forever): -march=BASE_EXTENSIONLIST
> > * GAS (since Nov 21): .option arch, +EXTENSION (in combination with
> > option push/pop)
> > * GCC (since Nov 23): __attribute__((target("arch=+EXTENSION")))
> >
> > I was not sure about using __riscv_zbb as well, but I considered it safe within
> > ifdef tests that ensure the macro won't be set twice.
> > If that's a concern, I could change to use something like this:
> > #define __riscv_force_zbb
> > #include <impl.c>
> > #undef __riscv_force_zbb
> > ... and change string-fza.h like this:
> > #if defined(__riscv_zbb) || defined(__riscv_force_zbb)
> > // orc.b
> > #endif
> >
> > BR
> > Christoph
>
> Another options would to parse the current march and add the extension if required,
> something like:
>
> abi=$(riscv64-linux-gnu-gcc -Q --help=target | grep march | cut -d '=' -f2 | xargs)
> if [[ ! "$abi" =~ "_zbb" ]]
> then
>   abi="$abi"_zbb
> fi

I tried to work something out, but this attempt also won't work reliably.
Until recently (Jan 5, 2024) GCC required that the extensions on the command
line were provided in canonical order. I.e., "_zbb" would probably have to be
inserted somewhere in the middle. Sorting the tokens of the march string seems
to be an overkill.

> I don't have a strong preference, it is just that by not using the compiler flag
> we won't be able to either use the builtin (__builtin_riscv_orc_b_32) and/or get
> a possible better code generation from compiler.

I agree that builtins are better.
At the same time, the solution in the patch is no worse than what we
already have
in glibc right now. Further, this builtin was added to GCC on Jan 15, 2024.
So, it can only be used if the build tools are recent enough.

>
> >
> >>> +#include <string/memchr.c>
> >>> diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
> >>> index fcef5659d4..5586d11c89 100644
> >>> --- a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
> >>> +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
> >>> @@ -1,5 +1,8 @@
> >>>  ifeq ($(subdir),string)
> >>>  sysdep_routines += \
> >>> +  memchr \
> >>> +  memchr-generic \
> >>> +  memchr-zbb \
> >>>    memcpy \
> >>>    memcpy-generic \
> >>>    memcpy_noalignment \
> >>> diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
> >>> index 9f806d7a9e..7321144a32 100644
> >>> --- a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
> >>> +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
> >>> @@ -20,19 +20,40 @@
> >>>  #include <string.h>
> >>>  #include <sys/hwprobe.h>
> >>>
> >>> +#define ARRAY_SIZE(A) (sizeof (A) / sizeof ((A)[0]))
> >>> +
> >>>  size_t
> >>>  __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
> >>>                       size_t max)
> >>>  {
> >>>    size_t i = max;
> >>> +  struct riscv_hwprobe pairs[] = {
> >>> +    { .key = RISCV_HWPROBE_KEY_IMA_EXT_0 },
> >>> +    { .key = RISCV_HWPROBE_KEY_CPUPERF_0 },
> >>> +  };
> >>>
> >>> +  bool has_zbb = false;
> >>>    bool fast_unaligned = false;
> >>>
> >>> -  struct riscv_hwprobe pair = { .key = RISCV_HWPROBE_KEY_CPUPERF_0 };
> >>> -  if (__riscv_hwprobe (&pair, 1, 0, NULL, 0) == 0
> >>> -      && (pair.value & RISCV_HWPROBE_MISALIGNED_MASK)
> >>> -          == RISCV_HWPROBE_MISALIGNED_FAST)
> >>> -    fast_unaligned = true;
> >>> +  if (__riscv_hwprobe (pairs, ARRAY_SIZE (pairs), 0, NULL, 0) == 0)
> >>> +    {
> >>> +      struct riscv_hwprobe *pair;
> >>> +
> >>> +      /* RISCV_HWPROBE_KEY_IMA_EXT_0  */
> >>> +      pair = &pairs[0];
> >>> +      if (pair->value & RISCV_HWPROBE_EXT_ZBB)
> >>> +        has_zbb = true;
> >>> +
> >>> +      /* RISCV_HWPROBE_KEY_CPUPERF_0  */
> >>> +      pair = &pairs[1];
> >>> +      if ((pair->value & RISCV_HWPROBE_MISALIGNED_MASK)
> >>> +        == RISCV_HWPROBE_MISALIGNED_FAST)
> >>> +        fast_unaligned = true;
> >>> +    }
> >>> +
> >>> +  IFUNC_IMPL (i, name, memchr,
> >>> +           IFUNC_IMPL_ADD (array, i, memchr, has_zbb, __memchr_zbb)
> >>> +           IFUNC_IMPL_ADD (array, i, memchr, 1, __memchr_generic))
> >>>
> >>>    IFUNC_IMPL (i, name, memcpy,
> >>>             IFUNC_IMPL_ADD (array, i, memcpy, fast_unaligned,
> >>> diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c b/sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c
> >>> new file mode 100644
> >>> index 0000000000..bc076cbf24
> >>> --- /dev/null
> >>> +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c
> >>> @@ -0,0 +1,57 @@
> >>> +/* Multiple versions of memchr.
> >>> +   All versions must be listed in ifunc-impl-list.c.
> >>> +   Copyright (C) 2017-2024 Free Software Foundation, Inc.
> >>> +   This file is part of the GNU C Library.
> >>> +
> >>> +   The GNU C Library is free software; you can redistribute it and/or
> >>> +   modify it under the terms of the GNU Lesser General Public
> >>> +   License as published by the Free Software Foundation; either
> >>> +   version 2.1 of the License, or (at your option) any later version.
> >>> +
> >>> +   The GNU C Library is distributed in the hope that it will be useful,
> >>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> >>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> >>> +   Lesser General Public License for more details.
> >>> +
> >>> +   You should have received a copy of the GNU Lesser General Public
> >>> +   License along with the GNU C Library; if not, see
> >>> +   <https://www.gnu.org/licenses/>.  */
> >>> +
> >>> +#if IS_IN (libc)
> >>> +/* Redefine memchr so that the compiler won't complain about the type
> >>> +   mismatch with the IFUNC selector in strong_alias, below.  */
> >>> +# undef memchr
> >>> +# define memchr __redirect_memchr
> >>> +# include <stdint.h>
> >>> +# include <string.h>
> >>> +# include <ifunc-init.h>
> >>> +# include <riscv-ifunc.h>
> >>> +# include <sys/hwprobe.h>
> >>> +
> >>> +extern __typeof (__redirect_memchr) __libc_memchr;
> >>> +
> >>> +extern __typeof (__redirect_memchr) __memchr_generic attribute_hidden;
> >>> +extern __typeof (__redirect_memchr) __memchr_zbb attribute_hidden;
> >>> +
> >>> +static inline __typeof (__redirect_memchr) *
> >>> +select_memchr_ifunc (uint64_t dl_hwcap, __riscv_hwprobe_t hwprobe_func)
> >>> +{
> >>> +  unsigned long long int v;
> >>> +  if (__riscv_hwprobe_one (hwprobe_func, RISCV_HWPROBE_KEY_IMA_EXT_0, &v) == 0
> >>> +      && (v & RISCV_HWPROBE_EXT_ZBB))
> >>> +    return __memchr_zbb;
> >>> +
> >>> +  return __memchr_generic;
> >>> +}
> >>> +
> >>> +riscv_libc_ifunc (__libc_memchr, select_memchr_ifunc);
> >>> +
> >>> +# undef memchr
> >>> +strong_alias (__libc_memchr, memchr);
> >>> +# ifdef SHARED
> >>> +__hidden_ver1 (memchr, __GI_memchr, __redirect_memchr)
> >>> +  __attribute__ ((visibility ("hidden"))) __attribute_copy__ (memchr);
> >>> +# endif
> >>> +#else
> >>> +# include <string/memchr.c>
> >>> +#endif

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/7] RISC-V: Add Zbb optimized memchr as ifunc
  2024-04-24 13:36       ` Adhemerval Zanella Netto
  2024-04-26 11:40         ` Christoph Müllner
@ 2024-04-30 15:13         ` Palmer Dabbelt
  2024-04-30 17:45           ` Adhemerval Zanella Netto
  1 sibling, 1 reply; 20+ messages in thread
From: Palmer Dabbelt @ 2024-04-30 15:13 UTC (permalink / raw)
  To: adhemerval.zanella
  Cc: christoph.muellner, libc-alpha, Darius Rad, Andrew Waterman,
	philipp.tomsich, Evan Green, DJ Delorie, Vineet Gupta,
	kito.cheng, jeffreyalaw

On Wed, 24 Apr 2024 06:36:43 PDT (-0700), adhemerval.zanella@linaro.org wrote:
>
>
> On 24/04/24 10:16, Christoph Müllner wrote:
>> On Wed, Apr 24, 2024 at 2:53 PM Adhemerval Zanella Netto
>> <adhemerval.zanella@linaro.org> wrote:
>>>
>>>
>>>
>>> On 22/04/24 04:43, Christoph Müllner wrote:
>>>> When building with Zbb enabled, memchr benefits from using orc.b in
>>>> find_zero_all().  This patch changes the build system such, that a
>>>> non-Zbb version as well as a Zbb version of this routine is built.
>>>> Further, a ifunc resolver is provided that selects the right routine
>>>> based on the outcome of extension probing via hwprobe().
>>>>
>>>> Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
>>>> ---
>>>>  sysdeps/riscv/multiarch/memchr-generic.c      | 26 +++++++++
>>>>  sysdeps/riscv/multiarch/memchr-zbb.c          | 30 ++++++++++
>>>>  .../unix/sysv/linux/riscv/multiarch/Makefile  |  3 +
>>>>  .../linux/riscv/multiarch/ifunc-impl-list.c   | 31 ++++++++--
>>>>  .../unix/sysv/linux/riscv/multiarch/memchr.c  | 57 +++++++++++++++++++
>>>>  5 files changed, 142 insertions(+), 5 deletions(-)
>>>>  create mode 100644 sysdeps/riscv/multiarch/memchr-generic.c
>>>>  create mode 100644 sysdeps/riscv/multiarch/memchr-zbb.c
>>>>  create mode 100644 sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c
>>>>
>>>> diff --git a/sysdeps/riscv/multiarch/memchr-generic.c b/sysdeps/riscv/multiarch/memchr-generic.c
>>>> new file mode 100644
>>>> index 0000000000..a96c36398b
>>>> --- /dev/null
>>>> +++ b/sysdeps/riscv/multiarch/memchr-generic.c
>>>> @@ -0,0 +1,26 @@
>>>> +/* Re-include the default memchr implementation.
>>>> +   Copyright (C) 2024 Free Software Foundation, Inc.
>>>> +   This file is part of the GNU C Library.
>>>> +
>>>> +   The GNU C Library is free software; you can redistribute it and/or
>>>> +   modify it under the terms of the GNU Lesser General Public
>>>> +   License as published by the Free Software Foundation; either
>>>> +   version 2.1 of the License, or (at your option) any later version.
>>>> +
>>>> +   The GNU C Library is distributed in the hope that it will be useful,
>>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>>> +   Lesser General Public License for more details.
>>>> +
>>>> +   You should have received a copy of the GNU Lesser General Public
>>>> +   License along with the GNU C Library; if not, see
>>>> +   <https://www.gnu.org/licenses/>.  */
>>>> +
>>>> +#include <string.h>
>>>> +
>>>> +#if IS_IN(libc)
>>>> +# define MEMCHR __memchr_generic
>>>> +# undef libc_hidden_builtin_def
>>>> +# define libc_hidden_builtin_def(x)
>>>> +#endif
>>>> +#include <string/memchr.c>
>>>> diff --git a/sysdeps/riscv/multiarch/memchr-zbb.c b/sysdeps/riscv/multiarch/memchr-zbb.c
>>>> new file mode 100644
>>>> index 0000000000..bead0335ae
>>>> --- /dev/null
>>>> +++ b/sysdeps/riscv/multiarch/memchr-zbb.c
>>>> @@ -0,0 +1,30 @@
>>>> +/* Re-include the default memchr implementation for Zbb.
>>>> +   Copyright (C) 2024 Free Software Foundation, Inc.
>>>> +   This file is part of the GNU C Library.
>>>> +
>>>> +   The GNU C Library is free software; you can redistribute it and/or
>>>> +   modify it under the terms of the GNU Lesser General Public
>>>> +   License as published by the Free Software Foundation; either
>>>> +   version 2.1 of the License, or (at your option) any later version.
>>>> +
>>>> +   The GNU C Library is distributed in the hope that it will be useful,
>>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>>> +   Lesser General Public License for more details.
>>>> +
>>>> +   You should have received a copy of the GNU Lesser General Public
>>>> +   License along with the GNU C Library; if not, see
>>>> +   <https://www.gnu.org/licenses/>.  */
>>>> +
>>>> +#include <string.h>
>>>> +
>>>> +#if IS_IN(libc)
>>>> +# define MEMCHR __memchr_zbb
>>>> +# undef libc_hidden_builtin_def
>>>> +# define libc_hidden_builtin_def(x)
>>>> +#endif
>>>> +/* Convince preprocessor to have Zbb instructions.  */
>>>> +#ifndef __riscv_zbb
>>>> +# define __riscv_zbb
>>>> +#endif
>>>
>>> Is there a way to specific the compiler to enable a extension, like aarch64
>>> -march=arch{+[no]feature}? I think ideally this should be enabled as CFLAGS
>>> instead of messing with compiler defined pre-processor.
>>
>> The tools expect a list of all extensions as parameter to the -march= option.
>> But there is no way to append extensions to an existing march string
>> on the command line.
>>
>> And if we would add this feature today, it would take many years until we could
>> use it here, because we want to remain compatible with old tools.
>> Or we enable the optimization only when being built with new tools, but that
>> adds even more complexity and build/test configurations.
>>
>> What we have is:
>> * Preprocessor (since forever): Extension test macros (__riscv_EXTENSION)
>> * Command line (since forever): -march=BASE_EXTENSIONLIST
>> * GAS (since Nov 21): .option arch, +EXTENSION (in combination with
>> option push/pop)
>> * GCC (since Nov 23): __attribute__((target("arch=+EXTENSION")))
>>
>> I was not sure about using __riscv_zbb as well, but I considered it safe within
>> ifdef tests that ensure the macro won't be set twice.
>> If that's a concern, I could change to use something like this:
>> #define __riscv_force_zbb
>> #include <impl.c>
>> #undef __riscv_force_zbb
>> ... and change string-fza.h like this:
>> #if defined(__riscv_zbb) || defined(__riscv_force_zbb)
>> // orc.b
>> #endif
>>
>> BR
>> Christoph
>
> Another options would to parse the current march and add the extension if required,
> something like:
>
> abi=$(riscv64-linux-gnu-gcc -Q --help=target | grep march | cut -d '=' -f2 | xargs)
> if [[ ! "$abi" =~ "_zbb" ]]
> then
>   abi="$abi"_zbb
> fi

That alone likely won't do it, there's a bunch of ordering rules in the 
ISA string handling so we might get tripped up on them.  We've got a 
fairly relaxed version of the rules in GCC to try and match the various 
older rules, though, so it might be possible to make something similar 
work.

We should probably just add some sort of -march=+zbb type argument.  
IIRC Kito was going to do it at some point, not sure if he got around to 
it?

> I don't have a strong preference, it is just that by not using the compiler flag
> we won't be able to either use the builtin (__builtin_riscv_orc_b_32) and/or get
> a possible better code generation from compiler.

I think we'd likely get slightly better codgen from telling the compiler 
about the bitmanip extensions.  Maybe we want something like

    diff --git a/string/memchr.c b/string/memchr.c
    index 08b5c41667..1b62dce8d8 100644
    --- a/string/memchr.c
    +++ b/string/memchr.c
    @@ -29,15 +29,19 @@
     # define __memchr MEMCHR
     #endif
    
    +#ifndef __MEMCHR_CODEGEN_ATTRIBUTE
    +#define __MEMCHR_CODEGEN_ATTRIBUTE
    +#endif
    +
     static __always_inline const char *
    -sadd (uintptr_t x, uintptr_t y)
    +sadd (uintptr_t x, uintptr_t y) __MEMCHR_CODEGEN_ATTRIBUTE
     {
       return (const char *)(y > UINTPTR_MAX - x ? UINTPTR_MAX : x + y);
     }
    
     /* Search no more than N bytes of S for C.  */
     void *
    -__memchr (void const *s, int c_in, size_t n)
    +__memchr (void const *s, int c_in, size_t n) __MEMCHR_CODEGEN_ATTRIBUTE
     {
       if (__glibc_unlikely (n == 0))
         return NULL;

in the generic versions, so we can add a

    #define __MEMCHR_CODEGEN_ATTRIBUTE __attribuet__((target("+zbb")))

(or whatever the syntax is) to the Zbb-flavored versions of these 
routines?

It might also be worth just jumping to the fast-misaligned versions for 
these routines, too --the slow-misaligned stuff is there for 
compatibility with old stuff (though memchr aligns the pointer, so it 
doesn't matter so much here).

>>>> +#include <string/memchr.c>
>>>> diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
>>>> index fcef5659d4..5586d11c89 100644
>>>> --- a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
>>>> +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
>>>> @@ -1,5 +1,8 @@
>>>>  ifeq ($(subdir),string)
>>>>  sysdep_routines += \
>>>> +  memchr \
>>>> +  memchr-generic \
>>>> +  memchr-zbb \
>>>>    memcpy \
>>>>    memcpy-generic \
>>>>    memcpy_noalignment \
>>>> diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
>>>> index 9f806d7a9e..7321144a32 100644
>>>> --- a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
>>>> +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
>>>> @@ -20,19 +20,40 @@
>>>>  #include <string.h>
>>>>  #include <sys/hwprobe.h>
>>>>
>>>> +#define ARRAY_SIZE(A) (sizeof (A) / sizeof ((A)[0]))
>>>> +
>>>>  size_t
>>>>  __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
>>>>                       size_t max)
>>>>  {
>>>>    size_t i = max;
>>>> +  struct riscv_hwprobe pairs[] = {
>>>> +    { .key = RISCV_HWPROBE_KEY_IMA_EXT_0 },
>>>> +    { .key = RISCV_HWPROBE_KEY_CPUPERF_0 },
>>>> +  };
>>>>
>>>> +  bool has_zbb = false;
>>>>    bool fast_unaligned = false;
>>>>
>>>> -  struct riscv_hwprobe pair = { .key = RISCV_HWPROBE_KEY_CPUPERF_0 };
>>>> -  if (__riscv_hwprobe (&pair, 1, 0, NULL, 0) == 0
>>>> -      && (pair.value & RISCV_HWPROBE_MISALIGNED_MASK)
>>>> -          == RISCV_HWPROBE_MISALIGNED_FAST)
>>>> -    fast_unaligned = true;
>>>> +  if (__riscv_hwprobe (pairs, ARRAY_SIZE (pairs), 0, NULL, 0) == 0)
>>>> +    {
>>>> +      struct riscv_hwprobe *pair;
>>>> +
>>>> +      /* RISCV_HWPROBE_KEY_IMA_EXT_0  */
>>>> +      pair = &pairs[0];
>>>> +      if (pair->value & RISCV_HWPROBE_EXT_ZBB)
>>>> +        has_zbb = true;
>>>> +
>>>> +      /* RISCV_HWPROBE_KEY_CPUPERF_0  */
>>>> +      pair = &pairs[1];
>>>> +      if ((pair->value & RISCV_HWPROBE_MISALIGNED_MASK)
>>>> +        == RISCV_HWPROBE_MISALIGNED_FAST)
>>>> +        fast_unaligned = true;
>>>> +    }
>>>> +
>>>> +  IFUNC_IMPL (i, name, memchr,
>>>> +           IFUNC_IMPL_ADD (array, i, memchr, has_zbb, __memchr_zbb)
>>>> +           IFUNC_IMPL_ADD (array, i, memchr, 1, __memchr_generic))
>>>>
>>>>    IFUNC_IMPL (i, name, memcpy,
>>>>             IFUNC_IMPL_ADD (array, i, memcpy, fast_unaligned,
>>>> diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c b/sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c
>>>> new file mode 100644
>>>> index 0000000000..bc076cbf24
>>>> --- /dev/null
>>>> +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c
>>>> @@ -0,0 +1,57 @@
>>>> +/* Multiple versions of memchr.
>>>> +   All versions must be listed in ifunc-impl-list.c.
>>>> +   Copyright (C) 2017-2024 Free Software Foundation, Inc.
>>>> +   This file is part of the GNU C Library.
>>>> +
>>>> +   The GNU C Library is free software; you can redistribute it and/or
>>>> +   modify it under the terms of the GNU Lesser General Public
>>>> +   License as published by the Free Software Foundation; either
>>>> +   version 2.1 of the License, or (at your option) any later version.
>>>> +
>>>> +   The GNU C Library is distributed in the hope that it will be useful,
>>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>>> +   Lesser General Public License for more details.
>>>> +
>>>> +   You should have received a copy of the GNU Lesser General Public
>>>> +   License along with the GNU C Library; if not, see
>>>> +   <https://www.gnu.org/licenses/>.  */
>>>> +
>>>> +#if IS_IN (libc)
>>>> +/* Redefine memchr so that the compiler won't complain about the type
>>>> +   mismatch with the IFUNC selector in strong_alias, below.  */
>>>> +# undef memchr
>>>> +# define memchr __redirect_memchr
>>>> +# include <stdint.h>
>>>> +# include <string.h>
>>>> +# include <ifunc-init.h>
>>>> +# include <riscv-ifunc.h>
>>>> +# include <sys/hwprobe.h>
>>>> +
>>>> +extern __typeof (__redirect_memchr) __libc_memchr;
>>>> +
>>>> +extern __typeof (__redirect_memchr) __memchr_generic attribute_hidden;
>>>> +extern __typeof (__redirect_memchr) __memchr_zbb attribute_hidden;
>>>> +
>>>> +static inline __typeof (__redirect_memchr) *
>>>> +select_memchr_ifunc (uint64_t dl_hwcap, __riscv_hwprobe_t hwprobe_func)
>>>> +{
>>>> +  unsigned long long int v;
>>>> +  if (__riscv_hwprobe_one (hwprobe_func, RISCV_HWPROBE_KEY_IMA_EXT_0, &v) == 0
>>>> +      && (v & RISCV_HWPROBE_EXT_ZBB))
>>>> +    return __memchr_zbb;
>>>> +
>>>> +  return __memchr_generic;
>>>> +}
>>>> +
>>>> +riscv_libc_ifunc (__libc_memchr, select_memchr_ifunc);
>>>> +
>>>> +# undef memchr
>>>> +strong_alias (__libc_memchr, memchr);
>>>> +# ifdef SHARED
>>>> +__hidden_ver1 (memchr, __GI_memchr, __redirect_memchr)
>>>> +  __attribute__ ((visibility ("hidden"))) __attribute_copy__ (memchr);
>>>> +# endif
>>>> +#else
>>>> +# include <string/memchr.c>
>>>> +#endif

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/7] RISC-V: Add Zbb optimized memchr as ifunc
  2024-04-30 15:13         ` Palmer Dabbelt
@ 2024-04-30 17:45           ` Adhemerval Zanella Netto
  2024-04-30 17:54             ` Palmer Dabbelt
  0 siblings, 1 reply; 20+ messages in thread
From: Adhemerval Zanella Netto @ 2024-04-30 17:45 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: christoph.muellner, libc-alpha, Darius Rad, Andrew Waterman,
	philipp.tomsich, Evan Green, DJ Delorie, Vineet Gupta,
	kito.cheng, jeffreyalaw



On 30/04/24 12:13, Palmer Dabbelt wrote:
> On Wed, 24 Apr 2024 06:36:43 PDT (-0700), adhemerval.zanella@linaro.org wrote:
>>
>>
>> On 24/04/24 10:16, Christoph Müllner wrote:
>>> On Wed, Apr 24, 2024 at 2:53 PM Adhemerval Zanella Netto
>>> <adhemerval.zanella@linaro.org> wrote:
>>>>
>>>>
>>>>
>>>> On 22/04/24 04:43, Christoph Müllner wrote:
>>>>> When building with Zbb enabled, memchr benefits from using orc.b in
>>>>> find_zero_all().  This patch changes the build system such, that a
>>>>> non-Zbb version as well as a Zbb version of this routine is built.
>>>>> Further, a ifunc resolver is provided that selects the right routine
>>>>> based on the outcome of extension probing via hwprobe().
>>>>>
>>>>> Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
>>>>> ---
>>>>>  sysdeps/riscv/multiarch/memchr-generic.c      | 26 +++++++++
>>>>>  sysdeps/riscv/multiarch/memchr-zbb.c          | 30 ++++++++++
>>>>>  .../unix/sysv/linux/riscv/multiarch/Makefile  |  3 +
>>>>>  .../linux/riscv/multiarch/ifunc-impl-list.c   | 31 ++++++++--
>>>>>  .../unix/sysv/linux/riscv/multiarch/memchr.c  | 57 +++++++++++++++++++
>>>>>  5 files changed, 142 insertions(+), 5 deletions(-)
>>>>>  create mode 100644 sysdeps/riscv/multiarch/memchr-generic.c
>>>>>  create mode 100644 sysdeps/riscv/multiarch/memchr-zbb.c
>>>>>  create mode 100644 sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c
>>>>>
>>>>> diff --git a/sysdeps/riscv/multiarch/memchr-generic.c b/sysdeps/riscv/multiarch/memchr-generic.c
>>>>> new file mode 100644
>>>>> index 0000000000..a96c36398b
>>>>> --- /dev/null
>>>>> +++ b/sysdeps/riscv/multiarch/memchr-generic.c
>>>>> @@ -0,0 +1,26 @@
>>>>> +/* Re-include the default memchr implementation.
>>>>> +   Copyright (C) 2024 Free Software Foundation, Inc.
>>>>> +   This file is part of the GNU C Library.
>>>>> +
>>>>> +   The GNU C Library is free software; you can redistribute it and/or
>>>>> +   modify it under the terms of the GNU Lesser General Public
>>>>> +   License as published by the Free Software Foundation; either
>>>>> +   version 2.1 of the License, or (at your option) any later version.
>>>>> +
>>>>> +   The GNU C Library is distributed in the hope that it will be useful,
>>>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>>>> +   Lesser General Public License for more details.
>>>>> +
>>>>> +   You should have received a copy of the GNU Lesser General Public
>>>>> +   License along with the GNU C Library; if not, see
>>>>> +   <https://www.gnu.org/licenses/>.  */
>>>>> +
>>>>> +#include <string.h>
>>>>> +
>>>>> +#if IS_IN(libc)
>>>>> +# define MEMCHR __memchr_generic
>>>>> +# undef libc_hidden_builtin_def
>>>>> +# define libc_hidden_builtin_def(x)
>>>>> +#endif
>>>>> +#include <string/memchr.c>
>>>>> diff --git a/sysdeps/riscv/multiarch/memchr-zbb.c b/sysdeps/riscv/multiarch/memchr-zbb.c
>>>>> new file mode 100644
>>>>> index 0000000000..bead0335ae
>>>>> --- /dev/null
>>>>> +++ b/sysdeps/riscv/multiarch/memchr-zbb.c
>>>>> @@ -0,0 +1,30 @@
>>>>> +/* Re-include the default memchr implementation for Zbb.
>>>>> +   Copyright (C) 2024 Free Software Foundation, Inc.
>>>>> +   This file is part of the GNU C Library.
>>>>> +
>>>>> +   The GNU C Library is free software; you can redistribute it and/or
>>>>> +   modify it under the terms of the GNU Lesser General Public
>>>>> +   License as published by the Free Software Foundation; either
>>>>> +   version 2.1 of the License, or (at your option) any later version.
>>>>> +
>>>>> +   The GNU C Library is distributed in the hope that it will be useful,
>>>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>>>> +   Lesser General Public License for more details.
>>>>> +
>>>>> +   You should have received a copy of the GNU Lesser General Public
>>>>> +   License along with the GNU C Library; if not, see
>>>>> +   <https://www.gnu.org/licenses/>.  */
>>>>> +
>>>>> +#include <string.h>
>>>>> +
>>>>> +#if IS_IN(libc)
>>>>> +# define MEMCHR __memchr_zbb
>>>>> +# undef libc_hidden_builtin_def
>>>>> +# define libc_hidden_builtin_def(x)
>>>>> +#endif
>>>>> +/* Convince preprocessor to have Zbb instructions.  */
>>>>> +#ifndef __riscv_zbb
>>>>> +# define __riscv_zbb
>>>>> +#endif
>>>>
>>>> Is there a way to specific the compiler to enable a extension, like aarch64
>>>> -march=arch{+[no]feature}? I think ideally this should be enabled as CFLAGS
>>>> instead of messing with compiler defined pre-processor.
>>>
>>> The tools expect a list of all extensions as parameter to the -march= option.
>>> But there is no way to append extensions to an existing march string
>>> on the command line.
>>>
>>> And if we would add this feature today, it would take many years until we could
>>> use it here, because we want to remain compatible with old tools.
>>> Or we enable the optimization only when being built with new tools, but that
>>> adds even more complexity and build/test configurations.
>>>
>>> What we have is:
>>> * Preprocessor (since forever): Extension test macros (__riscv_EXTENSION)
>>> * Command line (since forever): -march=BASE_EXTENSIONLIST
>>> * GAS (since Nov 21): .option arch, +EXTENSION (in combination with
>>> option push/pop)
>>> * GCC (since Nov 23): __attribute__((target("arch=+EXTENSION")))
>>>
>>> I was not sure about using __riscv_zbb as well, but I considered it safe within
>>> ifdef tests that ensure the macro won't be set twice.
>>> If that's a concern, I could change to use something like this:
>>> #define __riscv_force_zbb
>>> #include <impl.c>
>>> #undef __riscv_force_zbb
>>> ... and change string-fza.h like this:
>>> #if defined(__riscv_zbb) || defined(__riscv_force_zbb)
>>> // orc.b
>>> #endif
>>>
>>> BR
>>> Christoph
>>
>> Another options would to parse the current march and add the extension if required,
>> something like:
>>
>> abi=$(riscv64-linux-gnu-gcc -Q --help=target | grep march | cut -d '=' -f2 | xargs)
>> if [[ ! "$abi" =~ "_zbb" ]]
>> then
>>   abi="$abi"_zbb
>> fi
> 
> That alone likely won't do it, there's a bunch of ordering rules in the ISA string handling so we might get tripped up on them.  We've got a fairly relaxed version of the rules in GCC to try and match the various older rules, though, so it might be possible to make something similar work.
> 
> We should probably just add some sort of -march=+zbb type argument.  IIRC Kito was going to do it at some point, not sure if he got around to it?

I am just pointing this out because I think the way RISCV extension selection
is currently implemented makes it awkward to provide ifunc implementation in
a agnostic way (specially now that RISCV has dozens of extensions) without
knowing the current target compiler is generating.

Some other ABI allows to either specify a ISA/chip reference (like powerpc
with -mcpu=powerX) or a ABI extension directly (like aarch64 with -march=+xxx).

> 
>> I don't have a strong preference, it is just that by not using the compiler flag
>> we won't be able to either use the builtin (__builtin_riscv_orc_b_32) and/or get
>> a possible better code generation from compiler.
> 
> I think we'd likely get slightly better codgen from telling the compiler about the bitmanip extensions.  Maybe we want something like
> 
>    diff --git a/string/memchr.c b/string/memchr.c
>    index 08b5c41667..1b62dce8d8 100644
>    --- a/string/memchr.c
>    +++ b/string/memchr.c
>    @@ -29,15 +29,19 @@
>     # define __memchr MEMCHR
>     #endif
>       +#ifndef __MEMCHR_CODEGEN_ATTRIBUTE
>    +#define __MEMCHR_CODEGEN_ATTRIBUTE
>    +#endif
>    +
>     static __always_inline const char *
>    -sadd (uintptr_t x, uintptr_t y)
>    +sadd (uintptr_t x, uintptr_t y) __MEMCHR_CODEGEN_ATTRIBUTE
>     {
>       return (const char *)(y > UINTPTR_MAX - x ? UINTPTR_MAX : x + y);
>     }
>        /* Search no more than N bytes of S for C.  */
>     void *
>    -__memchr (void const *s, int c_in, size_t n)
>    +__memchr (void const *s, int c_in, size_t n) __MEMCHR_CODEGEN_ATTRIBUTE
>     {
>       if (__glibc_unlikely (n == 0))
>         return NULL;
> 
> in the generic versions, so we can add a
> 
>    #define __MEMCHR_CODEGEN_ATTRIBUTE __attribuet__((target("+zbb")))
> 
> (or whatever the syntax is) to the Zbb-flavored versions of these routines?

Yeah, this might work and it is clear than messing with compiler-defined
macros.

> 
> It might also be worth just jumping to the fast-misaligned versions for these routines, too --the slow-misaligned stuff is there for compatibility with old stuff (though memchr aligns the pointer, so it doesn't matter so much here).

I was hopping that ABIs that would like to provide unaligned variants
for mem* routines to improve the generic code, but it seems that for some
it is easier to just add an assembly routine (as loongarch did).

For memchr, I think it should be easy to provide a unaligned version.
Something like (completely untested):

/* Search no more than N bytes of S for C.  */
void *
__memchr (void const *s, int c_in, size_t n)
{
  if (__glibc_unlikely (n == 0))
    return NULL;

#ifdef USE_MEMCHR_UNALIGNED
  /* Read the first word, but munge it so that bytes before the array
     will not match goal.  */
  const op_t *word_ptr = PTR_ALIGN_DOWN (s, sizeof (op_t));
  uintptr_t s_int = (uintptr_t) s;

  op_t word = *word_ptr;
  op_t repeated_c = repeat_bytes (c_in);
  /* Compute the address of the last byte taking in consideration possible
     overflow.  */
  const char *lbyte = sadd (s_int, n - 1);
  /* And also the address of the word containing the last byte. */
  const op_t *lword = (const op_t *) PTR_ALIGN_DOWN (lbyte, sizeof (op_t));

  find_t mask = shift_find (find_eq_all (word, repeated_c), s_int);
  if (mask != 0)
    {
      char *ret = (char *) s + index_first (mask);
      return (ret <= lbyte) ? ret : NULL;
    }
  if (word_ptr == lword)
    return NULL;
#endif

  word = *++word_ptr;
  while (word_ptr != lword)
    {
      if (has_eq (word, repeated_c))
        return (char *) word_ptr + index_first_eq (word, repeated_c);
      word = *++word_ptr;
    }

  if (has_eq (word, repeated_c))
    {
      /* We found a match, but it might be in a byte past the end of the
         array.  */
      char *ret = (char *) word_ptr + index_first_eq (word, repeated_c);
      if (ret <= lbyte)
        return ret;
    }
  return NULL;
}

> 
>>>>> +#include <string/memchr.c>
>>>>> diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
>>>>> index fcef5659d4..5586d11c89 100644
>>>>> --- a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
>>>>> +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
>>>>> @@ -1,5 +1,8 @@
>>>>>  ifeq ($(subdir),string)
>>>>>  sysdep_routines += \
>>>>> +  memchr \
>>>>> +  memchr-generic \
>>>>> +  memchr-zbb \
>>>>>    memcpy \
>>>>>    memcpy-generic \
>>>>>    memcpy_noalignment \
>>>>> diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
>>>>> index 9f806d7a9e..7321144a32 100644
>>>>> --- a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
>>>>> +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
>>>>> @@ -20,19 +20,40 @@
>>>>>  #include <string.h>
>>>>>  #include <sys/hwprobe.h>
>>>>>
>>>>> +#define ARRAY_SIZE(A) (sizeof (A) / sizeof ((A)[0]))
>>>>> +
>>>>>  size_t
>>>>>  __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
>>>>>                       size_t max)
>>>>>  {
>>>>>    size_t i = max;
>>>>> +  struct riscv_hwprobe pairs[] = {
>>>>> +    { .key = RISCV_HWPROBE_KEY_IMA_EXT_0 },
>>>>> +    { .key = RISCV_HWPROBE_KEY_CPUPERF_0 },
>>>>> +  };
>>>>>
>>>>> +  bool has_zbb = false;
>>>>>    bool fast_unaligned = false;
>>>>>
>>>>> -  struct riscv_hwprobe pair = { .key = RISCV_HWPROBE_KEY_CPUPERF_0 };
>>>>> -  if (__riscv_hwprobe (&pair, 1, 0, NULL, 0) == 0
>>>>> -      && (pair.value & RISCV_HWPROBE_MISALIGNED_MASK)
>>>>> -          == RISCV_HWPROBE_MISALIGNED_FAST)
>>>>> -    fast_unaligned = true;
>>>>> +  if (__riscv_hwprobe (pairs, ARRAY_SIZE (pairs), 0, NULL, 0) == 0)
>>>>> +    {
>>>>> +      struct riscv_hwprobe *pair;
>>>>> +
>>>>> +      /* RISCV_HWPROBE_KEY_IMA_EXT_0  */
>>>>> +      pair = &pairs[0];
>>>>> +      if (pair->value & RISCV_HWPROBE_EXT_ZBB)
>>>>> +        has_zbb = true;
>>>>> +
>>>>> +      /* RISCV_HWPROBE_KEY_CPUPERF_0  */
>>>>> +      pair = &pairs[1];
>>>>> +      if ((pair->value & RISCV_HWPROBE_MISALIGNED_MASK)
>>>>> +        == RISCV_HWPROBE_MISALIGNED_FAST)
>>>>> +        fast_unaligned = true;
>>>>> +    }
>>>>> +
>>>>> +  IFUNC_IMPL (i, name, memchr,
>>>>> +           IFUNC_IMPL_ADD (array, i, memchr, has_zbb, __memchr_zbb)
>>>>> +           IFUNC_IMPL_ADD (array, i, memchr, 1, __memchr_generic))
>>>>>
>>>>>    IFUNC_IMPL (i, name, memcpy,
>>>>>             IFUNC_IMPL_ADD (array, i, memcpy, fast_unaligned,
>>>>> diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c b/sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c
>>>>> new file mode 100644
>>>>> index 0000000000..bc076cbf24
>>>>> --- /dev/null
>>>>> +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c
>>>>> @@ -0,0 +1,57 @@
>>>>> +/* Multiple versions of memchr.
>>>>> +   All versions must be listed in ifunc-impl-list.c.
>>>>> +   Copyright (C) 2017-2024 Free Software Foundation, Inc.
>>>>> +   This file is part of the GNU C Library.
>>>>> +
>>>>> +   The GNU C Library is free software; you can redistribute it and/or
>>>>> +   modify it under the terms of the GNU Lesser General Public
>>>>> +   License as published by the Free Software Foundation; either
>>>>> +   version 2.1 of the License, or (at your option) any later version.
>>>>> +
>>>>> +   The GNU C Library is distributed in the hope that it will be useful,
>>>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>>>> +   Lesser General Public License for more details.
>>>>> +
>>>>> +   You should have received a copy of the GNU Lesser General Public
>>>>> +   License along with the GNU C Library; if not, see
>>>>> +   <https://www.gnu.org/licenses/>.  */
>>>>> +
>>>>> +#if IS_IN (libc)
>>>>> +/* Redefine memchr so that the compiler won't complain about the type
>>>>> +   mismatch with the IFUNC selector in strong_alias, below.  */
>>>>> +# undef memchr
>>>>> +# define memchr __redirect_memchr
>>>>> +# include <stdint.h>
>>>>> +# include <string.h>
>>>>> +# include <ifunc-init.h>
>>>>> +# include <riscv-ifunc.h>
>>>>> +# include <sys/hwprobe.h>
>>>>> +
>>>>> +extern __typeof (__redirect_memchr) __libc_memchr;
>>>>> +
>>>>> +extern __typeof (__redirect_memchr) __memchr_generic attribute_hidden;
>>>>> +extern __typeof (__redirect_memchr) __memchr_zbb attribute_hidden;
>>>>> +
>>>>> +static inline __typeof (__redirect_memchr) *
>>>>> +select_memchr_ifunc (uint64_t dl_hwcap, __riscv_hwprobe_t hwprobe_func)
>>>>> +{
>>>>> +  unsigned long long int v;
>>>>> +  if (__riscv_hwprobe_one (hwprobe_func, RISCV_HWPROBE_KEY_IMA_EXT_0, &v) == 0
>>>>> +      && (v & RISCV_HWPROBE_EXT_ZBB))
>>>>> +    return __memchr_zbb;
>>>>> +
>>>>> +  return __memchr_generic;
>>>>> +}
>>>>> +
>>>>> +riscv_libc_ifunc (__libc_memchr, select_memchr_ifunc);
>>>>> +
>>>>> +# undef memchr
>>>>> +strong_alias (__libc_memchr, memchr);
>>>>> +# ifdef SHARED
>>>>> +__hidden_ver1 (memchr, __GI_memchr, __redirect_memchr)
>>>>> +  __attribute__ ((visibility ("hidden"))) __attribute_copy__ (memchr);
>>>>> +# endif
>>>>> +#else
>>>>> +# include <string/memchr.c>
>>>>> +#endif

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/7] RISC-V: Add Zbb optimized memchr as ifunc
  2024-04-30 17:45           ` Adhemerval Zanella Netto
@ 2024-04-30 17:54             ` Palmer Dabbelt
  2024-04-30 18:44               ` Vineet Gupta
  2024-05-06 13:20               ` Christoph Müllner
  0 siblings, 2 replies; 20+ messages in thread
From: Palmer Dabbelt @ 2024-04-30 17:54 UTC (permalink / raw)
  To: adhemerval.zanella
  Cc: christoph.muellner, libc-alpha, Darius Rad, Andrew Waterman,
	philipp.tomsich, Evan Green, DJ Delorie, Vineet Gupta,
	kito.cheng, jeffreyalaw

On Tue, 30 Apr 2024 10:45:12 PDT (-0700), adhemerval.zanella@linaro.org wrote:
>
>
> On 30/04/24 12:13, Palmer Dabbelt wrote:
>> On Wed, 24 Apr 2024 06:36:43 PDT (-0700), adhemerval.zanella@linaro.org wrote:
>>>
>>>
>>> On 24/04/24 10:16, Christoph Müllner wrote:
>>>> On Wed, Apr 24, 2024 at 2:53 PM Adhemerval Zanella Netto
>>>> <adhemerval.zanella@linaro.org> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 22/04/24 04:43, Christoph Müllner wrote:
>>>>>> When building with Zbb enabled, memchr benefits from using orc.b in
>>>>>> find_zero_all().  This patch changes the build system such, that a
>>>>>> non-Zbb version as well as a Zbb version of this routine is built.
>>>>>> Further, a ifunc resolver is provided that selects the right routine
>>>>>> based on the outcome of extension probing via hwprobe().
>>>>>>
>>>>>> Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
>>>>>> ---
>>>>>>  sysdeps/riscv/multiarch/memchr-generic.c      | 26 +++++++++
>>>>>>  sysdeps/riscv/multiarch/memchr-zbb.c          | 30 ++++++++++
>>>>>>  .../unix/sysv/linux/riscv/multiarch/Makefile  |  3 +
>>>>>>  .../linux/riscv/multiarch/ifunc-impl-list.c   | 31 ++++++++--
>>>>>>  .../unix/sysv/linux/riscv/multiarch/memchr.c  | 57 +++++++++++++++++++
>>>>>>  5 files changed, 142 insertions(+), 5 deletions(-)
>>>>>>  create mode 100644 sysdeps/riscv/multiarch/memchr-generic.c
>>>>>>  create mode 100644 sysdeps/riscv/multiarch/memchr-zbb.c
>>>>>>  create mode 100644 sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c
>>>>>>
>>>>>> diff --git a/sysdeps/riscv/multiarch/memchr-generic.c b/sysdeps/riscv/multiarch/memchr-generic.c
>>>>>> new file mode 100644
>>>>>> index 0000000000..a96c36398b
>>>>>> --- /dev/null
>>>>>> +++ b/sysdeps/riscv/multiarch/memchr-generic.c
>>>>>> @@ -0,0 +1,26 @@
>>>>>> +/* Re-include the default memchr implementation.
>>>>>> +   Copyright (C) 2024 Free Software Foundation, Inc.
>>>>>> +   This file is part of the GNU C Library.
>>>>>> +
>>>>>> +   The GNU C Library is free software; you can redistribute it and/or
>>>>>> +   modify it under the terms of the GNU Lesser General Public
>>>>>> +   License as published by the Free Software Foundation; either
>>>>>> +   version 2.1 of the License, or (at your option) any later version.
>>>>>> +
>>>>>> +   The GNU C Library is distributed in the hope that it will be useful,
>>>>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>>>>> +   Lesser General Public License for more details.
>>>>>> +
>>>>>> +   You should have received a copy of the GNU Lesser General Public
>>>>>> +   License along with the GNU C Library; if not, see
>>>>>> +   <https://www.gnu.org/licenses/>.  */
>>>>>> +
>>>>>> +#include <string.h>
>>>>>> +
>>>>>> +#if IS_IN(libc)
>>>>>> +# define MEMCHR __memchr_generic
>>>>>> +# undef libc_hidden_builtin_def
>>>>>> +# define libc_hidden_builtin_def(x)
>>>>>> +#endif
>>>>>> +#include <string/memchr.c>
>>>>>> diff --git a/sysdeps/riscv/multiarch/memchr-zbb.c b/sysdeps/riscv/multiarch/memchr-zbb.c
>>>>>> new file mode 100644
>>>>>> index 0000000000..bead0335ae
>>>>>> --- /dev/null
>>>>>> +++ b/sysdeps/riscv/multiarch/memchr-zbb.c
>>>>>> @@ -0,0 +1,30 @@
>>>>>> +/* Re-include the default memchr implementation for Zbb.
>>>>>> +   Copyright (C) 2024 Free Software Foundation, Inc.
>>>>>> +   This file is part of the GNU C Library.
>>>>>> +
>>>>>> +   The GNU C Library is free software; you can redistribute it and/or
>>>>>> +   modify it under the terms of the GNU Lesser General Public
>>>>>> +   License as published by the Free Software Foundation; either
>>>>>> +   version 2.1 of the License, or (at your option) any later version.
>>>>>> +
>>>>>> +   The GNU C Library is distributed in the hope that it will be useful,
>>>>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>>>>> +   Lesser General Public License for more details.
>>>>>> +
>>>>>> +   You should have received a copy of the GNU Lesser General Public
>>>>>> +   License along with the GNU C Library; if not, see
>>>>>> +   <https://www.gnu.org/licenses/>.  */
>>>>>> +
>>>>>> +#include <string.h>
>>>>>> +
>>>>>> +#if IS_IN(libc)
>>>>>> +# define MEMCHR __memchr_zbb
>>>>>> +# undef libc_hidden_builtin_def
>>>>>> +# define libc_hidden_builtin_def(x)
>>>>>> +#endif
>>>>>> +/* Convince preprocessor to have Zbb instructions.  */
>>>>>> +#ifndef __riscv_zbb
>>>>>> +# define __riscv_zbb
>>>>>> +#endif
>>>>>
>>>>> Is there a way to specific the compiler to enable a extension, like aarch64
>>>>> -march=arch{+[no]feature}? I think ideally this should be enabled as CFLAGS
>>>>> instead of messing with compiler defined pre-processor.
>>>>
>>>> The tools expect a list of all extensions as parameter to the -march= option.
>>>> But there is no way to append extensions to an existing march string
>>>> on the command line.
>>>>
>>>> And if we would add this feature today, it would take many years until we could
>>>> use it here, because we want to remain compatible with old tools.
>>>> Or we enable the optimization only when being built with new tools, but that
>>>> adds even more complexity and build/test configurations.
>>>>
>>>> What we have is:
>>>> * Preprocessor (since forever): Extension test macros (__riscv_EXTENSION)
>>>> * Command line (since forever): -march=BASE_EXTENSIONLIST
>>>> * GAS (since Nov 21): .option arch, +EXTENSION (in combination with
>>>> option push/pop)
>>>> * GCC (since Nov 23): __attribute__((target("arch=+EXTENSION")))
>>>>
>>>> I was not sure about using __riscv_zbb as well, but I considered it safe within
>>>> ifdef tests that ensure the macro won't be set twice.
>>>> If that's a concern, I could change to use something like this:
>>>> #define __riscv_force_zbb
>>>> #include <impl.c>
>>>> #undef __riscv_force_zbb
>>>> ... and change string-fza.h like this:
>>>> #if defined(__riscv_zbb) || defined(__riscv_force_zbb)
>>>> // orc.b
>>>> #endif
>>>>
>>>> BR
>>>> Christoph
>>>
>>> Another options would to parse the current march and add the extension if required,
>>> something like:
>>>
>>> abi=$(riscv64-linux-gnu-gcc -Q --help=target | grep march | cut -d '=' -f2 | xargs)
>>> if [[ ! "$abi" =~ "_zbb" ]]
>>> then
>>>   abi="$abi"_zbb
>>> fi
>>
>> That alone likely won't do it, there's a bunch of ordering rules in the ISA string handling so we might get tripped up on them.  We've got a fairly relaxed version of the rules in GCC to try and match the various older rules, though, so it might be possible to make something similar work.
>>
>> We should probably just add some sort of -march=+zbb type argument.  IIRC Kito was going to do it at some point, not sure if he got around to it?
>
> I am just pointing this out because I think the way RISCV extension selection
> is currently implemented makes it awkward to provide ifunc implementation in
> a agnostic way (specially now that RISCV has dozens of extensions) without
> knowing the current target compiler is generating.
>
> Some other ABI allows to either specify a ISA/chip reference (like powerpc
> with -mcpu=powerX) or a ABI extension directly (like aarch64 with -march=+xxx).

We have -mcpu, but for RISC-V that's even more fragmented than -march 
(-mcpu essentially just rolls together -march and -mtune, and only 
allows a curated list).  The `-march=+`-type stuff has been a pretty 
common request, I think someone just needst oget around to actually 
implementing it.

>>> I don't have a strong preference, it is just that by not using the compiler flag
>>> we won't be able to either use the builtin (__builtin_riscv_orc_b_32) and/or get
>>> a possible better code generation from compiler.
>>
>> I think we'd likely get slightly better codgen from telling the compiler about the bitmanip extensions.  Maybe we want something like
>>
>>    diff --git a/string/memchr.c b/string/memchr.c
>>    index 08b5c41667..1b62dce8d8 100644
>>    --- a/string/memchr.c
>>    +++ b/string/memchr.c
>>    @@ -29,15 +29,19 @@
>>     # define __memchr MEMCHR
>>     #endif
>>       +#ifndef __MEMCHR_CODEGEN_ATTRIBUTE
>>    +#define __MEMCHR_CODEGEN_ATTRIBUTE
>>    +#endif
>>    +
>>     static __always_inline const char *
>>    -sadd (uintptr_t x, uintptr_t y)
>>    +sadd (uintptr_t x, uintptr_t y) __MEMCHR_CODEGEN_ATTRIBUTE
>>     {
>>       return (const char *)(y > UINTPTR_MAX - x ? UINTPTR_MAX : x + y);
>>     }
>>        /* Search no more than N bytes of S for C.  */
>>     void *
>>    -__memchr (void const *s, int c_in, size_t n)
>>    +__memchr (void const *s, int c_in, size_t n) __MEMCHR_CODEGEN_ATTRIBUTE
>>     {
>>       if (__glibc_unlikely (n == 0))
>>         return NULL;
>>
>> in the generic versions, so we can add a
>>
>>    #define __MEMCHR_CODEGEN_ATTRIBUTE __attribuet__((target("+zbb")))
>>
>> (or whatever the syntax is) to the Zbb-flavored versions of these routines?
>
> Yeah, this might work and it is clear than messing with compiler-defined
> macros.
>
>>
>> It might also be worth just jumping to the fast-misaligned versions for these routines, too --the slow-misaligned stuff is there for compatibility with old stuff (though memchr aligns the pointer, so it doesn't matter so much here).
>
> I was hopping that ABIs that would like to provide unaligned variants
> for mem* routines to improve the generic code, but it seems that for some
> it is easier to just add an assembly routine (as loongarch did).
>
> For memchr, I think it should be easy to provide a unaligned version.
> Something like (completely untested):
>
> /* Search no more than N bytes of S for C.  */
> void *
> __memchr (void const *s, int c_in, size_t n)
> {
>   if (__glibc_unlikely (n == 0))
>     return NULL;
>
> #ifdef USE_MEMCHR_UNALIGNED
>   /* Read the first word, but munge it so that bytes before the array
>      will not match goal.  */
>   const op_t *word_ptr = PTR_ALIGN_DOWN (s, sizeof (op_t));
>   uintptr_t s_int = (uintptr_t) s;
>
>   op_t word = *word_ptr;
>   op_t repeated_c = repeat_bytes (c_in);
>   /* Compute the address of the last byte taking in consideration possible
>      overflow.  */
>   const char *lbyte = sadd (s_int, n - 1);
>   /* And also the address of the word containing the last byte. */
>   const op_t *lword = (const op_t *) PTR_ALIGN_DOWN (lbyte, sizeof (op_t));
>
>   find_t mask = shift_find (find_eq_all (word, repeated_c), s_int);
>   if (mask != 0)
>     {
>       char *ret = (char *) s + index_first (mask);
>       return (ret <= lbyte) ? ret : NULL;
>     }
>   if (word_ptr == lword)
>     return NULL;
> #endif
>
>   word = *++word_ptr;
>   while (word_ptr != lword)
>     {
>       if (has_eq (word, repeated_c))
>         return (char *) word_ptr + index_first_eq (word, repeated_c);
>       word = *++word_ptr;
>     }
>
>   if (has_eq (word, repeated_c))
>     {
>       /* We found a match, but it might be in a byte past the end of the
>          array.  */
>       char *ret = (char *) word_ptr + index_first_eq (word, repeated_c);
>       if (ret <= lbyte)
>         return ret;
>     }
>   return NULL;
> }
>
>>
>>>>>> +#include <string/memchr.c>
>>>>>> diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
>>>>>> index fcef5659d4..5586d11c89 100644
>>>>>> --- a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
>>>>>> +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
>>>>>> @@ -1,5 +1,8 @@
>>>>>>  ifeq ($(subdir),string)
>>>>>>  sysdep_routines += \
>>>>>> +  memchr \
>>>>>> +  memchr-generic \
>>>>>> +  memchr-zbb \
>>>>>>    memcpy \
>>>>>>    memcpy-generic \
>>>>>>    memcpy_noalignment \
>>>>>> diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
>>>>>> index 9f806d7a9e..7321144a32 100644
>>>>>> --- a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
>>>>>> +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
>>>>>> @@ -20,19 +20,40 @@
>>>>>>  #include <string.h>
>>>>>>  #include <sys/hwprobe.h>
>>>>>>
>>>>>> +#define ARRAY_SIZE(A) (sizeof (A) / sizeof ((A)[0]))
>>>>>> +
>>>>>>  size_t
>>>>>>  __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
>>>>>>                       size_t max)
>>>>>>  {
>>>>>>    size_t i = max;
>>>>>> +  struct riscv_hwprobe pairs[] = {
>>>>>> +    { .key = RISCV_HWPROBE_KEY_IMA_EXT_0 },
>>>>>> +    { .key = RISCV_HWPROBE_KEY_CPUPERF_0 },
>>>>>> +  };
>>>>>>
>>>>>> +  bool has_zbb = false;
>>>>>>    bool fast_unaligned = false;
>>>>>>
>>>>>> -  struct riscv_hwprobe pair = { .key = RISCV_HWPROBE_KEY_CPUPERF_0 };
>>>>>> -  if (__riscv_hwprobe (&pair, 1, 0, NULL, 0) == 0
>>>>>> -      && (pair.value & RISCV_HWPROBE_MISALIGNED_MASK)
>>>>>> -          == RISCV_HWPROBE_MISALIGNED_FAST)
>>>>>> -    fast_unaligned = true;
>>>>>> +  if (__riscv_hwprobe (pairs, ARRAY_SIZE (pairs), 0, NULL, 0) == 0)
>>>>>> +    {
>>>>>> +      struct riscv_hwprobe *pair;
>>>>>> +
>>>>>> +      /* RISCV_HWPROBE_KEY_IMA_EXT_0  */
>>>>>> +      pair = &pairs[0];
>>>>>> +      if (pair->value & RISCV_HWPROBE_EXT_ZBB)
>>>>>> +        has_zbb = true;
>>>>>> +
>>>>>> +      /* RISCV_HWPROBE_KEY_CPUPERF_0  */
>>>>>> +      pair = &pairs[1];
>>>>>> +      if ((pair->value & RISCV_HWPROBE_MISALIGNED_MASK)
>>>>>> +        == RISCV_HWPROBE_MISALIGNED_FAST)
>>>>>> +        fast_unaligned = true;
>>>>>> +    }
>>>>>> +
>>>>>> +  IFUNC_IMPL (i, name, memchr,
>>>>>> +           IFUNC_IMPL_ADD (array, i, memchr, has_zbb, __memchr_zbb)
>>>>>> +           IFUNC_IMPL_ADD (array, i, memchr, 1, __memchr_generic))
>>>>>>
>>>>>>    IFUNC_IMPL (i, name, memcpy,
>>>>>>             IFUNC_IMPL_ADD (array, i, memcpy, fast_unaligned,
>>>>>> diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c b/sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c
>>>>>> new file mode 100644
>>>>>> index 0000000000..bc076cbf24
>>>>>> --- /dev/null
>>>>>> +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c
>>>>>> @@ -0,0 +1,57 @@
>>>>>> +/* Multiple versions of memchr.
>>>>>> +   All versions must be listed in ifunc-impl-list.c.
>>>>>> +   Copyright (C) 2017-2024 Free Software Foundation, Inc.
>>>>>> +   This file is part of the GNU C Library.
>>>>>> +
>>>>>> +   The GNU C Library is free software; you can redistribute it and/or
>>>>>> +   modify it under the terms of the GNU Lesser General Public
>>>>>> +   License as published by the Free Software Foundation; either
>>>>>> +   version 2.1 of the License, or (at your option) any later version.
>>>>>> +
>>>>>> +   The GNU C Library is distributed in the hope that it will be useful,
>>>>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>>>>> +   Lesser General Public License for more details.
>>>>>> +
>>>>>> +   You should have received a copy of the GNU Lesser General Public
>>>>>> +   License along with the GNU C Library; if not, see
>>>>>> +   <https://www.gnu.org/licenses/>.  */
>>>>>> +
>>>>>> +#if IS_IN (libc)
>>>>>> +/* Redefine memchr so that the compiler won't complain about the type
>>>>>> +   mismatch with the IFUNC selector in strong_alias, below.  */
>>>>>> +# undef memchr
>>>>>> +# define memchr __redirect_memchr
>>>>>> +# include <stdint.h>
>>>>>> +# include <string.h>
>>>>>> +# include <ifunc-init.h>
>>>>>> +# include <riscv-ifunc.h>
>>>>>> +# include <sys/hwprobe.h>
>>>>>> +
>>>>>> +extern __typeof (__redirect_memchr) __libc_memchr;
>>>>>> +
>>>>>> +extern __typeof (__redirect_memchr) __memchr_generic attribute_hidden;
>>>>>> +extern __typeof (__redirect_memchr) __memchr_zbb attribute_hidden;
>>>>>> +
>>>>>> +static inline __typeof (__redirect_memchr) *
>>>>>> +select_memchr_ifunc (uint64_t dl_hwcap, __riscv_hwprobe_t hwprobe_func)
>>>>>> +{
>>>>>> +  unsigned long long int v;
>>>>>> +  if (__riscv_hwprobe_one (hwprobe_func, RISCV_HWPROBE_KEY_IMA_EXT_0, &v) == 0
>>>>>> +      && (v & RISCV_HWPROBE_EXT_ZBB))
>>>>>> +    return __memchr_zbb;
>>>>>> +
>>>>>> +  return __memchr_generic;
>>>>>> +}
>>>>>> +
>>>>>> +riscv_libc_ifunc (__libc_memchr, select_memchr_ifunc);
>>>>>> +
>>>>>> +# undef memchr
>>>>>> +strong_alias (__libc_memchr, memchr);
>>>>>> +# ifdef SHARED
>>>>>> +__hidden_ver1 (memchr, __GI_memchr, __redirect_memchr)
>>>>>> +  __attribute__ ((visibility ("hidden"))) __attribute_copy__ (memchr);
>>>>>> +# endif
>>>>>> +#else
>>>>>> +# include <string/memchr.c>
>>>>>> +#endif

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/7] RISC-V: Add Zbb optimized memchr as ifunc
  2024-04-30 17:54             ` Palmer Dabbelt
@ 2024-04-30 18:44               ` Vineet Gupta
  2024-05-06 13:20               ` Christoph Müllner
  1 sibling, 0 replies; 20+ messages in thread
From: Vineet Gupta @ 2024-04-30 18:44 UTC (permalink / raw)
  To: Palmer Dabbelt, adhemerval.zanella
  Cc: christoph.muellner, libc-alpha, Darius Rad, Andrew Waterman,
	philipp.tomsich, Evan Green, DJ Delorie, kito.cheng, jeffreyalaw


On 4/30/24 10:54, Palmer Dabbelt wrote:
>>> That alone likely won't do it, there's a bunch of ordering rules in the ISA string handling so we might get tripped up on them.  We've got a fairly relaxed version of the rules in GCC to try and match the various older rules, though, so it might be possible to make something similar work.
>>>
>>> We should probably just add some sort of -march=+zbb type argument.  IIRC Kito was going to do it at some point, not sure if he got around to it?
>> I am just pointing this out because I think the way RISCV extension selection
>> is currently implemented makes it awkward to provide ifunc implementation in
>> a agnostic way (specially now that RISCV has dozens of extensions) without
>> knowing the current target compiler is generating.
>>
>> Some other ABI allows to either specify a ISA/chip reference (like powerpc
>> with -mcpu=powerX) or a ABI extension directly (like aarch64 with -march=+xxx).
> We have -mcpu, but for RISC-V that's even more fragmented than -march 
> (-mcpu essentially just rolls together -march and -mtune, and only 
> allows a curated list).  The `-march=+`-type stuff has been a pretty 
> common request, I think someone just needst oget around to actually 
> implementing it.

But the lack of this infra as of today, means downstream projects such
as glibc either need to wait for that to get merged and rely on a
specific gcc version or need to implement both ways of doing this.
Unfortunately that means as of today we have to use the non perfect
solutions.

-Vineet

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/7] RISC-V: Add Zbb optimized memchr as ifunc
  2024-04-30 17:54             ` Palmer Dabbelt
  2024-04-30 18:44               ` Vineet Gupta
@ 2024-05-06 13:20               ` Christoph Müllner
  2024-05-06 13:32                 ` Kito Cheng
  1 sibling, 1 reply; 20+ messages in thread
From: Christoph Müllner @ 2024-05-06 13:20 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: adhemerval.zanella, libc-alpha, Darius Rad, Andrew Waterman,
	philipp.tomsich, Evan Green, DJ Delorie, Vineet Gupta,
	kito.cheng, jeffreyalaw

On Tue, Apr 30, 2024 at 7:54 PM Palmer Dabbelt <palmer@dabbelt.com> wrote:
>
> On Tue, 30 Apr 2024 10:45:12 PDT (-0700), adhemerval.zanella@linaro.org wrote:
> >
> >
> > On 30/04/24 12:13, Palmer Dabbelt wrote:
> >> On Wed, 24 Apr 2024 06:36:43 PDT (-0700), adhemerval.zanella@linaro.org wrote:
> >>>
> >>>
> >>> On 24/04/24 10:16, Christoph Müllner wrote:
> >>>> On Wed, Apr 24, 2024 at 2:53 PM Adhemerval Zanella Netto
> >>>> <adhemerval.zanella@linaro.org> wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>> On 22/04/24 04:43, Christoph Müllner wrote:
> >>>>>> When building with Zbb enabled, memchr benefits from using orc.b in
> >>>>>> find_zero_all().  This patch changes the build system such, that a
> >>>>>> non-Zbb version as well as a Zbb version of this routine is built.
> >>>>>> Further, a ifunc resolver is provided that selects the right routine
> >>>>>> based on the outcome of extension probing via hwprobe().
> >>>>>>
> >>>>>> Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
> >>>>>> ---
> >>>>>>  sysdeps/riscv/multiarch/memchr-generic.c      | 26 +++++++++
> >>>>>>  sysdeps/riscv/multiarch/memchr-zbb.c          | 30 ++++++++++
> >>>>>>  .../unix/sysv/linux/riscv/multiarch/Makefile  |  3 +
> >>>>>>  .../linux/riscv/multiarch/ifunc-impl-list.c   | 31 ++++++++--
> >>>>>>  .../unix/sysv/linux/riscv/multiarch/memchr.c  | 57 +++++++++++++++++++
> >>>>>>  5 files changed, 142 insertions(+), 5 deletions(-)
> >>>>>>  create mode 100644 sysdeps/riscv/multiarch/memchr-generic.c
> >>>>>>  create mode 100644 sysdeps/riscv/multiarch/memchr-zbb.c
> >>>>>>  create mode 100644 sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c
> >>>>>>
> >>>>>> diff --git a/sysdeps/riscv/multiarch/memchr-generic.c b/sysdeps/riscv/multiarch/memchr-generic.c
> >>>>>> new file mode 100644
> >>>>>> index 0000000000..a96c36398b
> >>>>>> --- /dev/null
> >>>>>> +++ b/sysdeps/riscv/multiarch/memchr-generic.c
> >>>>>> @@ -0,0 +1,26 @@
> >>>>>> +/* Re-include the default memchr implementation.
> >>>>>> +   Copyright (C) 2024 Free Software Foundation, Inc.
> >>>>>> +   This file is part of the GNU C Library.
> >>>>>> +
> >>>>>> +   The GNU C Library is free software; you can redistribute it and/or
> >>>>>> +   modify it under the terms of the GNU Lesser General Public
> >>>>>> +   License as published by the Free Software Foundation; either
> >>>>>> +   version 2.1 of the License, or (at your option) any later version.
> >>>>>> +
> >>>>>> +   The GNU C Library is distributed in the hope that it will be useful,
> >>>>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> >>>>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> >>>>>> +   Lesser General Public License for more details.
> >>>>>> +
> >>>>>> +   You should have received a copy of the GNU Lesser General Public
> >>>>>> +   License along with the GNU C Library; if not, see
> >>>>>> +   <https://www.gnu.org/licenses/>.  */
> >>>>>> +
> >>>>>> +#include <string.h>
> >>>>>> +
> >>>>>> +#if IS_IN(libc)
> >>>>>> +# define MEMCHR __memchr_generic
> >>>>>> +# undef libc_hidden_builtin_def
> >>>>>> +# define libc_hidden_builtin_def(x)
> >>>>>> +#endif
> >>>>>> +#include <string/memchr.c>
> >>>>>> diff --git a/sysdeps/riscv/multiarch/memchr-zbb.c b/sysdeps/riscv/multiarch/memchr-zbb.c
> >>>>>> new file mode 100644
> >>>>>> index 0000000000..bead0335ae
> >>>>>> --- /dev/null
> >>>>>> +++ b/sysdeps/riscv/multiarch/memchr-zbb.c
> >>>>>> @@ -0,0 +1,30 @@
> >>>>>> +/* Re-include the default memchr implementation for Zbb.
> >>>>>> +   Copyright (C) 2024 Free Software Foundation, Inc.
> >>>>>> +   This file is part of the GNU C Library.
> >>>>>> +
> >>>>>> +   The GNU C Library is free software; you can redistribute it and/or
> >>>>>> +   modify it under the terms of the GNU Lesser General Public
> >>>>>> +   License as published by the Free Software Foundation; either
> >>>>>> +   version 2.1 of the License, or (at your option) any later version.
> >>>>>> +
> >>>>>> +   The GNU C Library is distributed in the hope that it will be useful,
> >>>>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> >>>>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> >>>>>> +   Lesser General Public License for more details.
> >>>>>> +
> >>>>>> +   You should have received a copy of the GNU Lesser General Public
> >>>>>> +   License along with the GNU C Library; if not, see
> >>>>>> +   <https://www.gnu.org/licenses/>.  */
> >>>>>> +
> >>>>>> +#include <string.h>
> >>>>>> +
> >>>>>> +#if IS_IN(libc)
> >>>>>> +# define MEMCHR __memchr_zbb
> >>>>>> +# undef libc_hidden_builtin_def
> >>>>>> +# define libc_hidden_builtin_def(x)
> >>>>>> +#endif
> >>>>>> +/* Convince preprocessor to have Zbb instructions.  */
> >>>>>> +#ifndef __riscv_zbb
> >>>>>> +# define __riscv_zbb
> >>>>>> +#endif
> >>>>>
> >>>>> Is there a way to specific the compiler to enable a extension, like aarch64
> >>>>> -march=arch{+[no]feature}? I think ideally this should be enabled as CFLAGS
> >>>>> instead of messing with compiler defined pre-processor.
> >>>>
> >>>> The tools expect a list of all extensions as parameter to the -march= option.
> >>>> But there is no way to append extensions to an existing march string
> >>>> on the command line.
> >>>>
> >>>> And if we would add this feature today, it would take many years until we could
> >>>> use it here, because we want to remain compatible with old tools.
> >>>> Or we enable the optimization only when being built with new tools, but that
> >>>> adds even more complexity and build/test configurations.
> >>>>
> >>>> What we have is:
> >>>> * Preprocessor (since forever): Extension test macros (__riscv_EXTENSION)
> >>>> * Command line (since forever): -march=BASE_EXTENSIONLIST
> >>>> * GAS (since Nov 21): .option arch, +EXTENSION (in combination with
> >>>> option push/pop)
> >>>> * GCC (since Nov 23): __attribute__((target("arch=+EXTENSION")))
> >>>>
> >>>> I was not sure about using __riscv_zbb as well, but I considered it safe within
> >>>> ifdef tests that ensure the macro won't be set twice.
> >>>> If that's a concern, I could change to use something like this:
> >>>> #define __riscv_force_zbb
> >>>> #include <impl.c>
> >>>> #undef __riscv_force_zbb
> >>>> ... and change string-fza.h like this:
> >>>> #if defined(__riscv_zbb) || defined(__riscv_force_zbb)
> >>>> // orc.b
> >>>> #endif
> >>>>
> >>>> BR
> >>>> Christoph
> >>>
> >>> Another options would to parse the current march and add the extension if required,
> >>> something like:
> >>>
> >>> abi=$(riscv64-linux-gnu-gcc -Q --help=target | grep march | cut -d '=' -f2 | xargs)
> >>> if [[ ! "$abi" =~ "_zbb" ]]
> >>> then
> >>>   abi="$abi"_zbb
> >>> fi
> >>
> >> That alone likely won't do it, there's a bunch of ordering rules in the ISA string handling so we might get tripped up on them.  We've got a fairly relaxed version of the rules in GCC to try and match the various older rules, though, so it might be possible to make something similar work.
> >>
> >> We should probably just add some sort of -march=+zbb type argument.  IIRC Kito was going to do it at some point, not sure if he got around to it?
> >
> > I am just pointing this out because I think the way RISCV extension selection
> > is currently implemented makes it awkward to provide ifunc implementation in
> > a agnostic way (specially now that RISCV has dozens of extensions) without
> > knowing the current target compiler is generating.
> >
> > Some other ABI allows to either specify a ISA/chip reference (like powerpc
> > with -mcpu=powerX) or a ABI extension directly (like aarch64 with -march=+xxx).
>
> We have -mcpu, but for RISC-V that's even more fragmented than -march
> (-mcpu essentially just rolls together -march and -mtune, and only
> allows a curated list).  The `-march=+`-type stuff has been a pretty
> common request, I think someone just needst oget around to actually
> implementing it.

I agree that `-march=+` would be the best solution.
As we don't have it, I've created a PR for the riscv-toolchain-conventions:
  https://github.com/riscv-non-isa/riscv-toolchain-conventions/pull/47
This will have to depend on GCC 15+ (assuming there will be consensus
in the next 10 months about it).

Thanks for the proposal with the function-attributes.
I'm working on a patchset for that and will send it out after some testing.
This will have to depend on GCC 14+.

> >>> I don't have a strong preference, it is just that by not using the compiler flag
> >>> we won't be able to either use the builtin (__builtin_riscv_orc_b_32) and/or get
> >>> a possible better code generation from compiler.
> >>
> >> I think we'd likely get slightly better codgen from telling the compiler about the bitmanip extensions.  Maybe we want something like
> >>
> >>    diff --git a/string/memchr.c b/string/memchr.c
> >>    index 08b5c41667..1b62dce8d8 100644
> >>    --- a/string/memchr.c
> >>    +++ b/string/memchr.c
> >>    @@ -29,15 +29,19 @@
> >>     # define __memchr MEMCHR
> >>     #endif
> >>       +#ifndef __MEMCHR_CODEGEN_ATTRIBUTE
> >>    +#define __MEMCHR_CODEGEN_ATTRIBUTE
> >>    +#endif
> >>    +
> >>     static __always_inline const char *
> >>    -sadd (uintptr_t x, uintptr_t y)
> >>    +sadd (uintptr_t x, uintptr_t y) __MEMCHR_CODEGEN_ATTRIBUTE
> >>     {
> >>       return (const char *)(y > UINTPTR_MAX - x ? UINTPTR_MAX : x + y);
> >>     }
> >>        /* Search no more than N bytes of S for C.  */
> >>     void *
> >>    -__memchr (void const *s, int c_in, size_t n)
> >>    +__memchr (void const *s, int c_in, size_t n) __MEMCHR_CODEGEN_ATTRIBUTE
> >>     {
> >>       if (__glibc_unlikely (n == 0))
> >>         return NULL;
> >>
> >> in the generic versions, so we can add a
> >>
> >>    #define __MEMCHR_CODEGEN_ATTRIBUTE __attribuet__((target("+zbb")))
> >>
> >> (or whatever the syntax is) to the Zbb-flavored versions of these routines?
> >
> > Yeah, this might work and it is clear than messing with compiler-defined
> > macros.
> >
> >>
> >> It might also be worth just jumping to the fast-misaligned versions for these routines, too --the slow-misaligned stuff is there for compatibility with old stuff (though memchr aligns the pointer, so it doesn't matter so much here).
> >
> > I was hopping that ABIs that would like to provide unaligned variants
> > for mem* routines to improve the generic code, but it seems that for some
> > it is easier to just add an assembly routine (as loongarch did).
> >
> > For memchr, I think it should be easy to provide a unaligned version.
> > Something like (completely untested):
> >
> > /* Search no more than N bytes of S for C.  */
> > void *
> > __memchr (void const *s, int c_in, size_t n)
> > {
> >   if (__glibc_unlikely (n == 0))
> >     return NULL;
> >
> > #ifdef USE_MEMCHR_UNALIGNED
> >   /* Read the first word, but munge it so that bytes before the array
> >      will not match goal.  */
> >   const op_t *word_ptr = PTR_ALIGN_DOWN (s, sizeof (op_t));
> >   uintptr_t s_int = (uintptr_t) s;
> >
> >   op_t word = *word_ptr;
> >   op_t repeated_c = repeat_bytes (c_in);
> >   /* Compute the address of the last byte taking in consideration possible
> >      overflow.  */
> >   const char *lbyte = sadd (s_int, n - 1);
> >   /* And also the address of the word containing the last byte. */
> >   const op_t *lword = (const op_t *) PTR_ALIGN_DOWN (lbyte, sizeof (op_t));
> >
> >   find_t mask = shift_find (find_eq_all (word, repeated_c), s_int);
> >   if (mask != 0)
> >     {
> >       char *ret = (char *) s + index_first (mask);
> >       return (ret <= lbyte) ? ret : NULL;
> >     }
> >   if (word_ptr == lword)
> >     return NULL;
> > #endif
> >
> >   word = *++word_ptr;
> >   while (word_ptr != lword)
> >     {
> >       if (has_eq (word, repeated_c))
> >         return (char *) word_ptr + index_first_eq (word, repeated_c);
> >       word = *++word_ptr;
> >     }
> >
> >   if (has_eq (word, repeated_c))
> >     {
> >       /* We found a match, but it might be in a byte past the end of the
> >          array.  */
> >       char *ret = (char *) word_ptr + index_first_eq (word, repeated_c);
> >       if (ret <= lbyte)
> >         return ret;
> >     }
> >   return NULL;
> > }
> >
> >>
> >>>>>> +#include <string/memchr.c>
> >>>>>> diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
> >>>>>> index fcef5659d4..5586d11c89 100644
> >>>>>> --- a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
> >>>>>> +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
> >>>>>> @@ -1,5 +1,8 @@
> >>>>>>  ifeq ($(subdir),string)
> >>>>>>  sysdep_routines += \
> >>>>>> +  memchr \
> >>>>>> +  memchr-generic \
> >>>>>> +  memchr-zbb \
> >>>>>>    memcpy \
> >>>>>>    memcpy-generic \
> >>>>>>    memcpy_noalignment \
> >>>>>> diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
> >>>>>> index 9f806d7a9e..7321144a32 100644
> >>>>>> --- a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
> >>>>>> +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
> >>>>>> @@ -20,19 +20,40 @@
> >>>>>>  #include <string.h>
> >>>>>>  #include <sys/hwprobe.h>
> >>>>>>
> >>>>>> +#define ARRAY_SIZE(A) (sizeof (A) / sizeof ((A)[0]))
> >>>>>> +
> >>>>>>  size_t
> >>>>>>  __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
> >>>>>>                       size_t max)
> >>>>>>  {
> >>>>>>    size_t i = max;
> >>>>>> +  struct riscv_hwprobe pairs[] = {
> >>>>>> +    { .key = RISCV_HWPROBE_KEY_IMA_EXT_0 },
> >>>>>> +    { .key = RISCV_HWPROBE_KEY_CPUPERF_0 },
> >>>>>> +  };
> >>>>>>
> >>>>>> +  bool has_zbb = false;
> >>>>>>    bool fast_unaligned = false;
> >>>>>>
> >>>>>> -  struct riscv_hwprobe pair = { .key = RISCV_HWPROBE_KEY_CPUPERF_0 };
> >>>>>> -  if (__riscv_hwprobe (&pair, 1, 0, NULL, 0) == 0
> >>>>>> -      && (pair.value & RISCV_HWPROBE_MISALIGNED_MASK)
> >>>>>> -          == RISCV_HWPROBE_MISALIGNED_FAST)
> >>>>>> -    fast_unaligned = true;
> >>>>>> +  if (__riscv_hwprobe (pairs, ARRAY_SIZE (pairs), 0, NULL, 0) == 0)
> >>>>>> +    {
> >>>>>> +      struct riscv_hwprobe *pair;
> >>>>>> +
> >>>>>> +      /* RISCV_HWPROBE_KEY_IMA_EXT_0  */
> >>>>>> +      pair = &pairs[0];
> >>>>>> +      if (pair->value & RISCV_HWPROBE_EXT_ZBB)
> >>>>>> +        has_zbb = true;
> >>>>>> +
> >>>>>> +      /* RISCV_HWPROBE_KEY_CPUPERF_0  */
> >>>>>> +      pair = &pairs[1];
> >>>>>> +      if ((pair->value & RISCV_HWPROBE_MISALIGNED_MASK)
> >>>>>> +        == RISCV_HWPROBE_MISALIGNED_FAST)
> >>>>>> +        fast_unaligned = true;
> >>>>>> +    }
> >>>>>> +
> >>>>>> +  IFUNC_IMPL (i, name, memchr,
> >>>>>> +           IFUNC_IMPL_ADD (array, i, memchr, has_zbb, __memchr_zbb)
> >>>>>> +           IFUNC_IMPL_ADD (array, i, memchr, 1, __memchr_generic))
> >>>>>>
> >>>>>>    IFUNC_IMPL (i, name, memcpy,
> >>>>>>             IFUNC_IMPL_ADD (array, i, memcpy, fast_unaligned,
> >>>>>> diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c b/sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c
> >>>>>> new file mode 100644
> >>>>>> index 0000000000..bc076cbf24
> >>>>>> --- /dev/null
> >>>>>> +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c
> >>>>>> @@ -0,0 +1,57 @@
> >>>>>> +/* Multiple versions of memchr.
> >>>>>> +   All versions must be listed in ifunc-impl-list.c.
> >>>>>> +   Copyright (C) 2017-2024 Free Software Foundation, Inc.
> >>>>>> +   This file is part of the GNU C Library.
> >>>>>> +
> >>>>>> +   The GNU C Library is free software; you can redistribute it and/or
> >>>>>> +   modify it under the terms of the GNU Lesser General Public
> >>>>>> +   License as published by the Free Software Foundation; either
> >>>>>> +   version 2.1 of the License, or (at your option) any later version.
> >>>>>> +
> >>>>>> +   The GNU C Library is distributed in the hope that it will be useful,
> >>>>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> >>>>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> >>>>>> +   Lesser General Public License for more details.
> >>>>>> +
> >>>>>> +   You should have received a copy of the GNU Lesser General Public
> >>>>>> +   License along with the GNU C Library; if not, see
> >>>>>> +   <https://www.gnu.org/licenses/>.  */
> >>>>>> +
> >>>>>> +#if IS_IN (libc)
> >>>>>> +/* Redefine memchr so that the compiler won't complain about the type
> >>>>>> +   mismatch with the IFUNC selector in strong_alias, below.  */
> >>>>>> +# undef memchr
> >>>>>> +# define memchr __redirect_memchr
> >>>>>> +# include <stdint.h>
> >>>>>> +# include <string.h>
> >>>>>> +# include <ifunc-init.h>
> >>>>>> +# include <riscv-ifunc.h>
> >>>>>> +# include <sys/hwprobe.h>
> >>>>>> +
> >>>>>> +extern __typeof (__redirect_memchr) __libc_memchr;
> >>>>>> +
> >>>>>> +extern __typeof (__redirect_memchr) __memchr_generic attribute_hidden;
> >>>>>> +extern __typeof (__redirect_memchr) __memchr_zbb attribute_hidden;
> >>>>>> +
> >>>>>> +static inline __typeof (__redirect_memchr) *
> >>>>>> +select_memchr_ifunc (uint64_t dl_hwcap, __riscv_hwprobe_t hwprobe_func)
> >>>>>> +{
> >>>>>> +  unsigned long long int v;
> >>>>>> +  if (__riscv_hwprobe_one (hwprobe_func, RISCV_HWPROBE_KEY_IMA_EXT_0, &v) == 0
> >>>>>> +      && (v & RISCV_HWPROBE_EXT_ZBB))
> >>>>>> +    return __memchr_zbb;
> >>>>>> +
> >>>>>> +  return __memchr_generic;
> >>>>>> +}
> >>>>>> +
> >>>>>> +riscv_libc_ifunc (__libc_memchr, select_memchr_ifunc);
> >>>>>> +
> >>>>>> +# undef memchr
> >>>>>> +strong_alias (__libc_memchr, memchr);
> >>>>>> +# ifdef SHARED
> >>>>>> +__hidden_ver1 (memchr, __GI_memchr, __redirect_memchr)
> >>>>>> +  __attribute__ ((visibility ("hidden"))) __attribute_copy__ (memchr);
> >>>>>> +# endif
> >>>>>> +#else
> >>>>>> +# include <string/memchr.c>
> >>>>>> +#endif

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/7] RISC-V: Add Zbb optimized memchr as ifunc
  2024-05-06 13:20               ` Christoph Müllner
@ 2024-05-06 13:32                 ` Kito Cheng
  2024-05-06 13:46                   ` Christoph Müllner
  0 siblings, 1 reply; 20+ messages in thread
From: Kito Cheng @ 2024-05-06 13:32 UTC (permalink / raw)
  To: Christoph Müllner
  Cc: Palmer Dabbelt, adhemerval.zanella, libc-alpha, Darius Rad,
	Andrew Waterman, philipp.tomsich, Evan Green, DJ Delorie,
	Vineet Gupta, jeffreyalaw

`target` attribute is provided by GCC 14, and `-march=+ext` will at
least require GCC 15 (if we add),
also we relaxed the canonical order requirement for `-march=`, that
means we can relatively easy to manipulate the ISA
string by just concat that? so I don't really think we need  `-march=+ext`.

On Mon, May 6, 2024 at 9:20 PM Christoph Müllner
<christoph.muellner@vrull.eu> wrote:
>
> On Tue, Apr 30, 2024 at 7:54 PM Palmer Dabbelt <palmer@dabbelt.com> wrote:
> >
> > On Tue, 30 Apr 2024 10:45:12 PDT (-0700), adhemerval.zanella@linaro.org wrote:
> > >
> > >
> > > On 30/04/24 12:13, Palmer Dabbelt wrote:
> > >> On Wed, 24 Apr 2024 06:36:43 PDT (-0700), adhemerval.zanella@linaro.org wrote:
> > >>>
> > >>>
> > >>> On 24/04/24 10:16, Christoph Müllner wrote:
> > >>>> On Wed, Apr 24, 2024 at 2:53 PM Adhemerval Zanella Netto
> > >>>> <adhemerval.zanella@linaro.org> wrote:
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> On 22/04/24 04:43, Christoph Müllner wrote:
> > >>>>>> When building with Zbb enabled, memchr benefits from using orc.b in
> > >>>>>> find_zero_all().  This patch changes the build system such, that a
> > >>>>>> non-Zbb version as well as a Zbb version of this routine is built.
> > >>>>>> Further, a ifunc resolver is provided that selects the right routine
> > >>>>>> based on the outcome of extension probing via hwprobe().
> > >>>>>>
> > >>>>>> Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
> > >>>>>> ---
> > >>>>>>  sysdeps/riscv/multiarch/memchr-generic.c      | 26 +++++++++
> > >>>>>>  sysdeps/riscv/multiarch/memchr-zbb.c          | 30 ++++++++++
> > >>>>>>  .../unix/sysv/linux/riscv/multiarch/Makefile  |  3 +
> > >>>>>>  .../linux/riscv/multiarch/ifunc-impl-list.c   | 31 ++++++++--
> > >>>>>>  .../unix/sysv/linux/riscv/multiarch/memchr.c  | 57 +++++++++++++++++++
> > >>>>>>  5 files changed, 142 insertions(+), 5 deletions(-)
> > >>>>>>  create mode 100644 sysdeps/riscv/multiarch/memchr-generic.c
> > >>>>>>  create mode 100644 sysdeps/riscv/multiarch/memchr-zbb.c
> > >>>>>>  create mode 100644 sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c
> > >>>>>>
> > >>>>>> diff --git a/sysdeps/riscv/multiarch/memchr-generic.c b/sysdeps/riscv/multiarch/memchr-generic.c
> > >>>>>> new file mode 100644
> > >>>>>> index 0000000000..a96c36398b
> > >>>>>> --- /dev/null
> > >>>>>> +++ b/sysdeps/riscv/multiarch/memchr-generic.c
> > >>>>>> @@ -0,0 +1,26 @@
> > >>>>>> +/* Re-include the default memchr implementation.
> > >>>>>> +   Copyright (C) 2024 Free Software Foundation, Inc.
> > >>>>>> +   This file is part of the GNU C Library.
> > >>>>>> +
> > >>>>>> +   The GNU C Library is free software; you can redistribute it and/or
> > >>>>>> +   modify it under the terms of the GNU Lesser General Public
> > >>>>>> +   License as published by the Free Software Foundation; either
> > >>>>>> +   version 2.1 of the License, or (at your option) any later version.
> > >>>>>> +
> > >>>>>> +   The GNU C Library is distributed in the hope that it will be useful,
> > >>>>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > >>>>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > >>>>>> +   Lesser General Public License for more details.
> > >>>>>> +
> > >>>>>> +   You should have received a copy of the GNU Lesser General Public
> > >>>>>> +   License along with the GNU C Library; if not, see
> > >>>>>> +   <https://www.gnu.org/licenses/>.  */
> > >>>>>> +
> > >>>>>> +#include <string.h>
> > >>>>>> +
> > >>>>>> +#if IS_IN(libc)
> > >>>>>> +# define MEMCHR __memchr_generic
> > >>>>>> +# undef libc_hidden_builtin_def
> > >>>>>> +# define libc_hidden_builtin_def(x)
> > >>>>>> +#endif
> > >>>>>> +#include <string/memchr.c>
> > >>>>>> diff --git a/sysdeps/riscv/multiarch/memchr-zbb.c b/sysdeps/riscv/multiarch/memchr-zbb.c
> > >>>>>> new file mode 100644
> > >>>>>> index 0000000000..bead0335ae
> > >>>>>> --- /dev/null
> > >>>>>> +++ b/sysdeps/riscv/multiarch/memchr-zbb.c
> > >>>>>> @@ -0,0 +1,30 @@
> > >>>>>> +/* Re-include the default memchr implementation for Zbb.
> > >>>>>> +   Copyright (C) 2024 Free Software Foundation, Inc.
> > >>>>>> +   This file is part of the GNU C Library.
> > >>>>>> +
> > >>>>>> +   The GNU C Library is free software; you can redistribute it and/or
> > >>>>>> +   modify it under the terms of the GNU Lesser General Public
> > >>>>>> +   License as published by the Free Software Foundation; either
> > >>>>>> +   version 2.1 of the License, or (at your option) any later version.
> > >>>>>> +
> > >>>>>> +   The GNU C Library is distributed in the hope that it will be useful,
> > >>>>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > >>>>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > >>>>>> +   Lesser General Public License for more details.
> > >>>>>> +
> > >>>>>> +   You should have received a copy of the GNU Lesser General Public
> > >>>>>> +   License along with the GNU C Library; if not, see
> > >>>>>> +   <https://www.gnu.org/licenses/>.  */
> > >>>>>> +
> > >>>>>> +#include <string.h>
> > >>>>>> +
> > >>>>>> +#if IS_IN(libc)
> > >>>>>> +# define MEMCHR __memchr_zbb
> > >>>>>> +# undef libc_hidden_builtin_def
> > >>>>>> +# define libc_hidden_builtin_def(x)
> > >>>>>> +#endif
> > >>>>>> +/* Convince preprocessor to have Zbb instructions.  */
> > >>>>>> +#ifndef __riscv_zbb
> > >>>>>> +# define __riscv_zbb
> > >>>>>> +#endif
> > >>>>>
> > >>>>> Is there a way to specific the compiler to enable a extension, like aarch64
> > >>>>> -march=arch{+[no]feature}? I think ideally this should be enabled as CFLAGS
> > >>>>> instead of messing with compiler defined pre-processor.
> > >>>>
> > >>>> The tools expect a list of all extensions as parameter to the -march= option.
> > >>>> But there is no way to append extensions to an existing march string
> > >>>> on the command line.
> > >>>>
> > >>>> And if we would add this feature today, it would take many years until we could
> > >>>> use it here, because we want to remain compatible with old tools.
> > >>>> Or we enable the optimization only when being built with new tools, but that
> > >>>> adds even more complexity and build/test configurations.
> > >>>>
> > >>>> What we have is:
> > >>>> * Preprocessor (since forever): Extension test macros (__riscv_EXTENSION)
> > >>>> * Command line (since forever): -march=BASE_EXTENSIONLIST
> > >>>> * GAS (since Nov 21): .option arch, +EXTENSION (in combination with
> > >>>> option push/pop)
> > >>>> * GCC (since Nov 23): __attribute__((target("arch=+EXTENSION")))
> > >>>>
> > >>>> I was not sure about using __riscv_zbb as well, but I considered it safe within
> > >>>> ifdef tests that ensure the macro won't be set twice.
> > >>>> If that's a concern, I could change to use something like this:
> > >>>> #define __riscv_force_zbb
> > >>>> #include <impl.c>
> > >>>> #undef __riscv_force_zbb
> > >>>> ... and change string-fza.h like this:
> > >>>> #if defined(__riscv_zbb) || defined(__riscv_force_zbb)
> > >>>> // orc.b
> > >>>> #endif
> > >>>>
> > >>>> BR
> > >>>> Christoph
> > >>>
> > >>> Another options would to parse the current march and add the extension if required,
> > >>> something like:
> > >>>
> > >>> abi=$(riscv64-linux-gnu-gcc -Q --help=target | grep march | cut -d '=' -f2 | xargs)
> > >>> if [[ ! "$abi" =~ "_zbb" ]]
> > >>> then
> > >>>   abi="$abi"_zbb
> > >>> fi
> > >>
> > >> That alone likely won't do it, there's a bunch of ordering rules in the ISA string handling so we might get tripped up on them.  We've got a fairly relaxed version of the rules in GCC to try and match the various older rules, though, so it might be possible to make something similar work.
> > >>
> > >> We should probably just add some sort of -march=+zbb type argument.  IIRC Kito was going to do it at some point, not sure if he got around to it?
> > >
> > > I am just pointing this out because I think the way RISCV extension selection
> > > is currently implemented makes it awkward to provide ifunc implementation in
> > > a agnostic way (specially now that RISCV has dozens of extensions) without
> > > knowing the current target compiler is generating.
> > >
> > > Some other ABI allows to either specify a ISA/chip reference (like powerpc
> > > with -mcpu=powerX) or a ABI extension directly (like aarch64 with -march=+xxx).
> >
> > We have -mcpu, but for RISC-V that's even more fragmented than -march
> > (-mcpu essentially just rolls together -march and -mtune, and only
> > allows a curated list).  The `-march=+`-type stuff has been a pretty
> > common request, I think someone just needst oget around to actually
> > implementing it.
>
> I agree that `-march=+` would be the best solution.
> As we don't have it, I've created a PR for the riscv-toolchain-conventions:
>   https://github.com/riscv-non-isa/riscv-toolchain-conventions/pull/47
> This will have to depend on GCC 15+ (assuming there will be consensus
> in the next 10 months about it).
>
> Thanks for the proposal with the function-attributes.
> I'm working on a patchset for that and will send it out after some testing.
> This will have to depend on GCC 14+.
>
> > >>> I don't have a strong preference, it is just that by not using the compiler flag
> > >>> we won't be able to either use the builtin (__builtin_riscv_orc_b_32) and/or get
> > >>> a possible better code generation from compiler.
> > >>
> > >> I think we'd likely get slightly better codgen from telling the compiler about the bitmanip extensions.  Maybe we want something like
> > >>
> > >>    diff --git a/string/memchr.c b/string/memchr.c
> > >>    index 08b5c41667..1b62dce8d8 100644
> > >>    --- a/string/memchr.c
> > >>    +++ b/string/memchr.c
> > >>    @@ -29,15 +29,19 @@
> > >>     # define __memchr MEMCHR
> > >>     #endif
> > >>       +#ifndef __MEMCHR_CODEGEN_ATTRIBUTE
> > >>    +#define __MEMCHR_CODEGEN_ATTRIBUTE
> > >>    +#endif
> > >>    +
> > >>     static __always_inline const char *
> > >>    -sadd (uintptr_t x, uintptr_t y)
> > >>    +sadd (uintptr_t x, uintptr_t y) __MEMCHR_CODEGEN_ATTRIBUTE
> > >>     {
> > >>       return (const char *)(y > UINTPTR_MAX - x ? UINTPTR_MAX : x + y);
> > >>     }
> > >>        /* Search no more than N bytes of S for C.  */
> > >>     void *
> > >>    -__memchr (void const *s, int c_in, size_t n)
> > >>    +__memchr (void const *s, int c_in, size_t n) __MEMCHR_CODEGEN_ATTRIBUTE
> > >>     {
> > >>       if (__glibc_unlikely (n == 0))
> > >>         return NULL;
> > >>
> > >> in the generic versions, so we can add a
> > >>
> > >>    #define __MEMCHR_CODEGEN_ATTRIBUTE __attribuet__((target("+zbb")))
> > >>
> > >> (or whatever the syntax is) to the Zbb-flavored versions of these routines?
> > >
> > > Yeah, this might work and it is clear than messing with compiler-defined
> > > macros.
> > >
> > >>
> > >> It might also be worth just jumping to the fast-misaligned versions for these routines, too --the slow-misaligned stuff is there for compatibility with old stuff (though memchr aligns the pointer, so it doesn't matter so much here).
> > >
> > > I was hopping that ABIs that would like to provide unaligned variants
> > > for mem* routines to improve the generic code, but it seems that for some
> > > it is easier to just add an assembly routine (as loongarch did).
> > >
> > > For memchr, I think it should be easy to provide a unaligned version.
> > > Something like (completely untested):
> > >
> > > /* Search no more than N bytes of S for C.  */
> > > void *
> > > __memchr (void const *s, int c_in, size_t n)
> > > {
> > >   if (__glibc_unlikely (n == 0))
> > >     return NULL;
> > >
> > > #ifdef USE_MEMCHR_UNALIGNED
> > >   /* Read the first word, but munge it so that bytes before the array
> > >      will not match goal.  */
> > >   const op_t *word_ptr = PTR_ALIGN_DOWN (s, sizeof (op_t));
> > >   uintptr_t s_int = (uintptr_t) s;
> > >
> > >   op_t word = *word_ptr;
> > >   op_t repeated_c = repeat_bytes (c_in);
> > >   /* Compute the address of the last byte taking in consideration possible
> > >      overflow.  */
> > >   const char *lbyte = sadd (s_int, n - 1);
> > >   /* And also the address of the word containing the last byte. */
> > >   const op_t *lword = (const op_t *) PTR_ALIGN_DOWN (lbyte, sizeof (op_t));
> > >
> > >   find_t mask = shift_find (find_eq_all (word, repeated_c), s_int);
> > >   if (mask != 0)
> > >     {
> > >       char *ret = (char *) s + index_first (mask);
> > >       return (ret <= lbyte) ? ret : NULL;
> > >     }
> > >   if (word_ptr == lword)
> > >     return NULL;
> > > #endif
> > >
> > >   word = *++word_ptr;
> > >   while (word_ptr != lword)
> > >     {
> > >       if (has_eq (word, repeated_c))
> > >         return (char *) word_ptr + index_first_eq (word, repeated_c);
> > >       word = *++word_ptr;
> > >     }
> > >
> > >   if (has_eq (word, repeated_c))
> > >     {
> > >       /* We found a match, but it might be in a byte past the end of the
> > >          array.  */
> > >       char *ret = (char *) word_ptr + index_first_eq (word, repeated_c);
> > >       if (ret <= lbyte)
> > >         return ret;
> > >     }
> > >   return NULL;
> > > }
> > >
> > >>
> > >>>>>> +#include <string/memchr.c>
> > >>>>>> diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
> > >>>>>> index fcef5659d4..5586d11c89 100644
> > >>>>>> --- a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
> > >>>>>> +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
> > >>>>>> @@ -1,5 +1,8 @@
> > >>>>>>  ifeq ($(subdir),string)
> > >>>>>>  sysdep_routines += \
> > >>>>>> +  memchr \
> > >>>>>> +  memchr-generic \
> > >>>>>> +  memchr-zbb \
> > >>>>>>    memcpy \
> > >>>>>>    memcpy-generic \
> > >>>>>>    memcpy_noalignment \
> > >>>>>> diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
> > >>>>>> index 9f806d7a9e..7321144a32 100644
> > >>>>>> --- a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
> > >>>>>> +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
> > >>>>>> @@ -20,19 +20,40 @@
> > >>>>>>  #include <string.h>
> > >>>>>>  #include <sys/hwprobe.h>
> > >>>>>>
> > >>>>>> +#define ARRAY_SIZE(A) (sizeof (A) / sizeof ((A)[0]))
> > >>>>>> +
> > >>>>>>  size_t
> > >>>>>>  __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
> > >>>>>>                       size_t max)
> > >>>>>>  {
> > >>>>>>    size_t i = max;
> > >>>>>> +  struct riscv_hwprobe pairs[] = {
> > >>>>>> +    { .key = RISCV_HWPROBE_KEY_IMA_EXT_0 },
> > >>>>>> +    { .key = RISCV_HWPROBE_KEY_CPUPERF_0 },
> > >>>>>> +  };
> > >>>>>>
> > >>>>>> +  bool has_zbb = false;
> > >>>>>>    bool fast_unaligned = false;
> > >>>>>>
> > >>>>>> -  struct riscv_hwprobe pair = { .key = RISCV_HWPROBE_KEY_CPUPERF_0 };
> > >>>>>> -  if (__riscv_hwprobe (&pair, 1, 0, NULL, 0) == 0
> > >>>>>> -      && (pair.value & RISCV_HWPROBE_MISALIGNED_MASK)
> > >>>>>> -          == RISCV_HWPROBE_MISALIGNED_FAST)
> > >>>>>> -    fast_unaligned = true;
> > >>>>>> +  if (__riscv_hwprobe (pairs, ARRAY_SIZE (pairs), 0, NULL, 0) == 0)
> > >>>>>> +    {
> > >>>>>> +      struct riscv_hwprobe *pair;
> > >>>>>> +
> > >>>>>> +      /* RISCV_HWPROBE_KEY_IMA_EXT_0  */
> > >>>>>> +      pair = &pairs[0];
> > >>>>>> +      if (pair->value & RISCV_HWPROBE_EXT_ZBB)
> > >>>>>> +        has_zbb = true;
> > >>>>>> +
> > >>>>>> +      /* RISCV_HWPROBE_KEY_CPUPERF_0  */
> > >>>>>> +      pair = &pairs[1];
> > >>>>>> +      if ((pair->value & RISCV_HWPROBE_MISALIGNED_MASK)
> > >>>>>> +        == RISCV_HWPROBE_MISALIGNED_FAST)
> > >>>>>> +        fast_unaligned = true;
> > >>>>>> +    }
> > >>>>>> +
> > >>>>>> +  IFUNC_IMPL (i, name, memchr,
> > >>>>>> +           IFUNC_IMPL_ADD (array, i, memchr, has_zbb, __memchr_zbb)
> > >>>>>> +           IFUNC_IMPL_ADD (array, i, memchr, 1, __memchr_generic))
> > >>>>>>
> > >>>>>>    IFUNC_IMPL (i, name, memcpy,
> > >>>>>>             IFUNC_IMPL_ADD (array, i, memcpy, fast_unaligned,
> > >>>>>> diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c b/sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c
> > >>>>>> new file mode 100644
> > >>>>>> index 0000000000..bc076cbf24
> > >>>>>> --- /dev/null
> > >>>>>> +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c
> > >>>>>> @@ -0,0 +1,57 @@
> > >>>>>> +/* Multiple versions of memchr.
> > >>>>>> +   All versions must be listed in ifunc-impl-list.c.
> > >>>>>> +   Copyright (C) 2017-2024 Free Software Foundation, Inc.
> > >>>>>> +   This file is part of the GNU C Library.
> > >>>>>> +
> > >>>>>> +   The GNU C Library is free software; you can redistribute it and/or
> > >>>>>> +   modify it under the terms of the GNU Lesser General Public
> > >>>>>> +   License as published by the Free Software Foundation; either
> > >>>>>> +   version 2.1 of the License, or (at your option) any later version.
> > >>>>>> +
> > >>>>>> +   The GNU C Library is distributed in the hope that it will be useful,
> > >>>>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > >>>>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > >>>>>> +   Lesser General Public License for more details.
> > >>>>>> +
> > >>>>>> +   You should have received a copy of the GNU Lesser General Public
> > >>>>>> +   License along with the GNU C Library; if not, see
> > >>>>>> +   <https://www.gnu.org/licenses/>.  */
> > >>>>>> +
> > >>>>>> +#if IS_IN (libc)
> > >>>>>> +/* Redefine memchr so that the compiler won't complain about the type
> > >>>>>> +   mismatch with the IFUNC selector in strong_alias, below.  */
> > >>>>>> +# undef memchr
> > >>>>>> +# define memchr __redirect_memchr
> > >>>>>> +# include <stdint.h>
> > >>>>>> +# include <string.h>
> > >>>>>> +# include <ifunc-init.h>
> > >>>>>> +# include <riscv-ifunc.h>
> > >>>>>> +# include <sys/hwprobe.h>
> > >>>>>> +
> > >>>>>> +extern __typeof (__redirect_memchr) __libc_memchr;
> > >>>>>> +
> > >>>>>> +extern __typeof (__redirect_memchr) __memchr_generic attribute_hidden;
> > >>>>>> +extern __typeof (__redirect_memchr) __memchr_zbb attribute_hidden;
> > >>>>>> +
> > >>>>>> +static inline __typeof (__redirect_memchr) *
> > >>>>>> +select_memchr_ifunc (uint64_t dl_hwcap, __riscv_hwprobe_t hwprobe_func)
> > >>>>>> +{
> > >>>>>> +  unsigned long long int v;
> > >>>>>> +  if (__riscv_hwprobe_one (hwprobe_func, RISCV_HWPROBE_KEY_IMA_EXT_0, &v) == 0
> > >>>>>> +      && (v & RISCV_HWPROBE_EXT_ZBB))
> > >>>>>> +    return __memchr_zbb;
> > >>>>>> +
> > >>>>>> +  return __memchr_generic;
> > >>>>>> +}
> > >>>>>> +
> > >>>>>> +riscv_libc_ifunc (__libc_memchr, select_memchr_ifunc);
> > >>>>>> +
> > >>>>>> +# undef memchr
> > >>>>>> +strong_alias (__libc_memchr, memchr);
> > >>>>>> +# ifdef SHARED
> > >>>>>> +__hidden_ver1 (memchr, __GI_memchr, __redirect_memchr)
> > >>>>>> +  __attribute__ ((visibility ("hidden"))) __attribute_copy__ (memchr);
> > >>>>>> +# endif
> > >>>>>> +#else
> > >>>>>> +# include <string/memchr.c>
> > >>>>>> +#endif

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/7] RISC-V: Add Zbb optimized memchr as ifunc
  2024-05-06 13:32                 ` Kito Cheng
@ 2024-05-06 13:46                   ` Christoph Müllner
  2024-05-06 13:58                     ` Kito Cheng
  0 siblings, 1 reply; 20+ messages in thread
From: Christoph Müllner @ 2024-05-06 13:46 UTC (permalink / raw)
  To: Kito Cheng
  Cc: Palmer Dabbelt, adhemerval.zanella, libc-alpha, Darius Rad,
	Andrew Waterman, philipp.tomsich, Evan Green, DJ Delorie,
	Vineet Gupta, jeffreyalaw

On Mon, May 6, 2024 at 3:32 PM Kito Cheng <kito.cheng@sifive.com> wrote:
>
> `target` attribute is provided by GCC 14, and `-march=+ext` will at
> least require GCC 15 (if we add),
> also we relaxed the canonical order requirement for `-march=`, that
> means we can relatively easy to manipulate the ISA
> string by just concat that? so I don't really think we need  `-march=+ext`.

Then we require the Makefile author to:
1) parse if the current compiler flags include march (repeat for all
variables that may be used)
2a) if so, then amend to that
2b) if not, parse if there is a configured default ISA (via "gcc -v")
2ba) if so, then create -march accordingly
2bb) if not, then <fail to amend>

Bonus problem:
If there are multiple variables that might set -march=... this gets
even trickier because
the order how these variables are integrated in the compiler
invocation command is unknown.

Using "-march=+zfoo" seems simpler and cleaner.

>
> On Mon, May 6, 2024 at 9:20 PM Christoph Müllner
> <christoph.muellner@vrull.eu> wrote:
> >
> > On Tue, Apr 30, 2024 at 7:54 PM Palmer Dabbelt <palmer@dabbelt.com> wrote:
> > >
> > > On Tue, 30 Apr 2024 10:45:12 PDT (-0700), adhemerval.zanella@linaro.org wrote:
> > > >
> > > >
> > > > On 30/04/24 12:13, Palmer Dabbelt wrote:
> > > >> On Wed, 24 Apr 2024 06:36:43 PDT (-0700), adhemerval.zanella@linaro.org wrote:
> > > >>>
> > > >>>
> > > >>> On 24/04/24 10:16, Christoph Müllner wrote:
> > > >>>> On Wed, Apr 24, 2024 at 2:53 PM Adhemerval Zanella Netto
> > > >>>> <adhemerval.zanella@linaro.org> wrote:
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>> On 22/04/24 04:43, Christoph Müllner wrote:
> > > >>>>>> When building with Zbb enabled, memchr benefits from using orc.b in
> > > >>>>>> find_zero_all().  This patch changes the build system such, that a
> > > >>>>>> non-Zbb version as well as a Zbb version of this routine is built.
> > > >>>>>> Further, a ifunc resolver is provided that selects the right routine
> > > >>>>>> based on the outcome of extension probing via hwprobe().
> > > >>>>>>
> > > >>>>>> Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
> > > >>>>>> ---
> > > >>>>>>  sysdeps/riscv/multiarch/memchr-generic.c      | 26 +++++++++
> > > >>>>>>  sysdeps/riscv/multiarch/memchr-zbb.c          | 30 ++++++++++
> > > >>>>>>  .../unix/sysv/linux/riscv/multiarch/Makefile  |  3 +
> > > >>>>>>  .../linux/riscv/multiarch/ifunc-impl-list.c   | 31 ++++++++--
> > > >>>>>>  .../unix/sysv/linux/riscv/multiarch/memchr.c  | 57 +++++++++++++++++++
> > > >>>>>>  5 files changed, 142 insertions(+), 5 deletions(-)
> > > >>>>>>  create mode 100644 sysdeps/riscv/multiarch/memchr-generic.c
> > > >>>>>>  create mode 100644 sysdeps/riscv/multiarch/memchr-zbb.c
> > > >>>>>>  create mode 100644 sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c
> > > >>>>>>
> > > >>>>>> diff --git a/sysdeps/riscv/multiarch/memchr-generic.c b/sysdeps/riscv/multiarch/memchr-generic.c
> > > >>>>>> new file mode 100644
> > > >>>>>> index 0000000000..a96c36398b
> > > >>>>>> --- /dev/null
> > > >>>>>> +++ b/sysdeps/riscv/multiarch/memchr-generic.c
> > > >>>>>> @@ -0,0 +1,26 @@
> > > >>>>>> +/* Re-include the default memchr implementation.
> > > >>>>>> +   Copyright (C) 2024 Free Software Foundation, Inc.
> > > >>>>>> +   This file is part of the GNU C Library.
> > > >>>>>> +
> > > >>>>>> +   The GNU C Library is free software; you can redistribute it and/or
> > > >>>>>> +   modify it under the terms of the GNU Lesser General Public
> > > >>>>>> +   License as published by the Free Software Foundation; either
> > > >>>>>> +   version 2.1 of the License, or (at your option) any later version.
> > > >>>>>> +
> > > >>>>>> +   The GNU C Library is distributed in the hope that it will be useful,
> > > >>>>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > >>>>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > > >>>>>> +   Lesser General Public License for more details.
> > > >>>>>> +
> > > >>>>>> +   You should have received a copy of the GNU Lesser General Public
> > > >>>>>> +   License along with the GNU C Library; if not, see
> > > >>>>>> +   <https://www.gnu.org/licenses/>.  */
> > > >>>>>> +
> > > >>>>>> +#include <string.h>
> > > >>>>>> +
> > > >>>>>> +#if IS_IN(libc)
> > > >>>>>> +# define MEMCHR __memchr_generic
> > > >>>>>> +# undef libc_hidden_builtin_def
> > > >>>>>> +# define libc_hidden_builtin_def(x)
> > > >>>>>> +#endif
> > > >>>>>> +#include <string/memchr.c>
> > > >>>>>> diff --git a/sysdeps/riscv/multiarch/memchr-zbb.c b/sysdeps/riscv/multiarch/memchr-zbb.c
> > > >>>>>> new file mode 100644
> > > >>>>>> index 0000000000..bead0335ae
> > > >>>>>> --- /dev/null
> > > >>>>>> +++ b/sysdeps/riscv/multiarch/memchr-zbb.c
> > > >>>>>> @@ -0,0 +1,30 @@
> > > >>>>>> +/* Re-include the default memchr implementation for Zbb.
> > > >>>>>> +   Copyright (C) 2024 Free Software Foundation, Inc.
> > > >>>>>> +   This file is part of the GNU C Library.
> > > >>>>>> +
> > > >>>>>> +   The GNU C Library is free software; you can redistribute it and/or
> > > >>>>>> +   modify it under the terms of the GNU Lesser General Public
> > > >>>>>> +   License as published by the Free Software Foundation; either
> > > >>>>>> +   version 2.1 of the License, or (at your option) any later version.
> > > >>>>>> +
> > > >>>>>> +   The GNU C Library is distributed in the hope that it will be useful,
> > > >>>>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > >>>>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > > >>>>>> +   Lesser General Public License for more details.
> > > >>>>>> +
> > > >>>>>> +   You should have received a copy of the GNU Lesser General Public
> > > >>>>>> +   License along with the GNU C Library; if not, see
> > > >>>>>> +   <https://www.gnu.org/licenses/>.  */
> > > >>>>>> +
> > > >>>>>> +#include <string.h>
> > > >>>>>> +
> > > >>>>>> +#if IS_IN(libc)
> > > >>>>>> +# define MEMCHR __memchr_zbb
> > > >>>>>> +# undef libc_hidden_builtin_def
> > > >>>>>> +# define libc_hidden_builtin_def(x)
> > > >>>>>> +#endif
> > > >>>>>> +/* Convince preprocessor to have Zbb instructions.  */
> > > >>>>>> +#ifndef __riscv_zbb
> > > >>>>>> +# define __riscv_zbb
> > > >>>>>> +#endif
> > > >>>>>
> > > >>>>> Is there a way to specific the compiler to enable a extension, like aarch64
> > > >>>>> -march=arch{+[no]feature}? I think ideally this should be enabled as CFLAGS
> > > >>>>> instead of messing with compiler defined pre-processor.
> > > >>>>
> > > >>>> The tools expect a list of all extensions as parameter to the -march= option.
> > > >>>> But there is no way to append extensions to an existing march string
> > > >>>> on the command line.
> > > >>>>
> > > >>>> And if we would add this feature today, it would take many years until we could
> > > >>>> use it here, because we want to remain compatible with old tools.
> > > >>>> Or we enable the optimization only when being built with new tools, but that
> > > >>>> adds even more complexity and build/test configurations.
> > > >>>>
> > > >>>> What we have is:
> > > >>>> * Preprocessor (since forever): Extension test macros (__riscv_EXTENSION)
> > > >>>> * Command line (since forever): -march=BASE_EXTENSIONLIST
> > > >>>> * GAS (since Nov 21): .option arch, +EXTENSION (in combination with
> > > >>>> option push/pop)
> > > >>>> * GCC (since Nov 23): __attribute__((target("arch=+EXTENSION")))
> > > >>>>
> > > >>>> I was not sure about using __riscv_zbb as well, but I considered it safe within
> > > >>>> ifdef tests that ensure the macro won't be set twice.
> > > >>>> If that's a concern, I could change to use something like this:
> > > >>>> #define __riscv_force_zbb
> > > >>>> #include <impl.c>
> > > >>>> #undef __riscv_force_zbb
> > > >>>> ... and change string-fza.h like this:
> > > >>>> #if defined(__riscv_zbb) || defined(__riscv_force_zbb)
> > > >>>> // orc.b
> > > >>>> #endif
> > > >>>>
> > > >>>> BR
> > > >>>> Christoph
> > > >>>
> > > >>> Another options would to parse the current march and add the extension if required,
> > > >>> something like:
> > > >>>
> > > >>> abi=$(riscv64-linux-gnu-gcc -Q --help=target | grep march | cut -d '=' -f2 | xargs)
> > > >>> if [[ ! "$abi" =~ "_zbb" ]]
> > > >>> then
> > > >>>   abi="$abi"_zbb
> > > >>> fi
> > > >>
> > > >> That alone likely won't do it, there's a bunch of ordering rules in the ISA string handling so we might get tripped up on them.  We've got a fairly relaxed version of the rules in GCC to try and match the various older rules, though, so it might be possible to make something similar work.
> > > >>
> > > >> We should probably just add some sort of -march=+zbb type argument.  IIRC Kito was going to do it at some point, not sure if he got around to it?
> > > >
> > > > I am just pointing this out because I think the way RISCV extension selection
> > > > is currently implemented makes it awkward to provide ifunc implementation in
> > > > a agnostic way (specially now that RISCV has dozens of extensions) without
> > > > knowing the current target compiler is generating.
> > > >
> > > > Some other ABI allows to either specify a ISA/chip reference (like powerpc
> > > > with -mcpu=powerX) or a ABI extension directly (like aarch64 with -march=+xxx).
> > >
> > > We have -mcpu, but for RISC-V that's even more fragmented than -march
> > > (-mcpu essentially just rolls together -march and -mtune, and only
> > > allows a curated list).  The `-march=+`-type stuff has been a pretty
> > > common request, I think someone just needst oget around to actually
> > > implementing it.
> >
> > I agree that `-march=+` would be the best solution.
> > As we don't have it, I've created a PR for the riscv-toolchain-conventions:
> >   https://github.com/riscv-non-isa/riscv-toolchain-conventions/pull/47
> > This will have to depend on GCC 15+ (assuming there will be consensus
> > in the next 10 months about it).
> >
> > Thanks for the proposal with the function-attributes.
> > I'm working on a patchset for that and will send it out after some testing.
> > This will have to depend on GCC 14+.
> >
> > > >>> I don't have a strong preference, it is just that by not using the compiler flag
> > > >>> we won't be able to either use the builtin (__builtin_riscv_orc_b_32) and/or get
> > > >>> a possible better code generation from compiler.
> > > >>
> > > >> I think we'd likely get slightly better codgen from telling the compiler about the bitmanip extensions.  Maybe we want something like
> > > >>
> > > >>    diff --git a/string/memchr.c b/string/memchr.c
> > > >>    index 08b5c41667..1b62dce8d8 100644
> > > >>    --- a/string/memchr.c
> > > >>    +++ b/string/memchr.c
> > > >>    @@ -29,15 +29,19 @@
> > > >>     # define __memchr MEMCHR
> > > >>     #endif
> > > >>       +#ifndef __MEMCHR_CODEGEN_ATTRIBUTE
> > > >>    +#define __MEMCHR_CODEGEN_ATTRIBUTE
> > > >>    +#endif
> > > >>    +
> > > >>     static __always_inline const char *
> > > >>    -sadd (uintptr_t x, uintptr_t y)
> > > >>    +sadd (uintptr_t x, uintptr_t y) __MEMCHR_CODEGEN_ATTRIBUTE
> > > >>     {
> > > >>       return (const char *)(y > UINTPTR_MAX - x ? UINTPTR_MAX : x + y);
> > > >>     }
> > > >>        /* Search no more than N bytes of S for C.  */
> > > >>     void *
> > > >>    -__memchr (void const *s, int c_in, size_t n)
> > > >>    +__memchr (void const *s, int c_in, size_t n) __MEMCHR_CODEGEN_ATTRIBUTE
> > > >>     {
> > > >>       if (__glibc_unlikely (n == 0))
> > > >>         return NULL;
> > > >>
> > > >> in the generic versions, so we can add a
> > > >>
> > > >>    #define __MEMCHR_CODEGEN_ATTRIBUTE __attribuet__((target("+zbb")))
> > > >>
> > > >> (or whatever the syntax is) to the Zbb-flavored versions of these routines?
> > > >
> > > > Yeah, this might work and it is clear than messing with compiler-defined
> > > > macros.
> > > >
> > > >>
> > > >> It might also be worth just jumping to the fast-misaligned versions for these routines, too --the slow-misaligned stuff is there for compatibility with old stuff (though memchr aligns the pointer, so it doesn't matter so much here).
> > > >
> > > > I was hopping that ABIs that would like to provide unaligned variants
> > > > for mem* routines to improve the generic code, but it seems that for some
> > > > it is easier to just add an assembly routine (as loongarch did).
> > > >
> > > > For memchr, I think it should be easy to provide a unaligned version.
> > > > Something like (completely untested):
> > > >
> > > > /* Search no more than N bytes of S for C.  */
> > > > void *
> > > > __memchr (void const *s, int c_in, size_t n)
> > > > {
> > > >   if (__glibc_unlikely (n == 0))
> > > >     return NULL;
> > > >
> > > > #ifdef USE_MEMCHR_UNALIGNED
> > > >   /* Read the first word, but munge it so that bytes before the array
> > > >      will not match goal.  */
> > > >   const op_t *word_ptr = PTR_ALIGN_DOWN (s, sizeof (op_t));
> > > >   uintptr_t s_int = (uintptr_t) s;
> > > >
> > > >   op_t word = *word_ptr;
> > > >   op_t repeated_c = repeat_bytes (c_in);
> > > >   /* Compute the address of the last byte taking in consideration possible
> > > >      overflow.  */
> > > >   const char *lbyte = sadd (s_int, n - 1);
> > > >   /* And also the address of the word containing the last byte. */
> > > >   const op_t *lword = (const op_t *) PTR_ALIGN_DOWN (lbyte, sizeof (op_t));
> > > >
> > > >   find_t mask = shift_find (find_eq_all (word, repeated_c), s_int);
> > > >   if (mask != 0)
> > > >     {
> > > >       char *ret = (char *) s + index_first (mask);
> > > >       return (ret <= lbyte) ? ret : NULL;
> > > >     }
> > > >   if (word_ptr == lword)
> > > >     return NULL;
> > > > #endif
> > > >
> > > >   word = *++word_ptr;
> > > >   while (word_ptr != lword)
> > > >     {
> > > >       if (has_eq (word, repeated_c))
> > > >         return (char *) word_ptr + index_first_eq (word, repeated_c);
> > > >       word = *++word_ptr;
> > > >     }
> > > >
> > > >   if (has_eq (word, repeated_c))
> > > >     {
> > > >       /* We found a match, but it might be in a byte past the end of the
> > > >          array.  */
> > > >       char *ret = (char *) word_ptr + index_first_eq (word, repeated_c);
> > > >       if (ret <= lbyte)
> > > >         return ret;
> > > >     }
> > > >   return NULL;
> > > > }
> > > >
> > > >>
> > > >>>>>> +#include <string/memchr.c>
> > > >>>>>> diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
> > > >>>>>> index fcef5659d4..5586d11c89 100644
> > > >>>>>> --- a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
> > > >>>>>> +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
> > > >>>>>> @@ -1,5 +1,8 @@
> > > >>>>>>  ifeq ($(subdir),string)
> > > >>>>>>  sysdep_routines += \
> > > >>>>>> +  memchr \
> > > >>>>>> +  memchr-generic \
> > > >>>>>> +  memchr-zbb \
> > > >>>>>>    memcpy \
> > > >>>>>>    memcpy-generic \
> > > >>>>>>    memcpy_noalignment \
> > > >>>>>> diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
> > > >>>>>> index 9f806d7a9e..7321144a32 100644
> > > >>>>>> --- a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
> > > >>>>>> +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
> > > >>>>>> @@ -20,19 +20,40 @@
> > > >>>>>>  #include <string.h>
> > > >>>>>>  #include <sys/hwprobe.h>
> > > >>>>>>
> > > >>>>>> +#define ARRAY_SIZE(A) (sizeof (A) / sizeof ((A)[0]))
> > > >>>>>> +
> > > >>>>>>  size_t
> > > >>>>>>  __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
> > > >>>>>>                       size_t max)
> > > >>>>>>  {
> > > >>>>>>    size_t i = max;
> > > >>>>>> +  struct riscv_hwprobe pairs[] = {
> > > >>>>>> +    { .key = RISCV_HWPROBE_KEY_IMA_EXT_0 },
> > > >>>>>> +    { .key = RISCV_HWPROBE_KEY_CPUPERF_0 },
> > > >>>>>> +  };
> > > >>>>>>
> > > >>>>>> +  bool has_zbb = false;
> > > >>>>>>    bool fast_unaligned = false;
> > > >>>>>>
> > > >>>>>> -  struct riscv_hwprobe pair = { .key = RISCV_HWPROBE_KEY_CPUPERF_0 };
> > > >>>>>> -  if (__riscv_hwprobe (&pair, 1, 0, NULL, 0) == 0
> > > >>>>>> -      && (pair.value & RISCV_HWPROBE_MISALIGNED_MASK)
> > > >>>>>> -          == RISCV_HWPROBE_MISALIGNED_FAST)
> > > >>>>>> -    fast_unaligned = true;
> > > >>>>>> +  if (__riscv_hwprobe (pairs, ARRAY_SIZE (pairs), 0, NULL, 0) == 0)
> > > >>>>>> +    {
> > > >>>>>> +      struct riscv_hwprobe *pair;
> > > >>>>>> +
> > > >>>>>> +      /* RISCV_HWPROBE_KEY_IMA_EXT_0  */
> > > >>>>>> +      pair = &pairs[0];
> > > >>>>>> +      if (pair->value & RISCV_HWPROBE_EXT_ZBB)
> > > >>>>>> +        has_zbb = true;
> > > >>>>>> +
> > > >>>>>> +      /* RISCV_HWPROBE_KEY_CPUPERF_0  */
> > > >>>>>> +      pair = &pairs[1];
> > > >>>>>> +      if ((pair->value & RISCV_HWPROBE_MISALIGNED_MASK)
> > > >>>>>> +        == RISCV_HWPROBE_MISALIGNED_FAST)
> > > >>>>>> +        fast_unaligned = true;
> > > >>>>>> +    }
> > > >>>>>> +
> > > >>>>>> +  IFUNC_IMPL (i, name, memchr,
> > > >>>>>> +           IFUNC_IMPL_ADD (array, i, memchr, has_zbb, __memchr_zbb)
> > > >>>>>> +           IFUNC_IMPL_ADD (array, i, memchr, 1, __memchr_generic))
> > > >>>>>>
> > > >>>>>>    IFUNC_IMPL (i, name, memcpy,
> > > >>>>>>             IFUNC_IMPL_ADD (array, i, memcpy, fast_unaligned,
> > > >>>>>> diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c b/sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c
> > > >>>>>> new file mode 100644
> > > >>>>>> index 0000000000..bc076cbf24
> > > >>>>>> --- /dev/null
> > > >>>>>> +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c
> > > >>>>>> @@ -0,0 +1,57 @@
> > > >>>>>> +/* Multiple versions of memchr.
> > > >>>>>> +   All versions must be listed in ifunc-impl-list.c.
> > > >>>>>> +   Copyright (C) 2017-2024 Free Software Foundation, Inc.
> > > >>>>>> +   This file is part of the GNU C Library.
> > > >>>>>> +
> > > >>>>>> +   The GNU C Library is free software; you can redistribute it and/or
> > > >>>>>> +   modify it under the terms of the GNU Lesser General Public
> > > >>>>>> +   License as published by the Free Software Foundation; either
> > > >>>>>> +   version 2.1 of the License, or (at your option) any later version.
> > > >>>>>> +
> > > >>>>>> +   The GNU C Library is distributed in the hope that it will be useful,
> > > >>>>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > >>>>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > > >>>>>> +   Lesser General Public License for more details.
> > > >>>>>> +
> > > >>>>>> +   You should have received a copy of the GNU Lesser General Public
> > > >>>>>> +   License along with the GNU C Library; if not, see
> > > >>>>>> +   <https://www.gnu.org/licenses/>.  */
> > > >>>>>> +
> > > >>>>>> +#if IS_IN (libc)
> > > >>>>>> +/* Redefine memchr so that the compiler won't complain about the type
> > > >>>>>> +   mismatch with the IFUNC selector in strong_alias, below.  */
> > > >>>>>> +# undef memchr
> > > >>>>>> +# define memchr __redirect_memchr
> > > >>>>>> +# include <stdint.h>
> > > >>>>>> +# include <string.h>
> > > >>>>>> +# include <ifunc-init.h>
> > > >>>>>> +# include <riscv-ifunc.h>
> > > >>>>>> +# include <sys/hwprobe.h>
> > > >>>>>> +
> > > >>>>>> +extern __typeof (__redirect_memchr) __libc_memchr;
> > > >>>>>> +
> > > >>>>>> +extern __typeof (__redirect_memchr) __memchr_generic attribute_hidden;
> > > >>>>>> +extern __typeof (__redirect_memchr) __memchr_zbb attribute_hidden;
> > > >>>>>> +
> > > >>>>>> +static inline __typeof (__redirect_memchr) *
> > > >>>>>> +select_memchr_ifunc (uint64_t dl_hwcap, __riscv_hwprobe_t hwprobe_func)
> > > >>>>>> +{
> > > >>>>>> +  unsigned long long int v;
> > > >>>>>> +  if (__riscv_hwprobe_one (hwprobe_func, RISCV_HWPROBE_KEY_IMA_EXT_0, &v) == 0
> > > >>>>>> +      && (v & RISCV_HWPROBE_EXT_ZBB))
> > > >>>>>> +    return __memchr_zbb;
> > > >>>>>> +
> > > >>>>>> +  return __memchr_generic;
> > > >>>>>> +}
> > > >>>>>> +
> > > >>>>>> +riscv_libc_ifunc (__libc_memchr, select_memchr_ifunc);
> > > >>>>>> +
> > > >>>>>> +# undef memchr
> > > >>>>>> +strong_alias (__libc_memchr, memchr);
> > > >>>>>> +# ifdef SHARED
> > > >>>>>> +__hidden_ver1 (memchr, __GI_memchr, __redirect_memchr)
> > > >>>>>> +  __attribute__ ((visibility ("hidden"))) __attribute_copy__ (memchr);
> > > >>>>>> +# endif
> > > >>>>>> +#else
> > > >>>>>> +# include <string/memchr.c>
> > > >>>>>> +#endif

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/7] RISC-V: Add Zbb optimized memchr as ifunc
  2024-05-06 13:46                   ` Christoph Müllner
@ 2024-05-06 13:58                     ` Kito Cheng
  0 siblings, 0 replies; 20+ messages in thread
From: Kito Cheng @ 2024-05-06 13:58 UTC (permalink / raw)
  To: Christoph Müllner
  Cc: Palmer Dabbelt, adhemerval.zanella, libc-alpha, Darius Rad,
	Andrew Waterman, philipp.tomsich, Evan Green, DJ Delorie,
	Vineet Gupta, jeffreyalaw

> > `target` attribute is provided by GCC 14, and `-march=+ext` will at
> > least require GCC 15 (if we add),
> > also we relaxed the canonical order requirement for `-march=`, that
> > means we can relatively easy to manipulate the ISA
> > string by just concat that? so I don't really think we need  `-march=+ext`.
>
> Then we require the Makefile author to:
> 1) parse if the current compiler flags include march (repeat for all
> variables that may be used)
> 2a) if so, then amend to that
> 2b) if not, parse if there is a configured default ISA (via "gcc -v")
> 2ba) if so, then create -march accordingly
> 2bb) if not, then <fail to amend>
>
> Bonus problem:
> If there are multiple variables that might set -march=... this gets
> even trickier because
> the order how these variables are integrated in the compiler
> invocation command is unknown.
>
> Using "-march=+zfoo" seems simpler and cleaner.

I agree it will require the compiler to parse the gcc -v to obtain the
default -march, it might be a burden,
but AArch64 also does not allow -march=+ext only it requires
-march=arch{+[no]feature}*,
so I am wondering why do we need that to resolve/improve ifunc
implementation but not `target` attribute and `.option arch, +ext`?

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2024-05-06 13:58 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-22  7:43 [PATCH 0/7] Add ifunc support for existing Zbb optimizations Christoph Müllner
2024-04-22  7:43 ` [PATCH 1/7] RISC-V: Use .insn directive form for orc.b Christoph Müllner
2024-04-22  7:43 ` [PATCH 2/7] RISC-V: Add Zbb optimized memchr as ifunc Christoph Müllner
2024-04-24 12:53   ` Adhemerval Zanella Netto
2024-04-24 13:16     ` Christoph Müllner
2024-04-24 13:36       ` Adhemerval Zanella Netto
2024-04-26 11:40         ` Christoph Müllner
2024-04-30 15:13         ` Palmer Dabbelt
2024-04-30 17:45           ` Adhemerval Zanella Netto
2024-04-30 17:54             ` Palmer Dabbelt
2024-04-30 18:44               ` Vineet Gupta
2024-05-06 13:20               ` Christoph Müllner
2024-05-06 13:32                 ` Kito Cheng
2024-05-06 13:46                   ` Christoph Müllner
2024-05-06 13:58                     ` Kito Cheng
2024-04-22  7:43 ` [PATCH 3/7] RISC-V: Add Zbb optimized memrchr " Christoph Müllner
2024-04-22  7:44 ` [PATCH 4/7] RISC-V: Add Zbb optimized strchrnul " Christoph Müllner
2024-04-22  7:44 ` [PATCH 5/7] RISC-V: Add Zbb optimized strcmp " Christoph Müllner
2024-04-22  7:44 ` [PATCH 6/7] RISC-V: Add Zbb optimized strlen " Christoph Müllner
2024-04-22  7:44 ` [PATCH 7/7] RISC-V: Add Zbb optimized strncmp " Christoph Müllner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).