public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH 0/2] RISC-V: ifunced memcpy using new kernel hwprobe interface
@ 2023-02-06 19:48 Evan Green
  2023-02-06 19:48 ` [PATCH 1/2] riscv: Add Linux hwprobe syscall support Evan Green
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Evan Green @ 2023-02-06 19:48 UTC (permalink / raw)
  To: libc-alpha; +Cc: slewis, vineetg, palmer, Evan Green


This series illustrates the use of a proposed Linux syscall that
enumerates architectural information about the RISC-V cores the system
is running on. In this series we expose a small wrapper function around
the syscall. An ifunc selector for memcpy queries it to see if unaligned
access is "fast" on this hardware. If it is, it selects a newly provided
implementation of memcpy that doesn't work hard at aligning the src and
destination buffers.

This is somewhat of a proof of concept for the syscall itself, but I do
find that in my goofy  memcpy test [1], the unaligned memcpy performed at
least as well as the generic C version. This is however on Qemu on an M1
mac, so not a test of any real hardware (more a smoke test that the
implementation isn't silly).

v1 of the Linux series can be found at [2]. I'm about to post v2 (but
haven't yet!), I can reply here with the link once v2 is posted.

[1] https://pastebin.com/Nj8ixpkX
[2] https://yhbt.net/lore/all/20221013163551.6775-1-palmer@rivosinc.com/


Evan Green (2):
  riscv: Add Linux hwprobe syscall support
  riscv: Add and use alignment-ignorant memcpy

 sysdeps/riscv/memcopy.h                       |  28 +++++
 sysdeps/riscv/memcpy.c                        |  65 +++++++++++
 sysdeps/riscv/memcpy_noalignment.S            | 103 ++++++++++++++++++
 sysdeps/unix/sysv/linux/riscv/Makefile        |   8 +-
 sysdeps/unix/sysv/linux/riscv/Versions        |   3 +
 sysdeps/unix/sysv/linux/riscv/hwprobe.c       |  30 +++++
 .../unix/sysv/linux/riscv/memcpy-generic.c    |  24 ++++
 .../unix/sysv/linux/riscv/rv32/arch-syscall.h |   1 +
 .../unix/sysv/linux/riscv/rv32/libc.abilist   |   1 +
 .../unix/sysv/linux/riscv/rv64/arch-syscall.h |   1 +
 .../unix/sysv/linux/riscv/rv64/libc.abilist   |   1 +
 sysdeps/unix/sysv/linux/riscv/sys/hwprobe.h   |  34 ++++++
 sysdeps/unix/sysv/linux/syscall-names.list    |   1 +
 13 files changed, 298 insertions(+), 2 deletions(-)
 create mode 100644 sysdeps/riscv/memcopy.h
 create mode 100644 sysdeps/riscv/memcpy.c
 create mode 100644 sysdeps/riscv/memcpy_noalignment.S
 create mode 100644 sysdeps/unix/sysv/linux/riscv/hwprobe.c
 create mode 100644 sysdeps/unix/sysv/linux/riscv/memcpy-generic.c
 create mode 100644 sysdeps/unix/sysv/linux/riscv/sys/hwprobe.h

-- 
2.25.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/2] riscv: Add Linux hwprobe syscall support
  2023-02-06 19:48 [PATCH 0/2] RISC-V: ifunced memcpy using new kernel hwprobe interface Evan Green
@ 2023-02-06 19:48 ` Evan Green
  2023-02-07 13:05   ` Adhemerval Zanella Netto
  2023-02-06 19:48 ` [PATCH 2/2] riscv: Add and use alignment-ignorant memcpy Evan Green
  2023-02-06 21:28 ` [PATCH 0/2] RISC-V: ifunced memcpy using new kernel hwprobe interface Richard Henderson
  2 siblings, 1 reply; 10+ messages in thread
From: Evan Green @ 2023-02-06 19:48 UTC (permalink / raw)
  To: libc-alpha; +Cc: slewis, vineetg, palmer, Evan Green

Add awareness and a thin wrapper function around a new Linux system call
that allows callers to get architecture and microarchitecture
information about the CPUs from the kernel. This can be used to
do things like dynamically choose a memcpy implementation.

Signed-off-by: Evan Green <evan@rivosinc.com>
---

 sysdeps/unix/sysv/linux/riscv/Makefile        |  4 +--
 sysdeps/unix/sysv/linux/riscv/Versions        |  3 ++
 sysdeps/unix/sysv/linux/riscv/hwprobe.c       | 30 ++++++++++++++++
 .../unix/sysv/linux/riscv/rv32/arch-syscall.h |  1 +
 .../unix/sysv/linux/riscv/rv32/libc.abilist   |  1 +
 .../unix/sysv/linux/riscv/rv64/arch-syscall.h |  1 +
 .../unix/sysv/linux/riscv/rv64/libc.abilist   |  1 +
 sysdeps/unix/sysv/linux/riscv/sys/hwprobe.h   | 34 +++++++++++++++++++
 sysdeps/unix/sysv/linux/syscall-names.list    |  1 +
 9 files changed, 74 insertions(+), 2 deletions(-)
 create mode 100644 sysdeps/unix/sysv/linux/riscv/hwprobe.c
 create mode 100644 sysdeps/unix/sysv/linux/riscv/sys/hwprobe.h

diff --git a/sysdeps/unix/sysv/linux/riscv/Makefile b/sysdeps/unix/sysv/linux/riscv/Makefile
index 4b6eacb32f..45cc29e40d 100644
--- a/sysdeps/unix/sysv/linux/riscv/Makefile
+++ b/sysdeps/unix/sysv/linux/riscv/Makefile
@@ -1,6 +1,6 @@
 ifeq ($(subdir),misc)
-sysdep_headers += sys/cachectl.h
-sysdep_routines += flush-icache
+sysdep_headers += sys/cachectl.h sys/hwprobe.h
+sysdep_routines += flush-icache hwprobe
 endif
 
 ifeq ($(subdir),stdlib)
diff --git a/sysdeps/unix/sysv/linux/riscv/Versions b/sysdeps/unix/sysv/linux/riscv/Versions
index 5625d2a0b8..891ae05730 100644
--- a/sysdeps/unix/sysv/linux/riscv/Versions
+++ b/sysdeps/unix/sysv/linux/riscv/Versions
@@ -8,4 +8,7 @@ libc {
   GLIBC_2.27 {
     __riscv_flush_icache;
   }
+  GLIBC_2.37 {
+    __riscv_hwprobe;
+  }
 }
diff --git a/sysdeps/unix/sysv/linux/riscv/hwprobe.c b/sysdeps/unix/sysv/linux/riscv/hwprobe.c
new file mode 100644
index 0000000000..ef6dccb9db
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/riscv/hwprobe.c
@@ -0,0 +1,30 @@
+/* RISC-V hardware feature probing support on Linux
+   Copyright (C) 2023 Free Software Foundation, Inc.
+
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public License as
+   published by the Free Software Foundation; either version 2.1 of the
+   License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <sys/syscall.h>
+#include <asm/hwprobe.h>
+#include <sysdep.h>
+
+int
+__riscv_hwprobe (struct riscv_hwprobe *pairs, long pair_count,
+  long cpu_count, unsigned long *cpus, unsigned long flags)
+{
+  return INLINE_SYSCALL (riscv_hwprobe, 5, pairs, pair_count,
+                         cpu_count, cpus, flags);
+}
diff --git a/sysdeps/unix/sysv/linux/riscv/rv32/arch-syscall.h b/sysdeps/unix/sysv/linux/riscv/rv32/arch-syscall.h
index 202520ee25..2416e041c8 100644
--- a/sysdeps/unix/sysv/linux/riscv/rv32/arch-syscall.h
+++ b/sysdeps/unix/sysv/linux/riscv/rv32/arch-syscall.h
@@ -198,6 +198,7 @@
 #define __NR_request_key 218
 #define __NR_restart_syscall 128
 #define __NR_riscv_flush_icache 259
+#define __NR_riscv_hwprobe 258
 #define __NR_rseq 293
 #define __NR_rt_sigaction 134
 #define __NR_rt_sigpending 136
diff --git a/sysdeps/unix/sysv/linux/riscv/rv32/libc.abilist b/sysdeps/unix/sysv/linux/riscv/rv32/libc.abilist
index ff90d1bff2..f4c391d3be 100644
--- a/sysdeps/unix/sysv/linux/riscv/rv32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/riscv/rv32/libc.abilist
@@ -2396,3 +2396,4 @@ GLIBC_2.36 pidfd_open F
 GLIBC_2.36 pidfd_send_signal F
 GLIBC_2.36 process_madvise F
 GLIBC_2.36 process_mrelease F
+GLIBC_2.37 __riscv_hwprobe F
diff --git a/sysdeps/unix/sysv/linux/riscv/rv64/arch-syscall.h b/sysdeps/unix/sysv/linux/riscv/rv64/arch-syscall.h
index 4e65f337d4..a32bc82f60 100644
--- a/sysdeps/unix/sysv/linux/riscv/rv64/arch-syscall.h
+++ b/sysdeps/unix/sysv/linux/riscv/rv64/arch-syscall.h
@@ -205,6 +205,7 @@
 #define __NR_request_key 218
 #define __NR_restart_syscall 128
 #define __NR_riscv_flush_icache 259
+#define __NR_riscv_hwprobe 258
 #define __NR_rseq 293
 #define __NR_rt_sigaction 134
 #define __NR_rt_sigpending 136
diff --git a/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist b/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist
index f1017f6ec5..0f57bbe9e1 100644
--- a/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist
@@ -2596,3 +2596,4 @@ GLIBC_2.36 pidfd_open F
 GLIBC_2.36 pidfd_send_signal F
 GLIBC_2.36 process_madvise F
 GLIBC_2.36 process_mrelease F
+GLIBC_2.37 __riscv_hwprobe F
diff --git a/sysdeps/unix/sysv/linux/riscv/sys/hwprobe.h b/sysdeps/unix/sysv/linux/riscv/sys/hwprobe.h
new file mode 100644
index 0000000000..da8cdc90bf
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/riscv/sys/hwprobe.h
@@ -0,0 +1,34 @@
+/* RISC-V architecture probe interface
+   Copyright (C) 2023 Free Software Foundation, Inc.
+
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library.  If not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef _SYS_HWPROBE_H
+#define _SYS_HWPROBE_H 1
+
+#include <features.h>
+#include <asm/hwprobe.h>
+
+__BEGIN_DECLS
+
+int
+__riscv_hwprobe (struct riscv_hwprobe *pairs, long pair_count,
+  long cpu_count, unsigned long *cpus, unsigned long flags);
+
+__END_DECLS
+
+#endif /* sys/hwprobe.h */
diff --git a/sysdeps/unix/sysv/linux/syscall-names.list b/sysdeps/unix/sysv/linux/syscall-names.list
index 822498d3e3..4f4a62e91c 100644
--- a/sysdeps/unix/sysv/linux/syscall-names.list
+++ b/sysdeps/unix/sysv/linux/syscall-names.list
@@ -477,6 +477,7 @@ renameat2
 request_key
 restart_syscall
 riscv_flush_icache
+riscv_hwprobe
 rmdir
 rseq
 rt_sigaction
-- 
2.25.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 2/2] riscv: Add and use alignment-ignorant memcpy
  2023-02-06 19:48 [PATCH 0/2] RISC-V: ifunced memcpy using new kernel hwprobe interface Evan Green
  2023-02-06 19:48 ` [PATCH 1/2] riscv: Add Linux hwprobe syscall support Evan Green
@ 2023-02-06 19:48 ` Evan Green
  2023-02-06 22:05   ` Richard Henderson
  2023-02-06 21:28 ` [PATCH 0/2] RISC-V: ifunced memcpy using new kernel hwprobe interface Richard Henderson
  2 siblings, 1 reply; 10+ messages in thread
From: Evan Green @ 2023-02-06 19:48 UTC (permalink / raw)
  To: libc-alpha; +Cc: slewis, vineetg, palmer, Evan Green

For CPU implementations that can perform unaligned accesses with little
or no performance penalty, create a memcpy implementation that does not
bother aligning buffers. It will use a block of integer registers, a
single integer register, and fall back to bytewise copy for the
remainder.

Signed-off-by: Evan Green <evan@rivosinc.com>

---


---
 sysdeps/riscv/memcopy.h                       |  28 +++++
 sysdeps/riscv/memcpy.c                        |  65 +++++++++++
 sysdeps/riscv/memcpy_noalignment.S            | 103 ++++++++++++++++++
 sysdeps/unix/sysv/linux/riscv/Makefile        |   4 +
 .../unix/sysv/linux/riscv/memcpy-generic.c    |  24 ++++
 5 files changed, 224 insertions(+)
 create mode 100644 sysdeps/riscv/memcopy.h
 create mode 100644 sysdeps/riscv/memcpy.c
 create mode 100644 sysdeps/riscv/memcpy_noalignment.S
 create mode 100644 sysdeps/unix/sysv/linux/riscv/memcpy-generic.c

diff --git a/sysdeps/riscv/memcopy.h b/sysdeps/riscv/memcopy.h
new file mode 100644
index 0000000000..21f6081b5f
--- /dev/null
+++ b/sysdeps/riscv/memcopy.h
@@ -0,0 +1,28 @@
+/* memcopy.h -- definitions for memory copy functions. RISC-V version.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <sysdeps/generic/memcopy.h>
+
+/*
+ * Redefine the generic memcpy implementation to __memcpy_generic, so
+ * the memcpy ifunc can select between generic and special versions.
+ * In rtld, don't bother with all the ifunciness.
+ */
+#if IS_IN (libc)
+#define MEMCPY __memcpy_generic
+#endif
diff --git a/sysdeps/riscv/memcpy.c b/sysdeps/riscv/memcpy.c
new file mode 100644
index 0000000000..1ba25ef976
--- /dev/null
+++ b/sysdeps/riscv/memcpy.c
@@ -0,0 +1,65 @@
+/* Multiple versions of memcpy.
+   All versions must be listed in ifunc-impl-list.c.
+   Copyright (C) 2017-2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#if IS_IN (libc)
+/* Redefine memcpy so that the compiler won't complain about the type
+   mismatch with the IFUNC selector in strong_alias, below.  */
+# undef memcpy
+# define memcpy __redirect_memcpy
+# include <string.h>
+#include <ifunc-init.h>
+#include <sys/hwprobe.h>
+
+#define INIT_ARCH()
+
+extern __typeof (__redirect_memcpy) __libc_memcpy;
+
+extern __typeof (__redirect_memcpy) __memcpy_generic attribute_hidden;
+extern __typeof (__redirect_memcpy) __memcpy_noalignment attribute_hidden;
+
+static inline __typeof (__redirect_memcpy) *
+select_memcpy_ifunc (void)
+{
+  INIT_ARCH ();
+
+  struct riscv_hwprobe pair;
+
+  pair.key = RISCV_HWPROBE_KEY_CPUPERF_0;
+  if (__riscv_hwprobe(&pair, 1, 0, NULL, 0) != 0)
+    return __memcpy_generic;
+
+  if ((pair.key > 0) &&
+      (pair.value & RISCV_HWPROBE_MISALIGNED_FAST) ==
+       RISCV_HWPROBE_MISALIGNED_FAST)
+    return __memcpy_noalignment;
+
+  return __memcpy_generic;
+}
+
+libc_ifunc (__libc_memcpy, select_memcpy_ifunc ());
+
+# undef memcpy
+strong_alias (__libc_memcpy, memcpy);
+# ifdef SHARED
+__hidden_ver1 (memcpy, __GI_memcpy, __redirect_memcpy)
+  __attribute__ ((visibility ("hidden"))) __attribute_copy__ (memcpy);
+# endif
+
+#endif
+
diff --git a/sysdeps/riscv/memcpy_noalignment.S b/sysdeps/riscv/memcpy_noalignment.S
new file mode 100644
index 0000000000..fe1d9213c4
--- /dev/null
+++ b/sysdeps/riscv/memcpy_noalignment.S
@@ -0,0 +1,103 @@
+/* memcpy for RISC-V, ignoring buffer alignment
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library.  If not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+#include <sys/asm.h>
+
+/* void *memcpy(void *, const void *, size_t) */
+ENTRY (__memcpy_noalignment)
+	move t6, a0  /* Preserve return value */
+
+	/* Round down to the nearest "page" size */
+	andi a4, a2, ~((16*SZREG)-1)
+	beqz a4, 2f
+	add a3, a1, a4
+1:
+	/* Copy "pages" (chunks of 16 registers) */
+	REG_L a4,       0(a1)
+	REG_L a5,   SZREG(a1)
+	REG_L a6, 2*SZREG(a1)
+	REG_L a7, 3*SZREG(a1)
+	REG_L t0, 4*SZREG(a1)
+	REG_L t1, 5*SZREG(a1)
+	REG_L t2, 6*SZREG(a1)
+	REG_L t3, 7*SZREG(a1)
+	REG_L t4, 8*SZREG(a1)
+	REG_L t5, 9*SZREG(a1)
+	REG_S a4,       0(t6)
+	REG_S a5,   SZREG(t6)
+	REG_S a6, 2*SZREG(t6)
+	REG_S a7, 3*SZREG(t6)
+	REG_S t0, 4*SZREG(t6)
+	REG_S t1, 5*SZREG(t6)
+	REG_S t2, 6*SZREG(t6)
+	REG_S t3, 7*SZREG(t6)
+	REG_S t4, 8*SZREG(t6)
+	REG_S t5, 9*SZREG(t6)
+	REG_L a4, 10*SZREG(a1)
+	REG_L a5, 11*SZREG(a1)
+	REG_L a6, 12*SZREG(a1)
+	REG_L a7, 13*SZREG(a1)
+	REG_L t0, 14*SZREG(a1)
+	REG_L t1, 15*SZREG(a1)
+	addi a1, a1, 16*SZREG
+	REG_S a4, 10*SZREG(t6)
+	REG_S a5, 11*SZREG(t6)
+	REG_S a6, 12*SZREG(t6)
+	REG_S a7, 13*SZREG(t6)
+	REG_S t0, 14*SZREG(t6)
+	REG_S t1, 15*SZREG(t6)
+	addi t6, t6, 16*SZREG
+	bltu a1, a3, 1b
+	andi a2, a2, (16*SZREG)-1  /* Update count */
+
+2:
+	/* Remainder is smaller than a page, compute native word count */
+	beqz a2, 6f
+	andi a5, a2, ~(SZREG-1)
+	andi a2, a2, (SZREG-1)
+	add a3, a1, a5
+	/* Jump directly to byte copy if no words. */
+	beqz a5, 4f
+
+3:
+	/* Use single native register copy */
+	REG_L a4, 0(a1)
+	addi a1, a1, SZREG
+	REG_S a4, 0(t6)
+	addi t6, t6, SZREG
+	bltu a1, a3, 3b
+
+	/* Jump directly out if no more bytes */
+	beqz a2, 6f
+
+4:
+	/* Copy the last few individual bytes */
+	add a3, a1, a2
+5:
+	lb a4, 0(a1)
+	addi a1, a1, 1
+	sb a4, 0(t6)
+	addi t6, t6, 1
+	bltu a1, a3, 5b
+6:
+	ret
+
+END (__memcpy_noalignment)
+
+hidden_def (__memcpy_noalignment)
diff --git a/sysdeps/unix/sysv/linux/riscv/Makefile b/sysdeps/unix/sysv/linux/riscv/Makefile
index 45cc29e40d..aa9ea443d6 100644
--- a/sysdeps/unix/sysv/linux/riscv/Makefile
+++ b/sysdeps/unix/sysv/linux/riscv/Makefile
@@ -7,6 +7,10 @@ ifeq ($(subdir),stdlib)
 gen-as-const-headers += ucontext_i.sym
 endif
 
+ifeq ($(subdir),string)
+sysdep_routines += memcpy memcpy-generic memcpy_noalignment
+endif
+
 abi-variants := ilp32 ilp32d lp64 lp64d
 
 ifeq (,$(filter $(default-abi),$(abi-variants)))
diff --git a/sysdeps/unix/sysv/linux/riscv/memcpy-generic.c b/sysdeps/unix/sysv/linux/riscv/memcpy-generic.c
new file mode 100644
index 0000000000..0abe03f7f5
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/riscv/memcpy-generic.c
@@ -0,0 +1,24 @@
+/* Re-include the default memcpy implementation.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <string.h>
+
+extern __typeof (memcpy) __memcpy_generic;
+hidden_proto(__memcpy_generic)
+
+#include <string/memcpy.c>
-- 
2.25.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/2] RISC-V: ifunced memcpy using new kernel hwprobe interface
  2023-02-06 19:48 [PATCH 0/2] RISC-V: ifunced memcpy using new kernel hwprobe interface Evan Green
  2023-02-06 19:48 ` [PATCH 1/2] riscv: Add Linux hwprobe syscall support Evan Green
  2023-02-06 19:48 ` [PATCH 2/2] riscv: Add and use alignment-ignorant memcpy Evan Green
@ 2023-02-06 21:28 ` Richard Henderson
  2023-02-07 12:49   ` Adhemerval Zanella Netto
  2 siblings, 1 reply; 10+ messages in thread
From: Richard Henderson @ 2023-02-06 21:28 UTC (permalink / raw)
  To: Evan Green, libc-alpha; +Cc: slewis, vineetg, palmer

On 2/6/23 09:48, Evan Green wrote:
> 
> This series illustrates the use of a proposed Linux syscall that
> enumerates architectural information about the RISC-V cores the system
> is running on. In this series we expose a small wrapper function around
> the syscall. An ifunc selector for memcpy queries it to see if unaligned
> access is "fast" on this hardware. If it is, it selects a newly provided
> implementation of memcpy that doesn't work hard at aligning the src and
> destination buffers.
> 
> This is somewhat of a proof of concept for the syscall itself, but I do
> find that in my goofy  memcpy test [1], the unaligned memcpy performed at
> least as well as the generic C version. This is however on Qemu on an M1
> mac, so not a test of any real hardware (more a smoke test that the
> implementation isn't silly).
> 
> v1 of the Linux series can be found at [2]. I'm about to post v2 (but
> haven't yet!), I can reply here with the link once v2 is posted.
> 
> [1] https://pastebin.com/Nj8ixpkX
> [2] https://yhbt.net/lore/all/20221013163551.6775-1-palmer@rivosinc.com/

Re the syscall:

I question whether the heterogenous cpu case is something that you really want to query. 
In order to handle migration between such cpus, any such query must return the minimum 
level of support.

Remove that possibility, and this becomes a simple array reference.  Now you need to 
decide whether a vdso call, or HWCAP2 as pointer to read-only data is more or less 
efficient or extensible.


r~

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] riscv: Add and use alignment-ignorant memcpy
  2023-02-06 19:48 ` [PATCH 2/2] riscv: Add and use alignment-ignorant memcpy Evan Green
@ 2023-02-06 22:05   ` Richard Henderson
  2023-02-09 21:04     ` Evan Green
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Henderson @ 2023-02-06 22:05 UTC (permalink / raw)
  To: Evan Green, libc-alpha; +Cc: slewis, vineetg, palmer

On 2/6/23 09:48, Evan Green wrote:
> +	/* Remainder is smaller than a page, compute native word count */
> +	beqz a2, 6f
> +	andi a5, a2, ~(SZREG-1)
> +	andi a2, a2, (SZREG-1)
> +	add a3, a1, a5
> +	/* Jump directly to byte copy if no words. */
> +	beqz a5, 4f
> +
> +3:
> +	/* Use single native register copy */
> +	REG_L a4, 0(a1)
> +	addi a1, a1, SZREG
> +	REG_S a4, 0(t6)
> +	addi t6, t6, SZREG
> +	bltu a1, a3, 3b
> +
> +	/* Jump directly out if no more bytes */
> +	beqz a2, 6f
> +
> +4:
> +	/* Copy the last few individual bytes */
> +	add a3, a1, a2
> +5:
> +	lb a4, 0(a1)
> +	addi a1, a1, 1
> +	sb a4, 0(t6)
> +	addi t6, t6, 1
> +	bltu a1, a3, 5b
> +6:
> +	ret

If you know there are at least SZREG bytes in the range, you can avoid the byte loop by 
copying the last word unaligned.  That may copy some bytes twice, but that's ok too. 
Similarly, you can redundantly copy a few bytes at the beginning to align the destination 
(there's usually some cost for unaligned stores, even if it's generally "fast").

For memcpy < SZREG, you don't need a loop; just test the final few bits of len.
Have a look at the tricks in sysdeps/x86_64/multiarch/memmove-ssse3.S for ideas.


r~

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/2] RISC-V: ifunced memcpy using new kernel hwprobe interface
  2023-02-06 21:28 ` [PATCH 0/2] RISC-V: ifunced memcpy using new kernel hwprobe interface Richard Henderson
@ 2023-02-07 12:49   ` Adhemerval Zanella Netto
  0 siblings, 0 replies; 10+ messages in thread
From: Adhemerval Zanella Netto @ 2023-02-07 12:49 UTC (permalink / raw)
  To: libc-alpha



On 06/02/23 18:28, Richard Henderson via Libc-alpha wrote:
> On 2/6/23 09:48, Evan Green wrote:
>>
>> This series illustrates the use of a proposed Linux syscall that
>> enumerates architectural information about the RISC-V cores the system
>> is running on. In this series we expose a small wrapper function around
>> the syscall. An ifunc selector for memcpy queries it to see if unaligned
>> access is "fast" on this hardware. If it is, it selects a newly provided
>> implementation of memcpy that doesn't work hard at aligning the src and
>> destination buffers.
>>
>> This is somewhat of a proof of concept for the syscall itself, but I do
>> find that in my goofy  memcpy test [1], the unaligned memcpy performed at
>> least as well as the generic C version. This is however on Qemu on an M1
>> mac, so not a test of any real hardware (more a smoke test that the
>> implementation isn't silly).
>>
>> v1 of the Linux series can be found at [2]. I'm about to post v2 (but
>> haven't yet!), I can reply here with the link once v2 is posted.
>>
>> [1] https://pastebin.com/Nj8ixpkX
>> [2] https://yhbt.net/lore/all/20221013163551.6775-1-palmer@rivosinc.com/
> 
> Re the syscall:
> 
> I question whether the heterogenous cpu case is something that you really want to query. In order to handle migration between such cpus, any such query must return the minimum level of support.
> 
> Remove that possibility, and this becomes a simple array reference.  Now you need to decide whether a vdso call, or HWCAP2 as pointer to read-only data is more or less efficient or extensible.

It should at least work if kernel trap/emulate unaligned or any instruction
not supported by the other code, although it would be really subpar.  I 
would expect that kernel would report the minimum ISA as well.

I would recommend also to cache the values as we do for aarch64/x86/powerpc
to avoid issue multiple syscall on symbol resolution (check cpu-features.c).

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] riscv: Add Linux hwprobe syscall support
  2023-02-06 19:48 ` [PATCH 1/2] riscv: Add Linux hwprobe syscall support Evan Green
@ 2023-02-07 13:05   ` Adhemerval Zanella Netto
  2023-02-09 20:55     ` Evan Green
  2023-02-12 16:58     ` Jeff Law
  0 siblings, 2 replies; 10+ messages in thread
From: Adhemerval Zanella Netto @ 2023-02-07 13:05 UTC (permalink / raw)
  To: Evan Green, libc-alpha; +Cc: slewis, vineetg, palmer



On 06/02/23 16:48, Evan Green wrote:
> Add awareness and a thin wrapper function around a new Linux system call
> that allows callers to get architecture and microarchitecture
> information about the CPUs from the kernel. This can be used to
> do things like dynamically choose a memcpy implementation.

Do you want to use this symbol on external projects, for instance to
implement builtin_cpu_supports (as powerpc does)? If so, it has the 
drawback that such feature will only work on glibc.

> 
> Signed-off-by: Evan Green <evan@rivosinc.com>
> ---
> 
>  sysdeps/unix/sysv/linux/riscv/Makefile        |  4 +--
>  sysdeps/unix/sysv/linux/riscv/Versions        |  3 ++
>  sysdeps/unix/sysv/linux/riscv/hwprobe.c       | 30 ++++++++++++++++
>  .../unix/sysv/linux/riscv/rv32/arch-syscall.h |  1 +
>  .../unix/sysv/linux/riscv/rv32/libc.abilist   |  1 +
>  .../unix/sysv/linux/riscv/rv64/arch-syscall.h |  1 +
>  .../unix/sysv/linux/riscv/rv64/libc.abilist   |  1 +
>  sysdeps/unix/sysv/linux/riscv/sys/hwprobe.h   | 34 +++++++++++++++++++
>  sysdeps/unix/sysv/linux/syscall-names.list    |  1 +
>  9 files changed, 74 insertions(+), 2 deletions(-)
>  create mode 100644 sysdeps/unix/sysv/linux/riscv/hwprobe.c
>  create mode 100644 sysdeps/unix/sysv/linux/riscv/sys/hwprobe.h
> 
> diff --git a/sysdeps/unix/sysv/linux/riscv/Makefile b/sysdeps/unix/sysv/linux/riscv/Makefile
> index 4b6eacb32f..45cc29e40d 100644
> --- a/sysdeps/unix/sysv/linux/riscv/Makefile
> +++ b/sysdeps/unix/sysv/linux/riscv/Makefile
> @@ -1,6 +1,6 @@
>  ifeq ($(subdir),misc)
> -sysdep_headers += sys/cachectl.h
> -sysdep_routines += flush-icache
> +sysdep_headers += sys/cachectl.h sys/hwprobe.h
> +sysdep_routines += flush-icache hwprobe
>  endif
>  
>  ifeq ($(subdir),stdlib)
> diff --git a/sysdeps/unix/sysv/linux/riscv/Versions b/sysdeps/unix/sysv/linux/riscv/Versions
> index 5625d2a0b8..891ae05730 100644
> --- a/sysdeps/unix/sysv/linux/riscv/Versions
> +++ b/sysdeps/unix/sysv/linux/riscv/Versions
> @@ -8,4 +8,7 @@ libc {
>    GLIBC_2.27 {
>      __riscv_flush_icache;
>    }
> +  GLIBC_2.37 {
> +    __riscv_hwprobe;
> +  }
>  }
> diff --git a/sysdeps/unix/sysv/linux/riscv/hwprobe.c b/sysdeps/unix/sysv/linux/riscv/hwprobe.c
> new file mode 100644
> index 0000000000..ef6dccb9db
> --- /dev/null
> +++ b/sysdeps/unix/sysv/linux/riscv/hwprobe.c
> @@ -0,0 +1,30 @@
> +/* RISC-V hardware feature probing support on Linux
> +   Copyright (C) 2023 Free Software Foundation, Inc.
> +
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public License as
> +   published by the Free Software Foundation; either version 2.1 of the
> +   License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#include <sys/syscall.h>
> +#include <asm/hwprobe.h>
> +#include <sysdep.h>
> +
> +int
> +__riscv_hwprobe (struct riscv_hwprobe *pairs, long pair_count,
> +  long cpu_count, unsigned long *cpus, unsigned long flags)
> +{
> +  return INLINE_SYSCALL (riscv_hwprobe, 5, pairs, pair_count,
> +                         cpu_count, cpus, flags);
> +}

Use INLINE_SYSCALL_CALL to avoid the need to specify the argument size.

> diff --git a/sysdeps/unix/sysv/linux/riscv/rv32/arch-syscall.h b/sysdeps/unix/sysv/linux/riscv/rv32/arch-syscall.h
> index 202520ee25..2416e041c8 100644
> --- a/sysdeps/unix/sysv/linux/riscv/rv32/arch-syscall.h
> +++ b/sysdeps/unix/sysv/linux/riscv/rv32/arch-syscall.h
> @@ -198,6 +198,7 @@
>  #define __NR_request_key 218
>  #define __NR_restart_syscall 128
>  #define __NR_riscv_flush_icache 259
> +#define __NR_riscv_hwprobe 258
>  #define __NR_rseq 293
>  #define __NR_rt_sigaction 134
>  #define __NR_rt_sigpending 136
> diff --git a/sysdeps/unix/sysv/linux/riscv/rv32/libc.abilist b/sysdeps/unix/sysv/linux/riscv/rv32/libc.abilist
> index ff90d1bff2..f4c391d3be 100644
> --- a/sysdeps/unix/sysv/linux/riscv/rv32/libc.abilist
> +++ b/sysdeps/unix/sysv/linux/riscv/rv32/libc.abilist
> @@ -2396,3 +2396,4 @@ GLIBC_2.36 pidfd_open F
>  GLIBC_2.36 pidfd_send_signal F
>  GLIBC_2.36 process_madvise F
>  GLIBC_2.36 process_mrelease F
> +GLIBC_2.37 __riscv_hwprobe F
> diff --git a/sysdeps/unix/sysv/linux/riscv/rv64/arch-syscall.h b/sysdeps/unix/sysv/linux/riscv/rv64/arch-syscall.h
> index 4e65f337d4..a32bc82f60 100644
> --- a/sysdeps/unix/sysv/linux/riscv/rv64/arch-syscall.h
> +++ b/sysdeps/unix/sysv/linux/riscv/rv64/arch-syscall.h
> @@ -205,6 +205,7 @@
>  #define __NR_request_key 218
>  #define __NR_restart_syscall 128
>  #define __NR_riscv_flush_icache 259
> +#define __NR_riscv_hwprobe 258
>  #define __NR_rseq 293
>  #define __NR_rt_sigaction 134
>  #define __NR_rt_sigpending 136
> diff --git a/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist b/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist
> index f1017f6ec5..0f57bbe9e1 100644
> --- a/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist
> +++ b/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist
> @@ -2596,3 +2596,4 @@ GLIBC_2.36 pidfd_open F
>  GLIBC_2.36 pidfd_send_signal F
>  GLIBC_2.36 process_madvise F
>  GLIBC_2.36 process_mrelease F
> +GLIBC_2.37 __riscv_hwprobe F
> diff --git a/sysdeps/unix/sysv/linux/riscv/sys/hwprobe.h b/sysdeps/unix/sysv/linux/riscv/sys/hwprobe.h
> new file mode 100644
> index 0000000000..da8cdc90bf
> --- /dev/null
> +++ b/sysdeps/unix/sysv/linux/riscv/sys/hwprobe.h
> @@ -0,0 +1,34 @@
> +/* RISC-V architecture probe interface
> +   Copyright (C) 2023 Free Software Foundation, Inc.
> +
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library.  If not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#ifndef _SYS_HWPROBE_H
> +#define _SYS_HWPROBE_H 1
> +
> +#include <features.h>
> +#include <asm/hwprobe.h>

Do we really want to tie its usage to the supplied kernel headers? It means that
if kernel headers is too old a program that includes sys/hwprobe.h will just
fail to compile, which is not ideal.

For user exported header the usual way is to use __has_include and redefine the
required fields (check sysdeps/unix/sysv/linux/bits/unistd_ext.h or grep for
__has_include).  It has the drawnback of require duplicate and constant sync 
kernel definition to exported header.

> +
> +__BEGIN_DECLS
> +
> +int
> +__riscv_hwprobe (struct riscv_hwprobe *pairs, long pair_count,
> +  long cpu_count, unsigned long *cpus, unsigned long flags);
> +
> +__END_DECLS
> +
> +#endif /* sys/hwprobe.h */


I think we should also only export it as a GNU extension (__USE_GNU).

> diff --git a/sysdeps/unix/sysv/linux/syscall-names.list b/sysdeps/unix/sysv/linux/syscall-names.list
> index 822498d3e3..4f4a62e91c 100644
> --- a/sysdeps/unix/sysv/linux/syscall-names.list
> +++ b/sysdeps/unix/sysv/linux/syscall-names.list
> @@ -477,6 +477,7 @@ renameat2
>  request_key
>  restart_syscall
>  riscv_flush_icache
> +riscv_hwprobe
>  rmdir
>  rseq
>  rt_sigaction

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] riscv: Add Linux hwprobe syscall support
  2023-02-07 13:05   ` Adhemerval Zanella Netto
@ 2023-02-09 20:55     ` Evan Green
  2023-02-12 16:58     ` Jeff Law
  1 sibling, 0 replies; 10+ messages in thread
From: Evan Green @ 2023-02-09 20:55 UTC (permalink / raw)
  To: Adhemerval Zanella Netto; +Cc: libc-alpha, slewis, vineetg, palmer

On Tue, Feb 7, 2023 at 5:05 AM Adhemerval Zanella Netto
<adhemerval.zanella@linaro.org> wrote:
>
>
>
> On 06/02/23 16:48, Evan Green wrote:
> > Add awareness and a thin wrapper function around a new Linux system call
> > that allows callers to get architecture and microarchitecture
> > information about the CPUs from the kernel. This can be used to
> > do things like dynamically choose a memcpy implementation.
>
> Do you want to use this symbol on external projects, for instance to
> implement builtin_cpu_supports (as powerpc does)? If so, it has the
> drawback that such feature will only work on glibc.

Yes, that's the plan. Most uses will probably be internal, but for the
relatively small set of projects that are trying to decide between
various hand-rolled assembly routines (video/compression/?), I think
this interface could help. I think being glibc-specific is probably an
acceptable limitation, if they're dying to be libc-agnostic they can
always code the syscall themselves.

>
> >
> > Signed-off-by: Evan Green <evan@rivosinc.com>
> > ---
> >
> >  sysdeps/unix/sysv/linux/riscv/Makefile        |  4 +--
> >  sysdeps/unix/sysv/linux/riscv/Versions        |  3 ++
> >  sysdeps/unix/sysv/linux/riscv/hwprobe.c       | 30 ++++++++++++++++
> >  .../unix/sysv/linux/riscv/rv32/arch-syscall.h |  1 +
> >  .../unix/sysv/linux/riscv/rv32/libc.abilist   |  1 +
> >  .../unix/sysv/linux/riscv/rv64/arch-syscall.h |  1 +
> >  .../unix/sysv/linux/riscv/rv64/libc.abilist   |  1 +
> >  sysdeps/unix/sysv/linux/riscv/sys/hwprobe.h   | 34 +++++++++++++++++++
> >  sysdeps/unix/sysv/linux/syscall-names.list    |  1 +
> >  9 files changed, 74 insertions(+), 2 deletions(-)
> >  create mode 100644 sysdeps/unix/sysv/linux/riscv/hwprobe.c
> >  create mode 100644 sysdeps/unix/sysv/linux/riscv/sys/hwprobe.h
> >
> > diff --git a/sysdeps/unix/sysv/linux/riscv/Makefile b/sysdeps/unix/sysv/linux/riscv/Makefile
> > index 4b6eacb32f..45cc29e40d 100644
> > --- a/sysdeps/unix/sysv/linux/riscv/Makefile
> > +++ b/sysdeps/unix/sysv/linux/riscv/Makefile
> > @@ -1,6 +1,6 @@
> >  ifeq ($(subdir),misc)
> > -sysdep_headers += sys/cachectl.h
> > -sysdep_routines += flush-icache
> > +sysdep_headers += sys/cachectl.h sys/hwprobe.h
> > +sysdep_routines += flush-icache hwprobe
> >  endif
> >
> >  ifeq ($(subdir),stdlib)
> > diff --git a/sysdeps/unix/sysv/linux/riscv/Versions b/sysdeps/unix/sysv/linux/riscv/Versions
> > index 5625d2a0b8..891ae05730 100644
> > --- a/sysdeps/unix/sysv/linux/riscv/Versions
> > +++ b/sysdeps/unix/sysv/linux/riscv/Versions
> > @@ -8,4 +8,7 @@ libc {
> >    GLIBC_2.27 {
> >      __riscv_flush_icache;
> >    }
> > +  GLIBC_2.37 {
> > +    __riscv_hwprobe;
> > +  }
> >  }
> > diff --git a/sysdeps/unix/sysv/linux/riscv/hwprobe.c b/sysdeps/unix/sysv/linux/riscv/hwprobe.c
> > new file mode 100644
> > index 0000000000..ef6dccb9db
> > --- /dev/null
> > +++ b/sysdeps/unix/sysv/linux/riscv/hwprobe.c
> > @@ -0,0 +1,30 @@
> > +/* RISC-V hardware feature probing support on Linux
> > +   Copyright (C) 2023 Free Software Foundation, Inc.
> > +
> > +   This file is part of the GNU C Library.
> > +
> > +   The GNU C Library is free software; you can redistribute it and/or
> > +   modify it under the terms of the GNU Lesser General Public License as
> > +   published by the Free Software Foundation; either version 2.1 of the
> > +   License, or (at your option) any later version.
> > +
> > +   The GNU C Library is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   Lesser General Public License for more details.
> > +
> > +   You should have received a copy of the GNU Lesser General Public
> > +   License along with the GNU C Library; if not, see
> > +   <https://www.gnu.org/licenses/>.  */
> > +
> > +#include <sys/syscall.h>
> > +#include <asm/hwprobe.h>
> > +#include <sysdep.h>
> > +
> > +int
> > +__riscv_hwprobe (struct riscv_hwprobe *pairs, long pair_count,
> > +  long cpu_count, unsigned long *cpus, unsigned long flags)
> > +{
> > +  return INLINE_SYSCALL (riscv_hwprobe, 5, pairs, pair_count,
> > +                         cpu_count, cpus, flags);
> > +}
>
> Use INLINE_SYSCALL_CALL to avoid the need to specify the argument size.
>
> > diff --git a/sysdeps/unix/sysv/linux/riscv/rv32/arch-syscall.h b/sysdeps/unix/sysv/linux/riscv/rv32/arch-syscall.h
> > index 202520ee25..2416e041c8 100644
> > --- a/sysdeps/unix/sysv/linux/riscv/rv32/arch-syscall.h
> > +++ b/sysdeps/unix/sysv/linux/riscv/rv32/arch-syscall.h
> > @@ -198,6 +198,7 @@
> >  #define __NR_request_key 218
> >  #define __NR_restart_syscall 128
> >  #define __NR_riscv_flush_icache 259
> > +#define __NR_riscv_hwprobe 258
> >  #define __NR_rseq 293
> >  #define __NR_rt_sigaction 134
> >  #define __NR_rt_sigpending 136
> > diff --git a/sysdeps/unix/sysv/linux/riscv/rv32/libc.abilist b/sysdeps/unix/sysv/linux/riscv/rv32/libc.abilist
> > index ff90d1bff2..f4c391d3be 100644
> > --- a/sysdeps/unix/sysv/linux/riscv/rv32/libc.abilist
> > +++ b/sysdeps/unix/sysv/linux/riscv/rv32/libc.abilist
> > @@ -2396,3 +2396,4 @@ GLIBC_2.36 pidfd_open F
> >  GLIBC_2.36 pidfd_send_signal F
> >  GLIBC_2.36 process_madvise F
> >  GLIBC_2.36 process_mrelease F
> > +GLIBC_2.37 __riscv_hwprobe F
> > diff --git a/sysdeps/unix/sysv/linux/riscv/rv64/arch-syscall.h b/sysdeps/unix/sysv/linux/riscv/rv64/arch-syscall.h
> > index 4e65f337d4..a32bc82f60 100644
> > --- a/sysdeps/unix/sysv/linux/riscv/rv64/arch-syscall.h
> > +++ b/sysdeps/unix/sysv/linux/riscv/rv64/arch-syscall.h
> > @@ -205,6 +205,7 @@
> >  #define __NR_request_key 218
> >  #define __NR_restart_syscall 128
> >  #define __NR_riscv_flush_icache 259
> > +#define __NR_riscv_hwprobe 258
> >  #define __NR_rseq 293
> >  #define __NR_rt_sigaction 134
> >  #define __NR_rt_sigpending 136
> > diff --git a/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist b/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist
> > index f1017f6ec5..0f57bbe9e1 100644
> > --- a/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist
> > +++ b/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist
> > @@ -2596,3 +2596,4 @@ GLIBC_2.36 pidfd_open F
> >  GLIBC_2.36 pidfd_send_signal F
> >  GLIBC_2.36 process_madvise F
> >  GLIBC_2.36 process_mrelease F
> > +GLIBC_2.37 __riscv_hwprobe F
> > diff --git a/sysdeps/unix/sysv/linux/riscv/sys/hwprobe.h b/sysdeps/unix/sysv/linux/riscv/sys/hwprobe.h
> > new file mode 100644
> > index 0000000000..da8cdc90bf
> > --- /dev/null
> > +++ b/sysdeps/unix/sysv/linux/riscv/sys/hwprobe.h
> > @@ -0,0 +1,34 @@
> > +/* RISC-V architecture probe interface
> > +   Copyright (C) 2023 Free Software Foundation, Inc.
> > +
> > +   This file is part of the GNU C Library.
> > +
> > +   The GNU C Library is free software; you can redistribute it and/or
> > +   modify it under the terms of the GNU Lesser General Public
> > +   License as published by the Free Software Foundation; either
> > +   version 2.1 of the License, or (at your option) any later version.
> > +
> > +   The GNU C Library is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   Lesser General Public License for more details.
> > +
> > +   You should have received a copy of the GNU Lesser General Public
> > +   License along with the GNU C Library.  If not, see
> > +   <https://www.gnu.org/licenses/>.  */
> > +
> > +#ifndef _SYS_HWPROBE_H
> > +#define _SYS_HWPROBE_H 1
> > +
> > +#include <features.h>
> > +#include <asm/hwprobe.h>
>
> Do we really want to tie its usage to the supplied kernel headers? It means that
> if kernel headers is too old a program that includes sys/hwprobe.h will just
> fail to compile, which is not ideal.
>
> For user exported header the usual way is to use __has_include and redefine the
> required fields (check sysdeps/unix/sysv/linux/bits/unistd_ext.h or grep for
> __has_include).  It has the drawnback of require duplicate and constant sync
> kernel definition to exported header.
>

Sure, I can do the has_include way. Will do the other comments in here
as well. Thanks for the review!

> > +
> > +__BEGIN_DECLS
> > +
> > +int
> > +__riscv_hwprobe (struct riscv_hwprobe *pairs, long pair_count,
> > +  long cpu_count, unsigned long *cpus, unsigned long flags);
> > +
> > +__END_DECLS
> > +
> > +#endif /* sys/hwprobe.h */
>
>
> I think we should also only export it as a GNU extension (__USE_GNU).
>
> > diff --git a/sysdeps/unix/sysv/linux/syscall-names.list b/sysdeps/unix/sysv/linux/syscall-names.list
> > index 822498d3e3..4f4a62e91c 100644
> > --- a/sysdeps/unix/sysv/linux/syscall-names.list
> > +++ b/sysdeps/unix/sysv/linux/syscall-names.list
> > @@ -477,6 +477,7 @@ renameat2
> >  request_key
> >  restart_syscall
> >  riscv_flush_icache
> > +riscv_hwprobe
> >  rmdir
> >  rseq
> >  rt_sigaction

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] riscv: Add and use alignment-ignorant memcpy
  2023-02-06 22:05   ` Richard Henderson
@ 2023-02-09 21:04     ` Evan Green
  0 siblings, 0 replies; 10+ messages in thread
From: Evan Green @ 2023-02-09 21:04 UTC (permalink / raw)
  To: Richard Henderson; +Cc: libc-alpha, slewis, vineetg, palmer

On Mon, Feb 6, 2023 at 2:06 PM Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 2/6/23 09:48, Evan Green wrote:
> > +     /* Remainder is smaller than a page, compute native word count */
> > +     beqz a2, 6f
> > +     andi a5, a2, ~(SZREG-1)
> > +     andi a2, a2, (SZREG-1)
> > +     add a3, a1, a5
> > +     /* Jump directly to byte copy if no words. */
> > +     beqz a5, 4f
> > +
> > +3:
> > +     /* Use single native register copy */
> > +     REG_L a4, 0(a1)
> > +     addi a1, a1, SZREG
> > +     REG_S a4, 0(t6)
> > +     addi t6, t6, SZREG
> > +     bltu a1, a3, 3b
> > +
> > +     /* Jump directly out if no more bytes */
> > +     beqz a2, 6f
> > +
> > +4:
> > +     /* Copy the last few individual bytes */
> > +     add a3, a1, a2
> > +5:
> > +     lb a4, 0(a1)
> > +     addi a1, a1, 1
> > +     sb a4, 0(t6)
> > +     addi t6, t6, 1
> > +     bltu a1, a3, 5b
> > +6:
> > +     ret
>
> If you know there are at least SZREG bytes in the range, you can avoid the byte loop by
> copying the last word unaligned.  That may copy some bytes twice, but that's ok too.
> Similarly, you can redundantly copy a few bytes at the beginning to align the destination
> (there's usually some cost for unaligned stores, even if it's generally "fast").
>
> For memcpy < SZREG, you don't need a loop; just test the final few bits of len.
> Have a look at the tricks in sysdeps/x86_64/multiarch/memmove-ssse3.S for ideas.

Thanks! I haven't gone too deeply into the fine tuning of this
routine, I think you're right there are probably tweaks to be made for
optimal gains. These are good suggestions, though I might save them
for a subsequent patch.
-Evan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] riscv: Add Linux hwprobe syscall support
  2023-02-07 13:05   ` Adhemerval Zanella Netto
  2023-02-09 20:55     ` Evan Green
@ 2023-02-12 16:58     ` Jeff Law
  1 sibling, 0 replies; 10+ messages in thread
From: Jeff Law @ 2023-02-12 16:58 UTC (permalink / raw)
  To: Adhemerval Zanella Netto, Evan Green, libc-alpha; +Cc: slewis, vineetg, palmer



On 2/7/23 06:05, Adhemerval Zanella Netto via Libc-alpha wrote:
> 
> 
> On 06/02/23 16:48, Evan Green wrote:
>> Add awareness and a thin wrapper function around a new Linux system call
>> that allows callers to get architecture and microarchitecture
>> information about the CPUs from the kernel. This can be used to
>> do things like dynamically choose a memcpy implementation.
> 
> Do you want to use this symbol on external projects, for instance to
> implement builtin_cpu_supports (as powerpc does)? If so, it has the
> drawback that such feature will only work on glibc.
What's being built really needs to work outside of glibc as well -- 
consider GCC's function multi-versioning for example.

Jeff

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-02-12 16:58 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-06 19:48 [PATCH 0/2] RISC-V: ifunced memcpy using new kernel hwprobe interface Evan Green
2023-02-06 19:48 ` [PATCH 1/2] riscv: Add Linux hwprobe syscall support Evan Green
2023-02-07 13:05   ` Adhemerval Zanella Netto
2023-02-09 20:55     ` Evan Green
2023-02-12 16:58     ` Jeff Law
2023-02-06 19:48 ` [PATCH 2/2] riscv: Add and use alignment-ignorant memcpy Evan Green
2023-02-06 22:05   ` Richard Henderson
2023-02-09 21:04     ` Evan Green
2023-02-06 21:28 ` [PATCH 0/2] RISC-V: ifunced memcpy using new kernel hwprobe interface Richard Henderson
2023-02-07 12:49   ` Adhemerval Zanella Netto

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).