public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: "Stefan O'Rear" <sorear@fastmail.com>
To: "Evan Green" <evan@rivosinc.com>,
	"Stefan O'Rear via Libc-alpha" <libc-alpha@sourceware.org>
Cc: "Palmer Dabbelt" <palmer@rivosinc.com>,
	slewis@rivosinc.com, "Vineet Gupta" <vineetg@rivosinc.com>,
	"Florian Weimer" <fweimer@redhat.com>
Subject: Re: [PATCH v4 3/3] riscv: Add and use alignment-ignorant memcpy
Date: Fri, 07 Jul 2023 22:16:43 -0400	[thread overview]
Message-ID: <ab3d5a2e-f800-4736-bc8b-8869bdc59cf0@app.fastmail.com> (raw)
In-Reply-To: <20230706192947.1566767-4-evan@rivosinc.com>



On Thu, Jul 6, 2023, at 3:29 PM, Evan Green wrote:
> For CPU implementations that can perform unaligned accesses with little
> or no performance penalty, create a memcpy implementation that does not
> bother aligning buffers. It will use a block of integer registers, a
> single integer register, and fall back to bytewise copy for the
> remainder.
>
> Signed-off-by: Evan Green <evan@rivosinc.com>
> Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
>
> ---
>
> Changes in v4:
>  - Fixed comment style (Florian)
>
> Changes in v3:
>  - Word align dest for large memcpy()s.
>  - Add tags
>  - Remove spurious blank line from sysdeps/riscv/memcpy.c
>
> Changes in v2:
>  - Used _MASK instead of _FAST value itself.
>
>
> ---
>  sysdeps/riscv/memcopy.h                       |  26 ++++
>  sysdeps/riscv/memcpy.c                        |  64 +++++++++
>  sysdeps/riscv/memcpy_noalignment.S            | 121 ++++++++++++++++++
>  sysdeps/unix/sysv/linux/riscv/Makefile        |   4 +
>  .../unix/sysv/linux/riscv/memcpy-generic.c    |  24 ++++
>  5 files changed, 239 insertions(+)
>  create mode 100644 sysdeps/riscv/memcopy.h
>  create mode 100644 sysdeps/riscv/memcpy.c
>  create mode 100644 sysdeps/riscv/memcpy_noalignment.S
>  create mode 100644 sysdeps/unix/sysv/linux/riscv/memcpy-generic.c
>
> diff --git a/sysdeps/riscv/memcopy.h b/sysdeps/riscv/memcopy.h
> new file mode 100644
> index 0000000000..2b685c8aa0
> --- /dev/null
> +++ b/sysdeps/riscv/memcopy.h
> @@ -0,0 +1,26 @@
> +/* memcopy.h -- definitions for memory copy functions. RISC-V version.
> +   Copyright (C) 2023 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#include <sysdeps/generic/memcopy.h>
> +
> +/* Redefine the generic memcpy implementation to __memcpy_generic, so
> +   the memcpy ifunc can select between generic and special versions.
> +   In rtld, don't bother with all the ifunciness. */
> +#if IS_IN (libc)
> +#define MEMCPY __memcpy_generic
> +#endif
> diff --git a/sysdeps/riscv/memcpy.c b/sysdeps/riscv/memcpy.c
> new file mode 100644
> index 0000000000..fdb8dc3208
> --- /dev/null
> +++ b/sysdeps/riscv/memcpy.c
> @@ -0,0 +1,64 @@
> +/* Multiple versions of memcpy.
> +   All versions must be listed in ifunc-impl-list.c.
> +   Copyright (C) 2017-2023 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#if IS_IN (libc)
> +/* Redefine memcpy so that the compiler won't complain about the type
> +   mismatch with the IFUNC selector in strong_alias, below.  */
> +# undef memcpy
> +# define memcpy __redirect_memcpy
> +# include <string.h>
> +#include <ifunc-init.h>
> +#include <sys/hwprobe.h>
> +
> +#define INIT_ARCH()
> +
> +extern __typeof (__redirect_memcpy) __libc_memcpy;
> +
> +extern __typeof (__redirect_memcpy) __memcpy_generic attribute_hidden;
> +extern __typeof (__redirect_memcpy) __memcpy_noalignment 
> attribute_hidden;
> +
> +static inline __typeof (__redirect_memcpy) *
> +select_memcpy_ifunc (void)
> +{
> +  INIT_ARCH ();
> +
> +  struct riscv_hwprobe pair;
> +
> +  pair.key = RISCV_HWPROBE_KEY_CPUPERF_0;
> +  if (__riscv_hwprobe(&pair, 1, 0, NULL, 0) != 0)
> +    return __memcpy_generic;
> +
> +  if ((pair.key > 0) &&
> +      (pair.value & RISCV_HWPROBE_MISALIGNED_MASK) ==
> +       RISCV_HWPROBE_MISALIGNED_FAST)
> +    return __memcpy_noalignment;

It's unclear whether this is semantically correct as a use of
__riscv_hwprobe.  [1] describes the result of hwprobe as "what's possible
to enable", leaving open the possibility that additional system calls are
needed to determine whether unaligned accesses are supported right now in
the current process, and [2] adds an (inherited, IIUC) prctl for
unaligned access which doesn't affect the return value of hwprobe and
would break this code as written.

(There is nothing in either the privileged spec or the SBI spec to
prohibit an implementation which provides FAST unaligned access from
supporting an optional strict alignment checking mode and making it
available through fw_feature.)

-s

[1]: https://lore.kernel.org/linux-riscv/mhng-97928779-5d76-4390-a84c-398fdc6a0a4f@palmer-ri-x1c9/
[2]: https://lore.kernel.org/linux-riscv/20230624122049.7886-6-cleger@rivosinc.com/

> +
> +  return __memcpy_generic;
> +}
> +
> +libc_ifunc (__libc_memcpy, select_memcpy_ifunc ());
> +
> +# undef memcpy
> +strong_alias (__libc_memcpy, memcpy);
> +# ifdef SHARED
> +__hidden_ver1 (memcpy, __GI_memcpy, __redirect_memcpy)
> +  __attribute__ ((visibility ("hidden"))) __attribute_copy__ (memcpy);
> +# endif
> +
> +#endif
> diff --git a/sysdeps/riscv/memcpy_noalignment.S 
> b/sysdeps/riscv/memcpy_noalignment.S
> new file mode 100644
> index 0000000000..80f5e09ebb
> --- /dev/null
> +++ b/sysdeps/riscv/memcpy_noalignment.S
> @@ -0,0 +1,121 @@
> +/* memcpy for RISC-V, ignoring buffer alignment
> +   Copyright (C) 2023 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library.  If not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#include <sysdep.h>
> +#include <sys/asm.h>
> +
> +/* void *memcpy(void *, const void *, size_t) */
> +ENTRY (__memcpy_noalignment)
> +	move t6, a0  /* Preserve return value */
> +
> +	/* Round down to the nearest "page" size */
> +	andi a4, a2, ~((16*SZREG)-1)
> +	beqz a4, 2f
> +	add a3, a1, a4
> +
> +	/* Copy the first word to get dest word aligned */
> +	andi a5, t6, SZREG-1
> +	beqz a5, 1f
> +	REG_L a6, (a1)
> +	REG_S a6, (t6)
> +
> +	/* Align dst up to a word, move src and size as well. */
> +	addi t6, t6, SZREG-1
> +	andi t6, t6, ~(SZREG-1)
> +	sub a5, t6, a0
> +	add a1, a1, a5
> +	sub a2, a2, a5
> +
> +	/* Recompute page count */
> +	andi a4, a2, ~((16*SZREG)-1)
> +	beqz a4, 2f
> +
> +1:
> +	/* Copy "pages" (chunks of 16 registers) */
> +	REG_L a4,       0(a1)
> +	REG_L a5,   SZREG(a1)
> +	REG_L a6, 2*SZREG(a1)
> +	REG_L a7, 3*SZREG(a1)
> +	REG_L t0, 4*SZREG(a1)
> +	REG_L t1, 5*SZREG(a1)
> +	REG_L t2, 6*SZREG(a1)
> +	REG_L t3, 7*SZREG(a1)
> +	REG_L t4, 8*SZREG(a1)
> +	REG_L t5, 9*SZREG(a1)
> +	REG_S a4,       0(t6)
> +	REG_S a5,   SZREG(t6)
> +	REG_S a6, 2*SZREG(t6)
> +	REG_S a7, 3*SZREG(t6)
> +	REG_S t0, 4*SZREG(t6)
> +	REG_S t1, 5*SZREG(t6)
> +	REG_S t2, 6*SZREG(t6)
> +	REG_S t3, 7*SZREG(t6)
> +	REG_S t4, 8*SZREG(t6)
> +	REG_S t5, 9*SZREG(t6)
> +	REG_L a4, 10*SZREG(a1)
> +	REG_L a5, 11*SZREG(a1)
> +	REG_L a6, 12*SZREG(a1)
> +	REG_L a7, 13*SZREG(a1)
> +	REG_L t0, 14*SZREG(a1)
> +	REG_L t1, 15*SZREG(a1)
> +	addi a1, a1, 16*SZREG
> +	REG_S a4, 10*SZREG(t6)
> +	REG_S a5, 11*SZREG(t6)
> +	REG_S a6, 12*SZREG(t6)
> +	REG_S a7, 13*SZREG(t6)
> +	REG_S t0, 14*SZREG(t6)
> +	REG_S t1, 15*SZREG(t6)
> +	addi t6, t6, 16*SZREG
> +	bltu a1, a3, 1b
> +	andi a2, a2, (16*SZREG)-1  /* Update count */
> +
> +2:
> +	/* Remainder is smaller than a page, compute native word count */
> +	beqz a2, 6f
> +	andi a5, a2, ~(SZREG-1)
> +	andi a2, a2, (SZREG-1)
> +	add a3, a1, a5
> +	/* Jump directly to byte copy if no words. */
> +	beqz a5, 4f
> +
> +3:
> +	/* Use single native register copy */
> +	REG_L a4, 0(a1)
> +	addi a1, a1, SZREG
> +	REG_S a4, 0(t6)
> +	addi t6, t6, SZREG
> +	bltu a1, a3, 3b
> +
> +	/* Jump directly out if no more bytes */
> +	beqz a2, 6f
> +
> +4:
> +	/* Copy the last few individual bytes */
> +	add a3, a1, a2
> +5:
> +	lb a4, 0(a1)
> +	addi a1, a1, 1
> +	sb a4, 0(t6)
> +	addi t6, t6, 1
> +	bltu a1, a3, 5b
> +6:
> +	ret
> +
> +END (__memcpy_noalignment)
> +
> +hidden_def (__memcpy_noalignment)
> diff --git a/sysdeps/unix/sysv/linux/riscv/Makefile 
> b/sysdeps/unix/sysv/linux/riscv/Makefile
> index 45cc29e40d..aa9ea443d6 100644
> --- a/sysdeps/unix/sysv/linux/riscv/Makefile
> +++ b/sysdeps/unix/sysv/linux/riscv/Makefile
> @@ -7,6 +7,10 @@ ifeq ($(subdir),stdlib)
>  gen-as-const-headers += ucontext_i.sym
>  endif
> 
> +ifeq ($(subdir),string)
> +sysdep_routines += memcpy memcpy-generic memcpy_noalignment
> +endif
> +
>  abi-variants := ilp32 ilp32d lp64 lp64d
> 
>  ifeq (,$(filter $(default-abi),$(abi-variants)))
> diff --git a/sysdeps/unix/sysv/linux/riscv/memcpy-generic.c 
> b/sysdeps/unix/sysv/linux/riscv/memcpy-generic.c
> new file mode 100644
> index 0000000000..0abe03f7f5
> --- /dev/null
> +++ b/sysdeps/unix/sysv/linux/riscv/memcpy-generic.c
> @@ -0,0 +1,24 @@
> +/* Re-include the default memcpy implementation.
> +   Copyright (C) 2023 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#include <string.h>
> +
> +extern __typeof (memcpy) __memcpy_generic;
> +hidden_proto(__memcpy_generic)
> +
> +#include <string/memcpy.c>
> -- 
> 2.34.1

  parent reply	other threads:[~2023-07-08  2:17 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-06 19:29 [PATCH v4 0/3] RISC-V: ifunced memcpy using new kernel hwprobe interface Evan Green
2023-07-06 19:29 ` [PATCH v4 1/3] riscv: Add Linux hwprobe syscall support Evan Green
2023-07-07  8:15   ` Florian Weimer
2023-07-07 22:10     ` Evan Green
2023-07-10  9:17       ` Florian Weimer
2023-07-11 17:08         ` Evan Green
2023-07-06 19:29 ` [PATCH v4 2/3] riscv: Add hwprobe vdso call support Evan Green
2023-07-06 19:29 ` [PATCH v4 3/3] riscv: Add and use alignment-ignorant memcpy Evan Green
2023-07-07  9:22   ` Richard Henderson
2023-07-07 15:25     ` Jeff Law
2023-07-07 21:37       ` Evan Green
2023-07-07 22:15         ` Jeff Law
2023-07-08  2:16   ` Stefan O'Rear [this message]
2023-07-10 16:19     ` Evan Green
2023-07-12  5:22       ` Stefan O'Rear
2023-07-06 20:11 ` [PATCH v4 0/3] RISC-V: ifunced memcpy using new kernel hwprobe interface Palmer Dabbelt
2023-07-06 22:20   ` Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ab3d5a2e-f800-4736-bc8b-8869bdc59cf0@app.fastmail.com \
    --to=sorear@fastmail.com \
    --cc=evan@rivosinc.com \
    --cc=fweimer@redhat.com \
    --cc=libc-alpha@sourceware.org \
    --cc=palmer@rivosinc.com \
    --cc=slewis@rivosinc.com \
    --cc=vineetg@rivosinc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).