public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: "Christoph Müllner" <christoph.muellner@vrull.eu>
To: Palmer Dabbelt <palmer@dabbelt.com>
Cc: libc-alpha@sourceware.org, adhemerval.zanella@linaro.org,
	 Darius Rad <darius@bluespec.com>,
	Andrew Waterman <andrew@sifive.com>,
	philipp.tomsich@vrull.eu,  Evan Green <evan@rivosinc.com>,
	kito.cheng@sifive.com, jeffreyalaw@gmail.com,
	 Vineet Gupta <vineetg@rivosinc.com>
Subject: Re: [RFC PATCH 3/3] RISC-V: Implement CPU yielding for busy loops with Zihintpause/Zawrs
Date: Thu, 18 Apr 2024 22:19:04 +0200	[thread overview]
Message-ID: <CAEg0e7gXi15=85-9sc7Ap+K8_AArGBkNsFixAjjYiCNQA3im=g@mail.gmail.com> (raw)
In-Reply-To: <mhng-08eebb15-4e67-4616-abdd-64ea7522ab70@palmer-ri-x1c9a>

On Thu, Apr 18, 2024 at 7:17 PM Palmer Dabbelt <palmer@dabbelt.com> wrote:
>
> On Thu, 18 Apr 2024 02:46:35 PDT (-0700), christoph.muellner@vrull.eu wrote:
> > The macro atomic_spin_nop can be used to implement arch-specific
> > CPU yielding that is used in busy loops (e.g. in pthread_spin_lock).
> > This patch introduces an ifunc-based implementation for RISC-V,
> > that uses Zihintpause's PAUSE instruction for that matter (as PAUSE
> > is a HINT instruction there is not dependency to Zihintpause at
> > runtime).  Further, we test for Zawrs via hwprobe() and if found
> > we use WRS.STO instead of PAUSE.
> >
> > Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
> > ---
> >  sysdeps/riscv/multiarch/cpu-relax_generic.S   | 31 +++++++++++++++
> >  sysdeps/riscv/multiarch/cpu-relax_zawrs.S     | 28 +++++++++++++
> >  .../unix/sysv/linux/riscv/atomic-machine.h    |  3 ++
> >  .../unix/sysv/linux/riscv/multiarch/Makefile  |  8 ++++
> >  .../sysv/linux/riscv/multiarch/cpu-relax.c    | 39 +++++++++++++++++++
> >  .../linux/riscv/multiarch/ifunc-impl-list.c   | 32 +++++++++++++--
> >  6 files changed, 137 insertions(+), 4 deletions(-)
> >  create mode 100644 sysdeps/riscv/multiarch/cpu-relax_generic.S
> >  create mode 100644 sysdeps/riscv/multiarch/cpu-relax_zawrs.S
> >  create mode 100644 sysdeps/unix/sysv/linux/riscv/multiarch/cpu-relax.c
> >
> > diff --git a/sysdeps/riscv/multiarch/cpu-relax_generic.S b/sysdeps/riscv/multiarch/cpu-relax_generic.S
> > new file mode 100644
> > index 0000000000..d3ccfdce84
> > --- /dev/null
> > +++ b/sysdeps/riscv/multiarch/cpu-relax_generic.S
> > @@ -0,0 +1,31 @@
> > +/* CPU strand yielding for busy loops.  RISC-V version.
> > +   Copyright (C) 2024 Free Software Foundation, Inc.
> > +   This file is part of the GNU C Library.
> > +
> > +   The GNU C Library is free software; you can redistribute it and/or
> > +   modify it under the terms of the GNU Lesser General Public
> > +   License as published by the Free Software Foundation; either
> > +   version 2.1 of the License, or (at your option) any later version.
> > +
> > +   The GNU C Library is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   Lesser General Public License for more details.
> > +
> > +   You should have received a copy of the GNU Lesser General Public
> > +   License along with the GNU C Library.  If not, see
> > +   <https://www.gnu.org/licenses/>.  */
> > +
> > +#include <sysdep.h>
> > +#include <sys/asm.h>
> > +
> > +.option push
> > +.option arch, +zihintpause
> > +ENTRY (__cpu_relax_generic)
> > +     /* While we can use the `pause` instruction without
> > +        the need of Zihintpause (because it is a HINT instruction),
> > +        we still have to enable Zihintpause for the assembler.  */
> > +     pause
> > +     ret
> > +END (__cpu_relax_generic)
> > +.option pop
> > diff --git a/sysdeps/riscv/multiarch/cpu-relax_zawrs.S b/sysdeps/riscv/multiarch/cpu-relax_zawrs.S
> > new file mode 100644
> > index 0000000000..6d27b354df
> > --- /dev/null
> > +++ b/sysdeps/riscv/multiarch/cpu-relax_zawrs.S
> > @@ -0,0 +1,28 @@
> > +/* CPU strand yielding for busy loops.  RISC-V version with Zawrs.
> > +   Copyright (C) 2024 Free Software Foundation, Inc.
> > +   This file is part of the GNU C Library.
> > +
> > +   The GNU C Library is free software; you can redistribute it and/or
> > +   modify it under the terms of the GNU Lesser General Public
> > +   License as published by the Free Software Foundation; either
> > +   version 2.1 of the License, or (at your option) any later version.
> > +
> > +   The GNU C Library is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   Lesser General Public License for more details.
> > +
> > +   You should have received a copy of the GNU Lesser General Public
> > +   License along with the GNU C Library.  If not, see
> > +   <https://www.gnu.org/licenses/>.  */
> > +
> > +#include <sysdep.h>
> > +#include <sys/asm.h>
> > +
> > +.option push
> > +.option arch, +zawrs
> > +ENTRY (__cpu_relax_zawrs)
> > +     wrs.sto
>
> This has the same forward progress/eventual success violation as the
> code you sent for GCC and Linux does.  It doesn't really matter if the
> user of the reservation is in a builtin, an asm block, or a function.
> The compiler just doesn't know about those reservation rules and isn't
> going to generate code that follows them.

I see. The main issue is that we don't have a valid reservation when
calling WRS,
so the whole use of Zawrs instructions is pointless.
So the only way to move Zawrs forward would be to adjust the locking routines
(introducing new primitives that have to be implemented for all architectures).


>
> That said, as far as I can tell the same grey area in the WRS spec that
> I pointed out during one of the Linux reviews still exists.   So if
> there's some hardware that has a more generous set of LR->WRS rules than
> the ISA-required LR->SC rules that's fine, we just need to know what the
> hardware actually is so we can write down the rules and then stick to
> them -- it's just a performance thing for WRS, so having weaker rules
> seems entirely plausible.
>
> I don't know of any hardware with WRS, but sorry if I missed one.  Do
> you have something publicly availiable that behaves this way?
>
> > +     ret
> > +END (__cpu_relax_zawrs)
> > +.option pop
> > diff --git a/sysdeps/unix/sysv/linux/riscv/atomic-machine.h b/sysdeps/unix/sysv/linux/riscv/atomic-machine.h
> > index c1c9d949a0..02b9b7a421 100644
> > --- a/sysdeps/unix/sysv/linux/riscv/atomic-machine.h
> > +++ b/sysdeps/unix/sysv/linux/riscv/atomic-machine.h
> > @@ -178,4 +178,7 @@
> >  # error "ISAs that do not subsume the A extension are not supported"
> >  #endif /* !__riscv_atomic */
> >
> > +extern void __cpu_relax (void);
> > +#define atomic_spin_nop() __cpu_relax()
> > +
> >  #endif /* bits/atomic.h */
> > diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
> > index fcef5659d4..0cdf37a38b 100644
> > --- a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
> > +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
> > @@ -1,3 +1,11 @@
> > +# nscd uses atomic_spin_nop which in turn requires cpu_relax
> > +ifeq ($(subdir),nscd)
> > +sysdep_routines += \
> > +  cpu-relax \
> > +  cpu-relax_generic \
> > +  cpu-relax_zawrs
> > +endif
> > +
> >  ifeq ($(subdir),string)
> >  sysdep_routines += \
> >    memcpy \
> > diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/cpu-relax.c b/sysdeps/unix/sysv/linux/riscv/multiarch/cpu-relax.c
> > new file mode 100644
> > index 0000000000..5aeb120e21
> > --- /dev/null
> > +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/cpu-relax.c
> > @@ -0,0 +1,39 @@
> > +/* Multiple versions of cpu-relax.
> > +   All versions must be listed in ifunc-impl-list.c.
> > +   Copyright (C) 2024 Free Software Foundation, Inc.
> > +   This file is part of the GNU C Library.
> > +
> > +   The GNU C Library is free software; you can redistribute it and/or
> > +   modify it under the terms of the GNU Lesser General Public
> > +   License as published by the Free Software Foundation; either
> > +   version 2.1 of the License, or (at your option) any later version.
> > +
> > +   The GNU C Library is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   Lesser General Public License for more details.
> > +
> > +   You should have received a copy of the GNU Lesser General Public
> > +   License along with the GNU C Library; if not, see
> > +   <https://www.gnu.org/licenses/>.  */
> > +
> > +# include <ifunc-init.h>
> > +# include <riscv-ifunc.h>
> > +# include <sys/hwprobe.h>
> > +
> > +void __cpu_relax (void);
> > +extern __typeof (__cpu_relax) __cpu_relax_generic attribute_hidden;
> > +extern __typeof (__cpu_relax) __cpu_relax_zawrs attribute_hidden;
> > +
> > +static inline __typeof (__cpu_relax) *
> > +select_cpu_relax_ifunc (uint64_t dl_hwcap, __riscv_hwprobe_t hwprobe_func)
> > +{
> > +  unsigned long long int v;
> > +  if (__riscv_hwprobe_one (hwprobe_func, RISCV_HWPROBE_KEY_IMA_EXT_0, &v) == 0
> > +      && (v & RISCV_HWPROBE_EXT_ZAWRS))
> > +    return __cpu_relax_zawrs;
> > +
> > +  return __cpu_relax_generic;
> > +}
> > +
> > +riscv_libc_ifunc (__cpu_relax, select_cpu_relax_ifunc);
> > diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
> > index 9f806d7a9e..9c7a8c2e1f 100644
> > --- a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
> > +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
> > @@ -20,24 +20,48 @@
> >  #include <string.h>
> >  #include <sys/hwprobe.h>
> >
> > +#define ARRAY_SIZE(A) (sizeof (A) / sizeof ((A)[0]))
> > +
> > +void __cpu_relax (void);
> > +
> >  size_t
> >  __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
> >                       size_t max)
> >  {
> >    size_t i = max;
> > +  struct riscv_hwprobe pairs[] = {
> > +    { .key = RISCV_HWPROBE_KEY_IMA_EXT_0 },
> > +    { .key = RISCV_HWPROBE_KEY_CPUPERF_0 },
> > +  };
> >
> >    bool fast_unaligned = false;
> > +  bool has_zawrs = false;
> > +
> > +  if (__riscv_hwprobe (pairs, ARRAY_SIZE (pairs), 0, NULL, 0) == 0)
> > +    {
> > +      struct riscv_hwprobe *pair;
> >
> > -  struct riscv_hwprobe pair = { .key = RISCV_HWPROBE_KEY_CPUPERF_0 };
> > -  if (__riscv_hwprobe (&pair, 1, 0, NULL, 0) == 0
> > -      && (pair.value & RISCV_HWPROBE_MISALIGNED_MASK)
> > +      /* RISCV_HWPROBE_KEY_IMA_EXT_0  */
> > +      pair = &pairs[0];
> > +      if (pair->value & RISCV_HWPROBE_EXT_ZAWRS)
> > +     has_zawrs = true;
> > +
> > +      /* RISCV_HWPROBE_KEY_CPUPERF_0  */
> > +      pair = &pairs[1];
> > +      if ((pair->value & RISCV_HWPROBE_MISALIGNED_MASK)
> >            == RISCV_HWPROBE_MISALIGNED_FAST)
> > -    fast_unaligned = true;
> > +     fast_unaligned = true;
> > +    }
> >
> >    IFUNC_IMPL (i, name, memcpy,
> >             IFUNC_IMPL_ADD (array, i, memcpy, fast_unaligned,
> >                             __memcpy_noalignment)
> >             IFUNC_IMPL_ADD (array, i, memcpy, 1, __memcpy_generic))
> >
> > +  IFUNC_IMPL (i, name, __cpu_relax,
> > +           IFUNC_IMPL_ADD (array, i, __cpu_relax, has_zawrs,
> > +                           __cpu_relax_zawrs)
> > +           IFUNC_IMPL_ADD (array, i, __cpu_relax, 1, __cpu_relax_generic))
> > +
> >    return 0;
> >  }

  parent reply	other threads:[~2024-04-18 20:19 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-18  9:46 [RFC PATCH 0/3] RISC-V: Use WRS.STO for atomic_spin_nop Christoph Müllner
2024-04-18  9:46 ` [RFC PATCH 1/3] RISC-V: Sync hwprobe: Sync extension bits with Linux 6.8 Christoph Müllner
2024-04-18  9:46 ` [RFC PATCH 2/3] RISC-V: hwprobe: Add Zawrs test bit Christoph Müllner
2024-04-18  9:46 ` [RFC PATCH 3/3] RISC-V: Implement CPU yielding for busy loops with Zihintpause/Zawrs Christoph Müllner
2024-04-18 17:17   ` Palmer Dabbelt
2024-04-18 20:03     ` Vineet Gupta
2024-04-18 20:25       ` Christoph Müllner
2024-04-18 20:19     ` Christoph Müllner [this message]
2024-04-18 20:36       ` Vineet Gupta
2024-04-18 21:10         ` Palmer Dabbelt
2024-04-19 14:09           ` Andrew Jones

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAEg0e7gXi15=85-9sc7Ap+K8_AArGBkNsFixAjjYiCNQA3im=g@mail.gmail.com' \
    --to=christoph.muellner@vrull.eu \
    --cc=adhemerval.zanella@linaro.org \
    --cc=andrew@sifive.com \
    --cc=darius@bluespec.com \
    --cc=evan@rivosinc.com \
    --cc=jeffreyalaw@gmail.com \
    --cc=kito.cheng@sifive.com \
    --cc=libc-alpha@sourceware.org \
    --cc=palmer@dabbelt.com \
    --cc=philipp.tomsich@vrull.eu \
    --cc=vineetg@rivosinc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).