From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x135.google.com (mail-lf1-x135.google.com [IPv6:2a00:1450:4864:20::135]) by sourceware.org (Postfix) with ESMTPS id ADDB4384DD04 for ; Thu, 18 Apr 2024 09:46:47 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org ADDB4384DD04 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu ARC-Filter: OpenARC Filter v1.0.0 sourceware.org ADDB4384DD04 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::135 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1713433610; cv=none; b=snhSqhMOcQtELpb4oOSz0Hs8Xij0kAjxWBwXA97XrmbN6oDoaG4T4xXAcFCvKtjFNDDzyvvpm72gJqi2Sy6bwdthbOQBenJobZewek0DgQ3iaNX8xveqauK9DYsVXSiZwt5+Z9GImXDwSa0RiFSjRhwofUWIg61j0H0+xmB/hHs= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1713433610; c=relaxed/simple; bh=MqmdYpcfzmTs+8AysfVyOSdwPmjsBkwAj3KdCZfqz2M=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=K2hbAyBC3rgbOywdQYFO6bJVpGz2cTvxpleVM7bHPGWiLCMCeYH12Wn6rRjuUHYwEC7blwHzuE67PK2upuUzC+BfP7pp9O/IAC97hhu3FR3najmjwwzJDhoHoY/DkT0srtKVhh0Mf+8LFmpmN9BrBNYYqLa5uDQlAQY5fHp89zA= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lf1-x135.google.com with SMTP id 2adb3069b0e04-516d68d7a8bso668588e87.1 for ; Thu, 18 Apr 2024 02:46:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; t=1713433606; x=1714038406; darn=sourceware.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ygOv2eobKLMZ5gXL/cEh7J338BmFPLxJ9DcR3vOJTng=; b=CwAdXkfVUqdbthoblUT+IxvEaFKmc8OQZOidAaBngk3qP+bjbl9d8QSbrYLub02yJF cg5nLBKAX8j2NJH8NmFVkgwL1SvVPElAh0IjFYTebKKY+IoZb6I9040yJJm4qKyEDX2Z EzlFtPdjwBhhob5YL47qanAMkX/7PMY51eImJZA0mkXmx4G20GLrR4No9E8cjsZpKbUI s4TR9gI8U/qtPX1K+uMi5HJggYjqu1HiIiVSFE8jqc/8JwRyquUIW+CPJpP7phv8mgEB 9DBRIJhcUVznJMBnqtRzqE1GZjObf9T04qwY4UgJr1zDAyDkGh/+Z63SlNCkZk+bzulM 9Brg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713433606; x=1714038406; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ygOv2eobKLMZ5gXL/cEh7J338BmFPLxJ9DcR3vOJTng=; b=Fm+yh2vW5SuE1ok73xlc761vCXyogDxfqUXsswxkBLDr0UWUiubCEQ3MA8uv2ZeZi0 X62IxQg43KLG/d+o/VPpPAISlgU4WomiZkwuSpRL0NPK3LwMHI0rzzoAgJALiDUoA+Bx Y/bPk8xLSZ0zDVBQiW55woQ58FsAmRnn3iiEt4WK8HUO7PzBckzaUOOUfVweZ+mE9kUw ND1bUJnMcc549/53e+Njc87+pFu/iDpGXK4nr8d1UJeMt7m2S+pbmCJapeNiv4oY77eZ CvQy+0tOndBitjG34kIB/mL0UySIBBX5v9z8K1/ROMpYGqLFdB0UYqKdx9f1CELZfiL7 9lbQ== X-Gm-Message-State: AOJu0YykIREUUghvQlylMvArPM1cRWpJPDYkpdS3cqDD7GHmhTmkpn8x iQwY4T6uDDTAq78ST0xNxZ8p28MOdxRtW5YX5PpS0W2fYUtIA71g9M9nQWZy1VXX8gjQ9Sv/EUc xbr4= X-Google-Smtp-Source: AGHT+IFZZwRrnz/UHrFwX2tmxoGrhaxQ0g/maqwdR71eNVk4fNGQ0r1faYsRrKWr812sSCxOYLErvg== X-Received: by 2002:a05:6512:3107:b0:515:c652:6566 with SMTP id n7-20020a056512310700b00515c6526566mr527584lfb.8.1713433605684; Thu, 18 Apr 2024 02:46:45 -0700 (PDT) Received: from beast.fritz.box (static.239.130.217.95.clients.your-server.de. [95.217.130.239]) by smtp.gmail.com with ESMTPSA id b14-20020a056512060e00b00516d217f688sm169759lfe.295.2024.04.18.02.46.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 18 Apr 2024 02:46:44 -0700 (PDT) From: =?UTF-8?q?Christoph=20M=C3=BCllner?= To: libc-alpha@sourceware.org, Adhemerval Zanella , Palmer Dabbelt , Darius Rad , Andrew Waterman , Philipp Tomsich , Evan Green , Kito Cheng , Jeff Law , Vineet Gupta Cc: =?UTF-8?q?Christoph=20M=C3=BCllner?= Subject: [RFC PATCH 3/3] RISC-V: Implement CPU yielding for busy loops with Zihintpause/Zawrs Date: Thu, 18 Apr 2024 11:46:35 +0200 Message-ID: <20240418094635.3502009-4-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240418094635.3502009-1-christoph.muellner@vrull.eu> References: <20240418094635.3502009-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_MANYTO,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: The macro atomic_spin_nop can be used to implement arch-specific CPU yielding that is used in busy loops (e.g. in pthread_spin_lock). This patch introduces an ifunc-based implementation for RISC-V, that uses Zihintpause's PAUSE instruction for that matter (as PAUSE is a HINT instruction there is not dependency to Zihintpause at runtime). Further, we test for Zawrs via hwprobe() and if found we use WRS.STO instead of PAUSE. Signed-off-by: Christoph Müllner --- sysdeps/riscv/multiarch/cpu-relax_generic.S | 31 +++++++++++++++ sysdeps/riscv/multiarch/cpu-relax_zawrs.S | 28 +++++++++++++ .../unix/sysv/linux/riscv/atomic-machine.h | 3 ++ .../unix/sysv/linux/riscv/multiarch/Makefile | 8 ++++ .../sysv/linux/riscv/multiarch/cpu-relax.c | 39 +++++++++++++++++++ .../linux/riscv/multiarch/ifunc-impl-list.c | 32 +++++++++++++-- 6 files changed, 137 insertions(+), 4 deletions(-) create mode 100644 sysdeps/riscv/multiarch/cpu-relax_generic.S create mode 100644 sysdeps/riscv/multiarch/cpu-relax_zawrs.S create mode 100644 sysdeps/unix/sysv/linux/riscv/multiarch/cpu-relax.c diff --git a/sysdeps/riscv/multiarch/cpu-relax_generic.S b/sysdeps/riscv/multiarch/cpu-relax_generic.S new file mode 100644 index 0000000000..d3ccfdce84 --- /dev/null +++ b/sysdeps/riscv/multiarch/cpu-relax_generic.S @@ -0,0 +1,31 @@ +/* CPU strand yielding for busy loops. RISC-V version. + Copyright (C) 2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library. If not, see + . */ + +#include +#include + +.option push +.option arch, +zihintpause +ENTRY (__cpu_relax_generic) + /* While we can use the `pause` instruction without + the need of Zihintpause (because it is a HINT instruction), + we still have to enable Zihintpause for the assembler. */ + pause + ret +END (__cpu_relax_generic) +.option pop diff --git a/sysdeps/riscv/multiarch/cpu-relax_zawrs.S b/sysdeps/riscv/multiarch/cpu-relax_zawrs.S new file mode 100644 index 0000000000..6d27b354df --- /dev/null +++ b/sysdeps/riscv/multiarch/cpu-relax_zawrs.S @@ -0,0 +1,28 @@ +/* CPU strand yielding for busy loops. RISC-V version with Zawrs. + Copyright (C) 2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library. If not, see + . */ + +#include +#include + +.option push +.option arch, +zawrs +ENTRY (__cpu_relax_zawrs) + wrs.sto + ret +END (__cpu_relax_zawrs) +.option pop diff --git a/sysdeps/unix/sysv/linux/riscv/atomic-machine.h b/sysdeps/unix/sysv/linux/riscv/atomic-machine.h index c1c9d949a0..02b9b7a421 100644 --- a/sysdeps/unix/sysv/linux/riscv/atomic-machine.h +++ b/sysdeps/unix/sysv/linux/riscv/atomic-machine.h @@ -178,4 +178,7 @@ # error "ISAs that do not subsume the A extension are not supported" #endif /* !__riscv_atomic */ +extern void __cpu_relax (void); +#define atomic_spin_nop() __cpu_relax() + #endif /* bits/atomic.h */ diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile index fcef5659d4..0cdf37a38b 100644 --- a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile @@ -1,3 +1,11 @@ +# nscd uses atomic_spin_nop which in turn requires cpu_relax +ifeq ($(subdir),nscd) +sysdep_routines += \ + cpu-relax \ + cpu-relax_generic \ + cpu-relax_zawrs +endif + ifeq ($(subdir),string) sysdep_routines += \ memcpy \ diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/cpu-relax.c b/sysdeps/unix/sysv/linux/riscv/multiarch/cpu-relax.c new file mode 100644 index 0000000000..5aeb120e21 --- /dev/null +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/cpu-relax.c @@ -0,0 +1,39 @@ +/* Multiple versions of cpu-relax. + All versions must be listed in ifunc-impl-list.c. + Copyright (C) 2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +# include +# include +# include + +void __cpu_relax (void); +extern __typeof (__cpu_relax) __cpu_relax_generic attribute_hidden; +extern __typeof (__cpu_relax) __cpu_relax_zawrs attribute_hidden; + +static inline __typeof (__cpu_relax) * +select_cpu_relax_ifunc (uint64_t dl_hwcap, __riscv_hwprobe_t hwprobe_func) +{ + unsigned long long int v; + if (__riscv_hwprobe_one (hwprobe_func, RISCV_HWPROBE_KEY_IMA_EXT_0, &v) == 0 + && (v & RISCV_HWPROBE_EXT_ZAWRS)) + return __cpu_relax_zawrs; + + return __cpu_relax_generic; +} + +riscv_libc_ifunc (__cpu_relax, select_cpu_relax_ifunc); diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c index 9f806d7a9e..9c7a8c2e1f 100644 --- a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c @@ -20,24 +20,48 @@ #include #include +#define ARRAY_SIZE(A) (sizeof (A) / sizeof ((A)[0])) + +void __cpu_relax (void); + size_t __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, size_t max) { size_t i = max; + struct riscv_hwprobe pairs[] = { + { .key = RISCV_HWPROBE_KEY_IMA_EXT_0 }, + { .key = RISCV_HWPROBE_KEY_CPUPERF_0 }, + }; bool fast_unaligned = false; + bool has_zawrs = false; + + if (__riscv_hwprobe (pairs, ARRAY_SIZE (pairs), 0, NULL, 0) == 0) + { + struct riscv_hwprobe *pair; - struct riscv_hwprobe pair = { .key = RISCV_HWPROBE_KEY_CPUPERF_0 }; - if (__riscv_hwprobe (&pair, 1, 0, NULL, 0) == 0 - && (pair.value & RISCV_HWPROBE_MISALIGNED_MASK) + /* RISCV_HWPROBE_KEY_IMA_EXT_0 */ + pair = &pairs[0]; + if (pair->value & RISCV_HWPROBE_EXT_ZAWRS) + has_zawrs = true; + + /* RISCV_HWPROBE_KEY_CPUPERF_0 */ + pair = &pairs[1]; + if ((pair->value & RISCV_HWPROBE_MISALIGNED_MASK) == RISCV_HWPROBE_MISALIGNED_FAST) - fast_unaligned = true; + fast_unaligned = true; + } IFUNC_IMPL (i, name, memcpy, IFUNC_IMPL_ADD (array, i, memcpy, fast_unaligned, __memcpy_noalignment) IFUNC_IMPL_ADD (array, i, memcpy, 1, __memcpy_generic)) + IFUNC_IMPL (i, name, __cpu_relax, + IFUNC_IMPL_ADD (array, i, __cpu_relax, has_zawrs, + __cpu_relax_zawrs) + IFUNC_IMPL_ADD (array, i, __cpu_relax, 1, __cpu_relax_generic)) + return 0; } -- 2.44.0