From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from wout3-smtp.messagingengine.com (wout3-smtp.messagingengine.com [64.147.123.19]) by sourceware.org (Postfix) with ESMTPS id 4D2593858D20 for ; Wed, 12 Jul 2023 05:22:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4D2593858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=fastmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=fastmail.com Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailout.west.internal (Postfix) with ESMTP id 349033200909; Wed, 12 Jul 2023 01:22:37 -0400 (EDT) Received: from imap50 ([10.202.2.100]) by compute1.internal (MEProxy); Wed, 12 Jul 2023 01:22:37 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fastmail.com; h= cc:cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:sender:subject:subject:to:to; s=fm2; t= 1689139356; x=1689225756; bh=aNrywzCGsr5IQoEZSAI9I1Jt7O3yJ4qGDcX 6FWqhKNQ=; b=frlAAtSHVvpusEUnLG4JqNEG99BrFrSgPXuvwupifSTnEvsQ0SZ Hvjle3HVOIDyyjVlHiSiSKM+paBJ1ddo/eiEapLvmU/EINziwma3aSV4ET/9rZ/2 cSSdTV1wj6WMqWPtVrjJXN7PegItaCnb7t9yb+JS7P9ljyUkql+ITXbOqQSl2/0p IDkDe4ewSkG+TPwA+lNwDXSAR116/yE32OsMPSxoB3xZnq6Cah5d1byP3EUj8P/7 PqIxdUqeQdR7AFpvEXHSDgsDPrWICsvEu4H7SQLUbVtTBDk6arq9Nm6aSnbypoiR bS/3BCbLg9jN4HJCyLy9YQ62usLNxHJHu8A== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:sender:subject:subject:to:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t= 1689139356; x=1689225756; bh=aNrywzCGsr5IQoEZSAI9I1Jt7O3yJ4qGDcX 6FWqhKNQ=; b=hf3q5lp2Bz/7ZLYH5Ze4EdbVom+Oiimtm+hTb4QA34Kmw45K4xH uB00XFnXcrES/lw0LCf8T9qRVq/e4Hs4jfERxuk27rio+jGQjeiR/cMdAV/HAu2W c921jEjt6mc52W3N3S4eBObWKEa6DN2MhCv/EZ+jZtdS0Po4T5Z9UdKXjOwbGGXg HSrQ0ZQOJUmdxjtKGq3oqqYnixuwMwCr9Z0EzA7Nifd4rja51eJyLHU31pptjMdi p2AZQJUGTUR/snWWLL73ufqoZjnj5z9Otcg5DJS7ppX9K1uSrAxjXU2woUPmjEhu X7gSfqE0+MEKIpC2CzjW68s9BRFXMxebN6w== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedviedrfedugdelfecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefofgggkfgjfhffhffvvefutgfgsehtqhertderreejnecuhfhrohhmpedfufht vghfrghnucfqkdftvggrrhdfuceoshhorhgvrghrsehfrghsthhmrghilhdrtghomheqne cuggftrfgrthhtvghrnhepgfdvjeelleeghffhvdevveetveevgfeuteejveettdfhueel vdejhfektdefvdeinecuffhomhgrihhnpehgnhhurdhorhhgpdhkvghrnhgvlhdrohhrgh enucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehsohhr vggrrhesfhgrshhtmhgrihhlrdgtohhm X-ME-Proxy: Feedback-ID: i84414492:Fastmail Received: by mailuser.nyi.internal (Postfix, from userid 501) id 63BF61700089; Wed, 12 Jul 2023 01:22:36 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface User-Agent: Cyrus-JMAP/3.9.0-alpha0-531-gfdfa13a06d-fm-20230703.001-gfdfa13a0 Mime-Version: 1.0 Message-Id: <0086d635-5479-45a1-a27a-84ece3f1a9fc@app.fastmail.com> In-Reply-To: References: <20230706192947.1566767-1-evan@rivosinc.com> <20230706192947.1566767-4-evan@rivosinc.com> Date: Wed, 12 Jul 2023 01:22:16 -0400 From: "Stefan O'Rear" To: "Evan Green" Cc: "Stefan O'Rear via Libc-alpha" , "Palmer Dabbelt" , slewis@rivosinc.com, "Vineet Gupta" , "Florian Weimer" Subject: Re: [PATCH v4 3/3] riscv: Add and use alignment-ignorant memcpy Content-Type: text/plain;charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_SHORT,RCVD_IN_DNSWL_LOW,RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_PASS,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Mon, Jul 10, 2023, at 12:19 PM, Evan Green wrote: > On Fri, Jul 7, 2023 at 7:17=E2=80=AFPM Stefan O'Rear wrote: >> >> >> >> On Thu, Jul 6, 2023, at 3:29 PM, Evan Green wrote: >> > For CPU implementations that can perform unaligned accesses with li= ttle >> > or no performance penalty, create a memcpy implementation that does= not >> > bother aligning buffers. It will use a block of integer registers, a >> > single integer register, and fall back to bytewise copy for the >> > remainder. >> > >> > Signed-off-by: Evan Green >> > Reviewed-by: Palmer Dabbelt >> > >> > --- >> > >> > Changes in v4: >> > - Fixed comment style (Florian) >> > >> > Changes in v3: >> > - Word align dest for large memcpy()s. >> > - Add tags >> > - Remove spurious blank line from sysdeps/riscv/memcpy.c >> > >> > Changes in v2: >> > - Used _MASK instead of _FAST value itself. >> > >> > >> > --- >> > sysdeps/riscv/memcopy.h | 26 ++++ >> > sysdeps/riscv/memcpy.c | 64 +++++++++ >> > sysdeps/riscv/memcpy_noalignment.S | 121 ++++++++++++++= ++++ >> > sysdeps/unix/sysv/linux/riscv/Makefile | 4 + >> > .../unix/sysv/linux/riscv/memcpy-generic.c | 24 ++++ >> > 5 files changed, 239 insertions(+) >> > create mode 100644 sysdeps/riscv/memcopy.h >> > create mode 100644 sysdeps/riscv/memcpy.c >> > create mode 100644 sysdeps/riscv/memcpy_noalignment.S >> > create mode 100644 sysdeps/unix/sysv/linux/riscv/memcpy-generic.c >> > >> > diff --git a/sysdeps/riscv/memcopy.h b/sysdeps/riscv/memcopy.h >> > new file mode 100644 >> > index 0000000000..2b685c8aa0 >> > --- /dev/null >> > +++ b/sysdeps/riscv/memcopy.h >> > @@ -0,0 +1,26 @@ >> > +/* memcopy.h -- definitions for memory copy functions. RISC-V vers= ion. >> > + Copyright (C) 2023 Free Software Foundation, Inc. >> > + This file is part of the GNU C Library. >> > + >> > + The GNU C Library is free software; you can redistribute it and= /or >> > + modify it under the terms of the GNU Lesser General Public >> > + License as published by the Free Software Foundation; either >> > + version 2.1 of the License, or (at your option) any later versi= on. >> > + >> > + The GNU C Library is distributed in the hope that it will be us= eful, >> > + but WITHOUT ANY WARRANTY; without even the implied warranty of >> > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the G= NU >> > + Lesser General Public License for more details. >> > + >> > + You should have received a copy of the GNU Lesser General Public >> > + License along with the GNU C Library; if not, see >> > + . */ >> > + >> > +#include >> > + >> > +/* Redefine the generic memcpy implementation to __memcpy_generic,= so >> > + the memcpy ifunc can select between generic and special version= s. >> > + In rtld, don't bother with all the ifunciness. */ >> > +#if IS_IN (libc) >> > +#define MEMCPY __memcpy_generic >> > +#endif >> > diff --git a/sysdeps/riscv/memcpy.c b/sysdeps/riscv/memcpy.c >> > new file mode 100644 >> > index 0000000000..fdb8dc3208 >> > --- /dev/null >> > +++ b/sysdeps/riscv/memcpy.c >> > @@ -0,0 +1,64 @@ >> > +/* Multiple versions of memcpy. >> > + All versions must be listed in ifunc-impl-list.c. >> > + Copyright (C) 2017-2023 Free Software Foundation, Inc. >> > + This file is part of the GNU C Library. >> > + >> > + The GNU C Library is free software; you can redistribute it and= /or >> > + modify it under the terms of the GNU Lesser General Public >> > + License as published by the Free Software Foundation; either >> > + version 2.1 of the License, or (at your option) any later versi= on. >> > + >> > + The GNU C Library is distributed in the hope that it will be us= eful, >> > + but WITHOUT ANY WARRANTY; without even the implied warranty of >> > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the G= NU >> > + Lesser General Public License for more details. >> > + >> > + You should have received a copy of the GNU Lesser General Public >> > + License along with the GNU C Library; if not, see >> > + . */ >> > + >> > +#if IS_IN (libc) >> > +/* Redefine memcpy so that the compiler won't complain about the t= ype >> > + mismatch with the IFUNC selector in strong_alias, below. */ >> > +# undef memcpy >> > +# define memcpy __redirect_memcpy >> > +# include >> > +#include >> > +#include >> > + >> > +#define INIT_ARCH() >> > + >> > +extern __typeof (__redirect_memcpy) __libc_memcpy; >> > + >> > +extern __typeof (__redirect_memcpy) __memcpy_generic attribute_hid= den; >> > +extern __typeof (__redirect_memcpy) __memcpy_noalignment >> > attribute_hidden; >> > + >> > +static inline __typeof (__redirect_memcpy) * >> > +select_memcpy_ifunc (void) >> > +{ >> > + INIT_ARCH (); >> > + >> > + struct riscv_hwprobe pair; >> > + >> > + pair.key =3D RISCV_HWPROBE_KEY_CPUPERF_0; >> > + if (__riscv_hwprobe(&pair, 1, 0, NULL, 0) !=3D 0) >> > + return __memcpy_generic; >> > + >> > + if ((pair.key > 0) && >> > + (pair.value & RISCV_HWPROBE_MISALIGNED_MASK) =3D=3D >> > + RISCV_HWPROBE_MISALIGNED_FAST) >> > + return __memcpy_noalignment; >> >> It's unclear whether this is semantically correct as a use of >> __riscv_hwprobe. [1] describes the result of hwprobe as "what's poss= ible >> to enable", leaving open the possibility that additional system calls= are > > Right, think of it like "cpuid for risc-v". We're missing a good analog of cpuid's OSXSAVE result, or perhaps more accurately xgetbv. I was pushing for including the xgetbv information in hwprobe, under the mistaken impression that vvar was per-process; given = that hwprobe cannot usefully return per-process data, it's probably doing the right thing right now, but this puts us back in the place of having no usable userspace ABI for features that can be disabled per-process, or m= ight be disabled per-process in the future. >> needed to determine whether unaligned accesses are supported right no= w in >> the current process, and [2] adds an (inherited, IIUC) prctl for >> unaligned access which doesn't affect the return value of hwprobe and >> would break this code as written. >> >> (There is nothing in either the privileged spec or the SBI spec to >> prohibit an implementation which provides FAST unaligned access from >> supporting an optional strict alignment checking mode and making it >> available through fw_feature.) > > I think your point about prctls() muddying the waters of what hwprobe > is trying to report is generally valid. However in this case, the > prctl() proposed in [2] only disables unaligned accesses for systems > where Linux is handling the trapping/emulation of unaligned accesses. > These systems would only ever report SLOW/EMULATED out of hwprobe. > Since we're only making changes on FAST systems, the prctl()'s setting > doesn't come into play. Looking at other architectures with PR_SET_UNALIGN support, none of them do anything to stop the status from affecting children after fork or exec, so I think the proposed riscv handling is not special and we don't need to do anything special to handle PR_GET_UNALIGN here; if user code relies on fixups to handle mostly-aligned data, it will crash under PR_UNALIGN_SIGBUS, and that's expected and not an ABI issue. I wonder if we should take the same approach with PR_RISCV_V_VSTATE_CTRL_OFF and say that it requires buyin from any processes run under its influence and crashes are expected otherwise. -s > -Evan > >> >> -s >> >> [1]: https://lore.kernel.org/linux-riscv/mhng-97928779-5d76-4390-a84c= -398fdc6a0a4f@palmer-ri-x1c9/ >> [2]: https://lore.kernel.org/linux-riscv/20230624122049.7886-6-cleger= @rivosinc.com/ >>