From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x1033.google.com (mail-pj1-x1033.google.com [IPv6:2607:f8b0:4864:20::1033]) by sourceware.org (Postfix) with ESMTPS id 423F13858C50 for ; Tue, 28 Mar 2023 22:54:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 423F13858C50 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivosinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com Received: by mail-pj1-x1033.google.com with SMTP id p13-20020a17090a284d00b0023d2e945aebso2708374pjf.0 for ; Tue, 28 Mar 2023 15:54:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20210112.gappssmtp.com; s=20210112; t=1680044098; h=content-transfer-encoding:mime-version:message-id:to:from:cc :in-reply-to:subject:date:from:to:cc:subject:date:message-id :reply-to; bh=E+qyky0FFhj5ZjxYduy94AGt+UMrScE2Qva1yfAS70Q=; b=MxEafEB2jSfMYCJ2ufOUQxq41E16hHwJgRzbdnhFHacDbVgaByYFGa5bQ6fdKibniW KDozmRC54N+wwXOnLZkh3oe2syws6OQo4i2RJM+zisUMQ7qyUgp0Actji+PPAnl5ljSm Q3megM8BsV/vsNCUa7nCSCiA9TBkNzEOp3G9fa/R2NjcNgENnA8/iUkFi7DCSHZ5sc2R MjTz5Q1Si1cLPEC5fWVHuK6t7+QVpoNp4tULbbPgTRGb05Oq1suJg6pVVm8zCGdKw8t9 QEAr7P+7yRaebCFCZ2U0BDJ8fug2XnbMz6o/5059LVesfLj8mX6GNO7XfXb+WeHce35v LdTw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680044098; h=content-transfer-encoding:mime-version:message-id:to:from:cc :in-reply-to:subject:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=E+qyky0FFhj5ZjxYduy94AGt+UMrScE2Qva1yfAS70Q=; b=kSlu2RaMfBLA6aOlUs9mU+MaKJmBKlwGptKDRx9IZPExEtUtntNh+McxQ8Qhr2OLIW pzbPSSFk5ZXNxDYG1QT+l/Xd2gvdSvceCi3COWbgqsL2MsXjRFu/PRK7RCmlb0pd1zAm 4oqucO9dFapn0R61Y4LUuKyXYPt+RhfgRz1da+C7PM8KWSLUNImo5LS62N/zRKbi1RSi +k8dQIKdu7eWiahuZXlK1yFs7ke1ja78pojQzxszeSlehjyVvLYoZH2sNzc0Lh3fZhKh Ou0um/XZ1YW7negbyLHuVHnuQOKUs3XM7kMroeEu6nEdquLFcniYSa3sV9Jyz3B85Jyu 6C/w== X-Gm-Message-State: AO0yUKVQymBSSuWOIR49PuGB47yURlaM01mJcTjqBTPy/a4GWvVqmpzr 6rvVPiZWbIXlg9nXeblCJk2MrCBXDWIqGf54I4I= X-Google-Smtp-Source: AK7set8UDechpm++5LLUL6HErBHatVoqKY4aHMdOQdrkDs1q7UxR2KGDhTYGty5O7fEGlDoRPWMDdA== X-Received: by 2002:a05:6a20:bf26:b0:d5:635c:eaa2 with SMTP id gc38-20020a056a20bf2600b000d5635ceaa2mr14632152pzb.14.1680044097825; Tue, 28 Mar 2023 15:54:57 -0700 (PDT) Received: from localhost ([50.221.140.188]) by smtp.gmail.com with ESMTPSA id o7-20020a635d47000000b0050fb4181e8bsm13076900pgm.40.2023.03.28.15.54.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Mar 2023 15:54:57 -0700 (PDT) Date: Tue, 28 Mar 2023 15:54:57 -0700 (PDT) X-Google-Original-Date: Tue, 28 Mar 2023 15:54:53 PDT (-0700) Subject: Re: [PATCH v2 0/3] RISC-V: ifunced memcpy using new kernel hwprobe interface In-Reply-To: <20230221191537.3159966-1-evan@rivosinc.com> CC: libc-alpha@sourceware.org, slewis@rivosinc.com, Vineet Gupta , Evan Green From: Palmer Dabbelt To: Evan Green Message-ID: Mime-Version: 1.0 (MHng) Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-6.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Tue, 21 Feb 2023 11:15:34 PST (-0800), Evan Green wrote: > > This series illustrates the use of a proposed Linux syscall that > enumerates architectural information about the RISC-V cores the system > is running on. In this series we expose a small wrapper function around > the syscall. An ifunc selector for memcpy queries it to see if unaligned > access is "fast" on this hardware. If it is, it selects a newly provided > implementation of memcpy that doesn't work hard at aligning the src and > destination buffers. > > This is somewhat of a proof of concept for the syscall itself, but I do > find that in my goofy memcpy test [1], the unaligned memcpy performed at > least as well as the generic C version. This is however on Qemu on an M1 > mac, so not a test of any real hardware (more a smoke test that the > implementation isn't silly). QEMU isn't a good enough benchmark to justify a new memcpy routine in glibc. Evan has a D1, which does support misaligned access and runs some simple benchmarks faster. There's also been some minor changes to the Linux side of things that warrant a v3 anyway, so he'll just post some benchmarks on HW along with that. Aside from those comments, Reviewed-by: Palmer Dabbelt There's a lot more stuff to probe for, but I think we've got enough of a proof of concept for the hwprobe stuff that we can move forward with the core interface bits in Linux/glibc and then unleash the chaos... Unless anyone else has comments? > v3 of the Linux series can be found at [2]. > > [1] https://pastebin.com/Nj8ixpkX > [2] https://lore.kernel.org/lkml/20230221190858.3159617-1-evan@rivosinc.com/T/#t > > Changes in v2: > - hwprobe.h: Use __has_include and duplicate Linux content to make > compilation work when Linux headers are absent (Adhemerval) > - hwprobe.h: Put declaration under __USE_GNU (Adhemerval) > - Use INLINE_SYSCALL_CALL (Adhemerval) > - Update versions > - Update UNALIGNED_MASK to match kernel v3 series. > - Add vDSO interface > - Used _MASK instead of _FAST value itself. > > Evan Green (3): > riscv: Add Linux hwprobe syscall support > riscv: Add hwprobe vdso call support > riscv: Add and use alignment-ignorant memcpy > > sysdeps/riscv/memcopy.h | 28 +++++ > sysdeps/riscv/memcpy.c | 65 +++++++++++ > sysdeps/riscv/memcpy_noalignment.S | 103 ++++++++++++++++++ > sysdeps/unix/sysv/linux/dl-vdso-setup.c | 10 ++ > sysdeps/unix/sysv/linux/dl-vdso-setup.h | 3 + > sysdeps/unix/sysv/linux/riscv/Makefile | 8 +- > sysdeps/unix/sysv/linux/riscv/Versions | 3 + > sysdeps/unix/sysv/linux/riscv/hwprobe.c | 36 ++++++ > .../unix/sysv/linux/riscv/memcpy-generic.c | 24 ++++ > .../unix/sysv/linux/riscv/rv32/arch-syscall.h | 1 + > .../unix/sysv/linux/riscv/rv32/libc.abilist | 1 + > .../unix/sysv/linux/riscv/rv64/arch-syscall.h | 1 + > .../unix/sysv/linux/riscv/rv64/libc.abilist | 1 + > sysdeps/unix/sysv/linux/riscv/sys/hwprobe.h | 67 ++++++++++++ > sysdeps/unix/sysv/linux/riscv/sysdep.h | 1 + > sysdeps/unix/sysv/linux/syscall-names.list | 1 + > 16 files changed, 351 insertions(+), 2 deletions(-) > create mode 100644 sysdeps/riscv/memcopy.h > create mode 100644 sysdeps/riscv/memcpy.c > create mode 100644 sysdeps/riscv/memcpy_noalignment.S > create mode 100644 sysdeps/unix/sysv/linux/riscv/hwprobe.c > create mode 100644 sysdeps/unix/sysv/linux/riscv/memcpy-generic.c > create mode 100644 sysdeps/unix/sysv/linux/riscv/sys/hwprobe.h Thanks!