From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x102e.google.com (mail-pj1-x102e.google.com [IPv6:2607:f8b0:4864:20::102e]) by sourceware.org (Postfix) with ESMTPS id 7FE613858D1E for ; Wed, 29 Mar 2023 19:45:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7FE613858D1E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivosinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com Received: by mail-pj1-x102e.google.com with SMTP id x15so15112042pjk.2 for ; Wed, 29 Mar 2023 12:45:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20210112.gappssmtp.com; s=20210112; t=1680119154; h=content-transfer-encoding:mime-version:message-id:to:from:cc :in-reply-to:subject:date:from:to:cc:subject:date:message-id :reply-to; bh=5NBjE81do40TRowvL65miG2N/kOJcDWu+WpLUKM7Z5k=; b=0nKm4xEAQOBW5bIB/6hhZw3r8LkHXWfYWKr3RefhcoSbV4+EhhERHu8lkw8QeZx2MQ YtBpd0WmQJ84cGVsTApLwCLMQAzh3Zip8i5vXPOqm3jcBoyN4Ppdzpnfof2doyIrU/r1 wdKxFM4r05DS66jllnbogKyY5Q75j2k+MatyukuAkX2H4kyacL0H+FTegf0EoMOAE2yH Sbbrf4fsXVJL2wQie4g+N5Wt8BDiIdqlpLeVRdl5xv5aqPsA2Ds5QUksXAdktMuPfh+5 rIdTx9K9tYeS3Nqmz7yXp0Nl6UEbWR5B4V7cifaGcOjUOlpQ4/tT5trf2kLhI/G7YWmV w0Cw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680119154; h=content-transfer-encoding:mime-version:message-id:to:from:cc :in-reply-to:subject:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=5NBjE81do40TRowvL65miG2N/kOJcDWu+WpLUKM7Z5k=; b=blUlDx02Kta2wfWYWMHhxqtwaMObA/HUnX+Q2Kt5uogtJjU5x3+kmR6WFcsFhMJ1fE 7+8+GxGeewOwhvOwXHHaSVX4+GZnr8umb1Z/LRa1+yITI6UfSxD31NJgE5qGE5VAQwOP MEKmltTK1jF/h39JaigPHYCfLz1MnSTFROhtek8JPRGnFOibt2SncoHdlT5q7OhFzPNU 1Btx71GqVDY23bUY42s2d9N4UYSVfi6YyIVYly3Ta1sFUWgMep8NJE6zBLDpq4KCZeOQ AqJlOx+VHUgdF07C7E1Wupa+mC6HJnY4Nd4o5e9XkroV7ssunfSjEFrkjQJHm6KT68Pd W+iQ== X-Gm-Message-State: AAQBX9cV4iOpHL37PIjxz+PF520/0Lg5bokNr8Ki74uSYnWmbEz+JYDb ND/lrnpdrlfTOSAWuJz4lokK1Q== X-Google-Smtp-Source: AKy350bgtTqaCRZwxBk7qCpuYz8Tiu9rjyiieJG8WSx2wPSfcRb62Eb2b+Az4zEAf/fgtW18sk33Qg== X-Received: by 2002:a17:90b:388e:b0:23f:7d05:8762 with SMTP id mu14-20020a17090b388e00b0023f7d058762mr21273628pjb.23.1680119154459; Wed, 29 Mar 2023 12:45:54 -0700 (PDT) Received: from localhost ([50.221.140.188]) by smtp.gmail.com with ESMTPSA id d9-20020a17090a7bc900b00233df90227fsm1796969pjl.18.2023.03.29.12.45.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Mar 2023 12:45:53 -0700 (PDT) Date: Wed, 29 Mar 2023 12:45:53 -0700 (PDT) X-Google-Original-Date: Wed, 29 Mar 2023 12:45:49 PDT (-0700) Subject: Re: [PATCH v2 0/3] RISC-V: ifunced memcpy using new kernel hwprobe interface In-Reply-To: CC: Evan Green , libc-alpha@sourceware.org, slewis@rivosinc.com, Vineet Gupta From: Palmer Dabbelt To: adhemerval.zanella@linaro.org Message-ID: Mime-Version: 1.0 (MHng) Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-6.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, 29 Mar 2023 12:16:39 PDT (-0700), adhemerval.zanella@linaro.org wrote: > > > On 28/03/23 21:01, Palmer Dabbelt wrote: >> On Tue, 28 Mar 2023 16:41:10 PDT (-0700), adhemerval.zanella@linaro.org wrote: >>> >>> >>> On 28/03/23 19:54, Palmer Dabbelt wrote: >>>> On Tue, 21 Feb 2023 11:15:34 PST (-0800), Evan Green wrote: >>>>> >>>>> This series illustrates the use of a proposed Linux syscall that >>>>> enumerates architectural information about the RISC-V cores the system >>>>> is running on. In this series we expose a small wrapper function around >>>>> the syscall. An ifunc selector for memcpy queries it to see if unaligned >>>>> access is "fast" on this hardware. If it is, it selects a newly provided >>>>> implementation of memcpy that doesn't work hard at aligning the src and >>>>> destination buffers. >>>>> >>>>> This is somewhat of a proof of concept for the syscall itself, but I do >>>>> find that in my goofy memcpy test [1], the unaligned memcpy performed at >>>>> least as well as the generic C version. This is however on Qemu on an M1 >>>>> mac, so not a test of any real hardware (more a smoke test that the >>>>> implementation isn't silly). >>>> >>>> QEMU isn't a good enough benchmark to justify a new memcpy routine in glibc.  Evan has a D1, which does support misaligned access and runs some simple benchmarks faster.  There's also been some minor changes to the Linux side of things that warrant a v3 anyway, so he'll just post some benchmarks on HW along with that. >>>> >>>> Aside from those comments, >>>> >>>> Reviewed-by: Palmer Dabbelt >>>> >>>> There's a lot more stuff to probe for, but I think we've got enough of a proof of concept for the hwprobe stuff that we can move forward with the core interface bits in Linux/glibc and then unleash the chaos... >>>> >>>> Unless anyone else has comments? >>> >>> Until riscv_hwprobe is not on Linus tree as official Linux ABI this patchset >>> can not be installed.  We failed to enforce it on some occasion (like Intel >>> CET) and it turned out a complete mess after some years... >> >> Sorry if that wasn't clear, I was asking if there were any more comments from the glibc side of things before merging the Linux code. > > Right, so is this already settle to be the de-factor ABI to query for system > information in RISCV? Or is it still being discussed? Is it in a next branch > already, and/or have been tested with a patch glibc? It's not in for-next yet, but various patch sets / proposals have been on the lists for a few months and it seems like discussion on the kernel side has pretty much died down. That's why I was pinging the glibc side of things, if anyone here has comments on the interface then it's time to chime in. If there's no comments then we're likely to end up with this in the next release (so queue into for-next soon, Linus' master in a month or so). IIUC Evan's been testing the kernel+glibc stuff on QEMU, but he should be able to ack that explicitly (it's a little vague in the cover letter). There's also a glibc-independent kselftest as part of the kernel patch set: https://lore.kernel.org/all/20230327163203.2918455-6-evan@rivosinc.com/ . > > In any case I added some minimal comments. With the vDSO approach I think > there is no need to cache the result at startup, as aarch64 and x86 does.