From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x12a.google.com (mail-lf1-x12a.google.com [IPv6:2a00:1450:4864:20::12a]) by sourceware.org (Postfix) with ESMTPS id 4A2F7385840B for ; Mon, 7 Aug 2023 22:11:29 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4A2F7385840B Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivosinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com Received: by mail-lf1-x12a.google.com with SMTP id 2adb3069b0e04-4fe383c1a26so8239343e87.1 for ; Mon, 07 Aug 2023 15:11:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20221208.gappssmtp.com; s=20221208; t=1691446288; x=1692051088; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=P1t+0teYKTxH6Y8H0LbnWoDTKyl+uQ8k07lOx6BSq1o=; b=njolsqxDjIvYhS2y5gARQgkidGnRw0hN3M6xdqA5dSni33tHGzqVK+pNv7F0QXjQVn TQYpiofDI1W6HTMBQP4vimi9JC48S+eW7LJjnloFoTWj9g0qiheZfHy4f82f/oTY/Blf 6T2r4sEDhjFjm3AHQWukiLRu0LL4z2KTV+X7BMIO5tPVo/ggT+gnoXNlOXG8Cx1hupLC kv/lzHVbZJDEe74dUMrFRvmyaEofuOarww5nsIDboaNU5Y6iqD/mZOATMLe7/NdYb/v4 LlKwBmtKWFgaNAJluxXFP7JvX6gsNxETrEx9XRHCdmCfHu0VgViMJkyzyokYLPi1oZbb TpRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691446288; x=1692051088; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=P1t+0teYKTxH6Y8H0LbnWoDTKyl+uQ8k07lOx6BSq1o=; b=UKpt4fwB/nvPoJvPpCdxZ7K+SPQ6vZtTgBDGHXGiJ+dGwW7Y+LTMLJ7nDACLu8t4Ui NnhjCYNdzHMWcY07TQ8iUWMcAeP0UJ/1IRDydTsLEP7bSeVx8yVnufXXGG5EdxwNET0b oRziP5DEipr56IbO5Yynn6o0arAY/8ylzzVHg3J5jyl6VrTOzv5VfU8k2m48/g12lAu/ 64WRnLW+8ZflVRcvUHGoGHG3XpGpZpzzsSuJoG6EwsB+JVCcy6OwMvqBWrRIWTRhB1TV PodglKdgjbF5Qt6Bp8nGbhLXZQyedVWTLGylXIGJsf3sgalkuHMgDWveSQwMYE1mcQvp IMHA== X-Gm-Message-State: AOJu0Yx+r+dxOhm8k12MqVZJLWbT6dU6FvrD/O7/aWgOq6v2rtj6ewur lkp4jVIjNQDY4B+Bi235jmofvQ3POW5eQSRyEVqrxw== X-Google-Smtp-Source: AGHT+IEC36icUK1PyE/eFjmYb3hMjWc2OP00G5X99EWy+VeIoyNU7sf8rNbZcYtPjl4MCZAMyKrLt4jV+Et15KdTWws= X-Received: by 2002:a05:6512:3b2:b0:4fe:ecd:4959 with SMTP id v18-20020a05651203b200b004fe0ecd4959mr6280992lfp.32.1691446287802; Mon, 07 Aug 2023 15:11:27 -0700 (PDT) MIME-Version: 1.0 References: <20230802155903.2552780-1-evan@rivosinc.com> <20230802155903.2552780-6-evan@rivosinc.com> <87il9w37vi.fsf@oldenburg.str.redhat.com> <6f0911c6-b24b-444c-4b4b-a62e49a51734@linaro.org> <548fc7d5-6225-69e7-f4a7-47669d2fdbd5@linaro.org> In-Reply-To: <548fc7d5-6225-69e7-f4a7-47669d2fdbd5@linaro.org> From: Evan Green Date: Mon, 7 Aug 2023 15:10:51 -0700 Message-ID: Subject: Re: [PATCH v6 5/5] riscv: Add and use alignment-ignorant memcpy To: Richard Henderson Cc: Florian Weimer , libc-alpha@sourceware.org, slewis@rivosinc.com, palmer@rivosinc.com, vineetg@rivosinc.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, Aug 3, 2023 at 3:30=E2=80=AFPM Richard Henderson wrote: > > On 8/3/23 11:42, Evan Green wrote: > > On Thu, Aug 3, 2023 at 10:50=E2=80=AFAM Richard Henderson > > wrote: > >> Outside libc something is required. > >> > >> An extra parameter to ifunc is surprising though, and clearly not idea= l per the extra > >> hoops above. I would hope for something with hidden visibility in lib= c_nonshared.a that > >> could always be called directly. > > > > My previous spin took that approach, defining a > > __riscv_hwprobe_early() in libc_nonshared that could route to the real > > function if available, or make the syscall directly if not. But that > > approach had the drawback that ifunc users couldn't take advantage of > > the vDSO, and then all users had to comprehend the difference between > > __riscv_hwprobe() and __riscv_hwprobe_early(). > > I would define __riscv_hwprobe such that it could take advantage of the v= DSO once > initialization reaches a certain point, but cope with being run earlier t= han that point by > falling back to the syscall. > > That constrains the implementation, I guess, in that it can't set errno, = but just > returning the negative errno from the syscall seems fine. > > It might be tricky to get a reference to GLRO(dl_vdso_riscv_hwprobe) very= early, but I > would hope that some application of __attribute__((weak)) might correctly= get you a NULL > prior to full relocations being complete. Right, this is what we had in the previous iteration of this series, and it did work ok. But it wasn't as good since it meant ifunc selectors always got stuck in the null/fallback case and were forced to make the syscall. With this mechanism they get to take advantage of the vDSO. > > > > In contrast, IMO this approach is much nicer. Ifunc writers are > > already used to getting hwcap info via a parameter. Adding this second > > parameter, which also provides hwcap-like things, seems like a natural > > extension. I didn't quite follow what you meant by the "extra hoops > > above". > > The check for null function pointer, for sure. But also consider how __r= iscv_hwprobe is > going to be used. > > It might be worth defining some helper functions for probing a single key= or a single > field. E.g. > > uint64_t __riscv_hwprobe_one_key(int64_t key, unsigned int flags) > { > struct riscv_hwprobe pair =3D { .key =3D key }; > int err =3D __riscv_hwprobe(&pair, 1, 0, NULL, flags); > if (err) > return err; > if (pair.key =3D=3D -1) > return -ENOENT; > return pair.value; > } > > This implementation requires that no future hwprobe key define a value wh= ich as a valid > value in the errno range (or better, bit 63 unused). Alternately, or add= itionally: > > bool __riscv_hwprobe_one_mask(int64_t key, uint64_t mask, uint64_t val, i= nt flags) > { > struct riscv_hwprobe pair =3D { .key =3D key }; > return (__riscv_hwprobe(&pair, 1, 0, NULL, flags) =3D=3D 0 > && pair.key !=3D -1 > && (pair.value & mask) =3D=3D val); > } > > These yield either > > int64_t v =3D __riscv_hwprobe_one_key(CPUPERF_0, 0); > if (v >=3D 0 && (v & MISALIGNED_MASK) =3D=3D MISALIGNED_FAST) > return __memcpy_noalignment; > return __memcpy_generic; > > or > > if (__riscv_hwprobe_one_mask(CPUPERF_0, MISALIGNED_MASK, MISALIGNED_= FAST, 0)) > return __memcpy_noalignment; > return __memcpy_generic; > > which to my mind looks much better for a pattern you'll be replicating so= very many times > across all of the ifunc implementations in the system. Ah, I see. I could make a static inline function in the header that looks something like this (mangled by gmail, sorry): /* Helper function usable from ifunc selectors that probes a single key. */ static inline int __riscv_hwprobe_one(__riscv_hwprobe_t hwprobe_func, signed long long int key, unsigned long long int *value) { struct riscv_hwprobe pair; int rc; if (!hwprobe_func) return -ENOSYS; pair.key =3D key; rc =3D hwprobe_func(&pair, 1, 0, NULL, 0); if (rc) { return rc; } if (pair.key < 0) { return -ENOENT; } *value =3D pair.value; return 0; } The ifunc selector would then be significantly cleaned up, looking something like: if (__riscv_hwprobe_one(hwprobe_func, RISCV_HWPROBE_KEY_CPUPERF_0, &value)) return __memcpy_generic; if (value & RISCV_HWPROBE_MISALIGNED_MASK) =3D=3D RISCV_HWPROBE_MISALIGNED_= FAST) return __memcpy_noalignment; -Evan