From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x1032.google.com (mail-pj1-x1032.google.com [IPv6:2607:f8b0:4864:20::1032]) by sourceware.org (Postfix) with ESMTPS id 85F593858C50 for ; Thu, 3 Aug 2023 22:30:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 85F593858C50 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org Received: by mail-pj1-x1032.google.com with SMTP id 98e67ed59e1d1-268bc714ce0so1817272a91.0 for ; Thu, 03 Aug 2023 15:30:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1691101829; x=1691706629; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=9ncc8cJAT0m6n0l5exswr0HApaGna7azRRSqMnmFioE=; b=Gr3f14PUJv/1x1FWPdfv7TXchIsUBi+8y3xtre0UOhLXqxPeqd6C9/1+9ZhL0lhjyv lnlKJwEdUq1Suw11R9sY2tSVrbPsGL+q/g3Dwr4oSb40to7NUOXvEvkZq7xEdhgwRko9 iKIg73bGuzEEQFGTZO9tAmJWyTm3+hfhHsgZjSfBI6RDCMeT7KG+JfY2PB3ByCZITdMG FVf7BvLvF4idV4ZKMtr9BWYLBkvKDwbz2pxYZpwiUeazcImxGHTzTRY+LK+dY4pVIihq vjujV9pr+Git5xZIkazFl5hkz+wqeJZKAvMOcFp452NI9FBltVc5D83LbuY3UUlVBFP5 K7ng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691101829; x=1691706629; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=9ncc8cJAT0m6n0l5exswr0HApaGna7azRRSqMnmFioE=; b=cEuL/5zGwn/dZzif8KRWgp6lutbYxmngZ+TwNTY/jjyFuvh4sPjNmcIcDFNk0agDW0 u0vTbG7OO07RvcgBeiDLKLjsqH9d60vOIrYz/WZU5qILbhAUkuiPxYqV++dwcpycChUd T8A6wqOjwFNmCNXvS0a7MG5FNbhF3p8OpbM3JSvu9yPQqG6kAXmRONpTM1N/jhtARfK4 vJ6BOAH5GcrMNi+0In4zwSGmOeYL7l8O8Iwm+MGTKSP6rsXikwDo0FTTxEcI+thdqgKC n0KsY5ZfAnF/MBNRIUQ8c65thkqcpb+V2GWXRvYCNW4q0SvWuwWc84rMjzkAw5EAZaug 66Vg== X-Gm-Message-State: AOJu0YwkLu7otSolsahynBNMQj9aNwl1JPotCN87UwqeQ3eYKIWMe7sM 0PzOEiWjWFsOtjLBvrZR2fckbGHQTrM+ddtTxgc= X-Google-Smtp-Source: AGHT+IHeI7YiPN7ZdFhsdBPUiIppg+Pyyh7azg+WmF7oCadHXnSZeglt1NOOir7aAoSoflz4my2Cfw== X-Received: by 2002:a17:90a:be0c:b0:268:8ff2:5c0d with SMTP id a12-20020a17090abe0c00b002688ff25c0dmr90194pjs.17.1691101829503; Thu, 03 Aug 2023 15:30:29 -0700 (PDT) Received: from [192.168.0.4] ([71.212.144.105]) by smtp.gmail.com with ESMTPSA id e1-20020a17090a118100b002681131c36csm3213632pja.45.2023.08.03.15.30.28 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 03 Aug 2023 15:30:28 -0700 (PDT) Message-ID: <548fc7d5-6225-69e7-f4a7-47669d2fdbd5@linaro.org> Date: Thu, 3 Aug 2023 15:30:27 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 Subject: Re: [PATCH v6 5/5] riscv: Add and use alignment-ignorant memcpy Content-Language: en-US To: Evan Green Cc: Florian Weimer , libc-alpha@sourceware.org, slewis@rivosinc.com, palmer@rivosinc.com, vineetg@rivosinc.com References: <20230802155903.2552780-1-evan@rivosinc.com> <20230802155903.2552780-6-evan@rivosinc.com> <87il9w37vi.fsf@oldenburg.str.redhat.com> <6f0911c6-b24b-444c-4b4b-a62e49a51734@linaro.org> From: Richard Henderson In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-5.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 8/3/23 11:42, Evan Green wrote: > On Thu, Aug 3, 2023 at 10:50 AM Richard Henderson > wrote: >> Outside libc something is required. >> >> An extra parameter to ifunc is surprising though, and clearly not ideal per the extra >> hoops above. I would hope for something with hidden visibility in libc_nonshared.a that >> could always be called directly. > > My previous spin took that approach, defining a > __riscv_hwprobe_early() in libc_nonshared that could route to the real > function if available, or make the syscall directly if not. But that > approach had the drawback that ifunc users couldn't take advantage of > the vDSO, and then all users had to comprehend the difference between > __riscv_hwprobe() and __riscv_hwprobe_early(). I would define __riscv_hwprobe such that it could take advantage of the vDSO once initialization reaches a certain point, but cope with being run earlier than that point by falling back to the syscall. That constrains the implementation, I guess, in that it can't set errno, but just returning the negative errno from the syscall seems fine. It might be tricky to get a reference to GLRO(dl_vdso_riscv_hwprobe) very early, but I would hope that some application of __attribute__((weak)) might correctly get you a NULL prior to full relocations being complete. > In contrast, IMO this approach is much nicer. Ifunc writers are > already used to getting hwcap info via a parameter. Adding this second > parameter, which also provides hwcap-like things, seems like a natural > extension. I didn't quite follow what you meant by the "extra hoops > above". The check for null function pointer, for sure. But also consider how __riscv_hwprobe is going to be used. It might be worth defining some helper functions for probing a single key or a single field. E.g. uint64_t __riscv_hwprobe_one_key(int64_t key, unsigned int flags) { struct riscv_hwprobe pair = { .key = key }; int err = __riscv_hwprobe(&pair, 1, 0, NULL, flags); if (err) return err; if (pair.key == -1) return -ENOENT; return pair.value; } This implementation requires that no future hwprobe key define a value which as a valid value in the errno range (or better, bit 63 unused). Alternately, or additionally: bool __riscv_hwprobe_one_mask(int64_t key, uint64_t mask, uint64_t val, int flags) { struct riscv_hwprobe pair = { .key = key }; return (__riscv_hwprobe(&pair, 1, 0, NULL, flags) == 0 && pair.key != -1 && (pair.value & mask) == val); } These yield either int64_t v = __riscv_hwprobe_one_key(CPUPERF_0, 0); if (v >= 0 && (v & MISALIGNED_MASK) == MISALIGNED_FAST) return __memcpy_noalignment; return __memcpy_generic; or if (__riscv_hwprobe_one_mask(CPUPERF_0, MISALIGNED_MASK, MISALIGNED_FAST, 0)) return __memcpy_noalignment; return __memcpy_generic; which to my mind looks much better for a pattern you'll be replicating so very many times across all of the ifunc implementations in the system. r~