From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qv1-xf32.google.com (mail-qv1-xf32.google.com [IPv6:2607:f8b0:4864:20::f32]) by sourceware.org (Postfix) with ESMTPS id 1C1BE3858426 for ; Tue, 22 Aug 2023 15:06:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1C1BE3858426 Authentication-Results: sourceware.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=google.com Received: by mail-qv1-xf32.google.com with SMTP id 6a1803df08f44-64f444fe7a7so8713966d6.0 for ; Tue, 22 Aug 2023 08:06:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1692716773; x=1693321573; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=H/CxOee2ODLp0oUivd7DjjvQR9VSPdAJtLI4sL2zSmQ=; b=rZDJnJN11Po23eBMbWbQ1wFgOuinP3cih3HaDPiJjsDBbhvr5DB/pn5cXdGpxUAJpO j7xkZevGwN3f+N2x9nlJlWzzJOkYr8ZChKw6IrkgIujVygrs4WPmJwT0kio3+HZgUFQO 9WFH9xWyp6jqprZoc3gbn1r06i5QBzFRDw+4We1xTwmSOMrkh9jYHcCYR/HGGpihMK/k aXVMH59MMK2ZTopGyzFQZCI2Qm9UwWbmGABz4DZoHuD2GG8O6eA7/g77E6sbrpgl9wEz ZvFpw13luMqXpRBdNIS2ScgZ1sqH1pH5qpioMuSQ25xG1qjrmUUunnP8ZiWiL5MLm3WR 8cWQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692716773; x=1693321573; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=H/CxOee2ODLp0oUivd7DjjvQR9VSPdAJtLI4sL2zSmQ=; b=iRRbx0kTdQw3CMkf7LOnLj0dcOXNJbau9QEFEom/w5osqZoKW4QO/ZChXLaM4WxAZw wp9VN4Lw6drzNZDZf8C3NnX9O5DK44eQ/TCu8C2kGLJskVBu06A8VP1+cdAoZWzAnz/F 1yZyDW//yqLrErJWsJWlU7Xkty3tv+rZcvx6Z5tqYMHH0myM7CdGFt8AKnDv7+88tb+y 1OIMXxtw6r3VQU4X2Gi+a6uQ9WZGVuUapxbabN7a2vlWkiLatUfSwQ5zAHgXdIbHaTlo eIUGfMIPJR4AJ3/orzi/6d0mi55j6uh+ahcWtKdDV1fl+4aMfwHmf/WGn9l20kLsV2w5 KmZg== X-Gm-Message-State: AOJu0Yxgyygvq4dPJexp80MaG8NyxFuKQ40z9zwhGUZ7HUyUZ4gAUYOm 0tdktCfcGKl6ljjezwyDfoci336A+IMFFmrhp7UgTA== X-Google-Smtp-Source: AGHT+IEOasKenKqu/dIYZyja9ORPHu3L7z6ZRLhoPq7DVZ7uYI11K/vUXi63z1sSXaozQZq4HNkE8DbS/c/5dNC+RCg= X-Received: by 2002:a05:6214:cac:b0:62e:d9db:c788 with SMTP id s12-20020a0562140cac00b0062ed9dbc788mr13981389qvs.3.1692716773103; Tue, 22 Aug 2023 08:06:13 -0700 (PDT) MIME-Version: 1.0 References: <20230802155903.2552780-1-evan@rivosinc.com> <20230802155903.2552780-6-evan@rivosinc.com> <87il9w37vi.fsf@oldenburg.str.redhat.com> <6f0911c6-b24b-444c-4b4b-a62e49a51734@linaro.org> <548fc7d5-6225-69e7-f4a7-47669d2fdbd5@linaro.org> In-Reply-To: From: enh Date: Tue, 22 Aug 2023 08:06:01 -0700 Message-ID: Subject: Re: [PATCH v6 5/5] riscv: Add and use alignment-ignorant memcpy To: Evan Green Cc: Richard Henderson , Florian Weimer , libc-alpha@sourceware.org, slewis@rivosinc.com, palmer@rivosinc.com, vineetg@rivosinc.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-17.5 required=5.0 tests=BAYES_00,DKIMWL_WL_MED,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, Aug 17, 2023 at 10:41=E2=80=AFAM Evan Green wro= te: > > On Thu, Aug 17, 2023 at 9:37=E2=80=AFAM enh wrote: > > > > On Thu, Aug 17, 2023 at 9:27=E2=80=AFAM Evan Green = wrote: > > > > > > On Wed, Aug 16, 2023 at 4:18=E2=80=AFPM enh wrote: > > > > > > > > On Tue, Aug 15, 2023 at 4:02=E2=80=AFPM Evan Green wrote: > > > > > > > > > > On Tue, Aug 15, 2023 at 2:54=E2=80=AFPM enh wrot= e: > > > > > > > > > > > > On Tue, Aug 15, 2023 at 9:41=E2=80=AFAM Evan Green wrote: > > > > > > > > > > > > > > On Fri, Aug 11, 2023 at 5:01=E2=80=AFPM enh = wrote: > > > > > > > > > > > > > > > > On Mon, Aug 7, 2023 at 5:01=E2=80=AFPM Evan Green wrote: > > > > > > > > > > > > > > > > > > On Mon, Aug 7, 2023 at 3:48=E2=80=AFPM enh wrote: > > > > > > > > > > > > > > > > > > > > On Mon, Aug 7, 2023 at 3:11=E2=80=AFPM Evan Green wrote: > > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 3, 2023 at 3:30=E2=80=AFPM Richard Hender= son > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > On 8/3/23 11:42, Evan Green wrote: > > > > > > > > > > > > > On Thu, Aug 3, 2023 at 10:50=E2=80=AFAM Richard H= enderson > > > > > > > > > > > > > wrote: > > > > > > > > > > > > >> Outside libc something is required. > > > > > > > > > > > > >> > > > > > > > > > > > > >> An extra parameter to ifunc is surprising though= , and clearly not ideal per the extra > > > > > > > > > > > > >> hoops above. I would hope for something with hi= dden visibility in libc_nonshared.a that > > > > > > > > > > > > >> could always be called directly. > > > > > > > > > > > > > > > > > > > > > > > > > > My previous spin took that approach, defining a > > > > > > > > > > > > > __riscv_hwprobe_early() in libc_nonshared that co= uld route to the real > > > > > > > > > > > > > function if available, or make the syscall direct= ly if not. But that > > > > > > > > > > > > > approach had the drawback that ifunc users couldn= 't take advantage of > > > > > > > > > > > > > the vDSO, and then all users had to comprehend th= e difference between > > > > > > > > > > > > > __riscv_hwprobe() and __riscv_hwprobe_early(). > > > > > > > > > > > > > > > > > > > > > > > > I would define __riscv_hwprobe such that it could t= ake advantage of the vDSO once > > > > > > > > > > > > initialization reaches a certain point, but cope wi= th being run earlier than that point by > > > > > > > > > > > > falling back to the syscall. > > > > > > > > > > > > > > > > > > > > > > > > That constrains the implementation, I guess, in tha= t it can't set errno, but just > > > > > > > > > > > > returning the negative errno from the syscall seems= fine. > > > > > > > > > > > > > > > > > > > > > > > > It might be tricky to get a reference to GLRO(dl_vd= so_riscv_hwprobe) very early, but I > > > > > > > > > > > > would hope that some application of __attribute__((= weak)) might correctly get you a NULL > > > > > > > > > > > > prior to full relocations being complete. > > > > > > > > > > > > > > > > > > > > > > Right, this is what we had in the previous iteration = of this series, > > > > > > > > > > > and it did work ok. But it wasn't as good since it me= ant ifunc > > > > > > > > > > > selectors always got stuck in the null/fallback case = and were forced > > > > > > > > > > > to make the syscall. With this mechanism they get to = take advantage of > > > > > > > > > > > the vDSO. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > In contrast, IMO this approach is much nicer. Ifu= nc writers are > > > > > > > > > > > > > already used to getting hwcap info via a paramete= r. Adding this second > > > > > > > > > > > > > parameter, which also provides hwcap-like things,= seems like a natural > > > > > > > > > > > > > extension. I didn't quite follow what you meant b= y the "extra hoops > > > > > > > > > > > > > above". > > > > > > > > > > > > > > > > > > > > > > > > The check for null function pointer, for sure. But= also consider how __riscv_hwprobe is > > > > > > > > > > > > going to be used. > > > > > > > > > > > > > > > > > > > > > > > > It might be worth defining some helper functions fo= r probing a single key or a single > > > > > > > > > > > > field. E.g. > > > > > > > > > > > > > > > > > > > > > > > > uint64_t __riscv_hwprobe_one_key(int64_t key, unsig= ned int flags) > > > > > > > > > > > > { > > > > > > > > > > > > struct riscv_hwprobe pair =3D { .key =3D key }; > > > > > > > > > > > > int err =3D __riscv_hwprobe(&pair, 1, 0, NULL, f= lags); > > > > > > > > > > > > if (err) > > > > > > > > > > > > return err; > > > > > > > > > > > > if (pair.key =3D=3D -1) > > > > > > > > > > > > return -ENOENT; > > > > > > > > > > > > return pair.value; > > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > > > This implementation requires that no future hwprobe= key define a value which as a valid > > > > > > > > > > > > value in the errno range (or better, bit 63 unused)= . Alternately, or additionally: > > > > > > > > > > > > > > > > > > > > > > > > bool __riscv_hwprobe_one_mask(int64_t key, uint64_t= mask, uint64_t val, int flags) > > > > > > > > > > > > { > > > > > > > > > > > > struct riscv_hwprobe pair =3D { .key =3D key }; > > > > > > > > > > > > return (__riscv_hwprobe(&pair, 1, 0, NULL, flags= ) =3D=3D 0 > > > > > > > > > > > > && pair.key !=3D -1 > > > > > > > > > > > > && (pair.value & mask) =3D=3D val); > > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > > > These yield either > > > > > > > > > > > > > > > > > > > > > > > > int64_t v =3D __riscv_hwprobe_one_key(CPUPERF_= 0, 0); > > > > > > > > > > > > if (v >=3D 0 && (v & MISALIGNED_MASK) =3D=3D M= ISALIGNED_FAST) > > > > > > > > > > > > return __memcpy_noalignment; > > > > > > > > > > > > return __memcpy_generic; > > > > > > > > > > > > > > > > > > > > > > > > or > > > > > > > > > > > > > > > > > > > > > > > > if (__riscv_hwprobe_one_mask(CPUPERF_0, MISALI= GNED_MASK, MISALIGNED_FAST, 0)) > > > > > > > > > > > > return __memcpy_noalignment; > > > > > > > > > > > > return __memcpy_generic; > > > > > > > > > > > > > > > > > > > > > > > > which to my mind looks much better for a pattern yo= u'll be replicating so very many times > > > > > > > > > > > > across all of the ifunc implementations in the syst= em. > > > > > > > > > > > > > > > > > > > > > > Ah, I see. I could make a static inline function in t= he header that > > > > > > > > > > > looks something like this (mangled by gmail, sorry): > > > > > > > > > > > > > > > > > > > > > > /* Helper function usable from ifunc selectors that p= robes a single key. */ > > > > > > > > > > > static inline int __riscv_hwprobe_one(__riscv_hwprobe= _t hwprobe_func, > > > > > > > > > > > signed long long int key, > > > > > > > > > > > unsigned long long int *value) > > > > > > > > > > > { > > > > > > > > > > > struct riscv_hwprobe pair; > > > > > > > > > > > int rc; > > > > > > > > > > > > > > > > > > > > > > if (!hwprobe_func) > > > > > > > > > > > return -ENOSYS; > > > > > > > > > > > > > > > > > > > > > > pair.key =3D key; > > > > > > > > > > > rc =3D hwprobe_func(&pair, 1, 0, NULL, 0); > > > > > > > > > > > if (rc) { > > > > > > > > > > > return rc; > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > if (pair.key < 0) { > > > > > > > > > > > return -ENOENT; > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > *value =3D pair.value; > > > > > > > > > > > return 0; > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > The ifunc selector would then be significantly cleane= d up, looking > > > > > > > > > > > something like: > > > > > > > > > > > > > > > > > > > > > > if (__riscv_hwprobe_one(hwprobe_func, RISCV_HWPROBE_K= EY_CPUPERF_0, &value)) > > > > > > > > > > > return __memcpy_generic; > > > > > > > > > > > > > > > > > > > > > > if (value & RISCV_HWPROBE_MISALIGNED_MASK) =3D=3D RIS= CV_HWPROBE_MISALIGNED_FAST) > > > > > > > > > > > return __memcpy_noalignment; > > > > > > > > > > > > > > > > > > > > (Android's libc maintainer here, having joined the list= just to talk > > > > > > > > > > about risc-v ifuncs :-) ) > > > > > > > > > > > > > > > > > > > > has anyone thought about calling ifunc resolvers more l= ike this... > > > > > > > > > > > > > > > > > > > > --same part of the dynamic loader that caches the two g= etauxval()s for arm64-- > > > > > > > > > > static struct riscv_hwprobe probes[] =3D { > > > > > > > > > > {.value =3D RISCV_HWPROBE_KEY_MVENDORID}, > > > > > > > > > > {.value =3D RISCV_HWPROBE_KEY_MARCHID}, > > > > > > > > > > {.value =3D RISCV_HWPROBE_KEY_MIMPID}, > > > > > > > > > > {.value =3D RISCV_HWPROBE_KEY_BASE_BEHAVIOR}, > > > > > > > > > > {.value =3D RISCV_HWPROBE_KEY_IMA_EXT}, > > > > > > > > > > {.value =3D RISCV_HWPROBE_KEY_CPUPERF_0}, > > > > > > > > > > ... // every time a new key is added to the kernel, we = add it here > > > > > > > > > > }; > > > > > > > > > > __riscv_hwprobe(...); // called once > > > > > > > > > > > > > > > > > > > > --part of the dynamic loader that calls ifunc resolvers= -- > > > > > > > > > > (*ifunc_resolver)(sizeof(probes)/sizeof(probes[0]), pro= bes); > > > > > > > > > > > > > > > > > > > > this is similar to what we already have for arm64 (wher= e there's a > > > > > > > > > > getauxval(AT_HWCAP) and a pointer to a struct for AT_HW= CAP2 and > > > > > > > > > > potentially others), but more uniform, and avoiding the= source > > > > > > > > > > (in)compatibility issues of adding new fields to a stru= ct [even if it > > > > > > > > > > does have a size_t to "version" it like the arm64 ifunc= struct]. > > > > > > > > > > > > > > > > > > > > yes, it means everyone pays to get all the hwprobes, bu= t that gets > > > > > > > > > > amortized. and lookup in the ifunc resolver is simple a= nd quick. if we > > > > > > > > > > know that the keys will be kept dense, we can even have= code in ifunc > > > > > > > > > > resolvers like > > > > > > > > > > > > > > > > > > > > if (probes[RISCV_HWPROBE_BASE_BEHAVIOR_IMA].value & RIS= CV_HWPROBE_IMA_V) ... > > > > > > > > > > > > > > > > > > > > though personally for the "big ticket items" that get a= letter to > > > > > > > > > > themselves like V, i'd be tempted to pass `(getauxval(A= T_HWCAP), > > > > > > > > > > probe_count, probes_ptr)` to the resolver, but i hear t= hat's > > > > > > > > > > controversial :-) > > > > > > > > > > > > > > > > > > Hello, welcome to the fun! :) > > > > > > > > > > > > > > > > (sorry for the delay. i've been thinking :-) ) > > > > > > > > > > > > > > > > > What you're describing here is almost exactly what we did= inside the > > > > > > > > > vDSO function. The vDSO function acts as a front for a ha= ndful of > > > > > > > > > probe values that we've already completed and cached in u= serspace. We > > > > > > > > > opted to make it a function, rather than exposing the dat= a itself via > > > > > > > > > vDSO, so that we had future flexibility in what elements = we cached in > > > > > > > > > userspace and their storage format. We can update the ker= nel as needed > > > > > > > > > to cache the hottest things in userspace, even if that me= ans > > > > > > > > > rearranging the data format, passing through some extra i= nformation, > > > > > > > > > or adding an extra snip of code. My hope is callers can d= irectly > > > > > > > > > interact with the vDSO function (though maybe as Richard = suggested > > > > > > > > > maybe with the help of a tidy inline helper), rather than= trying to > > > > > > > > > add a second layer of userspace caching. > > > > > > > > > > > > > > > > on reflection i think i might be too focused on the FMV use= case, in > > > > > > > > part because we're looking at those compiler-generated ifun= cs for > > > > > > > > arm64 on Android atm. i think i'm imagining a world where t= here's a > > > > > > > > lot of that, and worrying about having to pay for the setup= , call, and > > > > > > > > loop for each ifunc, and wondering why we don't just pay on= ce instead. > > > > > > > > (as a bit of background context, Android "app" start is act= ually a > > > > > > > > dlopen() in a clone of an existing zygote process, and in g= eneral app > > > > > > > > launch time is one of the key metrics anyone who's serious = is > > > > > > > > optimizing for. you'd be surprised how much of my life i sp= end > > > > > > > > explaining to people that if they want dlopen() to be faste= r, maybe > > > > > > > > they shouldn't ask us to run thousands of ELF constructors.= ) > > > > > > > > > > > > > > > > but... the more time i spend looking at what we actually ne= ed in > > > > > > > > third-party open source libraries right now i realize that = libc and > > > > > > > > FMV (which is still a future thing for us anyway) are reall= y the only > > > > > > > > _actual_ ifunc users. perhaps in part because macOS/iOS don= 't have > > > > > > > > ifuncs, all the libraries that are part of the OS itself, f= or example, > > > > > > > > are just doing their own thing with function pointers and > > > > > > > > pthread_once() or whatever. > > > > > > > > > > > > > > > > (i have yet to try to get any data on actual apps. i have n= o reason to > > > > > > > > think they'll be very different, but that could easily be s= kewed by > > > > > > > > popular middleware or a popular game engine using ifuncs, s= o i do plan > > > > > > > > on following up on that.) > > > > > > > > > > > > > > > > "how do they decide what to set that function pointer to?".= well, it > > > > > > > > looks like in most cases cpuid on x86 and calls to getauxva= l() > > > > > > > > everywhere else. in some cases that's actually via some oth= er library: > > > > > > > > https://github.com/pytorch/cpuinfo or > > > > > > > > https://github.com/google/cpu_features for example. so they= have a > > > > > > > > layer of caching there, even in cases where they don't have= a single > > > > > > > > function that sets all the function pointers. > > > > > > > > > > > > > > Right, function multi-versioning is just the sort of spot whe= re we'd > > > > > > > imagine hwprobe gets used, since it's providing similar/equiv= alent > > > > > > > information to what cpuid does on x86. It may not be quite as= fast as > > > > > > > cpuid (I don't know how fast cpuid actually is). But with the= vDSO > > > > > > > function+data in userspace it should be able to match getauxv= al() in > > > > > > > performance, as they're both a function pointer plus a loop. = We're > > > > > > > sort of planning for a world in which RISC-V has a wider set = of these > > > > > > > values to fetch, such that a ifunc selector may need a more c= omplex > > > > > > > set of information. Hwprobe and the vDSO gives us the ability= both to > > > > > > > answer multiple queries fast, and freely allocate more keys t= hat may > > > > > > > represent versioned features or even compound features. > > > > > > > > > > > > yeah, my incorrect mental model was that -- primarily because o= f > > > > > > x86-64 and cpuid -- every function would get its own ifunc reso= lver > > > > > > that would have to make a query. but the [in progress] arm64 > > > > > > implementation shows that that's not really the case anyway, an= d we > > > > > > can just cache __riscv_hwprobe() in the same [one] place that > > > > > > getauxval() is already being cached for arm64. > > > > > > > > > > Sounds good. > > > > > > > > > > > > > > > > > > > so assuming i don't find that apps look very different from= the OS > > > > > > > > (that is: that apps use lots of ifuncs), i probably don't c= are at all > > > > > > > > until we get to FMV. and i probably don't care for FMV, bec= ause > > > > > > > > compiler-rt (or gcc's equivalent) will be the "caching laye= r" there. > > > > > > > > (and on Android it'll be a while before i have to worry abo= ut libc's > > > > > > > > ifuncs because we'll require V and not use ifuncs there for= the > > > > > > > > foreseeable future.) > > > > > > > > > > > > > > > > so, yeah, given that i've adopted the "pass a null pointer = rather than > > > > > > > > no arguments" convention you have, we have room for expansi= on if/when > > > > > > > > FMV is a big thing, and until then -- unless i'm shocked by= what i > > > > > > > > find looking at actual apps -- i don't think i have any rea= son to > > > > > > > > believe that ifuncs matter that much, and if compiler-rt ma= kes one > > > > > > > > __riscv_hwprobe() call per .so, that's probably fine. (i al= ready spend > > > > > > > > a big chunk of my life advising people to just have one .so= file, > > > > > > > > exporting nothing but a JNI_OnLoad symbol, so this will jus= t make that > > > > > > > > advice even better advice :-) ) > > > > > > > > > > > > > > Just to confirm, by "pass a null pointer", you're saying that= the > > > > > > > Android libc also passes NULL as the second ifunc selector ar= gument > > > > > > > (or first)? > > > > > > > > > > > > #elif defined(__riscv) > > > > > > // This argument and its value is just a placeholder for now, > > > > > > // but it means that if we do pass something in future (such = as > > > > > > // getauxval() and/or hwprobe key/value pairs), callees will = be able to > > > > > > // recognize what they're being given. > > > > > > typedef ElfW(Addr) (*ifunc_resolver_t)(void*); > > > > > > return reinterpret_cast(resolver_addr)(null= ptr); > > > > > > > > > > > > it's arm64 that has the initial getauxval() argument: > > > > > > > > > > > > #if defined(__aarch64__) > > > > > > typedef ElfW(Addr) (*ifunc_resolver_t)(uint64_t, __ifunc_arg_= t*); > > > > > > static __ifunc_arg_t arg; > > > > > > static bool initialized =3D false; > > > > > > if (!initialized) { > > > > > > initialized =3D true; > > > > > > arg._size =3D sizeof(__ifunc_arg_t); > > > > > > arg._hwcap =3D getauxval(AT_HWCAP); > > > > > > arg._hwcap2 =3D getauxval(AT_HWCAP2); > > > > > > } > > > > > > return reinterpret_cast(resolver_addr)(arg.= _hwcap > > > > > > | _IFUNC_ARG_HWCAP, &arg); > > > > > > > > > > > > https://android.googlesource.com/platform/bionic/+/main/libc/bi= onic/bionic_call_ifunc_resolver.cpp > > > > > > > > > > > > > That's good. It sounds like you're planning to just > > > > > > > continue passing NULL for now, and wait for people to start c= lamoring > > > > > > > for this in android libc? > > > > > > > > > > > > yeah, and i'm assuming there will never be any clamor ... yeste= rday > > > > > > and today i actually checked a bunch of popular apks, and didn'= t find > > > > > > any that were currently using ifuncs. > > > > > > > > > > > > the only change i'm thinking of making right now is that "there= 's a > > > > > > single argument, and it's null" should probably be the default. > > > > > > obviously since Android doesn't add new architectures very ofte= n, this > > > > > > is only likely to affect x86/x86-64 for the foreseeable future,= but > > > > > > being able to recognize at a glance "am i running under a libc = new > > > > > > enough to pass me arguments?" would certainly have helped for a= rm64. > > > > > > even if x86/x86-64 never benefit, it seems like the right defau= lt for > > > > > > the #else clause... > > > > > > > > > > Sounds good, thanks for the pointers. The paranoid person in me w= ould > > > > > also add a comment in the risc-v section that if a pointer to hwp= robe > > > > > is added, it should be added as the second argument, behind hwcap= as > > > > > the first (assuming this change lands). > > > > > > > > > > Come to think of it, the static inline helper I'm proposing in my > > > > > discussion with Richard needs to take both arguments, since calle= rs > > > > > need to check both ((arg1 !=3D 0) && (arg2 !=3D NULL)) to safely = know that > > > > > arg2 is a pointer to __riscv_hwprobe(). > > > > > > > > presumably not `(arg1 !=3D 0)` but `(arg1 & _IFUNC_ARG_HWCAP)` to m= atch arm64? > > > > > > It looks like we didn't do that _IFUNC_ARG_HWCAP bit on riscv. > > > Actually, looking at the history of sysdeps/riscv/dl-irel.h, hwcap ha= s > > > always been passed as the first argument. So I think I don't need to > > > check it in the (glibc-specific) inline helper function, I can safely > > > assume it's there and go straight to checking the second argument. > > > > oh i misunderstood what you were saying earlier. > > My bad for causing confusion, I learned something from your reply and > changed my conclusion. > > > > > > If you were coding this directly in a library or application, you > > > would need to check the first arg to be compatible with other libcs > > > like Android's. > > > > we haven't shipped yet, so if you're telling me glibc is passing > > `(getauxval(AT_HWCAP), nullptr)`, i'll change bionic to do the same > > today ... the whole reason i'm here on this thread is to ensure source > > compatibility for anyone writing ifuncs :-) > > Ah, yes, they are (and have always on risc-v) passed > (getauxval(AT_HWCAP), nullptr), so do exactly that. done: https://android-review.googlesource.com/c/platform/bionic/+/2695693 > -Evan