From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pl1-x632.google.com (mail-pl1-x632.google.com [IPv6:2607:f8b0:4864:20::632]) by sourceware.org (Postfix) with ESMTPS id 97A1E385840D for ; Tue, 23 Apr 2024 19:46:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 97A1E385840D Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=dabbelt.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=dabbelt.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 97A1E385840D Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::632 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1713901596; cv=none; b=MjIoi7PnRkXdBRQpkJJXlW+nLWu0NQfAPJSMKIpTfpm/1Cdz39A+Qu2QVeWSQ11tTAAVtxXl4sCtEeIkPbibyimIjXkh6JGD0dWZDOMVrNU+Mj0j3sCtSn7Xl2Q4CM3CNiz4ZYAS2cO5Zfd1zKIqfYGsCDoJgZPAGgFvDjm/McY= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1713901596; c=relaxed/simple; bh=BeAOGZfvnjPHTj26hgStJMyhqn8R1mSl3QuX21M7Owk=; h=DKIM-Signature:Date:Subject:From:To:Message-ID:Mime-Version; b=iWMCNt4Ne8J16FRCyXUAFl08fJgHLwgRZ+MBGk3qrQ+D9HT14KDdQScVL8EsjRqkYOjN4s4Eqie83rr5qMZ00Y/26vRjk/DYwzPfxim56zGZXWzazSCqTBeO9xFwHPznmJVZxjXfPnSh2PF8Y2jRHSUN1cxe9UCycD5rbYQE3m4= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-pl1-x632.google.com with SMTP id d9443c01a7336-1e5c7d087e1so49455895ad.0 for ; Tue, 23 Apr 2024 12:46:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dabbelt-com.20230601.gappssmtp.com; s=20230601; t=1713901591; x=1714506391; darn=sourceware.org; h=content-transfer-encoding:mime-version:message-id:to:from:cc :in-reply-to:subject:date:from:to:cc:subject:date:message-id :reply-to; bh=9e0ApDWm2ECuvnidxj0FWv2pdCRQTOekd6nOfC2Nz/Q=; b=vwv0W2yyyTVi1Zy6qHwbCuc6G+QWlxdEfWFnWc5p1XcYmpHxSDjtZonK3n+sRZX6wL HReEH/H77ZINdT2CcCJyciYCt0oFCwsCngNxd86AtKk6ObVHRysOX2Cs8Rgq+IO2+eeR TCWy9My1L/jRnrgSupWaULmQj+Q7eJWqvM+bZgKEGjaVJQFGYHM5czzi+KnHO2YywO3e FSvJUsK3+8NIs1ifHpi4xo/w9Dy155J8VqD0Na5Sb/4sz5ftrG80yCzMhX5abs9Q7nlz A7EJrR9lMPcpAeh8EycTrDFl+7wlbfELQkID/YHsZ1vOYvMsLkA/b7hd2a1u8XvGBViK UvCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713901591; x=1714506391; h=content-transfer-encoding:mime-version:message-id:to:from:cc :in-reply-to:subject:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=9e0ApDWm2ECuvnidxj0FWv2pdCRQTOekd6nOfC2Nz/Q=; b=FZdjsD84SNn1FBwcDTUmDNUnE2bxqz4ndCYMrkGDgXFAqg4Z94SPSa9a29vvT27qtI Y/GVGhzTOfbErC9vMI/ZJYA9w9Ly9ExSEw/PhLuYaNNPDlu8hKbfJT510G+/7XOloAJO s6F4X5AiaxSX9CKHAPRMkSgJTNVwfb2KSzkhelrCCy1iKmwCJWWFKlXXHcRfiOqXR9+d fmX92YRwiG4CWfMNzoTBHHiqj3h57Z/yhYf+A/N1tRJpAeYYwdpMh6sd9HhNBlmfxKr/ tw8bnl19Gy0AS5c8bWL7QQgZtFFTv+nrYngQoDD4slE+ufm7Lmm7IgaXRTeQ63Ggyf7Q VPjw== X-Forwarded-Encrypted: i=1; AJvYcCVavdWBDiAftfH0y5fMxGHaAAGaXMgM8/HmZqHDqC90g5P3l/we7ZgRLfyh5Sm2STAZOfdmK0qtQw6nvJym34nwFSA4fKVz7NBg X-Gm-Message-State: AOJu0Yytd+ZikU9O2b0gKckRyhsacRkP+X1nOOccBXhTPKWrH9i6rT2P hzeJeA7qbqQKNyOmj2mJkGAck1XNUmqBpQ3GiYt4OslPh7B8Bqyah2uo04ZL4kw= X-Google-Smtp-Source: AGHT+IEDBLXYJ/HGOOsCpHOeRSOWTJ5G2rvrHSp1LGOk/H5uEqsLflasdw67ly07OyaaehXllQWrjA== X-Received: by 2002:a17:902:f710:b0:1e6:ab0:90bb with SMTP id h16-20020a170902f71000b001e60ab090bbmr626568plo.23.1713901591090; Tue, 23 Apr 2024 12:46:31 -0700 (PDT) Received: from localhost ([50.145.13.30]) by smtp.gmail.com with ESMTPSA id h8-20020a170902704800b001e868e29fabsm10415073plt.251.2024.04.23.12.46.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Apr 2024 12:46:30 -0700 (PDT) Date: Tue, 23 Apr 2024 12:46:30 -0700 (PDT) X-Google-Original-Date: Tue, 23 Apr 2024 12:46:25 PDT (-0700) Subject: Re: Maybe we should get rid of ifuncs In-Reply-To: CC: zack@owlfolio.org, libc-alpha@sourceware.org From: Palmer Dabbelt To: enh@google.com Message-ID: Mime-Version: 1.0 (MHng) Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Tue, 23 Apr 2024 11:39:21 PDT (-0700), enh@google.com wrote: > On Tue, Apr 23, 2024 at 11:14 AM Zack Weinberg wrote: >> >> I've been thinking about the XZ exploit (two versions of the compression >> library `liblzma` included Trojan horse code that injected a back >> door into sshd; see https://research.swtch.com/xz-timeline) and what >> it means for glibc, and what I've come to is we should reconsider the >> entire idea of ifuncs. >> >> The SSH protocol does not use XZ compression. liblzma.so was loaded >> into sshd's address space because some Linux distributions patched >> sshd to use libsystemd, and some libsystemd functions (having to do >> with systemd's "journal" logging subsystem, IIUC) do use liblzma, but >> by itself that wouldn't have been enough to give the exploit control, >> because the patched sshd doesn't use any of those functions. But >> these same Linux distributions also compile libsshd with -z now >> (ironically, as a hardening measure, together with -z relro) and that >> means the resolvers for all the ifuncs in *all* the loaded shared >> libraries will be invoked, early enough in process startup that the >> PLT and GOT are still writable. The XZ exploit used an ifunc resolver >> to rewrite a whole bunch of PLT entries, intercepting both calls >> within sshd proper, and calls from sshd to libcrypto.so >> (i.e. OpenSSL's general-purpose cryptography library). >> >> Ifuncs were already a problem -- resolvers are arbitrary application >> code that gets called from deep within the guts of the dynamic loader, >> possibly while internal locks are held (I don't know for sure). >> In -z now mode, they are called not just before the core C library is >> fully initialized, but before symbol resolution is complete, meaning >> that they can't necessarily make *any* function calls; we've had any >> number of bug reports about this. > > (apparently FreeBSD uses two passes to avoid this, but bionic has the > same issue as glibc, and iirc musl doesn't implement ifuncs.) > >> They seem to be less troublesome in >> lazy binding mode, as far as I can tell, but I can still imagine them >> causing trouble (e.g. due to recursive invocation of the lazy symbol >> resolution machinery, or due to injecting non-async-signal-safe code >> into a call, from a signal handler, to a function that's *supposed* to >> be async-signal-safe). The glibc wiki page for ifuncs >> (https://sourceware.org/glibc/wiki/GNU_IFUNC) warns readers that ifunc >> resolvers are subject to severe restrictions that aren't documented or >> even agreed upon. >> >> As far as I know, the only legitimate (non-malicious) use case anyone >> wants for ifuncs is to allow a library to select one of several >> implementations of a single function, based on the characteristics of >> the CPU -- such as how glibc itself selects the best available >> implementation of `memcpy` for the CPU. It seems to me that we ought >> to be able to come up with a completely declarative mechanism for this >> use case. Perhaps a library could supply an array of candidate >> implementations of a function, each paired with a bit vector that >> declares all of the CPU capabilities that that implementation >> requires, sorted from most to least stringent, and the dynamic loader >> could run down the list and pick the first one that will work. This >> would avoid all the problems with calling application code from the >> guts of the loader. And, in -z relro -z now mode, it would mean that >> no application code could run before the PLT and GOT are made >> read-only, closing the path that the XZ trojan used to hook itself >> into sshd. We'd have to keep STT_GNU_IFUNC support around for at >> least a few releases, but we could officially deprecate it and provide >> a tunable and/or a build-time switch to disable it. >> >> To figure out if this is a workable idea, questions for you all: >> (1) Are there other use cases for ifuncs that I don't know about? > > one thing i think is interesting (having been looking at ifuncs while > adding riscv64 support to Android) is that afaict bionic [Android's > libc] is basically the only current user. every library i'm aware of > (and certainly every library that's part of the OS) does _not_ use > ifuncs, if only because iOS/macOS has no equivalent (and Windows > too?), and if you've got to have the manual function pointer > manipulation implementation for them... > > that said, for llvm at least there's work on function multi-versioning > where the compiler basically writes the ifunc resolver. but (a) that's > not quite finished yet (?) and (b) i haven't seen anyone _use_ it yet > and (c) is at least by definition of being machine-generated pretty > regular. > >> (2) Are there existing ifuncs that perform CPU-capability-based >> function selection that *could not* be replaced with an array of bit >> vectors like what I sketched in the previous paragraph? > > arm32 was a horrific mess where SoCs and their kernels were pretty > confused about what they did/didn't have. Android's libc ifunc > resolvers basically had to end up with "look, just write down in this > file what you want us to use, and we'll use those" so the _device_ > owners could override the mess the SoC vendors had made. > > arm64 was slightly awkward when hwcap2 appeared out of nowhere... > > ...and riscv64 is the pathological case of that where you don't have > hwcap but do have __riscv_hwprobe(2) and an unbounded set of keys. I don't really have an opinion on the global glibc side of things here, but I don't think RISC-V is a good argument for keeping anything around if people don't want it otherwise. We're just not that important of a target right now, there's no competitive hardware and the fragmentation is so pathologically bad that we'll just end up adding some crazy constraints to whatever people are trying to do. Hopefully that will change at some point, but for now it's hard to make much of an argument for adding global complexity to improve RISC-V performance. That said, I think some sort of "list of required features -> implementation pairs" would match what we're doing with IFUNCs. The "required feature" set would have to be very big as we have many ISA/vendor features in RISC-V, and the features are a bit more complex than just a bitmap (some are multiple bits, possibly even quantitative) otherwise that's just what we're doing with the IFUNCs I can think of. Right now we're just using IFUNCs because we can write adhoc arbitrary C code to encode those rules, but I don't see any reason we couldn't have an interface that's more specific to the features and priority rules. We'll need to end up doing something along those lines when we do function multi-versioning in GCC, so maybe we just piggyback on that effort? > but that's not a "no" :-) > > i'm curious if anyone has real-world examples of hand-written ifunc > resolvers _not_ in a libc? > >> zw