From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yb1-xb31.google.com (mail-yb1-xb31.google.com [IPv6:2607:f8b0:4864:20::b31]) by sourceware.org (Postfix) with ESMTPS id 95AC7385840D for ; Tue, 23 Apr 2024 18:39:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 95AC7385840D Authentication-Results: sourceware.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=google.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 95AC7385840D Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::b31 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1713897580; cv=none; b=UvQSwU5MzQqs39Yg/ccojQuHzsUJ/8u6O4CwlbxK7Vzh+MGF8RYQ0SOz5sIPjrziM6+TW1OC1v/Zoga8wUkvdbxR8QaNNPtHkGQ5b4Zb2iipLR9UVM2TPqIBlAlJnHB18gwfllUBjLAQVwOIoaOt6kM7KG1zdC9AhinRSnFPBSk= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1713897580; c=relaxed/simple; bh=sO+Ehs4Wx9tpw9GfdxJJoJwLtUoxFq4A8Fe7a/tzZuc=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=v+GLDTuuPZ6qoLma1la8HmQIO0F0PlFaEk1dGbijXKCus/0duQhXWPfU+QAkpf6URrLg0cfhK1z+2/e8GgcMJpapBvdSL1ijTWCx8FDc36+pVlSE1v//UJ/CRxsWab37vxHH9S1DUCLvfLzkd3T0eQXHqjQQXiUdtR1QPhqPSkE= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-yb1-xb31.google.com with SMTP id 3f1490d57ef6-de4665b4969so6265371276.2 for ; Tue, 23 Apr 2024 11:39:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1713897578; x=1714502378; darn=sourceware.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=whsAiMME9yqsanUr/tWFd95pXQwusSHlY6EsFUtqYfs=; b=zSRTc7fgyqxysRDO0GKs52KI1K9TnVXBOK43IX8Iz+HspIKES1u0K4Ko6f4AULpcPc fC4Yq2K0sZdAfSJOHQ1ZnkvOae3LLavqnGBbX38J2ZGPR5u3w6WZFRqT2okX5zvC1F4L xWcQbg63K5J+xSv3IftkoS8+LR/ypNLv+GIIt/KevD4Wbcgl5ZUn3j4G5iKX3TyT5Gz5 imwxAiOa0/CENwyiMpYVQ9RLVes/GloAj9X488usBrfXNzHuKF0o4ZDzU1O+k3LPvaw0 GHnPPBPVIsfsVn92AZSFgHZf0vylfa9zjXJJGTF8UHcHIrKI1PbKcy1UxdGHV5Cr46qm Ue+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713897578; x=1714502378; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=whsAiMME9yqsanUr/tWFd95pXQwusSHlY6EsFUtqYfs=; b=gSFvH4omtaE4dKfRK5DyqkmWfb1F9xakkw3b3S6jdNeuOS9hBJMYePGw3WBubnhw9R IVz+0MmJ6zeK7UglUFURUuUzTtyLlbfDTl3QkWQym18+jz11u6Vj2WCs2Fao01PPtj9C VVETe2i5rOGWqK1VROFnsXBuee83uNrlqoqkAcfXOwfIx85wwCNRoAnXVebtm0fckxNW BKnqiA9ai0yQNQcwlgXdwS+RW4hKUdweE79Fu06vZkrwFKmvi+piqQQTQ+BmXETZfc0p VRIBW30mGSv4rhIrr0Ehdq6Sq++M88PQMYm0sWLcR2rfggO6s52oo1Z98TGtVCDWKr7A F4Zg== X-Gm-Message-State: AOJu0YzZV04EKRaqt4BbZ5XOQ+XeM8JjKFjdrzycJ+a3vP1vbjWlo1A2 GNMPdauPo6FBgRUK+QYK+6HYryNlWRWaOT0tOpHoSm2Op5iEfpnc/wbL0tCTgkVkVSCFfINYitI d+oQK7CZnrDDNGP8R6S4PjISsay6JzoslAHktPWIy1w0oT9hO5n8+ X-Google-Smtp-Source: AGHT+IHoCXkBjxKBGdHCDo4NM76J1BZOKVI3WVJ2E2K80vyH3wsr1kPBK317ofbffeMvWZ11E0iiRD25LGQ+gJS7SiQ= X-Received: by 2002:a05:6902:188a:b0:de0:deb0:c363 with SMTP id cj10-20020a056902188a00b00de0deb0c363mr490220ybb.31.1713897577491; Tue, 23 Apr 2024 11:39:37 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: enh Date: Tue, 23 Apr 2024 11:39:21 -0700 Message-ID: Subject: Re: Maybe we should get rid of ifuncs To: Zack Weinberg Cc: libc-alpha@sourceware.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-16.9 required=5.0 tests=BAYES_00,DKIMWL_WL_MED,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Tue, Apr 23, 2024 at 11:14=E2=80=AFAM Zack Weinberg = wrote: > > I've been thinking about the XZ exploit (two versions of the compression > library `liblzma` included Trojan horse code that injected a back > door into sshd; see https://research.swtch.com/xz-timeline) and what > it means for glibc, and what I've come to is we should reconsider the > entire idea of ifuncs. > > The SSH protocol does not use XZ compression. liblzma.so was loaded > into sshd's address space because some Linux distributions patched > sshd to use libsystemd, and some libsystemd functions (having to do > with systemd's "journal" logging subsystem, IIUC) do use liblzma, but > by itself that wouldn't have been enough to give the exploit control, > because the patched sshd doesn't use any of those functions. But > these same Linux distributions also compile libsshd with -z now > (ironically, as a hardening measure, together with -z relro) and that > means the resolvers for all the ifuncs in *all* the loaded shared > libraries will be invoked, early enough in process startup that the > PLT and GOT are still writable. The XZ exploit used an ifunc resolver > to rewrite a whole bunch of PLT entries, intercepting both calls > within sshd proper, and calls from sshd to libcrypto.so > (i.e. OpenSSL's general-purpose cryptography library). > > Ifuncs were already a problem -- resolvers are arbitrary application > code that gets called from deep within the guts of the dynamic loader, > possibly while internal locks are held (I don't know for sure). > In -z now mode, they are called not just before the core C library is > fully initialized, but before symbol resolution is complete, meaning > that they can't necessarily make *any* function calls; we've had any > number of bug reports about this. (apparently FreeBSD uses two passes to avoid this, but bionic has the same issue as glibc, and iirc musl doesn't implement ifuncs.) > They seem to be less troublesome in > lazy binding mode, as far as I can tell, but I can still imagine them > causing trouble (e.g. due to recursive invocation of the lazy symbol > resolution machinery, or due to injecting non-async-signal-safe code > into a call, from a signal handler, to a function that's *supposed* to > be async-signal-safe). The glibc wiki page for ifuncs > (https://sourceware.org/glibc/wiki/GNU_IFUNC) warns readers that ifunc > resolvers are subject to severe restrictions that aren't documented or > even agreed upon. > > As far as I know, the only legitimate (non-malicious) use case anyone > wants for ifuncs is to allow a library to select one of several > implementations of a single function, based on the characteristics of > the CPU -- such as how glibc itself selects the best available > implementation of `memcpy` for the CPU. It seems to me that we ought > to be able to come up with a completely declarative mechanism for this > use case. Perhaps a library could supply an array of candidate > implementations of a function, each paired with a bit vector that > declares all of the CPU capabilities that that implementation > requires, sorted from most to least stringent, and the dynamic loader > could run down the list and pick the first one that will work. This > would avoid all the problems with calling application code from the > guts of the loader. And, in -z relro -z now mode, it would mean that > no application code could run before the PLT and GOT are made > read-only, closing the path that the XZ trojan used to hook itself > into sshd. We'd have to keep STT_GNU_IFUNC support around for at > least a few releases, but we could officially deprecate it and provide > a tunable and/or a build-time switch to disable it. > > To figure out if this is a workable idea, questions for you all: > (1) Are there other use cases for ifuncs that I don't know about? one thing i think is interesting (having been looking at ifuncs while adding riscv64 support to Android) is that afaict bionic [Android's libc] is basically the only current user. every library i'm aware of (and certainly every library that's part of the OS) does _not_ use ifuncs, if only because iOS/macOS has no equivalent (and Windows too?), and if you've got to have the manual function pointer manipulation implementation for them... that said, for llvm at least there's work on function multi-versioning where the compiler basically writes the ifunc resolver. but (a) that's not quite finished yet (?) and (b) i haven't seen anyone _use_ it yet and (c) is at least by definition of being machine-generated pretty regular. > (2) Are there existing ifuncs that perform CPU-capability-based > function selection that *could not* be replaced with an array of bit > vectors like what I sketched in the previous paragraph? arm32 was a horrific mess where SoCs and their kernels were pretty confused about what they did/didn't have. Android's libc ifunc resolvers basically had to end up with "look, just write down in this file what you want us to use, and we'll use those" so the _device_ owners could override the mess the SoC vendors had made. arm64 was slightly awkward when hwcap2 appeared out of nowhere... ...and riscv64 is the pathological case of that where you don't have hwcap but do have __riscv_hwprobe(2) and an unbounded set of keys. but that's not a "no" :-) i'm curious if anyone has real-world examples of hand-written ifunc resolvers _not_ in a libc? > zw