public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: enh <enh@google.com>
To: Zack Weinberg <zack@owlfolio.org>
Cc: libc-alpha@sourceware.org
Subject: Re: Maybe we should get rid of ifuncs
Date: Tue, 23 Apr 2024 11:39:21 -0700	[thread overview]
Message-ID: <CAJgzZormDUXPKzNTAT6tRxFYTMjH4KEUG6YmyFEotdb7s6MApw@mail.gmail.com> (raw)
In-Reply-To: <D0RPG6348D0S.1F9SCYCGKZ3VI@owlfolio.org>

On Tue, Apr 23, 2024 at 11:14 AM Zack Weinberg <zack@owlfolio.org> wrote:
>
> I've been thinking about the XZ exploit (two versions of the compression
> library `liblzma` included Trojan horse code that injected a back
> door into sshd; see https://research.swtch.com/xz-timeline) and what
> it means for glibc, and what I've come to is we should reconsider the
> entire idea of ifuncs.
>
> The SSH protocol does not use XZ compression.  liblzma.so was loaded
> into sshd's address space because some Linux distributions patched
> sshd to use libsystemd, and some libsystemd functions (having to do
> with systemd's "journal" logging subsystem, IIUC) do use liblzma, but
> by itself that wouldn't have been enough to give the exploit control,
> because the patched sshd doesn't use any of those functions.  But
> these same Linux distributions also compile libsshd with -z now
> (ironically, as a hardening measure, together with -z relro) and that
> means the resolvers for all the ifuncs in *all* the loaded shared
> libraries will be invoked, early enough in process startup that the
> PLT and GOT are still writable.  The XZ exploit used an ifunc resolver
> to rewrite a whole bunch of PLT entries, intercepting both calls
> within sshd proper, and calls from sshd to libcrypto.so
> (i.e. OpenSSL's general-purpose cryptography library).
>
> Ifuncs were already a problem -- resolvers are arbitrary application
> code that gets called from deep within the guts of the dynamic loader,
> possibly while internal locks are held (I don't know for sure).
> In -z now mode, they are called not just before the core C library is
> fully initialized, but before symbol resolution is complete, meaning
> that they can't necessarily make *any* function calls; we've had any
> number of bug reports about this.

(apparently FreeBSD uses two passes to avoid this, but bionic has the
same issue as glibc, and iirc musl doesn't implement ifuncs.)

> They seem to be less troublesome in
> lazy binding mode, as far as I can tell, but I can still imagine them
> causing trouble (e.g. due to recursive invocation of the lazy symbol
> resolution machinery, or due to injecting non-async-signal-safe code
> into a call, from a signal handler, to a function that's *supposed* to
> be async-signal-safe).  The glibc wiki page for ifuncs
> (https://sourceware.org/glibc/wiki/GNU_IFUNC) warns readers that ifunc
> resolvers are subject to severe restrictions that aren't documented or
> even agreed upon.
>
> As far as I know, the only legitimate (non-malicious) use case anyone
> wants for ifuncs is to allow a library to select one of several
> implementations of a single function, based on the characteristics of
> the CPU -- such as how glibc itself selects the best available
> implementation of `memcpy` for the CPU.  It seems to me that we ought
> to be able to come up with a completely declarative mechanism for this
> use case.  Perhaps a library could supply an array of candidate
> implementations of a function, each paired with a bit vector that
> declares all of the CPU capabilities that that implementation
> requires, sorted from most to least stringent, and the dynamic loader
> could run down the list and pick the first one that will work.  This
> would avoid all the problems with calling application code from the
> guts of the loader.  And, in -z relro -z now mode, it would mean that
> no application code could run before the PLT and GOT are made
> read-only, closing the path that the XZ trojan used to hook itself
> into sshd.  We'd have to keep STT_GNU_IFUNC support around for at
> least a few releases, but we could officially deprecate it and provide
> a tunable and/or a build-time switch to disable it.
>
> To figure out if this is a workable idea, questions for you all:
> (1) Are there other use cases for ifuncs that I don't know about?

one thing i think is interesting (having been looking at ifuncs while
adding riscv64 support to Android) is that afaict bionic [Android's
libc] is basically the only current user. every library i'm aware of
(and certainly every library that's part of the OS) does _not_ use
ifuncs, if only because iOS/macOS has no equivalent (and Windows
too?), and if you've got to have the manual function pointer
manipulation implementation for them...

that said, for llvm at least there's work on function multi-versioning
where the compiler basically writes the ifunc resolver. but (a) that's
not quite finished yet (?) and (b) i haven't seen anyone _use_ it yet
and (c) is at least by definition of being machine-generated pretty
regular.

> (2) Are there existing ifuncs that perform CPU-capability-based
> function selection that *could not* be replaced with an array of bit
> vectors like what I sketched in the previous paragraph?

arm32 was a horrific mess where SoCs and their kernels were pretty
confused about what they did/didn't have. Android's libc ifunc
resolvers basically had to end up with "look, just write down in this
file what you want us to use, and we'll use those" so the _device_
owners could override the mess the SoC vendors had made.

arm64 was slightly awkward when hwcap2 appeared out of nowhere...

...and riscv64 is the pathological case of that where you don't have
hwcap but do have __riscv_hwprobe(2) and an unbounded set of keys.

but that's not a "no" :-)

i'm curious if anyone has real-world examples of hand-written ifunc
resolvers _not_ in a libc?

> zw

  reply	other threads:[~2024-04-23 18:39 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-23 18:14 Zack Weinberg
2024-04-23 18:39 ` enh [this message]
2024-04-23 19:46   ` Palmer Dabbelt
2024-04-24 13:56   ` Zack Weinberg
2024-04-24 14:25     ` enh
2024-04-23 18:52 ` Sam James
2024-04-23 18:54 ` Florian Weimer
2024-04-24 13:53   ` Zack Weinberg
2024-04-23 19:26 ` Andreas Schwab
2024-04-24 13:54   ` Zack Weinberg
2024-04-24  1:41 ` Richard Henderson
2024-04-24 14:43   ` Zack Weinberg
2024-04-24 15:09     ` enh
2024-04-28  0:24     ` Peter Bergner
2024-05-02  2:59       ` Michael Meissner
2024-04-30  8:42 ` Simon Josefsson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJgzZormDUXPKzNTAT6tRxFYTMjH4KEUG6YmyFEotdb7s6MApw@mail.gmail.com \
    --to=enh@google.com \
    --cc=libc-alpha@sourceware.org \
    --cc=zack@owlfolio.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).