Re: Sourceware mitigating and preventing the next xz-backdoor

public inbox for overseers@sourceware.org
 help / color / mirror / Atom feed

From: anderson.jonathonm@gmail.com
To: Michael Matz <matz@suse.de>
Cc: Martin Uecker <uecker@tugraz.at>,
	Ian Lance Taylor <iant@golang.org>,
	 Paul Koning <paulkoning@comcast.net>,
	Paul Eggert <eggert@cs.ucla.edu>,
	Sandra Loosemore <sloosemore@baylibre.com>,
	Mark Wielaard <mark@klomp.org>,
	overseers@sourceware.org, gcc@gcc.gnu.org,
	binutils@sourceware.org,  gdb@sourceware.org,
	libc-alpha@sourceware.org
Subject: Re: Sourceware mitigating and preventing the next xz-backdoor
Date: Tue, 09 Apr 2024 09:44:20 -0700	[thread overview]
Message-ID: <4dd125546c920da4cc744a93f230917a7311c7fb.camel@gmail.com> (raw)
In-Reply-To: <41394737-6f2d-86e7-5742-e0a794f9f63c@suse.de>

[-- Attachment #1: Type: text/plain, Size: 7168 bytes --]

Hello,

On Thu, Apr 4, 2024, 09:00 Michael Matz <matz@suse.de> wrote:

> Hello,
>
> On Wed, 3 Apr 2024, Jonathon Anderson wrote:
>
> > Of course, this doesn't make the build system any less complex, but 
> > projects using newer build systems seem easier to secure and audit than 
> > those using overly flexible build systems like Autotools and maybe even 
> > CMake. IMHO using a late-model build system is a relatively low 
> > technical hurdle to overcome for the benefits noted above, switching 
> > should be considered and in a positive light.
>
> Note that we're talking not (only) about the build system itself, i.e. how 
> to declare dependencies within the sources, and how to declare how to 
> build them. [...]
>
> But Martin also specifically asked about alternatives for feature tests, 
> i.e. autoconfs purpose.  I simply don't see how any alternative to it 
> could be majorly "easier" or "less complex" at its core.  

My point was not that newer build systems are any less complex taken wholistically. The upthread has already discussed why configuration tests are necessary and aren't going away anytime soon. My horror stories would not add much to that conversation.

My point, or rather my opinion as a humble end-user, is that newer build systems (e.g. Meson) make identifying the xz backdoor and injections following the same pattern much easier than older build systems (e.g. autoconf+automake+libtool). Paraphrasing my original message to highlight the backdoor protections:

- This xz backdoor injection unpacked attacker-controlled files and ran them during `configure`. Newer build systems implement a build abstraction (aka DSL) that acts similar to a sandbox and enforces rules (e.g. the only code run during `meson setup` is from `meson.build` files and CMake). Generally speaking the only way to disobey those rules is via an "escape" command (e.g. `run_command()`) of which there are few. This reduces the task of auditing the build scripts for sandbox-breaking malicious intent significantly, only the "escapes" need investigation and they which should(tm) be rare for well-behaved projects.

- This xz backdoor injected worked due to altered versions of the build system implementation being shipped with the source. Newer build systems do not ship any part of the build system's implementation with the project's source, it somes as a separate package and the most the project does is specify a compatible version range. This removes the possibility of "hidden" code being mixed to the shipped source tarball, well-behaved projects will have source tarballs that can be byte-for-byte reproduced from the VCS (e.g. Git).

- This xz backdoor injected the payload code by altering the files in the source directory. Newer build systems do not allow altering the source directory (for good reason) in their build abstraction. For build workers/CI, this restriction can be enforced using low-level sandboxing (e.g. conainers or landlock) and well-behaved projects should(tm) be unaffected. This reduces the possibilities on how code can appear in the project, only code generation sequences (e.g. `custom_target()` or `generator()`) can produce arbitrary code.

Just by transitioning to a newer build system (and being "well-behaved") it seems like you get certain protections almost for free. And IMHO compared to other options to improve project security (e.g. automated fuzzing), transitioning build systems is a medium-effort, low-maintainence option with a comparatively high yield.

> make is just fine for that (as are many others).  (In a way 
> I think we meanwhile wouldn't really need automake and autogen, but 
> rewriting all that in pure GNUmake is a major undertaking).

Makefile (the language) is extremely complex, it is nigh impossible to ensure something won't expand to some malicious code injection when the right file arrangement is present. (IMHO Make also is not a great build system, it has none of the protections described above... and it's [just plain slow](https://david.rothlis.net/ninja-benchmark/).)

Newer build systems seem to be moving in a direction where the bulk of the build configuration/scripts are non-Turing-complete. This is a highly effective method for clarifying the high-level complexity of the DSL: simple things are short and sweet, complex things (and discouraged solutions) are long and complex. Less complex high-level build logic makes it easier to verify and audit for security (and other) purposes.

> Going with the 
> examples given upthread there is usually only one major solution: to check 
> if a given system supports FOOBAR you need to bite the bullet and compile 
> (and potentially run!) a small program using FOOBAR.  A configuration 
> system that can do that (and I don't see any real alternative to that), no 
> matter in which language it's written and how traditional or modern it is, 
> also gives you enough rope to hang yourself, if you so choose.

Both very true, as discussed upthread.

It may be a helpful policy to use templates for configure tests where possible. For example the [now-famous XZ landlock configure test (that contained a rouge syntax error)](https://git.tukaani.org/?p=xz.git;a=commitdiff;h=328c52da8a2bbb81307644efdb58db2c422d9ba7) could be written via Meson template configure tests as:

    cc = meson.get_compiler('c')
    if (
        cc.has_function('prctl', prefix: '#include <sys/prctl.h>')
        and cc.has_header_symbol('sys/syscall.h', 'SYS_landlock_create_ruleset', prefix: '#include <linux/landlock.h>')
        and cc.has_header_symbol('sys/syscall.h', 'SYS_landlock_restrict', prefix: '#include <linux/landlock.h>')
        and cc.has_header_symbol('linux/landlock.h', 'LANDLOCK_CREATE_RULESET_VERSION')
    )
    # ...
    endif

The templates are well-tested as part of the build system, so if the `prefix:` has no syntax error (and the spelling is right) I'm confident these checks are sane. And IMHO this expresses what is being checked much better than an actual snippet of C/C++ code.

>
> If you get away without many configuration tests in your project then this 
> is because what (e.g.) the compiler gives you, in the form of libstdc++ 
> for example, abstracts away many of the peculiarities of a system.  But 
> in order to be able to do that something (namely the config system of 
> libstdc++) needs to determine what is or isn't supported by the system in 
> order to correctly implement these abstractions.  I.e. things you depend 
> on did the major lifting of hiding system divergence.
>
> (Well, that, or you are very limited in the number of systems you support, 
> which can be the right thing as well!)

We do get away with less than a dozen configuration tests, in part because we are fairly limited in what we support (GNU/Linux) and we use the C, C++ and POSIX standards quite heavily. But we're also an end-user application with a heavy preference towards bleeding-edge systems, not a low-level library that can run on 1980-something technology, so our experience might be a bit... unique. ;)

Thanks,\
-Jonathon

next prev parent reply	other threads:[~2024-04-09 16:44 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-29 20:39 Security warning about xz library compromise Mark Wielaard
2024-04-01 15:06 ` Sourceware mitigating and preventing the next xz-backdoor Mark Wielaard
2024-04-02 19:54   ` Sandra Loosemore
2024-04-02 20:03     ` Paul Eggert
2024-04-02 20:20       ` Paul Koning
2024-04-02 20:28         ` Ian Lance Taylor
2024-04-03  6:26           ` Martin Uecker
2024-04-03 14:00             ` Michael Matz
2024-04-03 14:14               ` Paul Koning
2024-04-03 14:32               ` Martin Uecker
2024-04-03 14:46                 ` Jeffrey Walton
2024-04-03 16:02                 ` Michael Matz
2024-04-03 16:26                   ` Joel Sherrill
2024-04-03 16:32                   ` Martin Uecker
2024-04-03 16:51                 ` Andreas Schwab
2024-04-03 16:56                 ` Jonathan Wakely
2024-04-03 18:46               ` Jonathon Anderson
2024-04-03 19:01                 ` Martin Uecker
2024-04-05 21:15                   ` Andrew Sutton
2024-04-06 13:00                     ` Richard Biener
2024-04-06 15:59                       ` Martin Uecker
2024-04-04 13:59                 ` Michael Matz
2024-04-09 16:44                   ` anderson.jonathonm [this message]
2024-04-09 17:57                     ` Andreas Schwab
2024-04-09 19:59                       ` Jonathon Anderson
2024-04-09 20:11                         ` Paul Koning
2024-04-09 21:40                           ` Jeffrey Walton
2024-04-09 21:50                             ` Paul Eggert
2024-04-09 21:58                               ` Sam James
2024-04-09 22:15                                 ` Paul Eggert
2024-04-09 22:22                                   ` Sam James
2024-04-09 22:53                                     ` Paul Eggert
2024-04-09 22:03                               ` Jonathon Anderson
2024-04-09 22:10                                 ` Sam James
2024-04-09 21:54                           ` Jonathon Anderson
2024-04-09 22:00                             ` Sam James
2024-04-10 14:09                             ` Frank Ch. Eigler
2024-04-10 18:47                               ` Jonathon Anderson
2024-04-10 19:00                                 ` Frank Ch. Eigler
2024-04-10 10:26                       ` Claudio Bantaloukas
2024-04-02 22:08     ` Guinevere Larsen
2024-04-02 22:21       ` Guinevere Larsen
2024-04-02 22:50       ` Jeffrey Walton
2024-04-02 23:20       ` Mark Wielaard
2024-04-02 23:34       ` Paul Koning
2024-04-03  0:37         ` Jeffrey Walton
2024-04-03  8:08       ` Florian Weimer
2024-04-03 13:53         ` Joel Sherrill
2024-04-04 10:25           ` Mark Wielaard
2024-04-10 16:30           ` Alejandro Colomar
2024-04-21 15:30             ` Mark Wielaard
2024-04-21 20:40               ` Alejandro Colomar
2024-04-21 20:52                 ` Alejandro Colomar
2024-04-30 11:28                 ` Alejandro Colomar
2024-04-03 14:04         ` Tom Tromey
2024-04-03 14:42           ` Jeff Law
2024-04-04 10:48             ` Mark Wielaard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4dd125546c920da4cc744a93f230917a7311c7fb.camel@gmail.com \
    --to=anderson.jonathonm@gmail.com \
    --cc=binutils@sourceware.org \
    --cc=eggert@cs.ucla.edu \
    --cc=gcc@gcc.gnu.org \
    --cc=gdb@sourceware.org \
    --cc=iant@golang.org \
    --cc=libc-alpha@sourceware.org \
    --cc=mark@klomp.org \
    --cc=matz@suse.de \
    --cc=overseers@sourceware.org \
    --cc=paulkoning@comcast.net \
    --cc=sloosemore@baylibre.com \
    --cc=uecker@tugraz.at \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).