From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=RE+4=LO=gmail.com=anderson.jonathonm@sourceware.org>
Received: from mail-oi1-x22a.google.com (mail-oi1-x22a.google.com [IPv6:2607:f8b0:4864:20::22a])
	by sourceware.org (Postfix) with ESMTPS id 30B743858D20;
	Tue,  9 Apr 2024 16:44:24 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 30B743858D20
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 30B743858D20
Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::22a
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712681069; cv=none;
	b=ROH2W6wcb9Nvm+VDhL/IO6WinlxhypqZze82YKKIBU7Cly8C16D7mY9fIZRpSK0gP2CydYRJStHgI0I4/3pGuni3kliNJwXA6wdXf1Pezd+MMWVHW9i2ISeXaLjXToCvZZZOuMBEkC+pm1D9FNZr77DjJDzQvwvdzur3YGDVbeo=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
	t=1712681069; c=relaxed/simple;
	bh=nMRzdDwR3bB3pSZOcGtJ8QCDbAC8sqdUb+hAYTKgXYU=;
	h=DKIM-Signature:Message-ID:Subject:From:To:Date:MIME-Version; b=q5JgMj/t3O+dsyhkFNwRZ4gqzQuapr3HAspeoS1b+i2BZQFFjrWAZymAoB1fPm2xj1WyfexVG/uJqNzfDyl3IkdFGJ6lTmCOu1aKxZWe75xb2uCkb6s8CM0WHDyqmMHaxTmkMb2cDpevgADGu6zuiGhRffkQImVcBkjq70p1lGM=
ARC-Authentication-Results: i=1; server2.sourceware.org
Received: by mail-oi1-x22a.google.com with SMTP id 5614622812f47-3c3e6ea6d2fso4033469b6e.2;
        Tue, 09 Apr 2024 09:44:24 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20230601; t=1712681063; x=1713285863; darn=sourceware.org;
        h=mime-version:user-agent:references:in-reply-to:date:cc:to:from
         :subject:message-id:from:to:cc:subject:date:message-id:reply-to;
        bh=FFEMG/vKqh59l5N0FYbHGJ3ICTY0wFyQEaI+4WO8yEc=;
        b=aym4dH4lHeE8aAT5tQ5gmBuKyWfhFmLOh4cdBOE2SGNfZWvUIkBCLWe2cqXKKvOqOm
         Kx9WrDN+36dvWR4KwBy1MXR4ughwUxbj/pEyxdTFyDK36BWK3lOVCwzbUwA6SDuUgxDp
         RWpZ+/zO23d0CxIx7EWJIboIuTtzlT6AlsL3OURgbg1pnc487LhsvV+CxA2orfZ80mDx
         bDCAUWouItOhGbytzoVW4oXcxqyeaSXDnJ52ZdYTYJyvxwZws9g4vd+JEKjIfxZzVVZw
         A2ZtF87FlNQeRZFlKWRebnzhKEZN8yKWdnR7imYgUageVBnR4PbvPlpdiPXoDHHMFSbu
         rkDA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1712681063; x=1713285863;
        h=mime-version:user-agent:references:in-reply-to:date:cc:to:from
         :subject:message-id:x-gm-message-state:from:to:cc:subject:date
         :message-id:reply-to;
        bh=FFEMG/vKqh59l5N0FYbHGJ3ICTY0wFyQEaI+4WO8yEc=;
        b=IobyzMGqlnrScsukZp6OZTq3dkfcbuDrZ8xcll0Pa45fSLngsVCAqzoVQo/x7mVEqT
         xdGZhYojZEtZElV2SHQlnXGF8aYj7ybs7DJc8TDvVe0ZN9KP47suw2WmwyhAVlhaN9xw
         Y9amhNOKsmx0in/QmB1YwaU031r/bVdhrY4Jifv0dsIdB/dhm1PSaHhUigW//C1oS01u
         c1GrFDvyubCA+JNyoXeDf+W3PCwz48fHq/cDGuIWkvr0xhL+5E007YdELTcd2+tvWUuV
         mWV09Pznjtx1tTTcEp/wMryEQZ2oBLSV/F4Aqht3E/DJcZWXn88Ik4iWFk1pcFgWDVe7
         lg6A==
X-Forwarded-Encrypted: i=1; AJvYcCW08MJeUdjC5yQr6B5VgaSZdd5XpLNfNYMgqUrABSeUY40lSMd9B91dSqTBm3XOvAMd4NeQg/vkvT7kPyXIy18MX2e1cCMruQWqe4ip4VhT6ZiHNkQJYaxamzWuegBk/VZDsEhJLz6ZdXoT1wj2Ke4QveZvZ9hp/aP4j5WiqcardZJq2XoC12NDUiqonzqayeM=
X-Gm-Message-State: AOJu0YykMi7OQIgEIBiV9OVCpVzD6Al4AWuoWIkhzTqy9/2dJuE8opoB
	NZFYR/uOLx6FkGzMKMRMpH3a1w+O3O0nin9w0RkPN/F34z83gFbA
X-Google-Smtp-Source: AGHT+IFkD5gtDQ/PV/hlsSWQHmLhlDJURkJNUwNGDvtWkXOpPWBHN6F7yPUJzd7jorGnWp93mKNY9A==
X-Received: by 2002:aca:1c0e:0:b0:3c5:f94e:2f6a with SMTP id c14-20020aca1c0e000000b003c5f94e2f6amr67926oic.2.1712681063072;
        Tue, 09 Apr 2024 09:44:23 -0700 (PDT)
Received: from [10.41.6.67] ([24.75.238.76])
        by smtp.gmail.com with ESMTPSA id d9-20020a05680805c900b003c5ef82d3b9sm935271oij.55.2024.04.09.09.44.21
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Tue, 09 Apr 2024 09:44:22 -0700 (PDT)
Message-ID: <4dd125546c920da4cc744a93f230917a7311c7fb.camel@gmail.com>
Subject: Re: Sourceware mitigating and preventing the next xz-backdoor
From: anderson.jonathonm@gmail.com
To: Michael Matz <matz@suse.de>
Cc: Martin Uecker <uecker@tugraz.at>, Ian Lance Taylor <iant@golang.org>, 
 Paul Koning <paulkoning@comcast.net>, Paul Eggert <eggert@cs.ucla.edu>,
 Sandra Loosemore <sloosemore@baylibre.com>, Mark Wielaard <mark@klomp.org>,
  overseers@sourceware.org, gcc@gcc.gnu.org, binutils@sourceware.org, 
 gdb@sourceware.org, libc-alpha@sourceware.org
Date: Tue, 09 Apr 2024 09:44:20 -0700
In-Reply-To: <41394737-6f2d-86e7-5742-e0a794f9f63c@suse.de>
References: <20240329203909.GS9427@gnu.wildebeest.org>
	 <20240401150617.GF19478@gnu.wildebeest.org>
	 <fc3d8fe6-bfec-474d-a9ed-895067654603@baylibre.com>
	 <12215cd2-16db-4ee4-bd98-6a4bcf318592@cs.ucla.edu>
	 <FC0B14DF-0B05-4896-B70F-7D994961D5C2@comcast.net>
	 <CAKOQZ8zfgq4xyFkQSFBZpU26g7eh=nr+fiQeAWVeba68DX7DeQ@mail.gmail.com>
	 <6239192ba9ff8aad0752309a54b633dc75a57c77.camel@tugraz.at>
	 <8e877d2f-01e0-c786-dea5-265edbdc0c07@suse.de>
	 <cfa8d1d4ffddc1133e3477de31d2237e13bda2d0.camel@gmail.com>
	 <41394737-6f2d-86e7-5742-e0a794f9f63c@suse.de>
Content-Type: multipart/alternative; boundary="=-LcAskMQKQBGbgePnp0TF"
User-Agent: Evolution 3.46.4-2 
MIME-Version: 1.0
X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <libc-alpha.sourceware.org>

--=-LcAskMQKQBGbgePnp0TF
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Hello,

On Thu, Apr 4, 2024, 09:00 Michael Matz <matz@suse.de> wrote:

> Hello,
>
> On Wed, 3 Apr 2024, Jonathon Anderson wrote:
>
> > Of course, this doesn't make the build system any less complex, but=20
> > projects using newer build systems seem easier to secure and audit than=
=20
> > those using overly flexible build systems like Autotools and maybe even=
=20
> > CMake. IMHO using a late-model build system is a relatively low=20
> > technical hurdle to overcome for the benefits noted above, switching=20
> > should be considered and in a positive light.
>
> Note that we're talking not (only) about the build system itself, i.e. ho=
w=20
> to declare dependencies within the sources, and how to declare how to=20
> build them. [...]
>
> But Martin also specifically asked about alternatives for feature tests,=
=20
> i.e. autoconfs purpose.  I simply don't see how any alternative to it=20
> could be majorly "easier" or "less complex" at its core.=20=20

My point was not that newer build systems are any less complex taken wholis=
tically. The upthread has already discussed why configuration tests are nec=
essary and aren't going away anytime soon. My horror stories would not add =
much to that conversation.

My point, or rather my opinion as a humble end-user, is that newer build sy=
stems (e.g. Meson) make identifying the xz backdoor and injections followin=
g the same pattern much easier than older build systems (e.g. autoconf+auto=
make+libtool). Paraphrasing my original message to highlight the backdoor p=
rotections:

- This xz backdoor injection unpacked attacker-controlled files and ran the=
m during `configure`. Newer build systems implement a build abstraction (ak=
a DSL) that acts similar to a sandbox and enforces rules (e.g. the only cod=
e run during `meson setup` is from `meson.build` files and CMake). Generall=
y speaking the only way to disobey those rules is via an "escape" command (=
e.g. `run_command()`) of which there are few. This reduces the task of audi=
ting the build scripts for sandbox-breaking malicious intent significantly,=
 only the "escapes" need investigation and they which should(tm) be rare fo=
r well-behaved projects.

- This xz backdoor injected worked due to altered versions of the build sys=
tem implementation being shipped with the source. Newer build systems do no=
t ship any part of the build system's implementation with the project's sou=
rce, it somes as a separate package and the most the project does is specif=
y a compatible version range. This removes the possibility of "hidden" code=
 being mixed to the shipped source tarball, well-behaved projects will have=
 source tarballs that can be byte-for-byte reproduced from the VCS (e.g. Gi=
t).

- This xz backdoor injected the payload code by altering the files in the s=
ource directory. Newer build systems do not allow altering the source direc=
tory (for good reason) in their build abstraction. For build workers/CI, th=
is restriction can be enforced using low-level sandboxing (e.g. conainers o=
r landlock) and well-behaved projects should(tm) be unaffected. This reduce=
s the possibilities on how code can appear in the project, only code genera=
tion sequences (e.g. `custom_target()` or `generator()`) can produce arbitr=
ary code.

Just by transitioning to a newer build system (and being "well-behaved") it=
 seems like you get certain protections almost for free. And IMHO compared =
to other options to improve project security (e.g. automated fuzzing), tran=
sitioning build systems is a medium-effort, low-maintainence option with a =
comparatively high yield.

> make is just fine for that (as are many others).  (In a way=20
> I think we meanwhile wouldn't really need automake and autogen, but=20
> rewriting all that in pure GNUmake is a major undertaking).

Makefile (the language) is extremely complex, it is nigh impossible to ensu=
re something won't expand to some malicious code injection when the right f=
ile arrangement is present. (IMHO Make also is not a great build system, it=
 has none of the protections described above... and it's [just plain slow](=
https://david.rothlis.net/ninja-benchmark/).)

Newer build systems seem to be moving in a direction where the bulk of the =
build configuration/scripts are non-Turing-complete. This is a highly effec=
tive method for clarifying the high-level complexity of the DSL: simple thi=
ngs are short and sweet, complex things (and discouraged solutions) are lon=
g and complex. Less complex high-level build logic makes it easier to verif=
y and audit for security (and other) purposes.

> Going with the=20
> examples given upthread there is usually only one major solution: to chec=
k=20
> if a given system supports FOOBAR you need to bite the bullet and compile=
=20
> (and potentially run!) a small program using FOOBAR.  A configuration=20
> system that can do that (and I don't see any real alternative to that), n=
o=20
> matter in which language it's written and how traditional or modern it is=
,=20
> also gives you enough rope to hang yourself, if you so choose.

Both very true, as discussed upthread.

It may be a helpful policy to use templates for configure tests where possi=
ble. For example the [now-famous XZ landlock configure test (that contained=
 a rouge syntax error)](https://git.tukaani.org/?p=3Dxz.git;a=3Dcommitdiff;=
h=3D328c52da8a2bbb81307644efdb58db2c422d9ba7) could be written via Meson te=
mplate configure tests as:

    cc =3D meson.get_compiler('c')
    if (
        cc.has_function('prctl', prefix: '#include <sys/prctl.h>')
        and cc.has_header_symbol('sys/syscall.h', 'SYS_landlock_create_rule=
set', prefix: '#include <linux/landlock.h>')
        and cc.has_header_symbol('sys/syscall.h', 'SYS_landlock_restrict', =
prefix: '#include <linux/landlock.h>')
        and cc.has_header_symbol('linux/landlock.h', 'LANDLOCK_CREATE_RULES=
ET_VERSION')
    )
    # ...
    endif

The templates are well-tested as part of the build system, so if the `prefi=
x:` has no syntax error (and the spelling is right) I'm confident these che=
cks are sane. And IMHO this expresses what is being checked much better tha=
n an actual snippet of C/C++ code.

>
> If you get away without many configuration tests in your project then thi=
s=20
> is because what (e.g.) the compiler gives you, in the form of libstdc++=20
> for example, abstracts away many of the peculiarities of a system.  But=20
> in order to be able to do that something (namely the config system of=20
> libstdc++) needs to determine what is or isn't supported by the system in=
=20
> order to correctly implement these abstractions.  I.e. things you depend=
=20
> on did the major lifting of hiding system divergence.
>
> (Well, that, or you are very limited in the number of systems you support=
,=20
> which can be the right thing as well!)

We do get away with less than a dozen configuration tests, in part because =
we are fairly limited in what we support (GNU/Linux) and we use the C, C++ =
and POSIX standards quite heavily. But we're also an end-user application w=
ith a heavy preference towards bleeding-edge systems, not a low-level libra=
ry that can run on 1980-something technology, so our experience might be a =
bit... unique. ;)

Thanks,\
-Jonathon

--=-LcAskMQKQBGbgePnp0TF--