From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=rtrk=GG=klomp.org=mark@sourceware.org>
Received: from gnu.wildebeest.org (gnu.wildebeest.org [45.83.234.184])
	by sourceware.org (Postfix) with ESMTPS id 8BC0A3858C2D
	for <elfutils-devel@sourceware.org>; Tue, 24 Oct 2023 21:03:46 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8BC0A3858C2D
Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=klomp.org
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=klomp.org
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8BC0A3858C2D
Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=45.83.234.184
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698181429; cv=none;
	b=Heve/EW2JwW+k1PrUM69XIUrgDPuhUV/2Tm5YQKfev4AmZYw+ejQ13TBqDEMvuPMaCuLflWt88G3h5+UX4tpwfZZ6b5AYXvhOvwLWMdNQmsWkxwJhMyrxYN3SPafwpdZmoEarQ70wFJCLzqMgKuJ6VBxCfWjCqGIH/CR0tjh5fI=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
	t=1698181429; c=relaxed/simple;
	bh=19ZiTrTdmMqJfB2aGVUYdewNKjD8OL1vDghuPVwJC9U=;
	h=Date:From:To:Subject:Message-ID:MIME-Version; b=dIwYxa0TGTpLaTiREhukqPODWNQZNL9DMKQdqk7N0lIugIPNrQXzVSvbPERR2UbgKuHI2EzAE4kBWILnoVsmSSx7UeMov8/glaDyEg3gmXAloLHi0UkdclaA7+KzIDd7O4cm2vuszrwoPKEixl/RPuT/j1m8uRYg2ZoxKkm6Ayo=
ARC-Authentication-Results: i=1; server2.sourceware.org
Received: by gnu.wildebeest.org (Postfix, from userid 1000)
	id 67FEC302FDC0; Tue, 24 Oct 2023 23:03:45 +0200 (CEST)
Date: Tue, 24 Oct 2023 23:03:45 +0200
From: Mark Wielaard <mark@klomp.org>
To: "Frank Ch. Eigler" <fche@redhat.com>
Cc: elfutils-devel@sourceware.org
Subject: Re: [PATCH] PR28204, debuginfod IMA
Message-ID: <20231024210345.GE2863@gnu.wildebeest.org>
References: <ZPnILmIj587Y/SSV@elastic.org>
 <20231023193347.GB2863@gnu.wildebeest.org>
 <20231024132743.GC9683@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20231024132743.GC9683@redhat.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Spam-Status: No, score=-3028.4 required=5.0 tests=BAYES_00,JMQ_SPF_NEUTRAL,KAM_DMARC_STATUS,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=no autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <elfutils-devel.sourceware.org>

Hi Frank,

On Tue, Oct 24, 2023 at 09:27:43AM -0400, Frank Ch. Eigler wrote:
> > BTW. The diff doesn't show the newly added binary files. So the patch
> > cannot be applied. Please use git send-email or git format-patch for
> > that.
> 
> I would not expect the emailed patch to apply, esp. with all the other
> work done in the intermediate months, which is why the code is also in
> the git branch.  The binary files do not seem effectively reviewable
> anyway.

It would be really convenient though. And modern git format-patch does
includes base tree information which allows tools to stich commits at
the right place. It would also help with patchwork and pre-commit CI.
https://git-scm.com/docs/git-format-patch#_base_tree_information

> > [...]
> > >     The default is ima:permissive mode, which allows signatures to
> > >     function like a checksum to detect accidental corruption, but accepts
> > >     operation in a mix of signed and unsigned packages & servers.
> > 
> > I still think "permissive" is confusing. Since it is a term also used
> > by e.g. selinux, but doesn't work that way. And it doesn't seem
> > connected with the threat-model that enforcing protects against.
> 
> The connection is the following:
> "enforcing" mode protects against accidental or deliberate (MITM) corruption.
> "permissive" mode protects against accidental corruption.
> 
> > Since it is a different concept maybe it shouldn't be part of this
> > patch. It is a form of integrity checking, but doesn't protect (or
> > warns) about integrity failures. 
> 
> It does protect and warn against integrity failures of the form of 
> incorrect signatures.

My issue is that I don't really understand "permissive". Originally I
assumed it was like selinux permissive mode, it does do the checks,
but if they fail we just warn and continue. That seems a clear concept.

So instead permissive is more like a optional checksum. And the
checksum we are using is really a signature. So we do more than
checksumming, we also check the certificate (is it still valid,
expired, etc.) It is just imho a very "sloppy" concept. I don't
understand the permissive threat-model. And I don't really understand
the name "permissive". Which makes it hard to know if we have
implemented it correctly. e.g. if there is a signature, but we don't
have the corresponding certificate to check it against, should it
still fail, or is it more like a none-signed file and we can be
"permissive" and accept it? Maybe I don't have enough imagination.

On the other hand I think the enforcing mode is pretty clear. It
provides the user with those files that the distro provider really
meant to ship. It makes sure that the tools relying on debuginfod only
process known-good/trusted files.

> > As pointed out in another bug (#30978) if you want to do checking
> > for (accidental) corruption of files you can also use the
> > .gnu_debuglink CRC.
> >
> > Or maybe add this is a followup to this patch, adding an ima:checking
> > and crc:checking marker (or maybe a generic checking marker that might
> > do both)?
> 
> A .gnu_debuglink checksum that is part of the executable can to some
> extent protect the debuginfo that it is a checksum of, but cannot
> protect the executable, or the source files.

Yeah, you are right, it wouldn't work for the executables, and for the
debug files it only works if you already have the executable on disk
(and trust them). In theory a debugfile can contain MD5 sums in the
DWARF line tables which could be used to check the source files (there
is also the time and size of the file). But in practice those aren't
added (at least not by gcc).

This probably shows how I am confused by the permissive/checking
concept.

> > >     NB: debuginfod section queries are excluded from signature
> > >     verification at this time, and function as though ima:ignore were in
> > >     effect.
> > 
> > imho this is odd. You either enforce the data produced by the server
> > is certified, or it isn't. I understand that doing that for the
> > section data is difficult because you effectively have to download and
> > check the whole file. But it feels wrong to claim to enforce the
> > signatures and then not do it.
> 
> Yes, it is odd.  Unfortunately, it is not possible to enforce crypto
> signatures from distros upon unsigned slices.  A couple of possible
> solutions:
> 
> - accept it as is with documentation
> 
> - disable section queries from enforcing-mode servers (which could
>   then nuke gdbindex capability for e.g. future fedora/gdb users)
> 
> - have debuginfod servers operate their own crypto signing engine,
>   sign sections and everything, distribute keys somehow (DNS?
>   distributed with elfutils?), let clients fetch & trust those
>   certificates; however does note prove distro provenance 

I think only option 2 makes sense given the enforcing threat-model.

Optionally we could do the section part locally, download the whole
file, check the ima signature, then provide the application with just
the section data.

Option 3 really seems something different. An alternative would be
marking https servers as trusted as long as their TLS certificate is
trusted.

> > >     IMA signatures are verified against a set of signing certificates.
> > >     These are normally published by distributions.  A selection of such
> > >     certificates is included with the debuginfod client, but some system
> > >     directories are also searched.  See $DEBUGINFOD_IMA_CERT_PATH.  These
> > >     certificates are assumed trusted.
> > 
> > Including default system directories seems fine. But I don't think
> > elfutils should ship certificates itself. That is the job of the
> > distro or user.
> 
> The user or the distro the user is running may not be the same one
> that the binaries the user is debugging comes from.  By shipping
> Fedora/RHEL/CentOS certificates, we allow a Ubuntu person to debug a
> RHEL container, and trust debuginfod content for it.

But it should be the distro/user who makes that choice. We cannot
decide for others who they trust as provider of the files they
download.

> > We aren't in a position to make sure the certificates are valid
> > and/or can be trusted.
> 
> Why not?  We can document where we got them - I believe they are all
> public somewhere or other already.

We certainly should document that and provide pointers to where
distros publish their certificates. But we shouldn't install them by
default. The distro/user can make their own choice of using them, just
like they decide whether or not the have default DEBUGINFOD_URLS.

> > > +  static inline unsigned char hex2dec(char c)
> > > +  {
> > > +    if (c >= '0' && c <= '9') return (c - '0');
> > > +    if (c >= 'a' && c <= 'f') return (c - 'a') + 10;
> > > +    if (c >= 'A' && c <= 'F') return (c - 'A') + 10;
> > > +    return 0;
> > > +  }
> > > +
> > 
> > OK. Since this doesn't signal error (except by returning 0, which is a
> > valid value) it should probably be guarded by some kind of input check
> > before being called.
> 
> Well, this need not signal an error at all.  If the hex encoding of
> the signature is corrupt (and get 0s or whatever from this function),
> then it will fail crypto verification, and that's all the error we
> need.

OK.
 
> > [...]
> > It would be good to add some comments for extract_skid, I am not sure
> > I understand how this works.
> 
> (Ditto.)

I do understand hex2dec, but I don't understand what extract_skid
does. Maybe add an explanation what a certificates subject key id is
and why we need it.
 
> > Do we need name[PATH_MAX]? That could be really big.
> 
> Yeah, just simpler than doing asprintf etc. everywhere, and doing all
> the C free cleanups, I guess.
> 

> > > +static int
> > > +debuginfod_validate_imasig (debuginfod_client *c, const char* tmp_path, int fd)
> > > +{
> > > +  (void) c;
> > > +  (void) tmp_path;
> > > +  (void) fd;
> > > +  int rc = ENOSYS;
> > 
> > Why not define c, tmp_path and fd under ENABLE_IMA_VERIFICATION?
> 
> Not sure what you mean.  This is not a definition, it is a "use", to
> eschew gcc warnings.

Ah, yeah, I got that wrong. Sorry. Those are unused arguments.

> > > +    // Compute the binary digest of the cached file (with file descriptor fd)
> > > +    ctx = EVP_MD_CTX_new();
> > > +    int hash_algo = imaevm_hash_algo_from_sig(bin_sig + 1);
> > 
> > Why bin_sig + 1?
> 
> Not sure, but this is how libimaevm.c similar code does it.  I assume
> the first byte of the signature is something else.
> (https://git.code.sf.net/p/linux-ima/ima-evm-utils)

ewww. Does this pass ubsan (--enable-sanitize-undefined)?  The issue
is that this seems to access structure values at a possibly unaligned
address.

> > > +      if (-1 == (n = pread(fd, file_data, DATA_SIZE, k)))
> 
> > Nice, but where is DATA_SIZE defined?
> 
> /usr/include/imaevm.h:#define  DATA_SIZE        4096

OK, reading 4K blocks is fine. But bleah, that is a very generic name
to use in an include header. Boo imaevm.h.

> > > +    /* Iterate over the directories in DEBUGINFOD_IMA_CERT_PATH looking
> > > +     * for the first verification certificate which matches keyid
> > > +     */
> > > +    uint32_t keyid = ntohl(((struct signature_v2_hdr *)(bin_sig + 1))->keyid); // The signature's keyid
> > 
> > Eep. Could this have a comment and/or pointer to docs? +1?
> > cast to struct? ntohl?
> 
> Yeah.  ntohl endianness conversion is not unusual in binary transport
> protocols.

So same as above, this seems like it might trigger undefined behavior.

> > > +          init_public_keys(tmp_file);
> > 
> > What does init_public_keys do? Is it thread-safe?
> 
> Good catch.  It initialized a global inside libimaevm.c.  It does not
> appear thread-safe.  Will wrap this in a pthread-once or somesuch.

libimaevm.c seems not thread-safe in general. You might have to put a big lock around the whole signature extraction/checking block which uses those library functions.

Another possible issue might be the initialization of openssl in the
static constructure. How does that interact with how libcurl inits
openssl?

Cheers,

Mark