From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=PAhz=5X=klomp.org=mark@sourceware.org>
Received: from gnu.wildebeest.org (gnu.wildebeest.org [45.83.234.184])
	by sourceware.org (Postfix) with ESMTPS id 925063858D28
	for <elfutils-devel@sourceware.org>; Thu, 26 Jan 2023 21:05:41 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 925063858D28
Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=klomp.org
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=klomp.org
Received: by gnu.wildebeest.org (Postfix, from userid 1000)
	id D69F9302BBEC; Thu, 26 Jan 2023 22:05:39 +0100 (CET)
Date: Thu, 26 Jan 2023 22:05:39 +0100
From: Mark Wielaard <mark@klomp.org>
To: vvvvvv@google.com
Cc: elfutils-devel@sourceware.org, kernel-team@android.com,
	maennich@google.com
Subject: Re: [PATCH] libdw: check memory access in get_(u|s)leb128
Message-ID: <20230126210539.GC2781@gnu.wildebeest.org>
References: <20230125160530.949622-1-vvvvvv@google.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20230125160530.949622-1-vvvvvv@google.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Spam-Status: No, score=-3031.6 required=5.0 tests=BAYES_00,JMQ_SPF_NEUTRAL,KAM_DMARC_STATUS,KAM_NUMSUBJECT,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=no autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <elfutils-devel.sourceware.org>

Hi Aleksei,

On Wed, Jan 25, 2023 at 04:05:30PM +0000, Aleksei Vetrov via Elfutils-devel wrote:
> From: Aleksei Vetrov <vvvvvv@google.com>
> 
> __libdw_get_uleb128 and __libdw_get_sleb128 should check if addrp has
> already reached the end before unrolling the first step. It is done by
> moving __libdw_max_len to the beginning of the function, which already
> has all the checks.

I understand why you might want to add these extra safeguards. But
when the bounds checking and unrolling for get_uleb128/get_sleb128 was
introduced the idea was that they should only be called with addrp <
endp. Precisely so at least one byte could be read without needing to
bounds check.

See the following commits:

commit 7a053473c7bedd22e3db39c444a4cd8f97eace25
Author: Mark Wielaard <mjw@redhat.com>
Date:   Sun Dec 14 21:48:23 2014 +0100

    libdw: Add get_uleb128 and get_sleb128 bounds checking.
    
    Both get_uleb128 and get_sleb128 now take an end pointer to prevent
    reading too much data. Adjust all callers to provide the end pointer.
    
    There are still two exceptions. "Raw" dwarf_getabbrevattr and
    read_encoded_valued don't have a end pointer associated yet.
    They will have to be provided in the future.
    
    Signed-off-by: Mark Wielaard <mjw@redhat.com>

commit 54662f13d14d59d44943543c48bdb21d50dd008d
Author: Josh Stone <jistone@redhat.com>
Date:   Mon Dec 15 12:18:25 2014 -0800

    libdw: pre-compute leb128 loop limits
    
    Signed-off-by: Josh Stone <jistone@redhat.com>

commit 1b5477ddf360af0f6262c6f15a590448b4e1a65a
Author: Mark Wielaard <mjw@redhat.com>
Date:   Tue Dec 16 10:53:22 2014 +0100

    libdw: Unroll the first get_sleb128 step to help the compiler optimize.
    
    The common case is a single-byte. So no extra (max len) calculation is
    necessary then.
    
    Signed-off-by: Mark Wielaard <mjw@redhat.com>

Now this was 8 years ago and might have been too optimistic. Maybe we
missed some checks? Did you actually find situations where these
functions were called with addrp >= endp?

It turns out that get_[su]leb128 dominates some operations and really
does have to be as fast as possible. So I do like to know what the
impact is of this change.

Thanks,

Mark