public inbox for elfutils@sourceware.org
 help / color / mirror / Atom feed
* [Bug libdw/26773] New: sleb128 values near INT64_MAX/MIN not correctly read
@ 2020-10-22 11:06 mark at klomp dot org
  2020-10-22 20:07 ` [Bug libdw/26773] " tromey at sourceware dot org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: mark at klomp dot org @ 2020-10-22 11:06 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=26773

            Bug ID: 26773
           Summary: sleb128 values near INT64_MAX/MIN not correctly read
           Product: elfutils
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: libdw
          Assignee: unassigned at sourceware dot org
          Reporter: mark at klomp dot org
                CC: elfutils-devel at sourceware dot org, tromey at sourceware dot org
  Target Milestone: ---

sleb128 values near INT64_MAX/MIN are not correctly read since:

commit 65a211f9028304757df8f4fa7cb3cc77d1501420
Author: Mark Wielaard <mjw@redhat.com>
Date:   Wed Apr 22 12:28:30 2015 +0200

    libdw: Undefined behavior in get_sleb128_step.

    gcc -fsanitize=undefined pointed out that for too big sleb128 values we
    could shift into the sign bit. So for sleb128 values that have to fit
    in a (signed) int64_t variable reduce the max number of steps by one.

    https://bugzilla.redhat.com/show_bug.cgi?id=1170810#c29

    Signed-off-by: Mark Wielaard <mjw@redhat.com>

The idea was the we didn't want to shift 1 << (9 * 7) since that would shift 63
places in a signed value (which is undefined behavior). Where 9 is the number
of bytes/steps (counting from zero). But sleb128 values near INT64_MIN/MAX are
represented by 10 bytes. Where the 10th byte doesn't have its high bit set to
indicate it is the last one. So for sleb128 values that use 10 bytes we fail to
read all and return INT64_MAX.

The reason we haven't seen this before is because normally using data8 is
cheaper for representing such values (if we know the type is signed). But with
DWARF5 we might be able to share the value between DIEs using
DW_FORM_implicit_const (which is represented in the abbrev itself using an
sleb128).

An example that triggers this is:

#include <stdint.h>

int foo ()
{
  int64_t maxA, maxB, maxC;
  maxA = maxB = maxC = INT64_MAX - 16; /*  9223372036854775791 */

  int64_t minA, minB, minC;
  minA = minB = minC = INT64_MIN + 16; /* -9223372036854775792 */

  return (maxA + maxB + maxC
          + minA + minB + minC);
}

compiled with gcc -gdwarf-5 -g -O2 -c consts.c

$ eu-readelf --debug-dump=abbrev consts.o

DWARF section [ 6] '.debug_abbrev' at offset 0x114:
 [ Code]

Abbreviation section at offset 0:
 [    1] offset: 0, children: no, tag: base_type
          attr: byte_size, form: data1, offset: 0
          attr: encoding, form: data1, offset: 0x2
          attr: name, form: strp, offset: 0x4
 [    2] offset: 11, children: no, tag: variable
          attr: name, form: strp, offset: 0xb
          attr: decl_file, form: implicit_const (1), offset: 0xd
          attr: decl_line, form: implicit_const (5), offset: 0x10
          attr: decl_column, form: data1, offset: 0x13
          attr: type, form: ref4, offset: 0x15
          attr: const_value, form: implicit_const (9223372036854775807),
offset: 0x17

Abbreviation section at offset 40:
 [    3] offset: 40, children: no, tag: variable
          attr: name, form: strp, offset: 0x28
          attr: decl_file, form: implicit_const (1), offset: 0x2a
          attr: decl_line, form: implicit_const (8), offset: 0x2d
          attr: decl_column, form: data1, offset: 0x30
          attr: type, form: ref4, offset: 0x32
          attr: const_value, form: implicit_const (9223372036854775807),
offset: 0x34
          attr: call_origin, form: ??? (0), offset: 0x3f
          attr: ??? (0), form: block4, offset: 0x41
[...]

Note how both values are read as INT64_MAX and that after (what is supposed to
be the negative value -9223372036854775792) the rest of the abbrev is read
wrongly:

$ eu-readelf --debug-dump=info consts.o


DWARF section [ 4] '.debug_info' at offset 0x46:
 [Offset]
 Compilation unit at offset 0:
 Version: 5, Abbreviation section offset: 0, Address size: 8, Offset size: 4
 Unit type: compile (1)
eu-readelf: cannot get tag of DIE at offset [c] in section '.debug_info':
invalid DWARF

We have to allow reading 10-byte sleb128 values without triggering undefined
behavior.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug libdw/26773] sleb128 values near INT64_MAX/MIN not correctly read
  2020-10-22 11:06 [Bug libdw/26773] New: sleb128 values near INT64_MAX/MIN not correctly read mark at klomp dot org
@ 2020-10-22 20:07 ` tromey at sourceware dot org
  2020-10-22 23:46 ` jistone at redhat dot com
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: tromey at sourceware dot org @ 2020-10-22 20:07 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=26773

--- Comment #1 from Tom Tromey <tromey at sourceware dot org> ---
I looked at this a little today.

In addition to this bug, I think the _unchecked variants
have another bug; namely they do:

  const size_t max = len_leb128 (int64_t) - 1;

This limits the number of bytes read-- but it seems like
it maybe ought to read until the first byte without the
high bit set.

FWIW gdb seems to rely on implementation-defined behavior
here.  It does all the sleb work in an unsigned type
and then casts it to signed on return.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug libdw/26773] sleb128 values near INT64_MAX/MIN not correctly read
  2020-10-22 11:06 [Bug libdw/26773] New: sleb128 values near INT64_MAX/MIN not correctly read mark at klomp dot org
  2020-10-22 20:07 ` [Bug libdw/26773] " tromey at sourceware dot org
@ 2020-10-22 23:46 ` jistone at redhat dot com
  2020-10-23 15:57 ` mark at klomp dot org
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: jistone at redhat dot com @ 2020-10-22 23:46 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=26773

Josh Stone <jistone at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jistone at redhat dot com

--- Comment #2 from Josh Stone <jistone at redhat dot com> ---
(In reply to Tom Tromey from comment #1)
> This limits the number of bytes read-- but it seems like
> it maybe ought to read until the first byte without the
> high bit set.

Beware, golang just dealt with a CVE for reading unlimited varints:
https://github.com/golang/go/issues/40618

(But the context is different since that's used in stuff like protocol buffers
that may be streamed.)

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug libdw/26773] sleb128 values near INT64_MAX/MIN not correctly read
  2020-10-22 11:06 [Bug libdw/26773] New: sleb128 values near INT64_MAX/MIN not correctly read mark at klomp dot org
  2020-10-22 20:07 ` [Bug libdw/26773] " tromey at sourceware dot org
  2020-10-22 23:46 ` jistone at redhat dot com
@ 2020-10-23 15:57 ` mark at klomp dot org
  2020-10-23 16:08 ` mark at klomp dot org
  2020-10-29 23:07 ` mark at klomp dot org
  4 siblings, 0 replies; 6+ messages in thread
From: mark at klomp dot org @ 2020-10-23 15:57 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=26773

--- Comment #3 from Mark Wielaard <mark at klomp dot org> ---
(In reply to Tom Tromey from comment #1)
> I looked at this a little today.
> 
> In addition to this bug, I think the _unchecked variants
> have another bug; namely they do:
> 
>   const size_t max = len_leb128 (int64_t) - 1;
> 
> This limits the number of bytes read-- but it seems like
> it maybe ought to read until the first byte without the
> high bit set.

Yes, that is the bug (plus the same -1 we do inside __libdw_max_len_sleb128 for
the "checked" variant.

> FWIW gdb seems to rely on implementation-defined behavior
> here.  It does all the sleb work in an unsigned type
> and then casts it to signed on return.

That would probably work. Is the implementation-defined part that it depends on
signed numbers being represented as Two's complement? In that case I think we
are fine with that as fix.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug libdw/26773] sleb128 values near INT64_MAX/MIN not correctly read
  2020-10-22 11:06 [Bug libdw/26773] New: sleb128 values near INT64_MAX/MIN not correctly read mark at klomp dot org
                   ` (2 preceding siblings ...)
  2020-10-23 15:57 ` mark at klomp dot org
@ 2020-10-23 16:08 ` mark at klomp dot org
  2020-10-29 23:07 ` mark at klomp dot org
  4 siblings, 0 replies; 6+ messages in thread
From: mark at klomp dot org @ 2020-10-23 16:08 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=26773

--- Comment #4 from Mark Wielaard <mark at klomp dot org> ---
(In reply to Josh Stone from comment #2)
> (In reply to Tom Tromey from comment #1)
> > This limits the number of bytes read-- but it seems like
> > it maybe ought to read until the first byte without the
> > high bit set.
> 
> Beware, golang just dealt with a CVE for reading unlimited varints:
> https://github.com/golang/go/issues/40618
> 
> (But the context is different since that's used in stuff like protocol
> buffers that may be streamed.)

Note that there are two "limits" in play here. For the normal get_sleb128
(__libdw_get_sleb128) variant we already require that and end pointer is
provided, which we won't move past. We require the caller to make sure we can
read at least the first byte. For the get_sleb128_unchecked
(__libdw_get_sleb128_unchecked) variant we assume the called already knows
there is a valid sleb128 at the given address, so we don't require an end
pointer and can just read till the end (first byte that doesn't have the high
bit set).

The other limit is that we only read as many bytes that can "fit" the type.
That is the "step limit" which is given by len_leb128 (int64_t) (or
__libdw_max_len_uleb128/__libdw_max_len_sleb128 for the unchecked variant).
This limits makes sure we don't "overflow" the int64_t or uint64_t. So we
normally only read 10 bytes anyway.

The bug is in "the other limit". For signed types we subtract 1 to make sure we
didn't shift into the sign bit. But that means we can only read up to 9 bytes
of the sleb128 and will fail (return INT64_MAX and leave the addr pointer at
the last byte instead of consuming it).

I think Tom's suggestion to use an unsigned type for the sleb128 calculation
and then cast it to int64_t when done is the correct fix.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug libdw/26773] sleb128 values near INT64_MAX/MIN not correctly read
  2020-10-22 11:06 [Bug libdw/26773] New: sleb128 values near INT64_MAX/MIN not correctly read mark at klomp dot org
                   ` (3 preceding siblings ...)
  2020-10-23 16:08 ` mark at klomp dot org
@ 2020-10-29 23:07 ` mark at klomp dot org
  4 siblings, 0 replies; 6+ messages in thread
From: mark at klomp dot org @ 2020-10-29 23:07 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=26773

Mark Wielaard <mark at klomp dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED

--- Comment #5 from Mark Wielaard <mark at klomp dot org> ---
commit 70343f484481184f9fa216071399690ff833256b
Author: Tom Tromey <tom@tromey.com>
Date:   Wed Oct 28 17:26:42 2020 -0600

    Fix leb128 reading

    PR 26773 points out that some sleb128 values are decoded incorrectly.

    This version of the fix only examines the sleb128 conversion.
    Overlong encodings are not handled, and the uleb128 decoders are not
    touched.  The approach taken here is to do the work in an unsigned
    type, and then rely on an implementation-defined cast to convert to
    signed.

    Signed-off-by: Tom Tromey <tom@tromey.com>

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-10-29 23:07 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-22 11:06 [Bug libdw/26773] New: sleb128 values near INT64_MAX/MIN not correctly read mark at klomp dot org
2020-10-22 20:07 ` [Bug libdw/26773] " tromey at sourceware dot org
2020-10-22 23:46 ` jistone at redhat dot com
2020-10-23 15:57 ` mark at klomp dot org
2020-10-23 16:08 ` mark at klomp dot org
2020-10-29 23:07 ` mark at klomp dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).