Re: [PATCH 2/3] Simplify and inline get_uleb128 and get

public inbox for elfutils@sourceware.org
 help / color / mirror / Atom feed

* Re: [PATCH 2/3] Simplify and inline get_uleb128 and get_sleb128
@ 2014-04-22 15:01 Richard Henderson
  0 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2014-04-22 15:01 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 1853 bytes --]

On 04/22/2014 07:55 AM, Mark Wielaard wrote:
> On Thu, 2013-12-12 at 11:06 -0800, Josh Stone wrote:
>> On 12/12/2013 04:13 AM, Petr Machata wrote:
>>> Josh Stone <jistone@redhat.com> writes:
>>>> +#define get_sleb128_step(var, addr, nth)			  \
>>>>    do {							  \
>>>> +    unsigned char __b = *(addr)++;				  \
>>>> +    if (likely ((__b & 0x80) == 0))				  \
>>>>        {							  \
>>>> +	   struct { signed int i:7; } __s = { .i = __b };	  \
>>>> +	   (var) |= (typeof (var)) __s.i << ((nth) * 7);	  \
>>>
>>> Oh, the bitfield trick is clever!
>>
>> I should give credit, I found that trick here:
>> http://graphics.stanford.edu/~seander/bithacks.html#FixedSignExtend
>>
>> The former code was trying to sign-extend after the value was shifted
>> and combined, which as a variable width is harder.  I really like that
>> this way the compiler is fully aware that this is a sign extension,
>> rather than being a side effect of ORing bits or left-right shifts.
> 
> Sadly the neat trick triggers undefined behavior since we are trying to
> left shift a negative value. Even though it appears to work currently I
> am slightly afraid a compiler optimization might take advantage of this
> in the future (since it is undefined behavior it could just assume
> negative values won't occur) especially since this code is inlined in a
> lot of places, causing hard to diagnose errors.
> 
> The attached patch is very much not clever, but does what is intended in
> a well-defined way (it is basically what the DWARF spec gives as pseudo
> code). Does anybody see a better way?

Multiply instead of shift.  I.e.

  (typeof (var)) (__b & 0x7f) * (1 << ((nth) * 7));

With any luck the compiler will even notice it's a multiply by a value with one
bit set, and convert it back to a shift during optimization.


r~

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] Simplify and inline get_uleb128 and get_sleb128
@ 2014-04-24  9:49 Mark Wielaard
  0 siblings, 0 replies; 19+ messages in thread
From: Mark Wielaard @ 2014-04-24  9:49 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 1008 bytes --]

On Thu, 2014-04-24 at 00:32 +0200, Petr Machata wrote:
> Mark Wielaard <mjw@redhat.com> writes:
> > But I think I prefer the fix using multiplication as posted here:
> > https://lists.fedorahosted.org/pipermail/elfutils-devel/2014-April/003991.html
> > That one is also just a oneliner and more generic since it doesn't rely
> > on having to cast to uint64_t. And Richard checked it still just
> > generates a shift.
> 
> Ah, cool, I didn't notice that e-mail.

OK, I pushed that fix and the other undefined behavior fixes posted
earlier to master now:

  readelf: handle_core_item make sure variable length array isn't zero size.
  libdwfl: __libdwfl_frame_reg_[gs]et use uint64_t when checking bits.
  readelf.c (print_gdb_index_section): Use unsigned int for 31 bits left shift.
  libdw (get_sleb128_step): Remove undefined behavior.

make CFLAGS="-g -fsanitize=undefined" && \
make CFLAGS="-g -fsanitize=undefined" check
with gcc 4.9 should have zero FAILs now.

Cheers,

Mark

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] Simplify and inline get_uleb128 and get_sleb128
@ 2014-04-23 22:32 Petr Machata
  0 siblings, 0 replies; 19+ messages in thread
From: Petr Machata @ 2014-04-23 22:32 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 687 bytes --]

Mark Wielaard <mjw@redhat.com> writes:

> On Wed, 2014-04-23 at 22:29 +0200, Petr Machata wrote:
>> How's this for a counter-proposal?  I shamelessly reused your commit
>> message and ChangeLog wording, please let me know if that's overstepping
>> boundaries.
>
> No it isn't. And thanks for the proposal.
> But I think I prefer the fix using multiplication as posted here:
> https://lists.fedorahosted.org/pipermail/elfutils-devel/2014-April/003991.html
> That one is also just a oneliner and more generic since it doesn't rely
> on having to cast to uint64_t. And Richard checked it still just
> generates a shift.

Ah, cool, I didn't notice that e-mail.

Thanks,
PM

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] Simplify and inline get_uleb128 and get_sleb128
@ 2014-04-23 21:54 Mark Wielaard
  0 siblings, 0 replies; 19+ messages in thread
From: Mark Wielaard @ 2014-04-23 21:54 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 1061 bytes --]

On Wed, 2014-04-23 at 22:29 +0200, Petr Machata wrote:
> How's this for a counter-proposal?  I shamelessly reused your commit
> message and ChangeLog wording, please let me know if that's overstepping
> boundaries.

No it isn't. And thanks for the proposal.
But I think I prefer the fix using multiplication as posted here:
https://lists.fedorahosted.org/pipermail/elfutils-devel/2014-April/003991.html
That one is also just a oneliner and more generic since it doesn't rely
on having to cast to uint64_t. And Richard checked it still just
generates a shift.

You are right that we don't really need the generic version, but just a
int64_t one. But if we make it non-generic than I think we should go all
the way and also do the other cleanups suggested to get rid of the
typeofs and just use the 64bit types everywhere (and maybe inline or
undef the macro). But please don't feel you need to clean it all up and
"ungenericize" it all. That would be nice, but the main thing is that we
get rid of the undefined behavior.

Thanks,

Mark

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] Simplify and inline get_uleb128 and get_sleb128
@ 2014-04-23 20:29 Petr Machata
  0 siblings, 0 replies; 19+ messages in thread
From: Petr Machata @ 2014-04-23 20:29 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 172 bytes --]

How's this for a counter-proposal?  I shamelessly reused your commit
message and ChangeLog wording, please let me know if that's overstepping
boundaries.

Thanks,
PM

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] Simplify and inline get_uleb128 and get_sleb128
@ 2014-04-23 19:01 Josh Stone
  0 siblings, 0 replies; 19+ messages in thread
From: Josh Stone @ 2014-04-23 19:01 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 1752 bytes --]

On 04/23/2014 11:51 AM, Richard Henderson wrote:
> On 04/23/2014 11:32 AM, Petr Machata wrote:
>> Richard Henderson <rth@redhat.com> writes:
>>
>>> On 04/23/2014 03:17 AM, Petr Machata wrote:
>>>> Wouldn't something like this get us off the hook as well?
>>>>
>>>> -	(var) |= (typeof (var)) __s.i << ((nth) * 7);			      \
>>>> +	(var) |= (typeof (var))						      \
>>>> +	  (((uint64_t) (typeof (var)) __s.i) << ((nth) * 7));		      \
>>>>
>>>> We are really only using the bitfield trick to avoid having to
>>>> sign-extend by hand, but we can shift unsigned without losing anything.
>>>
>>> It gets us off the hook, but might introduce a 64-bit shift where
>>> only a 32-bit shift was required.
>>
>> Good point, but get_sleb128_step is only used from __libdw_get_sleb128,
>> where the type is int64_t.  This macro is not safe for outside use
>> anyway, as it uses its parameters more than once.
>>
>> Hmm, should we maybe #undef it after __libdw_get_sleb128, so that it's
>> clear that it's for local use only?
> 
> In that case, why play around with the typeof at all?  Just merge the macro
> into its only user.  At which point using uint64_t makes total sense in context.
> 
> Similarly with get_uleb128_step, I assume, at least wrt to the typeof.

The way these are macro'd is partly a legacy of earlier code.  It does
help for the uleb case since unrolling the first step proved to benefit
optimization.  Keeping sleb this way is helpful to maintain their
similarity as much as possible.

Sure, the typeof's are probably unnecessary anymore, but this is
splitting hairs IMO.  It doesn't help anything to drop this, and we
could decide to provide narrower _libdw_get... in the future for certain
callers.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] Simplify and inline get_uleb128 and get_sleb128
@ 2014-04-23 18:51 Richard Henderson
  0 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2014-04-23 18:51 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 1214 bytes --]

On 04/23/2014 11:32 AM, Petr Machata wrote:
> Richard Henderson <rth@redhat.com> writes:
> 
>> On 04/23/2014 03:17 AM, Petr Machata wrote:
>>> Wouldn't something like this get us off the hook as well?
>>>
>>> -	(var) |= (typeof (var)) __s.i << ((nth) * 7);			      \
>>> +	(var) |= (typeof (var))						      \
>>> +	  (((uint64_t) (typeof (var)) __s.i) << ((nth) * 7));		      \
>>>
>>> We are really only using the bitfield trick to avoid having to
>>> sign-extend by hand, but we can shift unsigned without losing anything.
>>
>> It gets us off the hook, but might introduce a 64-bit shift where
>> only a 32-bit shift was required.
> 
> Good point, but get_sleb128_step is only used from __libdw_get_sleb128,
> where the type is int64_t.  This macro is not safe for outside use
> anyway, as it uses its parameters more than once.
> 
> Hmm, should we maybe #undef it after __libdw_get_sleb128, so that it's
> clear that it's for local use only?

In that case, why play around with the typeof at all?  Just merge the macro
into its only user.  At which point using uint64_t makes total sense in context.

Similarly with get_uleb128_step, I assume, at least wrt to the typeof.


r~


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] Simplify and inline get_uleb128 and get_sleb128
@ 2014-04-23 18:32 Petr Machata
  0 siblings, 0 replies; 19+ messages in thread
From: Petr Machata @ 2014-04-23 18:32 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 911 bytes --]

Richard Henderson <rth@redhat.com> writes:

> On 04/23/2014 03:17 AM, Petr Machata wrote:
>> Wouldn't something like this get us off the hook as well?
>> 
>> -	(var) |= (typeof (var)) __s.i << ((nth) * 7);			      \
>> +	(var) |= (typeof (var))						      \
>> +	  (((uint64_t) (typeof (var)) __s.i) << ((nth) * 7));		      \
>> 
>> We are really only using the bitfield trick to avoid having to
>> sign-extend by hand, but we can shift unsigned without losing anything.
>
> It gets us off the hook, but might introduce a 64-bit shift where
> only a 32-bit shift was required.

Good point, but get_sleb128_step is only used from __libdw_get_sleb128,
where the type is int64_t.  This macro is not safe for outside use
anyway, as it uses its parameters more than once.

Hmm, should we maybe #undef it after __libdw_get_sleb128, so that it's
clear that it's for local use only?

Thanks,
PM

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] Simplify and inline get_uleb128 and get_sleb128
@ 2014-04-23 15:27 Richard Henderson
  0 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2014-04-23 15:27 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 732 bytes --]

On 04/23/2014 03:17 AM, Petr Machata wrote:
> Wouldn't something like this get us off the hook as well?
> 
> -	(var) |= (typeof (var)) __s.i << ((nth) * 7);			      \
> +	(var) |= (typeof (var))						      \
> +	  (((uint64_t) (typeof (var)) __s.i) << ((nth) * 7));		      \
> 
> We are really only using the bitfield trick to avoid having to
> sign-extend by hand, but we can shift unsigned without losing anything.

It gets us off the hook, but might introduce a 64-bit shift where
only a 32-bit shift was required.

Sadly, (unsigned typeof(var)) doesn't work.  ;-)

But I gave the multiplication change a look, and the compiler is in fact
optimizing the multiplication by a power of 2 back into a shift.


r~

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] Simplify and inline get_uleb128 and get_sleb128
@ 2014-04-23 10:17 Petr Machata
  0 siblings, 0 replies; 19+ messages in thread
From: Petr Machata @ 2014-04-23 10:17 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 1777 bytes --]

Mark Wielaard <mjw@redhat.com> writes:

> Sadly the neat trick triggers undefined behavior since we are trying to
> left shift a negative value. Even though it appears to work currently I
> am slightly afraid a compiler optimization might take advantage of this
> in the future (since it is undefined behavior it could just assume
> negative values won't occur) especially since this code is inlined in a
> lot of places, causing hard to diagnose errors.

Ouch.  Yeah, I agree, it is essentially a matter of time before this
whole thing is optimized away or something.

> diff --git a/libdw/memory-access.h b/libdw/memory-access.h
> index d0ee63c..c6e4bdc 100644
> --- a/libdw/memory-access.h
> +++ b/libdw/memory-access.h
> @@ -70,8 +70,9 @@ __libdw_get_uleb128 (const unsigned char **addrp)
>      unsigned char __b = *(addr)++;					      \
>      if (likely ((__b & 0x80) == 0))					      \
>        {									      \
> -	struct { signed int i:7; } __s = { .i = __b };			      \
> -	(var) |= (typeof (var)) __s.i << ((nth) * 7);			      \
> +	(var) |= (typeof (var)) (__b & 0x7f) << ((nth) * 7);		      \
> +	if ((((nth) + 1) < 8 * sizeof (var)) && (__b & 0x40))		      \
> +	  (var) |= -(((uint64_t) 1) << (((nth) + 1) * 7));		      \
>  	return (var);							      \
>        }									      \
>      (var) |= (typeof (var)) (__b & 0x7f) << ((nth) * 7);		      \

Wouldn't something like this get us off the hook as well?

-	(var) |= (typeof (var)) __s.i << ((nth) * 7);			      \
+	(var) |= (typeof (var))						      \
+	  (((uint64_t) (typeof (var)) __s.i) << ((nth) * 7));		      \

We are really only using the bitfield trick to avoid having to
sign-extend by hand, but we can shift unsigned without losing anything.

Thanks,
PM

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] Simplify and inline get_uleb128 and get_sleb128
@ 2014-04-22 22:04 Mark Wielaard
  0 siblings, 0 replies; 19+ messages in thread
From: Mark Wielaard @ 2014-04-22 22:04 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 433 bytes --]

On Tue, 2014-04-22 at 08:58 -0700, Richard Henderson wrote:
> On 04/22/2014 08:52 AM, Josh Stone wrote:
> > So in total:
> > 
> >   struct { signed int i:7; } __s = { .i = __b };
> >   (var) |= (typeof (var)) __s.i * ((typeof (var)) 1 << ((nth) * 7));
> > 
> > Better?
> 
> Yes, that's what I had in mind.

Excellent. I changed the patch as attached. No undefined behavior
detected, all tests PASS.

Thanks,

Mark


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-libdw-get_sleb128_step-Remove-undefined-behavior.patch --]
[-- Type: text/x-patch, Size: 1844 bytes --]

>From 9ab5b0eba3343e3a75cdc8a8fd5dea76c7a2d351 Mon Sep 17 00:00:00 2001
From: Mark Wielaard <mjw@redhat.com>
Date: Tue, 22 Apr 2014 16:43:11 +0200
Subject: [PATCH] libdw (get_sleb128_step): Remove undefined behavior.

As pointed out by gcc -fsanitize=undefined left shifting a negative value
is undefined. Replace it with a multiplication of the signed value as
suggested by Richard Henderson and Josh Stone.

Signed-off-by: Mark Wielaard <mjw@redhat.com>
---
 libdw/ChangeLog       |    5 +++++
 libdw/memory-access.h |    4 ++--
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/libdw/ChangeLog b/libdw/ChangeLog
index 49d70af..a7b0400 100644
--- a/libdw/ChangeLog
+++ b/libdw/ChangeLog
@@ -1,3 +1,8 @@
+2014-04-22  Mark Wielaard  <mjw@redhat.com>
+
+	* memory-access.h (get_sleb128_step): Remove undefined behavior
+	of left shifting a signed value. Replace it with a multiplication.
+
 2014-04-13  Mark Wielaard  <mjw@redhat.com>
 
 	* Makefile.am: Remove !MUDFLAP conditions.
diff --git a/libdw/memory-access.h b/libdw/memory-access.h
index d0ee63c..f41f783 100644
--- a/libdw/memory-access.h
+++ b/libdw/memory-access.h
@@ -1,5 +1,5 @@
 /* Unaligned memory access functionality.
-   Copyright (C) 2000-2013 Red Hat, Inc.
+   Copyright (C) 2000-2014 Red Hat, Inc.
    This file is part of elfutils.
    Written by Ulrich Drepper <drepper@redhat.com>, 2001.
 
@@ -71,7 +71,7 @@ __libdw_get_uleb128 (const unsigned char **addrp)
     if (likely ((__b & 0x80) == 0))					      \
       {									      \
 	struct { signed int i:7; } __s = { .i = __b };			      \
-	(var) |= (typeof (var)) __s.i << ((nth) * 7);			      \
+	(var) |= (typeof (var)) __s.i * ((typeof (var)) 1 << ((nth) * 7));    \
 	return (var);							      \
       }									      \
     (var) |= (typeof (var)) (__b & 0x7f) << ((nth) * 7);		      \
-- 
1.7.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] Simplify and inline get_uleb128 and get_sleb128
@ 2014-04-22 15:58 Richard Henderson
  0 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2014-04-22 15:58 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 476 bytes --]

On 04/22/2014 08:52 AM, Josh Stone wrote:
> The trick in the original was that "(typeof (var)) __s.i" would extend
> the sign from the 7th bit, which I don't your mask will do.

Yes, that was another cut/paste error.
Not a great batting average this morning...  ;-)

> So in total:
> 
>   struct { signed int i:7; } __s = { .i = __b };
>   (var) |= (typeof (var)) __s.i * ((typeof (var)) 1 << ((nth) * 7));
> 
> Better?

Yes, that's what I had in mind.


r~

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] Simplify and inline get_uleb128 and get_sleb128
@ 2014-04-22 15:52 Josh Stone
  0 siblings, 0 replies; 19+ messages in thread
From: Josh Stone @ 2014-04-22 15:52 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 2358 bytes --]

On 04/22/2014 08:01 AM, Richard Henderson wrote:
> On 04/22/2014 07:55 AM, Mark Wielaard wrote:
>> On Thu, 2013-12-12 at 11:06 -0800, Josh Stone wrote:
>>> On 12/12/2013 04:13 AM, Petr Machata wrote:
>>>> Josh Stone <jistone@redhat.com> writes:
>>>>> +#define get_sleb128_step(var, addr, nth)			  \
>>>>>    do {							  \
>>>>> +    unsigned char __b = *(addr)++;				  \
>>>>> +    if (likely ((__b & 0x80) == 0))				  \
>>>>>        {							  \
>>>>> +	   struct { signed int i:7; } __s = { .i = __b };	  \
>>>>> +	   (var) |= (typeof (var)) __s.i << ((nth) * 7);	  \
>>>>
>>>> Oh, the bitfield trick is clever!
>>>
>>> I should give credit, I found that trick here:
>>> http://graphics.stanford.edu/~seander/bithacks.html#FixedSignExtend
>>>
>>> The former code was trying to sign-extend after the value was shifted
>>> and combined, which as a variable width is harder.  I really like that
>>> this way the compiler is fully aware that this is a sign extension,
>>> rather than being a side effect of ORing bits or left-right shifts.
>>
>> Sadly the neat trick triggers undefined behavior since we are trying to
>> left shift a negative value. Even though it appears to work currently I
>> am slightly afraid a compiler optimization might take advantage of this
>> in the future (since it is undefined behavior it could just assume
>> negative values won't occur) especially since this code is inlined in a
>> lot of places, causing hard to diagnose errors.
>>
>> The attached patch is very much not clever, but does what is intended in
>> a well-defined way (it is basically what the DWARF spec gives as pseudo
>> code). Does anybody see a better way?
> 
> Multiply instead of shift.  I.e.
> 
>   (typeof (var)) (__b & 0x7f) * (1 << ((nth) * 7));
> 
> With any luck the compiler will even notice it's a multiply by a value with one
> bit set, and convert it back to a shift during optimization.

The trick in the original was that "(typeof (var)) __s.i" would extend
the sign from the 7th bit, which I don't your mask will do.  But if the
only UB is left-shifting negative, then I think we can combine them.

> Bah.  That 1 needs to be typeof (var) too.

So in total:

  struct { signed int i:7; } __s = { .i = __b };
  (var) |= (typeof (var)) __s.i * ((typeof (var)) 1 << ((nth) * 7));

Better?



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] Simplify and inline get_uleb128 and get_sleb128
@ 2014-04-22 15:03 Richard Henderson
  0 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2014-04-22 15:03 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 159 bytes --]

On 04/22/2014 08:01 AM, Richard Henderson wrote:
>   (typeof (var)) (__b & 0x7f) * (1 << ((nth) * 7));

Bah.  That 1 needs to be typeof (var) too.


r~

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] Simplify and inline get_uleb128 and get_sleb128
@ 2014-04-22 14:55 Mark Wielaard
  0 siblings, 0 replies; 19+ messages in thread
From: Mark Wielaard @ 2014-04-22 14:55 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 1552 bytes --]

On Thu, 2013-12-12 at 11:06 -0800, Josh Stone wrote:
> On 12/12/2013 04:13 AM, Petr Machata wrote:
> > Josh Stone <jistone@redhat.com> writes:
> >> +#define get_sleb128_step(var, addr, nth)			  \
> >>    do {							  \
> >> +    unsigned char __b = *(addr)++;				  \
> >> +    if (likely ((__b & 0x80) == 0))				  \
> >>        {							  \
> >> +	   struct { signed int i:7; } __s = { .i = __b };	  \
> >> +	   (var) |= (typeof (var)) __s.i << ((nth) * 7);	  \
> > 
> > Oh, the bitfield trick is clever!
> 
> I should give credit, I found that trick here:
> http://graphics.stanford.edu/~seander/bithacks.html#FixedSignExtend
> 
> The former code was trying to sign-extend after the value was shifted
> and combined, which as a variable width is harder.  I really like that
> this way the compiler is fully aware that this is a sign extension,
> rather than being a side effect of ORing bits or left-right shifts.

Sadly the neat trick triggers undefined behavior since we are trying to
left shift a negative value. Even though it appears to work currently I
am slightly afraid a compiler optimization might take advantage of this
in the future (since it is undefined behavior it could just assume
negative values won't occur) especially since this code is inlined in a
lot of places, causing hard to diagnose errors.

The attached patch is very much not clever, but does what is intended in
a well-defined way (it is basically what the DWARF spec gives as pseudo
code). Does anybody see a better way?

Thanks,

Mark

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-libdw-get_sleb128_step-Remove-undefined-behavior.patch --]
[-- Type: text/x-patch, Size: 1707 bytes --]

>From 0eedb6486806dba9c454edcc249238e096961e09 Mon Sep 17 00:00:00 2001
From: Mark Wielaard <mjw@redhat.com>
Date: Tue, 22 Apr 2014 16:43:11 +0200
Subject: [PATCH] libdw (get_sleb128_step): Remove undefined behavior.

As pointed out by gcc -fsanitize=undefined left shifting a negative value
is undefined. Replace it with an explicit sign extension step.

Signed-off-by: Mark Wielaard <mjw@redhat.com>
---
 libdw/ChangeLog       | 5 +++++
 libdw/memory-access.h | 5 +++--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/libdw/ChangeLog b/libdw/ChangeLog
index 49d70af..e561e00 100644
--- a/libdw/ChangeLog
+++ b/libdw/ChangeLog
@@ -1,3 +1,8 @@
+2014-04-22  Mark Wielaard  <mjw@redhat.com>
+
+	* memory-access.h (get_sleb128_step): Remove undefined behavior
+	of left shifting a signed value. Add explicit sign extension.
+
 2014-04-13  Mark Wielaard  <mjw@redhat.com>
 
 	* Makefile.am: Remove !MUDFLAP conditions.
diff --git a/libdw/memory-access.h b/libdw/memory-access.h
index d0ee63c..c6e4bdc 100644
--- a/libdw/memory-access.h
+++ b/libdw/memory-access.h
@@ -70,8 +70,9 @@ __libdw_get_uleb128 (const unsigned char **addrp)
     unsigned char __b = *(addr)++;					      \
     if (likely ((__b & 0x80) == 0))					      \
       {									      \
-	struct { signed int i:7; } __s = { .i = __b };			      \
-	(var) |= (typeof (var)) __s.i << ((nth) * 7);			      \
+	(var) |= (typeof (var)) (__b & 0x7f) << ((nth) * 7);		      \
+	if ((((nth) + 1) < 8 * sizeof (var)) && (__b & 0x40))		      \
+	  (var) |= -(((uint64_t) 1) << (((nth) + 1) * 7));		      \
 	return (var);							      \
       }									      \
     (var) |= (typeof (var)) (__b & 0x7f) << ((nth) * 7);		      \
-- 
1.9.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] Simplify and inline get_uleb128 and get_sleb128
@ 2013-12-12 22:23 Petr Machata
  0 siblings, 0 replies; 19+ messages in thread
From: Petr Machata @ 2013-12-12 22:23 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 707 bytes --]

Josh Stone <jistone@redhat.com> writes:

> On 12/12/2013 04:13 AM, Petr Machata wrote:
>> Josh Stone <jistone@redhat.com> writes:
>>> +static inline uint64_t
>>> +__libdw_get_uleb128 (const unsigned char **addrp)
>>> +{
>>> +  uint64_t acc = 0;
>>> +  get_uleb128_step (acc, *addrp, 0);
>>> +  for (unsigned int i = 1; i < len_leb128(acc); ++i)
>>> +    get_uleb128_step (acc, *addrp, i);
>> 
>> Is there a reason not to use for (i = 0; ...) instead of the pre-step
>> followed by a for (i = 1; ...)?
>
> Ah, yes, I should explain that a little.  I found that the code was
> actually faster with the first step unrolled.

Ah, I think that's fine then, but needs a comment.

Thanks,
PM

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] Simplify and inline get_uleb128 and get_sleb128
@ 2013-12-12 19:06 Josh Stone
  0 siblings, 0 replies; 19+ messages in thread
From: Josh Stone @ 2013-12-12 19:06 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 3598 bytes --]

On 12/12/2013 04:13 AM, Petr Machata wrote:
> Josh Stone <jistone@redhat.com> writes:
>> +static inline uint64_t
>> +__libdw_get_uleb128 (const unsigned char **addrp)
>> +{
>> +  uint64_t acc = 0;
>> +  get_uleb128_step (acc, *addrp, 0);
>> +  for (unsigned int i = 1; i < len_leb128(acc); ++i)
>> +    get_uleb128_step (acc, *addrp, i);
> 
> Is there a reason not to use for (i = 0; ...) instead of the pre-step
> followed by a for (i = 1; ...)?  The only difference would be for
> len_leb128 of 0, where your code makes one step anyway, but that can't
> happen.  Then you could just expand get_uleb128_step inline, and get rid
> of it altogether.

Ah, yes, I should explain that a little.  I found that the code was
actually faster with the first step unrolled.  And you may have noticed
I didn't do this for sleb128, because I found that slightly slower.

It seems to work like an extra hint to the compiler.  There's already a
likely-return in the loop, but writing it this way seems to convince gcc
that the first step in particular is *really* likely to return.  Then
that step also gets to const-prop the zero.

I can only guess that the sign-extension for sleb128 is complicated
enough to nullify the gain of unrolling.

>> +#define get_uleb128(var, addr)						      \
>>    do {									      \
>> +    (var) = __libdw_get_uleb128(&(addr));				      \
>> +    (void)(var);							      \
>>    } while (0)
> 
> Space before (...).  Space after type cast operator.
> 
> Actually, could this actually be defined like this?
> 
>         #define get_uleb128(var, addr) ((var) = __libdw_get_uleb128 (&(addr)))
> 
> Then you wouldn't need the cast as the only set-but-not-used warnings
> would come from sites that actually set but not use, and those could be
> changed to simply call __libdw_get_uleb128.  But that's probably a
> logically separate change.

Right, there are two reasons I kept it this way.  One is the cast for
set-but-not-used, as you note, which didn't trigger before because the
former macros did both read and write the value in place.  I think there
was only one caller affected though, so I could just change that.  The
do-while keeps the macro as a statement, vs your expression, but maybe
that doesn't matter much.

>> +#define get_sleb128_step(var, addr, nth)				      \
>>    do {									      \
>> +    unsigned char __b = *(addr)++;					      \
>> +    if (likely ((__b & 0x80) == 0))					      \
>>        {									      \
>> +	   struct { signed int i:7; } __s = { .i = __b };			      \
>> +	   (var) |= (typeof (var)) __s.i << ((nth) * 7);			      \
> 
> Oh, the bitfield trick is clever!

I should give credit, I found that trick here:
http://graphics.stanford.edu/~seander/bithacks.html#FixedSignExtend

The former code was trying to sign-extend after the value was shifted
and combined, which as a variable width is harder.  I really like that
this way the compiler is fully aware that this is a sign extension,
rather than being a side effect of ORing bits or left-right shifts.

> This all looks fine to me.  I find the new code much clearer than the
> original, which seemed to wind every which way and was hard to follow.

I agree the original was convoluted, and I think it was actually broken
for some edge cases, though you almost have to look at it preprocessed
to make sense of it.  In particular, __libdw_get_sleb128 ORed with acc
in the final step, but only _v was kept updated throughout the loop.
(But it's unlikely any real DWARF had a 10-byte sleb128 to hit this.)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] Simplify and inline get_uleb128 and get_sleb128
@ 2013-12-12 12:13 Petr Machata
  0 siblings, 0 replies; 19+ messages in thread
From: Petr Machata @ 2013-12-12 12:13 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 3057 bytes --]

Josh Stone <jistone@redhat.com> writes:

> diff --git a/libdw/ChangeLog b/libdw/ChangeLog
> index a2e4b142a107..93396e404d97 100644
> --- a/libdw/ChangeLog
> +++ b/libdw/ChangeLog
> @@ -1,3 +1,17 @@
> +2013-12-10  Josh Stone  <jistone@redhat.com>
> +
> +	* memory-access.h (get_uleb128_rest_return): Removed.
> +	(get_sleb128_rest_return): Removed

Period.

> +static inline uint64_t
> +__libdw_get_uleb128 (const unsigned char **addrp)
> +{
> +  uint64_t acc = 0;
> +  get_uleb128_step (acc, *addrp, 0);
> +  for (unsigned int i = 1; i < len_leb128(acc); ++i)
> +    get_uleb128_step (acc, *addrp, i);

Is there a reason not to use for (i = 0; ...) instead of the pre-step
followed by a for (i = 1; ...)?  The only difference would be for
len_leb128 of 0, where your code makes one step anyway, but that can't
happen.  Then you could just expand get_uleb128_step inline, and get rid
of it altogether.

Also, the len_leb128 call needs space before (...).

> +  /* Other implementations set VALUE to UINT_MAX in this
> +     case.  So we better do this as well.  */
> +  return UINT64_MAX;
> +}
> +
> +#define get_uleb128(var, addr)						      \
>    do {									      \
> +    (var) = __libdw_get_uleb128(&(addr));				      \
> +    (void)(var);							      \
>    } while (0)

Space before (...).  Space after type cast operator.

Actually, could this actually be defined like this?

        #define get_uleb128(var, addr) ((var) = __libdw_get_uleb128 (&(addr)))

Then you wouldn't need the cast as the only set-but-not-used warnings
would come from sites that actually set but not use, and those could be
changed to simply call __libdw_get_uleb128.  But that's probably a
logically separate change.

> +#define get_sleb128_step(var, addr, nth)				      \
>    do {									      \
> +    unsigned char __b = *(addr)++;					      \
> +    if (likely ((__b & 0x80) == 0))					      \
>        {									      \
> +	   struct { signed int i:7; } __s = { .i = __b };			      \
> +	   (var) |= (typeof (var)) __s.i << ((nth) * 7);			      \

Oh, the bitfield trick is clever!

> +	   return (var);							      \
>        }									      \
> +    (var) |= (typeof (var)) (__b & 0x7f) << ((nth) * 7);		      \
>    } while (0)

>  
>  static inline int64_t
> +__libdw_get_sleb128 (const unsigned char **addrp)
>  {
> +  int64_t acc = 0;
> +  for (unsigned int i = 0; i < len_leb128(acc); ++i)

Space before (...).

> +    get_sleb128_step (acc, *addrp, i);
> +  /* Other implementations set VALUE to INT_MAX in this
> +     case.  So we better do this as well.  */
> +  return INT64_MAX;
>  }
> +
> +#define get_sleb128(var, addr)						      \
> +  do {									      \
> +    (var) = __libdw_get_sleb128(&(addr));				      \
> +    (void)(var);							      \
> +  } while (0)

Same comments as for get_uleb128.

This all looks fine to me.  I find the new code much clearer than the
original, which seemed to wind every which way and was hard to follow.

Thanks,
PM

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 2/3] Simplify and inline get_uleb128 and get_sleb128
@ 2013-12-11  1:35 Josh Stone
  0 siblings, 0 replies; 19+ messages in thread
From: Josh Stone @ 2013-12-11  1:35 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 9031 bytes --]

This removes the IS_LIBDW distinction so LEB128 operations are now
always inlined, and the implementations are simplified, more direct.

Signed-off-by: Josh Stone <jistone@redhat.com>
---
 libdw/ChangeLog       |  14 +++++++
 libdw/Makefile.am     |   3 +-
 libdw/memory-access.c |  50 -----------------------
 libdw/memory-access.h | 108 +++++++++++++++++++-------------------------------
 4 files changed, 56 insertions(+), 119 deletions(-)
 delete mode 100644 libdw/memory-access.c

diff --git a/libdw/ChangeLog b/libdw/ChangeLog
index a2e4b142a107..93396e404d97 100644
--- a/libdw/ChangeLog
+++ b/libdw/ChangeLog
@@ -1,3 +1,17 @@
+2013-12-10  Josh Stone  <jistone@redhat.com>
+
+	* memory-access.h (get_uleb128_rest_return): Removed.
+	(get_sleb128_rest_return): Removed
+	(get_uleb128_step): Make this a self-contained block.
+	(get_sleb128_step): Ditto, and use a bitfield to extend signs.
+	(get_uleb128): Make this wholly implemented by __libdw_get_uleb128.
+	(get_sleb128): Make this wholly implemented by __libdw_get_sleb128.
+	(__libdw_get_uleb128): Simplify and inline for all callers.
+	(__libdw_get_sleb128): Ditto.
+	* memory-access.c: Delete file.
+	* Makefile.am (libdw_a_SOURCES): Remove it.
+	(DEFS): Remove the now unused -DIS_LIBDW.
+
 2013-12-09  Josh Stone  <jistone@redhat.com>
 
 	* libdw_form.c (__libdw_form_val_compute_len): Renamed function from
diff --git a/libdw/Makefile.am b/libdw/Makefile.am
index a22166a99e73..cd9e31435317 100644
--- a/libdw/Makefile.am
+++ b/libdw/Makefile.am
@@ -28,7 +28,6 @@
 ## not, see <http://www.gnu.org/licenses/>.
 ##
 include $(top_srcdir)/config/eu.am
-DEFS += -DIS_LIBDW
 if BUILD_STATIC
 AM_CFLAGS += -fpic
 endif
@@ -79,7 +78,7 @@ libdw_a_SOURCES = dwarf_begin.c dwarf_begin_elf.c dwarf_end.c dwarf_getelf.c \
 		  dwarf_getfuncs.c  \
 		  dwarf_decl_file.c dwarf_decl_line.c dwarf_decl_column.c \
 		  dwarf_func_inline.c dwarf_getsrc_file.c \
-		  libdw_findcu.c libdw_form.c libdw_alloc.c memory-access.c \
+		  libdw_findcu.c libdw_form.c libdw_alloc.c \
 		  libdw_visit_scopes.c \
 		  dwarf_entry_breakpoints.c \
 		  dwarf_next_cfi.c \
diff --git a/libdw/memory-access.c b/libdw/memory-access.c
deleted file mode 100644
index 7666fb60a64c..000000000000
--- a/libdw/memory-access.c
+++ /dev/null
@@ -1,50 +0,0 @@
-/* Out of line functions for memory-access.h macros.
-   Copyright (C) 2005, 2006 Red Hat, Inc.
-   This file is part of elfutils.
-
-   This file is free software; you can redistribute it and/or modify
-   it under the terms of either
-
-     * the GNU Lesser General Public License as published by the Free
-       Software Foundation; either version 3 of the License, or (at
-       your option) any later version
-
-   or
-
-     * the GNU General Public License as published by the Free
-       Software Foundation; either version 2 of the License, or (at
-       your option) any later version
-
-   or both in parallel, as here.
-
-   elfutils is distributed in the hope that it will be useful, but
-   WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   General Public License for more details.
-
-   You should have received copies of the GNU General Public License and
-   the GNU Lesser General Public License along with this program.  If
-   not, see <http://www.gnu.org/licenses/>.  */
-
-#ifdef HAVE_CONFIG_H
-# include <config.h>
-#endif
-#include "libdwP.h"
-#include "memory-access.h"
-
-uint64_t
-internal_function
-__libdw_get_uleb128 (uint64_t acc, unsigned int i, const unsigned char **addrp)
-{
-  unsigned char __b;
-  get_uleb128_rest_return (acc, i, addrp);
-}
-
-int64_t
-internal_function
-__libdw_get_sleb128 (int64_t acc, unsigned int i, const unsigned char **addrp)
-{
-  unsigned char __b;
-  int64_t _v = acc;
-  get_sleb128_rest_return (acc, i, addrp);
-}
diff --git a/libdw/memory-access.h b/libdw/memory-access.h
index 16471990dfa0..0aa46c9994b9 100644
--- a/libdw/memory-access.h
+++ b/libdw/memory-access.h
@@ -37,90 +37,64 @@
 
 /* Number decoding macros.  See 7.6 Variable Length Data.  */
 
-#define get_uleb128_step(var, addr, nth, break)				      \
-    __b = *(addr)++;							      \
-    var |= (uintmax_t) (__b & 0x7f) << (nth * 7);			      \
-    if (likely ((__b & 0x80) == 0))					      \
-      break
+#define len_leb128(var) ((8 * sizeof (var) + 6) / 7)
 
-#define get_uleb128(var, addr)						      \
+#define get_uleb128_step(var, addr, nth)				      \
   do {									      \
-    unsigned char __b;							      \
-    var = 0;								      \
-    get_uleb128_step (var, addr, 0, break);				      \
-    var = __libdw_get_uleb128 (var, 1, &(addr));			      \
+    unsigned char __b = *(addr)++;					      \
+    (var) |= (typeof (var)) (__b & 0x7f) << ((nth) * 7);		      \
+    if (likely ((__b & 0x80) == 0))					      \
+      return (var);							      \
   } while (0)
 
-#define get_uleb128_rest_return(var, i, addrp)				      \
+static inline uint64_t
+__libdw_get_uleb128 (const unsigned char **addrp)
+{
+  uint64_t acc = 0;
+  get_uleb128_step (acc, *addrp, 0);
+  for (unsigned int i = 1; i < len_leb128(acc); ++i)
+    get_uleb128_step (acc, *addrp, i);
+  /* Other implementations set VALUE to UINT_MAX in this
+     case.  So we better do this as well.  */
+  return UINT64_MAX;
+}
+
+#define get_uleb128(var, addr)						      \
   do {									      \
-    for (; i < 10; ++i)							      \
-      {									      \
-	get_uleb128_step (var, *addrp, i, return var);			      \
-      }									      \
-    /* Other implementations set VALUE to UINT_MAX in this		      \
-       case.  So we better do this as well.  */				      \
-    return UINT64_MAX;							      \
+    (var) = __libdw_get_uleb128(&(addr));				      \
+    (void)(var);							      \
   } while (0)
 
 /* The signed case is similar, but we sign-extend the result.  */
 
-#define get_sleb128_step(var, addr, nth, break)				      \
-    __b = *(addr)++;							      \
-    _v |= (uint64_t) (__b & 0x7f) << (nth * 7);				      \
-    if (likely ((__b & 0x80) == 0))					      \
-      {									      \
-	var = (_v << (64 - (nth * 7) - 7)) >> (64 - (nth * 7) - 7);	      \
-        break;					 			      \
-      }									      \
-    else do {} while (0)
-
-#define get_sleb128(var, addr)						      \
-  do {									      \
-    unsigned char __b;							      \
-    int64_t _v = 0;							      \
-    get_sleb128_step (var, addr, 0, break);				      \
-    var = __libdw_get_sleb128 (_v, 1, &(addr));				      \
-  } while (0)
-
-#define get_sleb128_rest_return(var, i, addrp)				      \
+#define get_sleb128_step(var, addr, nth)				      \
   do {									      \
-    for (; i < 9; ++i)							      \
+    unsigned char __b = *(addr)++;					      \
+    if (likely ((__b & 0x80) == 0))					      \
       {									      \
-	get_sleb128_step (var, *addrp, i, return var);			      \
+	struct { signed int i:7; } __s = { .i = __b };			      \
+	(var) |= (typeof (var)) __s.i << ((nth) * 7);			      \
+	return (var);							      \
       }									      \
-    __b = *(*addrp)++;							      \
-    if (likely ((__b & 0x80) == 0))					      \
-      return var | ((uint64_t) __b << 63);				      \
-    else								      \
-      /* Other implementations set VALUE to INT_MAX in this		      \
-	 case.  So we better do this as well.  */			      \
-      return INT64_MAX;							      \
+    (var) |= (typeof (var)) (__b & 0x7f) << ((nth) * 7);		      \
   } while (0)
 
-#ifdef IS_LIBDW
-extern uint64_t __libdw_get_uleb128 (uint64_t acc, unsigned int i,
-				     const unsigned char **addrp)
-     internal_function attribute_hidden;
-extern int64_t __libdw_get_sleb128 (int64_t acc, unsigned int i,
-				    const unsigned char **addrp)
-     internal_function attribute_hidden;
-#else
-static inline uint64_t
-__attribute__ ((unused))
-__libdw_get_uleb128 (uint64_t acc, unsigned int i, const unsigned char **addrp)
-{
-  unsigned char __b;
-  get_uleb128_rest_return (acc, i, addrp);
-}
 static inline int64_t
-__attribute__ ((unused))
-__libdw_get_sleb128 (int64_t acc, unsigned int i, const unsigned char **addrp)
+__libdw_get_sleb128 (const unsigned char **addrp)
 {
-  unsigned char __b;
-  int64_t _v = acc;
-  get_sleb128_rest_return (acc, i, addrp);
+  int64_t acc = 0;
+  for (unsigned int i = 0; i < len_leb128(acc); ++i)
+    get_sleb128_step (acc, *addrp, i);
+  /* Other implementations set VALUE to INT_MAX in this
+     case.  So we better do this as well.  */
+  return INT64_MAX;
 }
-#endif
+
+#define get_sleb128(var, addr)						      \
+  do {									      \
+    (var) = __libdw_get_sleb128(&(addr));				      \
+    (void)(var);							      \
+  } while (0)
 
 
 /* We use simple memory access functions in case the hardware allows it.
-- 
1.8.4.2


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2014-04-24  9:49 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-22 15:01 [PATCH 2/3] Simplify and inline get_uleb128 and get_sleb128 Richard Henderson
  -- strict thread matches above, loose matches on Subject: below --
2014-04-24  9:49 Mark Wielaard
2014-04-23 22:32 Petr Machata
2014-04-23 21:54 Mark Wielaard
2014-04-23 20:29 Petr Machata
2014-04-23 19:01 Josh Stone
2014-04-23 18:51 Richard Henderson
2014-04-23 18:32 Petr Machata
2014-04-23 15:27 Richard Henderson
2014-04-23 10:17 Petr Machata
2014-04-22 22:04 Mark Wielaard
2014-04-22 15:58 Richard Henderson
2014-04-22 15:52 Josh Stone
2014-04-22 15:03 Richard Henderson
2014-04-22 14:55 Mark Wielaard
2013-12-12 22:23 Petr Machata
2013-12-12 19:06 Josh Stone
2013-12-12 12:13 Petr Machata
2013-12-11  1:35 Josh Stone

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).