public inbox for elfutils@sourceware.org
 help / color / mirror / Atom feed
* Problems with dwarf-getmacros test
@ 2017-05-09 14:11 Ulf Hermann
  2017-05-09 15:11 ` Mark Wielaard
  0 siblings, 1 reply; 5+ messages in thread
From: Ulf Hermann @ 2017-05-09 14:11 UTC (permalink / raw)
  To: elfutils-devel

Hi,

I frequently get failures from the run-dwarf-getmacros.sh test, on 
testfile-macros:0xb. The test repeatedly outputs "(null)" instead of the 
actual macros and then runs into the assert at dwarf-getmacros.c:50. The 
failure is very nondeterministic, though. I haven't found a reliable way 
to trigger it.

Further examination reveals that the __libdw_in_section check in 
READ_AND_RELOCATE (libdwP.h:656), when called from __libdw_read_offset 
seems to be bogus. The "return -1" in there is what produces the null 
results and ultimately the assert. Experiments show that the address is 
frequently not in the section we're checking there, but still valid. 
Just dropping the check makes the test succeed.

I'm currently at a loss about why this happens. One thing that strikes 
me was that the additional dbg_ret mechamism was added in 2012 with 
commit 775375e3, but the check in READ_AND_RELOCATE was not adapted then.

However, the address is also not necessarily in dbg_ret at that point. 
Checking dbg_ret in addition to dbg still fails sometimes, and also that 
wouldn't explain the nondeterminism.

Ulf

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Problems with dwarf-getmacros test
  2017-05-09 14:11 Problems with dwarf-getmacros test Ulf Hermann
@ 2017-05-09 15:11 ` Mark Wielaard
  2017-05-09 16:28   ` Ulf Hermann
  0 siblings, 1 reply; 5+ messages in thread
From: Mark Wielaard @ 2017-05-09 15:11 UTC (permalink / raw)
  To: Ulf Hermann; +Cc: elfutils-devel

On Mon, 2017-05-08 at 18:22 +0200, Ulf Hermann wrote:
> I frequently get failures from the run-dwarf-getmacros.sh test, on 
> testfile-macros:0xb. The test repeatedly outputs "(null)" instead of the 
> actual macros and then runs into the assert at dwarf-getmacros.c:50. The 
> failure is very nondeterministic, though. I haven't found a reliable way 
> to trigger it.

Is it only with testfile-macros?
All other testfiles always run correctly?

> Further examination reveals that the __libdw_in_section check in 
> READ_AND_RELOCATE (libdwP.h:656), when called from __libdw_read_offset 
> seems to be bogus. The "return -1" in there is what produces the null 
> results and ultimately the assert.

Do you have the whole call stack of that failed __libdw_read_offset
call? Which source line in tests/dwarf-getmacros.c prints the "(null)"?

> Experiments show that the address is 
> frequently not in the section we're checking there, but still valid. 
> Just dropping the check makes the test succeed.

I think this might be related to our "fake" CU and attributes we invent
in libdw/dwarf_getmacros.c (read_macros). See around this comment:

          /* We pretend this is a DW_AT_GNU_macros attribute so that
             DW_FORM_sec_offset forms get correctly interpreted as
             offset into .debug_macro.  */

If that is the issue then we might need to somehow make
READ_AND_RELOCATE and/or __libdw_in_section aware that the CU is fake
and the check isn't needed. In which case we probably need to add some
flag "fake" to the CU and pass the CU to the various __libdw_read_*
functions to disable that sanity check in READ_AND_RELOCATE.

> I'm currently at a loss about why this happens. One thing that strikes 
> me was that the additional dbg_ret mechamism was added in 2012 with 
> commit 775375e3, but the check in READ_AND_RELOCATE was not adapted then.

The READ_AND_RELOCATE macro is hard to read because it captures the
names of some of the variables it uses instead of getting them passed as
arguments. It took me a couple of times to double check what it does
seems correct. The check in READ_AND_RELOCATE is against (dbg,
sec_index, addr) checking that the address is inside the section for the
dbg Dwarf. While the __libdw_read_offset (dbg_ret, sec_ret, *ret) is for
the returned valued in the returned section in the returned Dwarf.

> However, the address is also not necessarily in dbg_ret at that point. 
> Checking dbg_ret in addition to dbg still fails sometimes, and also that 
> wouldn't explain the nondeterminism.

The nondeterminism is weird indeed.

Cheers,

Mark

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Problems with dwarf-getmacros test
  2017-05-09 15:11 ` Mark Wielaard
@ 2017-05-09 16:28   ` Ulf Hermann
  2017-05-09 16:28     ` Ulf Hermann
  0 siblings, 1 reply; 5+ messages in thread
From: Ulf Hermann @ 2017-05-09 16:28 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: elfutils-devel

> Is it only with testfile-macros?
> All other testfiles always run correctly?

Yes, it only happens with the testfile-macros check at 0xb.

> Do you have the whole call stack of that failed __libdw_read_offset
> call? Which source line in tests/dwarf-getmacros.c prints the "(null)"?

I'm having a hard time triggering the problem in a controlled environment. So far I didn't manage to create a stack trace. The (null) is the result of passing a value of null to dwarf-getmacros.c:88. I get an errno of DWARF_E_INVALID_OFFSET at the assert failure. So I put some debug messages into all places where that is generated and this is how I found the source of this. It seems I can't trigger the effect if I put the debug output directly into __libdw_in_section (but maybe I was just very unlucky). If I put it in READ_AND_RELOCATE I can see that the check always fails before it outputs (null) or hits the assertion.

>> Experiments show that the address is 
>> frequently not in the section we're checking there, but still valid. 
>> Just dropping the check makes the test succeed.
> 
> I think this might be related to our "fake" CU and attributes we invent
> in libdw/dwarf_getmacros.c (read_macros). See around this comment:
> 
>           /* We pretend this is a DW_AT_GNU_macros attribute so that
>              DW_FORM_sec_offset forms get correctly interpreted as
>              offset into .debug_macro.  */
> 
> If that is the issue then we might need to somehow make
> READ_AND_RELOCATE and/or __libdw_in_section aware that the CU is fake
> and the check isn't needed. In which case we probably need to add some
> flag "fake" to the CU and pass the CU to the various __libdw_read_*
> functions to disable that sanity check in READ_AND_RELOCATE.

We don't set type_offset on the fake_cu, but we read it via cu_sec_idx when calling __libdw_read_offset from dwarf_formstring. This seems wrong.

Ulf

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Problems with dwarf-getmacros test
  2017-05-09 16:28   ` Ulf Hermann
@ 2017-05-09 16:28     ` Ulf Hermann
  2017-05-10 17:09       ` Ulf Hermann
  0 siblings, 1 reply; 5+ messages in thread
From: Ulf Hermann @ 2017-05-09 16:28 UTC (permalink / raw)
  To: elfutils-devel

>> Do you have the whole call stack of that failed __libdw_read_offset
>> call? Which source line in tests/dwarf-getmacros.c prints the "(null)"?

Actually I just managed to catch a backtrace by inserting an infinite 
loop at the point where it would normally return -1. It turns out the 
address we are looking for is not in the IDX_debug_info section, but in 
the IDX_debug_macro section (which by itself isn't all that surprising, 
but if that is a rule, how can this code ever work??).

Thread 1 (Thread 220048.0x35af0):
#0  0x0000000070ec311e in __libdw_read_offset (sec_ret=8, size=1, 
ret=<synthetic pointer>, width=4, addr=0xe90089 ".\027", sec_index=0, 
dbg_ret=<optimized out>, dbg=<optimized out>) at 
D:/qtcreator-super-master/elfutils/libdw/libdwP.h:712
         orig_addr = <optimized out>
#1  dwarf_formstring (attrp=attrp@entry=0x6af950) at 
D:/qtcreator-super-master/elfutils/libdw/dwarf_formstring.c:69
         dbg = <optimized out>
         dbg_ret = <optimized out>
         off = <optimized out>
#2  0x0000000070ecc709 in dwarf_macro_param2 
(macro=macro@entry=0x6afaa0, paramp=paramp@entry=0x0, 
strp=strp@entry=0x6af9f0) at 
D:/qtcreator-super-master/elfutils/libdw/dwarf_macro_param2.c:50
         param = {code = 8473, form = 14, valp = 0xe90089 ".\027", cu = 
0x6afac0}
#3  0x0000000000401784 in mac (macro=0x6afaa0, dbg=0xf926f0) at 
D:/qtcreator-super-master/elfutils/tests/dwarf-getmacros.c:87
         value = 0xf927b8 "p?ù"
         level = 0
         opcode = 5
#4  0x0000000070ecbf1d in read_macros (dbg=dbg@entry=0xf926f0, 
sec_index=sec_index@entry=10, macoff=macoff@entry=0, callback=<optimized 
out>, callback@entry=0x4015e0 <mac>, arg=<optimized out>, 
offset=<optimized out>, offset@entry=0, accept_0xff=<optimized out>, 
accept_0xff@entry=false, cudie=<optimized out>, cudie@entry=0x6afe10) at 
D:/qtcreator-super-master/elfutils/libdw/dwarf_getmacros.c:409
         opcode = <optimized out>
         idx = <optimized out>
         attributesp = <optimized out>
         nattributes = {{code = 8473, form = 15, valp = 0xe90088 
"\001.\027", cu = 0x6afac0}, {code = 8473, form = 14, valp = 0xe90089 
".\027", cu = 0x6afac0}, {code = 0, form = 0, valp = 0x0, cu = 0x0}, 
{code = 0, form = 0, valp = 0x0, cu = 0x0}, {code = 0, form = 0, valp = 
0x0, cu = 0x0}, {code = 0, form = 0, valp = 0x0, cu = 0x0}, {code = 0, 
form = 0, valp = 0x0, cu = 0x0}, {code = 0, form = 0, valp = 0x0, cu = 0x0}}
         fake_cu = {dbg = 0xf926f0, start = 0, end = 0, address_size = 0 
'\000', offset_size = 4 '\004', version = 4, type_offset = 0, type_sig8 
= 0, abbrev_hash = {size = 0, filled = 0, table = 0x0}, 
orig_abbrev_offset = 0, last_abbrev_offset = 0, lines = 0x0, files = 
0x0, locs = 0x0, startp = 0xe90087, endp = 0xe90ac4}
         attributes = <optimized out>
         macro = {table = 0xf928b0, attributes = 0x6afb40, opcode = 5 
'\005'}
         res = <optimized out>
         d = <optimized out>
         startp = <optimized out>
         endp = <optimized out>
         readp = 0xe9008d "\005\001î!"
#5  0x0000000070ecc302 in gnu_macros_getmacros_off (dbg=0xf926f0, 
macoff=0, callback=callback@entry=0x4015e0 <mac>, 
arg=arg@entry=0xf926f0, offset=offset@entry=0, 
accept_0xff=accept_0xff@entry=false, cudie=cudie@entry=0x6afe10) at 
D:/qtcreator-super-master/elfutils/libdw/dwarf_getmacros.c:478
No locals.
#6  0x0000000070ecc502 in dwarf_getmacros (cudie=cudie@entry=0x6afe10, 
callback=callback@entry=0x4015e0 <mac>, arg=arg@entry=0xf926f0, token=0) 
at D:/qtcreator-super-master/elfutils/libdw/dwarf_getmacros.c:564
         macoff = 0
         accept_0xff = false
         offset = <optimized out>
#7  0x0000000000409a36 in main (argc=3, argv=<optimized out>) at 
D:/qtcreator-super-master/elfutils/tests/dwarf-getmacros.c:133
         off = <optimized out>
         name = <optimized out>
         cuoff = <optimized out>
         new_style = <optimized out>
         fd = <optimized out>
         dbg = 0xf926f0
         cudie_mem = {addr = 0xf93a5b, cu = 0xf92810, abbrev = 0xf92890, 
padding__ = 0}
         cudie = 0x6afe10

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Problems with dwarf-getmacros test
  2017-05-09 16:28     ` Ulf Hermann
@ 2017-05-10 17:09       ` Ulf Hermann
  0 siblings, 0 replies; 5+ messages in thread
From: Ulf Hermann @ 2017-05-10 17:09 UTC (permalink / raw)
  To: elfutils-devel; +Cc: Mark Wielaard

>>> Do you have the whole call stack of that failed __libdw_read_offset
>>> call? Which source line in tests/dwarf-getmacros.c prints the "(null)"?
>
> Actually I just managed to catch a backtrace by inserting an infinite
> loop at the point where it would normally return -1. It turns out the
> address we are looking for is not in the IDX_debug_info section, but in
> the IDX_debug_macro section (which by itself isn't all that surprising,
> but if that is a rule, how can this code ever work??).

The likely solution of the mystery is this piece of code in 
__libdw_in_section:

if (unlikely (addr < data->d_buf)
       || unlikely (data->d_size - (addr - data->d_buf) < size))
     {
       __libdw_seterrno (DWARF_E_INVALID_OFFSET);
       return false;
     }

If addr < data->d_buf we fail immediately. That's what's happening to me 
sometimes. However, if addr > data->d_buf, then it's likely much bigger, 
as the allocations of the different sections are unrelated. data->d_size 
is always 159 in this test case. Therefore, data->d_size - (addr - 
data->d_buf) will produce something negative, which then gets compared 
to a size_t, forcing it to overflow. size is 4, so this case never gets 
detected.

Ulf

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-05-09 16:05 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-09 14:11 Problems with dwarf-getmacros test Ulf Hermann
2017-05-09 15:11 ` Mark Wielaard
2017-05-09 16:28   ` Ulf Hermann
2017-05-09 16:28     ` Ulf Hermann
2017-05-10 17:09       ` Ulf Hermann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).