public inbox for dwz@sourceware.org
 help / color / mirror / Atom feed
* Re: [PATCH] Error out on DW_AT_location with invalid encoding
  2019-01-01  0:00     ` Jakub Jelinek
@ 2019-01-01  0:00       ` Tom de Vries
  2019-01-01  0:00         ` Jakub Jelinek
  2019-01-01  0:00         ` Mark Wielaard
  0 siblings, 2 replies; 7+ messages in thread
From: Tom de Vries @ 2019-01-01  0:00 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: dwz

On 07-03-19 09:27, Jakub Jelinek wrote:
> On Thu, Mar 07, 2019 at 09:23:19AM +0100, Tom de Vries wrote:
>> On 07-03-19 08:55, Jakub Jelinek wrote:
>>> On Thu, Mar 07, 2019 at 08:22:42AM +0100, Tom de Vries wrote:
>>>> Hi,
>>>>
>>>> When processing a file containing an DW_AT_location with encoding DW_FORM_addr,
>>>
>>> What kind of generator generates that?  Ugh.
>>
>> AFAICT it's the result of generating a .s file from a .c file and then
>> hand-editing the debug info (
>> https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=gdb/testsuite/gdb.dwarf2/dw2-restrict.S;h=7108b86ec109ce5a1bd299e2c36696a12024e0d5;hb=HEAD
>> ):
>> ...
>>         .byte   16                      # DW_AT_stmt_list
>>         .byte   1                       # DW_FORM_addr
>> ...
> 

Hmm, I guessed it was an error introduced by hand-editing, but digging a
little further, it was in fact a generator error (clang 2.9):
https://bugs.llvm.org/show_bug.cgi?id=9995 .

> Is it worth bothering with that then?  Shouldn't just gdb be fixed?
> Or is it intentional that the test tests how it deals with invalid DWARF?
> 

I'm not sure if the test-case should be fixed. Gdb testcases are known
to contain invalid dwarf, although indeed in this case the test-case is
just to test handling of DW_TAG_restrict_type, not invalid dwarf.

> I mean, dwz has over hundred asserts and many of them will fail if fed
> complete garbage.  Do we want to turn them all into errors?

I'd propose we just fix the ones that show up with gdb testsuite (which
are just a couple), because it's a known way to test dwz.

Thanks,
- Tom

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Error out on DW_AT_location with invalid encoding
  2019-01-01  0:00 [PATCH] Error out on DW_AT_location with invalid encoding Tom de Vries
@ 2019-01-01  0:00 ` Jakub Jelinek
  2019-01-01  0:00   ` Tom de Vries
  0 siblings, 1 reply; 7+ messages in thread
From: Jakub Jelinek @ 2019-01-01  0:00 UTC (permalink / raw)
  To: Tom de Vries; +Cc: dwz

On Thu, Mar 07, 2019 at 08:22:42AM +0100, Tom de Vries wrote:
> Hi,
> 
> When processing a file containing an DW_AT_location with encoding DW_FORM_addr,

What kind of generator generates that?  Ugh.
Above you mention DW_AT_location (and in ChangeLog too), but the patch
actually handles DW_AT_stmt_list that way.  So, which one it is?

	Jakub

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] Error out on DW_AT_location with invalid encoding
@ 2019-01-01  0:00 Tom de Vries
  2019-01-01  0:00 ` Jakub Jelinek
  0 siblings, 1 reply; 7+ messages in thread
From: Tom de Vries @ 2019-01-01  0:00 UTC (permalink / raw)
  To: dwz, jakub

Hi,

When processing a file containing an DW_AT_location with encoding DW_FORM_addr,
we run into this assert in write_die:
...
dwz: dwz.c:9068: write_die: \
  Assertion `p && (form == DW_FORM_sec_offset || form == DW_FORM_data4)' failed
...

Error out instead (and do it earlier, in read_debug_info):
...
$ ./dwz -m 3 1 2
./dwz: 1: DW_AT_stmt_list not DW_FORM_sec_offset or DW_FORM_data4
./dwz: 2: DW_AT_stmt_list not DW_FORM_sec_offset or DW_FORM_data4
./dwz: Too few files for multifile optimization
...

OK for trunk?

Thanks,
- Tom

Error out on DW_AT_location with invalid encoding

2019-02-14  Tom de Vries  <tdevries@suse.de>

	PR dwz/24171
	* dwz.c (get_AT_int): Add and handle formp parameter.
	(read_debug_info): Error out on invalid DW_AT_location encoding.

---
 dwz.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/dwz.c b/dwz.c
index 4ef8657..79cddb6 100644
--- a/dwz.c
+++ b/dwz.c
@@ -1325,16 +1325,16 @@ get_AT (dw_die_ref die, enum dwarf_attribute at, enum dwarf_form *formp)
 /* Return an integer attribute AT of DIE.  Set *PRESENT to true
    if found.  */
 static uint64_t
-get_AT_int (dw_die_ref die, enum dwarf_attribute at, bool *present)
+get_AT_int (dw_die_ref die, enum dwarf_attribute at, bool *present,
+	    enum dwarf_form *formp)
 {
-  enum dwarf_form form;
   unsigned char *ptr;
-  ptr = get_AT (die, at, &form);
+  ptr = get_AT (die, at, formp);
   *present = false;
   if (ptr == NULL)
     return 0;
   *present = true;
-  switch (form)
+  switch (*formp)
     {
     case DW_FORM_ref_addr:
       return read_size (ptr, die_cu (die)->cu_version == 2 ? ptr_size : 4);
@@ -5000,9 +5000,18 @@ read_debug_info (DSO *dso, int kind)
 	}
 
       cu->cu_comp_dir = get_AT_string (cu->cu_die, DW_AT_comp_dir);
-      debug_line_off = get_AT_int (cu->cu_die, DW_AT_stmt_list, &present);
+      enum dwarf_form form;
+      debug_line_off
+	= get_AT_int (cu->cu_die, DW_AT_stmt_list, &present, &form);
       if (present)
 	{
+	  if (!(form == DW_FORM_sec_offset || form == DW_FORM_data4))
+	    {
+	      error (0, 0, "%s: DW_AT_stmt_list not DW_FORM_sec_offset or"
+		     " DW_FORM_data4", dso->filename);
+	      goto fail;
+	    }
+
 	  if (cu_files != NULL && last_debug_line_off == debug_line_off)
 	    {
 	      cu->cu_nfiles = cu_nfiles;

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Error out on DW_AT_location with invalid encoding
  2019-01-01  0:00       ` Tom de Vries
@ 2019-01-01  0:00         ` Jakub Jelinek
  2019-01-01  0:00         ` Mark Wielaard
  1 sibling, 0 replies; 7+ messages in thread
From: Jakub Jelinek @ 2019-01-01  0:00 UTC (permalink / raw)
  To: Tom de Vries; +Cc: dwz

On Thu, Mar 07, 2019 at 09:59:09AM +0100, Tom de Vries wrote:
> >>> What kind of generator generates that?  Ugh.
> >>
> >> AFAICT it's the result of generating a .s file from a .c file and then
> >> hand-editing the debug info (
> >> https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=gdb/testsuite/gdb.dwarf2/dw2-restrict.S;h=7108b86ec109ce5a1bd299e2c36696a12024e0d5;hb=HEAD
> >> ):
> >> ...
> >>         .byte   16                      # DW_AT_stmt_list
> >>         .byte   1                       # DW_FORM_addr
> >> ...
> > 
> 
> Hmm, I guessed it was an error introduced by hand-editing, but digging a
> little further, it was in fact a generator error (clang 2.9):
> https://bugs.llvm.org/show_bug.cgi?id=9995 .

The patch is ok then.

	Jakub

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Error out on DW_AT_location with invalid encoding
  2019-01-01  0:00   ` Tom de Vries
@ 2019-01-01  0:00     ` Jakub Jelinek
  2019-01-01  0:00       ` Tom de Vries
  0 siblings, 1 reply; 7+ messages in thread
From: Jakub Jelinek @ 2019-01-01  0:00 UTC (permalink / raw)
  To: Tom de Vries; +Cc: dwz

On Thu, Mar 07, 2019 at 09:23:19AM +0100, Tom de Vries wrote:
> On 07-03-19 08:55, Jakub Jelinek wrote:
> > On Thu, Mar 07, 2019 at 08:22:42AM +0100, Tom de Vries wrote:
> >> Hi,
> >>
> >> When processing a file containing an DW_AT_location with encoding DW_FORM_addr,
> > 
> > What kind of generator generates that?  Ugh.
> 
> AFAICT it's the result of generating a .s file from a .c file and then
> hand-editing the debug info (
> https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=gdb/testsuite/gdb.dwarf2/dw2-restrict.S;h=7108b86ec109ce5a1bd299e2c36696a12024e0d5;hb=HEAD
> ):
> ...
>         .byte   16                      # DW_AT_stmt_list
>         .byte   1                       # DW_FORM_addr
> ...

Is it worth bothering with that then?  Shouldn't just gdb be fixed?
Or is it intentional that the test tests how it deals with invalid DWARF?

I mean, dwz has over hundred asserts and many of them will fail if fed
complete garbage.  Do we want to turn them all into errors?

	Jakub

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Error out on DW_AT_location with invalid encoding
  2019-01-01  0:00       ` Tom de Vries
  2019-01-01  0:00         ` Jakub Jelinek
@ 2019-01-01  0:00         ` Mark Wielaard
  1 sibling, 0 replies; 7+ messages in thread
From: Mark Wielaard @ 2019-01-01  0:00 UTC (permalink / raw)
  To: Tom de Vries, Jakub Jelinek; +Cc: dwz

On Thu, 2019-03-07 at 09:59 +0100, Tom de Vries wrote:
> On 07-03-19 09:27, Jakub Jelinek wrote:
> > Is it worth bothering with that then?  Shouldn't just gdb be fixed?
> > Or is it intentional that the test tests how it deals with invalid DWARF?
> > 
> 
> I'm not sure if the test-case should be fixed. Gdb testcases are known
> to contain invalid dwarf, although indeed in this case the test-case is
> just to test handling of DW_TAG_restrict_type, not invalid dwarf.
> 
> > I mean, dwz has over hundred asserts and many of them will fail if fed
> > complete garbage.  Do we want to turn them all into errors?
> 
> I'd propose we just fix the ones that show up with gdb testsuite (which
> are just a couple), because it's a known way to test dwz.

Jakub is right that there are a lot of asserts in the code (128 at the
moment). But I do think it makes sense to clean them all up eventually.

Keep asserts just for internal data invariants but turn anything that
checks ELF/DWARF input into a normal error check. Unfortunately there
will always be bad DWARF out there, intentionally or by accident. An
assert is a terrible way to report an error when dwz is used in some
toolchain to process all files IMHO.

Cheers,

Mark

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Error out on DW_AT_location with invalid encoding
  2019-01-01  0:00 ` Jakub Jelinek
@ 2019-01-01  0:00   ` Tom de Vries
  2019-01-01  0:00     ` Jakub Jelinek
  0 siblings, 1 reply; 7+ messages in thread
From: Tom de Vries @ 2019-01-01  0:00 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: dwz

[-- Attachment #1: Type: text/plain, Size: 835 bytes --]

On 07-03-19 08:55, Jakub Jelinek wrote:
> On Thu, Mar 07, 2019 at 08:22:42AM +0100, Tom de Vries wrote:
>> Hi,
>>
>> When processing a file containing an DW_AT_location with encoding DW_FORM_addr,
> 
> What kind of generator generates that?  Ugh.

AFAICT it's the result of generating a .s file from a .c file and then
hand-editing the debug info (
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=gdb/testsuite/gdb.dwarf2/dw2-restrict.S;h=7108b86ec109ce5a1bd299e2c36696a12024e0d5;hb=HEAD
):
...
        .byte   16                      # DW_AT_stmt_list
        .byte   1                       # DW_FORM_addr
...

> Above you mention DW_AT_location (and in ChangeLog too), but the patch
> actually handles DW_AT_stmt_list that way.  So, which one it is?

It's DW_AT_stmt_list. Updated patch attached.

Thanks,
- Tom



[-- Attachment #2: 0001-Error-out-on-DW_AT_stmt_list-with-invalid-encoding.patch --]
[-- Type: text/x-patch, Size: 2285 bytes --]

Error out on DW_AT_stmt_list with invalid encoding

When processing a file containing an DW_AT_stmt_list with encoding DW_FORM_addr,
we run into this assert in write_die:
...
dwz: dwz.c:9068: write_die: \
  Assertion `p && (form == DW_FORM_sec_offset || form == DW_FORM_data4)' failed
...

Error out instead (and do it earlier, in read_debug_info):
...
$ ./dwz -m 3 1 2
./dwz: 1: DW_AT_stmt_list not DW_FORM_sec_offset or DW_FORM_data4
./dwz: 2: DW_AT_stmt_list not DW_FORM_sec_offset or DW_FORM_data4
./dwz: Too few files for multifile optimization
...

2019-02-14  Tom de Vries  <tdevries@suse.de>

	PR dwz/24171
	* dwz.c (get_AT_int): Add and handle formp parameter.
	(read_debug_info): Error out on invalid DW_AT_stmt_list encoding.

---
 dwz.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/dwz.c b/dwz.c
index fc87abe..476807a 100644
--- a/dwz.c
+++ b/dwz.c
@@ -1325,16 +1325,16 @@ get_AT (dw_die_ref die, enum dwarf_attribute at, enum dwarf_form *formp)
 /* Return an integer attribute AT of DIE.  Set *PRESENT to true
    if found.  */
 static uint64_t
-get_AT_int (dw_die_ref die, enum dwarf_attribute at, bool *present)
+get_AT_int (dw_die_ref die, enum dwarf_attribute at, bool *present,
+	    enum dwarf_form *formp)
 {
-  enum dwarf_form form;
   unsigned char *ptr;
-  ptr = get_AT (die, at, &form);
+  ptr = get_AT (die, at, formp);
   *present = false;
   if (ptr == NULL)
     return 0;
   *present = true;
-  switch (form)
+  switch (*formp)
     {
     case DW_FORM_ref_addr:
       return read_size (ptr, die_cu (die)->cu_version == 2 ? ptr_size : 4);
@@ -5000,9 +5000,18 @@ read_debug_info (DSO *dso, int kind)
 	}
 
       cu->cu_comp_dir = get_AT_string (cu->cu_die, DW_AT_comp_dir);
-      debug_line_off = get_AT_int (cu->cu_die, DW_AT_stmt_list, &present);
+      enum dwarf_form form;
+      debug_line_off
+	= get_AT_int (cu->cu_die, DW_AT_stmt_list, &present, &form);
       if (present)
 	{
+	  if (!(form == DW_FORM_sec_offset || form == DW_FORM_data4))
+	    {
+	      error (0, 0, "%s: DW_AT_stmt_list not DW_FORM_sec_offset or"
+		     " DW_FORM_data4", dso->filename);
+	      goto fail;
+	    }
+
 	  if (cu_files != NULL && last_debug_line_off == debug_line_off)
 	    {
 	      cu->cu_nfiles = cu_nfiles;

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-03-07  9:19 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-01  0:00 [PATCH] Error out on DW_AT_location with invalid encoding Tom de Vries
2019-01-01  0:00 ` Jakub Jelinek
2019-01-01  0:00   ` Tom de Vries
2019-01-01  0:00     ` Jakub Jelinek
2019-01-01  0:00       ` Tom de Vries
2019-01-01  0:00         ` Jakub Jelinek
2019-01-01  0:00         ` Mark Wielaard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).