public inbox for elfutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH][v2] Support .debug_macro
@ 2014-09-12 18:03 Petr Machata
  0 siblings, 0 replies; 8+ messages in thread
From: Petr Machata @ 2014-09-12 18:03 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 37496 bytes --]

Hi there,

in comparison with the previous version, this doesn't invent custom
bounds-checking primitives and just does what the rest of libdw does.
It also uses the read_addr_unaligned_inc that I introduced in the
other patch that I posted for review.

Thanks,
PM

---8<-------------------------------------------------------------
- This code is based on the following proposal:
	http://www.dwarfstd.org/ShowIssue.php?issue=110722.1

- dwarf_getmacros is left to support only .debug_macinfo.  The only
  problematic opcode really is DW_MACINFO_vendor_ext, that the new
  standard lacks, and that could in theory be misinterpreted by extant
  code as whatever operand a vendor decides to map to 0xff.

- Instead, two principial iteration interfaces are provided for access
  to new-style sections: dwarf_getmacros_die and dwarf_getmacros_off.
  And a suite of new helper interfaces for analysis of individual
  opcodes: dwarf_macro_version, dwarf_macro_line_offset,
  dwarf_macro_getparamcnt, dwarf_macro_param.

- The existing interfaces dwarf_macro_opcode, dwarf_macro_param1 and
  dwarf_macro_param2 remain operational for old- as well as new-style
  Dwarf macro sections, if applicable.

Signed-off-by: Petr Machata <pmachata@redhat.com>
---
 libdw/ChangeLog                 |   32 +++
 libdw/Makefile.am               |    8 +-
 libdw/dwarf_end.c               |    1 +
 libdw/dwarf_error.c             |    3 +-
 libdw/dwarf_getmacros.c         |  496 ++++++++++++++++++++++++++++++++-------
 libdw/dwarf_macro_getparamcnt.c |   43 ++++
 libdw/dwarf_macro_line_offset.c |   45 ++++
 libdw/dwarf_macro_param.c       |   46 ++++
 libdw/dwarf_macro_param1.c      |    8 +-
 libdw/dwarf_macro_param2.c      |   18 +-
 libdw/dwarf_macro_version.c     |   44 ++++
 libdw/libdw.h                   |   87 +++++++-
 libdw/libdw.map                 |   11 +-
 libdw/libdwP.h                  |   54 ++++-
 14 files changed, 785 insertions(+), 111 deletions(-)
 create mode 100644 libdw/dwarf_macro_getparamcnt.c
 create mode 100644 libdw/dwarf_macro_line_offset.c
 create mode 100644 libdw/dwarf_macro_param.c
 create mode 100644 libdw/dwarf_macro_version.c

diff --git a/libdw/ChangeLog b/libdw/ChangeLog
index 410b31a..a9d2166 100644
--- a/libdw/ChangeLog
+++ b/libdw/ChangeLog
@@ -1,3 +1,35 @@
+2014-09-10  Petr Machata  <pmachata@redhat.com>
+
+	* dwarf_macro_getparamcnt.c: New file.
+	* dwarf_macro_line_offset.c: New file.
+	* dwarf_macro_param.c: New file.
+	* dwarf_macro_version.c: New file.
+	* Makefile.am (libdw_a_SOURCES): Add the new files.
+	* libdwP.h (DWARF_E_INVALID_OPCODE): New enumerator.
+	(struct Dwarf): New field macro_ops.
+	(Dwarf_Macro_Op_Proto, Dwarf_Macro_Op_Table): New structures for
+	keeping macro opcode prototypes in.
+	(Dwarf_Macro_s): Drop all fields, add new ones.
+	* dwarf_error.c (errmsgs): Add a message for
+	DWARF_E_INVALID_OPCODE.
+	* dwarf_end.c (dwarf_end): Destroy it here.
+	* libdw.h (dwarf_getmacros_die, dwarf_getmacros_off)
+	(dwarf_macro_version, dwarf_macro_line_offset)
+	(dwarf_macro_getparamcnt)
+	(dwarf_macro_param): New public	interfaces.
+	* dwarf_getmacros.c: Implement them here.
+	(get_offset_from, macro_op_compare, build_table)
+	(init_macinfo_table, get_macinfo_table, get_table_for_offset)
+	(read_macros, gnu_macros_getmacros_off)
+	(macro_info_getmacros_off): Helpers.
+	(dwarf_getmacros): Adjust to dispatch to the new interfaces.
+	* dwarf_macro_param1.c (dwarf_macro_param1): Adjust to dispatch to
+	the new interfaces.
+	* dwarf_macro_param2.c (dwarf_macro_param2): Likewise.
+	* libdw.map (ELFUTILS_0.161): New. Add dwarf_getmacros_die,
+	dwarf_getmacros_off, dwarf_macro_version, dwarf_macro_getparamcnt,
+	dwarf_macro_param.
+
 2014-09-12  Petr Machata  <pmachata@redhat.com>
 
 	* memory-access.h (read_ubyte_unaligned_inc): Allow only 4- and
diff --git a/libdw/Makefile.am b/libdw/Makefile.am
index 2e42a37..51b6a0c 100644
--- a/libdw/Makefile.am
+++ b/libdw/Makefile.am
@@ -71,9 +71,11 @@ libdw_a_SOURCES = dwarf_begin.c dwarf_begin_elf.c dwarf_end.c dwarf_getelf.c \
 		  dwarf_getlocation.c dwarf_getstring.c dwarf_offabbrev.c \
 		  dwarf_getaranges.c dwarf_onearange.c dwarf_getarangeinfo.c \
 		  dwarf_getarange_addr.c dwarf_getattrs.c dwarf_formflag.c \
-		  dwarf_getmacros.c dwarf_macro_opcode.c dwarf_macro_param1.c \
-		  dwarf_macro_param2.c dwarf_addrdie.c \
-		  dwarf_getfuncs.c  \
+		  dwarf_getmacros.c dwarf_macro_getparamcnt.c \
+		  dwarf_macro_line_offset.c dwarf_macro_opcode.c \
+		  dwarf_macro_param.c dwarf_macro_param1.c \
+		  dwarf_macro_param2.c dwarf_macro_version.c \
+		  dwarf_addrdie.c dwarf_getfuncs.c \
 		  dwarf_decl_file.c dwarf_decl_line.c dwarf_decl_column.c \
 		  dwarf_func_inline.c dwarf_getsrc_file.c \
 		  libdw_findcu.c libdw_form.c libdw_alloc.c \
diff --git a/libdw/dwarf_end.c b/libdw/dwarf_end.c
index 241a257..26c38cb 100644
--- a/libdw/dwarf_end.c
+++ b/libdw/dwarf_end.c
@@ -92,6 +92,7 @@ dwarf_end (dwarf)
 	 to be handled.  */
       tdestroy (dwarf->cu_tree, cu_free);
       tdestroy (dwarf->tu_tree, cu_free);
+      tdestroy (dwarf->macro_ops, noop_free);
 
       struct libdw_memblock *memp = dwarf->mem_tail;
       /* The first block is allocated together with the Dwarf object.  */
diff --git a/libdw/dwarf_error.c b/libdw/dwarf_error.c
index 2292914..08b691a 100644
--- a/libdw/dwarf_error.c
+++ b/libdw/dwarf_error.c
@@ -1,5 +1,5 @@
 /* Retrieve ELF descriptor used for DWARF access.
-   Copyright (C) 2002, 2003, 2004, 2005, 2009 Red Hat, Inc.
+   Copyright (C) 2002, 2003, 2004, 2005, 2009, 2014 Red Hat, Inc.
    This file is part of elfutils.
    Written by Ulrich Drepper <drepper@redhat.com>, 2002.
 
@@ -92,6 +92,7 @@ static const char *errmsgs[] =
     [DWARF_E_NO_DEBUG_RANGES] = N_(".debug_ranges section missing"),
     [DWARF_E_INVALID_CFI] = N_("invalid CFI section"),
     [DWARF_E_NO_ALT_DEBUGLINK] = N_("no alternative debug link found"),
+    [DWARF_E_INVALID_OPCODE] = N_("invalid opcode"),
   };
 #define nerrmsgs (sizeof (errmsgs) / sizeof (errmsgs[0]))
 
diff --git a/libdw/dwarf_getmacros.c b/libdw/dwarf_getmacros.c
index a3d2c81..52ba512 100644
--- a/libdw/dwarf_getmacros.c
+++ b/libdw/dwarf_getmacros.c
@@ -1,5 +1,5 @@
 /* Get macro information.
-   Copyright (C) 2002-2009 Red Hat, Inc.
+   Copyright (C) 2002-2009, 2014 Red Hat, Inc.
    This file is part of elfutils.
    Written by Ulrich Drepper <drepper@redhat.com>, 2002.
 
@@ -33,112 +33,446 @@
 
 #include <dwarf.h>
 #include <string.h>
+#include <search.h>
 
 #include <libdwP.h>
 
+#include <stdio.h>
+#include <assert.h>
+#include <inttypes.h>
+#include <stdlib.h>
 
-ptrdiff_t
-dwarf_getmacros (die, callback, arg, offset)
-     Dwarf_Die *die;
-     int (*callback) (Dwarf_Macro *, void *);
-     void *arg;
-     ptrdiff_t offset;
+static int
+get_offset_from (Dwarf_Die *die, int name, Dwarf_Word *retp)
 {
-  if (die == NULL)
+  /* Get the appropriate attribute.  */
+  Dwarf_Attribute attr;
+  if (INTUSE(dwarf_attr) (die, name, &attr) == NULL)
+    return -1;
+
+  /* Offset into the corresponding section.  */
+  return INTUSE(dwarf_formudata) (&attr, retp);
+}
+
+static int
+macro_op_compare (const void *p1, const void *p2)
+{
+  const Dwarf_Macro_Op_Table *t1 = (const Dwarf_Macro_Op_Table *) p1;
+  const Dwarf_Macro_Op_Table *t2 = (const Dwarf_Macro_Op_Table *) p2;
+
+  if (t1->offset < t2->offset)
     return -1;
+  if (t1->offset > t2->offset + t2->read)
+    return 1;
+
+  return 0;
+}
+
+static void
+build_table (Dwarf_Macro_Op_Table *table,
+	     Dwarf_Macro_Op_Proto op_protos[static 255])
+{
+  unsigned ct = 0;
+  for (unsigned i = 1; i < 256; ++i)
+    if (op_protos[i - 1].forms != NULL)
+      table->table[table->opcodes[i - 1] = ct++] = op_protos[i - 1];
+    else
+      table->opcodes[i] = 0xff;
+}
+
+#define MACRO_PROTO(NAME, ...)					\
+  Dwarf_Macro_Op_Proto NAME = ({				\
+      static const uint8_t proto[] = {__VA_ARGS__};		\
+      (Dwarf_Macro_Op_Proto) {sizeof proto, proto};		\
+    })
+
+static Dwarf_Macro_Op_Table *
+init_macinfo_table (void)
+{
+  MACRO_PROTO (p_udata_str, DW_FORM_udata, DW_FORM_string);
+  MACRO_PROTO (p_udata_udata, DW_FORM_udata, DW_FORM_udata);
+  MACRO_PROTO (p_none);
+
+  Dwarf_Macro_Op_Proto op_protos[255] =
+    {
+      [DW_MACINFO_define - 1] = p_udata_str,
+      [DW_MACINFO_undef - 1] = p_udata_str,
+      [DW_MACINFO_vendor_ext - 1] = p_udata_str,
+      [DW_MACINFO_start_file - 1] = p_udata_udata,
+      [DW_MACINFO_end_file - 1] = p_none,
+    };
+
+  static Dwarf_Macro_Op_Table table;
+  memset (&table, 0, sizeof table);
+
+  build_table (&table, op_protos);
+  return &table;
+}
+
+static inline Dwarf_Macro_Op_Table *
+get_macinfo_table (void)
+{
+  static Dwarf_Macro_Op_Table *ret = NULL;
+  if (unlikely (ret == NULL))
+    ret = init_macinfo_table ();
+  return ret;
+}
+
+static Dwarf_Macro_Op_Table *
+get_table_for_offset (Dwarf *dbg, Dwarf_Word macoff,
+		      const unsigned char *readp,
+		      const unsigned char *const endp)
+{
+  Dwarf_Macro_Op_Table fake = { .offset = macoff };
+  Dwarf_Macro_Op_Table **found = tfind (&fake, &dbg->macro_ops,
+					macro_op_compare);
+  if (found != NULL)
+    return *found;
+
+  const unsigned char *startp = readp;
+
+  /* Request at least 3 bytes for header.  */
+  if (readp + 3 > endp)
+    {
+    invalid_dwarf:
+      __libdw_seterrno (DWARF_E_INVALID_DWARF);
+      return NULL;
+    }
 
-  Elf_Data *d = die->cu->dbg->sectiondata[IDX_debug_macinfo];
-  if (unlikely (d == NULL) || unlikely (d->d_buf == NULL))
+  uint16_t version = read_2ubyte_unaligned_inc (dbg, readp);
+  if (version != 4)
+    {
+      __libdw_seterrno (DWARF_E_INVALID_VERSION);
+      return NULL;
+    }
+
+  uint8_t flags = *readp++;
+  bool is_64bit = (flags & 0x1) != 0;
+
+  Dwarf_Off line_offset = (Dwarf_Off) -1;
+  if ((flags & 0x2) != 0)
+    {
+      line_offset = read_addr_unaligned_inc (is_64bit ? 8 : 4, dbg, readp);
+      if (readp > endp)
+	goto invalid_dwarf;
+    }
+
+  /* """The macinfo entry types defined in this standard may, but
+     might not, be described in the table""".
+
+     I.e. these may be present.  It's tempting to simply skip them,
+     but it's probably more correct to tolerate that a producer tweaks
+     the way certain opcodes are encoded, for whatever reasons.  */
+
+  MACRO_PROTO (p_udata_str, DW_FORM_udata, DW_FORM_string);
+  MACRO_PROTO (p_udata_strp, DW_FORM_udata, DW_FORM_strp);
+  MACRO_PROTO (p_udata_udata, DW_FORM_udata, DW_FORM_udata);
+  MACRO_PROTO (p_secoffset, DW_FORM_sec_offset);
+  MACRO_PROTO (p_none);
+
+  Dwarf_Macro_Op_Proto op_protos[255] =
+    {
+      [DW_MACRO_GNU_define - 1] = p_udata_str,
+      [DW_MACRO_GNU_undef - 1] = p_udata_str,
+      [DW_MACRO_GNU_define_indirect - 1] = p_udata_strp,
+      [DW_MACRO_GNU_undef_indirect - 1] = p_udata_strp,
+      [DW_MACRO_GNU_start_file - 1] = p_udata_udata,
+      [DW_MACRO_GNU_end_file - 1] = p_none,
+      [DW_MACRO_GNU_transparent_include - 1] = p_secoffset,
+      /* N.B. DW_MACRO_undef_indirectx, DW_MACRO_define_indirectx
+	 should be added when 130313.1 is supported.  */
+    };
+
+  if ((flags & 0x4) != 0)
+    {
+      unsigned count = *readp++;
+      for (unsigned i = 0; i < count; ++i)
+	{
+	  unsigned opcode = *readp++;
+
+	  Dwarf_Macro_Op_Proto e;
+	  get_uleb128 (e.nforms, readp); // XXX checking
+	  e.forms = readp;
+	  op_protos[opcode] = e;
+
+	  readp += e.nforms;
+	  if (readp > endp)
+	    {
+	      __libdw_seterrno (DWARF_E_INVALID_DWARF);
+	      return NULL;
+	    }
+	}
+    }
+
+  size_t ct = 0;
+  for (unsigned i = 1; i < 256; ++i)
+    if (op_protos[i - 1].forms != NULL)
+      ++ct;
+
+  /* We support at most 0xfe opcodes defined in the table, as 0xff is
+     a value that means that given opcode is not stored at all.  But
+     that should be fine, as opcode 0 is not allocated.  */
+  assert (ct < 0xff);
+
+  size_t macop_table_size = offsetof (Dwarf_Macro_Op_Table, table[ct]);
+
+  Dwarf_Macro_Op_Table *table = libdw_alloc (dbg, Dwarf_Macro_Op_Table,
+					     macop_table_size, 1);
+
+  *table = (Dwarf_Macro_Op_Table) {
+    .offset = macoff,
+    .line_offset = line_offset,
+    .header_len = readp - startp,
+    .version = version,
+    .is_64bit = is_64bit,
+  };
+  build_table (table, op_protos);
+
+  Dwarf_Macro_Op_Table **ret = tsearch (table, &dbg->macro_ops,
+					macro_op_compare);
+  if (unlikely (ret == NULL))
+    {
+      __libdw_seterrno (DWARF_E_NOMEM);
+      return NULL;
+    }
+
+  return *ret;
+}
+
+static ptrdiff_t
+read_macros (Dwarf *dbg, Dwarf_Macro_Op_Table *table, int secindex,
+	     Dwarf_Off macoff, int (*callback) (Dwarf_Macro *, void *),
+	     void *arg, ptrdiff_t offset)
+{
+  Elf_Data *d = dbg->sectiondata[secindex];
+  if (unlikely (d == NULL || d->d_buf == NULL))
     {
       __libdw_seterrno (DWARF_E_NO_ENTRY);
       return -1;
     }
 
-  if (offset == 0)
+  if (unlikely (macoff >= d->d_size))
     {
-      /* Get the appropriate attribute.  */
-      Dwarf_Attribute attr;
-      if (INTUSE(dwarf_attr) (die, DW_AT_macro_info, &attr) == NULL)
-	return -1;
+      __libdw_seterrno (DWARF_E_INVALID_DWARF);
+      return -1;
+    }
 
-      /* Offset into the .debug_macinfo section.  */
-      Dwarf_Word macoff;
-      if (INTUSE(dwarf_formudata) (&attr, &macoff) != 0)
-	return -1;
+  const unsigned char *const startp = d->d_buf + macoff;
+  const unsigned char *const endp = d->d_buf + d->d_size;
 
-      offset = macoff;
+  if (table == NULL)
+    {
+      table = get_table_for_offset (dbg, macoff, startp, endp);
+      if (table == NULL)
+	return -1;
     }
-  if (unlikely (offset > (ptrdiff_t) d->d_size))
-    goto invalid;
 
-  const unsigned char *readp = d->d_buf + offset;
-  const unsigned char *readendp = d->d_buf + d->d_size;
+  if (offset == 0)
+    offset = table->header_len;
 
-  if (readp == readendp)
-    return 0;
+  assert (offset >= 0);
+  assert (offset < endp - startp);
+  const unsigned char *readp = startp + offset;
 
-  while (readp < readendp)
+  while (readp < endp)
     {
       unsigned int opcode = *readp++;
-      unsigned int u128;
-      unsigned int u128_2 = 0;
-      const char *str = NULL;
-      const unsigned char *endp;
+      if (opcode == 0)
+	/* Nothing more to do.  */
+	return 0;
+
+      unsigned int idx = table->opcodes[opcode - 1];
+      if (idx == 0xff)
+	{
+	  __libdw_seterrno (DWARF_E_INVALID_OPCODE);
+	  return -1;
+	}
+
+      Dwarf_Macro_Op_Proto *proto = &table->table[idx];
 
-      switch (opcode)
+      /* A fake CU with bare minimum data to fool dwarf_formX into
+	 doing the right thing with the attributes that we put
+	 out.  */
+      Dwarf_CU fake_cu = {
+	.dbg = dbg,
+	.version = 4,
+	.offset_size = table->is_64bit ? 8 : 4,
+      };
+
+      Dwarf_Attribute attributes[proto->nforms];
+      for (Dwarf_Word i = 0; i < proto->nforms; ++i)
 	{
-	case DW_MACINFO_define:
-	case DW_MACINFO_undef:
-	case DW_MACINFO_vendor_ext:
-	  /*  For the first two opcodes the parameters are
-	        line, string
-	      For the latter
-	        number, string.
-	      We can treat these cases together.  */
-	  get_uleb128 (u128, readp);
-
-	  endp = memchr (readp, '\0', readendp - readp);
-	  if (endp == NULL)
-	    goto invalid;
-
-	  str = (char *) readp;
-	  readp = endp + 1;
-	  break;
-
-	case DW_MACINFO_start_file:
-	  /* The two parameters are line and file index.  */
-	  get_uleb128 (u128, readp);
-	  get_uleb128 (u128_2, readp);
-	  break;
-
-	case DW_MACINFO_end_file:
-	  /* No parameters for this one.  */
-	  u128 = 0;
-	  break;
-
-	case 0:
-	  /* Nothing more to do.  */
-	  return 0;
-
-	default:
-	  goto invalid;
+	  /* We pretend this is a DW_AT_GNU_macros attribute so that
+	     DW_FORM_sec_offset forms get correctly interpreted as
+	     offset into .debug_macro.  */
+	  attributes[i].code = DW_AT_GNU_macros;
+	  attributes[i].form = proto->forms[i];
+	  attributes[i].valp = (void *) readp;
+	  attributes[i].cu = &fake_cu;
+
+	  readp += __libdw_form_val_len (dbg, &fake_cu,
+					 proto->forms[i], readp);
 	}
 
-      Dwarf_Macro mac;
-      mac.opcode = opcode;
-      mac.param1 = u128;
-      if (str == NULL)
-	mac.param2.u = u128_2;
-      else
-	mac.param2.s = str;
+      Dwarf_Macro macro = {
+	.line_offset = table->line_offset,
+	.version = table->version,
+	.opcode = opcode,
+	.nargs = proto->nforms,
+	.attributes = attributes,
+      };
+
+      Dwarf_Off nread = readp - startp;
+      if (nread > table->read)
+	table->read = nread;
+
+      if (callback (&macro, arg) != DWARF_CB_OK)
+	return readp - startp;
+    }
+
+  return 0;
+}
+
+static ptrdiff_t
+gnu_macros_getmacros_off (Dwarf *dbg, Dwarf_Off macoff,
+			  int (*callback) (Dwarf_Macro *, void *),
+			  void *arg, ptrdiff_t token)
+{
+  assert (token <= 0);
 
-      if (callback (&mac, arg) != DWARF_CB_OK)
-	return readp - (const unsigned char *) d->d_buf;
+  ptrdiff_t ret = read_macros (dbg, NULL, IDX_debug_macro,
+			       macoff, callback, arg, -token);
+  if (ret == -1)
+    return -1;
+  else
+    return -ret;
+}
+
+static ptrdiff_t
+macro_info_getmacros_off (Dwarf *dbg, Dwarf_Off macoff,
+			  int (*callback) (Dwarf_Macro *, void *),
+			  void *arg, ptrdiff_t token)
+{
+  assert (token >= 0);
+
+  Dwarf_Macro_Op_Table *table = get_macinfo_table ();
+  assert (table != NULL);
+
+  return read_macros (dbg, table, IDX_debug_macinfo,
+		      macoff, callback, arg, token);
+}
+
+ptrdiff_t
+dwarf_getmacros_off (Dwarf *dbg, Dwarf_Off macoff,
+		     int (*callback) (Dwarf_Macro *, void *),
+		     void *arg, ptrdiff_t token)
+{
+  if (dbg == NULL)
+    {
+      __libdw_seterrno (DWARF_E_NO_DWARF);
+      return -1;
+    }
+
+  /* We use token values > 0 for iteration through .debug_macinfo and
+     values < 0 for iteration through .debug_macro.  Return value of
+     -1 also signifies an error, but that's fine, because .debug_macro
+     always contains at least three bytes of headers and after
+     iterating one opcode, we should never see anything above -4.  */
+
+  if (token > 0)
+    /* A continuation call from DW_AT_macro_info iteration.  */
+    return macro_info_getmacros_off (dbg, macoff, callback, arg, token);
+
+  /* Either a DW_AT_GNU_macros continuation, or a fresh start
+     thereof.  */
+  return gnu_macros_getmacros_off (dbg, macoff, callback, arg, token);
+}
+
+ptrdiff_t
+dwarf_getmacros_die (Dwarf_Die *cudie, Dwarf_Off *macoffp,
+		     int (*callback) (Dwarf_Macro *, void *),
+		     void *arg, ptrdiff_t token)
+{
+  if (cudie == NULL)
+    {
+      __libdw_seterrno (DWARF_E_NO_DWARF);
+      return -1;
+    }
+
+  if (token > 0 && macoffp != NULL)
+    /* A continuation call from DW_AT_macro_info iteration, meaning
+       *MACOFF contains previously-cached offset.  */
+    return macro_info_getmacros_off (cudie->cu->dbg, *macoffp,
+				     callback, arg, token);
+
+  /* A fresh start of DW_AT_macro_info iteration, or a continuation
+     thereof without a cache.  */
+  if (token > 0
+      || (token == 0 && dwarf_hasattr (cudie, DW_AT_macro_info)))
+    {
+      Dwarf_Word macoff;
+      if (macoffp == NULL)
+	macoffp = &macoff;
+      if (get_offset_from (cudie, DW_AT_macro_info, macoffp) != 0)
+	return -1;
+      return macro_info_getmacros_off (cudie->cu->dbg, *macoffp,
+				       callback, arg, token);
+    }
+
+  if (token < 0 && macoffp != NULL)
+    /* A continuation call from DW_AT_GNU_macros iteration.  */
+    return gnu_macros_getmacros_off (cudie->cu->dbg, *macoffp,
+				     callback, arg, token);
+
+  /* Likewise without cache, or iteration start.  */
+  Dwarf_Word macoff;
+  if (macoffp == NULL)
+    macoffp = &macoff;
+  if (get_offset_from (cudie, DW_AT_GNU_macros, macoffp) != 0)
+    return -1;
+  return gnu_macros_getmacros_off (cudie->cu->dbg, *macoffp,
+				   callback, arg, token);
+}
+
+ptrdiff_t
+dwarf_getmacros (die, callback, arg, offset)
+     Dwarf_Die *die;
+     int (*callback) (Dwarf_Macro *, void *);
+     void *arg;
+     ptrdiff_t offset;
+{
+  if (die == NULL)
+    return -1;
+
+  if (offset == 0)
+    {
+      /* We can't support .debug_macro transparently by
+	 dwarf_getmacros, because extant callers would think that the
+	 returned macro opcodes come from DW_MACINFO_* domain and be
+	 confused.  */
+      if (unlikely (! dwarf_hasattr (die, DW_AT_macro_info)))
+	{
+	  __libdw_seterrno (DWARF_E_NO_ENTRY);
+	  return -1;
+	}
+
+      /* DIE's with both DW_AT_GNU_macros and DW_AT_macro_info are
+	 disallowed by the proposal that DW_AT_GNU_macros support is
+	 based on, and this attribute would derail us above, so check
+	 for it now.  */
+      if (unlikely (dwarf_hasattr (die, DW_AT_GNU_macros)))
+	{
+	  __libdw_seterrno (DWARF_E_INVALID_DWARF);
+	  return -1;
+	}
     }
+  else
+    /* If non-zero, this ought to be a continuation from previous
+       DW_AT_macro_info iteration, meaning offset can't be
+       negative.  */
+    assert (offset > 0);
 
-  /* If we come here the termination of the data for the CU is not
-     present.  */
- invalid:
-  __libdw_seterrno (DWARF_E_INVALID_DWARF);
-  return -1;
+  /* At this point we can safely piggy-back on existing new-style
+     interfaces.  */
+  return dwarf_getmacros_die (die, NULL, callback, arg, offset);
 }
diff --git a/libdw/dwarf_macro_getparamcnt.c b/libdw/dwarf_macro_getparamcnt.c
new file mode 100644
index 0000000..294134d
--- /dev/null
+++ b/libdw/dwarf_macro_getparamcnt.c
@@ -0,0 +1,43 @@
+/* Return number of parameters of a macro.
+   Copyright (C) 2014 Red Hat, Inc.
+   This file is part of elfutils.
+
+   This file is free software; you can redistribute it and/or modify
+   it under the terms of either
+
+     * the GNU Lesser General Public License as published by the Free
+       Software Foundation; either version 3 of the License, or (at
+       your option) any later version
+
+   or
+
+     * the GNU General Public License as published by the Free
+       Software Foundation; either version 2 of the License, or (at
+       your option) any later version
+
+   or both in parallel, as here.
+
+   elfutils is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   General Public License for more details.
+
+   You should have received copies of the GNU General Public License and
+   the GNU Lesser General Public License along with this program.  If
+   not, see <http://www.gnu.org/licenses/>.  */
+
+#ifdef HAVE_CONFIG_H
+# include <config.h>
+#endif
+
+#include "libdwP.h"
+
+int
+dwarf_macro_getparamcnt (Dwarf_Macro *macro, size_t *paramcntp)
+{
+  if (macro == NULL)
+    return -1;
+
+  *paramcntp = macro->nargs;
+  return 0;
+}
diff --git a/libdw/dwarf_macro_line_offset.c b/libdw/dwarf_macro_line_offset.c
new file mode 100644
index 0000000..531085c
--- /dev/null
+++ b/libdw/dwarf_macro_line_offset.c
@@ -0,0 +1,45 @@
+/* Return .debug_line offset associated with given macro.
+   Copyright (C) 2014 Red Hat, Inc.
+   This file is part of elfutils.
+   Written by Ulrich Drepper <drepper@redhat.com>, 2005.
+
+   This file is free software; you can redistribute it and/or modify
+   it under the terms of either
+
+     * the GNU Lesser General Public License as published by the Free
+       Software Foundation; either version 3 of the License, or (at
+       your option) any later version
+
+   or
+
+     * the GNU General Public License as published by the Free
+       Software Foundation; either version 2 of the License, or (at
+       your option) any later version
+
+   or both in parallel, as here.
+
+   elfutils is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   General Public License for more details.
+
+   You should have received copies of the GNU General Public License and
+   the GNU Lesser General Public License along with this program.  If
+   not, see <http://www.gnu.org/licenses/>.  */
+
+#ifdef HAVE_CONFIG_H
+# include <config.h>
+#endif
+
+#include "libdwP.h"
+
+
+int
+dwarf_macro_line_offset (Dwarf_Macro *macro, Dwarf_Off *offp)
+{
+  if (macro == NULL)
+    return -1;
+
+  *offp = macro->line_offset;
+  return 0;
+}
diff --git a/libdw/dwarf_macro_param.c b/libdw/dwarf_macro_param.c
new file mode 100644
index 0000000..b66b8ac
--- /dev/null
+++ b/libdw/dwarf_macro_param.c
@@ -0,0 +1,46 @@
+/* Return a given parameter of a macro.
+   Copyright (C) 2014 Red Hat, Inc.
+   This file is part of elfutils.
+
+   This file is free software; you can redistribute it and/or modify
+   it under the terms of either
+
+     * the GNU Lesser General Public License as published by the Free
+       Software Foundation; either version 3 of the License, or (at
+       your option) any later version
+
+   or
+
+     * the GNU General Public License as published by the Free
+       Software Foundation; either version 2 of the License, or (at
+       your option) any later version
+
+   or both in parallel, as here.
+
+   elfutils is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   General Public License for more details.
+
+   You should have received copies of the GNU General Public License and
+   the GNU Lesser General Public License along with this program.  If
+   not, see <http://www.gnu.org/licenses/>.  */
+
+#ifdef HAVE_CONFIG_H
+# include <config.h>
+#endif
+
+#include "libdwP.h"
+
+int
+dwarf_macro_param (Dwarf_Macro *macro, size_t idx, Dwarf_Attribute *ret)
+{
+  if (macro == NULL)
+    return -1;
+
+  if (idx >= macro->nargs)
+    return -1;
+
+  *ret = macro->attributes[idx];
+  return 0;
+}
diff --git a/libdw/dwarf_macro_param1.c b/libdw/dwarf_macro_param1.c
index 35d4a71..337ad6f 100644
--- a/libdw/dwarf_macro_param1.c
+++ b/libdw/dwarf_macro_param1.c
@@ -1,5 +1,5 @@
 /* Return first macro parameter.
-   Copyright (C) 2005 Red Hat, Inc.
+   Copyright (C) 2005, 2014 Red Hat, Inc.
    This file is part of elfutils.
    Written by Ulrich Drepper <drepper@redhat.com>, 2005.
 
@@ -40,7 +40,9 @@ dwarf_macro_param1 (Dwarf_Macro *macro, Dwarf_Word *paramp)
   if (macro == NULL)
     return -1;
 
-  *paramp = macro->param1;
+  Dwarf_Attribute param;
+  if (dwarf_macro_param (macro, 1, &param) != 0)
+    return -1;
 
-  return 0;
+  return dwarf_formudata (&param, paramp);
 }
diff --git a/libdw/dwarf_macro_param2.c b/libdw/dwarf_macro_param2.c
index b49e60e..cc902c9 100644
--- a/libdw/dwarf_macro_param2.c
+++ b/libdw/dwarf_macro_param2.c
@@ -1,5 +1,5 @@
 /* Return second macro parameter.
-   Copyright (C) 2005 Red Hat, Inc.
+   Copyright (C) 2005, 2014 Red Hat, Inc.
    This file is part of elfutils.
    Written by Ulrich Drepper <drepper@redhat.com>, 2005.
 
@@ -40,10 +40,16 @@ dwarf_macro_param2 (Dwarf_Macro *macro, Dwarf_Word *paramp, const char **strp)
   if (macro == NULL)
     return -1;
 
-  if (paramp != NULL)
-    *paramp = macro->param2.u;
-  if (strp != NULL)
-    *strp = macro->param2.s;
+  Dwarf_Attribute param;
+  if (dwarf_macro_param (macro, 1, &param) != 0)
+    return -1;
 
-  return 0;
+  if (param.form == DW_FORM_string
+      || param.form == DW_FORM_strp)
+    {
+      *strp = dwarf_formstring (&param);
+      return 0;
+    }
+  else
+    return dwarf_formudata (&param, paramp);
 }
diff --git a/libdw/dwarf_macro_version.c b/libdw/dwarf_macro_version.c
new file mode 100644
index 0000000..0b4b189
--- /dev/null
+++ b/libdw/dwarf_macro_version.c
@@ -0,0 +1,44 @@
+/* Return a version of a macro.
+   Copyright (C) 2014 Red Hat, Inc.
+   This file is part of elfutils.
+
+   This file is free software; you can redistribute it and/or modify
+   it under the terms of either
+
+     * the GNU Lesser General Public License as published by the Free
+       Software Foundation; either version 3 of the License, or (at
+       your option) any later version
+
+   or
+
+     * the GNU General Public License as published by the Free
+       Software Foundation; either version 2 of the License, or (at
+       your option) any later version
+
+   or both in parallel, as here.
+
+   elfutils is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   General Public License for more details.
+
+   You should have received copies of the GNU General Public License and
+   the GNU Lesser General Public License along with this program.  If
+   not, see <http://www.gnu.org/licenses/>.  */
+
+#ifdef HAVE_CONFIG_H
+# include <config.h>
+#endif
+
+#include "libdwP.h"
+
+int
+dwarf_macro_version (Dwarf_Macro *macro, unsigned int *versionp)
+{
+  if (macro == NULL)
+    return -1;
+
+  *versionp = macro->version;
+
+  return 0;
+}
diff --git a/libdw/libdw.h b/libdw/libdw.h
index 196d54a..bd5a0c6 100644
--- a/libdw/libdw.h
+++ b/libdw/libdw.h
@@ -827,25 +827,98 @@ extern int dwarf_entry_breakpoints (Dwarf_Die *die, Dwarf_Addr **bkpts);
 
 
 /* Call callback function for each of the macro information entry for
-   the CU.  */
+   the CU.  Similar in operation to dwarf_getmacros_die, but only
+   works for .debug_macinfo style macro entries (those where
+   dwarf_macro_version returns 0.)  */
 extern ptrdiff_t dwarf_getmacros (Dwarf_Die *cudie,
 				  int (*callback) (Dwarf_Macro *, void *),
-				  void *arg, ptrdiff_t offset)
-     __nonnull_attribute__ (2);
+				  void *arg, ptrdiff_t token)
+     __nonnull_attribute__ (2);
+
+/* Iterate through the macro section referenced by CUDIE and call
+   CALLBACK for each macro information entry.  Keeps iterating while
+   CALLBACK returns DWARF_CB_OK.  If the callback returns
+   DWARF_CB_ABORT, it stops iterating and returns a continuation
+   token, which can be used to restart the iteration at the point
+   where it ended.  Returns -1 for errors or 0 if there are no more
+   macro entries.
+
+   If MACOFFP is non-NULL, offset to macro section referenced by CUDIE
+   is stored to *MACOFFP.  That can later be passed together with a
+   continuation token to dwarf_getmacros_off.
+
+   If this is called with non-0 TOKEN (i.e. it is a continuation
+   call), MACOFFP shall be either NULL, or point to a location that
+   contains the same section offset as was at *MACOFFP of the call
+   that the passed-in continuation token came from.  */
+extern ptrdiff_t dwarf_getmacros_die (Dwarf_Die *cudie, Dwarf_Off *macoffp,
+				      int (*callback) (Dwarf_Macro *, void *),
+				      void *arg, ptrdiff_t token)
+     __nonnull_attribute__ (3);
+
+/* This is similar in operation to dwarf_getmacros_die, but selects
+   the section to iterate through by offset instead of by CU.  This
+   can be used for handling DW_MACRO_GNU_transparent_include's or
+   similar opcodes, or for continuation calls seeded with MACOFF and
+   TOKEN that came back from dwarf_getmacros_die (or a previous call
+   to dwarf_getmacros_off).  Note that with TOKEN of 0, this will
+   always iterate through DW_AT_GNU_macros style section.  */
+extern ptrdiff_t dwarf_getmacros_off (Dwarf *dbg, Dwarf_Off macoff,
+				      int (*callback) (Dwarf_Macro *, void *),
+				      void *arg, ptrdiff_t token)
+  __nonnull_attribute__ (3);
+
+/* Return Dwarf version of this macro opcode.  The versions are 0 for
+   macro elements coming from DW_AT_macro_info, and 4 for macro
+   elements coming from DW_AT_GNU_macros.  It is expected that 5 and
+   further will be for macro elements coming from standardized
+   DW_AT_macros.  */
+extern int dwarf_macro_version (Dwarf_Macro *macro, unsigned int *versionp)
+  __nonnull_attribute__ (2);
+
+/* Set *OFFP to .debug_line offset associated with the Dwarf macro
+   section that this MACRO comes from.  Returns -1 for errors or 0 for
+   success.
 
-/* Return macro opcode.  */
+   The offset is considered to be 0xffffffffffffffff (or (Dwarf_Off)
+   -1) if .debug_line offset was not set in the section header, which
+   will always be the case for .debug_macinfo macro sections, and may
+   be the case for .debug_macro sections as well.  This condition is
+   not considered an error.  */
+extern int dwarf_macro_line_offset (Dwarf_Macro *macro, Dwarf_Off *offp)
+  __nonnull_attribute__ (2);
+
+/* Return macro opcode.  That's a constant that can be either from
+   DW_MACINFO_* domain if version of MACRO is 0, or from
+   DW_MACRO_GNU_* domain if the version is 4.  */
 extern int dwarf_macro_opcode (Dwarf_Macro *macro, unsigned int *opcodep)
      __nonnull_attribute__ (2);
 
-/* Return first macro parameter.  */
+/* Get number of parameters of MACRO and store it to *PARAMCNTP.  */
+extern int dwarf_macro_getparamcnt (Dwarf_Macro *macro, size_t *paramcntp);
+
+/* Get IDX-th parameter of MACRO, and stores it to *ATTRIBUTE.
+   Returns 0 on success or -1 for errors.
+
+   After a successful call, you can query ATTRIBUTE by dwarf_whatform
+   to determine which of the dwarf_formX calls to make to get actual
+   value out of ATTRIBUTE.  Note that calling dwarf_whatattr is not
+   meaningful for pseudo-attributes formed this way.  */
+extern int dwarf_macro_param (Dwarf_Macro *macro, size_t idx,
+			      Dwarf_Attribute *attribute);
+
+/* Return first macro parameter.  This will return -1 if the parameter
+   is not an integral value.  Use dwarf_macro_param for more general
+   access.  */
 extern int dwarf_macro_param1 (Dwarf_Macro *macro, Dwarf_Word *paramp)
      __nonnull_attribute__ (2);
 
-/* Return second macro parameter.  */
+/* Return second macro parameter.  This will return -1 if the
+   parameter is not an integral or string value.  Use
+   dwarf_macro_param for more general access.  */
 extern int dwarf_macro_param2 (Dwarf_Macro *macro, Dwarf_Word *paramp,
 			       const char **strp);
 
-
 /* Compute what's known about a call frame when the PC is at ADDRESS.
    Returns 0 for success or -1 for errors.
    On success, *FRAME is a malloc'd pointer.  */
diff --git a/libdw/libdw.map b/libdw/libdw.map
index 55bc537..5a88bdc 100644
--- a/libdw/libdw.map
+++ b/libdw/libdw.map
@@ -306,4 +306,13 @@ ELFUTILS_0.160 {
   global:
     dwarf_cu_getdwarf;
     dwarf_cu_die;
-} ELFUTILS_0.159;
\ No newline at end of file
+} ELFUTILS_0.159;
+
+ELFUTILS_0.161 {
+  global:
+    dwarf_getmacros_die;
+    dwarf_getmacros_off;
+    dwarf_macro_version;
+    dwarf_macro_getparamcnt;
+    dwarf_macro_param;
+} ELFUTILS_0.160;
diff --git a/libdw/libdwP.h b/libdw/libdwP.h
index ce8a83d..47c2cc1 100644
--- a/libdw/libdwP.h
+++ b/libdw/libdwP.h
@@ -118,7 +118,8 @@ enum
   DWARF_E_INVALID_OFFSET,
   DWARF_E_NO_DEBUG_RANGES,
   DWARF_E_INVALID_CFI,
-  DWARF_E_NO_ALT_DEBUGLINK
+  DWARF_E_NO_ALT_DEBUGLINK,
+  DWARF_E_INVALID_OPCODE,
 };
 
 
@@ -167,6 +168,9 @@ struct Dwarf
   Dwarf_Off next_tu_offset;
   Dwarf_Sig8_Hash sig8_hash;
 
+  /* Search tree for .debug_macro operator tables.  */
+  void *macro_ops;
+
   /* Address ranges.  */
   Dwarf_Aranges *aranges;
 
@@ -326,16 +330,48 @@ struct Dwarf_CU
    })									      \
 
 
-/* Macro information.  */
+/* Prototype of a single .debug_macro operator.  */
+typedef struct
+{
+  Dwarf_Word nforms;
+  unsigned char const *forms;
+} Dwarf_Macro_Op_Proto;
+
+/* Prototype table.  */
+typedef struct
+{
+  /* Offset of .debug_macro section.  */
+  Dwarf_Off offset;
+
+  /* Offset of associated .debug_line section.  */
+  Dwarf_Off line_offset;
+
+  /* How far have we read in this section.  This starts as 0 and is
+     increased as the section is worked through.  */
+  Dwarf_Off read;
+
+  /* Header length.  */
+  Dwarf_Half header_len;
+
+  uint16_t version;
+  bool is_64bit;
+
+  /* Shows where in TABLE each opcode is defined.  Since opcode 0 is
+     never used, it stores index of opcode X in X-1'th element.  The
+     value of 0xff means not stored at all.  */
+  unsigned char opcodes[255];
+
+  /* Individual opcode prototypes.  */
+  Dwarf_Macro_Op_Proto table[];
+} Dwarf_Macro_Op_Table;
+
 struct Dwarf_Macro_s
 {
-  unsigned int opcode;
-  Dwarf_Word param1;
-  union
-  {
-    Dwarf_Word u;
-    const char *s;
-  } param2;
+  Dwarf_Off line_offset;
+  Dwarf_Word nargs;
+  uint16_t version;
+  uint8_t opcode;
+  Dwarf_Attribute *attributes;
 };
 
 
-- 
1.7.6.5


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH][v2] Support .debug_macro
@ 2014-10-01 17:16 Petr Machata
  0 siblings, 0 replies; 8+ messages in thread
From: Petr Machata @ 2014-10-01 17:16 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 2465 bytes --]

Mark Wielaard <mjw@redhat.com> writes:

> On Tue, 2014-09-30 at 18:06 +0200, Petr Machata wrote:
>> extern int dwarf_macro_srcfiles (Dwarf_Macro *macro, Dwarf_Files **files,
>> 				 size_t *nfiles)
>
> Yeah, this interface is nicer than trying to describe the file entry
> with attributes/forms.  And it mimics what we have for CUs. So all file
> accessors/helper functions work just the same. The one difference is
> that for CUs we only make it work on the CU Dwarf_Die, not all
> Dwarf_Dies in the CU. So we will have to make it clear what we expect
> the user to do here.

The cleanest solution would probably be to expose the existence of macro
units to the user.  Fundamentally, things like versions, file tables and
even macro prototypes are associated with macro units.  Presenting them
in any other way risks there will be impedance mismatch (like having to
ask whether a Files pointer obtained through one macro is still valid in
another macro).

But this smells of overengineering.  We would then have Dwarf macro
units, Dwarf macros and Dwarf macro prototypes, which is a lot of
concepts for something that few clients care about (essentially only
dwgrep does, I expect).

> I assume it makes most sense/is the most flexible
> to allow the user to call it on every Dwarf_Macro and to make no
> promises about whether the Dwarf_Files returned are cached/related
> between Dwarf_Macros.

Yeah.

> Or is there a case to be made for the user to cache the
> dwarf_macro_srcfiles result between processing different Dwarf_Macros
> (and should we warn for/against users depending on them being the same
> if they know the Dwarf_Macros come from the same macros table/header)?

*shrug*

A point could be made that individual opcodes could set a different file
section (say, DW_MACRO_push_lineoff, DW_MACRO_pop_lineoff), so that's
somewhat hazy argument against caching.  Then again this argument just
illustrates that we should have had dwarf_offsrcfiles all along, as that
better reflects the underlying reality of Dwarf.

For now I'd just ban caching, because that's what we do with these sorts
of interfaces.  Principle of least surprise and all that.

> http://dwarfstd.org/ShowIssue.php?issue=140724.1

Hmm, interesting.  We'll have to wait until what appears in one of the
Dwarf 5 drafts.

> Hopefully we can reuse some of the param/form/attribute design.

Yep, that would be cool.

Thanks,
Petr

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH][v2] Support .debug_macro
@ 2014-10-01 10:06 Mark Wielaard
  0 siblings, 0 replies; 8+ messages in thread
From: Mark Wielaard @ 2014-10-01 10:06 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 3054 bytes --]

On Tue, 2014-09-30 at 18:06 +0200, Petr Machata wrote:
> Mark Wielaard <mjw@redhat.com> writes:
> > The question is does the user of Dwarf_Macro need them? The user isn't
> > going to check against the version, they are just matching against the
> > DW_MACRO constants and for interpreting the arguments and file names
> > they will use the attribute and helper functions provided.
> 
> The user needs at least the .debug_line offset, if they are supposed to
> handle any start_file-like opcodes.  But I noticed we don't even have a
> function like dwarf_offline, or dwarf_offsrcfiles, which I just assumed
> would be present.  So the offset itself is not really useful anyway.

Ah, I missed that what is provided is a "File" not just a file name. I
see why this was probably not an issue for macinfo. Since those are
explicitly associated with a specific CU. So you would get the files
from there. But macro tables are more generic and carry their own file
list (reference).

> I think the way forward then is to have an interface like this:
> 
> extern int dwarf_macro_srcfiles (Dwarf_Macro *macro, Dwarf_Files **files,
> 				 size_t *nfiles)
> 
> ... this would return the files-section associated with the macro unit.
> The user could then use the usual interfaces to query the files section
> however they like.  I would like such interface betten than one that
> works akin to dwarf_macro_param*, because it's more general.  There's no
> telling where the file number comes from, it might be implied by the
> opcode for all I care.

Yeah, this interface is nicer than trying to describe the file entry
with attributes/forms.  And it mimics what we have for CUs. So all file
accessors/helper functions work just the same. The one difference is
that for CUs we only make it work on the CU Dwarf_Die, not all
Dwarf_Dies in the CU. So we will have to make it clear what we expect
the user to do here. I assume it makes most sense/is the most flexible
to allow the user to call it on every Dwarf_Macro and to make no
promises about whether the Dwarf_Files returned are cached/related
between Dwarf_Macros. Or is there a case to be made for the user to
cache the dwarf_macro_srcfiles result between processing different
Dwarf_Macros (and should we warn for/against users depending on them
being the same if they know the Dwarf_Macros come from the same macros
table/header)?

BTW. There is a proposal for DWARF5 to extend the line file table with
vendor extensions to describe more properties like with macros, but it
is a little unclear what the current status is. There are multiple
proposals that seem to have been (dis|re)approved. And sadly most of
them seem not to have been publicly proposed or discussed. See
http://dwarfstd.org/ShowIssue.php?issue=140724.1
And it isn't really related to this macro table interface. Just a
pointer that we might need a similar cleanup of file tables as we do now
for macros. Hopefully we can reuse some of the param/form/attribute
design.

Thanks,

Mark

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH][v2] Support .debug_macro
@ 2014-09-30 16:06 Petr Machata
  0 siblings, 0 replies; 8+ messages in thread
From: Petr Machata @ 2014-09-30 16:06 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 4006 bytes --]

Mark Wielaard <mjw@redhat.com> writes:

>> > Why you decide to a include the version and line_offset interfaces on
>> > Dwarf_Macro? Isn't that a CU DIE concept? (See also below, maybe I just
>> > misunderstand the data representation.)
>> 
>> Both version and line offset belong to macro units (not CU's, each macro
>> unit may set its own .debug_line offset, and macro units have their own
>> version).  But there's a question of how to get to that value.  You need
>> version to interpret the opcode, and you need .debug_line offset to
>> interpret file names.
>
> The question is does the user of Dwarf_Macro need them? The user isn't
> going to check against the version, they are just matching against the
> DW_MACRO constants and for interpreting the arguments and file names
> they will use the attribute and helper functions provided.

The user needs at least the .debug_line offset, if they are supposed to
handle any start_file-like opcodes.  But I noticed we don't even have a
function like dwarf_offline, or dwarf_offsrcfiles, which I just assumed
would be present.  So the offset itself is not really useful anyway.  I
think the way forward then is to have an interface like this:

extern int dwarf_macro_srcfiles (Dwarf_Macro *macro, Dwarf_Files **files,
				 size_t *nfiles)

... this would return the files-section associated with the macro unit.
The user could then use the usual interfaces to query the files section
however they like.  I would like such interface betten than one that
works akin to dwarf_macro_param*, because it's more general.  There's no
telling where the file number comes from, it might be implied by the
opcode for all I care.

> For the version I guess this depends on whether we can treat
> DW_MACINFO_vendor_ext specially or not.
>
> So if possible I wouldn't expose them at all in the interface and just
> treat them as implementation details. Does that make sense?

If 0xff is made reserved in Dwarf 5, then version is not necessary.
Other than that... I don't know.  I guess should it be needed in the
future, we could probably add bits of version-dependent logic where it's
needed.  After all, CU's do have versions, and users generally don't
need to care about them, as it's all hidden in the attribute decoding
routines.

> Can the table be setup by _init() by marking init_macinfo_table with
> __attribute__ ((constructor))? It seems small enough to not really
> matter that it is done eagerly instead of lazily when macros are used
> and solves all the locking trickery.

I don't see why not, the initialization is fairly minimal.  I'll do it
this way.

>> But I guess we should match the CU version to whatever corresponds to
>> the .debug_macro version that this macro comes from (the two versions
>> can in theory evolve independently.  It's hard to imagine something like
>> that happening, but there could be a version of Dwarf that doesn't
>> change .debug_info at all, but does change .debug_macro).
>
> I am fine with either hard coding 4 for now or pick the MU version.
> Please just add a comment for future reference.

OK.

> This is just an implementation detail, so it doesn't really matter that
> much. I just had a bit of a hard time keeping track of which value came
> from where and thought it was easier to understand (and less value
> copying) if the struct looked like:
>
>  struct Dwarf_Macro_s
>  {
>    Dwarf_Macro_Op_Table *table;
>    uint8_t opcode;
>    Dwarf_Attribute *attributes;
>  };
>
> And then explicitly lookup line_offset, version and nargs using the
> table when using a Dwarf_Macro_s later in the code. Certainly not a big
> deal, I don't think the data representation really matters here. Just
> wondering why you had chosen this one.

I can't recall doing this for any reason in particular.  I'll look into
this again and possibly rewrite to what you propose, I don't immediately
see why it wouldn't work.

Thank you,
Petr

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH][v2] Support .debug_macro
@ 2014-09-30  8:57 Mark Wielaard
  0 siblings, 0 replies; 8+ messages in thread
From: Mark Wielaard @ 2014-09-30  8:57 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 7867 bytes --]

On Mon, 2014-09-29 at 17:26 +0200, Petr Machata wrote:
> Mark Wielaard <mjw@redhat.com> writes:
> 
> > On Fri, 2014-09-12 at 20:03 +0200, Petr Machata wrote:
> >> - dwarf_getmacros is left to support only .debug_macinfo.  The only
> >>   problematic opcode really is DW_MACINFO_vendor_ext
> >
> > You could suggest that the standard reserves that value to enable
> > consumers to use the same parsers for both.
> 
> That's worth a try.

Thanks.

> > Could you also include a small testcase? That could then also be used as
> > an example. I don't think it is immediately clear how you would use the
> > dual style die/off accessors. So an test/example would be helpful.
> 
> I posted a test case patch with my previous submission.  That's still
> active, but I didn't ping it until this one gets through:
> 
>         https://lists.fedorahosted.org/pipermail/elfutils-devel/2014-September/004143.html

Sorry, I missed that one. It does look like what I was hoping for.

> > And I was wondering if having a readelf --debug-dump=decodedmacro using
> > this interface was useful. The raw macro dump doesn't handle transparent
> > includes (you don't have to implement that, but maybe it is useful to do
> > if you do decide to include a test/example anyway).
> 
> Sure, why not.  I actually think readelf should just use libdw/libelf
> interfaces instead of duplicating the decoding logic, though that's not
> what you are proposing.

I admit that the "raw" and "decoded" variant of the macro sections are
mostly the same in this case. I think in this case only the include
magic would be different. The raw variant would just show the macro
names and params as is, while the decoded variant does the actual
transparent include magic and produces the "full list". But in general
it is nice that eu-readelf has a "raw" data parsing and dump of the
section data that you can then compare with what the "decoded" variant
based on libdw functions produces. An extreme example is the
--debug-dump=line vs --debug-dump=decodedline.

> > Why you decide to a include the version and line_offset interfaces on
> > Dwarf_Macro? Isn't that a CU DIE concept? (See also below, maybe I just
> > misunderstand the data representation.)
> 
> Both version and line offset belong to macro units (not CU's, each macro
> unit may set its own .debug_line offset, and macro units have their own
> version).  But there's a question of how to get to that value.  You need
> version to interpret the opcode, and you need .debug_line offset to
> interpret file names.

Ah, right, it is Macro Unit, not Compile Unit. Sorry for confusing them.
That makes my comments even more cryptic :) The question is does the
user of Dwarf_Macro need them? The user isn't going to check against the
version, they are just matching against the DW_MACRO constants and for
interpreting the arguments and file names they will use the attribute
and helper functions provided.

For the version I guess this depends on whether we can treat
DW_MACINFO_vendor_ext specially or not.

So if possible I wouldn't expose them at all in the interface and just
treat them as implementation details. Does that make sense?

> >> +static inline Dwarf_Macro_Op_Table *
> >> +get_macinfo_table (void)
> >> +{
> >> +  static Dwarf_Macro_Op_Table *ret = NULL;
> >> +  if (unlikely (ret == NULL))
> >> +    ret = init_macinfo_table ();
> >> +  return ret;
> >> +}
> >
> > The statics here make me a little uncomfortable. What if the user is
> > working with multiple Dwarfs at the same time iterating through the
> > simultaneously (maybe to compare the macros each defines in their CUs).
> > Won't they overwrite each others Macro_Op_Table?
> 
> This is a singleton macro prototype table for .debug_macinfo (not
> .debug_macro) section.  It's the same in all Dwarf's.  The reason it's
> there is just for uniformness of decoding both types of sections.  So
> there's no problem if you have more than one Dwarf.

Aha, thanks. I misunderstood. Yes, for .debug_macinfo they cannot be
redefined so no issue there.

> One problem this could cause however is that in a multi-threaded
> application, there's a race between the threads for initialization of
> this global resource.  If this is deemed a problem, we could do this:
> 
> __thread Dwarf_Macro_Op_Table *ret = NULL;
> if (unlikely (ret == NULL))
>   {
>     lock;
>     static Dwarf_Macro_Op_Table *global_table = NULL;
>     if (global_table == NULL)
>       global_table = init_macinfo_table ();
>     ret = global_table;
>     unlock;
>   }
> 
> This is essentially double-checked locking, but that's not thread-safe,
> so I make ret thread-local, and always lock before touching the global
> resource.  That should be both thread-safe, have reasonable performance,
> and not waste memory by storing one table for each thread.
> 
> Initially I dismissed this problem because we all know that libdw is
> thread unsafe anyway.  But here we potentially get thread-unsafe
> behavior even if we touch different Dwarf's, which might be unexpected.
> May be worth reconsidering.

Can the table be setup by _init() by marking init_macinfo_table with
__attribute__ ((constructor))? It seems small enough to not really
matter that it is done eagerly instead of lazily when macros are used
and solves all the locking trickery.

> >> -      switch (opcode)
> >> +      /* A fake CU with bare minimum data to fool dwarf_formX into
> >> +	 doing the right thing with the attributes that we put
> >> +	 out.  */
> >> +      Dwarf_CU fake_cu = {
> >> +	.dbg = dbg,
> >> +	.version = 4,
> >> +	.offset_size = table->is_64bit ? 8 : 4,
> >> +      };
> >
> > Do we want/need to match the version here to match the CU DIE this macro
> > table came from? Might be hard to get at at this point though. Maybe
> > match it to the Macro_Op_Table version?
> 
> The problem is that some macro units don't have a related CU per se.
> They could be accessed through offset from other macro units.  There
> typically will be more than one CU that these independent units
> logically belong to, otherwise this transparent inclusion scheme
> wouldn't be advantageous.

And again I confused CU and MU.

> But I guess we should match the CU version to whatever corresponds to
> the .debug_macro version that this macro comes from (the two versions
> can in theory evolve independently.  It's hard to imagine something like
> that happening, but there could be a version of Dwarf that doesn't
> change .debug_info at all, but does change .debug_macro).

I am fine with either hard coding 4 for now or pick the MU version.
Please just add a comment for future reference.

> >> +      Dwarf_Macro macro = {
> >> +	.line_offset = table->line_offset,
> >> +	.version = table->version,
> >> +	.opcode = opcode,
> >> +	.nargs = proto->nforms,
> >> +	.attributes = attributes,
> >> +      };
> >
> > Couldn't you just store the table, opcode and attributes in Dwarf_Macro?
> > The rest can then be retrieved from the associated table.
> 
> Isn't that what I'm doing?  I guess I don't understand what you mean.

This is just an implementation detail, so it doesn't really matter that
much. I just had a bit of a hard time keeping track of which value came
from where and thought it was easier to understand (and less value
copying) if the struct looked like:

 struct Dwarf_Macro_s
 {
   Dwarf_Macro_Op_Table *table;
   uint8_t opcode;
   Dwarf_Attribute *attributes;
 };

And then explicitly lookup line_offset, version and nargs using the
table when using a Dwarf_Macro_s later in the code. Certainly not a big
deal, I don't think the data representation really matters here. Just
wondering why you had chosen this one.

Thanks,

Mark

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH][v2] Support .debug_macro
@ 2014-09-29 15:26 Petr Machata
  0 siblings, 0 replies; 8+ messages in thread
From: Petr Machata @ 2014-09-29 15:26 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 7271 bytes --]

Mark Wielaard <mjw@redhat.com> writes:

> On Fri, 2014-09-12 at 20:03 +0200, Petr Machata wrote:
>> - dwarf_getmacros is left to support only .debug_macinfo.  The only
>>   problematic opcode really is DW_MACINFO_vendor_ext
>
> You could suggest that the standard reserves that value to enable
> consumers to use the same parsers for both.

That's worth a try.

> Could you also include a small testcase? That could then also be used as
> an example. I don't think it is immediately clear how you would use the
> dual style die/off accessors. So an test/example would be helpful.

I posted a test case patch with my previous submission.  That's still
active, but I didn't ping it until this one gets through:

        https://lists.fedorahosted.org/pipermail/elfutils-devel/2014-September/004143.html

> And I was wondering if having a readelf --debug-dump=decodedmacro using
> this interface was useful. The raw macro dump doesn't handle transparent
> includes (you don't have to implement that, but maybe it is useful to do
> if you do decide to include a test/example anyway).

Sure, why not.  I actually think readelf should just use libdw/libelf
interfaces instead of duplicating the decoding logic, though that's not
what you are proposing.

> Why you decide to a include the version and line_offset interfaces on
> Dwarf_Macro? Isn't that a CU DIE concept? (See also below, maybe I just
> misunderstand the data representation.)

Both version and line offset belong to macro units (not CU's, each macro
unit may set its own .debug_line offset, and macro units have their own
version).  But there's a question of how to get to that value.  You need
version to interpret the opcode, and you need .debug_line offset to
interpret file names.

One alternative is to have an interface like this:

	extern ptrdiff_t dwarf_getmacros_off (Dwarf *dbg, Dwarf_Off macoff,
					      int (*callback) (Dwarf_Macro *, void *),
					      void *arg, ptrdiff_t token,
        	                              unsigned int *versionp,
                	                      Dwarf_Off *line_offp);

... which would set *VERSIONP and *LINE_OFFP before invoking CALLBACK.
Then use it like this:

	struct data { some stuff; unsigned int version; Dwarf_Off line_off; };

	int callback (Dwarf_Macro *macro, void *user) {
	  // Cast user to struct data, use version and line_off.
	}

        struct data data = {...};
	dwarf_getmacros_off (dbg, off, callback, &data, token,
			     &data.version, &data.line_off);

But arranging this exchange smells of reinventing the wheel every time
you need to use the interface, and the interface is not extensible (if
we need to pass one more thing, we need to add one more parameter).

You could also extend the callback signature and just pass them down
directly.  That's much better in my opinion, but again has the
extensibility problem.

For this reason I thought it best to just carry these per-unit variables
with macros.  That addresses both the extensibility (for each new thing
that you need to pass around, you add one more private field, and one
more public interface), as well as the don't-repeat-yourself concerns.
The obvious downside is that you'll never know what these values are for
macro-less macro unit, but for macro-less unit you don't care.

Yet another possibility is to have a type like Dwarf_Macro_Unit, akin to
Dwarf_CU, which could be accessed from macro, or possibly be passed
together with macro to the callback, and which would carry these things,
and possibly more--like the prototype table.  I didn't think it
necessary to go to these lengths.

>> +static inline Dwarf_Macro_Op_Table *
>> +get_macinfo_table (void)
>> +{
>> +  static Dwarf_Macro_Op_Table *ret = NULL;
>> +  if (unlikely (ret == NULL))
>> +    ret = init_macinfo_table ();
>> +  return ret;
>> +}
>
> The statics here make me a little uncomfortable. What if the user is
> working with multiple Dwarfs at the same time iterating through the
> simultaneously (maybe to compare the macros each defines in their CUs).
> Won't they overwrite each others Macro_Op_Table?

This is a singleton macro prototype table for .debug_macinfo (not
.debug_macro) section.  It's the same in all Dwarf's.  The reason it's
there is just for uniformness of decoding both types of sections.  So
there's no problem if you have more than one Dwarf.

One problem this could cause however is that in a multi-threaded
application, there's a race between the threads for initialization of
this global resource.  If this is deemed a problem, we could do this:

__thread Dwarf_Macro_Op_Table *ret = NULL;
if (unlikely (ret == NULL))
  {
    lock;
    static Dwarf_Macro_Op_Table *global_table = NULL;
    if (global_table == NULL)
      global_table = init_macinfo_table ();
    ret = global_table;
    unlock;
  }

This is essentially double-checked locking, but that's not thread-safe,
so I make ret thread-local, and always lock before touching the global
resource.  That should be both thread-safe, have reasonable performance,
and not waste memory by storing one table for each thread.

Initially I dismissed this problem because we all know that libdw is
thread unsafe anyway.  But here we potentially get thread-unsafe
behavior even if we touch different Dwarf's, which might be unexpected.
May be worth reconsidering.

>> -      switch (opcode)
>> +      /* A fake CU with bare minimum data to fool dwarf_formX into
>> +	 doing the right thing with the attributes that we put
>> +	 out.  */
>> +      Dwarf_CU fake_cu = {
>> +	.dbg = dbg,
>> +	.version = 4,
>> +	.offset_size = table->is_64bit ? 8 : 4,
>> +      };
>
> Do we want/need to match the version here to match the CU DIE this macro
> table came from? Might be hard to get at at this point though. Maybe
> match it to the Macro_Op_Table version?

The problem is that some macro units don't have a related CU per se.
They could be accessed through offset from other macro units.  There
typically will be more than one CU that these independent units
logically belong to, otherwise this transparent inclusion scheme
wouldn't be advantageous.

But I guess we should match the CU version to whatever corresponds to
the .debug_macro version that this macro comes from (the two versions
can in theory evolve independently.  It's hard to imagine something like
that happening, but there could be a version of Dwarf that doesn't
change .debug_info at all, but does change .debug_macro).

>> +      Dwarf_Macro macro = {
>> +	.line_offset = table->line_offset,
>> +	.version = table->version,
>> +	.opcode = opcode,
>> +	.nargs = proto->nforms,
>> +	.attributes = attributes,
>> +      };
>
> Couldn't you just store the table, opcode and attributes in Dwarf_Macro?
> The rest can then be retrieved from the associated table.

Isn't that what I'm doing?  I guess I don't understand what you mean.

> Although it is common for callback based functions in libdw I would
> explicitly say that the Dwarf_Macro is only valid during the callback
> and no information can be retrieved using the pointer after the callback
> returns.

OK, agreed.

Thanks,
PM

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH][v2] Support .debug_macro
@ 2014-09-26 11:44 Mark Wielaard
  0 siblings, 0 replies; 8+ messages in thread
From: Mark Wielaard @ 2014-09-26 11:44 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 13757 bytes --]

On Fri, 2014-09-12 at 20:03 +0200, Petr Machata wrote:
> - This code is based on the following proposal:
> 	http://www.dwarfstd.org/ShowIssue.php?issue=110722.1
> 
> - dwarf_getmacros is left to support only .debug_macinfo.  The only
>   problematic opcode really is DW_MACINFO_vendor_ext, that the new
>   standard lacks, and that could in theory be misinterpreted by extant
>   code as whatever operand a vendor decides to map to 0xff.

Since the standard is not completed yet, and I don't think anybody used
vendor extensions (especially not the last one 0xff and they are
versioned anyway, version 5 is not yet really out there). You could
suggest that the standard reserves that value to enable consumers to use
the same parsers for both. Would it help if the standard says
DW_MACRO_hi_user 0xef and DW_MACRO_reserver_do_not_use 0xff?
Maybe you can suggest that on dwarf-discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
http://dir.gmane.org/gmane.comp.standards.dwarf

> - Instead, two principial iteration interfaces are provided for access
>   to new-style sections: dwarf_getmacros_die and dwarf_getmacros_off.
>   And a suite of new helper interfaces for analysis of individual
>   opcodes: dwarf_macro_version, dwarf_macro_line_offset,
>   dwarf_macro_getparamcnt, dwarf_macro_param.

Could you also include a small testcase? That could then also be used as
an example. I don't think it is immediately clear how you would use the
dual style die/off accessors. So an test/example would be helpful.

And I was wondering if having a readelf --debug-dump=decodedmacro using
this interface was useful. The raw macro dump doesn't handle transparent
includes (you don't have to implement that, but maybe it is useful to do
if you do decide to include a test/example anyway).

The usage of Dwarf_Attribute to return the param value is a nice idea.

Why you decide to a include the version and line_offset interfaces on
Dwarf_Macro? Isn't that a CU DIE concept? (See also below, maybe I just
misunderstand the data representation.)

> - The existing interfaces dwarf_macro_opcode, dwarf_macro_param1 and
>   dwarf_macro_param2 remain operational for old- as well as new-style
>   Dwarf macro sections, if applicable.

Nice.

> +static void
> +build_table (Dwarf_Macro_Op_Table *table,
> +	     Dwarf_Macro_Op_Proto op_protos[static 255])
> +{
> +  unsigned ct = 0;
> +  for (unsigned i = 1; i < 256; ++i)
> +    if (op_protos[i - 1].forms != NULL)
> +      table->table[table->opcodes[i - 1] = ct++] = op_protos[i - 1];
> +    else
> +      table->opcodes[i] = 0xff;
> +}
> +
> +#define MACRO_PROTO(NAME, ...)					\
> +  Dwarf_Macro_Op_Proto NAME = ({				\
> +      static const uint8_t proto[] = {__VA_ARGS__};		\
> +      (Dwarf_Macro_Op_Proto) {sizeof proto, proto};		\
> +    })
> +
> +static Dwarf_Macro_Op_Table *
> +init_macinfo_table (void)
> +{
> +  MACRO_PROTO (p_udata_str, DW_FORM_udata, DW_FORM_string);
> +  MACRO_PROTO (p_udata_udata, DW_FORM_udata, DW_FORM_udata);
> +  MACRO_PROTO (p_none);
> +
> +  Dwarf_Macro_Op_Proto op_protos[255] =
> +    {
> +      [DW_MACINFO_define - 1] = p_udata_str,
> +      [DW_MACINFO_undef - 1] = p_udata_str,
> +      [DW_MACINFO_vendor_ext - 1] = p_udata_str,
> +      [DW_MACINFO_start_file - 1] = p_udata_udata,
> +      [DW_MACINFO_end_file - 1] = p_none,
> +    };
> +
> +  static Dwarf_Macro_Op_Table table;
> +  memset (&table, 0, sizeof table);
> +
> +  build_table (&table, op_protos);
> +  return &table;
> +}
> +
> +static inline Dwarf_Macro_Op_Table *
> +get_macinfo_table (void)
> +{
> +  static Dwarf_Macro_Op_Table *ret = NULL;
> +  if (unlikely (ret == NULL))
> +    ret = init_macinfo_table ();
> +  return ret;
> +}

The statics here make me a little uncomfortable. What if the user is
working with multiple Dwarfs at the same time iterating through the
simultaneously (maybe to compare the macros each defines in their CUs).
Won't they overwrite each others Macro_Op_Table?

> +static Dwarf_Macro_Op_Table *
> +get_table_for_offset (Dwarf *dbg, Dwarf_Word macoff,
> +		      const unsigned char *readp,
> +		      const unsigned char *const endp)
> +{
> +  Dwarf_Macro_Op_Table fake = { .offset = macoff };
> +  Dwarf_Macro_Op_Table **found = tfind (&fake, &dbg->macro_ops,
> +					macro_op_compare);
> +  if (found != NULL)
> +    return *found;
> +
> +  const unsigned char *startp = readp;
> +
> +  /* Request at least 3 bytes for header.  */
> +  if (readp + 3 > endp)
> +    {
> +    invalid_dwarf:
> +      __libdw_seterrno (DWARF_E_INVALID_DWARF);
> +      return NULL;
> +    }
>  
> -  Elf_Data *d = die->cu->dbg->sectiondata[IDX_debug_macinfo];
> -  if (unlikely (d == NULL) || unlikely (d->d_buf == NULL))
> +  uint16_t version = read_2ubyte_unaligned_inc (dbg, readp);
> +  if (version != 4)
> +    {
> +      __libdw_seterrno (DWARF_E_INVALID_VERSION);
> +      return NULL;
> +    }
> +
> +  uint8_t flags = *readp++;
> +  bool is_64bit = (flags & 0x1) != 0;
> +
> +  Dwarf_Off line_offset = (Dwarf_Off) -1;
> +  if ((flags & 0x2) != 0)
> +    {
> +      line_offset = read_addr_unaligned_inc (is_64bit ? 8 : 4, dbg, readp);
> +      if (readp > endp)
> +	goto invalid_dwarf;
> +    }
> +
> +  /* """The macinfo entry types defined in this standard may, but
> +     might not, be described in the table""".
> +
> +     I.e. these may be present.  It's tempting to simply skip them,
> +     but it's probably more correct to tolerate that a producer tweaks
> +     the way certain opcodes are encoded, for whatever reasons.  */
> +
> +  MACRO_PROTO (p_udata_str, DW_FORM_udata, DW_FORM_string);
> +  MACRO_PROTO (p_udata_strp, DW_FORM_udata, DW_FORM_strp);
> +  MACRO_PROTO (p_udata_udata, DW_FORM_udata, DW_FORM_udata);
> +  MACRO_PROTO (p_secoffset, DW_FORM_sec_offset);
> +  MACRO_PROTO (p_none);
> +
> +  Dwarf_Macro_Op_Proto op_protos[255] =
> +    {
> +      [DW_MACRO_GNU_define - 1] = p_udata_str,
> +      [DW_MACRO_GNU_undef - 1] = p_udata_str,
> +      [DW_MACRO_GNU_define_indirect - 1] = p_udata_strp,
> +      [DW_MACRO_GNU_undef_indirect - 1] = p_udata_strp,
> +      [DW_MACRO_GNU_start_file - 1] = p_udata_udata,
> +      [DW_MACRO_GNU_end_file - 1] = p_none,
> +      [DW_MACRO_GNU_transparent_include - 1] = p_secoffset,
> +      /* N.B. DW_MACRO_undef_indirectx, DW_MACRO_define_indirectx
> +	 should be added when 130313.1 is supported.  */
> +    };
> +
> +  if ((flags & 0x4) != 0)
> +    {
> +      unsigned count = *readp++;
> +      for (unsigned i = 0; i < count; ++i)
> +	{
> +	  unsigned opcode = *readp++;
> +
> +	  Dwarf_Macro_Op_Proto e;
> +	  get_uleb128 (e.nforms, readp); // XXX checking
> +	  e.forms = readp;
> +	  op_protos[opcode] = e;
> +
> +	  readp += e.nforms;
> +	  if (readp > endp)
> +	    {
> +	      __libdw_seterrno (DWARF_E_INVALID_DWARF);
> +	      return NULL;
> +	    }
> +	}
> +    }
> +
> +  size_t ct = 0;
> +  for (unsigned i = 1; i < 256; ++i)
> +    if (op_protos[i - 1].forms != NULL)
> +      ++ct;
> +
> +  /* We support at most 0xfe opcodes defined in the table, as 0xff is
> +     a value that means that given opcode is not stored at all.  But
> +     that should be fine, as opcode 0 is not allocated.  */
> +  assert (ct < 0xff);
> +
> +  size_t macop_table_size = offsetof (Dwarf_Macro_Op_Table, table[ct]);
> +
> +  Dwarf_Macro_Op_Table *table = libdw_alloc (dbg, Dwarf_Macro_Op_Table,
> +					     macop_table_size, 1);
> +
> +  *table = (Dwarf_Macro_Op_Table) {
> +    .offset = macoff,
> +    .line_offset = line_offset,
> +    .header_len = readp - startp,
> +    .version = version,
> +    .is_64bit = is_64bit,
> +  };
> +  build_table (table, op_protos);
> +
> +  Dwarf_Macro_Op_Table **ret = tsearch (table, &dbg->macro_ops,
> +					macro_op_compare);
> +  if (unlikely (ret == NULL))
> +    {
> +      __libdw_seterrno (DWARF_E_NOMEM);
> +      return NULL;
> +    }
> +
> +  return *ret;
> +}
> +
> +static ptrdiff_t
> +read_macros (Dwarf *dbg, Dwarf_Macro_Op_Table *table, int secindex,
> +	     Dwarf_Off macoff, int (*callback) (Dwarf_Macro *, void *),
> +	     void *arg, ptrdiff_t offset)
> +{
> +  Elf_Data *d = dbg->sectiondata[secindex];
> +  if (unlikely (d == NULL || d->d_buf == NULL))
>      {
>        __libdw_seterrno (DWARF_E_NO_ENTRY);
>        return -1;
>      }
>  
> -  if (offset == 0)
> +  if (unlikely (macoff >= d->d_size))
>      {
> -      /* Get the appropriate attribute.  */
> -      Dwarf_Attribute attr;
> -      if (INTUSE(dwarf_attr) (die, DW_AT_macro_info, &attr) == NULL)
> -	return -1;
> +      __libdw_seterrno (DWARF_E_INVALID_DWARF);
> +      return -1;
> +    }
>  
> -      /* Offset into the .debug_macinfo section.  */
> -      Dwarf_Word macoff;
> -      if (INTUSE(dwarf_formudata) (&attr, &macoff) != 0)
> -	return -1;
> +  const unsigned char *const startp = d->d_buf + macoff;
> +  const unsigned char *const endp = d->d_buf + d->d_size;
>  
> -      offset = macoff;
> +  if (table == NULL)
> +    {
> +      table = get_table_for_offset (dbg, macoff, startp, endp);
> +      if (table == NULL)
> +	return -1;
>      }
> -  if (unlikely (offset > (ptrdiff_t) d->d_size))
> -    goto invalid;
>  
> -  const unsigned char *readp = d->d_buf + offset;
> -  const unsigned char *readendp = d->d_buf + d->d_size;
> +  if (offset == 0)
> +    offset = table->header_len;
>  
> -  if (readp == readendp)
> -    return 0;
> +  assert (offset >= 0);
> +  assert (offset < endp - startp);
> +  const unsigned char *readp = startp + offset;
>  
> -  while (readp < readendp)
> +  while (readp < endp)
>      {
>        unsigned int opcode = *readp++;
> -      unsigned int u128;
> -      unsigned int u128_2 = 0;
> -      const char *str = NULL;
> -      const unsigned char *endp;
> +      if (opcode == 0)
> +	/* Nothing more to do.  */
> +	return 0;
> +
> +      unsigned int idx = table->opcodes[opcode - 1];
> +      if (idx == 0xff)
> +	{
> +	  __libdw_seterrno (DWARF_E_INVALID_OPCODE);
> +	  return -1;
> +	}
> +
> +      Dwarf_Macro_Op_Proto *proto = &table->table[idx];
>  
> -      switch (opcode)
> +      /* A fake CU with bare minimum data to fool dwarf_formX into
> +	 doing the right thing with the attributes that we put
> +	 out.  */
> +      Dwarf_CU fake_cu = {
> +	.dbg = dbg,
> +	.version = 4,
> +	.offset_size = table->is_64bit ? 8 : 4,
> +      };

Do we want/need to match the version here to match the CU DIE this macro
table came from? Might be hard to get at at this point though. Maybe
match it to the Macro_Op_Table version?

> +      Dwarf_Macro macro = {
> +	.line_offset = table->line_offset,
> +	.version = table->version,
> +	.opcode = opcode,
> +	.nargs = proto->nforms,
> +	.attributes = attributes,
> +      };

Couldn't you just store the table, opcode and attributes in Dwarf_Macro?
The rest can then be retrieved from the associated table.

>  /* Call callback function for each of the macro information entry for
> -   the CU.  */
> +   the CU.  Similar in operation to dwarf_getmacros_die, but only
> +   works for .debug_macinfo style macro entries (those where
> +   dwarf_macro_version returns 0.)  */
>  extern ptrdiff_t dwarf_getmacros (Dwarf_Die *cudie,
>  				  int (*callback) (Dwarf_Macro *, void *),
> -				  void *arg, ptrdiff_t offset)
> -     __nonnull_attribute__ (2);
> +				  void *arg, ptrdiff_t token)
> +     __nonnull_attribute__ (2);
> +
> +/* Iterate through the macro section referenced by CUDIE and call
> +   CALLBACK for each macro information entry.  Keeps iterating while
> +   CALLBACK returns DWARF_CB_OK.  If the callback returns
> +   DWARF_CB_ABORT, it stops iterating and returns a continuation
> +   token, which can be used to restart the iteration at the point
> +   where it ended.  Returns -1 for errors or 0 if there are no more
> +   macro entries.
> +
> +   If MACOFFP is non-NULL, offset to macro section referenced by CUDIE
> +   is stored to *MACOFFP.  That can later be passed together with a
> +   continuation token to dwarf_getmacros_off.
> +
> +   If this is called with non-0 TOKEN (i.e. it is a continuation
> +   call), MACOFFP shall be either NULL, or point to a location that
> +   contains the same section offset as was at *MACOFFP of the call
> +   that the passed-in continuation token came from.  */
> +extern ptrdiff_t dwarf_getmacros_die (Dwarf_Die *cudie, Dwarf_Off *macoffp,
> +				      int (*callback) (Dwarf_Macro *, void *),
> +				      void *arg, ptrdiff_t token)
> +     __nonnull_attribute__ (3);
> +
> +/* This is similar in operation to dwarf_getmacros_die, but selects
> +   the section to iterate through by offset instead of by CU.  This
> +   can be used for handling DW_MACRO_GNU_transparent_include's or
> +   similar opcodes, or for continuation calls seeded with MACOFF and
> +   TOKEN that came back from dwarf_getmacros_die (or a previous call
> +   to dwarf_getmacros_off).  Note that with TOKEN of 0, this will
> +   always iterate through DW_AT_GNU_macros style section.  */
> +extern ptrdiff_t dwarf_getmacros_off (Dwarf *dbg, Dwarf_Off macoff,
> +				      int (*callback) (Dwarf_Macro *, void *),
> +				      void *arg, ptrdiff_t token)
> +  __nonnull_attribute__ (3);

Although it is common for callback based functions in libdw I would
explicitly say that the Dwarf_Macro is only valid during the callback
and no information can be retrieved using the pointer after the callback
returns.

Thanks,

Mark

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH][v2] Support .debug_macro
@ 2014-09-22 22:14 Petr Machata
  0 siblings, 0 replies; 8+ messages in thread
From: Petr Machata @ 2014-09-22 22:14 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 22 bytes --]

Ping.

Thanks,
PM

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-10-01 17:16 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-12 18:03 [PATCH][v2] Support .debug_macro Petr Machata
2014-09-22 22:14 Petr Machata
2014-09-26 11:44 Mark Wielaard
2014-09-29 15:26 Petr Machata
2014-09-30  8:57 Mark Wielaard
2014-09-30 16:06 Petr Machata
2014-10-01 10:06 Mark Wielaard
2014-10-01 17:16 Petr Machata

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).