* [PATCH 2/2] libdw: Reduce size of struct Dwarf_Abbrev.
2017-12-26 19:38 Simpler abbrev parsing using less memory Mark Wielaard
2017-12-26 19:38 ` [PATCH 1/2] libdw: New get_uleb128_unchecked to use with already checked Dwarf_Abbrev Mark Wielaard
@ 2017-12-26 19:38 ` Mark Wielaard
2018-01-01 22:59 ` Simpler abbrev parsing using less memory Mark Wielaard
2 siblings, 0 replies; 4+ messages in thread
From: Mark Wielaard @ 2017-12-26 19:38 UTC (permalink / raw)
To: elfutils-devel; +Cc: Mark Wielaard
If we don't cache the attrcnt and use bitfields for the has_children and
code we can reduce the size of struct Dwarf Abbrev from 32 to 24 bytes on
64bit architectures and from 28 to 20 bytes on 32bit arches.
Signed-off-by: Mark Wielaard <mark@klomp.org>
---
libdw/ChangeLog | 7 +++++++
libdw/dwarf_getabbrev.c | 7 +++----
libdw/dwarf_getattrcnt.c | 19 +++++++++++++++++--
libdw/libdwP.h | 15 +++++++--------
4 files changed, 34 insertions(+), 14 deletions(-)
diff --git a/libdw/ChangeLog b/libdw/ChangeLog
index 3e3b7169..c284dbbd 100644
--- a/libdw/ChangeLog
+++ b/libdw/ChangeLog
@@ -1,3 +1,10 @@
+2017-12-26 Mark Wielaard <mark@klomp.org>
+
+ * libdwP.h (struct Dwarf_Abbrev): Pack struct. Remove attrcnt,
+ use bitfields for has_children and code.
+ * dwarf_getabbrev.c (__libdw_getabbrev): Don't count attrs.
+ * dwarf_getattrcnt.c (dwarf_getattrcnt): Count attrs.
+
2017-12-26 Mark Wielaard <mark@klomp.org>
* memory-access.h (__libdw_get_uleb128_unchecked): New function.
diff --git a/libdw/dwarf_getabbrev.c b/libdw/dwarf_getabbrev.c
index ef51b847..a3a68b37 100644
--- a/libdw/dwarf_getabbrev.c
+++ b/libdw/dwarf_getabbrev.c
@@ -1,5 +1,5 @@
/* Get abbreviation at given offset.
- Copyright (C) 2003, 2004, 2005, 2006, 2014 Red Hat, Inc.
+ Copyright (C) 2003, 2004, 2005, 2006, 2014, 2017 Red Hat, Inc.
This file is part of elfutils.
Written by Ulrich Drepper <drepper@redhat.com>, 2003.
@@ -121,8 +121,7 @@ __libdw_getabbrev (Dwarf *dbg, struct Dwarf_CU *cu, Dwarf_Off offset,
abb->attrp = (unsigned char *) abbrevp;
abb->offset = offset;
- /* Skip over all the attributes and count them while doing so. */
- abb->attrcnt = 0;
+ /* Skip over all the attributes and check rest of the abbrev is valid. */
unsigned int attrname;
unsigned int attrform;
do
@@ -134,7 +133,7 @@ __libdw_getabbrev (Dwarf *dbg, struct Dwarf_CU *cu, Dwarf_Off offset,
goto invalid;
get_uleb128 (attrform, abbrevp, end);
}
- while (attrname != 0 && attrform != 0 && ++abb->attrcnt);
+ while (attrname != 0 && attrform != 0);
/* Return the length to the caller if she asked for it. */
if (lengthp != NULL)
diff --git a/libdw/dwarf_getattrcnt.c b/libdw/dwarf_getattrcnt.c
index 2bfb4ac5..a05976d4 100644
--- a/libdw/dwarf_getattrcnt.c
+++ b/libdw/dwarf_getattrcnt.c
@@ -1,5 +1,5 @@
/* Get number of attributes of abbreviation.
- Copyright (C) 2003, 2004 Red Hat, Inc.
+ Copyright (C) 2003, 2004, 2017 Red Hat, Inc.
This file is part of elfutils.
Written by Ulrich Drepper <drepper@redhat.com>, 2003.
@@ -40,7 +40,22 @@ dwarf_getattrcnt (Dwarf_Abbrev *abbrev, size_t *attrcntp)
if (abbrev == NULL)
return -1;
- *attrcntp = abbrev->attrcnt;
+ const unsigned char *abbrevp = abbrev->attrp;
+
+ /* Skip over all the attributes and count them while doing so. */
+ int attrcnt = 0;
+ unsigned int attrname;
+ unsigned int attrform;
+ do
+ {
+ /* We can use unchecked since they were checked when the Dwrf_Abbrev
+ was created. */
+ get_uleb128_unchecked (attrname, abbrevp);
+ get_uleb128_unchecked (attrform, abbrevp);
+ }
+ while (attrname != 0 && attrform != 0 && ++attrcnt);
+
+ *attrcntp = attrcnt;
return 0;
}
diff --git a/libdw/libdwP.h b/libdw/libdwP.h
index 82b47d09..fbb8ed08 100644
--- a/libdw/libdwP.h
+++ b/libdw/libdwP.h
@@ -1,5 +1,5 @@
/* Internal definitions for libdwarf.
- Copyright (C) 2002-2011, 2013-2016 Red Hat, Inc.
+ Copyright (C) 2002-2011, 2013-2017 Red Hat, Inc.
This file is part of elfutils.
Written by Ulrich Drepper <drepper@redhat.com>, 2002.
@@ -212,13 +212,12 @@ struct Dwarf
/* Abbreviation representation. */
struct Dwarf_Abbrev
{
- Dwarf_Off offset;
- unsigned char *attrp;
- unsigned int attrcnt;
- unsigned int code;
- unsigned int tag;
- bool has_children;
-};
+ Dwarf_Off offset; /* Offset to start of abbrev into .debug_abbrev. */
+ unsigned char *attrp; /* Pointer to start of attribute name/form pairs. */
+ bool has_children : 1; /* Whether or not the DIE has children. */
+ unsigned int code : 31; /* The (unique) abbrev code. */
+ unsigned int tag; /* The tag of the DIE. */
+} attribute_packed;
#include "dwarf_abbrev_hash.h"
--
2.14.3
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH 1/2] libdw: New get_uleb128_unchecked to use with already checked Dwarf_Abbrev.
2017-12-26 19:38 Simpler abbrev parsing using less memory Mark Wielaard
@ 2017-12-26 19:38 ` Mark Wielaard
2017-12-26 19:38 ` [PATCH 2/2] libdw: Reduce size of struct Dwarf_Abbrev Mark Wielaard
2018-01-01 22:59 ` Simpler abbrev parsing using less memory Mark Wielaard
2 siblings, 0 replies; 4+ messages in thread
From: Mark Wielaard @ 2017-12-26 19:38 UTC (permalink / raw)
To: elfutils-devel; +Cc: Mark Wielaard
When creating a Dwarf_Abbrev in dwarf_getabbrev (__libdw_getabbrev) we
already check it is fully readable from the .debug_abbrev section. So
whenever we reread it later using the attrp pointer we don't have to
check it again. Introduce get_uleb128_unchecked to use for ulebs we
know are safe to read directly.
Signed-off-by: Mark Wielaard <mark@klomp.org>
---
libdw/ChangeLog | 10 ++++++++++
libdw/dwarf_child.c | 19 +++++--------------
libdw/dwarf_getabbrevattr.c | 10 +++++-----
libdw/dwarf_getattrs.c | 20 +++++---------------
libdw/dwarf_hasattr.c | 22 +++++-----------------
libdw/memory-access.h | 18 ++++++++++++++++++
6 files changed, 48 insertions(+), 51 deletions(-)
diff --git a/libdw/ChangeLog b/libdw/ChangeLog
index eb1cb709..3e3b7169 100644
--- a/libdw/ChangeLog
+++ b/libdw/ChangeLog
@@ -1,3 +1,13 @@
+2017-12-26 Mark Wielaard <mark@klomp.org>
+
+ * memory-access.h (__libdw_get_uleb128_unchecked): New function.
+ (get_uleb128_unchecked): New define.
+ * dwarf_child.c (__libdw_find_attr): Use get_uleb128_unchecked to
+ read attr name and form.
+ * dwarf_getabbrevattr.c (dwarf_getabbrevattr): Likewise.
+ * dwarf_getattrs.c (dwarf_getattrs): Likewise.
+ * dwarf_hasattr.c (dwarf_hasattr): Likewise.
+
2017-05-09 Ulf Hermann <ulf.hermann@qt.io>
Mark Wielaard <mark@klomp.org>
diff --git a/libdw/dwarf_child.c b/libdw/dwarf_child.c
index cc95fb3f..248338eb 100644
--- a/libdw/dwarf_child.c
+++ b/libdw/dwarf_child.c
@@ -1,5 +1,5 @@
/* Return child of current DIE.
- Copyright (C) 2003-2011, 2014 Red Hat, Inc.
+ Copyright (C) 2003-2011, 2014, 2017 Red Hat, Inc.
This file is part of elfutils.
Written by Ulrich Drepper <drepper@redhat.com>, 2003.
@@ -43,36 +43,27 @@ internal_function
__libdw_find_attr (Dwarf_Die *die, unsigned int search_name,
unsigned int *codep, unsigned int *formp)
{
- Dwarf *dbg = die->cu->dbg;
const unsigned char *readp;
/* Find the abbreviation entry. */
Dwarf_Abbrev *abbrevp = __libdw_dieabbrev (die, &readp);
if (unlikely (abbrevp == DWARF_END_ABBREV))
{
- invalid_dwarf:
__libdw_seterrno (DWARF_E_INVALID_DWARF);
return NULL;
}
- /* Search the name attribute. */
- unsigned char *const endp
- = ((unsigned char *) dbg->sectiondata[IDX_debug_abbrev]->d_buf
- + dbg->sectiondata[IDX_debug_abbrev]->d_size);
-
+ /* Search the name attribute. Attribute has been checked when
+ Dwarf_Abbrev was created, we can read unchecked. */
const unsigned char *attrp = abbrevp->attrp;
while (1)
{
/* Get attribute name and form. */
- if (unlikely (attrp >= endp))
- goto invalid_dwarf;
unsigned int attr_name;
- get_uleb128 (attr_name, attrp, endp);
+ get_uleb128_unchecked (attr_name, attrp);
- if (unlikely (attrp >= endp))
- goto invalid_dwarf;
unsigned int attr_form;
- get_uleb128 (attr_form, attrp, endp);
+ get_uleb128_unchecked (attr_form, attrp);
/* We can stop if we found the attribute with value zero. */
if (attr_name == 0 && attr_form == 0)
diff --git a/libdw/dwarf_getabbrevattr.c b/libdw/dwarf_getabbrevattr.c
index 3b4da99c..57fe3637 100644
--- a/libdw/dwarf_getabbrevattr.c
+++ b/libdw/dwarf_getabbrevattr.c
@@ -1,5 +1,5 @@
/* Get specific attribute of abbreviation.
- Copyright (C) 2003, 2004, 2005, 2014 Red Hat, Inc.
+ Copyright (C) 2003, 2004, 2005, 2014, 2017 Red Hat, Inc.
This file is part of elfutils.
Written by Ulrich Drepper <drepper@redhat.com>, 2003.
@@ -53,10 +53,10 @@ dwarf_getabbrevattr (Dwarf_Abbrev *abbrev, size_t idx, unsigned int *namep,
{
start_attrp = attrp;
- /* Attribute code and form are encoded as ULEB128 values.i
- XXX We have no way to bounds check. */
- get_uleb128 (name, attrp, attrp + len_leb128 (name));
- get_uleb128 (form, attrp, attrp + len_leb128 (form));
+ /* Attribute code and form are encoded as ULEB128 values.
+ Already checked when Dwarf_Abbrev was created, read unchecked. */
+ get_uleb128_unchecked (name, attrp);
+ get_uleb128_unchecked (form, attrp);
/* If both values are zero the index is out of range. */
if (name == 0 && form == 0)
diff --git a/libdw/dwarf_getattrs.c b/libdw/dwarf_getattrs.c
index 0da8b5ba..7f55fafc 100644
--- a/libdw/dwarf_getattrs.c
+++ b/libdw/dwarf_getattrs.c
@@ -1,5 +1,5 @@
/* Get attributes of the DIE.
- Copyright (C) 2004, 2005, 2008, 2009, 2014 Red Hat, Inc.
+ Copyright (C) 2004, 2005, 2008, 2009, 2014, 2017 Red Hat, Inc.
This file is part of elfutils.
Written by Ulrich Drepper <drepper@redhat.com>, 2004.
@@ -51,7 +51,6 @@ dwarf_getattrs (Dwarf_Die *die, int (*callback) (Dwarf_Attribute *, void *),
if (unlikely (abbrevp == DWARF_END_ABBREV))
{
- invalid_dwarf:
__libdw_seterrno (DWARF_E_INVALID_DWARF);
return -1l;
}
@@ -61,24 +60,15 @@ dwarf_getattrs (Dwarf_Die *die, int (*callback) (Dwarf_Attribute *, void *),
const unsigned char *const offset_attrp = abbrevp->attrp + offset;
/* Go over the list of attributes. */
- Dwarf *dbg = die->cu->dbg;
- const unsigned char *endp;
- endp = ((const unsigned char *) dbg->sectiondata[IDX_debug_abbrev]->d_buf
- + dbg->sectiondata[IDX_debug_abbrev]->d_size);
while (1)
{
- /* Are we still in bounds? */
- if (unlikely (attrp >= endp))
- goto invalid_dwarf;
-
- /* Get attribute name and form. */
+ /* Get attribute name and form. Dwarf_Abbrev was checked when
+ created, so we can read unchecked. */
Dwarf_Attribute attr;
const unsigned char *remembered_attrp = attrp;
- get_uleb128 (attr.code, attrp, endp);
- if (unlikely (attrp >= endp))
- goto invalid_dwarf;
- get_uleb128 (attr.form, attrp, endp);
+ get_uleb128_unchecked (attr.code, attrp);
+ get_uleb128_unchecked (attr.form, attrp);
/* We can stop if we found the attribute with value zero. */
if (attr.code == 0 && attr.form == 0)
diff --git a/libdw/dwarf_hasattr.c b/libdw/dwarf_hasattr.c
index 2bb8dc82..90b333e6 100644
--- a/libdw/dwarf_hasattr.c
+++ b/libdw/dwarf_hasattr.c
@@ -1,5 +1,5 @@
/* Check whether given DIE has specific attribute.
- Copyright (C) 2003, 2005, 2014 Red Hat, Inc.
+ Copyright (C) 2003, 2005, 2014, 2017 Red Hat, Inc.
This file is part of elfutils.
Written by Ulrich Drepper <drepper@redhat.com>, 2003.
@@ -45,32 +45,20 @@ dwarf_hasattr (Dwarf_Die *die, unsigned int search_name)
Dwarf_Abbrev *abbrevp = __libdw_dieabbrev (die, NULL);
if (unlikely (abbrevp == DWARF_END_ABBREV))
{
- invalid_dwarf:
__libdw_seterrno (DWARF_E_INVALID_DWARF);
return 0;
}
- Dwarf *dbg = die->cu->dbg;
-
- /* Search the name attribute. */
- unsigned char *const endp
- = ((unsigned char *) dbg->sectiondata[IDX_debug_abbrev]->d_buf
- + dbg->sectiondata[IDX_debug_abbrev]->d_size);
-
+ /* Search the name attribute. Dwarf_Abbrev was checked when created,
+ so we can read unchecked here. */
const unsigned char *attrp = abbrevp->attrp;
while (1)
{
- /* Are we still in bounds? This test needs to be refined. */
- if (unlikely (attrp >= endp))
- goto invalid_dwarf;
-
/* Get attribute name and form. */
unsigned int attr_name;
- get_uleb128 (attr_name, attrp, endp);
+ get_uleb128_unchecked (attr_name, attrp);
unsigned int attr_form;
- if (unlikely (attrp >= endp))
- goto invalid_dwarf;
- get_uleb128 (attr_form, attrp, endp);
+ get_uleb128_unchecked (attr_form, attrp);
/* We can stop if we found the attribute with value zero. */
if (attr_name == 0 || attr_form == 0)
diff --git a/libdw/memory-access.h b/libdw/memory-access.h
index afb651fc..ed68bdb9 100644
--- a/libdw/memory-access.h
+++ b/libdw/memory-access.h
@@ -87,8 +87,26 @@ __libdw_get_uleb128 (const unsigned char **addrp, const unsigned char *end)
return UINT64_MAX;
}
+static inline uint64_t
+__libdw_get_uleb128_unchecked (const unsigned char **addrp)
+{
+ uint64_t acc = 0;
+
+ /* Unroll the first step to help the compiler optimize
+ for the common single-byte case. */
+ get_uleb128_step (acc, *addrp, 0);
+
+ const size_t max = len_leb128 (uint64_t);
+ for (size_t i = 1; i < max; ++i)
+ get_uleb128_step (acc, *addrp, i);
+ /* Other implementations set VALUE to UINT_MAX in this
+ case. So we better do this as well. */
+ return UINT64_MAX;
+}
+
/* Note, addr needs to me smaller than end. */
#define get_uleb128(var, addr, end) ((var) = __libdw_get_uleb128 (&(addr), end))
+#define get_uleb128_unchecked(var, addr) ((var) = __libdw_get_uleb128_unchecked (&(addr)))
/* The signed case is similar, but we sign-extend the result. */
--
2.14.3
^ permalink raw reply [flat|nested] 4+ messages in thread
* Simpler abbrev parsing using less memory
@ 2017-12-26 19:38 Mark Wielaard
2017-12-26 19:38 ` [PATCH 1/2] libdw: New get_uleb128_unchecked to use with already checked Dwarf_Abbrev Mark Wielaard
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Mark Wielaard @ 2017-12-26 19:38 UTC (permalink / raw)
To: elfutils-devel
Hi,
When we added bounds checking to almost all data reading functions
(commit 7a05347 libdw: Add get_uleb128 and get_sleb128 bounds checking)
we also added extra checks to the abbrev reading. But since we didn't
really have bounds for the "raw" Dwarf_Abbrev reading functions we
just "guessed" the maximum of a uleb128. This wasn't really correct
and not really needed. A struct Dwarf_Abbrev can only be created by
__libdw_getabbrev, which checks the whole abbrev (code, tag, children
and attribute names/forms) is valid already. So whenever we use the
attrp pointer from the Dwarf_Abbrev to read the name/forms we already
know they are in the .debug_abbrev bounds).
[PATCH 1/2] libdw: New get_uleb128_unchecked to use with already
checked Dwarf_Abbrev.
So the first patch introduces a get_uleb128_unchecked function that
is used for re-reading such uleb128 values.
The second patch reduces the size of the struct Dwarf_Abbrev by not
storing the attrcnt and by using bitfields for has_children and code.
[PATCH 2/2] libdw: Reduce size of struct Dwarf_Abbrev.
The attrcnt was only used by the dwarf_getattrcnt function. Which
is only used in one testcase. And which seems mostly unnecessary
for real programs. The function now explicitly counts the attrs
instead of using a cached value.
The combined patches very slightly reduces the time for parsing
abbrevs and make the abbrev cache somewhat smaller.
On my machine eu-readelf -N --debug-dump=info libstdc++.so.debug
goes down from 1.79 to 1.71 secs. And max rss goes down from 15.296
to 14.684 kbytes.
Cheers,
Mark
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Simpler abbrev parsing using less memory
2017-12-26 19:38 Simpler abbrev parsing using less memory Mark Wielaard
2017-12-26 19:38 ` [PATCH 1/2] libdw: New get_uleb128_unchecked to use with already checked Dwarf_Abbrev Mark Wielaard
2017-12-26 19:38 ` [PATCH 2/2] libdw: Reduce size of struct Dwarf_Abbrev Mark Wielaard
@ 2018-01-01 22:59 ` Mark Wielaard
2 siblings, 0 replies; 4+ messages in thread
From: Mark Wielaard @ 2018-01-01 22:59 UTC (permalink / raw)
To: elfutils-devel
On Tue, Dec 26, 2017 at 08:38:38PM +0100, Mark Wielaard wrote:
> [PATCH 1/2] libdw: New get_uleb128_unchecked to use with already
> checked Dwarf_Abbrev.
>
> So the first patch introduces a get_uleb128_unchecked function that
> is used for re-reading such uleb128 values.
>
> The second patch reduces the size of the struct Dwarf_Abbrev by not
> storing the attrcnt and by using bitfields for has_children and code.
>
> [PATCH 2/2] libdw: Reduce size of struct Dwarf_Abbrev.
I pushed both patches to master.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2018-01-01 22:59 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-26 19:38 Simpler abbrev parsing using less memory Mark Wielaard
2017-12-26 19:38 ` [PATCH 1/2] libdw: New get_uleb128_unchecked to use with already checked Dwarf_Abbrev Mark Wielaard
2017-12-26 19:38 ` [PATCH 2/2] libdw: Reduce size of struct Dwarf_Abbrev Mark Wielaard
2018-01-01 22:59 ` Simpler abbrev parsing using less memory Mark Wielaard
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).