From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <elfutils-devel-return-6459-listarch-elfutils-devel=sourceware.org@sourceware.org>
Received: (qmail 21481 invoked by alias); 26 Dec 2017 19:38:51 -0000
Mailing-List: contact elfutils-devel-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <elfutils-devel.sourceware.org>
List-Post: <mailto:elfutils-devel@sourceware.org>
List-Help: <mailto:elfutils-devel-help@sourceware.org>
List-Subscribe: <mailto:elfutils-devel-subscribe@sourceware.org>
Sender: elfutils-devel-owner@sourceware.org
Received: (qmail 21469 invoked by uid 89); 26 Dec 2017 19:38:51 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Checked: by ClamAV 0.99.2 on sourceware.org
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-0.9 required=5.0 tests=BAYES_00,KAM_LAZY_DOMAIN_SECURITY autolearn=no version=3.3.2 spammy=Hx-languages-length:1620, secs, rss
X-Spam-Status: No, score=-0.9 required=5.0 tests=BAYES_00,KAM_LAZY_DOMAIN_SECURITY autolearn=no version=3.3.2
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on sourceware.org
X-Spam-Level:
X-HELO: gnu.wildebeest.org
Received: from wildebeest.demon.nl (HELO gnu.wildebeest.org) (212.238.236.112) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 26 Dec 2017 19:38:49 +0000
Received: from stream.wildebeest.org (a80-101-194-232.adsl.xs4all.nl [80.101.194.232])	(using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits))	(No client certificate requested)	by gnu.wildebeest.org (Postfix) with ESMTPSA id A4C73302BAA3	for <elfutils-devel@sourceware.org>; Tue, 26 Dec 2017 20:38:47 +0100 (CET)
Received: by stream.wildebeest.org (Postfix, from userid 1000)	id 0A63D109054; Tue, 26 Dec 2017 20:38:47 +0100 (CET)
From: Mark Wielaard <mark@klomp.org>
To: elfutils-devel@sourceware.org
Subject: Simpler abbrev parsing using less memory
Date: Tue, 26 Dec 2017 19:38:00 -0000
Message-Id: <20171226193840.27387-1-mark@klomp.org>
X-Mailer: git-send-email 2.14.3
X-Spam-Flag: NO
X-IsSubscribed: yes
X-SW-Source: 2017-q4/txt/msg00124.txt.bz2

Hi,

When we added bounds checking to almost all data reading functions
(commit 7a05347 libdw: Add get_uleb128 and get_sleb128 bounds checking)
we also added extra checks to the abbrev reading. But since we didn't
really have bounds for the "raw" Dwarf_Abbrev reading functions we
just "guessed" the maximum of a uleb128. This wasn't really correct
and not really needed. A struct Dwarf_Abbrev can only be created by
__libdw_getabbrev, which checks the whole abbrev (code, tag, children
and attribute names/forms) is valid already. So whenever we use the
attrp pointer from the Dwarf_Abbrev to read the name/forms we already
know they are in the .debug_abbrev bounds).

[PATCH 1/2] libdw: New get_uleb128_unchecked to use with already
                   checked Dwarf_Abbrev.

So the first patch introduces a get_uleb128_unchecked function that
is used for re-reading such uleb128 values.

The second patch reduces the size of the struct Dwarf_Abbrev by not
storing the attrcnt and by using bitfields for has_children and code.

[PATCH 2/2] libdw: Reduce size of struct Dwarf_Abbrev.

The attrcnt was only used by the dwarf_getattrcnt function. Which
is only used in one testcase. And which seems mostly unnecessary
for real programs. The function now explicitly counts the attrs
instead of using a cached value.

The combined patches very slightly reduces the time for parsing
abbrevs and make the abbrev cache somewhat smaller.

On my machine eu-readelf -N --debug-dump=info libstdc++.so.debug
goes down from 1.79 to 1.71 secs. And max rss goes down from 15.296
to 14.684 kbytes.

Cheers,

Mark