From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 21481 invoked by alias); 26 Dec 2017 19:38:51 -0000 Mailing-List: contact elfutils-devel-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Post: List-Help: List-Subscribe: Sender: elfutils-devel-owner@sourceware.org Received: (qmail 21469 invoked by uid 89); 26 Dec 2017 19:38:51 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Checked: by ClamAV 0.99.2 on sourceware.org X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.9 required=5.0 tests=BAYES_00,KAM_LAZY_DOMAIN_SECURITY autolearn=no version=3.3.2 spammy=Hx-languages-length:1620, secs, rss X-Spam-Status: No, score=-0.9 required=5.0 tests=BAYES_00,KAM_LAZY_DOMAIN_SECURITY autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on sourceware.org X-Spam-Level: X-HELO: gnu.wildebeest.org Received: from wildebeest.demon.nl (HELO gnu.wildebeest.org) (212.238.236.112) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 26 Dec 2017 19:38:49 +0000 Received: from stream.wildebeest.org (a80-101-194-232.adsl.xs4all.nl [80.101.194.232]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by gnu.wildebeest.org (Postfix) with ESMTPSA id A4C73302BAA3 for ; Tue, 26 Dec 2017 20:38:47 +0100 (CET) Received: by stream.wildebeest.org (Postfix, from userid 1000) id 0A63D109054; Tue, 26 Dec 2017 20:38:47 +0100 (CET) From: Mark Wielaard To: elfutils-devel@sourceware.org Subject: Simpler abbrev parsing using less memory Date: Tue, 26 Dec 2017 19:38:00 -0000 Message-Id: <20171226193840.27387-1-mark@klomp.org> X-Mailer: git-send-email 2.14.3 X-Spam-Flag: NO X-IsSubscribed: yes X-SW-Source: 2017-q4/txt/msg00124.txt.bz2 Hi, When we added bounds checking to almost all data reading functions (commit 7a05347 libdw: Add get_uleb128 and get_sleb128 bounds checking) we also added extra checks to the abbrev reading. But since we didn't really have bounds for the "raw" Dwarf_Abbrev reading functions we just "guessed" the maximum of a uleb128. This wasn't really correct and not really needed. A struct Dwarf_Abbrev can only be created by __libdw_getabbrev, which checks the whole abbrev (code, tag, children and attribute names/forms) is valid already. So whenever we use the attrp pointer from the Dwarf_Abbrev to read the name/forms we already know they are in the .debug_abbrev bounds). [PATCH 1/2] libdw: New get_uleb128_unchecked to use with already checked Dwarf_Abbrev. So the first patch introduces a get_uleb128_unchecked function that is used for re-reading such uleb128 values. The second patch reduces the size of the struct Dwarf_Abbrev by not storing the attrcnt and by using bitfields for has_children and code. [PATCH 2/2] libdw: Reduce size of struct Dwarf_Abbrev. The attrcnt was only used by the dwarf_getattrcnt function. Which is only used in one testcase. And which seems mostly unnecessary for real programs. The function now explicitly counts the attrs instead of using a cached value. The combined patches very slightly reduces the time for parsing abbrevs and make the abbrev cache somewhat smaller. On my machine eu-readelf -N --debug-dump=info libstdc++.so.debug goes down from 1.79 to 1.71 secs. And max rss goes down from 15.296 to 14.684 kbytes. Cheers, Mark