From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 46556 invoked by alias); 10 Dec 2019 17:18:39 -0000 Mailing-List: contact dwz-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Post: List-Help: List-Subscribe: Sender: dwz-owner@sourceware.org Received: (qmail 46542 invoked by uid 89); 10 Dec 2019 17:18:38 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Checked: by ClamAV 0.100.3 on sourceware.org X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.2 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,SPF_PASS autolearn=ham version=3.3.1 spammy=ccc, measurement, boards, lesser X-Spam-Status: No, score=-25.2 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,SPF_PASS autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on sourceware.org X-Spam-Level: X-HELO: mx1.suse.de X-Virus-Scanned: by amavisd-new at test-mx.suse.de Date: Tue, 01 Jan 2019 00:00:00 -0000 From: Tom de Vries To: dwz@sourceware.org, jakub@redhat.com Cc: Michael Matz Subject: [RFC 1/13][odr] Cover letter Message-ID: <20191210171830.GA13804@delia> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.10.1 (2018-07-13) X-SW-Source: 2019-q4/txt/msg00128.txt.bz2 Hi, This patch series adds optimization option --odr, that exploits the one-definition-rule for C++ for struct, class and union. It's on by default. --- I. Patch series II. Optimization description III. Optimization modes basic and link IV. Effect V. Cost VI. Testing VII. Todo --- I. Patch series [odr] Add odr variable [odr] Add lang field to struct dw_cu [odr] Add die_odr_state field to struct dw_die [odr] Construct maximal duplicate chains [odr] Split the maximal duplicate chains [odr] Combine decls duplicate chain with def duplicate chain [odr] Add --odr/--no-odr and --odr-mode={basic,link} command line options [odr, testsuite] Add test-cases odr-{struct,class,union}.sh [odr, testsuite] Add test-case odr-loc.sh [odr, testsuite] Add odr-def-decl.sh [odr] Enable --odr by default [odr] Add --odr/--no-odr and --odr-mode entries to man page II. Optimization description When passing --odr, dwz merges a struct/class/union declaration in one CU with a corresponding definition with the same name in another CU. F.i., for dwarf describing compilation units: ... struct bbb; // decl struct ccc { int c; }; // def struct aaa { struct bbb *b; // pointer to decl struct ccc *c; // pointer to def }; ... and: ... struct bbb { int b; }; // def struct ccc; // decl struct aaa { struct bbb *b; // pointer to def struct ccc *c; // pointer to decl }; ... we manage to get a partial unit containing dwarf describing: ... struct bbb { int b; }; // def struct ccc { int c; }; // def struct aaa { struct bbb *b; // pointer to def struct ccc *c; // pointer to def } ... So, instead of two, we get one definition of aaa with both fields pointing to definitions of bbb and ccc. III. Optimization modes basic and link The result at I describes the default optimization mode --odr-mode=link. A less aggressive optimization mode --odr-mode=basic gets us instead a partial unit containing dwarf describing: ... struct aaa { struct bbb *b; // pointer to decl struct ccc *c; // pointer to def }; ... This mode is provided as a fall back, in case there are problems in --odr-mode=link. IV. Effect We use a cc1 executable to generate executables compressed with and without odr: ... $ dwz cc1 -lnone --no-odr -o cc1.dwz $ dwz cc1 -lnone --odr -o cc1.dwz.odr ... Then we can inspect the difference: ... $ diff.sh cc1 cc1.dwz .debug_info red: 44.84% 111527248 61527733 .debug_abbrev red: 40.28% 1722726 1028968 .debug_str red: 0.00% 6609355 6609355 total red: 42.30% 119859329 69166056 $ diff.sh cc1 cc1.dwz.odr .debug_info red: 57.40% 111527248 47516686 .debug_abbrev red: 73.87% 1722726 450319 .debug_str red: 0.00% 6609355 6609355 total red: 54.47% 119859329 54576360 ... [ Note that the total mentioned here relates to the 3 mentioned debug sections, not the size of all the debug sections or the entire executable. ] In summary, the mentioned debug sections are reduced in size by: - by 42.30% when not using odr, and - by 54.47% when using --odr, making the impact of --odr an extra 12,17% of size reduction. The result for --odr-mode=basic is: ... $ dwz cc1 -lnone --odr --odr-mode=basic -o cc1.dwz.odr.basic $ diff.sh cc1 cc1.dwz.odr.basic .debug_info red: 56.16% 111527248 48903796 .debug_abbrev red: 71.48% 1722726 491371 .debug_str red: 0% 6609355 6609355 total red: 53.28% 119859329 56004522 ... making the impact of --odr --odr-mode=basic a (slightly lesser) extra 10,98% of size reduction. V. Cost Using the same cc1 example as in IV, we can see the cost of the optimization: ... $ time.sh dwz cc1 -lnone --no-odr -o cc1.dwz maxmem: 1259084 real: 5.44 user: 5.26 system: 0.18 $ time.sh dwz cc1 -lnone --odr -o cc1.dwz.odr maxmem: 1252996 real: 6.02 user: 5.85 system: 0.16 ... So, roughly the same amount of memory, and a bit (~11%) slower. A more detailed measurement of execution time confirms that: ... real: mean: 5282.60 100.00% stddev: 32.33 mean: 5843.30 110.61% stddev: 32.00 user: mean: 5178.30 100.00% stddev: 20.03 mean: 5742.30 110.89% stddev: 25.21 sys: mean: 104.00 100.00% stddev: 38.41 mean: 100.80 96.92% stddev: 26.98 ... It's good to note though that without the patch series applied, we use 6.5% less memory (in absolute numbers: 79.5 MB) than with --no-odr, due to the struct dw_die not having the die_hash2 field: ... $ time.sh dwz cc1 -lnone -o cc1.dwz maxmem: 1177712 real: 5.28 user: 5.11 system: 0.17 ... It needs to be investigated whether it makes sense to get rid of this memory usage regression for --no-odr. VI. Testing The patch series contains test-cases exercising the --odr optimization option. The patch series has been in conjunction with the gdb testsuite, using target boards cc-with-dwz.exp and cc-with-dwz-m.exp, both using --odr-mode=basic and --odr-mode=link. VII. Todo The optimization is disabled in low memory mode. It needs to be investigated whether it makes sense to enable it in low memory mode. The optimization requires extra memory (in theory, atm we lazily claim the same amount with and without the optimization, but that might change), which conflicts with the idea of the low memory mode. Any comments? Thanks, - Tom [odr] Cover letter 2019-12-10 Tom de Vries * COVER-LETTER: New file, meant to avoid dropping empty commits containing the cover letter in the log message. --- COVER-LETTER | 1 + 1 file changed, 1 insertion(+) diff --git a/COVER-LETTER b/COVER-LETTER new file mode 100644 index 0000000..9176886 --- /dev/null +++ b/COVER-LETTER @@ -0,0 +1 @@ +One definition rule optimization