* [Bug default/24388] Disabling DIE deduplication improves compression for hello
2019-01-01 0:00 [Bug default/24388] New: Disabling DIE deduplication improves compression for hello vries at gcc dot gnu.org
@ 2019-01-01 0:00 ` vries at gcc dot gnu.org
2020-01-01 0:00 ` vries at gcc dot gnu.org
1 sibling, 0 replies; 3+ messages in thread
From: vries at gcc dot gnu.org @ 2019-01-01 0:00 UTC (permalink / raw)
To: dwz
https://sourceware.org/bugzilla/show_bug.cgi?id=24388
--- Comment #1 from Tom de Vries <vries at gcc dot gnu.org> ---
I tried the same with dwz as dwz input:
...
$ dwz dwz-for-test -o dwz-for-test.dwz
$ diff.sh dwz-for-test dwz-for-test.dwz
.debug_info red: 15% 119827 102801
.debug_abbrev red: 2% 4009 3964
.debug_str red: 0% 20731 20731
total red: 12% 144567 127496
$ dwz dwz-for-test -o dwz-for-test.dwz.2
$ diff.sh dwz-for-test dwz-for-test.dwz.2
.debug_info red: 14% 119827 103872
.debug_abbrev red: 23% 4009 3094
.debug_str red: 0% 20731 20731
total red: 12% 144567 127697
...
This seems to be a point where using DIE deduplication is just better than not
using it.
So, with this being the DIE counts:
...
$ count-dies.sh hello
130
$ count-dies.sh dwz-for-test
10.393
$ count-dies.sh cc1
10.188.941
...
perhaps a cut-off point of 100.000 would do.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug default/24388] New: Disabling DIE deduplication improves compression for hello
@ 2019-01-01 0:00 vries at gcc dot gnu.org
2019-01-01 0:00 ` [Bug default/24388] " vries at gcc dot gnu.org
2020-01-01 0:00 ` vries at gcc dot gnu.org
0 siblings, 2 replies; 3+ messages in thread
From: vries at gcc dot gnu.org @ 2019-01-01 0:00 UTC (permalink / raw)
To: dwz
https://sourceware.org/bugzilla/show_bug.cgi?id=24388
Bug ID: 24388
Summary: Disabling DIE deduplication improves compression for
hello
Product: dwz
Version: unspecified
Status: NEW
Severity: enhancement
Priority: P2
Component: default
Assignee: nobody at sourceware dot org
Reporter: vries at gcc dot gnu.org
CC: dwz at sourceware dot org
Target Milestone: ---
If we run dwz on a hello world executable, we measure a reduction of 8% in size
of the relevant debug sections:
...
$ gcc hello.c -g
$ dwz hello -o hello.dwz
$ diff.sh hello hello.dwz
.debug_info red: 17% 1467 1221
.debug_abbrev red: 7% 624 584
.debug_str red: 0% 1619 1619
total red: 8% 3710 3424
...
However, if we disable the DIE deduplication optimization, like so:
...
diff --git a/dwz.c b/dwz.c
index 045bda5..4b9b5e6 100644
--- a/dwz.c
+++ b/dwz.c
@@ -5038,8 +5038,8 @@ read_debug_info (DSO *dso, int kind)
dump_dies (0, cu->cu_die);
#endif
- if (find_dups (cu->cu_die))
- goto fail;
+ //if (find_dups (cu->cu_die))
+ //goto fail;
}
if (unlikely (kind == DEBUG_TYPES))
{
@@ -11080,9 +11080,9 @@ dwz
ret = read_dwarf (dso, quiet && outfile == NULL);
if (ret)
cleanup ();
- else if (partition_dups ()
- || create_import_tree ()
- || (unlikely (fi_multifile)
+ else if (// partition_dups ()
+ // || create_import_tree ()
+ (unlikely (fi_multifile)
&& (remove_empty_pus ()
|| read_macro (dso)))
|| read_debug_info (dso, DEBUG_TYPES)
...
we get a better result (12%) instead:
...
$ dwz hello -o hello.dwz.2
$ diff.sh hello hello.dwz.2
.debug_info red: 20% 1467 1183
.debug_abbrev red: 25% 624 474
.debug_str red: 0% 1619 1619
total red: 12% 3710 3276
...
It would be nice if we could pick up the 12% benefit here, by generating this
output as an intermediate step, and preferring it if it's smaller than the
result after the following DIE deduplication optimization.
I tried the same experiment with a cc1 (from pr24275, with the tentative fix
applied):
...
$ dwz cc1 -o cc1.dwz
$ diff.sh cc1 cc1.dwz
.debug_info red: 45% 111527248 61570632
.debug_abbrev red: 41% 1722726 1030935
.debug_str red: 0% 6609355 6609355
total red: 43% 119859329 69210922
$ dwz cc1 -o cc1.dwz.2
$ diff.sh cc1 cc1.dwz.2
.debug_info red: 11% 111527248 100313798
.debug_abbrev red: 11% 1722726 1542574
.debug_str red: 0% 6609355 6609355
total red: 10% 119859329 108465727
...
Here we see the opposite result.
By disabling the intermediate step above some cut-off point (say x nr of DIES),
we might be able to get:
- better compression for smaller programs
- without spending noticeable extra time for smaller programs
- without spending extra time for larger programs.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug default/24388] Disabling DIE deduplication improves compression for hello
2019-01-01 0:00 [Bug default/24388] New: Disabling DIE deduplication improves compression for hello vries at gcc dot gnu.org
2019-01-01 0:00 ` [Bug default/24388] " vries at gcc dot gnu.org
@ 2020-01-01 0:00 ` vries at gcc dot gnu.org
1 sibling, 0 replies; 3+ messages in thread
From: vries at gcc dot gnu.org @ 2020-01-01 0:00 UTC (permalink / raw)
To: dwz
https://sourceware.org/bugzilla/show_bug.cgi?id=24388
--- Comment #2 from Tom de Vries <vries at gcc dot gnu.org> ---
(In reply to Tom de Vries from comment #0)
> However, if we disable the DIE deduplication optimization, like so:
I've added a commit "Add --devel-deduplication-mode={none,intra-cu,inter-cu}".
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2020-02-17 11:30 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-01 0:00 [Bug default/24388] New: Disabling DIE deduplication improves compression for hello vries at gcc dot gnu.org
2019-01-01 0:00 ` [Bug default/24388] " vries at gcc dot gnu.org
2020-01-01 0:00 ` vries at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).