public inbox for dwz@sourceware.org
 help / color / mirror / Atom feed
* [Bug default/24388] Disabling DIE deduplication improves compression for hello
  2019-01-01  0:00 [Bug default/24388] New: Disabling DIE deduplication improves compression for hello vries at gcc dot gnu.org
@ 2019-01-01  0:00 ` vries at gcc dot gnu.org
  2020-01-01  0:00 ` vries at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: vries at gcc dot gnu.org @ 2019-01-01  0:00 UTC (permalink / raw)
  To: dwz

https://sourceware.org/bugzilla/show_bug.cgi?id=24388

--- Comment #1 from Tom de Vries <vries at gcc dot gnu.org> ---
I tried the same with dwz as dwz input:
...
$ dwz dwz-for-test -o dwz-for-test.dwz
$ diff.sh dwz-for-test dwz-for-test.dwz 
.debug_info      red: 15%       119827  102801
.debug_abbrev    red: 2%        4009    3964
.debug_str       red: 0%        20731   20731
total            red: 12%       144567  127496
$ dwz dwz-for-test -o dwz-for-test.dwz.2
$ diff.sh dwz-for-test dwz-for-test.dwz.2 
.debug_info      red: 14%       119827  103872
.debug_abbrev    red: 23%       4009    3094
.debug_str       red: 0%        20731   20731
total            red: 12%       144567  127697
...
This seems to be a point where using DIE deduplication is just better than not
using it.

So, with this being the DIE counts:
...
$ count-dies.sh hello
130
$ count-dies.sh dwz-for-test
10.393
$ count-dies.sh cc1
10.188.941
...
perhaps a cut-off point of 100.000 would do.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug default/24388] New: Disabling DIE deduplication improves compression for hello
@ 2019-01-01  0:00 vries at gcc dot gnu.org
  2019-01-01  0:00 ` [Bug default/24388] " vries at gcc dot gnu.org
  2020-01-01  0:00 ` vries at gcc dot gnu.org
  0 siblings, 2 replies; 3+ messages in thread
From: vries at gcc dot gnu.org @ 2019-01-01  0:00 UTC (permalink / raw)
  To: dwz

https://sourceware.org/bugzilla/show_bug.cgi?id=24388

            Bug ID: 24388
           Summary: Disabling DIE deduplication improves compression for
                    hello
           Product: dwz
           Version: unspecified
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: default
          Assignee: nobody at sourceware dot org
          Reporter: vries at gcc dot gnu.org
                CC: dwz at sourceware dot org
  Target Milestone: ---

If we run dwz on a hello world executable, we measure a reduction of 8% in size
of the relevant debug sections:
...
$ gcc hello.c -g
$ dwz hello -o hello.dwz
$ diff.sh hello hello.dwz 
.debug_info      red: 17%       1467    1221
.debug_abbrev    red: 7%        624     584
.debug_str       red: 0%        1619    1619
total            red: 8%        3710    3424
...

However, if we disable the DIE deduplication optimization, like so:
...
diff --git a/dwz.c b/dwz.c
index 045bda5..4b9b5e6 100644
--- a/dwz.c
+++ b/dwz.c
@@ -5038,8 +5038,8 @@ read_debug_info (DSO *dso, int kind)
          dump_dies (0, cu->cu_die);
 #endif

-         if (find_dups (cu->cu_die))
-           goto fail;
+         //if (find_dups (cu->cu_die))
+         //goto fail;
        }
       if (unlikely (kind == DEBUG_TYPES))
        {
@@ -11080,9 +11080,9 @@ dwz
       ret = read_dwarf (dso, quiet && outfile == NULL);
       if (ret)
        cleanup ();
-      else if (partition_dups ()
-              || create_import_tree ()
-              || (unlikely (fi_multifile)
+      else if (// partition_dups ()
+              // || create_import_tree ()
+              (unlikely (fi_multifile)
                   && (remove_empty_pus ()
                       || read_macro (dso)))
               || read_debug_info (dso, DEBUG_TYPES)
...

we get a better result (12%) instead:
...
$ dwz hello -o hello.dwz.2
$ diff.sh hello hello.dwz.2 
.debug_info      red: 20%       1467    1183
.debug_abbrev    red: 25%       624     474
.debug_str       red: 0%        1619    1619
total            red: 12%       3710    3276
...

It would be nice if we could pick up the 12% benefit here, by generating this
output as an intermediate step, and preferring it if it's smaller than the
result after the following DIE deduplication optimization.

I tried the same experiment with a cc1 (from pr24275, with the tentative fix
applied):
...
$ dwz cc1 -o cc1.dwz
$ diff.sh cc1 cc1.dwz 
.debug_info      red: 45%       111527248 61570632
.debug_abbrev    red: 41%       1722726    1030935
.debug_str       red: 0%        6609355    6609355
total            red: 43%       119859329 69210922
$ dwz cc1 -o cc1.dwz.2
$ diff.sh cc1 cc1.dwz.2 
.debug_info      red: 11%       111527248 100313798
.debug_abbrev    red: 11%       1722726    1542574
.debug_str       red: 0%        6609355    6609355
total            red: 10%       119859329 108465727
...
Here we see the opposite result.

By disabling the intermediate step above some cut-off point (say x nr of DIES),
we might be able to get:
- better compression for smaller programs
- without spending noticeable extra time for smaller programs
- without spending extra time for larger programs.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug default/24388] Disabling DIE deduplication improves compression for hello
  2019-01-01  0:00 [Bug default/24388] New: Disabling DIE deduplication improves compression for hello vries at gcc dot gnu.org
  2019-01-01  0:00 ` [Bug default/24388] " vries at gcc dot gnu.org
@ 2020-01-01  0:00 ` vries at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: vries at gcc dot gnu.org @ 2020-01-01  0:00 UTC (permalink / raw)
  To: dwz

https://sourceware.org/bugzilla/show_bug.cgi?id=24388

--- Comment #2 from Tom de Vries <vries at gcc dot gnu.org> ---
(In reply to Tom de Vries from comment #0)
> However, if we disable the DIE deduplication optimization, like so:

I've added a commit "Add --devel-deduplication-mode={none,intra-cu,inter-cu}".

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-02-17 11:30 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-01  0:00 [Bug default/24388] New: Disabling DIE deduplication improves compression for hello vries at gcc dot gnu.org
2019-01-01  0:00 ` [Bug default/24388] " vries at gcc dot gnu.org
2020-01-01  0:00 ` vries at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).