public inbox for dwz@sourceware.org
 help / color / mirror / Atom feed
* Re: [RFC 1/13][odr] Cover letter
  2019-01-01  0:00       ` Tom de Vries
@ 2019-01-01  0:00         ` Michael Matz
  0 siblings, 0 replies; 7+ messages in thread
From: Michael Matz @ 2019-01-01  0:00 UTC (permalink / raw)
  To: Tom de Vries; +Cc: dwz, jakub

Hello,

On Wed, 11 Dec 2019, Tom de Vries wrote:

> > but would expand a little bit on 
> > the explanations.  For --odr I would add a sentence describing the effect 
> > of applying the ODR rule (without going into too much detail), i.e. 
> > mention that this causes type DIEs that have to be the same per C++ rules 
> > (same name basically) to be more forcefully deduped.  Also a 
> > description of the two modes basic/link might be in order for the user to 
> > make a meaningful choice; alteratively describe it as debug/development 
> > option if the user isn't supposed to use it in regular use.
> > 
> 
> I've updated the man page entries with your suggestions.
> 
> Better like this?

Yep.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC 1/13][odr] Cover letter
  2019-01-01  0:00   ` Tom de Vries
@ 2019-01-01  0:00     ` Michael Matz
  2019-01-01  0:00       ` Tom de Vries
  0 siblings, 1 reply; 7+ messages in thread
From: Michael Matz @ 2019-01-01  0:00 UTC (permalink / raw)
  To: Tom de Vries; +Cc: dwz, jakub

Hello,

On Tue, 10 Dec 2019, Tom de Vries wrote:

> > Seriously, this is nice.  The additional reduction of using the ODR rule 
> > on cc1 isn't as big as I thought in the past, probably the largest stuff 
> > in .debug_info aren't type descriptions anymore, but still, it's a 22% 
> > reduction on top, so nothing to sneeze at.  (Maybe for firefox it's more, 
> > I remember Honza moaning about many multiple struct type chains that 
> > only differed in the completeness of their pointer members)
> 
> I'd hope that the --odr-mode=unify in a future patch series will bring
> further improvement, but we'll have to see how much.
> 
> As for this patch series, I'm planning to commit in a couple of days,
> unless there are serious objections.
> 
> Could you at least review the user visible parts? Command line option 
> names, man page entries (and usage ... oops, I forgot to update usage).

If you mean me: I'm fine with the names, but would expand a little bit on 
the explanations.  For --odr I would add a sentence describing the effect 
of applying the ODR rule (without going into too much detail), i.e. 
mention that this causes type DIEs that have to be the same per C++ rules 
(same name basically) to be more forcefully deduped.  Also a 
description of the two modes basic/link might be in order for the user to 
make a meaningful choice; alteratively describe it as debug/development 
option if the user isn't supposed to use it in regular use.

Otherwise I have no comments.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC 1/13][odr] Cover letter
  2019-01-01  0:00 [RFC 1/13][odr] Cover letter Tom de Vries
@ 2019-01-01  0:00 ` Michael Matz
  2019-01-01  0:00   ` Tom de Vries
  2019-01-01  0:00   ` Tom de Vries
  0 siblings, 2 replies; 7+ messages in thread
From: Michael Matz @ 2019-01-01  0:00 UTC (permalink / raw)
  To: Tom de Vries; +Cc: dwz, jakub

Hello,

On Tue, 10 Dec 2019, Tom de Vries wrote:

> This patch series adds optimization option --odr, that exploits the
> one-definition-rule for C++ for struct, class and union.  It's on by
> default.
... 
> Any comments?

Like Yeehah!?  :-)

Seriously, this is nice.  The additional reduction of using the ODR rule 
on cc1 isn't as big as I thought in the past, probably the largest stuff 
in .debug_info aren't type descriptions anymore, but still, it's a 22% 
reduction on top, so nothing to sneeze at.  (Maybe for firefox it's more, 
I remember Honza moaning about many multiple struct type chains that 
only differed in the completeness of their pointer members)


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC 1/13][odr] Cover letter
  2019-01-01  0:00 ` Michael Matz
  2019-01-01  0:00   ` Tom de Vries
@ 2019-01-01  0:00   ` Tom de Vries
  2019-01-01  0:00     ` Michael Matz
  1 sibling, 1 reply; 7+ messages in thread
From: Tom de Vries @ 2019-01-01  0:00 UTC (permalink / raw)
  To: Michael Matz; +Cc: dwz, jakub

On 10-12-2019 18:41, Michael Matz wrote:
> Hello,
> 
> On Tue, 10 Dec 2019, Tom de Vries wrote:
> 
>> This patch series adds optimization option --odr, that exploits the
>> one-definition-rule for C++ for struct, class and union.  It's on by
>> default.
> ... 
>> Any comments?
> 
> Like Yeehah!?  :-)
> 

Heh :)

> Seriously, this is nice.  The additional reduction of using the ODR rule 
> on cc1 isn't as big as I thought in the past, probably the largest stuff 
> in .debug_info aren't type descriptions anymore, but still, it's a 22% 
> reduction on top, so nothing to sneeze at.  (Maybe for firefox it's more, 
> I remember Honza moaning about many multiple struct type chains that 
> only differed in the completeness of their pointer members)

I'd hope that the --odr-mode=unify in a future patch series will bring
further improvement, but we'll have to see how much.

As for this patch series, I'm planning to commit in a couple of days,
unless there are serious objections.

Could you at least review the user visible parts? Command line option
names, man page entries (and usage ... oops, I forgot to update usage).

Thanks,
- Tom

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC 1/13][odr] Cover letter
  2019-01-01  0:00 ` Michael Matz
@ 2019-01-01  0:00   ` Tom de Vries
  2019-01-01  0:00   ` Tom de Vries
  1 sibling, 0 replies; 7+ messages in thread
From: Tom de Vries @ 2019-01-01  0:00 UTC (permalink / raw)
  To: Michael Matz; +Cc: dwz, jakub


On 10-12-2019 18:41, Michael Matz wrote:
> The additional reduction of using the ODR rule 
> on cc1 isn't as big as I thought in the past, probably the largest stuff 
> in .debug_info aren't type descriptions anymore, but still, it's a 22% 
> reduction on top, so nothing to sneeze at.  (Maybe for firefox it's more, 
> I remember Honza moaning about many multiple struct type chains that 
> only differed in the completeness of their pointer members)

FYI, I'm currently investigating a factor 3 execution time regression on
clang-10 (the largest I've tested sofar, at 111 million DIEs), and there
the difference between --no-odr and --odr --odr-mode=basic is larger.

--no-odr:
..
$ diff.sh clang-10-10.0.0-0.20190817snap5.fc30.x86_64.debug 1
.debug_info      red: 30.55%    1769020251  1228593816
.debug_abbrev    red: -2.54%      12122350    12430622
.debug_str       red: 0%         135646131   135646131
total            red: 28.18%    1916788732  1376670569
...

--odr --odr-mode=basic:
...
$ diff.sh clang-10-10.0.0-0.20190817snap5.fc30.x86_64.debug 1
.debug_info      red: 66.90%    1769020251  585587333
.debug_abbrev    red: 20.81%      12122350    9599834
.debug_str       red: 0%         135646131  135646131
total            red: 61.88%    1916788732  730833298
...

Thanks,
- Tom

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [RFC 1/13][odr] Cover letter
@ 2019-01-01  0:00 Tom de Vries
  2019-01-01  0:00 ` Michael Matz
  0 siblings, 1 reply; 7+ messages in thread
From: Tom de Vries @ 2019-01-01  0:00 UTC (permalink / raw)
  To: dwz, jakub; +Cc: Michael Matz

Hi,

This patch series adds optimization option --odr, that exploits the
one-definition-rule for C++ for struct, class and union.  It's on by
default.

---
I.   Patch series
II.  Optimization description
III. Optimization modes basic and link
IV.  Effect
V.   Cost
VI.  Testing
VII. Todo
---

I. Patch series

[odr] Add odr variable
[odr] Add lang field to struct dw_cu
[odr] Add die_odr_state field to struct dw_die
[odr] Construct maximal duplicate chains
[odr] Split the maximal duplicate chains
[odr] Combine decls duplicate chain with def duplicate chain
[odr] Add --odr/--no-odr and --odr-mode={basic,link} command line options
[odr, testsuite] Add test-cases odr-{struct,class,union}.sh
[odr, testsuite] Add test-case odr-loc.sh
[odr, testsuite] Add odr-def-decl.sh
[odr] Enable --odr by default
[odr] Add --odr/--no-odr and --odr-mode entries to man page

II. Optimization description

When passing --odr, dwz merges a struct/class/union declaration in one CU
with a corresponding definition with the same name in another CU.

F.i., for dwarf describing compilation units:
...
struct bbb;            // decl
struct ccc { int c; }; // def
struct aaa {
  struct bbb *b;       // pointer to decl
  struct ccc *c;       // pointer to def
};
...
and:
...
struct bbb { int b; }; // def
struct ccc;            // decl
struct aaa {
  struct bbb *b;       // pointer to def
  struct ccc *c;       // pointer to decl
};
...
we manage to get a partial unit containing dwarf describing:
...
struct bbb { int b; }; // def
struct ccc { int c; }; // def
struct aaa {
  struct bbb *b;       // pointer to def
  struct ccc *c;       // pointer to def
}
...

So, instead of two, we get one definition of aaa with both fields pointing to
definitions of bbb and ccc.

III. Optimization modes basic and link

The result at I describes the default optimization mode --odr-mode=link.

A less aggressive optimization mode --odr-mode=basic gets us instead a partial
unit containing dwarf describing:
...
struct aaa {
  struct bbb *b;       // pointer to decl
  struct ccc *c;       // pointer to def
};
...

This mode is provided as a fall back, in case there are problems in
--odr-mode=link.

IV. Effect

We use a cc1 executable to generate executables compressed with and without odr:
...
$ dwz cc1 -lnone --no-odr -o cc1.dwz
$ dwz cc1 -lnone --odr -o cc1.dwz.odr
...

Then we can inspect the difference:
...
$ diff.sh cc1 cc1.dwz
.debug_info      red: 44.84%    111527248  61527733
.debug_abbrev    red: 40.28%      1722726   1028968
.debug_str       red:  0.00%      6609355   6609355
total            red: 42.30%    119859329  69166056
$ diff.sh cc1 cc1.dwz.odr
.debug_info      red: 57.40%    111527248  47516686
.debug_abbrev    red: 73.87%      1722726    450319
.debug_str       red:  0.00%      6609355   6609355
total            red: 54.47%    119859329  54576360
...

[ Note that the total mentioned here relates to the 3 mentioned debug
sections, not the size of all the debug sections or the entire executable.  ]

In summary, the mentioned debug sections are reduced in size by:
- by 42.30% when not using odr, and
- by 54.47% when using --odr,
making the impact of --odr an extra 12,17% of size reduction.

The result for --odr-mode=basic is:
...
$ dwz cc1 -lnone --odr --odr-mode=basic -o cc1.dwz.odr.basic
$ diff.sh cc1 cc1.dwz.odr.basic
.debug_info      red: 56.16%    111527248  48903796
.debug_abbrev    red: 71.48%    1722726  491371
.debug_str       red: 0%        6609355  6609355
total            red: 53.28%    119859329 56004522
...
making the impact of --odr --odr-mode=basic a (slightly lesser) extra 10,98%
of size reduction.

V. Cost

Using the same cc1 example as in IV, we can see the cost of the optimization:
...
$ time.sh dwz cc1 -lnone --no-odr -o cc1.dwz
maxmem: 1259084
real: 5.44
user: 5.26
system: 0.18
$ time.sh dwz cc1 -lnone --odr -o cc1.dwz.odr
maxmem: 1252996
real: 6.02
user: 5.85
system: 0.16
...
So, roughly the same amount of memory, and a bit (~11%) slower.

A more detailed measurement of execution time confirms that:
...
real:  mean:  5282.60  100.00%  stddev:  32.33
       mean:  5843.30  110.61%  stddev:  32.00
user:  mean:  5178.30  100.00%  stddev:  20.03
       mean:  5742.30  110.89%  stddev:  25.21
sys:   mean:   104.00  100.00%  stddev:  38.41
       mean:   100.80   96.92%  stddev:  26.98
...

It's good to note though that without the patch series applied, we use 6.5% less
memory (in absolute numbers: 79.5 MB) than with --no-odr, due to the struct
dw_die not having the die_hash2 field:
...
$ time.sh dwz cc1 -lnone -o cc1.dwz
maxmem: 1177712
real: 5.28
user: 5.11
system: 0.17
...
It needs to be investigated whether it makes sense to get rid of this memory
usage regression for --no-odr.

VI. Testing

The patch series contains test-cases exercising the --odr optimization option.

The patch series has been in conjunction with the gdb testsuite, using target
boards cc-with-dwz.exp and cc-with-dwz-m.exp, both using --odr-mode=basic and
--odr-mode=link.

VII. Todo

The optimization is disabled in low memory mode.  It needs to be investigated
whether it makes sense to enable it in low memory mode.  The optimization
requires extra memory (in theory, atm we lazily claim the same amount with
and without the optimization, but that might change), which conflicts with
the idea of the low memory mode.

Any comments?

Thanks,
- Tom

[odr] Cover letter

2019-12-10  Tom de Vries  <tdevries@suse.de>

	* COVER-LETTER: New file, meant to avoid dropping empty commits
	containing the cover letter in the log message.

---
 COVER-LETTER | 1 +
 1 file changed, 1 insertion(+)

diff --git a/COVER-LETTER b/COVER-LETTER
new file mode 100644
index 0000000..9176886
--- /dev/null
+++ b/COVER-LETTER
@@ -0,0 +1 @@
+One definition rule optimization

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC 1/13][odr] Cover letter
  2019-01-01  0:00     ` Michael Matz
@ 2019-01-01  0:00       ` Tom de Vries
  2019-01-01  0:00         ` Michael Matz
  0 siblings, 1 reply; 7+ messages in thread
From: Tom de Vries @ 2019-01-01  0:00 UTC (permalink / raw)
  To: Michael Matz; +Cc: dwz, jakub

[-- Attachment #1: Type: text/plain, Size: 1707 bytes --]



On 11-12-2019 14:15, Michael Matz wrote:
> Hello,
> 
> On Tue, 10 Dec 2019, Tom de Vries wrote:
> 
>>> Seriously, this is nice.  The additional reduction of using the ODR rule 
>>> on cc1 isn't as big as I thought in the past, probably the largest stuff 
>>> in .debug_info aren't type descriptions anymore, but still, it's a 22% 
>>> reduction on top, so nothing to sneeze at.  (Maybe for firefox it's more, 
>>> I remember Honza moaning about many multiple struct type chains that 
>>> only differed in the completeness of their pointer members)
>>
>> I'd hope that the --odr-mode=unify in a future patch series will bring
>> further improvement, but we'll have to see how much.
>>
>> As for this patch series, I'm planning to commit in a couple of days,
>> unless there are serious objections.
>>
>> Could you at least review the user visible parts? Command line option 
>> names, man page entries (and usage ... oops, I forgot to update usage).
> 

That's done: https://sourceware.org/ml/dwz/2019-q4/msg00147.html .

> If you mean me: I'm fine with the names,

Ack.

> but would expand a little bit on 
> the explanations.  For --odr I would add a sentence describing the effect 
> of applying the ODR rule (without going into too much detail), i.e. 
> mention that this causes type DIEs that have to be the same per C++ rules 
> (same name basically) to be more forcefully deduped.  Also a 
> description of the two modes basic/link might be in order for the user to 
> make a meaningful choice; alteratively describe it as debug/development 
> option if the user isn't supposed to use it in regular use.
> 

I've updated the man page entries with your suggestions.

Better like this?

Thanks,
- Tom

[-- Attachment #2: 0013-odr-Add-odr-no-odr-and-odr-mode-entries-to-man-page.patch --]
[-- Type: text/x-patch, Size: 1511 bytes --]

[odr] Add --odr/--no-odr and --odr-mode entries to man page

Add an --odr/--no-odr entry and a --odr-mode entry to the man page, advertising
the optimization to the user.

2019-12-10  Tom de Vries  <tdevries@suse.de>

	* dwz.1: Add --odr/--no-odr and a --odr-mode entries.

---
 dwz.1 | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/dwz.1 b/dwz.1
index a5a1ef6..3df7e59 100644
--- a/dwz.1
+++ b/dwz.1
@@ -88,6 +88,23 @@ containing more than
 \fICOUNT\fR DIEs at all.  The default is 50 million DIEs.  Specifying none as
 argument disables the limit.
 .TP
+.B \-\-odr / \-\-no-odr
+Enable/disable One-Definition-Rule optimization for C++ compilation units.
+This optimization causes struct/union/class DIEs with the same name to be
+considered equal.  This has the effect that DIEs referring to distinct DIEs
+representing the same type (like f.i. pointer type DIEs) are considered equal,
+and may be deduplicated.
+Enabled by default.
+.TP
+.B \-\-odr-mode=<basic|link>
+Set the One-Definition-Rule optimization aggressiveness: basic or link.
+When using the link setting, the optimization will attempt to replace
+declarations of a struct/union/class with a corresponding definition.  When
+using the basic setting, that part of the optimization is disabled.
+In normal operation, the link setting should be used.  The basic setting is
+provided only as fallback in case of problems with the link setting.  Set to
+link by default.
+.TP
 .B \-? \-\-help
 Print short help and exit.
 .TP

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-12-12 16:18 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-01  0:00 [RFC 1/13][odr] Cover letter Tom de Vries
2019-01-01  0:00 ` Michael Matz
2019-01-01  0:00   ` Tom de Vries
2019-01-01  0:00   ` Tom de Vries
2019-01-01  0:00     ` Michael Matz
2019-01-01  0:00       ` Tom de Vries
2019-01-01  0:00         ` Michael Matz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).