[RFC] Implement C++ One Definition Rule for struct, class and union

public inbox for dwz@sourceware.org
 help / color / mirror / Atom feed

* [RFC] Implement C++ One Definition Rule for struct, class and union
@ 2019-01-01  0:00 Tom de Vries
  2019-01-01  0:00 ` Tom de Vries
  2019-01-01  0:00 ` Michael Matz
  0 siblings, 2 replies; 3+ messages in thread
From: Tom de Vries @ 2019-01-01  0:00 UTC (permalink / raw)
  To: dwz, Jakub Jelinek, Mark Wielaard; +Cc: Michael Matz

[-- Attachment #1: Type: text/plain, Size: 645 bytes --]

Hi,

I reached a feature-complete state for the patch series implementing the
ODR optimization (PR dwz/24198).  [ Feature-complete meaning, AFAICT it
does what it's supposed to do, though it may be able to do it quicker
and/or using less memory. ]

The patch series comes with a number of test-cases testing the
optimization.  Furthermore, the optimization was tested on-by-default in
conjunction with the gdb testsuite, using target boards cc-with-dwz.exp
and cc-with-dwz-m.exp.

I'm attaching the patch series here in git bundle and patches tarball
formats, and including the complete cover letter below.

Any comments welcome.

Thanks,
- Tom

[-- Attachment #2: 0001-COVER-LETTER-Implement-odr-for-struct-class-and-union.patch --]
[-- Type: text/x-patch, Size: 5923 bytes --]

[COVER-LETTER] Implement odr for struct, class and union

PR dwz/24198

Add optimization options --odr and --odr-unify, that exploit the
one-definition-rule for C++ for struct, class and union.

I. -DODR

The optimization options are enabled in the build by -DODR.  The reason that
the optimization options are conditionally enabled in the build, is that
there's a regression in terms of memory and execution time, even when the
optimization options are not used.

The patch series sets -DODR for both dwz and dwz-for-test, but that might not
be part of an initial commit.

II. Optimization option --odr

When passing --odr, merge a struct/class/union declaration in one CU with a
corresponding definition with the same name in another CU.

F.i., for dwarf describing compilation units:
...
struct bbb;            // decl
struct ccc { int c; }; // def
struct aaa {
  struct bbb *b;       // pointer to decl
  struct ccc *c;       // pointer to def
};
...
and:
...
struct bbb { int b; }; // def
struct ccc;            // decl
struct aaa {
  struct bbb *b;       // pointer to def
  struct ccc *c;       // pointer to decl
};
...
we manage to get a partial unit containing dwarf describing:
...
struct bbb { int b; }; // def
struct ccc { int c; }; // def
struct aaa {
  struct bbb *b;       // pointer to def
  struct ccc *c;       // pointer to def
}
...
So, one definition of aaa with both fields pointing to definitions of bbb and
ccc.

III. Optimization option --odr-unify

When passing --odr-unify, we enable the optimization further.  For example,
DIEs describing a struct containing member templates, may have different number
and type of instantiations of the member template in different compilation
units.  The optimization merges these DIEs by constructing a union of the
struct members.

This compression type is gdb-visible lossy.  Before compression, we get f.i.
either:
...
(gdb) ptype aaa
type = struct aaa {
    int var;
  public:
    void foo<float>(float);
    void foo<int>(int);
}
...
or:
...
(gdb) ptype aaa
type = struct aaa {
    int var;
  public:
    void foo<double>(double);
}
...
depending on which compilation unit is in scope, while after compression we
only have this unifified definition:
...
(gdb) ptype aaa
type = struct aaa {
    int var;
  public:
    void foo<float>(float);
    void foo<int>(int);
    void foo<double>(double);
}
...

IV. ODR errors

No effort is made to detect ODR errors.

The behaviour of the optimization in the presence of ODR errors is as follows.

Consider a test-case consisting of odr-error.h, odr-error.c, odr-error-2.c and
main.c:
...
$ cat odr-error.h
 struct aaa {
   FIELD;
 };
$ cat odr-error.c
 #define FIELD FIELD1
 #include "odr-error.h"

 struct aaa var1;
$ cat odr-error-2.c
 #define FIELD FIELD2
 #include "odr-error.h"

 struct aaa var2;
$ cat main.c
 int main (void) { return 0; }
...

When we define struct aaa with two different fields with different
names, we get both fields in the resulting struct:
...
$ g++ main.c -DFIELD1="int x" -DFIELD2="float y" odr-error.c odr-error-2.c -g
$ dwz --odr-unify --devel-ignore-size a.out
$ gdb -batch a.out -ex "ptype var1" -ex "ptype var2"
type = struct aaa {
    int x;
    float y;
}
type = struct aaa {
    int x;
    float y;
}
...

OTOH, if we define struct aa with two different fields with the same name, we
get only one of the two fields:
...
$ g++ main.c -DFIELD1="int x" -DFIELD2="float x" odr-error.c odr-error-2.c -g
$ dwz --odr-unify --devel-ignore-size a.out
$ gdb -batch a.out -ex "ptype var1" -ex "ptype var2"
type = struct aaa {
    int x;
}
type = struct aaa {
    int x;
}
...

V. Effect

We use a cc1 executable to generate executables compressed with no odr,
--odr and --odr-unify:
...
$ dwz -l50000000 cc1 -o 1
$ dwz -l50000000 cc1 -o 2 --odr
$ dwz -l50000000 cc1 -o 3 --odr-unify
...

Then we can inspect the differences:
...
$ diff.sh cc1 1
.debug_info      red: 44.80%    111527248  61570632
.debug_abbrev    red: 40.16%    1722726  1030935
.debug_str       red: 0%        6609355  6609355
total            red: 42.26%    119859329 69210922
$ diff.sh cc1 2
.debug_info      red: 55.16%    111527248  50019425
.debug_abbrev    red: 68.13%    1722726  549035
.debug_str       red: 0%        6609355  6609355
total            red: 52.30%    119859329 57177815
$ diff.sh cc1 3
.debug_info      red: 58.18%    111527248  46649959
.debug_abbrev    red: 79.57%    1722726  352080
.debug_str       red: 0%        6609355  6609355
total            red: 55.28%    119859329 53611394
...

So, the .debug_info and .debug_abbrev sections are reduced in size by:
- by 42% when not using odr,
- by 52% when using --odr, and
- by 55% when using --odr-unify.

VI. Cost

Using the same cc1 example as in V, we can see the cost of the optimization:
...
$ time.sh dwz -L50000000 cc1 -o 1
maxmem: 1150352
real: 15.35
user: 15.10
system: 0.25
$ time.sh dwz -L50000000 cc1 -o 2 --odr
maxmem: 1151216
real: 20.17
user: 19.91
system: 0.19
$ time.sh dwz -L50000000 cc1 -o 2 --odr-unify
maxmem: 1151916
real: 14.57
user: 14.34
system: 0.22
...

It's good to note though that without the patch series applied, we use less
memory, due to the struct dw_die not having the copy/origin fields:
...
$ time.sh ./dwz -L50000000 cc1 -o 1
maxmem: 993276
real: 14.56
user: 14.29
system: 0.23
...

VII. Testing

The patch series contains test-cases exercising the --odr and --odr-unify
optimization options.

The patch series has been tested on-by-default in conjunction with the gdb
testsuite, using target boards cc-with-dwz.exp and cc-with-dwz-m.exp.

VIII. Todo

- minimize runtime/memory performance degradation when optimization options
  are not used.
- minimize runtime/memory performance impact of optimizations
- enable optimization in low-mem mode

---
 COVER-LETTER | 0
 1 file changed, 0 insertions(+), 0 deletions(-)

diff --git a/COVER-LETTER b/COVER-LETTER
new file mode 100644
index 0000000..e69de29

[-- Attachment #3: odr-publish-v1.bundle --]
[-- Type: application/octet-stream, Size: 17281 bytes --]

[-- Attachment #4: patches.tgz --]
[-- Type: application/x-compressed-tar, Size: 14310 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC] Implement C++ One Definition Rule for struct, class and union
  2019-01-01  0:00 [RFC] Implement C++ One Definition Rule for struct, class and union Tom de Vries
@ 2019-01-01  0:00 ` Tom de Vries
  2019-01-01  0:00 ` Michael Matz
  1 sibling, 0 replies; 3+ messages in thread
From: Tom de Vries @ 2019-01-01  0:00 UTC (permalink / raw)
  To: dwz, Jakub Jelinek, Mark Wielaard; +Cc: Michael Matz

On 12-11-2019 14:56, Tom de Vries wrote:
> V. Effect
> 
> We use a cc1 executable to generate executables compressed with no odr,
> --odr and --odr-unify:
> ...
> $ dwz -l50000000 cc1 -o 1
> $ dwz -l50000000 cc1 -o 2 --odr
> $ dwz -l50000000 cc1 -o 3 --odr-unify
> ...
> 
> Then we can inspect the differences:
> ...
> $ diff.sh cc1 1
> .debug_info      red: 44.80%    111527248  61570632
> .debug_abbrev    red: 40.16%    1722726  1030935
> .debug_str       red: 0%        6609355  6609355
> total            red: 42.26%    119859329 69210922
> $ diff.sh cc1 2
> .debug_info      red: 55.16%    111527248  50019425
> .debug_abbrev    red: 68.13%    1722726  549035
> .debug_str       red: 0%        6609355  6609355
> total            red: 52.30%    119859329 57177815
> $ diff.sh cc1 3
> .debug_info      red: 58.18%    111527248  46649959
> .debug_abbrev    red: 79.57%    1722726  352080
> .debug_str       red: 0%        6609355  6609355
> total            red: 55.28%    119859329 53611394
> ...
> 
> So, the .debug_info and .debug_abbrev sections are reduced in size by:
> - by 42% when not using odr,
> - by 52% when using --odr, and
> - by 55% when using --odr-unify.
> 
> VI. Cost
> 

> Using the same cc1 example as in V, we can see the cost of the optimization:

At V, I correctly used -l50000000 (l lower-case), but here I accidentally
used -L50000000 (L upper-case).  Which means low-mem mode kicked in and
disabled the optimization midway, so the time and mem results presented here
earlier were off.
Let's try again:
...
$ time.sh dwz -l50000000 cc1 -o 1
maxmem: 1341888
real: 7.09
user: 6.90
system: 0.18
$ time.sh dwz -l50000000 cc1 -o 2 --odr
maxmem: 1336612
real: 18.72
user: 18.54
system: 0.17
$ time.sh dwz -l50000000 cc1 -o 3 --odr-unify
maxmem: 1336216
real: 13.76
user: 13.57
system: 0.18
...

> It's good to note though that without the patch series applied, we use less
> memory, due to the struct dw_die not having the copy/origin fields:

And here again:
...
$ time.sh dwz -lnone cc1 -o 1
maxmem: 1179928
real: 6.98
user: 6.83
system: 0.14
...

Thanks,
- Tom

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC] Implement C++ One Definition Rule for struct, class and union
  2019-01-01  0:00 [RFC] Implement C++ One Definition Rule for struct, class and union Tom de Vries
  2019-01-01  0:00 ` Tom de Vries
@ 2019-01-01  0:00 ` Michael Matz
  1 sibling, 0 replies; 3+ messages in thread
From: Michael Matz @ 2019-01-01  0:00 UTC (permalink / raw)
  To: Tom de Vries; +Cc: dwz, Jakub Jelinek, Mark Wielaard

Hello,

On Tue, 12 Nov 2019, Tom de Vries wrote:

> I reached a feature-complete state for the patch series implementing the
> ODR optimization (PR dwz/24198).  [ Feature-complete meaning, AFAICT it
> does what it's supposed to do, though it may be able to do it quicker
> and/or using less memory. ]

That's awecome!  I was looking forward to this for some time ;-)  Let's 
look at the compression impact:

> $ dwz -l50000000 cc1 -o 1
> $ dwz -l50000000 cc1 -o 2 --odr
> $ dwz -l50000000 cc1 -o 3 --odr-unify
> ...
> $ diff.sh cc1 1
> .debug_info      red: 44.80%    111527248  61570632
> .debug_abbrev    red: 40.16%    1722726  1030935
> $ diff.sh cc1 2
> .debug_info      red: 55.16%    111527248  50019425
> .debug_abbrev    red: 68.13%    1722726  549035
> $ diff.sh cc1 3
> .debug_info      red: 58.18%    111527248  46649959
> .debug_abbrev    red: 79.57%    1722726  352080

So, while it is an appreciable reduction I somehow hoped for even more.  
Which probably means that type information, at least in the --odr-unify 
file, isn't the largest component of .debug_info anymore.  I wonder what 
the large components are now.  (I'm not sure how to measure this, possibly 
some size measure per DIE type or a subset of DIE types, or even 
attributes (so that one could say that so and so many bytes are associated 
with variable location descriptions, and so and so many by subprogram 
descriptions, and types, and so on)).

(And I note that --odr-unify is faster than no option, with -DODR, and the 
same time as without -DODR ;-) )

Ciao,
Michael.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-11-13 22:04 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-01  0:00 [RFC] Implement C++ One Definition Rule for struct, class and union Tom de Vries
2019-01-01  0:00 ` Tom de Vries
2019-01-01  0:00 ` Michael Matz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).