public inbox for archer@sourceware.org
 help / color / mirror / Atom feed
* gcc dwarf2out: Drop the size + performance overhead of DW_AT_sibling
       [not found]   ` <1318929963.8669.2.camel@springer.wildebeest.org>
@ 2011-10-18  9:45     ` Jan Kratochvil
  2011-10-18  9:49       ` Jan Kratochvil
  2011-10-19  9:34       ` Mark Wielaard
  0 siblings, 2 replies; 5+ messages in thread
From: Jan Kratochvil @ 2011-10-18  9:45 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: Project Archer

Hi Mark,

<warning> moved to a public list </warning>

On Tue, 18 Oct 2011 11:26:03 +0200, Mark Wielaard wrote:
> On Mon, 2011-10-17 at 15:36 +0200, Jan Kratochvil wrote:
> > gcc.post: Drop DW_AT_sibling; remove 27 LoC: -3.49% .debug size, -1.7%
> > GDB time.
> 
> Do you have more information about that? Systemtap for example, which
> uses elfutils libdw uses DW_AT_subling to more efficiently go through
> the debug_info DIEs.

The patch with various benchmarks is:
	[patch] dwarf2out: Drop the size + performance overhead of DW_AT_sibling
	http://gcc.gnu.org/ml/gcc-patches/2011-10/msg00992.html

GDB also uses DW_AT_sibling when available (skip_one_die and
locate_pdi_sibling).  The mail above quotation:
# I guess DW_AT_sibling had real performance gains on CPUs with 1x (=no) clock
# multipliers.  Nowadays mostly only the data size transferred over FSB matters.

The problem is the DIEs skipping by CPU is so cheap on current CPUs it cannot
be compared with the overhead of providing the helper data for it.  I did not
expect dropping DW_AT_sibling would be even a consumer performance
_improvement_.  I expected more it will be either not measurable or just not
significant enough for the .debug on-disk sizes cost justification.

I did only gdb and idb benchmarks.  systemtap benchmark is welcome, libstdc++
files for benchmark, if it is enough for systemtap this way:
	http://people.redhat.com/jkratoch/ns.tar.xz


Thanks,
Jan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: gcc dwarf2out: Drop the size + performance overhead of DW_AT_sibling
  2011-10-18  9:45     ` gcc dwarf2out: Drop the size + performance overhead of DW_AT_sibling Jan Kratochvil
@ 2011-10-18  9:49       ` Jan Kratochvil
  2011-10-19  9:34       ` Mark Wielaard
  1 sibling, 0 replies; 5+ messages in thread
From: Jan Kratochvil @ 2011-10-18  9:49 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: Project Archer

On Tue, 18 Oct 2011 11:44:57 +0200, Jan Kratochvil wrote:
> The problem is the DIEs skipping by CPU is so cheap on current CPUs it cannot
> be compared with the overhead of providing the helper data for it.  I did not
> expect dropping DW_AT_sibling would be even a consumer performance
> _improvement_.  I expected more it will be either not measurable or just not
> significant enough for the .debug on-disk sizes cost justification.

maybe it could be worth tune out specific special cases where DW_AT_sibling
skips larger set of DIEs and any of the consumers benefits from that case.

Just at least in the case of GDB there are so many several orders of magnitude
worse performance issues than reading out the CU data that I do not think it
matters much and the on-disk size should be the primary concern even even if
would mean some performance degradation, which will not be much measurable.

It is true systemtap is a different kind of consumer, thanks for pointing it
out.


Thanks,
Jan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: gcc dwarf2out: Drop the size + performance overhead of DW_AT_sibling
  2011-10-18  9:45     ` gcc dwarf2out: Drop the size + performance overhead of DW_AT_sibling Jan Kratochvil
  2011-10-18  9:49       ` Jan Kratochvil
@ 2011-10-19  9:34       ` Mark Wielaard
  2011-10-19 11:29         ` Jan Kratochvil
  2011-10-19 15:38         ` Jan Kratochvil
  1 sibling, 2 replies; 5+ messages in thread
From: Mark Wielaard @ 2011-10-19  9:34 UTC (permalink / raw)
  To: Jan Kratochvil; +Cc: Project Archer

Hi Jan,

On Tue, 2011-10-18 at 11:44 +0200, Jan Kratochvil wrote:
> On Tue, 18 Oct 2011 11:26:03 +0200, Mark Wielaard wrote:
> > On Mon, 2011-10-17 at 15:36 +0200, Jan Kratochvil wrote:
> > > gcc.post: Drop DW_AT_sibling; remove 27 LoC: -3.49% .debug size, -1.7%
> > > GDB time.
> > 
> > Do you have more information about that? Systemtap for example, which
> > uses elfutils libdw uses DW_AT_subling to more efficiently go through
> > the debug_info DIEs.
> 
> The patch with various benchmarks is:
> 	[patch] dwarf2out: Drop the size + performance overhead of DW_AT_sibling
> 	http://gcc.gnu.org/ml/gcc-patches/2011-10/msg00992.html
> 
> GDB also uses DW_AT_sibling when available (skip_one_die and
> locate_pdi_sibling).  The mail above quotation:
> # I guess DW_AT_sibling had real performance gains on CPUs with 1x (=no) clock
> # multipliers.  Nowadays mostly only the data size transferred over FSB matters.
> 
> The problem is the DIEs skipping by CPU is so cheap on current CPUs it cannot
> be compared with the overhead of providing the helper data for it.  I did not
> expect dropping DW_AT_sibling would be even a consumer performance
> _improvement_.  I expected more it will be either not measurable or just not
> significant enough for the .debug on-disk sizes cost justification.
> 
> I did only gdb and idb benchmarks.  systemtap benchmark is welcome, libstdc++
> files for benchmark, if it is enough for systemtap this way:
> 	http://people.redhat.com/jkratoch/ns.tar.xz

Thanks for those. Some quick benchmarks show systemtap selection of
functions and function parameters is slightly slower without
DW_AT_sibling being available. But not dramatically.

$ for i in $(find ns -name libstdc++.so.6.0.17.debug); do echo $i; time
stap -l "process(\"$i\").function(\"*\")" | wc --lines; done
ns/gccgit-c-xxxxxxxxxxxxx-test/default/libstdc++.so.6.0.17.debug
1852

real	0m0.427s
user	0m0.392s
sys	0m0.036s
ns/gccgit-c-xxxxxxxxxxxxx-test/gdbindex/libstdc++.so.6.0.17.debug
1852

real	0m0.422s
user	0m0.384s
sys	0m0.035s
ns/gccgit-c-ns-xxxxxxxxxx-test/default/libstdc++.so.6.0.17.debug
1852

real	0m0.447s
user	0m0.406s
sys	0m0.042s
ns/gccgit-c-ns-xxxxxxxxxx-test/gdbindex/libstdc++.so.6.0.17.debug
1852

real	0m0.443s
user	0m0.404s
sys	0m0.037s

That is selecting all functions in libstdc++. Systemtap doesn't use
gdbindex, but I included it so you can see the "noise". Here the
slowdown seems somewhat equal to the noise.

If we also want parameters/variables listed for each function probe
point (using -L) things are a bit more visible:

$ for i in $(find ns -name libstdc++.so.6.0.17.debug); do echo $i; time
stap -L "process(\"$i\").function(\"*\")" | wc --lines; done
ns/gccgit-c-xxxxxxxxxxxxx-test/default/libstdc++.so.6.0.17.debug
1852

real	0m0.573s
user	0m0.522s
sys	0m0.043s
ns/gccgit-c-xxxxxxxxxxxxx-test/gdbindex/libstdc++.so.6.0.17.debug
1852

real	0m0.558s
user	0m0.505s
sys	0m0.052s
ns/gccgit-c-ns-xxxxxxxxxx-test/default/libstdc++.so.6.0.17.debug
1852

real	0m0.611s
user	0m0.556s
sys	0m0.056s
ns/gccgit-c-ns-xxxxxxxxxx-test/gdbindex/libstdc++.so.6.0.17.debug
1852

real	0m0.603s
user	0m0.557s
sys	0m0.045s

Still no dramatic slowdown, but probably enough to discount random noise
in the measurements.

So DW_AT_sibling definitely does help systemtap/libdw walk a little bit
more efficient through the DIE tree. But you are right that walking the
DIE tree even without them can be done pretty quickly.

Cheers,

Mark

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: gcc dwarf2out: Drop the size + performance overhead of DW_AT_sibling
  2011-10-19  9:34       ` Mark Wielaard
@ 2011-10-19 11:29         ` Jan Kratochvil
  2011-10-19 15:38         ` Jan Kratochvil
  1 sibling, 0 replies; 5+ messages in thread
From: Jan Kratochvil @ 2011-10-19 11:29 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: Project Archer

Hi Mark,

On Wed, 19 Oct 2011 11:34:18 +0200, Mark Wielaard wrote:
> real	0m0.558s
->
> real	0m0.603s

I find it a significant enough performance degradation.

I will look into some compromise of a selective DW_AT_sibling entries.

Thanks for the stap commands for performance tuning.


Regards,
Jan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: gcc dwarf2out: Drop the size + performance overhead of DW_AT_sibling
  2011-10-19  9:34       ` Mark Wielaard
  2011-10-19 11:29         ` Jan Kratochvil
@ 2011-10-19 15:38         ` Jan Kratochvil
  1 sibling, 0 replies; 5+ messages in thread
From: Jan Kratochvil @ 2011-10-19 15:38 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: Project Archer

Hi Mark,

(.debug size)
5059640 = -3.71% no DW_AT_sibling
stap	0m0.577s (real)
gdb	0m0.234s (real)

5061160 = -3.68% DW_AT_sibling if >= 256 total children
stap	0m0.572s
gdb	0m0.232s

5084040 = -3.25%; DW_AT_sibling if >= 16 total children
stap	0m0.545s
gdb	0m0.231s

5169888 = -1.62%; DW_AT_sibling if >= 4 total children
stap	0m0.540s
gdb	0m0.235s

5254792 = ------; all DW_AT_sibling
stap	0m0.536s
gdb	0m0.243s


So the stap vs. gdb performance is exactly the opposite.

I will redo the timings after another change (DW_FORM_ref_udata) which may yet
change the timing / magic threshold.


Thanks,
Jan


gcc/
2011-10-19  Jan Kratochvil  <jan.kratochvil@redhat.com>

	* dwarf2out.c (add_sibling_attributes): New variables next_die_no and
	this_die_no.  Do not produce DW_AT_sibling for too few children and no
	-gstruct-dwarf.

--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -7490,14 +7490,21 @@ static void
 add_sibling_attributes (dw_die_ref die)
 {
   dw_die_ref c;
+  static unsigned long next_die_no;
+  unsigned long this_die_no = next_die_no++;
 
   if (! die->die_child)
     return;
 
+  FOR_EACH_CHILD (die, c, add_sibling_attributes (c));
+
+  /* -gstruct-dwarf can be used for unconditional DW_AT_sibling for backward
+     compatibility.  */
+  if (!dwarf_strict && next_die_no - this_die_no < 16)
+    return;
+
   if (die->die_parent && die != die->die_parent->die_child)
     add_AT_die_ref (die, DW_AT_sibling, die->die_sib);
-
-  FOR_EACH_CHILD (die, c, add_sibling_attributes (c));
 }
 
 /* Output all location lists for the DIE and its children.  */

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-10-19 15:38 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <201110170508.p9H581vh028090@shell.devel.redhat.com>
     [not found] ` <20111017133634.GA5677@host1.jankratochvil.net>
     [not found]   ` <1318929963.8669.2.camel@springer.wildebeest.org>
2011-10-18  9:45     ` gcc dwarf2out: Drop the size + performance overhead of DW_AT_sibling Jan Kratochvil
2011-10-18  9:49       ` Jan Kratochvil
2011-10-19  9:34       ` Mark Wielaard
2011-10-19 11:29         ` Jan Kratochvil
2011-10-19 15:38         ` Jan Kratochvil

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).