public inbox for dwz@sourceware.org
 help / color / mirror / Atom feed
From: Tom de Vries <tdevries@suse.de>
To: dwz@sourceware.org, jakub@redhat.com
Subject: [committed] Avoid htab_expand for off_htab
Date: Tue, 01 Jan 2019 00:00:00 -0000	[thread overview]
Message-ID: <20191121163349.GA23595@delia> (raw)

Hi,

Every time the number of elements in a hash table from hashtab.c grow beyond
75% occupancy, the hash table grows in size.

Using a debug patch to print this growth for off_htab, we get:
...
$ dwz cc1 -o 1 -lnone
Allocating htab with size            131071
Growing htab from size   131071 to   524287
Growing htab from size   524287 to  2097143
Growing htab from size  2097143 to  4194301
Growing htab from size  4194301 to 16777213
$ dwz vmlinux -o 1 -lnone
Allocating htab with size            131071
Growing htab from size   131071 to   524287
Growing htab from size   524287 to  2097143
Growing htab from size  2097143 to  4194301
Growing htab from size  4194301 to 16777213
Growing htab from size 16777213 to 67108859
...

There is a cost associated to this growing.

Every time the hash table grows:
- a new array is allocated, after which the old array is is freed.  This can
  cause memory fragmentation.
- the elements need to be copied from the old to the new array.

Also, hash table collisions will be higher due to operating on a hash table
with on average higher occupancy.

In the case of off_htab, we can estimate the final size and use that as
initial size, which skips the growth costs.

I've tested this on cc1 and vmlinux, and the gain is ~6%:
...
cc1:
real:  mean:  6985.50  100.00%  stddev:  34.55
       mean:  6531.40   93.50%  stddev:  18.82
user:  mean:  6816.70  100.00%  stddev:  30.84
       mean:  6374.00   93.51%  stddev:  28.43
sys:   mean:   167.20  100.00%  stddev:  17.77
       mean:   156.80   93.78%  stddev:  17.05

vmlinux:
real:  mean:  30748.60  100.00%  stddev:  377.31
       mean:  28773.90   93.58%  stddev:   61.33
user:  mean:  30338.40  100.00%  stddev:  139.17
       mean:  28499.70   93.94%  stddev:   49.44
sys:   mean:    336.50  100.00%  stddev:  102.51
       mean:    272.00   80.83%  stddev:   23.55
...

As for the collisions (measured using another debug patch), there's a clear
reduction in collision rate for both cc1 (-0.13):
...
 $ dwz cc1 -o 1 -lnone 2>&1 | sort -V | tail -n 1
-searches: 34256212, collisions: 0.684419, occupancy: 0.607308, line: 11031
+searches: 34256212, collisions: 0.551075, occupancy: 0.607308, line: 11086
...
as well as vmlinux (-0.57):
...
 $ dwz vmlinux -o 1 -lnone 2>&1 | sort -V | tail -n 1
-searches: 83198411, collisions: 0.672008, occupancy: 0.260028, line: 11031
+searches: 83198411, collisions: 0.100454, occupancy: 0.260028, line: 11086
...

To confirm, I've tried a yet larger example, clang (with cc1 at 10.2 million
DIEs, vmlinux at 17.5 million DIES, and clang at 111.4 million DIEs):
...
real:  mean:  127261.67  100.00%  stddev:  422.04
       mean:  112483.33   88.39%  stddev:  443.20
user:  mean:  116865.67  100.00%  stddev:  101.59
       mean:  102581.33   87.78%  stddev:  344.42
sys:   mean:    5881.33  100.00%  stddev:  261.95
       mean:    5218.00   88.72%  stddev:  144.81
...
Here we see an even higher performance gain, of ~11%.

We don't do this optimization when operating in low-mem or optimize_multifile
mode.  In either mode, not all DIEs will be present in off_htab after parsing
the entire .debug_info, so it's hard to estimate what the final size will be.

Also, we don't do this optimization if we're suspecting that we'll run into
the low-mem limit (again using the estimated number of DIEs).  There's no
need to allocate that much memory if we don't think we'll end up using all
of it.

Likewise, we don't do this optimization if we're suspecting that we'll run into
the max-die limit (again using the estimated number of DIEs).

Committed to trunk.

Thanks,
- Tom

Avoid htab_expand for off_htab

2019-11-20  Tom de Vries  <tdevries@suse.de>

	* dwz.c (off_htab_add_die): Allocate off_htab with estimated final
	size.

---
 dwz.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/dwz.c b/dwz.c
index db134ab..84a7080 100644
--- a/dwz.c
+++ b/dwz.c
@@ -1279,7 +1279,21 @@ off_htab_add_die (dw_cu_ref cu, dw_die_ref die)
 
   if (off_htab == NULL)
     {
-      off_htab = htab_try_create (100000, off_hash, off_eq, NULL);
+      unsigned int estimated_nr_dies = estimate_nr_dies ();
+      size_t default_initial_size = 100000;
+      size_t initial_size;
+      if (low_mem
+	  || op_multifile
+	  || estimated_nr_dies >= low_mem_die_limit
+	  || estimated_nr_dies >= max_die_limit)
+	initial_size = default_initial_size;
+      else
+	{
+	  size_t estimated_final_hashtab_size
+	    = emulate_htab (default_initial_size, estimated_nr_dies);
+	  initial_size = estimated_final_hashtab_size;
+	}
+      off_htab = htab_try_create (initial_size, off_hash, off_eq, NULL);
       if (off_htab == NULL)
 	dwz_oom ();
       if (rd_multifile)

                 reply	other threads:[~2019-11-21 16:33 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191121163349.GA23595@delia \
    --to=tdevries@suse.de \
    --cc=dwz@sourceware.org \
    --cc=jakub@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).