public inbox for dwz@sourceware.org
 help / color / mirror / Atom feed
* [RFC] Try to ensure 62.5% occupancy in off_htab
@ 2019-01-01  0:00 Tom de Vries
  0 siblings, 0 replies; only message in thread
From: Tom de Vries @ 2019-01-01  0:00 UTC (permalink / raw)
  To: dwz, jakub

Hi,

A noticible fact about the vmlinux test-case is the low final occupancy of
26% of off_htab:
...
searches: 83198411, collisions: 0.044957, occupancy: 0.260028, line: 11088
...

This is caused by the following.

The hashtable.c in the gcc repo started out with a growing strategy that
grows to twice the number of elements:
...
  new_htab = create_hash_table (htab->number_of_elements * 2,
                                htab->hash_function, htab->eq_function);
...

However, in 2000 commmit 07c967f908b changed that to twice the size:
...
  htab->size = higher_prime_number (htab->size * 2);
...

So, before this commit, in a hashtab with size 100 and 75 elements, we
would grow (ignoring the higher_prime_number complication) to size 150
(with occupancy 50%), but after this commit we would grow to size 200
(with occupancy 37.5%).

This change was reverted again in 2003 in commit b9af005f330:
...
    nsize = higher_prime_number ((htab->n_elements - htab->n_deleted) * 2);
...
But the hashtable.c in the dwz repo has the 'twice the size' variant.

If we change the hashtable implementation to use the 'twice the elements'
variant:
...
-  htab->size = higher_prime_number (htab->size * 2);
+  htab->size = higher_prime_number (htab_elements (htab) * 2);
...
on top of the commit before commit b7935b7 "Avoid htab_expand for off_htab",
we get an occupancy of 52% instead of 26:
...
 $ dwz -lnone -Lnone ./vmlinux-4.20.13-0-vanilla.debug -o 1 2>&1 | sort -V | tail -n 1
-searches: 83198411, collisions: 0.672008, occupancy: 0.260028, line: 11073
+searches: 83198411, collisions: 0.989007, occupancy: 0.520056, line: 11073
...

This patch doesn't change hashtab.c growth strategy, nor the emulation thereof
in emulate_htab.

But it does attempt to ensure at least a 62.5% occupancy (chosen as inbetween
the 75% growth trigger and the 50% after-growth target) for the estimated
final occupancy.

The resulting change is 52% occupancy (which is somewhat lower than the
targeted 62.5% due to the higher_prime_number usage):
...
-searches: 83198411, collisions: 0.044957, occupancy: 0.260028, line: 11088
+searches: 83198411, collisions: 0.060844, occupancy: 0.520056, line: 11094
...
which gives a ~10% reduction in memory usage:
...
$ time.sh dwz.1 -lnone -Lnone vmlinux -o 1 \
    | grep maxmem
maxmem: 2123368
$ time.sh dwz.2 -lnone -Lnone vmlinux -o 1 \
    | grep maxmem
maxmem: 1903212
...
without any adverse performance effects:
...
real:  mean:  27212.20  100.00%  stddev:  28.20
       mean:  27189.10   99.92%  stddev:  62.44
user:  mean:  26667.60  100.00%  stddev:  36.77
       mean:  26704.80  100.14%  stddev:  61.91
sys:   mean:    542.80  100.00%  stddev:  23.25
       mean:    482.40   88.87%  stddev:  36.33
...

For now, this is an RFC due having been validated on only one test-case.

---
 dwz.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/dwz.c b/dwz.c
index 945eae9..04d38cd 100644
--- a/dwz.c
+++ b/dwz.c
@@ -1292,6 +1292,13 @@ off_htab_add_die (dw_cu_ref cu, dw_die_ref die)
 	  size_t estimated_final_hashtab_size
 	    = emulate_htab (default_initial_size, estimated_nr_dies);
 	  initial_size = estimated_final_hashtab_size;
+	  if (8 * estimated_nr_dies < 5 * initial_size)
+	    {
+	      /* If occupancy is less 62.5% (5/8), try to make it at least
+		 that.  */
+	      size_t alt_size = higher_prime_number (8 * estimated_nr_dies / 5);
+	      initial_size = alt_size;
+	    }
 	}
       off_htab = htab_try_create (initial_size, off_hash, off_eq, NULL);
       if (off_htab == NULL)

Any comments?

Thanks,
- Tom

Try to ensure 62.5% occupancy in off_htab

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2019-11-22 15:02 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-01  0:00 [RFC] Try to ensure 62.5% occupancy in off_htab Tom de Vries

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).