public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Decrease fill-ratio of hash tables
@ 2011-08-09  7:41 Dimitrios Apostolou
  0 siblings, 0 replies; only message in thread
From: Dimitrios Apostolou @ 2011-08-09  7:41 UTC (permalink / raw)
  To: gcc-patches; +Cc: Steven Bosscher, Nicola Pero

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1613 bytes --]

Hello list,

the attach simple patch saves reproducively a tiny bit of time on various 
-O0 compilations I've performed, for example 5ms out of 627ms. Tested on 
i386.

We trade a little bit of memory (maxmem2.sh from [1] reported 25240 KB 
instead of 25080 KB without the patch) to allow our hash tables be more 
sparsely populated (always at least half-empty). Since our hash tables 
resolve conflicts by rehashing, reducing collisions is good.

Another patch that I'll post soon helped me measure collisions/searches 
ratio for the hottest hash tables. What could be easily noticed was that 
the ratio was too high, reaching 0.5 or even 0.7 in some cases. This made 
clear that we need some deep refactoring of our hash tables, either 
changing hash functions or the complete hash table implementation, to not 
involve any rehashing for collision handling. Unfortunately such tries 
failed either because they were too simple to show any difference, or they 
were too intrusive and involved changes everywhere htab_t is used 
(almost everywhere).

This patch is the simplest one to show positive change but I believe that 
some careful hash table redesign must be planned. For example, for the 
mem_attrs_htab hash table, coll/searches ratio is still sometimes higher 
than 0.5.

Changelog:

2011-08-09  Dimitrios Apostolou  <jimis@gmx.net>

 	* symtab.c (ht_lookup_with_hash): Hash table will now be doubled
 	when 50% full, not 75%, to reduce collisions.
 	* hashtab.c (find_empty_slot_for_expand)
 	(htab_find_slot_with_hash): The same.



Thanks,
Dimitris



[1] http://gcc.gnu.org/wiki/PerformanceTesting

[-- Attachment #2: Type: TEXT/plain, Size: 1758 bytes --]

=== modified file 'libcpp/symtab.c'
--- libcpp/symtab.c	2009-07-18 02:22:16 +0000
+++ libcpp/symtab.c	2011-08-09 02:39:45 +0000
@@ -172,7 +172,7 @@
     HT_STR (node) = (const unsigned char *) obstack_copy0 (&table->stack,
 							   str, len);
 
-  if (++table->nelements * 4 >= table->nslots * 3)
+  if (++table->nelements * 2 > table->nslots)
     /* Must expand the string table.  */
     ht_expand (table);
 

=== modified file 'libiberty/hashtab.c'
--- libiberty/hashtab.c	2011-02-03 07:23:20 +0000
+++ libiberty/hashtab.c	2011-08-09 02:39:45 +0000
@@ -515,8 +515,9 @@
 }
 
 /* The following function changes size of memory allocated for the
-   entries and repeatedly inserts the table elements.  The occupancy
-   of the table after the call will be about 50%.  Naturally the hash
+   entries and repeatedly inserts the table elements.  It is designed to
+   be called when table is half-full. The occupancy
+   of the table after the call will be about 25%.  Naturally the hash
    table must already exist.  Remember also that the place of the
    table entries is changed.  If memory allocation failures are allowed,
    this function will return zero, indicating that the table could not be
@@ -542,7 +543,7 @@
      too full or too empty.  */
   if (elts * 2 > osize || (elts * 8 < osize && osize > 32))
     {
-      nindex = higher_prime_index (elts * 2);
+      nindex = higher_prime_index (elts * 4);
       nsize = prime_tab[nindex].prime;
     }
   else
@@ -648,7 +649,7 @@
   PTR entry;
 
   size = htab_size (htab);
-  if (insert == INSERT && size * 3 <= htab->n_elements * 4)
+  if (insert == INSERT && htab->n_elements * 2 > size)
     {
       if (htab_expand (htab) == 0)
 	return NULL;

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2011-08-09  3:17 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-09  7:41 Decrease fill-ratio of hash tables Dimitrios Apostolou

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).