public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [var-tracking] small speed-ups
@ 2011-08-22 11:38 Dimitrios Apostolou
  2011-08-22 12:21 ` [var-tracking] [not-good!] disable shared_hash and other simplifications Dimitrios Apostolou
  2011-08-22 13:44 ` [var-tracking] small speed-ups Jakub Jelinek
  0 siblings, 2 replies; 9+ messages in thread
From: Dimitrios Apostolou @ 2011-08-22 11:38 UTC (permalink / raw)
  To: gcc-patches; +Cc: jimis, Steven Bosscher, Alexandre Oliva

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2179 bytes --]

Hello list,

this patch has all of my changes to var-tracking that I believe are worth 
keeping. These are all minor changes not touching algorithmic issues, I 
lost much time trying to understand what is actually done in 
var-tracking.c but I still don't get it. I wish there was some document 
describing current implementation. I'd appreciate all help I can get here.

Bootstrapped/tested on x86_64.

Thanks to lxo for helping me, hopefully his plan for better var-tracking 
throughout all optimisation passes will be implemented. This is the only 
var-tracking related doc I could find... ( 
http://gcc.gnu.org/wiki/Var_Tracking_Assignments).

This patch also includes minor stylistic changes that made the code just a 
tiny bit more accessible to me, the indirection was so much that it hardly 
reminded me of C, let me know if I should remove these parts. :-s For the 
sake of completion I'll also post a follow-up patch where I 
delete/simplify a big part of var-tracking, unfortunately with some impact 
on performance.


2011-08-22  Dimitrios Apostolou  <jimis@gmx.net>

 	* var-tracking.c (init_attrs_list_set): Remove function, instead
 	use a memset() call to zero the attr list in...
 	(dataflow_set_init).
 	(vars_copy): Remove function because inserting each element into a
 	new hash table was too costly. Replaced with the ...
 	(htab_dup): ... new function that only does a memcpy() of the
 	element table in htab_t, without rehashing any elements.
 	(shared_hash_unshare): Replace the vars_copy() call with
 	htab_dup(), plus do a little extra work (reference counting) which
 	was in vars_copy.
 	(shared_hash_destroy, dataflow_set_destroy): Add an extra
 	"do_free" bool argument, to avoid iterating over hash tables
 	freeing elements, when not needed.
 	(vt_finalize, vt_emit_notes): Call the above with do_free=false
 	since all pools will be freed later.
 	(dataflow_set_clear, dataflow_set_copy, dataflow_set_union)
 	(dataflow_set_merge, dataflow_set_different, compute_bb_dataflow)
 	(vt_find_locations): Call shared_hash_destroy with do_free=true.
 	(attrs_list_copy): Do not free destination list but reuse already
 	allocated elements if possible.

[-- Attachment #2: Type: TEXT/plain, Size: 10901 bytes --]

=== modified file 'gcc/var-tracking.c'
--- gcc/var-tracking.c	2011-06-03 01:42:31 +0000
+++ gcc/var-tracking.c	2011-08-22 09:52:08 +0000
@@ -414,7 +414,6 @@ static hashval_t variable_htab_hash (con
 static int variable_htab_eq (const void *, const void *);
 static void variable_htab_free (void *);
 
-static void init_attrs_list_set (attrs *);
 static void attrs_list_clear (attrs *);
 static attrs attrs_list_member (attrs, decl_or_value, HOST_WIDE_INT);
 static void attrs_list_insert (attrs *, decl_or_value, HOST_WIDE_INT, rtx);
@@ -423,7 +422,6 @@ static void attrs_list_union (attrs *, a
 
 static void **unshare_variable (dataflow_set *set, void **slot, variable var,
 				enum var_init_status);
-static void vars_copy (htab_t, htab_t);
 static tree var_debug_decl (tree);
 static void var_reg_set (dataflow_set *, rtx, enum var_init_status, rtx);
 static void var_reg_delete_and_set (dataflow_set *, rtx, bool,
@@ -447,7 +445,7 @@ static bool variable_part_different_p (v
 static bool onepart_variable_different_p (variable, variable);
 static bool variable_different_p (variable, variable);
 static bool dataflow_set_different (dataflow_set *, dataflow_set *);
-static void dataflow_set_destroy (dataflow_set *);
+static void dataflow_set_destroy (dataflow_set *, bool);
 
 static bool contains_symbol_ref (rtx);
 static bool track_expr_p (tree, bool);
@@ -1069,14 +1067,14 @@ adjust_insn (basic_block bb, rtx insn)
 static inline bool
 dv_is_decl_p (decl_or_value dv)
 {
-  return !dv || (int) TREE_CODE ((tree) dv) != (int) VALUE;
+  return !dv || ((int) TREE_CODE ((tree) dv) != (int) VALUE);
 }
 
 /* Return true if a decl_or_value is a VALUE rtl.  */
 static inline bool
 dv_is_value_p (decl_or_value dv)
 {
-  return dv && !dv_is_decl_p (dv);
+  return !dv_is_decl_p (dv);
 }
 
 /* Return the decl in the decl_or_value.  */
@@ -1092,7 +1090,7 @@ static inline rtx
 dv_as_value (decl_or_value dv)
 {
   gcc_checking_assert (dv_is_value_p (dv));
-  return (rtx)dv;
+  return (rtx) dv;
 }
 
 /* Return the opaque pointer in the decl_or_value.  */
@@ -1191,7 +1189,7 @@ dv_uid2hash (dvuid uid)
 static inline hashval_t
 dv_htab_hash (decl_or_value dv)
 {
-  return dv_uid2hash (dv_uid (dv));
+  return (hashval_t) (dv_uid (dv));
 }
 
 /* The hash function for variable_htab, computes the hash value
@@ -1202,7 +1200,7 @@ variable_htab_hash (const void *x)
 {
   const_variable const v = (const_variable) x;
 
-  return dv_htab_hash (v->dv);
+  return (hashval_t) (dv_uid (v->dv));
 }
 
 /* Compare the declaration of variable X with declaration Y.  */
@@ -1211,9 +1209,8 @@ static int
 variable_htab_eq (const void *x, const void *y)
 {
   const_variable const v = (const_variable) x;
-  decl_or_value dv = CONST_CAST2 (decl_or_value, const void *, y);
 
-  return (dv_as_opaque (v->dv) == dv_as_opaque (dv));
+  return (v->dv) == y;
 }
 
 /* Free the element of VARIABLE_HTAB (its type is struct variable_def).  */
@@ -1265,17 +1262,6 @@ value_chain_htab_eq (const void *x, cons
   return dv_as_opaque (v->dv) == dv_as_opaque (dv);
 }
 
-/* Initialize the set (array) SET of attrs to empty lists.  */
-
-static void
-init_attrs_list_set (attrs *set)
-{
-  int i;
-
-  for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
-    set[i] = NULL;
-}
-
 /* Make the list *LISTP empty.  */
 
 static void
@@ -1323,18 +1309,33 @@ attrs_list_insert (attrs *listp, decl_or
 static void
 attrs_list_copy (attrs *dstp, attrs src)
 {
-  attrs n;
+  attrs n, next, *nextp;
 
-  attrs_list_clear (dstp);
-  for (; src; src = src->next)
+  /* Copy to already existing nodes of *dstp, allocating new only if needed */
+  nextp = dstp;
+  while (src)
     {
-      n = (attrs) pool_alloc (attrs_pool);
+      if (*nextp == NULL)
+	{
+	  (*nextp) = (attrs) pool_alloc (attrs_pool);
+	  (*nextp)->next = NULL;
+	}
+      n = *nextp;
       n->loc = src->loc;
       n->dv = src->dv;
       n->offset = src->offset;
-      n->next = *dstp;
-      *dstp = n;
+      nextp = &n->next;
+      src = src->next;
     }
+  /* Free remaining elements of *dstp, if it was longer than src. */
+  n = *nextp;
+  while (n != NULL)
+    {
+      next = n->next;
+      pool_free (attrs_pool, n);
+      n = next;
+    }
+  (*nextp) = NULL;
 }
 
 /* Add all nodes from SRC which are not in *DSTP to *DSTP.  */
@@ -1397,19 +1398,40 @@ shared_var_p (variable var, shared_hash 
 	  || shared_hash_shared (vars));
 }
 
+/* Copy all variables from hash table SRC to hash table DST without rehashing
+   any values.  */
+
+static htab_t
+htab_dup (htab_t src)
+{
+  htab_t dst;
+
+  dst = (htab_t) xmalloc (sizeof (*src));
+  memcpy (dst, src, sizeof (*src));
+  dst->entries = (void **) xmalloc (src->size * sizeof (*src->entries));
+  memcpy (dst->entries, src->entries,
+	  src->size * sizeof (*src->entries));
+  return dst;
+}
+
 /* Copy variables into a new hash table.  */
 
 static shared_hash
 shared_hash_unshare (shared_hash vars)
 {
+  variable var;
+  htab_iterator hi;
   shared_hash new_vars = (shared_hash) pool_alloc (shared_hash_pool);
+
   gcc_assert (vars->refcount > 1);
   new_vars->refcount = 1;
-  new_vars->htab
-    = htab_create (htab_elements (vars->htab) + 3, variable_htab_hash,
-		   variable_htab_eq, variable_htab_free);
-  vars_copy (new_vars->htab, vars->htab);
+  new_vars->htab = htab_dup (vars->htab);
+  FOR_EACH_HTAB_ELEMENT (new_vars->htab, var, variable, hi)
+    {
+      var->refcount++;
+    }
   vars->refcount--;
+
   return new_vars;
 }
 
@@ -1426,13 +1448,17 @@ shared_hash_copy (shared_hash vars)
    anymore.  */
 
 static void
-shared_hash_destroy (shared_hash vars)
+shared_hash_destroy (shared_hash vars, bool do_free)
 {
   gcc_checking_assert (vars->refcount > 0);
   if (--vars->refcount == 0)
     {
+      if (!do_free)
+	/* Dirty hack to prevent htab_delete() iterating over all elements */
+	vars->htab->del_f = NULL;
       htab_delete (vars->htab);
-      pool_free (shared_hash_pool, vars);
+      if (do_free)
+	pool_free (shared_hash_pool, vars);
     }
 }
 
@@ -1593,25 +1619,6 @@ unshare_variable (dataflow_set *set, voi
   return slot;
 }
 
-/* Copy all variables from hash table SRC to hash table DST.  */
-
-static void
-vars_copy (htab_t dst, htab_t src)
-{
-  htab_iterator hi;
-  variable var;
-
-  FOR_EACH_HTAB_ELEMENT (src, var, variable, hi)
-    {
-      void **dstp;
-      var->refcount++;
-      dstp = htab_find_slot_with_hash (dst, var->dv,
-				       dv_htab_hash (var->dv),
-				       INSERT);
-      *dstp = var;
-    }
-}
-
 /* Map a decl to its main debug decl.  */
 
 static inline tree
@@ -2034,7 +2041,8 @@ val_resolve (dataflow_set *set, rtx val,
 static void
 dataflow_set_init (dataflow_set *set)
 {
-  init_attrs_list_set (set->regs);
+  /* Initialize the set (array) SET of attrs to empty lists.  */
+  memset (set->regs, 0, sizeof (set->regs));
   set->vars = shared_hash_copy (empty_shared_hash);
   set->stack_adjust = 0;
   set->traversed_vars = NULL;
@@ -2050,7 +2058,7 @@ dataflow_set_clear (dataflow_set *set)
   for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
     attrs_list_clear (&set->regs[i]);
 
-  shared_hash_destroy (set->vars);
+  shared_hash_destroy (set->vars, true);
   set->vars = shared_hash_copy (empty_shared_hash);
 }
 
@@ -2064,7 +2072,7 @@ dataflow_set_copy (dataflow_set *dst, da
   for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
     attrs_list_copy (&dst->regs[i], src->regs[i]);
 
-  shared_hash_destroy (dst->vars);
+  shared_hash_destroy (dst->vars, true);
   dst->vars = shared_hash_copy (src->vars);
   dst->stack_adjust = src->stack_adjust;
 }
@@ -2489,7 +2497,7 @@ dataflow_set_union (dataflow_set *dst, d
 
   if (dst->vars == empty_shared_hash)
     {
-      shared_hash_destroy (dst->vars);
+      shared_hash_destroy (dst->vars, true);
       dst->vars = shared_hash_copy (src->vars);
     }
   else
@@ -3711,11 +3719,11 @@ dataflow_set_merge (dataflow_set *dst, d
   src2_elems = htab_elements (shared_hash_htab (src2->vars));
   dataflow_set_init (dst);
   dst->stack_adjust = cur.stack_adjust;
-  shared_hash_destroy (dst->vars);
+  shared_hash_destroy (dst->vars, true);
   dst->vars = (shared_hash) pool_alloc (shared_hash_pool);
   dst->vars->refcount = 1;
   dst->vars->htab
-    = htab_create (MAX (src1_elems, src2_elems), variable_htab_hash,
+    = htab_create (2 * MAX (src1_elems, src2_elems), variable_htab_hash,
 		   variable_htab_eq, variable_htab_free);
 
   for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
@@ -3734,7 +3742,7 @@ dataflow_set_merge (dataflow_set *dst, d
   if (dsm.src_onepart_cnt)
     dst_can_be_shared = false;
 
-  dataflow_set_destroy (src1);
+  dataflow_set_destroy (src1, true);
 }
 
 /* Mark register equivalences.  */
@@ -4503,14 +4511,15 @@ dataflow_set_different (dataflow_set *ol
 /* Free the contents of dataflow set SET.  */
 
 static void
-dataflow_set_destroy (dataflow_set *set)
+dataflow_set_destroy (dataflow_set *set, bool do_free)
 {
   int i;
 
-  for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
-    attrs_list_clear (&set->regs[i]);
+  if (do_free)
+    for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
+      attrs_list_clear (&set->regs[i]);
 
-  shared_hash_destroy (set->vars);
+  shared_hash_destroy (set->vars, do_free);
   set->vars = NULL;
 }
 
@@ -6339,7 +6348,7 @@ compute_bb_dataflow (basic_block bb)
 #endif
     }
   changed = dataflow_set_different (&old_out, out);
-  dataflow_set_destroy (&old_out);
+  dataflow_set_destroy (&old_out, true);
   return changed;
 }
 
@@ -6456,7 +6465,7 @@ vt_find_locations (void)
 #endif
 		      if (dst_can_be_shared)
 			{
-			  shared_hash_destroy (in->vars);
+			  shared_hash_destroy (in->vars, true);
 			  in->vars = shared_hash_copy (first_out->vars);
 			}
 		    }
@@ -8343,7 +8352,8 @@ vt_emit_notes (void)
       gcc_assert (htab_elements (value_chains) == 0);
     }
 #endif
-  dataflow_set_destroy (&cur);
+  /* don't free memory here, we'll free pools right after in vt_finalize */
+  dataflow_set_destroy (&cur, false);
 
   if (MAY_HAVE_DEBUG_INSNS)
     {
@@ -8996,11 +9006,13 @@ vt_finalize (void)
 
   FOR_ALL_BB (bb)
     {
-      dataflow_set_destroy (&VTI (bb)->in);
-      dataflow_set_destroy (&VTI (bb)->out);
+      /* The "false" do_free parameter means to not bother to iterate and free
+	 all hash table elements, since we'll destroy the pools. */
+      dataflow_set_destroy (&VTI (bb)->in, false);
+      dataflow_set_destroy (&VTI (bb)->out, false);
       if (VTI (bb)->permp)
 	{
-	  dataflow_set_destroy (VTI (bb)->permp);
+	  dataflow_set_destroy (VTI (bb)->permp, false);
 	  XDELETE (VTI (bb)->permp);
 	}
     }


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [var-tracking] [not-good!] disable shared_hash and other simplifications
  2011-08-22 11:38 [var-tracking] small speed-ups Dimitrios Apostolou
@ 2011-08-22 12:21 ` Dimitrios Apostolou
  2011-08-22 12:57   ` Jakub Jelinek
  2011-08-22 13:44 ` [var-tracking] small speed-ups Jakub Jelinek
  1 sibling, 1 reply; 9+ messages in thread
From: Dimitrios Apostolou @ 2011-08-22 12:21 UTC (permalink / raw)
  To: Dimitrios Apostolou; +Cc: gcc-patches, Steven Bosscher, Alexandre Oliva

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1685 bytes --]

Hello,

the attached patch applies after my previous one, and actually cancels all 
runtime gains from it. It doesn't make things worse than initially, so 
it's not *that* bad.

While trying to understand var-tracking I deleted the whole shared hash 
table concept and some other indirections. It didn't make things much 
worse, and it makes me think that getting rid of other indirections like 
shared variables reference counting would actually help. But it got too 
complicated and I had to give up.

While I don't understand variable tracking, I couldn't help but notice 
some things that could be discussed:

* This is the part of the compiler that makes the heaviest use of hash 
tables. htab_traverse(), htab_delete(), FOR_EACH_HTAB_ELEMENT (should that 
be moved in libiberty?) all iterate way too many times over hash tables. 
This is all because decl/value UIDs are sparse.
* Since var-tracking is happening per-function, is there a way to produce 
per-function UIDs and use those in a regular array instead of a sparse 
hash table? These UIDs could be produced at tree/rtx level, and I bet 
other parts in the compiler could make good use of them.
* Another alternative would be to store variables in a vector, sorted by 
UID value. That way checking if it's there would be O(logn) but merging 
would be a very simple operation. Still, merging dataflow sets is way to 
complex to follow so I didn't manage to change this.
* From a very wide field of view, all this DF solving reminded me a lot of 
what I've seen in df-*.c. Why can't variable tracking be integrated in the 
main DF pass of the compiler, the one that happens *after* register 
allocation?


Thanks,
Dimitris

[-- Attachment #2: Type: TEXT/plain, Size: 40881 bytes --]

=== modified file 'gcc/var-tracking.c'
--- gcc/var-tracking.c	2011-08-22 08:27:47 +0000
+++ gcc/var-tracking.c	2011-08-22 10:36:15 +0000
@@ -233,17 +233,6 @@ typedef struct attrs_def
   HOST_WIDE_INT offset;
 } *attrs;
 
-/* Structure holding a refcounted hash table.  If refcount > 1,
-   it must be first unshared before modified.  */
-typedef struct shared_hash_def
-{
-  /* Reference count.  */
-  int refcount;
-
-  /* Actual hash table.  */
-  htab_t htab;
-} *shared_hash;
-
 /* Structure holding the IN or OUT set for a basic block.  */
 typedef struct dataflow_set_def
 {
@@ -253,11 +242,11 @@ typedef struct dataflow_set_def
   /* Attributes for registers (lists of attrs).  */
   attrs regs[FIRST_PSEUDO_REGISTER];
 
-  /* Variable locations.  */
-  shared_hash vars;
+  /* Variable locations, stored in a hash table.  */
+  htab_t vars;
 
   /* Vars that is being traversed.  */
-  shared_hash traversed_vars;
+  htab_t traversed_vars;
 } dataflow_set;
 
 /* The structure (one for each basic block) containing the information
@@ -379,9 +368,6 @@ static alloc_pool valvar_pool;
 /* Alloc pool for struct location_chain_def.  */
 static alloc_pool loc_chain_pool;
 
-/* Alloc pool for struct shared_hash_def.  */
-static alloc_pool shared_hash_pool;
-
 /* Alloc pool for struct value_chain_def.  */
 static alloc_pool value_chain_pool;
 
@@ -395,7 +381,7 @@ static htab_t value_chains;
 static bool emit_notes;
 
 /* Empty shared hashtable.  */
-static shared_hash empty_shared_hash;
+static htab_t empty_htab;
 
 /* Scratch register bitmap used by cselib_expand_value_rtx.  */
 static bitmap scratch_regs = NULL;
@@ -478,7 +464,7 @@ static void **delete_slot_part (dataflow
 static void delete_variable_part (dataflow_set *, rtx,
 				  decl_or_value, HOST_WIDE_INT);
 static int emit_note_insn_var_location (void **, void *);
-static void emit_notes_for_changes (rtx, enum emit_note_where, shared_hash);
+static void emit_notes_for_changes (rtx, enum emit_note_where, htab_t);
 static int emit_notes_for_differences_1 (void **, void *);
 static int emit_notes_for_differences_2 (void **, void *);
 static void emit_notes_for_differences (rtx, dataflow_set *, dataflow_set *);
@@ -1370,164 +1356,30 @@ attrs_list_mpdv_union (attrs *dstp, attr
     }
 }
 
-/* Shared hashtable support.  */
-
-/* Return true if VARS is shared.  */
-
-static inline bool
-shared_hash_shared (shared_hash vars)
-{
-  return vars->refcount > 1;
-}
-
-/* Return the hash table for VARS.  */
-
-static inline htab_t
-shared_hash_htab (shared_hash vars)
-{
-  return vars->htab;
-}
-
 /* Return true if VAR is shared, or maybe because VARS is shared.  */
 
 static inline bool
-shared_var_p (variable var, shared_hash vars)
+shared_var_p (variable var)
 {
   /* Don't count an entry in the changed_variables table as a duplicate.  */
-  return ((var->refcount > 1 + (int) var->in_changed_variables)
-	  || shared_hash_shared (vars));
+  return (var->refcount - (int) var->in_changed_variables) > 1;
 }
 
-/* Copy all variables from hash table SRC to hash table DST without rehashing
-   any values.  */
-
-static htab_t
-htab_dup (htab_t src)
-{
-  htab_t dst;
-
-  dst = (htab_t) xmalloc (sizeof (*src));
-  memcpy (dst, src, sizeof (*src));
-  dst->entries = (void **) xmalloc (src->size * sizeof (*src->entries));
-  memcpy (dst->entries, src->entries,
-	  src->size * sizeof (*src->entries));
-  return dst;
-}
-
-/* Copy variables into a new hash table.  */
-
-static shared_hash
-shared_hash_unshare (shared_hash vars)
-{
-  variable var;
-  htab_iterator hi;
-  shared_hash new_vars = (shared_hash) pool_alloc (shared_hash_pool);
-
-  gcc_assert (vars->refcount > 1);
-  new_vars->refcount = 1;
-  new_vars->htab = htab_dup (vars->htab);
-  FOR_EACH_HTAB_ELEMENT (new_vars->htab, var, variable, hi)
-    {
-      var->refcount++;
-    }
-  vars->refcount--;
-
-  return new_vars;
-}
-
-/* Increment reference counter on VARS and return it.  */
-
-static inline shared_hash
-shared_hash_copy (shared_hash vars)
-{
-  vars->refcount++;
-  return vars;
-}
-
-/* Decrement reference counter and destroy hash table if not shared
-   anymore.  */
-
-static void
-shared_hash_destroy (shared_hash vars, bool do_free)
-{
-  gcc_checking_assert (vars->refcount > 0);
-  if (--vars->refcount == 0)
-    {
-      if (!do_free)
-	/* Dirty hack to prevent htab_delete() iterating over all elements */
-	vars->htab->del_f = NULL;
-      htab_delete (vars->htab);
-      if (do_free)
-	pool_free (shared_hash_pool, vars);
-    }
-}
-
-/* Unshare *PVARS if shared and return slot for DV.  If INS is
-   INSERT, insert it if not already present.  */
+/* Search for a dv in a hash table of variables/value_chains, return its
+   slot. */
 
 static inline void **
-shared_hash_find_slot_unshare_1 (shared_hash *pvars, decl_or_value dv,
-				 hashval_t dvhash, enum insert_option ins)
+htab_slot_dv (htab_t vars, decl_or_value dv, enum insert_option ins)
 {
-  if (shared_hash_shared (*pvars))
-    *pvars = shared_hash_unshare (*pvars);
-  return htab_find_slot_with_hash (shared_hash_htab (*pvars), dv, dvhash, ins);
+  return htab_find_slot_with_hash (vars, dv, dv_htab_hash (dv), ins);
 }
 
-static inline void **
-shared_hash_find_slot_unshare (shared_hash *pvars, decl_or_value dv,
-			       enum insert_option ins)
-{
-  return shared_hash_find_slot_unshare_1 (pvars, dv, dv_htab_hash (dv), ins);
-}
-
-/* Return slot for DV, if it is already present in the hash table.
-   If it is not present, insert it only VARS is not shared, otherwise
-   return NULL.  */
-
-static inline void **
-shared_hash_find_slot_1 (shared_hash vars, decl_or_value dv, hashval_t dvhash)
-{
-  return htab_find_slot_with_hash (shared_hash_htab (vars), dv, dvhash,
-				   shared_hash_shared (vars)
-				   ? NO_INSERT : INSERT);
-}
-
-static inline void **
-shared_hash_find_slot (shared_hash vars, decl_or_value dv)
-{
-  return shared_hash_find_slot_1 (vars, dv, dv_htab_hash (dv));
-}
-
-/* Return slot for DV only if it is already present in the hash table.  */
-
-static inline void **
-shared_hash_find_slot_noinsert_1 (shared_hash vars, decl_or_value dv,
-				  hashval_t dvhash)
-{
-  return htab_find_slot_with_hash (shared_hash_htab (vars), dv, dvhash,
-				   NO_INSERT);
-}
-
-static inline void **
-shared_hash_find_slot_noinsert (shared_hash vars, decl_or_value dv)
-{
-  return shared_hash_find_slot_noinsert_1 (vars, dv, dv_htab_hash (dv));
-}
-
-/* Return variable for DV or NULL if not already present in the hash
-   table.  */
+/* Search for a dv in a hash table of variables. */
 
 static inline variable
-shared_hash_find_1 (shared_hash vars, decl_or_value dv, hashval_t dvhash)
+htab_find_dv (htab_t vars, decl_or_value dv)
 {
-  return (variable) htab_find_with_hash (shared_hash_htab (vars), dv, dvhash);
-}
-
-static inline variable
-shared_hash_find (shared_hash vars, decl_or_value dv)
-{
-  return shared_hash_find_1 (vars, dv, dv_htab_hash (dv));
+  return (variable) htab_find_with_hash (vars, dv, dv_htab_hash (dv));
 }
 
 /* Return true if TVAL is better than CVAL as a canonival value.  We
@@ -1546,8 +1398,6 @@ canon_value_cmp (rtx tval, rtx cval)
     || CSELIB_VAL_PTR (tval)->uid < CSELIB_VAL_PTR (cval)->uid;
 }
 
-static bool dst_can_be_shared;
-
 /* Return a copy of a variable VAR and insert it to dataflow set SET.  */
 
 static void **
@@ -1599,17 +1449,13 @@ unshare_variable (dataflow_set *set, voi
       new_var->var_part[i].cur_loc = var->var_part[i].cur_loc;
     }
 
-  dst_can_be_shared = false;
-  if (shared_hash_shared (set->vars))
-    slot = shared_hash_find_slot_unshare (&set->vars, var->dv, NO_INSERT);
-  else if (set->traversed_vars && set->vars != set->traversed_vars)
-    slot = shared_hash_find_slot_noinsert (set->vars, var->dv);
+  if (set->traversed_vars && set->vars != set->traversed_vars)
+    slot = htab_slot_dv (set->vars, var->dv, NO_INSERT);
   *slot = new_var;
   if (var->in_changed_variables)
     {
       void **cslot
-	= htab_find_slot_with_hash (changed_variables, var->dv,
-				    dv_htab_hash (var->dv), NO_INSERT);
+	= htab_slot_dv (changed_variables, var->dv, NO_INSERT);
       gcc_assert (*cslot == (void *) var);
       var->in_changed_variables = false;
       variable_htab_free (var);
@@ -1680,7 +1526,7 @@ get_init_value (dataflow_set *set, rtx l
   if (! flag_var_tracking_uninit)
     return VAR_INIT_STATUS_INITIALIZED;
 
-  var = shared_hash_find (set->vars, dv);
+  var = htab_find_dv (set->vars, dv);
   if (var)
     {
       for (i = 0; i < var->n_var_parts && ret_val == VAR_INIT_STATUS_UNKNOWN; i++)
@@ -1910,7 +1756,7 @@ val_store (dataflow_set *set, rtx val, r
 static void
 val_reset (dataflow_set *set, decl_or_value dv)
 {
-  variable var = shared_hash_find (set->vars, dv) ;
+  variable var = htab_find_dv (set->vars, dv);
   location_chain node;
   rtx cval;
 
@@ -2043,7 +1889,8 @@ dataflow_set_init (dataflow_set *set)
 {
   /* Initialize the set (array) SET of attrs to empty lists.  */
   memset (set->regs, 0, sizeof (set->regs));
-  set->vars = shared_hash_copy (empty_shared_hash);
+  set->vars = htab_create (5, variable_htab_hash, variable_htab_eq,
+			   variable_htab_free);
   set->stack_adjust = 0;
   set->traversed_vars = NULL;
 }
@@ -2058,8 +1905,23 @@ dataflow_set_clear (dataflow_set *set)
   for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
     attrs_list_clear (&set->regs[i]);
 
-  shared_hash_destroy (set->vars, true);
-  set->vars = shared_hash_copy (empty_shared_hash);
+  htab_empty (set->vars);
+}
+
+/* Copy all variables from hash table SRC to hash table DST without rehashing
+   any values.  */
+
+static htab_t
+htab_dup (htab_t src)
+{
+  htab_t dst;
+
+  dst = (htab_t) xmalloc (sizeof (*src));
+  memcpy (dst, src, sizeof (*src));
+  dst->entries = (void **) xmalloc (src->size * sizeof (*src->entries));
+  memcpy (dst->entries, src->entries,
+	  src->size * sizeof (*src->entries));
+  return dst;
 }
 
 /* Copy the contents of dataflow set SRC to DST.  */
@@ -2067,14 +1929,19 @@ dataflow_set_clear (dataflow_set *set)
 static void
 dataflow_set_copy (dataflow_set *dst, dataflow_set *src)
 {
+  htab_iterator hi;
+  variable var;
   int i;
 
+  /* TODO memcpy */
   for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
     attrs_list_copy (&dst->regs[i], src->regs[i]);
 
-  shared_hash_destroy (dst->vars, true);
-  dst->vars = shared_hash_copy (src->vars);
+  htab_delete (dst->vars);
   dst->stack_adjust = src->stack_adjust;
+  dst->vars = htab_dup (src->vars);
+  FOR_EACH_HTAB_ELEMENT (dst->vars, var, variable, hi)
+    var->refcount++;
 }
 
 /* Information for merging lists of locations for a given offset of variable.
@@ -2128,22 +1995,17 @@ variable_union (variable src, dataflow_s
   void **dstp;
   int i, j, k;
 
-  dstp = shared_hash_find_slot (set->vars, src->dv);
-  if (!dstp || !*dstp)
+  dstp = htab_slot_dv (set->vars, src->dv, INSERT);
+  if (*dstp == HTAB_EMPTY_ENTRY)
     {
       src->refcount++;
-
-      dst_can_be_shared = false;
-      if (!dstp)
-	dstp = shared_hash_find_slot_unshare (&set->vars, src->dv, INSERT);
-
       *dstp = src;
 
       /* Continue traversing the hash table.  */
       return 1;
     }
-  else
-    dst = (variable) *dstp;
+  
+  dst = (variable) *dstp;
 
   gcc_assert (src->n_var_parts);
 
@@ -2172,7 +2034,7 @@ variable_union (variable src, dataflow_s
 	    {
 	      location_chain nnode;
 
-	      if (shared_var_p (dst, set->vars))
+	      if (shared_var_p (dst))
 		{
 		  dstp = unshare_variable (set, dstp, dst,
 					   VAR_INIT_STATUS_INITIALIZED);
@@ -2224,7 +2086,7 @@ variable_union (variable src, dataflow_s
      thus there are at most MAX_VAR_PARTS different offsets.  */
   gcc_assert (dv_onepart_p (dst->dv) ? k == 1 : k <= MAX_VAR_PARTS);
 
-  if (dst->n_var_parts != k && shared_var_p (dst, set->vars))
+  if (dst->n_var_parts != k && shared_var_p (dst))
     {
       dstp = unshare_variable (set, dstp, dst, VAR_INIT_STATUS_UNKNOWN);
       dst = (variable)*dstp;
@@ -2251,7 +2113,7 @@ variable_union (variable src, dataflow_s
 	  /* If DST is shared compare the location chains.
 	     If they are different we will modify the chain in DST with
 	     high probability so make a copy of DST.  */
-	  if (shared_var_p (dst, set->vars))
+	  if (shared_var_p (dst))
 	    {
 	      for (node = src->var_part[i].loc_chain,
 		   node2 = dst->var_part[j].loc_chain; node && node2;
@@ -2490,24 +2352,15 @@ variable_union (variable src, dataflow_s
 static void
 dataflow_set_union (dataflow_set *dst, dataflow_set *src)
 {
+  htab_iterator hi;
+  variable var;
   int i;
 
   for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
     attrs_list_union (&dst->regs[i], src->regs[i]);
 
-  if (dst->vars == empty_shared_hash)
-    {
-      shared_hash_destroy (dst->vars, true);
-      dst->vars = shared_hash_copy (src->vars);
-    }
-  else
-    {
-      htab_iterator hi;
-      variable var;
-
-      FOR_EACH_HTAB_ELEMENT (shared_hash_htab (src->vars), var, variable, hi)
-	variable_union (var, dst);
-    }
+  FOR_EACH_HTAB_ELEMENT (src->vars, var, variable, hi)
+    variable_union (var, dst);
 }
 
 /* Whether the value is currently being expanded.  */
@@ -2605,7 +2458,7 @@ find_loc_in_1pdv (rtx loc, variable var,
       gcc_checking_assert (!node->next);
 
       dv = dv_from_value (node->loc);
-      rvar = (variable) htab_find_with_hash (vars, dv, dv_htab_hash (dv));
+      rvar = htab_find_dv (vars, dv);
       return find_loc_in_1pdv (loc, rvar, vars);
     }
 
@@ -2694,8 +2547,7 @@ intersect_loc_chains (rtx val, location_
       if (s1node->loc == val)
 	continue;
 
-      if ((found = find_loc_in_1pdv (s1node->loc, s2var,
-				     shared_hash_htab (s2set->vars))))
+      if ((found = find_loc_in_1pdv (s1node->loc, s2var, s2set->vars)))
 	{
 	  insert_into_intersection (dest, s1node->loc,
 				    MIN (s1node->init, found->init));
@@ -2706,7 +2558,7 @@ intersect_loc_chains (rtx val, location_
 	  && !VALUE_RECURSED_INTO (s1node->loc))
 	{
 	  decl_or_value dv = dv_from_value (s1node->loc);
-	  variable svar = shared_hash_find (s1set->vars, dv);
+	  variable svar = htab_find_dv (s1set->vars, dv);
 	  if (svar)
 	    {
 	      if (svar->n_var_parts == 1)
@@ -2924,8 +2776,7 @@ add_value_chain (rtx *loc, void *dvp)
     return 0;
 
   dv = (decl_or_value) dvp;
-  slot = htab_find_slot_with_hash (value_chains, ldv, dv_htab_hash (ldv),
-				   INSERT);
+  slot = htab_slot_dv (value_chains, ldv, INSERT);
   if (!*slot)
     {
       vc = (value_chain) pool_alloc (value_chain_pool);
@@ -3013,8 +2864,7 @@ remove_value_chain (rtx *loc, void *dvp)
     return 0;
 
   dv = (decl_or_value) dvp;
-  slot = htab_find_slot_with_hash (value_chains, ldv, dv_htab_hash (ldv),
-				   NO_INSERT);
+  slot = htab_slot_dv (value_chains, ldv, NO_INSERT);
   for (vc = (value_chain) *slot; vc->next; vc = vc->next)
     if (dv_as_opaque (vc->next->dv) == dv_as_opaque (dv))
       {
@@ -3128,7 +2978,7 @@ canonicalize_values_mark (void **slot, v
 	else
 	  {
 	    decl_or_value odv = dv_from_value (node->loc);
-	    void **oslot = shared_hash_find_slot_noinsert (set->vars, odv);
+	    void **oslot = htab_slot_dv (set->vars, odv, NO_INSERT);
 
 	    set_slot_part (set, val, oslot, odv, 0,
 			   node->init, NULL_RTX);
@@ -3209,7 +3059,7 @@ canonicalize_values_star (void **slot, v
 	  restart_with_cval:
 	    VALUE_RECURSED_INTO (cval) = false;
 	    dv = dv_from_value (cval);
-	    slot = shared_hash_find_slot_noinsert (set->vars, dv);
+	    slot = htab_slot_dv (set->vars, dv, NO_INSERT);
 	    if (!slot)
 	      {
 		gcc_assert (dv_is_decl_p (var->dv));
@@ -3233,7 +3083,7 @@ canonicalize_values_star (void **slot, v
 
   /* Push values to the canonical one.  */
   cdv = dv_from_value (cval);
-  cslot = shared_hash_find_slot_noinsert (set->vars, cdv);
+  cslot = htab_slot_dv (set->vars, cdv, NO_INSERT);
 
   for (node = var->var_part[0].loc_chain; node; node = node->next)
     if (node->loc != cval)
@@ -3393,7 +3243,7 @@ canonicalize_vars_star (void **slot, voi
 
   /* Push values to the canonical one.  */
   cdv = dv_from_value (cval);
-  cslot = shared_hash_find_slot_noinsert (set->vars, cdv);
+  cslot = htab_slot_dv (set->vars, cdv, NO_INSERT);
   if (!cslot)
     return 1;
   cvar = (variable)*cslot;
@@ -3454,10 +3304,9 @@ variable_merge_over_cur (variable s1var,
   else
     val = NULL;
 
-  s2var = shared_hash_find_1 (dsm->src->vars, dv, dvhash);
+  s2var = (variable) htab_find_with_hash (dsm->src->vars, dv, dvhash);
   if (!s2var)
     {
-      dst_can_be_shared = false;
       return 1;
     }
 
@@ -3466,7 +3315,7 @@ variable_merge_over_cur (variable s1var,
 	      && s2var->n_var_parts == 1
 	      && s2var->var_part[0].offset == 0);
 
-  dstslot = shared_hash_find_slot_noinsert_1 (dst->vars, dv, dvhash);
+  dstslot = htab_find_slot_with_hash (dst->vars, dv, dvhash, NO_INSERT);
   if (dstslot)
     {
       dvar = (variable)*dstslot;
@@ -3483,15 +3332,13 @@ variable_merge_over_cur (variable s1var,
 
   if (!dstslot && !onepart_variable_different_p (s1var, s2var))
     {
-      dstslot = shared_hash_find_slot_unshare_1 (&dst->vars, dv,
-						 dvhash, INSERT);
+      dstslot = htab_find_slot_with_hash (dst->vars, dv,
+					  dvhash, INSERT);
       *dstslot = dvar = s2var;
       dvar->refcount++;
     }
   else
     {
-      dst_can_be_shared = false;
-
       intersect_loc_chains (val, nodep, dsm,
 			    s1var->var_part[0].loc_chain, s2var);
 
@@ -3509,9 +3356,8 @@ variable_merge_over_cur (variable s1var,
 	      dvar->var_part[0].loc_chain = node;
 	      dvar->var_part[0].cur_loc = NULL;
 
-	      dstslot
-		= shared_hash_find_slot_unshare_1 (&dst->vars, dv, dvhash,
-						   INSERT);
+	      dstslot = htab_find_slot_with_hash (dst->vars, dv,
+						  dvhash, INSERT);
 	      gcc_assert (!*dstslot);
 	      *dstslot = dvar;
 	    }
@@ -3581,12 +3427,12 @@ variable_merge_over_cur (variable s1var,
 			       node->init, NULL, INSERT);
 	  }
 
-      dstslot = shared_hash_find_slot_noinsert_1 (dst->vars, dv, dvhash);
+      dstslot = htab_find_slot_with_hash (dst->vars, dv, dvhash, NO_INSERT);
       gcc_assert (*dstslot == dvar);
       canonicalize_values_star (dstslot, dst);
       gcc_checking_assert (dstslot
-			   == shared_hash_find_slot_noinsert_1 (dst->vars,
-								dv, dvhash));
+			   == htab_find_slot_with_hash (dst->vars, dv,
+							dvhash, NO_INSERT));
       dvar = (variable)*dstslot;
     }
   else
@@ -3625,11 +3471,10 @@ variable_merge_over_cur (variable s1var,
 		  decl_or_value dv = dv_from_value (node->loc);
 		  void **slot = NULL;
 
-		  if (shared_hash_shared (dst->vars))
-		    slot = shared_hash_find_slot_noinsert (dst->vars, dv);
-		  if (!slot)
-		    slot = shared_hash_find_slot_unshare (&dst->vars, dv,
-							  INSERT);
+		  /* if (shared_hash_shared (dst->vars)) */
+		  /*   slot = shared_hash_find_slot_noinsert (dst->vars, dv); */
+		  /* if (!slot) */
+		  slot = htab_slot_dv (dst->vars, dv, INSERT);
 		  if (!*slot)
 		    {
 		      variable var = (variable) pool_alloc (dv_pool (dv));
@@ -3648,12 +3493,12 @@ variable_merge_over_cur (variable s1var,
 		}
 	    }
 
-	  dstslot = shared_hash_find_slot_noinsert_1 (dst->vars, dv, dvhash);
+	  dstslot = htab_find_slot_with_hash (dst->vars, dv, dvhash, NO_INSERT);
 	  gcc_assert (*dstslot == dvar);
 	  canonicalize_values_star (dstslot, dst);
 	  gcc_checking_assert (dstslot
-			       == shared_hash_find_slot_noinsert_1 (dst->vars,
-								    dv, dvhash));
+			       == htab_find_slot_with_hash (dst->vars, dv,
+							    dvhash, NO_INSERT));
 	  dvar = (variable)*dstslot;
 	}
     }
@@ -3669,10 +3514,7 @@ variable_merge_over_cur (variable s1var,
       variable_htab_free (dvar);
       *dstslot = dvar = s1var;
       dvar->refcount++;
-      dst_can_be_shared = false;
     }
-  else
-    dst_can_be_shared = false;
 
   return 1;
 }
@@ -3691,7 +3533,7 @@ variable_merge_over_src (variable s2var,
 
   if (!onepart)
     {
-      void **dstp = shared_hash_find_slot (dst->vars, dv);
+      void **dstp = htab_slot_dv (dst->vars, dv, INSERT);
       *dstp = s2var;
       s2var->refcount++;
       return 1;
@@ -3707,7 +3549,7 @@ variable_merge_over_src (variable s2var,
 static void
 dataflow_set_merge (dataflow_set *dst, dataflow_set *src2)
 {
-  dataflow_set cur = *dst;
+  dataflow_set cur = *dst;	/* Keep original dst as src1 */
   dataflow_set *src1 = &cur;
   struct dfset_merge dsm;
   int i;
@@ -3715,16 +3557,15 @@ dataflow_set_merge (dataflow_set *dst, d
   htab_iterator hi;
   variable var;
 
-  src1_elems = htab_elements (shared_hash_htab (src1->vars));
-  src2_elems = htab_elements (shared_hash_htab (src2->vars));
-  dataflow_set_init (dst);
+  src1_elems = htab_elements (src1->vars);
+  src2_elems = htab_elements (src2->vars);
+
+  memset (dst->regs, 0, sizeof (dst->regs));
+  /* htab_empty (dst->vars); */
+  dst->vars = htab_create (2 * MAX (src1_elems, src2_elems), variable_htab_hash,
+			   variable_htab_eq, variable_htab_free);
+  dst->traversed_vars = NULL;
   dst->stack_adjust = cur.stack_adjust;
-  shared_hash_destroy (dst->vars, true);
-  dst->vars = (shared_hash) pool_alloc (shared_hash_pool);
-  dst->vars->refcount = 1;
-  dst->vars->htab
-    = htab_create (2 * MAX (src1_elems, src2_elems), variable_htab_hash,
-		   variable_htab_eq, variable_htab_free);
 
   for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
     attrs_list_mpdv_union (&dst->regs[i], src1->regs[i], src2->regs[i]);
@@ -3734,14 +3575,11 @@ dataflow_set_merge (dataflow_set *dst, d
   dsm.cur = src1;
   dsm.src_onepart_cnt = 0;
 
-  FOR_EACH_HTAB_ELEMENT (shared_hash_htab (dsm.src->vars), var, variable, hi)
+  FOR_EACH_HTAB_ELEMENT (dsm.src->vars, var, variable, hi)
     variable_merge_over_src (var, &dsm);
-  FOR_EACH_HTAB_ELEMENT (shared_hash_htab (dsm.cur->vars), var, variable, hi)
+  FOR_EACH_HTAB_ELEMENT (dsm.cur->vars, var, variable, hi)
     variable_merge_over_cur (var, &dsm);
 
-  if (dsm.src_onepart_cnt)
-    dst_can_be_shared = false;
-
   dataflow_set_destroy (src1, true);
 }
 
@@ -3818,7 +3656,7 @@ dataflow_set_equiv_regs (dataflow_set *s
 		  continue;
 	      }
 
-	    slot = shared_hash_find_slot_noinsert (set->vars, list->dv);
+	    slot = htab_slot_dv (set->vars, list->dv, NO_INSERT);
 	    canonicalize_values_star (slot, set);
 	    if (*listp != list)
 	      list = NULL;
@@ -4032,8 +3870,7 @@ variable_post_merge_perm_vals (void **ps
 	      && REG_P (pnode->loc));
 
   dv = pvar->dv;
-
-  var = shared_hash_find (set->vars, dv);
+  var = htab_find_dv (set->vars, dv);
   if (var)
     {
       /* Although variable_post_merge_new_vals may have made decls
@@ -4042,7 +3879,7 @@ variable_post_merge_perm_vals (void **ps
 	 REG, so they are canonical as well.  Since VAR has the
 	 location list for a VALUE, using find_loc_in_1pdv for it is
 	 fine, since VALUEs don't map back to DECLs.  */
-      if (find_loc_in_1pdv (pnode->loc, var, shared_hash_htab (set->vars)))
+      if (find_loc_in_1pdv (pnode->loc, var, set->vars))
 	return 1;
       val_reset (set, dv);
     }
@@ -4083,13 +3920,11 @@ dataflow_post_merge_adjust (dataflow_set
   dfpm.set = set;
   dfpm.permp = permp;
 
-  htab_traverse (shared_hash_htab (set->vars), variable_post_merge_new_vals,
-		 &dfpm);
+  htab_traverse (set->vars, variable_post_merge_new_vals, &dfpm);
   if (*permp)
-    htab_traverse (shared_hash_htab ((*permp)->vars),
-		   variable_post_merge_perm_vals, &dfpm);
-  htab_traverse (shared_hash_htab (set->vars), canonicalize_values_star, set);
-  htab_traverse (shared_hash_htab (set->vars), canonicalize_vars_star, set);
+    htab_traverse ((*permp)->vars, variable_post_merge_perm_vals, &dfpm);
+  htab_traverse (set->vars, canonicalize_values_star, set);
+  htab_traverse (set->vars, canonicalize_vars_star, set);
 }
 
 /* Return a node whose loc is a MEM that refers to EXPR in the
@@ -4111,7 +3946,7 @@ find_mem_expr_in_1pdv (tree expr, rtx va
 	      && !VALUE_RECURSED_INTO (val));
 
   dv = dv_from_value (val);
-  var = (variable) htab_find_with_hash (vars, dv, dv_htab_hash (dv));
+  var = htab_find_dv (vars, dv);
 
   if (!var)
     return NULL;
@@ -4187,7 +4022,7 @@ dataflow_set_preserve_mem_locs (void **s
 
       gcc_assert (var->n_var_parts == 1);
 
-      if (shared_var_p (var, set->vars))
+      if (shared_var_p (var))
 	{
 	  for (loc = var->var_part[0].loc_chain; loc; loc = loc->next)
 	    {
@@ -4199,8 +4034,7 @@ dataflow_set_preserve_mem_locs (void **s
 		break;
 	      /* We want to move here MEMs that do refer to DECL.  */
 	      else if (GET_CODE (loc->loc) == VALUE
-		       && find_mem_expr_in_1pdv (decl, loc->loc,
-						 shared_hash_htab (set->vars)))
+		       && find_mem_expr_in_1pdv (decl, loc->loc, set->vars))
 		break;
 	    }
 
@@ -4219,8 +4053,7 @@ dataflow_set_preserve_mem_locs (void **s
 	  if (GET_CODE (old_loc) == VALUE)
 	    {
 	      location_chain mem_node
-		= find_mem_expr_in_1pdv (decl, loc->loc,
-					 shared_hash_htab (set->vars));
+		= find_mem_expr_in_1pdv (decl, loc->loc, set->vars);
 
 	      /* ??? This picks up only one out of multiple MEMs that
 		 refer to the same variable.  Do we ever need to be
@@ -4298,7 +4131,7 @@ dataflow_set_remove_mem_locs (void **slo
 
       gcc_assert (var->n_var_parts == 1);
 
-      if (shared_var_p (var, set->vars))
+      if (shared_var_p (var))
 	{
 	  for (loc = var->var_part[0].loc_chain; loc; loc = loc->next)
 	    if (GET_CODE (loc->loc) == MEM
@@ -4365,11 +4198,9 @@ dataflow_set_clear_at_call (dataflow_set
   if (MAY_HAVE_DEBUG_INSNS)
     {
       set->traversed_vars = set->vars;
-      htab_traverse (shared_hash_htab (set->vars),
-		     dataflow_set_preserve_mem_locs, set);
+      htab_traverse (set->vars, dataflow_set_preserve_mem_locs, set);
       set->traversed_vars = set->vars;
-      htab_traverse (shared_hash_htab (set->vars), dataflow_set_remove_mem_locs,
-		     set);
+      htab_traverse (set->vars, dataflow_set_remove_mem_locs, set);
       set->traversed_vars = NULL;
     }
 }
@@ -4470,15 +4301,12 @@ dataflow_set_different (dataflow_set *ol
   if (old_set->vars == new_set->vars)
     return false;
 
-  if (htab_elements (shared_hash_htab (old_set->vars))
-      != htab_elements (shared_hash_htab (new_set->vars)))
+  if (htab_elements (old_set->vars) != htab_elements (new_set->vars))
     return true;
 
-  FOR_EACH_HTAB_ELEMENT (shared_hash_htab (old_set->vars), var1, variable, hi)
+  FOR_EACH_HTAB_ELEMENT (old_set->vars, var1, variable, hi)
     {
-      htab_t htab = shared_hash_htab (new_set->vars);
-      variable var2 = (variable) htab_find_with_hash (htab, var1->dv,
-						      dv_htab_hash (var1->dv));
+      variable var2 = htab_find_dv (new_set->vars, var1->dv);
       if (!var2)
 	{
 	  if (dump_file && (dump_flags & TDF_DETAILS))
@@ -4519,7 +4347,10 @@ dataflow_set_destroy (dataflow_set *set,
     for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
       attrs_list_clear (&set->regs[i]);
 
-  shared_hash_destroy (set->vars, do_free);
+  if (!do_free)
+    /* Dirty hack to prevent htab_delete() iterating over all elements */
+    set->vars->del_f = NULL;
+  htab_delete (set->vars);
   set->vars = NULL;
 }
 
@@ -6039,7 +5870,7 @@ find_src_set_src (dataflow_set *set, rtx
     {
       decl_or_value dv = dv_from_decl (decl);
 
-      var = shared_hash_find (set->vars, dv);
+      var = htab_find_dv (set->vars, dv);
       if (var)
 	{
 	  found = false;
@@ -6338,13 +6169,10 @@ compute_bb_dataflow (basic_block bb)
   if (MAY_HAVE_DEBUG_INSNS)
     {
       dataflow_set_equiv_regs (out);
-      htab_traverse (shared_hash_htab (out->vars), canonicalize_values_mark,
-		     out);
-      htab_traverse (shared_hash_htab (out->vars), canonicalize_values_star,
-		     out);
+      htab_traverse (out->vars, canonicalize_values_mark, out);
+      htab_traverse (out->vars, canonicalize_values_star, out);
 #if ENABLE_CHECKING
-      htab_traverse (shared_hash_htab (out->vars),
-		     canonicalize_loc_order_check, out);
+      htab_traverse (out->vars, canonicalize_loc_order_check, out);
 #endif
     }
   changed = dataflow_set_different (&old_out, out);
@@ -6415,27 +6243,24 @@ vt_find_locations (void)
 
 	      if (VTI (bb)->in.vars)
 		{
-		  htabsz
-		    -= (htab_size (shared_hash_htab (VTI (bb)->in.vars))
-			+ htab_size (shared_hash_htab (VTI (bb)->out.vars)));
-		  oldinsz
-		    = htab_elements (shared_hash_htab (VTI (bb)->in.vars));
-		  oldoutsz
-		    = htab_elements (shared_hash_htab (VTI (bb)->out.vars));
+		  htabsz -= (htab_size (VTI (bb)->in.vars)
+			     + htab_size (VTI (bb)->out.vars));
+		  oldinsz = htab_elements (VTI (bb)->in.vars);
+		  oldoutsz = htab_elements (VTI (bb)->out.vars);
 		}
 	      else
 		oldinsz = oldoutsz = 0;
 
 	      if (MAY_HAVE_DEBUG_INSNS)
 		{
-		  dataflow_set *in = &VTI (bb)->in, *first_out = NULL;
+		  dataflow_set *in = &VTI (bb)->in;
+		  /* dataflow_set *first_out = NULL; */
 		  bool first = true, adjust = false;
 
 		  /* Calculate the IN set as the intersection of
 		     predecessor OUT sets.  */
 
 		  dataflow_set_clear (in);
-		  dst_can_be_shared = true;
 
 		  FOR_EACH_EDGE (e, ei, bb->preds)
 		    if (!VTI (e->src)->flooded)
@@ -6444,7 +6269,7 @@ vt_find_locations (void)
 		    else if (first)
 		      {
 			dataflow_set_copy (in, &VTI (e->src)->out);
-			first_out = &VTI (e->src)->out;
+			/* first_out = &VTI (e->src)->out; */
 			first = false;
 		      }
 		    else
@@ -6459,15 +6284,9 @@ vt_find_locations (void)
 #if ENABLE_CHECKING
 		      /* Merge and merge_adjust should keep entries in
 			 canonical order.  */
-		      htab_traverse (shared_hash_htab (in->vars),
-				     canonicalize_loc_order_check,
+		      htab_traverse (in->vars, canonicalize_loc_order_check,
 				     in);
 #endif
-		      if (dst_can_be_shared)
-			{
-			  shared_hash_destroy (in->vars, true);
-			  in->vars = shared_hash_copy (first_out->vars);
-			}
 		    }
 
 		  VTI (bb)->flooded = true;
@@ -6481,8 +6300,8 @@ vt_find_locations (void)
 		}
 
 	      changed = compute_bb_dataflow (bb);
-	      htabsz += (htab_size (shared_hash_htab (VTI (bb)->in.vars))
-			 + htab_size (shared_hash_htab (VTI (bb)->out.vars)));
+	      htabsz += (htab_size (VTI (bb)->in.vars)
+			 + htab_size (VTI (bb)->out.vars));
 
 	      if (htabmax && htabsz > htabmax)
 		{
@@ -6529,11 +6348,11 @@ vt_find_locations (void)
 		fprintf (dump_file,
 			 "BB %i: in %i (was %i), out %i (was %i), rem %i + %i, tsz %i\n",
 			 bb->index,
-			 (int)htab_elements (shared_hash_htab (VTI (bb)->in.vars)),
+			 (int) htab_elements (VTI (bb)->in.vars),
 			 oldinsz,
-			 (int)htab_elements (shared_hash_htab (VTI (bb)->out.vars)),
+			 (int) htab_elements (VTI (bb)->out.vars),
 			 oldoutsz,
-			 (int)worklist->nodes, (int)pending->nodes, htabsz);
+			 (int) worklist->nodes, (int) pending->nodes, htabsz);
 
 	      if (dump_file && (dump_flags & TDF_DETAILS))
 		{
@@ -6664,7 +6483,7 @@ dump_dataflow_set (dataflow_set *set)
 	  dump_attrs_list (set->regs[i]);
 	}
     }
-  dump_vars (shared_hash_htab (set->vars));
+  dump_vars (set->vars);
   fprintf (dump_file, "\n");
 }
 
@@ -6745,14 +6564,9 @@ variable_was_changed (variable var, data
 	  void **slot;
 
 	drop_var:
-	  slot = shared_hash_find_slot_noinsert (set->vars, var->dv);
+	  slot = htab_slot_dv (set->vars, var->dv, NO_INSERT);
 	  if (slot)
-	    {
-	      if (shared_hash_shared (set->vars))
-		slot = shared_hash_find_slot_unshare (&set->vars, var->dv,
-						      NO_INSERT);
-	      htab_clear_slot (shared_hash_htab (set->vars), slot);
-	    }
+	      htab_clear_slot (set->vars, slot);
 	}
     }
 }
@@ -6913,7 +6727,7 @@ set_slot_part (dataflow_set *set, rtx lo
       if (r == 0)
 	return slot;
 
-      if (shared_var_p (var, set->vars))
+      if (shared_var_p (var))
 	{
 	  slot = unshare_variable (set, slot, var, initialized);
 	  var = (variable)*slot;
@@ -6952,7 +6766,7 @@ set_slot_part (dataflow_set *set, rtx lo
 	  else
 	    {
 	      /* We have to make a copy of a shared variable.  */
-	      if (shared_var_p (var, set->vars))
+	      if (shared_var_p (var))
 		{
 		  slot = unshare_variable (set, slot, var, initialized);
 		  var = (variable)*slot;
@@ -6964,7 +6778,7 @@ set_slot_part (dataflow_set *set, rtx lo
 	  /* We have not found the location part, new one will be created.  */
 
 	  /* We have to make a copy of the shared variable.  */
-	  if (shared_var_p (var, set->vars))
+	  if (shared_var_p (var))
 	    {
 	      slot = unshare_variable (set, slot, var, initialized);
 	      var = (variable)*slot;
@@ -7049,14 +6863,8 @@ set_variable_part (dataflow_set *set, rt
 {
   void **slot;
 
-  if (iopt == NO_INSERT)
-    slot = shared_hash_find_slot_noinsert (set->vars, dv);
-  else
-    {
-      slot = shared_hash_find_slot (set->vars, dv);
-      if (!slot)
-	slot = shared_hash_find_slot_unshare (&set->vars, dv, iopt);
-    }
+  slot = htab_slot_dv (set->vars, dv, iopt);
+
   set_slot_part (set, loc, slot, dv, offset, initialized, set_src);
 }
 
@@ -7134,7 +6942,7 @@ clobber_variable_part (dataflow_set *set
       || (!dv_is_value_p (dv) && ! DECL_P (dv_as_decl (dv))))
     return;
 
-  slot = shared_hash_find_slot_noinsert (set->vars, dv);
+  slot = htab_slot_dv (set->vars, dv, NO_INSERT);
   if (!slot)
     return;
 
@@ -7158,7 +6966,7 @@ delete_slot_part (dataflow_set *set, rtx
       location_chain *nextp;
       bool changed;
 
-      if (shared_var_p (var, set->vars))
+      if (shared_var_p (var))
 	{
 	  /* If the variable contains the location part we have to
 	     make a copy of the variable.  */
@@ -7233,7 +7041,7 @@ static void
 delete_variable_part (dataflow_set *set, rtx loc, decl_or_value dv,
 		      HOST_WIDE_INT offset)
 {
-  void **slot = shared_hash_find_slot_noinsert (set->vars, dv);
+  void **slot = htab_slot_dv (set->vars, dv, NO_INSERT);
   if (!slot)
     return;
 
@@ -7330,7 +7138,7 @@ vt_expand_loc_callback (rtx x, bitmap re
   if (VALUE_RECURSED_INTO (x))
     return NULL;
 
-  var = (variable) htab_find_with_hash (elcd->vars, dv, dv_htab_hash (dv));
+  var = htab_find_dv (elcd->vars, dv);
 
   if (!var)
     {
@@ -7710,19 +7518,18 @@ static VEC (rtx, heap) *changed_values_s
 /* Helper function for check_changed_vars_1 and check_changed_vars_2.  */
 
 static void
-check_changed_vars_0 (decl_or_value dv, htab_t htab)
+check_changed_vars_0 (decl_or_value dv, htab_t vars)
 {
   value_chain vc
-    = (value_chain) htab_find_with_hash (value_chains, dv, dv_htab_hash (dv));
+    = (value_chain) htab_find_with_hash (value_chains, dv,
+					 dv_htab_hash (dv));
 
   if (vc == NULL)
     return;
   for (vc = vc->next; vc; vc = vc->next)
     if (!dv_changed_p (vc->dv))
       {
-	variable vcvar
-	  = (variable) htab_find_with_hash (htab, vc->dv,
-					    dv_htab_hash (vc->dv));
+	variable vcvar = htab_find_dv (vars, vc->dv);
 	if (vcvar)
 	  {
 	    set_dv_changed (vc->dv, true);
@@ -7733,7 +7540,7 @@ check_changed_vars_0 (decl_or_value dv, 
 	    set_dv_changed (vc->dv, true);
 	    VEC_safe_push (rtx, heap, changed_values_stack,
 			   dv_as_value (vc->dv));
-	    check_changed_vars_0 (vc->dv, htab);
+	    check_changed_vars_0 (vc->dv, vars);
 	  }
       }
 }
@@ -7813,11 +7620,9 @@ check_changed_vars_3 (void **slot, void 
    shall be emitted before of after instruction INSN.  */
 
 static void
-emit_notes_for_changes (rtx insn, enum emit_note_where where,
-			shared_hash vars)
+emit_notes_for_changes (rtx insn, enum emit_note_where where, htab_t vars)
 {
   emit_note_data data;
-  htab_t htab = shared_hash_htab (vars);
 
   if (!htab_elements (changed_variables))
     return;
@@ -7827,19 +7632,19 @@ emit_notes_for_changes (rtx insn, enum e
       /* Unfortunately this has to be done in two steps, because
 	 we can't traverse a hashtab into which we are inserting
 	 through variable_was_changed.  */
-      htab_traverse (changed_variables, check_changed_vars_1, htab);
+      htab_traverse (changed_variables, check_changed_vars_1, vars);
       while (VEC_length (variable, changed_variables_stack) > 0)
 	check_changed_vars_2 (VEC_pop (variable, changed_variables_stack),
-			      htab);
+			      vars);
       while (VEC_length (rtx, changed_values_stack) > 0)
 	set_dv_changed (dv_from_value (VEC_pop (rtx, changed_values_stack)),
 			false);
-      htab_traverse (changed_variables, check_changed_vars_3, htab);
+      htab_traverse (changed_variables, check_changed_vars_3, vars);
     }
 
   data.insn = insn;
   data.where = where;
-  data.vars = htab;
+  data.vars = vars;
 
   htab_traverse (changed_variables, emit_note_insn_var_location, &data);
 }
@@ -7854,8 +7659,7 @@ emit_notes_for_differences_1 (void **slo
   variable old_var, new_var;
 
   old_var = (variable) *slot;
-  new_var = (variable) htab_find_with_hash (new_vars, old_var->dv,
-					    dv_htab_hash (old_var->dv));
+  new_var = htab_find_dv (new_vars, old_var->dv);
 
   if (!new_var)
     {
@@ -7947,8 +7751,7 @@ emit_notes_for_differences_2 (void **slo
   variable old_var, new_var;
 
   new_var = (variable) *slot;
-  old_var = (variable) htab_find_with_hash (old_vars, new_var->dv,
-					    dv_htab_hash (new_var->dv));
+  old_var = htab_find_dv(old_vars, new_var->dv);
   if (!old_var)
     {
       int i;
@@ -7977,12 +7780,8 @@ static void
 emit_notes_for_differences (rtx insn, dataflow_set *old_set,
 			    dataflow_set *new_set)
 {
-  htab_traverse (shared_hash_htab (old_set->vars),
-		 emit_notes_for_differences_1,
-		 shared_hash_htab (new_set->vars));
-  htab_traverse (shared_hash_htab (new_set->vars),
-		 emit_notes_for_differences_2,
-		 shared_hash_htab (old_set->vars));
+  htab_traverse (old_set->vars, emit_notes_for_differences_1, new_set->vars);
+  htab_traverse (new_set->vars, emit_notes_for_differences_2, old_set->vars);
   emit_notes_for_changes (insn, EMIT_NOTE_BEFORE_INSN, new_set->vars);
 }
 
@@ -8011,8 +7810,7 @@ emit_notes_in_bb (basic_block bb, datafl
 	      while (*p)
 		{
 		  XEXP (XEXP (*p, 0), 1)
-		    = vt_expand_loc (XEXP (XEXP (*p, 0), 1),
-				     shared_hash_htab (set->vars), true);
+		    = vt_expand_loc (XEXP (XEXP (*p, 0), 1), set->vars, true);
 		  /* If expansion is successful, keep it in the list.  */
 		  if (XEXP (XEXP (*p, 0), 1))
 		    p = &XEXP (*p, 1);
@@ -8339,9 +8137,7 @@ vt_emit_notes (void)
       dataflow_set_clear (&VTI (bb)->in);
     }
 #ifdef ENABLE_CHECKING
-  htab_traverse (shared_hash_htab (cur.vars),
-		 emit_notes_for_differences_1,
-		 shared_hash_htab (empty_shared_hash));
+  htab_traverse (cur.vars, emit_notes_for_differences_1, empty_htab);
   if (MAY_HAVE_DEBUG_INSNS)
     {
       unsigned int i;
@@ -8703,13 +8499,9 @@ vt_initialize (void)
   loc_chain_pool = create_alloc_pool ("location_chain_def pool",
 				      sizeof (struct location_chain_def),
 				      1024);
-  shared_hash_pool = create_alloc_pool ("shared_hash_def pool",
-					sizeof (struct shared_hash_def), 256);
-  empty_shared_hash = (shared_hash) pool_alloc (shared_hash_pool);
-  empty_shared_hash->refcount = 1;
-  empty_shared_hash->htab
-    = htab_create (1, variable_htab_hash, variable_htab_eq,
-		   variable_htab_free);
+  /* htab_pool = create_alloc_pool ("htab pool", sizeof (*htab_t), 256); */
+  empty_htab = htab_create (1, variable_htab_hash, variable_htab_eq,
+			    variable_htab_free);
   changed_variables = htab_create (10, variable_htab_hash, variable_htab_eq,
 				   variable_htab_free);
   if (MAY_HAVE_DEBUG_INSNS)
@@ -9017,12 +8809,11 @@ vt_finalize (void)
 	}
     }
   free_aux_for_blocks ();
-  htab_delete (empty_shared_hash->htab);
+  htab_delete (empty_htab);
   htab_delete (changed_variables);
   free_alloc_pool (attrs_pool);
   free_alloc_pool (var_pool);
   free_alloc_pool (loc_chain_pool);
-  free_alloc_pool (shared_hash_pool);
 
   if (MAY_HAVE_DEBUG_INSNS)
     {


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [var-tracking] [not-good!] disable shared_hash and other simplifications
  2011-08-22 12:21 ` [var-tracking] [not-good!] disable shared_hash and other simplifications Dimitrios Apostolou
@ 2011-08-22 12:57   ` Jakub Jelinek
  2011-08-22 18:53     ` Steven Bosscher
  0 siblings, 1 reply; 9+ messages in thread
From: Jakub Jelinek @ 2011-08-22 12:57 UTC (permalink / raw)
  To: Dimitrios Apostolou; +Cc: gcc-patches, Steven Bosscher, Alexandre Oliva

On Mon, Aug 22, 2011 at 02:26:58PM +0300, Dimitrios Apostolou wrote:
> the attached patch applies after my previous one, and actually
> cancels all runtime gains from it. It doesn't make things worse than
> initially, so it's not *that* bad.

You should really test it on the testcases which lead to the changes,
e.g. http://gcc.gnu.org/ml/gcc-patches/2009-06/msg01586.html
You might not notice significant differences on small or common testcases,
but on larger testcases it may be a question whether something can be
compiled at all or not.  You'll see that reverting it is actually very
bad for the worst cases.

> * This is the part of the compiler that makes the heaviest use of
> hash tables. htab_traverse(), htab_delete(), FOR_EACH_HTAB_ELEMENT
> (should that be moved in libiberty?) all iterate way too many times
> over hash tables. This is all because decl/value UIDs are sparse.
> * Since var-tracking is happening per-function, is there a way to
> produce per-function UIDs and use those in a regular array instead
> of a sparse hash table? These UIDs could be produced at tree/rtx
> level, and I bet other parts in the compiler could make good use of
> them.

That wouldn't help you, you don't want/need to represent every DECL_UID and
every VALUE seen in the whole function in every bb, only those that were
actually seen there.  Having
(number_of_seen_decl_uids+number_of_values)*number_of_basic_blocks*sizeof(void*)
memory requirements would be still bigger than the size of all the hash
tables together, I think much bigger actually, and also slower when you'd
need to traverse it.

> * From a very wide field of view, all this DF solving reminded me a
> lot of what I've seen in df-*.c. Why can't variable tracking be
> integrated in the main DF pass of the compiler, the one that happens
> *after* register allocation?

I don't think there is any significant overlap.  Yes, var-tracking is a data
flow algorithm, but the code that does that is just a few lines, the big
part is how the merges are done and how the tracking propagates through a
basic block, and that has nothing in common with df-*.c.

	Jakub

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [var-tracking] small speed-ups
  2011-08-22 11:38 [var-tracking] small speed-ups Dimitrios Apostolou
  2011-08-22 12:21 ` [var-tracking] [not-good!] disable shared_hash and other simplifications Dimitrios Apostolou
@ 2011-08-22 13:44 ` Jakub Jelinek
  2011-08-23 12:36   ` Dimitrios Apostolou
  1 sibling, 1 reply; 9+ messages in thread
From: Jakub Jelinek @ 2011-08-22 13:44 UTC (permalink / raw)
  To: Dimitrios Apostolou; +Cc: gcc-patches, Steven Bosscher, Alexandre Oliva

On Mon, Aug 22, 2011 at 01:30:33PM +0300, Dimitrios Apostolou wrote:
> --- gcc/var-tracking.c	2011-06-03 01:42:31 +0000
> +++ gcc/var-tracking.c	2011-08-22 09:52:08 +0000
> @@ -1069,14 +1067,14 @@ adjust_insn (basic_block bb, rtx insn)
>  static inline bool
>  dv_is_decl_p (decl_or_value dv)
>  {
> -  return !dv || (int) TREE_CODE ((tree) dv) != (int) VALUE;
> +  return !dv || ((int) TREE_CODE ((tree) dv) != (int) VALUE);
>  }

Why?

>  /* Return true if a decl_or_value is a VALUE rtl.  */
>  static inline bool
>  dv_is_value_p (decl_or_value dv)
>  {
> -  return dv && !dv_is_decl_p (dv);
> +  return !dv_is_decl_p (dv);
>  }

This is fine, though I don't think it makes any difference if var-tracking
is compiled with -O+ - gcc should be able to optimize the second
NULL/non-NULL check out.

> @@ -1191,7 +1189,7 @@ dv_uid2hash (dvuid uid)
>  static inline hashval_t
>  dv_htab_hash (decl_or_value dv)
>  {
> -  return dv_uid2hash (dv_uid (dv));
> +  return (hashval_t) (dv_uid (dv));
>  }

Why?  dv_uid2hash is an inline that does exactly that.

> @@ -1202,7 +1200,7 @@ variable_htab_hash (const void *x)
>  {
>    const_variable const v = (const_variable) x;
>  
> -  return dv_htab_hash (v->dv);
> +  return (hashval_t) (dv_uid (v->dv));
>  }

Why?

>  /* Compare the declaration of variable X with declaration Y.  */
> @@ -1211,9 +1209,8 @@ static int
>  variable_htab_eq (const void *x, const void *y)
>  {
>    const_variable const v = (const_variable) x;
> -  decl_or_value dv = CONST_CAST2 (decl_or_value, const void *, y);
>  
> -  return (dv_as_opaque (v->dv) == dv_as_opaque (dv));
> +  return (v->dv) == y;
>  }

Why?

> @@ -1397,19 +1398,40 @@ shared_var_p (variable var, shared_hash 
>  	  || shared_hash_shared (vars));
>  }
>  
> +/* Copy all variables from hash table SRC to hash table DST without rehashing
> +   any values.  */
> +
> +static htab_t
> +htab_dup (htab_t src)
> +{
> +  htab_t dst;
> +
> +  dst = (htab_t) xmalloc (sizeof (*src));
> +  memcpy (dst, src, sizeof (*src));
> +  dst->entries = (void **) xmalloc (src->size * sizeof (*src->entries));
> +  memcpy (dst->entries, src->entries,
> +	  src->size * sizeof (*src->entries));
> +  return dst;
> +}
> +

This certainly doesn't belong here, it should go into libiberty/hashtab.c
and prototype into include/hashtab.h.  It relies on hashtab.c
implementation details.

> @@ -2034,7 +2041,8 @@ val_resolve (dataflow_set *set, rtx val,
>  static void
>  dataflow_set_init (dataflow_set *set)
>  {
> -  init_attrs_list_set (set->regs);
> +  /* Initialize the set (array) SET of attrs to empty lists.  */
> +  memset (set->regs, 0, sizeof (set->regs));
>    set->vars = shared_hash_copy (empty_shared_hash);
>    set->stack_adjust = 0;
>    set->traversed_vars = NULL;

I'd say you should instead just implement init_attrs_list_set inline using
memset.

>    dst->vars = (shared_hash) pool_alloc (shared_hash_pool);
>    dst->vars->refcount = 1;
>    dst->vars->htab
> -    = htab_create (MAX (src1_elems, src2_elems), variable_htab_hash,
> +    = htab_create (2 * MAX (src1_elems, src2_elems), variable_htab_hash,
>  		   variable_htab_eq, variable_htab_free);

This looks wrong, 2 * max is definitely too much.

> @@ -8996,11 +9006,13 @@ vt_finalize (void)
>  
>    FOR_ALL_BB (bb)
>      {
> -      dataflow_set_destroy (&VTI (bb)->in);
> -      dataflow_set_destroy (&VTI (bb)->out);
> +      /* The "false" do_free parameter means to not bother to iterate and free
> +	 all hash table elements, since we'll destroy the pools. */
> +      dataflow_set_destroy (&VTI (bb)->in, false);
> +      dataflow_set_destroy (&VTI (bb)->out, false);
>        if (VTI (bb)->permp)
>  	{
> -	  dataflow_set_destroy (VTI (bb)->permp);
> +	  dataflow_set_destroy (VTI (bb)->permp, false);
>  	  XDELETE (VTI (bb)->permp);
>  	}
>      }
> 

How much does this actually speed things up (the not freeing pool allocated
stuff during finalizaqtion)?  Is it really worth it?

	Jakub

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [var-tracking] [not-good!] disable shared_hash and other simplifications
  2011-08-22 12:57   ` Jakub Jelinek
@ 2011-08-22 18:53     ` Steven Bosscher
  2011-08-22 18:55       ` Jakub Jelinek
  0 siblings, 1 reply; 9+ messages in thread
From: Steven Bosscher @ 2011-08-22 18:53 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Dimitrios Apostolou, gcc-patches, Alexandre Oliva

On Mon, Aug 22, 2011 at 1:54 PM, Jakub Jelinek <jakub@redhat.com> wrote:
>> * From a very wide field of view, all this DF solving reminded me a
>> lot of what I've seen in df-*.c. Why can't variable tracking be
>> integrated in the main DF pass of the compiler, the one that happens
>> *after* register allocation?
>
> I don't think there is any significant overlap.  Yes, var-tracking is a data
> flow algorithm, but the code that does that is just a few lines, the big
> part is how the merges are done and how the tracking propagates through a
> basic block, and that has nothing in common with df-*.c.

But perhaps with the DF solver fewer blocks are visited...?

Ciao!
Steven

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [var-tracking] [not-good!] disable shared_hash and other simplifications
  2011-08-22 18:53     ` Steven Bosscher
@ 2011-08-22 18:55       ` Jakub Jelinek
  0 siblings, 0 replies; 9+ messages in thread
From: Jakub Jelinek @ 2011-08-22 18:55 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: Dimitrios Apostolou, gcc-patches, Alexandre Oliva

On Mon, Aug 22, 2011 at 07:49:43PM +0200, Steven Bosscher wrote:
> On Mon, Aug 22, 2011 at 1:54 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> >> * From a very wide field of view, all this DF solving reminded me a
> >> lot of what I've seen in df-*.c. Why can't variable tracking be
> >> integrated in the main DF pass of the compiler, the one that happens
> >> *after* register allocation?
> >
> > I don't think there is any significant overlap.  Yes, var-tracking is a data
> > flow algorithm, but the code that does that is just a few lines, the big
> > part is how the merges are done and how the tracking propagates through a
> > basic block, and that has nothing in common with df-*.c.
> 
> But perhaps with the DF solver fewer blocks are visited...?

Or perhaps more.  Who knows.

	Jakub

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [var-tracking] small speed-ups
  2011-08-22 13:44 ` [var-tracking] small speed-ups Jakub Jelinek
@ 2011-08-23 12:36   ` Dimitrios Apostolou
  2011-08-23 14:51     ` Jakub Jelinek
  0 siblings, 1 reply; 9+ messages in thread
From: Dimitrios Apostolou @ 2011-08-23 12:36 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Dimitrios Apostolou, gcc-patches, Steven Bosscher, Alexandre Oliva

Hi jakub,

On Mon, 22 Aug 2011, Jakub Jelinek wrote:
> On Mon, Aug 22, 2011 at 01:30:33PM +0300, Dimitrios Apostolou wrote:
>
>> @@ -1191,7 +1189,7 @@ dv_uid2hash (dvuid uid)
>>  static inline hashval_t
>>  dv_htab_hash (decl_or_value dv)
>>  {
>> -  return dv_uid2hash (dv_uid (dv));
>> +  return (hashval_t) (dv_uid (dv));
>>  }
>
> Why?  dv_uid2hash is an inline that does exactly that.
>
>> @@ -1202,7 +1200,7 @@ variable_htab_hash (const void *x)
>>  {
>>    const_variable const v = (const_variable) x;
>>
>> -  return dv_htab_hash (v->dv);
>> +  return (hashval_t) (dv_uid (v->dv));
>>  }
>
> Why?
>
>>  /* Compare the declaration of variable X with declaration Y.  */
>> @@ -1211,9 +1209,8 @@ static int
>>  variable_htab_eq (const void *x, const void *y)
>>  {
>>    const_variable const v = (const_variable) x;
>> -  decl_or_value dv = CONST_CAST2 (decl_or_value, const void *, y);
>>
>> -  return (dv_as_opaque (v->dv) == dv_as_opaque (dv));
>> +  return (v->dv) == y;
>>  }
>
> Why?

I was hoping you'd ask so I'll ask back :-) Why are we doing it that way? 
Why so much indirection in the first place? Why create inline functions 
just to typecast and why do we need this CONST_CAST2 ugliness in C code. I 
bet there are things I don't understand so I'd be happy to listen...

The reason I did this (and many more I didn't publish) simplifications 
within var-tracking is because it hurt my brains to follow the 
logic. Even with the help of TAGS I have a specific stack depth before I 
forget where I begun diving into TAG declarations. Well in var-tracking 
this limit was surpassed by much...

>
>> @@ -1397,19 +1398,40 @@ shared_var_p (variable var, shared_hash
>>  	  || shared_hash_shared (vars));
>>  }
>>
>> +/* Copy all variables from hash table SRC to hash table DST without rehashing
>> +   any values.  */
>> +
>> +static htab_t
>> +htab_dup (htab_t src)
>> +{
>> +  htab_t dst;
>> +
>> +  dst = (htab_t) xmalloc (sizeof (*src));
>> +  memcpy (dst, src, sizeof (*src));
>> +  dst->entries = (void **) xmalloc (src->size * sizeof (*src->entries));
>> +  memcpy (dst->entries, src->entries,
>> +	  src->size * sizeof (*src->entries));
>> +  return dst;
>> +}
>> +
>
> This certainly doesn't belong here, it should go into libiberty/hashtab.c
> and prototype into include/hashtab.h.  It relies on hashtab.c
> implementation details.

OK I'll do that in the future. Should I also move some other htab 
functions I saw in var-tracking and rtl? FOR_EACH_HTAB_ELEMENT comes to 
mind, probably other too.

>
>> @@ -2034,7 +2041,8 @@ val_resolve (dataflow_set *set, rtx val,
>>  static void
>>  dataflow_set_init (dataflow_set *set)
>>  {
>> -  init_attrs_list_set (set->regs);
>> +  /* Initialize the set (array) SET of attrs to empty lists.  */
>> +  memset (set->regs, 0, sizeof (set->regs));
>>    set->vars = shared_hash_copy (empty_shared_hash);
>>    set->stack_adjust = 0;
>>    set->traversed_vars = NULL;
>
> I'd say you should instead just implement init_attrs_list_set inline using
> memset.

It's used only once, that's why I deleted the function. I'll bring it back 
if you think it helps.

>
>>    dst->vars = (shared_hash) pool_alloc (shared_hash_pool);
>>    dst->vars->refcount = 1;
>>    dst->vars->htab
>> -    = htab_create (MAX (src1_elems, src2_elems), variable_htab_hash,
>> +    = htab_create (2 * MAX (src1_elems, src2_elems), variable_htab_hash,
>>  		   variable_htab_eq, variable_htab_free);
>
> This looks wrong, 2 * max is definitely too much.

For a hash table to fit N elements, it has to have at least 4/3*N 
slots, or 2*N slots if htab has the 50% load factor I was proposing.

>> @@ -8996,11 +9006,13 @@ vt_finalize (void)
>>
>>    FOR_ALL_BB (bb)
>>      {
>> -      dataflow_set_destroy (&VTI (bb)->in);
>> -      dataflow_set_destroy (&VTI (bb)->out);
>> +      /* The "false" do_free parameter means to not bother to iterate and free
>> +	 all hash table elements, since we'll destroy the pools. */
>> +      dataflow_set_destroy (&VTI (bb)->in, false);
>> +      dataflow_set_destroy (&VTI (bb)->out, false);
>>        if (VTI (bb)->permp)
>>  	{
>> -	  dataflow_set_destroy (VTI (bb)->permp);
>> +	  dataflow_set_destroy (VTI (bb)->permp, false);
>>  	  XDELETE (VTI (bb)->permp);
>>  	}
>>      }
>>
>
> How much does this actually speed things up (the not freeing pool allocated
> stuff during finalizaqtion)?  Is it really worth it?

In total for dataflow_set_destroy I can see that calls to 
attrs_list_clear() have been reduced from 500K to 250K, and I can also see 
a reduction of free() calls from htab_delete(), from 30K to 10K. I'm 
willing to bet that much of this is because of this change, I have kept 
only the ones that showed difference and remember clearly that 
var-tracking is iterating over hash tables too much, either directly or 
from htab_traverse()/htab_delete().


Thanks,
Dimitris

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [var-tracking] small speed-ups
  2011-08-23 12:36   ` Dimitrios Apostolou
@ 2011-08-23 14:51     ` Jakub Jelinek
  2011-08-23 14:56       ` Dimitrios Apostolou
  0 siblings, 1 reply; 9+ messages in thread
From: Jakub Jelinek @ 2011-08-23 14:51 UTC (permalink / raw)
  To: Dimitrios Apostolou; +Cc: gcc-patches, Steven Bosscher, Alexandre Oliva

On Tue, Aug 23, 2011 at 02:40:56PM +0300, Dimitrios Apostolou wrote:
> I was hoping you'd ask so I'll ask back :-) Why are we doing it that
> way? Why so much indirection in the first place? Why create inline
> functions just to typecast and why do we need this CONST_CAST2
> ugliness in C code. I bet there are things I don't understand so I'd
> be happy to listen...

It is not indirection, it is abstraction, which should make the code more
readable and allow changing the implementation details.

> OK I'll do that in the future. Should I also move some other htab
> functions I saw in var-tracking and rtl? FOR_EACH_HTAB_ELEMENT comes
> to mind, probably other too.

I guess FOR_EACH_HTAB_ELEMENT could move too (together with all the support
though - tree-flow.h and tree-flow-inline.h related stuff).

> It's used only once, that's why I deleted the function. I'll bring
> it back if you think it helps.

Yes, please.

> >>   dst->vars = (shared_hash) pool_alloc (shared_hash_pool);
> >>   dst->vars->refcount = 1;
> >>   dst->vars->htab
> >>-    = htab_create (MAX (src1_elems, src2_elems), variable_htab_hash,
> >>+    = htab_create (2 * MAX (src1_elems, src2_elems), variable_htab_hash,
> >> 		   variable_htab_eq, variable_htab_free);
> >
> >This looks wrong, 2 * max is definitely too much.
> 
> For a hash table to fit N elements, it has to have at least 4/3*N
> slots, or 2*N slots if htab has the 50% load factor I was proposing.

For var-tracking, 50% load factor is IMHO a very bad idea, memory
consumption of var-tracking is already high right now, we IMHO don't have
the luxury to waste more RAM.

> In total for dataflow_set_destroy I can see that calls to
> attrs_list_clear() have been reduced from 500K to 250K, and I can
> also see a reduction of free() calls from htab_delete(), from 30K to

free calls?  If you avoid free calls, then you end up with a memory leak.
I can understand when you avoid pool_free calls...

	Jakub

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [var-tracking] small speed-ups
  2011-08-23 14:51     ` Jakub Jelinek
@ 2011-08-23 14:56       ` Dimitrios Apostolou
  0 siblings, 0 replies; 9+ messages in thread
From: Dimitrios Apostolou @ 2011-08-23 14:56 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Dimitrios Apostolou, gcc-patches, Steven Bosscher, Alexandre Oliva

On Tue, 23 Aug 2011, Jakub Jelinek wrote:
> On Tue, Aug 23, 2011 at 02:40:56PM +0300, Dimitrios Apostolou wrote:
>
>>>>   dst->vars = (shared_hash) pool_alloc (shared_hash_pool);
>>>>   dst->vars->refcount = 1;
>>>>   dst->vars->htab
>>>> -    = htab_create (MAX (src1_elems, src2_elems), variable_htab_hash,
>>>> +    = htab_create (2 * MAX (src1_elems, src2_elems), variable_htab_hash,
>>>> 		   variable_htab_eq, variable_htab_free);
>>>
>>> This looks wrong, 2 * max is definitely too much.
>>
>> For a hash table to fit N elements, it has to have at least 4/3*N
>> slots, or 2*N slots if htab has the 50% load factor I was proposing.
>
> For var-tracking, 50% load factor is IMHO a very bad idea, memory
> consumption of var-tracking is already high right now, we IMHO don't have
> the luxury to waste more RAM.

Agreed, then I 'll change it to 4/3 * MAX so that I avoid expansions.

>> In total for dataflow_set_destroy I can see that calls to
>> attrs_list_clear() have been reduced from 500K to 250K, and I can
>> also see a reduction of free() calls from htab_delete(), from 30K to
>
> free calls?  If you avoid free calls, then you end up with a memory leak.
> I can understand when you avoid pool_free calls...

You are right, I just mentioned the total difference in free() calls from 
all my patches. But in this part there is no free() involved, so the major 
gain should be from avoiding htab_delete() iterating many times over the 
hash tables. Annotated source from the callgrind profiler (shows 
instruction count):

Before:

          .  void
15,820,597  htab_delete (htab_t htab)
    250,914  {
     41,819    size_t size = htab_size (htab);
     41,819    PTR *entries = htab->entries;
          .    int i;
          .
     83,638    if (htab->del_f)
66,493,884      for (i = size - 1; i >= 0; i--)
49,825,998        if (entries[i] != HTAB_EMPTY_ENTRY && entries[i] != HTAB_DELETED_ENTRY)
  1,813,018  	(*htab->del_f) (entries[i]);
  1,800,731  => tree-into-ssa.c:def_blocks_free (10988x)
    345,910  => tree-into-ssa.c:repl_map_free (2815x)
     53,950  => tree-scalar-evolution.c:del_scev_info (1197x)
    139,354  => tree-ssa-loop-im.c:memref_free (198x)
    281,777  => tree-ssa-sccvn.c:free_phi (3788x)
     81,726  => tree-ssa-uncprop.c:equiv_free (463x)
17,359,572  => var-tracking.c:variable_htab_free (835512x)
     42,161  => cfgloop.c:loop_exit_free (720x)
    284,904  => ira-costs.c:cost_classes_del (2270x)
        157  => tree-ssa-loop-im.c:vtoe_free (1x)
11,454,221  => ???:free (37228x)
  1,460,684  => tree-ssa-sccvn.c:free_reference (11329x)
          .
    125,457    if (htab->free_f != NULL)



After:

          .  void
  6,543,474  htab_delete (htab_t htab)
    250,914  {
     41,819    size_t size = htab_size (htab);
     41,819    PTR *entries = htab->entries;
          .    int i;
          .
     83,638    if (htab->del_f)
29,288,268      for (i = size - 1; i >= 0; i--)
21,927,330        if (entries[i] != HTAB_EMPTY_ENTRY && entries[i] != HTAB_DELETED_ENTRY)
  1,738,584  	(*htab->del_f) (entries[i]);
    139,344  => tree-ssa-loop-im.c:memref_free (198x)
        157  => tree-ssa-loop-im.c:vtoe_free (1x)
     81,762  => tree-ssa-uncprop.c:equiv_free (463x)
    281,884  => tree-ssa-sccvn.c:free_phi (3788x)
     42,145  => cfgloop.c:loop_exit_free (720x)
    345,870  => tree-into-ssa.c:repl_map_free (2815x)
  1,462,000  => tree-ssa-sccvn.c:free_reference (11329x)
  1,800,951  => tree-into-ssa.c:def_blocks_free (10988x)
16,518,004  => var-tracking.c:variable_htab_free (824010x)
    284,979  => ira-costs.c:cost_classes_del (2270x)
  9,080,245  => ???:free (11513x)
     53,921  => tree-scalar-evolution.c:del_scev_info (1197x)
          .
    125,457    if (htab->free_f != NULL)



So if the part relevant to this patch is the number of calls to 
variable_htab_free, I suppose it won't make a big difference.

But if I take a look at the top callers and top callees of htab_delete():

Before:

  19,590,149  < tree-ssa-structalias.c:solve_constraints (2058x) [cc1]
  33,299,064  < var-tracking.c:shared_hash_destroy (6923x) [cc1]
  68,941,719  < tree-ssa-pre.c:execute_pre (2058x) [cc1]
134,915,334  *  hashtab.c:htab_delete [cc1]
  17,359,572  >   var-tracking.c:variable_htab_free (835512x) [cc1]
  11,454,221  >   ???:free (108456x) [libc-2.12.so]
   1,800,731  >   tree-into-ssa.c:def_blocks_free (10988x) [cc1]


After:

   8,756,328  < tree-ssa-structalias.c:delete_points_to_sets (1029x) [cc1]
   9,316,165  < tree-ssa-dom.c:tree_ssa_dominator_optimize (562x) [cc1]
  83,696,211  < var-tracking.c:shared_hash_destroy (6923x) [cc1]
  60,459,493  *  hashtab.c:htab_delete [cc1]
  16,518,004  >   var-tracking.c:variable_htab_free (824010x) [cc1]
   9,080,245  >   ???:free (82741x) [libc-2.12.so]
   1,800,951  >   tree-into-ssa.c:def_blocks_free (10988x) [cc1]


Then I'm more confused, htab_delete() seems to be much colder, probably 
due to less traversals of hash tables, but It's not clear how much 
var-tracking changes are responsible for this. I'll keep in mind to revert 
the do_free parameter and report back. I guess for less than 10 M instr 
it's not worth the hassle.


Thanks,
Dimitris

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2011-08-23 14:37 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-22 11:38 [var-tracking] small speed-ups Dimitrios Apostolou
2011-08-22 12:21 ` [var-tracking] [not-good!] disable shared_hash and other simplifications Dimitrios Apostolou
2011-08-22 12:57   ` Jakub Jelinek
2011-08-22 18:53     ` Steven Bosscher
2011-08-22 18:55       ` Jakub Jelinek
2011-08-22 13:44 ` [var-tracking] small speed-ups Jakub Jelinek
2011-08-23 12:36   ` Dimitrios Apostolou
2011-08-23 14:51     ` Jakub Jelinek
2011-08-23 14:56       ` Dimitrios Apostolou

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).