public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Various minor speed-ups
@ 2011-08-22  7:50 Dimitrios Apostolou
  2011-08-22  7:53 ` mem_attrs_htab Dimitrios Apostolou
                   ` (6 more replies)
  0 siblings, 7 replies; 22+ messages in thread
From: Dimitrios Apostolou @ 2011-08-22  7:50 UTC (permalink / raw)
  To: gcc-patches; +Cc: Steven Bosscher, Dimitrios Apostolou

Hello list,

the followup patches are a selection of minor changes introduced in 
various times during my GSOC project. They mostly are simple or
not that important to be posted alone, so I'll post them alltogether under 
this thread. Nevertheless they have been carefully selected from a pool of 
other changes and they are the ones that *do* offer some (minor) speed 
improvement, and have the least impact on memory usage, if at all.

They have all been tested on x86_64, some also on i386. For production 
builds I have seen no regression introduced.


Thanks,
Dimitris

^ permalink raw reply	[flat|nested] 22+ messages in thread

* graphds.[ch]: alloc_pool for edges
  2011-08-22  7:50 Various minor speed-ups Dimitrios Apostolou
  2011-08-22  7:53 ` mem_attrs_htab Dimitrios Apostolou
@ 2011-08-22  7:53 ` Dimitrios Apostolou
  2011-08-22  8:46   ` Jakub Jelinek
  2011-08-22 10:11   ` Richard Guenther
  2011-08-22  7:59 ` tree-ssa*: reduce malloc() calls by preallocating hot VECs on the stack Dimitrios Apostolou
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 22+ messages in thread
From: Dimitrios Apostolou @ 2011-08-22  7:53 UTC (permalink / raw)
  To: Dimitrios Apostolou; +Cc: gcc-patches, Steven Bosscher

[-- Attachment #1: Type: TEXT/PLAIN, Size: 524 bytes --]

free() was called way too often before, this patch reduces it 
significantly. Minor speed-up here too, I don't mention it individually 
since numbers are within noise margins.


2011-08-22  Dimitrios Apostolou  <jimis@gmx.net>

 	* graphds.h (struct graph): Added edge_pool as a pool for
 	allocating edges.
 	* graphds.c (new_graph): Initialise edge_pool.
 	(add_edge): Allocate edge from edge_pool rather than with malloc.
 	(free_graph): Instead of iterating across the graph freeing edges,
 	just destroy the edge_pool.

[-- Attachment #2: Type: TEXT/plain, Size: 1966 bytes --]

=== modified file 'gcc/graphds.c'
--- gcc/graphds.c	2009-11-25 10:55:54 +0000
+++ gcc/graphds.c	2011-08-19 16:44:41 +0000
@@ -62,7 +62,8 @@ new_graph (int n_vertices)
 
   g->n_vertices = n_vertices;
   g->vertices = XCNEWVEC (struct vertex, n_vertices);
-
+  g->edge_pool = create_alloc_pool ("edge_pool",
+				    sizeof (struct graph_edge), 32);
   return g;
 }
 
@@ -71,7 +72,7 @@ new_graph (int n_vertices)
 struct graph_edge *
 add_edge (struct graph *g, int f, int t)
 {
-  struct graph_edge *e = XNEW (struct graph_edge);
+  struct graph_edge *e = (struct graph_edge *) pool_alloc (g->edge_pool);
   struct vertex *vf = &g->vertices[f], *vt = &g->vertices[t];
 
 
@@ -326,19 +327,7 @@ for_each_edge (struct graph *g, graphds_
 void
 free_graph (struct graph *g)
 {
-  struct graph_edge *e, *n;
-  struct vertex *v;
-  int i;
-
-  for (i = 0; i < g->n_vertices; i++)
-    {
-      v = &g->vertices[i];
-      for (e = v->succ; e; e = n)
-	{
-	  n = e->succ_next;
-	  free (e);
-	}
-    }
+  free_alloc_pool (g->edge_pool);
   free (g->vertices);
   free (g);
 }

=== modified file 'gcc/graphds.h'
--- gcc/graphds.h	2009-02-20 15:20:38 +0000
+++ gcc/graphds.h	2011-08-19 16:44:41 +0000
@@ -18,6 +18,10 @@ You should have received a copy of the G
 along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
+
+#include "alloc-pool.h"
+
+
 /* Structure representing edge of a graph.  */
 
 struct graph_edge
@@ -44,10 +48,10 @@ struct vertex
 
 struct graph
 {
-  int n_vertices;	/* Number of vertices.  */
-  struct vertex *vertices;
-			/* The vertices.  */
-  htab_t indices;	/* Fast lookup for indices.  */
+  int n_vertices;		/* Number of vertices.  */
+  struct vertex *vertices;	/* The vertices.  */
+  htab_t indices;		/* Fast lookup for indices.  */
+  alloc_pool edge_pool;		/* Pool for allocating edges. */
 };
 
 struct graph *new_graph (int);


^ permalink raw reply	[flat|nested] 22+ messages in thread

* mem_attrs_htab
  2011-08-22  7:50 Various minor speed-ups Dimitrios Apostolou
@ 2011-08-22  7:53 ` Dimitrios Apostolou
  2011-08-22  8:37   ` mem_attrs_htab Jakub Jelinek
  2011-08-22  7:53 ` graphds.[ch]: alloc_pool for edges Dimitrios Apostolou
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Dimitrios Apostolou @ 2011-08-22  7:53 UTC (permalink / raw)
  To: Dimitrios Apostolou; +Cc: gcc-patches, Steven Bosscher

[-- Attachment #1: Type: TEXT/PLAIN, Size: 470 bytes --]


2011-08-22  Dimitrios Apostolou  <jimis@gmx.net>

 	* emit-rtl.c (mem_attrs_htab_hash): Hash massively by calling
 	iterative_hash(). We disregard the offset,size rtx fields of the
 	mem_attrs struct, but overall this hash is a *huge* improvement to
 	the previous one, it reduces the collisions/searches ratio from 8
 	to 0.8 for some cases.
 	(init_emit_once): Slightly increase the mem_attrs_htab initial
 	size because it's frequently used and expanded many times.

[-- Attachment #2: Type: TEXT/plain, Size: 1222 bytes --]

=== modified file 'gcc/emit-rtl.c'
--- gcc/emit-rtl.c	2011-05-29 17:40:05 +0000
+++ gcc/emit-rtl.c	2011-08-21 04:44:25 +0000
@@ -256,11 +256,10 @@ mem_attrs_htab_hash (const void *x)
 {
   const mem_attrs *const p = (const mem_attrs *) x;
 
-  return (p->alias ^ (p->align * 1000)
-	  ^ (p->addrspace * 4000)
-	  ^ ((p->offset ? INTVAL (p->offset) : 0) * 50000)
-	  ^ ((p->size ? INTVAL (p->size) : 0) * 2500000)
-	  ^ (size_t) iterative_hash_expr (p->expr, 0));
+  /* By massively feeding the mem_attrs struct to iterative_hash() we
+     disregard the p->offset and p->size rtx, but in total the hash is
+     quick and good enough. */
+  return iterative_hash_object (*p, iterative_hash_expr (p->expr, 0));
 }
 
 /* Returns nonzero if the value represented by X (which is really a
@@ -5494,7 +5500,7 @@ init_emit_once (void)
   const_fixed_htab = htab_create_ggc (37, const_fixed_htab_hash,
 				      const_fixed_htab_eq, NULL);
 
-  mem_attrs_htab = htab_create_ggc (37, mem_attrs_htab_hash,
+  mem_attrs_htab = htab_create_ggc (128, mem_attrs_htab_hash,
 				    mem_attrs_htab_eq, NULL);
   reg_attrs_htab = htab_create_ggc (37, reg_attrs_htab_hash,
 				    reg_attrs_htab_eq, NULL);


^ permalink raw reply	[flat|nested] 22+ messages in thread

* tree-ssa*: reduce malloc() calls by preallocating hot VECs on the stack
  2011-08-22  7:50 Various minor speed-ups Dimitrios Apostolou
  2011-08-22  7:53 ` mem_attrs_htab Dimitrios Apostolou
  2011-08-22  7:53 ` graphds.[ch]: alloc_pool for edges Dimitrios Apostolou
@ 2011-08-22  7:59 ` Dimitrios Apostolou
  2011-08-22 10:07   ` Richard Guenther
  2011-08-22  8:05 ` tree-ssa-structalias.c: alloc_pool for struct equiv_class_label Dimitrios Apostolou
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Dimitrios Apostolou @ 2011-08-22  7:59 UTC (permalink / raw)
  To: Dimitrios Apostolou; +Cc: gcc-patches, Steven Bosscher

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1057 bytes --]


2011-08-22  Dimitrios Apostolou  <jimis@gmx.net>

 	Allocate some very frequently used vectors on the stack:
 	* vecir.h: Defined a tree vector on the stack.
 	* tree-ssa-sccvn.c (print_scc, sort_scc, process_scc)
 	(extract_and_process_scc_for_name): Allocate the scc vector on the
 	stack instead of the heap, giving it a minimal initial size
 	instead of 0.
 	* tree-ssa-structalias.c (get_constraint_for_1)
 	(get_constraint_for, get_constraint_for_rhs, do_deref)
 	(get_constraint_for_ssa_var, get_constraint_for_ptr_offset)
 	(get_constraint_for_component_ref, get_constraint_for_address_of)
 	(process_all_all_constraints, do_structure_copy)
 	(make_constraints_to, make_constraint_to, handle_rhs_call)
 	(handle_lhs_call, handle_const_call, handle_pure_call)
 	(find_func_aliases_for_builtin_call, find_func_aliases_for_call)
 	(find_func_aliases, process_ipa_clobber, find_func_clobbers)
 	(create_variable_info_for): Converted the rhsc, lhsc vectors from
 	heap to stack, with a minimal initial size, since they were very
 	frequently allocated.

[-- Attachment #2: Type: TEXT/plain, Size: 28822 bytes --]

=== modified file 'gcc/tree-ssa-structalias.c'
--- gcc/tree-ssa-structalias.c	2011-08-18 06:53:12 +0000
+++ gcc/tree-ssa-structalias.c	2011-08-19 09:43:41 +0000
@@ -477,11 +477,14 @@ struct constraint_expr
 
 typedef struct constraint_expr ce_s;
 DEF_VEC_O(ce_s);
-DEF_VEC_ALLOC_O(ce_s, heap);
-static void get_constraint_for_1 (tree, VEC(ce_s, heap) **, bool, bool);
-static void get_constraint_for (tree, VEC(ce_s, heap) **);
-static void get_constraint_for_rhs (tree, VEC(ce_s, heap) **);
-static void do_deref (VEC (ce_s, heap) **);
+DEF_VEC_ALLOC_O_STACK(ce_s);
+#define VEC_ce_s_stack_alloc(alloc) \
+  VEC_stack_alloc (ce_s, alloc)
+
+static void get_constraint_for_1 (tree, VEC(ce_s, stack) **, bool, bool);
+static void get_constraint_for (tree, VEC(ce_s, stack) **);
+static void get_constraint_for_rhs (tree, VEC(ce_s, stack) **);
+static void do_deref (VEC (ce_s, stack) **);
 
 /* Our set constraints are made up of two constraint expressions, one
    LHS, and one RHS.
@@ -2736,7 +2739,7 @@ new_scalar_tmp_constraint_exp (const cha
    If address_p is true, the result will be taken its address of.  */
 
 static void
-get_constraint_for_ssa_var (tree t, VEC(ce_s, heap) **results, bool address_p)
+get_constraint_for_ssa_var (tree t, VEC(ce_s, stack) **results, bool address_p)
 {
   struct constraint_expr cexpr;
   varinfo_t vi;
@@ -2776,12 +2779,12 @@ get_constraint_for_ssa_var (tree t, VEC(
       for (; vi; vi = vi->next)
 	{
 	  cexpr.var = vi->id;
-	  VEC_safe_push (ce_s, heap, *results, &cexpr);
+	  VEC_safe_push (ce_s, stack, *results, &cexpr);
 	}
       return;
     }
 
-  VEC_safe_push (ce_s, heap, *results, &cexpr);
+  VEC_safe_push (ce_s, stack, *results, &cexpr);
 }
 
 /* Process constraint T, performing various simplifications and then
@@ -2861,7 +2864,7 @@ bitpos_of_field (const tree fdecl)
 
 static void
 get_constraint_for_ptr_offset (tree ptr, tree offset,
-			       VEC (ce_s, heap) **results)
+			       VEC (ce_s, stack) **results)
 {
   struct constraint_expr c;
   unsigned int j, n;
@@ -2920,7 +2923,7 @@ get_constraint_for_ptr_offset (tree ptr,
 	      c2.type = ADDRESSOF;
 	      c2.offset = 0;
 	      if (c2.var != c.var)
-		VEC_safe_push (ce_s, heap, *results, &c2);
+		VEC_safe_push (ce_s, stack, *results, &c2);
 	      temp = temp->next;
 	    }
 	  while (temp);
@@ -2955,7 +2958,7 @@ get_constraint_for_ptr_offset (tree ptr,
 	      c2.var = temp->next->id;
 	      c2.type = ADDRESSOF;
 	      c2.offset = 0;
-	      VEC_safe_push (ce_s, heap, *results, &c2);
+	      VEC_safe_push (ce_s, stack, *results, &c2);
 	    }
 	  c.var = temp->id;
 	  c.offset = 0;
@@ -2974,7 +2977,7 @@ get_constraint_for_ptr_offset (tree ptr,
    as the lhs.  */
 
 static void
-get_constraint_for_component_ref (tree t, VEC(ce_s, heap) **results,
+get_constraint_for_component_ref (tree t, VEC(ce_s, stack) **results,
 				  bool address_p, bool lhs_p)
 {
   tree orig_t = t;
@@ -2999,7 +3002,7 @@ get_constraint_for_component_ref (tree t
       temp.offset = 0;
       temp.var = integer_id;
       temp.type = SCALAR;
-      VEC_safe_push (ce_s, heap, *results, &temp);
+      VEC_safe_push (ce_s, stack, *results, &temp);
       return;
     }
 
@@ -3021,7 +3024,7 @@ get_constraint_for_component_ref (tree t
 	    temp.offset = 0;
 	    temp.var = anything_id;
 	    temp.type = ADDRESSOF;
-	    VEC_safe_push (ce_s, heap, *results, &temp);
+	    VEC_safe_push (ce_s, stack, *results, &temp);
 	    return;
 	  }
     }
@@ -3062,7 +3065,7 @@ get_constraint_for_component_ref (tree t
 				    bitpos, bitmaxsize))
 		{
 		  cexpr.var = curr->id;
-		  VEC_safe_push (ce_s, heap, *results, &cexpr);
+		  VEC_safe_push (ce_s, stack, *results, &cexpr);
 		  if (address_p)
 		    break;
 		}
@@ -3077,7 +3080,7 @@ get_constraint_for_component_ref (tree t
 	      while (curr->next != NULL)
 		curr = curr->next;
 	      cexpr.var = curr->id;
-	      VEC_safe_push (ce_s, heap, *results, &cexpr);
+	      VEC_safe_push (ce_s, stack, *results, &cexpr);
 	    }
 	  else if (VEC_length (ce_s, *results) == 0)
 	    /* Assert that we found *some* field there. The user couldn't be
@@ -3090,7 +3093,7 @@ get_constraint_for_component_ref (tree t
 	      cexpr.type = SCALAR;
 	      cexpr.var = anything_id;
 	      cexpr.offset = 0;
-	      VEC_safe_push (ce_s, heap, *results, &cexpr);
+	      VEC_safe_push (ce_s, stack, *results, &cexpr);
 	    }
 	}
       else if (bitmaxsize == 0)
@@ -3136,7 +3139,7 @@ get_constraint_for_component_ref (tree t
    This is needed so that we can handle dereferencing DEREF constraints.  */
 
 static void
-do_deref (VEC (ce_s, heap) **constraints)
+do_deref (VEC (ce_s, stack) **constraints)
 {
   struct constraint_expr *c;
   unsigned int i = 0;
@@ -3163,7 +3166,7 @@ do_deref (VEC (ce_s, heap) **constraints
    address of it.  */
 
 static void
-get_constraint_for_address_of (tree t, VEC (ce_s, heap) **results)
+get_constraint_for_address_of (tree t, VEC (ce_s, stack) **results)
 {
   struct constraint_expr *c;
   unsigned int i;
@@ -3182,7 +3185,7 @@ get_constraint_for_address_of (tree t, V
 /* Given a tree T, return the constraint expression for it.  */
 
 static void
-get_constraint_for_1 (tree t, VEC (ce_s, heap) **results, bool address_p,
+get_constraint_for_1 (tree t, VEC (ce_s, stack) **results, bool address_p,
 		      bool lhs_p)
 {
   struct constraint_expr temp;
@@ -3214,7 +3217,7 @@ get_constraint_for_1 (tree t, VEC (ce_s,
 	temp.var = nonlocal_id;
       temp.type = ADDRESSOF;
       temp.offset = 0;
-      VEC_safe_push (ce_s, heap, *results, &temp);
+      VEC_safe_push (ce_s, stack, *results, &temp);
       return;
     }
 
@@ -3224,7 +3227,7 @@ get_constraint_for_1 (tree t, VEC (ce_s,
       temp.var = readonly_id;
       temp.type = SCALAR;
       temp.offset = 0;
-      VEC_safe_push (ce_s, heap, *results, &temp);
+      VEC_safe_push (ce_s, stack, *results, &temp);
       return;
     }
 
@@ -3275,7 +3278,7 @@ get_constraint_for_1 (tree t, VEC (ce_s,
 		      if (curr->offset - vi->offset < size)
 			{
 			  cs.var = curr->id;
-			  VEC_safe_push (ce_s, heap, *results, &cs);
+			  VEC_safe_push (ce_s, stack, *results, &cs);
 			}
 		      else
 			break;
@@ -3310,17 +3313,17 @@ get_constraint_for_1 (tree t, VEC (ce_s,
 	    {
 	      unsigned int i;
 	      tree val;
-	      VEC (ce_s, heap) *tmp = NULL;
+	      VEC (ce_s, stack) *tmp = VEC_alloc (ce_s, stack, 32);
 	      FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (t), i, val)
 		{
 		  struct constraint_expr *rhsp;
 		  unsigned j;
 		  get_constraint_for_1 (val, &tmp, address_p, lhs_p);
 		  FOR_EACH_VEC_ELT (ce_s, tmp, j, rhsp)
-		    VEC_safe_push (ce_s, heap, *results, rhsp);
+		    VEC_safe_push (ce_s, stack, *results, rhsp);
 		  VEC_truncate (ce_s, tmp, 0);
 		}
-	      VEC_free (ce_s, heap, tmp);
+	      VEC_free (ce_s, stack, tmp);
 	      /* We do not know whether the constructor was complete,
 	         so technically we have to add &NOTHING or &ANYTHING
 		 like we do for an empty constructor as well.  */
@@ -3341,7 +3344,7 @@ get_constraint_for_1 (tree t, VEC (ce_s,
 	temp.type = ADDRESSOF;
 	temp.var = nonlocal_id;
 	temp.offset = 0;
-	VEC_safe_push (ce_s, heap, *results, &temp);
+	VEC_safe_push (ce_s, stack, *results, &temp);
 	return;
       }
     default:;
@@ -3351,13 +3354,13 @@ get_constraint_for_1 (tree t, VEC (ce_s,
   temp.type = ADDRESSOF;
   temp.var = anything_id;
   temp.offset = 0;
-  VEC_safe_push (ce_s, heap, *results, &temp);
+  VEC_safe_push (ce_s, stack, *results, &temp);
 }
 
 /* Given a gimple tree T, return the constraint expression vector for it.  */
 
 static void
-get_constraint_for (tree t, VEC (ce_s, heap) **results)
+get_constraint_for (tree t, VEC (ce_s, stack) **results)
 {
   gcc_assert (VEC_length (ce_s, *results) == 0);
 
@@ -3368,7 +3371,7 @@ get_constraint_for (tree t, VEC (ce_s, h
    to be used as the rhs of a constraint.  */
 
 static void
-get_constraint_for_rhs (tree t, VEC (ce_s, heap) **results)
+get_constraint_for_rhs (tree t, VEC (ce_s, stack) **results)
 {
   gcc_assert (VEC_length (ce_s, *results) == 0);
 
@@ -3380,7 +3383,7 @@ get_constraint_for_rhs (tree t, VEC (ce_
    entries in *LHSC.  */
 
 static void
-process_all_all_constraints (VEC (ce_s, heap) *lhsc, VEC (ce_s, heap) *rhsc)
+process_all_all_constraints (VEC (ce_s, stack) *lhsc, VEC (ce_s, stack) *rhsc)
 {
   struct constraint_expr *lhsp, *rhsp;
   unsigned i, j;
@@ -3410,8 +3413,9 @@ static void
 do_structure_copy (tree lhsop, tree rhsop)
 {
   struct constraint_expr *lhsp, *rhsp;
-  VEC (ce_s, heap) *lhsc = NULL, *rhsc = NULL;
   unsigned j;
+  VEC (ce_s, stack) *lhsc = VEC_alloc (ce_s, stack, 32);
+  VEC (ce_s, stack) *rhsc = VEC_alloc (ce_s, stack, 32);
 
   get_constraint_for (lhsop, &lhsc);
   get_constraint_for_rhs (rhsop, &rhsc);
@@ -3470,14 +3474,14 @@ do_structure_copy (tree lhsop, tree rhso
   else
     gcc_unreachable ();
 
-  VEC_free (ce_s, heap, lhsc);
-  VEC_free (ce_s, heap, rhsc);
+  VEC_free (ce_s, stack, lhsc);
+  VEC_free (ce_s, stack, rhsc);
 }
 
 /* Create constraints ID = { rhsc }.  */
 
 static void
-make_constraints_to (unsigned id, VEC(ce_s, heap) *rhsc)
+make_constraints_to (unsigned id, VEC(ce_s, stack) *rhsc)
 {
   struct constraint_expr *c;
   struct constraint_expr includes;
@@ -3496,10 +3500,10 @@ make_constraints_to (unsigned id, VEC(ce
 static void
 make_constraint_to (unsigned id, tree op)
 {
-  VEC(ce_s, heap) *rhsc = NULL;
+  VEC(ce_s, stack) *rhsc = VEC_alloc (ce_s, stack, 32);
   get_constraint_for_rhs (op, &rhsc);
   make_constraints_to (id, rhsc);
-  VEC_free (ce_s, heap, rhsc);
+  VEC_free (ce_s, stack, rhsc);
 }
 
 /* Create a constraint ID = &FROM.  */
@@ -3690,7 +3694,7 @@ get_function_part_constraint (varinfo_t 
    RHS.  */
 
 static void
-handle_rhs_call (gimple stmt, VEC(ce_s, heap) **results)
+handle_rhs_call (gimple stmt, VEC(ce_s, stack) **results)
 {
   struct constraint_expr rhsc;
   unsigned i;
@@ -3744,7 +3748,7 @@ handle_rhs_call (gimple stmt, VEC(ce_s, 
       rhsc.var = get_call_use_vi (stmt)->id;
       rhsc.offset = 0;
       rhsc.type = SCALAR;
-      VEC_safe_push (ce_s, heap, *results, &rhsc);
+      VEC_safe_push (ce_s, stack, *results, &rhsc);
     }
 
   /* The static chain escapes as well.  */
@@ -3756,22 +3760,23 @@ handle_rhs_call (gimple stmt, VEC(ce_s, 
       && gimple_call_lhs (stmt) != NULL_TREE
       && TREE_ADDRESSABLE (TREE_TYPE (gimple_call_lhs (stmt))))
     {
-      VEC(ce_s, heap) *tmpc = NULL;
       struct constraint_expr lhsc, *c;
+      VEC(ce_s, stack) *tmpc = VEC_alloc (ce_s, stack, 32);
+
       get_constraint_for_address_of (gimple_call_lhs (stmt), &tmpc);
       lhsc.var = escaped_id;
       lhsc.offset = 0;
       lhsc.type = SCALAR;
       FOR_EACH_VEC_ELT (ce_s, tmpc, i, c)
 	process_constraint (new_constraint (lhsc, *c));
-      VEC_free(ce_s, heap, tmpc);
+      VEC_free(ce_s, stack, tmpc);
     }
 
   /* Regular functions return nonlocal memory.  */
   rhsc.var = nonlocal_id;
   rhsc.offset = 0;
   rhsc.type = SCALAR;
-  VEC_safe_push (ce_s, heap, *results, &rhsc);
+  VEC_safe_push (ce_s, stack, *results, &rhsc);
 }
 
 /* For non-IPA mode, generate constraints necessary for a call
@@ -3779,10 +3784,10 @@ handle_rhs_call (gimple stmt, VEC(ce_s, 
    the LHS point to global and escaped variables.  */
 
 static void
-handle_lhs_call (gimple stmt, tree lhs, int flags, VEC(ce_s, heap) *rhsc,
+handle_lhs_call (gimple stmt, tree lhs, int flags, VEC(ce_s, stack) *rhsc,
 		 tree fndecl)
 {
-  VEC(ce_s, heap) *lhsc = NULL;
+  VEC(ce_s, stack) *lhsc = VEC_alloc (ce_s, stack, 32);
 
   get_constraint_for (lhs, &lhsc);
   /* If the store is to a global decl make sure to
@@ -3796,7 +3801,7 @@ handle_lhs_call (gimple stmt, tree lhs, 
       tmpc.var = escaped_id;
       tmpc.offset = 0;
       tmpc.type = SCALAR;
-      VEC_safe_push (ce_s, heap, lhsc, &tmpc);
+      VEC_safe_push (ce_s, stack, lhsc, &tmpc);
     }
 
   /* If the call returns an argument unmodified override the rhs
@@ -3810,7 +3815,7 @@ handle_lhs_call (gimple stmt, tree lhs, 
       arg = gimple_call_arg (stmt, flags & ERF_RETURN_ARG_MASK);
       get_constraint_for (arg, &rhsc);
       process_all_all_constraints (lhsc, rhsc);
-      VEC_free (ce_s, heap, rhsc);
+      VEC_free (ce_s, stack, rhsc);
     }
   else if (flags & ERF_NOALIAS)
     {
@@ -3831,19 +3836,19 @@ handle_lhs_call (gimple stmt, tree lhs, 
       tmpc.var = vi->id;
       tmpc.offset = 0;
       tmpc.type = ADDRESSOF;
-      VEC_safe_push (ce_s, heap, rhsc, &tmpc);
+      VEC_safe_push (ce_s, stack, rhsc, &tmpc);
     }
 
   process_all_all_constraints (lhsc, rhsc);
 
-  VEC_free (ce_s, heap, lhsc);
+  VEC_free (ce_s, stack, lhsc);
 }
 
 /* For non-IPA mode, generate constraints necessary for a call of a
    const function that returns a pointer in the statement STMT.  */
 
 static void
-handle_const_call (gimple stmt, VEC(ce_s, heap) **results)
+handle_const_call (gimple stmt, VEC(ce_s, stack) **results)
 {
   struct constraint_expr rhsc;
   unsigned int k;
@@ -3858,34 +3863,35 @@ handle_const_call (gimple stmt, VEC(ce_s
       rhsc.var = uses->id;
       rhsc.offset = 0;
       rhsc.type = SCALAR;
-      VEC_safe_push (ce_s, heap, *results, &rhsc);
+      VEC_safe_push (ce_s, stack, *results, &rhsc);
     }
 
   /* May return arguments.  */
   for (k = 0; k < gimple_call_num_args (stmt); ++k)
     {
       tree arg = gimple_call_arg (stmt, k);
-      VEC(ce_s, heap) *argc = NULL;
       unsigned i;
       struct constraint_expr *argp;
+      VEC(ce_s, stack) *argc = VEC_alloc (ce_s, stack, 32);
+
       get_constraint_for_rhs (arg, &argc);
       FOR_EACH_VEC_ELT (ce_s, argc, i, argp)
-	VEC_safe_push (ce_s, heap, *results, argp);
-      VEC_free(ce_s, heap, argc);
+	VEC_safe_push (ce_s, stack, *results, argp);
+      VEC_free(ce_s, stack, argc);
     }
 
   /* May return addresses of globals.  */
   rhsc.var = nonlocal_id;
   rhsc.offset = 0;
   rhsc.type = ADDRESSOF;
-  VEC_safe_push (ce_s, heap, *results, &rhsc);
+  VEC_safe_push (ce_s, stack, *results, &rhsc);
 }
 
 /* For non-IPA mode, generate constraints necessary for a call to a
    pure function in statement STMT.  */
 
 static void
-handle_pure_call (gimple stmt, VEC(ce_s, heap) **results)
+handle_pure_call (gimple stmt, VEC(ce_s, stack) **results)
 {
   struct constraint_expr rhsc;
   unsigned i;
@@ -3920,12 +3926,12 @@ handle_pure_call (gimple stmt, VEC(ce_s,
       rhsc.var = uses->id;
       rhsc.offset = 0;
       rhsc.type = SCALAR;
-      VEC_safe_push (ce_s, heap, *results, &rhsc);
+      VEC_safe_push (ce_s, stack, *results, &rhsc);
     }
   rhsc.var = nonlocal_id;
   rhsc.offset = 0;
   rhsc.type = SCALAR;
-  VEC_safe_push (ce_s, heap, *results, &rhsc);
+  VEC_safe_push (ce_s, stack, *results, &rhsc);
 }
 
 
@@ -3966,9 +3972,9 @@ static bool
 find_func_aliases_for_builtin_call (gimple t)
 {
   tree fndecl = gimple_call_fndecl (t);
-  VEC(ce_s, heap) *lhsc = NULL;
-  VEC(ce_s, heap) *rhsc = NULL;
   varinfo_t fi;
+  VEC(ce_s, stack) *lhsc = VEC_alloc (ce_s, stack, 32);
+  VEC(ce_s, stack) *rhsc = VEC_alloc (ce_s, stack, 32);
 
   if (fndecl != NULL_TREE
       && DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL)
@@ -4007,16 +4013,16 @@ find_func_aliases_for_builtin_call (gimp
 	      else
 		get_constraint_for (dest, &rhsc);
 	      process_all_all_constraints (lhsc, rhsc);
-	      VEC_free (ce_s, heap, lhsc);
-	      VEC_free (ce_s, heap, rhsc);
+	      VEC_free (ce_s, stack, lhsc);
+	      VEC_free (ce_s, stack, rhsc);
 	    }
 	  get_constraint_for_ptr_offset (dest, NULL_TREE, &lhsc);
 	  get_constraint_for_ptr_offset (src, NULL_TREE, &rhsc);
 	  do_deref (&lhsc);
 	  do_deref (&rhsc);
 	  process_all_all_constraints (lhsc, rhsc);
-	  VEC_free (ce_s, heap, lhsc);
-	  VEC_free (ce_s, heap, rhsc);
+	  VEC_free (ce_s, stack, lhsc);
+	  VEC_free (ce_s, stack, rhsc);
 	  return true;
 	}
       case BUILT_IN_MEMSET:
@@ -4031,8 +4037,8 @@ find_func_aliases_for_builtin_call (gimp
 	      get_constraint_for (res, &lhsc);
 	      get_constraint_for (dest, &rhsc);
 	      process_all_all_constraints (lhsc, rhsc);
-	      VEC_free (ce_s, heap, lhsc);
-	      VEC_free (ce_s, heap, rhsc);
+	      VEC_free (ce_s, stack, lhsc);
+	      VEC_free (ce_s, stack, rhsc);
 	    }
 	  get_constraint_for_ptr_offset (dest, NULL_TREE, &lhsc);
 	  do_deref (&lhsc);
@@ -4050,7 +4056,7 @@ find_func_aliases_for_builtin_call (gimp
 	  ac.offset = 0;
 	  FOR_EACH_VEC_ELT (ce_s, lhsc, i, lhsp)
 	      process_constraint (new_constraint (*lhsp, ac));
-	  VEC_free (ce_s, heap, lhsc);
+	  VEC_free (ce_s, stack, lhsc);
 	  return true;
 	}
       /* All the following functions do not return pointers, do not
@@ -4096,7 +4102,7 @@ find_func_aliases_for_builtin_call (gimp
 		  get_constraint_for (frame, &rhsc);
 		  FOR_EACH_VEC_ELT (ce_s, rhsc, i, rhsp)
 		      process_constraint (new_constraint (lhs, *rhsp));
-		  VEC_free (ce_s, heap, rhsc);
+		  VEC_free (ce_s, stack, rhsc);
 
 		  /* Make the frame point to the function for
 		     the trampoline adjustment call.  */
@@ -4104,8 +4110,8 @@ find_func_aliases_for_builtin_call (gimp
 		  do_deref (&lhsc);
 		  get_constraint_for (nfunc, &rhsc);
 		  process_all_all_constraints (lhsc, rhsc);
-		  VEC_free (ce_s, heap, rhsc);
-		  VEC_free (ce_s, heap, lhsc);
+		  VEC_free (ce_s, stack, rhsc);
+		  VEC_free (ce_s, stack, lhsc);
 
 		  return true;
 		}
@@ -4124,8 +4130,8 @@ find_func_aliases_for_builtin_call (gimp
 	      get_constraint_for (tramp, &rhsc);
 	      do_deref (&rhsc);
 	      process_all_all_constraints (lhsc, rhsc);
-	      VEC_free (ce_s, heap, rhsc);
-	      VEC_free (ce_s, heap, lhsc);
+	      VEC_free (ce_s, stack, rhsc);
+	      VEC_free (ce_s, stack, lhsc);
 	    }
 	  return true;
 	}
@@ -4148,7 +4154,7 @@ find_func_aliases_for_builtin_call (gimp
 	      rhs.type = ADDRESSOF;
 	      FOR_EACH_VEC_ELT (ce_s, lhsc, i, lhsp)
 		  process_constraint (new_constraint (*lhsp, rhs));
-	      VEC_free (ce_s, heap, lhsc);
+	      VEC_free (ce_s, stack, lhsc);
 	      /* va_list is clobbered.  */
 	      make_constraint_to (get_call_clobber_vi (t)->id, valist);
 	      return true;
@@ -4193,9 +4199,9 @@ static void
 find_func_aliases_for_call (gimple t)
 {
   tree fndecl = gimple_call_fndecl (t);
-  VEC(ce_s, heap) *lhsc = NULL;
-  VEC(ce_s, heap) *rhsc = NULL;
   varinfo_t fi;
+  VEC(ce_s, stack) *lhsc = VEC_alloc (ce_s, stack, 32);
+  VEC(ce_s, stack) *rhsc = VEC_alloc (ce_s, stack, 32);
 
   if (fndecl != NULL_TREE
       && DECL_BUILT_IN (fndecl)
@@ -4206,7 +4212,7 @@ find_func_aliases_for_call (gimple t)
   if (!in_ipa_mode
       || (fndecl && !fi->is_fn_info))
     {
-      VEC(ce_s, heap) *rhsc = NULL;
+      VEC(ce_s, stack) *rhsc = VEC_alloc (ce_s, stack, 32);
       int flags = gimple_call_flags (t);
 
       /* Const functions can return their arguments and addresses
@@ -4225,7 +4231,7 @@ find_func_aliases_for_call (gimple t)
 	handle_rhs_call (t, &rhsc);
       if (gimple_call_lhs (t))
 	handle_lhs_call (t, gimple_call_lhs (t), flags, rhsc, fndecl);
-      VEC_free (ce_s, heap, rhsc);
+      VEC_free (ce_s, stack, rhsc);
     }
   else
     {
@@ -4263,11 +4269,11 @@ find_func_aliases_for_call (gimple t)
 	      && DECL_RESULT (fndecl)
 	      && DECL_BY_REFERENCE (DECL_RESULT (fndecl)))
 	    {
-	      VEC(ce_s, heap) *tem = NULL;
-	      VEC_safe_push (ce_s, heap, tem, &rhs);
+	      VEC(ce_s, stack) *tem = VEC_alloc (ce_s, stack, 32);
+	      VEC_safe_push (ce_s, stack, tem, &rhs);
 	      do_deref (&tem);
 	      rhs = *VEC_index (ce_s, tem, 0);
-	      VEC_free(ce_s, heap, tem);
+	      VEC_free(ce_s, stack, tem);
 	    }
 	  FOR_EACH_VEC_ELT (ce_s, lhsc, j, lhsp)
 	    process_constraint (new_constraint (*lhsp, rhs));
@@ -4286,7 +4292,7 @@ find_func_aliases_for_call (gimple t)
 	  lhs = get_function_part_constraint (fi, fi_result);
 	  FOR_EACH_VEC_ELT (ce_s, rhsc, j, rhsp)
 	    process_constraint (new_constraint (lhs, *rhsp));
-	  VEC_free (ce_s, heap, rhsc);
+	  VEC_free (ce_s, stack, rhsc);
 	}
 
       /* If we use a static chain, pass it along.  */
@@ -4312,10 +4318,10 @@ static void
 find_func_aliases (gimple origt)
 {
   gimple t = origt;
-  VEC(ce_s, heap) *lhsc = NULL;
-  VEC(ce_s, heap) *rhsc = NULL;
   struct constraint_expr *c;
   varinfo_t fi;
+  VEC(ce_s, stack) *lhsc = VEC_alloc (ce_s, stack, 32);
+  VEC(ce_s, stack) *rhsc = VEC_alloc (ce_s, stack, 32);
 
   /* Now build constraints expressions.  */
   if (gimple_code (t) == GIMPLE_PHI)
@@ -4395,18 +4401,19 @@ find_func_aliases (gimple origt)
 	  else
 	    {
 	      /* All other operations are merges.  */
-	      VEC (ce_s, heap) *tmp = NULL;
 	      struct constraint_expr *rhsp;
 	      unsigned i, j;
+	      VEC (ce_s, stack) *tmp = VEC_alloc (ce_s, stack, 32);
+
 	      get_constraint_for_rhs (gimple_assign_rhs1 (t), &rhsc);
 	      for (i = 2; i < gimple_num_ops (t); ++i)
 		{
 		  get_constraint_for_rhs (gimple_op (t, i), &tmp);
 		  FOR_EACH_VEC_ELT (ce_s, tmp, j, rhsp)
-		    VEC_safe_push (ce_s, heap, rhsc, rhsp);
+		    VEC_safe_push (ce_s, stack, rhsc, rhsp);
 		  VEC_truncate (ce_s, tmp, 0);
 		}
-	      VEC_free (ce_s, heap, tmp);
+	      VEC_free (ce_s, stack, tmp);
 	    }
 	  process_all_all_constraints (lhsc, rhsc);
 	}
@@ -4477,16 +4484,17 @@ find_func_aliases (gimple origt)
 	     any global memory.  */
 	  if (op)
 	    {
-	      VEC(ce_s, heap) *lhsc = NULL;
 	      struct constraint_expr rhsc, *lhsp;
 	      unsigned j;
+	      VEC(ce_s, stack) *lhsc = VEC_alloc (ce_s, stack, 32);
+
 	      get_constraint_for (op, &lhsc);
 	      rhsc.var = nonlocal_id;
 	      rhsc.offset = 0;
 	      rhsc.type = SCALAR;
 	      FOR_EACH_VEC_ELT (ce_s, lhsc, j, lhsp)
 		process_constraint (new_constraint (*lhsp, rhsc));
-	      VEC_free (ce_s, heap, lhsc);
+	      VEC_free (ce_s, stack, lhsc);
 	    }
 	}
       for (i = 0; i < gimple_asm_ninputs (t); ++i)
@@ -4510,8 +4518,8 @@ find_func_aliases (gimple origt)
 	}
     }
 
-  VEC_free (ce_s, heap, rhsc);
-  VEC_free (ce_s, heap, lhsc);
+  VEC_free (ce_s, stack, rhsc);
+  VEC_free (ce_s, stack, lhsc);
 }
 
 
@@ -4521,14 +4529,15 @@ find_func_aliases (gimple origt)
 static void
 process_ipa_clobber (varinfo_t fi, tree ptr)
 {
-  VEC(ce_s, heap) *ptrc = NULL;
   struct constraint_expr *c, lhs;
   unsigned i;
+  VEC(ce_s, stack) *ptrc = VEC_alloc (ce_s, stack, 32);
+
   get_constraint_for_rhs (ptr, &ptrc);
   lhs = get_function_part_constraint (fi, fi_clobbers);
   FOR_EACH_VEC_ELT (ce_s, ptrc, i, c)
     process_constraint (new_constraint (lhs, *c));
-  VEC_free (ce_s, heap, ptrc);
+  VEC_free (ce_s, stack, ptrc);
 }
 
 /* Walk statement T setting up clobber and use constraints according to the
@@ -4539,9 +4548,9 @@ static void
 find_func_clobbers (gimple origt)
 {
   gimple t = origt;
-  VEC(ce_s, heap) *lhsc = NULL;
-  VEC(ce_s, heap) *rhsc = NULL;
   varinfo_t fi;
+  VEC(ce_s, stack) *lhsc = VEC_alloc (ce_s, stack, 32);
+  VEC(ce_s, stack) *rhsc = VEC_alloc (ce_s, stack, 32);
 
   /* Add constraints for clobbered/used in IPA mode.
      We are not interested in what automatic variables are clobbered
@@ -4579,7 +4588,7 @@ find_func_clobbers (gimple origt)
 	  get_constraint_for_address_of (lhs, &rhsc);
 	  FOR_EACH_VEC_ELT (ce_s, rhsc, i, rhsp)
 	    process_constraint (new_constraint (lhsc, *rhsp));
-	  VEC_free (ce_s, heap, rhsc);
+	  VEC_free (ce_s, stack, rhsc);
 	}
     }
 
@@ -4607,7 +4616,7 @@ find_func_clobbers (gimple origt)
 	  get_constraint_for_address_of (rhs, &rhsc);
 	  FOR_EACH_VEC_ELT (ce_s, rhsc, i, rhsp)
 	    process_constraint (new_constraint (lhs, *rhsp));
-	  VEC_free (ce_s, heap, rhsc);
+	  VEC_free (ce_s, stack, rhsc);
 	}
     }
 
@@ -4647,12 +4656,12 @@ find_func_clobbers (gimple origt)
 	      lhs = get_function_part_constraint (fi, fi_clobbers);
 	      FOR_EACH_VEC_ELT (ce_s, lhsc, i, lhsp)
 		process_constraint (new_constraint (lhs, *lhsp));
-	      VEC_free (ce_s, heap, lhsc);
+	      VEC_free (ce_s, stack, lhsc);
 	      get_constraint_for_ptr_offset (src, NULL_TREE, &rhsc);
 	      lhs = get_function_part_constraint (fi, fi_uses);
 	      FOR_EACH_VEC_ELT (ce_s, rhsc, i, rhsp)
 		process_constraint (new_constraint (lhs, *rhsp));
-	      VEC_free (ce_s, heap, rhsc);
+	      VEC_free (ce_s, stack, rhsc);
 	      return;
 	    }
 	  /* The following function clobbers memory pointed to by
@@ -4666,7 +4675,7 @@ find_func_clobbers (gimple origt)
 	      lhs = get_function_part_constraint (fi, fi_clobbers);
 	      FOR_EACH_VEC_ELT (ce_s, lhsc, i, lhsp)
 		process_constraint (new_constraint (lhs, *lhsp));
-	      VEC_free (ce_s, heap, lhsc);
+	      VEC_free (ce_s, stack, lhsc);
 	      return;
 	    }
 	  /* The following functions clobber their second and third
@@ -4735,7 +4744,7 @@ find_func_clobbers (gimple origt)
 	  get_constraint_for_address_of (arg, &rhsc);
 	  FOR_EACH_VEC_ELT (ce_s, rhsc, j, rhsp)
 	    process_constraint (new_constraint (lhs, *rhsp));
-	  VEC_free (ce_s, heap, rhsc);
+	  VEC_free (ce_s, stack, rhsc);
 	}
 
       /* Build constraints for propagating clobbers/uses along the
@@ -4796,7 +4805,7 @@ find_func_clobbers (gimple origt)
 			    anything_id);
     }
 
-  VEC_free (ce_s, heap, rhsc);
+  VEC_free (ce_s, stack, rhsc);
 }
 
 
@@ -5436,9 +5445,10 @@ create_variable_info_for (tree decl, con
       if (in_ipa_mode
 	  && DECL_INITIAL (decl))
 	{
-	  VEC (ce_s, heap) *rhsc = NULL;
 	  struct constraint_expr lhs, *rhsp;
 	  unsigned i;
+	  VEC (ce_s, stack) *rhsc = VEC_alloc (ce_s, stack, 32);
+
 	  get_constraint_for_rhs (DECL_INITIAL (decl), &rhsc);
 	  lhs.var = vi->id;
 	  lhs.offset = 0;
@@ -5455,7 +5465,7 @@ create_variable_info_for (tree decl, con
 	      FOR_EACH_VEC_ELT (ce_s, rhsc, i, rhsp)
 		process_constraint (new_constraint (lhs, *rhsp));
 	    }
-	  VEC_free (ce_s, heap, rhsc);
+	  VEC_free (ce_s, stack, rhsc);
 	}
     }
 

=== modified file 'gcc/tree-ssa-sccvn.c'
--- gcc/tree-ssa-sccvn.c	2011-05-02 13:11:27 +0000
+++ gcc/tree-ssa-sccvn.c	2011-08-19 12:50:47 +0000
@@ -2221,7 +2221,7 @@ vn_phi_insert (gimple phi, tree result)
 /* Print set of components in strongly connected component SCC to OUT. */
 
 static void
-print_scc (FILE *out, VEC (tree, heap) *scc)
+print_scc (FILE *out, VEC (tree, stack) *scc)
 {
   tree var;
   unsigned int i;
@@ -3203,7 +3203,7 @@ compare_ops (const void *pa, const void 
    array will give you the members in RPO order.  */
 
 static void
-sort_scc (VEC (tree, heap) *scc)
+sort_scc (VEC (tree, stack) *scc)
 {
   VEC_qsort (tree, scc, compare_ops);
 }
@@ -3254,7 +3254,7 @@ copy_reference (vn_reference_t oref, vn_
 /* Process a strongly connected component in the SSA graph.  */
 
 static void
-process_scc (VEC (tree, heap) *scc)
+process_scc (VEC (tree, stack) *scc)
 {
   tree var;
   unsigned int i;
@@ -3334,8 +3334,8 @@ DEF_VEC_ALLOC_O(ssa_op_iter,heap);
 static bool
 extract_and_process_scc_for_name (tree name)
 {
-  VEC (tree, heap) *scc = NULL;
   tree x;
+  VEC (tree, stack) *scc = VEC_alloc (tree, stack, 16);
 
   /* Found an SCC, pop the components off the SCC stack and
      process them.  */
@@ -3344,7 +3344,7 @@ extract_and_process_scc_for_name (tree n
       x = VEC_pop (tree, sccstack);
 
       VN_INFO (x)->on_sccstack = false;
-      VEC_safe_push (tree, heap, scc, x);
+      VEC_safe_push (tree, stack, scc, x);
     } while (x != name);
 
   /* Bail out of SCCVN in case a SCC turns out to be incredibly large.  */
@@ -3366,7 +3366,7 @@ extract_and_process_scc_for_name (tree n
 
   process_scc (scc);
 
-  VEC_free (tree, heap, scc);
+  VEC_free (tree, stack, scc);
 
   return true;
 }

=== modified file 'gcc/vecir.h'
--- gcc/vecir.h	2010-05-15 19:02:11 +0000
+++ gcc/vecir.h	2011-08-19 12:50:47 +0000
@@ -28,6 +28,9 @@ along with GCC; see the file COPYING3.  
 DEF_VEC_P(tree);
 DEF_VEC_ALLOC_P(tree,gc);
 DEF_VEC_ALLOC_P(tree,heap);
+DEF_VEC_ALLOC_P_STACK(tree);
+#define VEC_tree_stack_alloc(alloc) \
+  VEC_stack_alloc (tree, alloc)
 
 /* A varray of gimple statements.  */
 DEF_VEC_P(gimple);


^ permalink raw reply	[flat|nested] 22+ messages in thread

* tree-ssa-structalias.c: alloc_pool for struct equiv_class_label
  2011-08-22  7:50 Various minor speed-ups Dimitrios Apostolou
                   ` (2 preceding siblings ...)
  2011-08-22  7:59 ` tree-ssa*: reduce malloc() calls by preallocating hot VECs on the stack Dimitrios Apostolou
@ 2011-08-22  8:05 ` Dimitrios Apostolou
  2011-08-22  9:01   ` Dimitrios Apostolou
  2011-08-22 10:25   ` Richard Guenther
  2011-08-22  8:43 ` Various minor speed-ups Dimitrios Apostolou
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 22+ messages in thread
From: Dimitrios Apostolou @ 2011-08-22  8:05 UTC (permalink / raw)
  To: Dimitrios Apostolou; +Cc: gcc-patches, Steven Bosscher


2011-08-22  Dimitrios Apostolou  <jimis@gmx.net>

 	* tree-ssa-structalias.c (equiv_class_add)
 	(perform_var_substitution, free_var_substitution_info): Created a
 	new equiv_class_pool allocator pool for struct
 	equiv_class_label. Changed the pointer_equiv_class_table and
 	location_equiv_class_table hash tables to not iterate freeing all
 	elements in the end, but just free the pool.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: mem_attrs_htab
  2011-08-22  7:53 ` mem_attrs_htab Dimitrios Apostolou
@ 2011-08-22  8:37   ` Jakub Jelinek
  2011-08-22  9:39     ` mem_attrs_htab Dimitrios Apostolou
  0 siblings, 1 reply; 22+ messages in thread
From: Jakub Jelinek @ 2011-08-22  8:37 UTC (permalink / raw)
  To: Dimitrios Apostolou; +Cc: gcc-patches, Steven Bosscher

On Mon, Aug 22, 2011 at 10:32:35AM +0300, Dimitrios Apostolou wrote:
> --- gcc/emit-rtl.c	2011-05-29 17:40:05 +0000
> +++ gcc/emit-rtl.c	2011-08-21 04:44:25 +0000
> @@ -256,11 +256,10 @@ mem_attrs_htab_hash (const void *x)
>  {
>    const mem_attrs *const p = (const mem_attrs *) x;
>  
> -  return (p->alias ^ (p->align * 1000)
> -	  ^ (p->addrspace * 4000)
> -	  ^ ((p->offset ? INTVAL (p->offset) : 0) * 50000)
> -	  ^ ((p->size ? INTVAL (p->size) : 0) * 2500000)
> -	  ^ (size_t) iterative_hash_expr (p->expr, 0));
> +  /* By massively feeding the mem_attrs struct to iterative_hash() we
> +     disregard the p->offset and p->size rtx, but in total the hash is
> +     quick and good enough. */
> +  return iterative_hash_object (*p, iterative_hash_expr (p->expr, 0));
>  }
>  
>  /* Returns nonzero if the value represented by X (which is really a

This patch isn't against the trunk, where p->offset and p->size aren't rtxes
anymore, but HOST_WIDE_INTs.  Furthermore, it is a bad idea to hash
the p->expr address itself, it doesn't make any sense to hash on what
p->expr points to in that case.  And p->offset and p->size should be ignored
if the *known_p corresponding fields are false.  So, if you really think
using iterative_hash_object is a win, it should be something like:
  mem_attrs q = *p;
  q.expr = NULL;
  if (!q.offset_known_p) q.offset = 0;
  if (!q.size_known_p) q.size = 0;
  return iterative_hash_object (q, iterative_hash_expr (p->expr, 0));
(or better yet avoid q.expr = NULL and instead start hashing from the next
field after expr).  Hashing the struct padding might not be a good idea
either.

	Jakub

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Various minor speed-ups
  2011-08-22  7:50 Various minor speed-ups Dimitrios Apostolou
                   ` (3 preceding siblings ...)
  2011-08-22  8:05 ` tree-ssa-structalias.c: alloc_pool for struct equiv_class_label Dimitrios Apostolou
@ 2011-08-22  8:43 ` Dimitrios Apostolou
  2011-08-22 11:01   ` Richard Guenther
  2011-08-22  9:44 ` Dimitrios Apostolou
  2011-08-22  9:50 ` cse.c: preferable() Dimitrios Apostolou
  6 siblings, 1 reply; 22+ messages in thread
From: Dimitrios Apostolou @ 2011-08-22  8:43 UTC (permalink / raw)
  To: Dimitrios Apostolou; +Cc: gcc-patches, Steven Bosscher

[-- Attachment #1: Type: TEXT/PLAIN, Size: 200 bytes --]


2011-08-22  Dimitrios Apostolou  <jimis@gmx.net>

 	* tree-ssa-pre.c (phi_trans_add, init_pre, fini_pre): Added a pool
 	for phi_translate_table elements to avoid free() calls from
 	htab_delete().


[-- Attachment #2: Type: TEXT/plain, Size: 1983 bytes --]

=== modified file 'gcc/tree-ssa-pre.c'
--- gcc/tree-ssa-pre.c	2011-05-04 09:04:53 +0000
+++ gcc/tree-ssa-pre.c	2011-08-17 08:43:23 +0000
@@ -515,6 +515,10 @@ typedef struct expr_pred_trans_d
 } *expr_pred_trans_t;
 typedef const struct expr_pred_trans_d *const_expr_pred_trans_t;
 
+/* Pool of memory for the above */
+
+static alloc_pool phi_translate_pool;
+
 /* Return the hash value for a phi translation table entry.  */
 
 static hashval_t
@@ -571,7 +575,8 @@ static inline void
 phi_trans_add (pre_expr e, pre_expr v, basic_block pred)
 {
   void **slot;
-  expr_pred_trans_t new_pair = XNEW (struct expr_pred_trans_d);
+  expr_pred_trans_t new_pair
+    = (expr_pred_trans_t) pool_alloc (phi_translate_pool);
   new_pair->e = e;
   new_pair->pred = pred;
   new_pair->v = v;
@@ -580,7 +585,8 @@ phi_trans_add (pre_expr e, pre_expr v, b
 
   slot = htab_find_slot_with_hash (phi_translate_table, new_pair,
 				   new_pair->hashcode, INSERT);
-  free (*slot);
+  if (*slot)
+    pool_free (phi_translate_pool, *slot);
   *slot = (void *) new_pair;
 }
 
@@ -4804,8 +4810,12 @@ init_pre (bool do_fre)
   calculate_dominance_info (CDI_DOMINATORS);
 
   bitmap_obstack_initialize (&grand_bitmap_obstack);
+  phi_translate_pool = create_alloc_pool ("phi_translate_table pool",
+					  sizeof (struct expr_pred_trans_d),
+					  4096);
+  /* NULL as free because we'll free the whole pool in the end. */
   phi_translate_table = htab_create (5110, expr_pred_trans_hash,
-				     expr_pred_trans_eq, free);
+				     expr_pred_trans_eq, NULL);
   expression_to_id = htab_create (num_ssa_names * 3,
 				  pre_expr_hash,
 				  pre_expr_eq, NULL);
@@ -4839,6 +4849,7 @@ fini_pre (bool do_fre)
   free_alloc_pool (bitmap_set_pool);
   free_alloc_pool (pre_expr_pool);
   htab_delete (phi_translate_table);
+  free_alloc_pool (phi_translate_pool);
   htab_delete (expression_to_id);
   VEC_free (unsigned, heap, name_to_id);
 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: graphds.[ch]: alloc_pool for edges
  2011-08-22  7:53 ` graphds.[ch]: alloc_pool for edges Dimitrios Apostolou
@ 2011-08-22  8:46   ` Jakub Jelinek
  2011-08-22 10:11   ` Richard Guenther
  1 sibling, 0 replies; 22+ messages in thread
From: Jakub Jelinek @ 2011-08-22  8:46 UTC (permalink / raw)
  To: Dimitrios Apostolou; +Cc: gcc-patches, Steven Bosscher

On Mon, Aug 22, 2011 at 10:37:58AM +0300, Dimitrios Apostolou wrote:
> --- gcc/graphds.h	2009-02-20 15:20:38 +0000
> +++ gcc/graphds.h	2011-08-19 16:44:41 +0000
> @@ -18,6 +18,10 @@ You should have received a copy of the G
>  along with GCC; see the file COPYING3.  If not see
>  <http://www.gnu.org/licenses/>.  */
>  
> +
> +#include "alloc-pool.h"
> +
> +

This needs to be reflected in Makefile.in, we unfortunately don't have
automatic dependency generation for gcc.
So, create
GRAPHDS_H = graphds.h alloc-pool.h
and use $(GRAPHDS_H) everywhere where graphds.h has been used so far.

	Jakub

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: tree-ssa-structalias.c: alloc_pool for struct equiv_class_label
  2011-08-22  8:05 ` tree-ssa-structalias.c: alloc_pool for struct equiv_class_label Dimitrios Apostolou
@ 2011-08-22  9:01   ` Dimitrios Apostolou
  2011-08-22 10:25   ` Richard Guenther
  1 sibling, 0 replies; 22+ messages in thread
From: Dimitrios Apostolou @ 2011-08-22  9:01 UTC (permalink / raw)
  To: Dimitrios Apostolou; +Cc: gcc-patches, Steven Bosscher

[-- Attachment #1: Type: TEXT/PLAIN, Size: 475 bytes --]

Forgot the patch...

On Mon, 22 Aug 2011, Dimitrios Apostolou wrote:
>
> 2011-08-22  Dimitrios Apostolou  <jimis@gmx.net>
>
> 	* tree-ssa-structalias.c (equiv_class_add)
> 	(perform_var_substitution, free_var_substitution_info): Created a
> 	new equiv_class_pool allocator pool for struct
> 	equiv_class_label. Changed the pointer_equiv_class_table and
> 	location_equiv_class_table hash tables to not iterate freeing all
> 	elements in the end, but just free the pool.
>
>
>

[-- Attachment #2: Type: TEXT/plain, Size: 1816 bytes --]

=== modified file 'gcc/tree-ssa-structalias.c'
--- gcc/tree-ssa-structalias.c	2011-04-29 10:59:33 +0000
+++ gcc/tree-ssa-structalias.c	2011-08-18 06:53:12 +0000
@@ -1899,6 +1899,9 @@ static htab_t pointer_equiv_class_table;
    classes.  */
 static htab_t location_equiv_class_table;
 
+/* Pool of memory for storing the above */
+static alloc_pool equiv_class_pool;
+
 /* Hash function for a equiv_class_label_t */
 
 static hashval_t
@@ -1948,7 +1951,8 @@ equiv_class_add (htab_t table, unsigned 
 		 bitmap labels)
 {
   void **slot;
-  equiv_class_label_t ecl = XNEW (struct equiv_class_label);
+  equiv_class_label_t ecl
+    = (equiv_class_label_t) pool_alloc (equiv_class_pool);
 
   ecl->labels = labels;
   ecl->equivalence_class = equivalence_class;
@@ -2159,10 +2163,14 @@ perform_var_substitution (constraint_gra
   struct scc_info *si = init_scc_info (size);
 
   bitmap_obstack_initialize (&iteration_obstack);
+  equiv_class_pool = create_alloc_pool ("equiv_class_label pool",
+					sizeof (struct equiv_class_label),
+					64);
+  /* NULL free function, we'll free the whole pool at the end of the pass. */
   pointer_equiv_class_table = htab_create (511, equiv_class_label_hash,
-					   equiv_class_label_eq, free);
+					   equiv_class_label_eq, NULL);
   location_equiv_class_table = htab_create (511, equiv_class_label_hash,
-					    equiv_class_label_eq, free);
+					    equiv_class_label_eq, NULL);
   pointer_equiv_class = 1;
   location_equiv_class = 1;
 
@@ -2269,6 +2277,7 @@ free_var_substitution_info (struct scc_i
   sbitmap_free (graph->direct_nodes);
   htab_delete (pointer_equiv_class_table);
   htab_delete (location_equiv_class_table);
+  free_alloc_pool (equiv_class_pool);
   bitmap_obstack_release (&iteration_obstack);
 }
 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: mem_attrs_htab
  2011-08-22  8:37   ` mem_attrs_htab Jakub Jelinek
@ 2011-08-22  9:39     ` Dimitrios Apostolou
  2011-08-22  9:43       ` mem_attrs_htab Jakub Jelinek
  0 siblings, 1 reply; 22+ messages in thread
From: Dimitrios Apostolou @ 2011-08-22  9:39 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Dimitrios Apostolou, gcc-patches, Steven Bosscher

Hi Jakub,

I forgot to mention that all patches are against mid-July trunk, I was 
hoping I'd have no conflicts. Anyway thanks for letting me know, 
if there are conflicts with my other patches please let me know, and I'll 
post an updated version at a later date.

All your other concerns are valid and I'll try addressing them in the 
future. I didn't like hashing addresses either, and I was surprised I saw 
no regressions.


Dimitris


>
> This patch isn't against the trunk, where p->offset and p->size aren't rtxes
> anymore, but HOST_WIDE_INTs.  Furthermore, it is a bad idea to hash
> the p->expr address itself, it doesn't make any sense to hash on what
> p->expr points to in that case.  And p->offset and p->size should be ignored
> if the *known_p corresponding fields are false.  So, if you really think
> using iterative_hash_object is a win, it should be something like:
>  mem_attrs q = *p;
>  q.expr = NULL;
>  if (!q.offset_known_p) q.offset = 0;
>  if (!q.size_known_p) q.size = 0;
>  return iterative_hash_object (q, iterative_hash_expr (p->expr, 0));
> (or better yet avoid q.expr = NULL and instead start hashing from the next
> field after expr).  Hashing the struct padding might not be a good idea
> either.
>
> 	Jakub
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: mem_attrs_htab
  2011-08-22  9:39     ` mem_attrs_htab Dimitrios Apostolou
@ 2011-08-22  9:43       ` Jakub Jelinek
  2011-08-22 10:45         ` mem_attrs_htab Richard Guenther
  0 siblings, 1 reply; 22+ messages in thread
From: Jakub Jelinek @ 2011-08-22  9:43 UTC (permalink / raw)
  To: Dimitrios Apostolou; +Cc: gcc-patches, Steven Bosscher

On Mon, Aug 22, 2011 at 10:58:48AM +0300, Dimitrios Apostolou wrote:
> the future. I didn't like hashing addresses either, and I was
> surprised I saw no regressions.

Hashing on the expr address as well just results in smaller sharing
in the hash table (i.e. if the expr has different address, but is considered
equal).  The hashing of mem attrs is done just to reduce memory overhead.

	Jakub

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Various minor speed-ups
  2011-08-22  7:50 Various minor speed-ups Dimitrios Apostolou
                   ` (4 preceding siblings ...)
  2011-08-22  8:43 ` Various minor speed-ups Dimitrios Apostolou
@ 2011-08-22  9:44 ` Dimitrios Apostolou
  2011-08-22  9:50 ` cse.c: preferable() Dimitrios Apostolou
  6 siblings, 0 replies; 22+ messages in thread
From: Dimitrios Apostolou @ 2011-08-22  9:44 UTC (permalink / raw)
  To: Dimitrios Apostolou; +Cc: gcc-patches, Steven Bosscher

[-- Attachment #1: Type: TEXT/PLAIN, Size: 425 bytes --]

For whoever is concerned about memory usage, I didn't measure a real 
increase, besides a few KB. These are very hot allocation pools and 
allocating too many blocks of 10 elements is suboptimal.



2011-08-22  Dimitrios Apostolou  <jimis@gmx.net>

 	* cselib.c (cselib_init): Increased initial size of elt_list_pool,
 	elt_loc_list_pool, cselib_val_pool, value_pool allocation pools
 	since they are very frequently used.



[-- Attachment #2: Type: TEXT/plain, Size: 912 bytes --]

=== modified file 'gcc/cselib.c'
--- gcc/cselib.c	2011-05-31 19:14:21 +0000
+++ gcc/cselib.c	2011-08-17 14:03:56 +0000
@@ -2484,12 +2484,12 @@ void
 cselib_init (int record_what)
 {
   elt_list_pool = create_alloc_pool ("elt_list",
-				     sizeof (struct elt_list), 10);
+				     sizeof (struct elt_list), 128);
   elt_loc_list_pool = create_alloc_pool ("elt_loc_list",
-				         sizeof (struct elt_loc_list), 10);
+				         sizeof (struct elt_loc_list), 128);
   cselib_val_pool = create_alloc_pool ("cselib_val_list",
-				       sizeof (cselib_val), 10);
-  value_pool = create_alloc_pool ("value", RTX_CODE_SIZE (VALUE), 100);
+				       sizeof (cselib_val), 128);
+  value_pool = create_alloc_pool ("value", RTX_CODE_SIZE (VALUE), 128);
   cselib_record_memory = record_what & CSELIB_RECORD_MEMORY;
   cselib_preserve_constants = record_what & CSELIB_PRESERVE_CONSTANTS;
 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* cse.c: preferable()
  2011-08-22  7:50 Various minor speed-ups Dimitrios Apostolou
                   ` (5 preceding siblings ...)
  2011-08-22  9:44 ` Dimitrios Apostolou
@ 2011-08-22  9:50 ` Dimitrios Apostolou
  6 siblings, 0 replies; 22+ messages in thread
From: Dimitrios Apostolou @ 2011-08-22  9:50 UTC (permalink / raw)
  To: Dimitrios Apostolou; +Cc: gcc-patches, Steven Bosscher, christophe.jaillet

[-- Attachment #1: Type: TEXT/PLAIN, Size: 350 bytes --]

Attached patch is also posted at bug #19832 and I think resolves it, as 
well as /maybe/ offers a negligible speedup of 3-4 M instr or a couple 
milliseconds. I also post it here for comments.


2011-08-13  Dimitrios Apostolou  <jimis@gmx.net>

     * cse.c (preferable): Make it more readable and slightly faster,
     without affecting its logic.


[-- Attachment #2: Type: TEXT/plain, Size: 1585 bytes --]

=== modified file 'gcc/cse.c'
--- gcc/cse.c	2011-06-02 21:52:46 +0000
+++ gcc/cse.c	2011-08-13 00:54:06 +0000
@@ -720,32 +720,25 @@ approx_reg_cost (rtx x)
 static int
 preferable (int cost_a, int regcost_a, int cost_b, int regcost_b)
 {
-  /* First, get rid of cases involving expressions that are entirely
-     unwanted.  */
-  if (cost_a != cost_b)
-    {
-      if (cost_a == MAX_COST)
-	return 1;
-      if (cost_b == MAX_COST)
-	return -1;
-    }
+  int cost_diff = cost_a - cost_b;
+  int regcost_diff = regcost_a - regcost_b;
 
-  /* Avoid extending lifetimes of hardregs.  */
-  if (regcost_a != regcost_b)
+  if (cost_diff != 0)
     {
-      if (regcost_a == MAX_COST)
-	return 1;
-      if (regcost_b == MAX_COST)
-	return -1;
+      /* If none of the expressions are entirely unwanted */
+      if ((cost_a != MAX_COST) && (cost_b != MAX_COST)
+	  /* AND only one of the regs is HARD_REG */
+	  && (regcost_diff != 0)
+	  && ((regcost_a == MAX_COST) || (regcost_b == MAX_COST))
+	  )
+	/* Then avoid extending lifetime of HARD_REG */
+	return regcost_diff;
+
+      return cost_diff;
     }
 
-  /* Normal operation costs take precedence.  */
-  if (cost_a != cost_b)
-    return cost_a - cost_b;
-  /* Only if these are identical consider effects on register pressure.  */
-  if (regcost_a != regcost_b)
-    return regcost_a - regcost_b;
-  return 0;
+  /* cost_a == costb, consider effects on register pressure */
+  return regcost_diff;
 }
 
 /* Internal function, to compute cost when X is not a register; called


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: tree-ssa*: reduce malloc() calls by preallocating hot VECs on the stack
  2011-08-22  7:59 ` tree-ssa*: reduce malloc() calls by preallocating hot VECs on the stack Dimitrios Apostolou
@ 2011-08-22 10:07   ` Richard Guenther
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Guenther @ 2011-08-22 10:07 UTC (permalink / raw)
  To: Dimitrios Apostolou; +Cc: gcc-patches, Steven Bosscher

On Mon, Aug 22, 2011 at 9:43 AM, Dimitrios Apostolou <jimis@gmx.net> wrote:
>
> 2011-08-22  Dimitrios Apostolou  <jimis@gmx.net>
>
>        Allocate some very frequently used vectors on the stack:
>        * vecir.h: Defined a tree vector on the stack.
>        * tree-ssa-sccvn.c (print_scc, sort_scc, process_scc)
>        (extract_and_process_scc_for_name): Allocate the scc vector on the
>        stack instead of the heap, giving it a minimal initial size
>        instead of 0.
>        * tree-ssa-structalias.c (get_constraint_for_1)
>        (get_constraint_for, get_constraint_for_rhs, do_deref)
>        (get_constraint_for_ssa_var, get_constraint_for_ptr_offset)
>        (get_constraint_for_component_ref, get_constraint_for_address_of)
>        (process_all_all_constraints, do_structure_copy)
>        (make_constraints_to, make_constraint_to, handle_rhs_call)
>        (handle_lhs_call, handle_const_call, handle_pure_call)
>        (find_func_aliases_for_builtin_call, find_func_aliases_for_call)
>        (find_func_aliases, process_ipa_clobber, find_func_clobbers)
>        (create_variable_info_for): Converted the rhsc, lhsc vectors from
>        heap to stack, with a minimal initial size, since they were very
>        frequently allocated.

Ok if bootstrapped and tested ok - please always state how you tested
a patch.

Thanks,
Richard.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: graphds.[ch]: alloc_pool for edges
  2011-08-22  7:53 ` graphds.[ch]: alloc_pool for edges Dimitrios Apostolou
  2011-08-22  8:46   ` Jakub Jelinek
@ 2011-08-22 10:11   ` Richard Guenther
  1 sibling, 0 replies; 22+ messages in thread
From: Richard Guenther @ 2011-08-22 10:11 UTC (permalink / raw)
  To: Dimitrios Apostolou; +Cc: gcc-patches, Steven Bosscher

On Mon, Aug 22, 2011 at 9:37 AM, Dimitrios Apostolou <jimis@gmx.net> wrote:
> free() was called way too often before, this patch reduces it significantly.
> Minor speed-up here too, I don't mention it individually since numbers are
> within noise margins.

As there is no re-use in this pool the natural allocator to use is an
obstack which has even less overhead than a alloc_pool.

Richard.

>
> 2011-08-22  Dimitrios Apostolou  <jimis@gmx.net>
>
>        * graphds.h (struct graph): Added edge_pool as a pool for
>        allocating edges.
>        * graphds.c (new_graph): Initialise edge_pool.
>        (add_edge): Allocate edge from edge_pool rather than with malloc.
>        (free_graph): Instead of iterating across the graph freeing edges,
>        just destroy the edge_pool.
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: tree-ssa-structalias.c: alloc_pool for struct equiv_class_label
  2011-08-22  8:05 ` tree-ssa-structalias.c: alloc_pool for struct equiv_class_label Dimitrios Apostolou
  2011-08-22  9:01   ` Dimitrios Apostolou
@ 2011-08-22 10:25   ` Richard Guenther
  2011-08-22 11:26     ` Dimitrios Apostolou
  1 sibling, 1 reply; 22+ messages in thread
From: Richard Guenther @ 2011-08-22 10:25 UTC (permalink / raw)
  To: Dimitrios Apostolou; +Cc: gcc-patches, Steven Bosscher

On Mon, Aug 22, 2011 at 9:46 AM, Dimitrios Apostolou <jimis@gmx.net> wrote:
>
> 2011-08-22  Dimitrios Apostolou  <jimis@gmx.net>
>
>        * tree-ssa-structalias.c (equiv_class_add)
>        (perform_var_substitution, free_var_substitution_info): Created a
>        new equiv_class_pool allocator pool for struct
>        equiv_class_label. Changed the pointer_equiv_class_table and
>        location_equiv_class_table hash tables to not iterate freeing all
>        elements in the end, but just free the pool.

Did you check if the hash functions have ever called free()?  If so why
not use the pool free function so that entries can get re-used?  If not,
the natural allocator would be an obstack instead.

Richard.

>
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: mem_attrs_htab
  2011-08-22  9:43       ` mem_attrs_htab Jakub Jelinek
@ 2011-08-22 10:45         ` Richard Guenther
  2011-08-22 11:11           ` mem_attrs_htab Jakub Jelinek
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Guenther @ 2011-08-22 10:45 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Dimitrios Apostolou, gcc-patches, Steven Bosscher

On Mon, Aug 22, 2011 at 10:04 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Mon, Aug 22, 2011 at 10:58:48AM +0300, Dimitrios Apostolou wrote:
>> the future. I didn't like hashing addresses either, and I was
>> surprised I saw no regressions.
>
> Hashing on the expr address as well just results in smaller sharing
> in the hash table (i.e. if the expr has different address, but is considered
> equal).  The hashing of mem attrs is done just to reduce memory overhead.

And at some point the idea popped up to just dump this whole re-using
mem-attrs.  There is at most a single function in RTL but the whole program
is available in SSA, so the memory overhead must be small.

Richard.

>        Jakub
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Various minor speed-ups
  2011-08-22  8:43 ` Various minor speed-ups Dimitrios Apostolou
@ 2011-08-22 11:01   ` Richard Guenther
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Guenther @ 2011-08-22 11:01 UTC (permalink / raw)
  To: Dimitrios Apostolou; +Cc: gcc-patches, Steven Bosscher

On Mon, Aug 22, 2011 at 9:50 AM, Dimitrios Apostolou <jimis@gmx.net> wrote:
>
> 2011-08-22  Dimitrios Apostolou  <jimis@gmx.net>
>
>        * tree-ssa-pre.c (phi_trans_add, init_pre, fini_pre): Added a pool
>        for phi_translate_table elements to avoid free() calls from
>        htab_delete().

Ok if bootstrap and test pass.

Richard.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: mem_attrs_htab
  2011-08-22 10:45         ` mem_attrs_htab Richard Guenther
@ 2011-08-22 11:11           ` Jakub Jelinek
  2011-08-22 11:54             ` mem_attrs_htab Richard Guenther
  0 siblings, 1 reply; 22+ messages in thread
From: Jakub Jelinek @ 2011-08-22 11:11 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Dimitrios Apostolou, gcc-patches, Steven Bosscher

On Mon, Aug 22, 2011 at 11:57:18AM +0200, Richard Guenther wrote:
> And at some point the idea popped up to just dump this whole re-using
> mem-attrs.  There is at most a single function in RTL but the whole program
> is available in SSA, so the memory overhead must be small.

Some functions are extremely large though.  Do you mean that MEM itself would be
enlarged to have the MEM_ATTRS field so that one operand is the address,
then expr, then HWI size, offset, etc.?  Because if the mem attrs aren't
shared any longer, it doesn't make sense to keep the indirection.
I still fear we have way too many MEMs in RTL that this would be noticeable.

	Jakub

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: tree-ssa-structalias.c: alloc_pool for struct equiv_class_label
  2011-08-22 10:25   ` Richard Guenther
@ 2011-08-22 11:26     ` Dimitrios Apostolou
  0 siblings, 0 replies; 22+ messages in thread
From: Dimitrios Apostolou @ 2011-08-22 11:26 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Dimitrios Apostolou, gcc-patches, Steven Bosscher

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1214 bytes --]

On Mon, 22 Aug 2011, Richard Guenther wrote:

> On Mon, Aug 22, 2011 at 9:46 AM, Dimitrios Apostolou <jimis@gmx.net> wrote:
>>
>> 2011-08-22  Dimitrios Apostolou  <jimis@gmx.net>
>>
>>        * tree-ssa-structalias.c (equiv_class_add)
>>        (perform_var_substitution, free_var_substitution_info): Created a
>>        new equiv_class_pool allocator pool for struct
>>        equiv_class_label. Changed the pointer_equiv_class_table and
>>        location_equiv_class_table hash tables to not iterate freeing all
>>        elements in the end, but just free the pool.
>
> Did you check if the hash functions have ever called free()?  If so why
> not use the pool free function so that entries can get re-used?  If not,
> the natural allocator would be an obstack instead.

I have not found any relevant call of htab_clear_slot(). I didn't consider 
obstacks at all for all these cases, thanks for telling me, I'll see where 
I can use them. As I've noted I have bootstrapped and tested all these 
changes at least on x86_64 with release-checking enabled, but I plan to 
test and measure all my changes together later, and hopefully on other 
platforms in the future.


Thanks,
Dimitris

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: mem_attrs_htab
  2011-08-22 11:11           ` mem_attrs_htab Jakub Jelinek
@ 2011-08-22 11:54             ` Richard Guenther
  2011-08-22 16:13               ` mem_attrs_htab Michael Matz
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Guenther @ 2011-08-22 11:54 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Dimitrios Apostolou, gcc-patches, Steven Bosscher

On Mon, Aug 22, 2011 at 12:07 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Mon, Aug 22, 2011 at 11:57:18AM +0200, Richard Guenther wrote:
>> And at some point the idea popped up to just dump this whole re-using
>> mem-attrs.  There is at most a single function in RTL but the whole program
>> is available in SSA, so the memory overhead must be small.
>
> Some functions are extremely large though.  Do you mean that MEM itself would be
> enlarged to have the MEM_ATTRS field so that one operand is the address,
> then expr, then HWI size, offset, etc.?  Because if the mem attrs aren't
> shared any longer, it doesn't make sense to keep the indirection.
> I still fear we have way too many MEMs in RTL that this would be noticeable.

It would be interesting to have numbers about the amount of sharing that
happens - might be not trivial though, as some re-uses would be able to
simply modify the attr inplace.

Richard.

>        Jakub
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: mem_attrs_htab
  2011-08-22 11:54             ` mem_attrs_htab Richard Guenther
@ 2011-08-22 16:13               ` Michael Matz
  0 siblings, 0 replies; 22+ messages in thread
From: Michael Matz @ 2011-08-22 16:13 UTC (permalink / raw)
  To: Richard Guenther
  Cc: Jakub Jelinek, Dimitrios Apostolou, gcc-patches, Steven Bosscher

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2539 bytes --]

Hi,

On Mon, 22 Aug 2011, Richard Guenther wrote:

> > Some functions are extremely large though.  Do you mean that MEM 
> > itself would be enlarged to have the MEM_ATTRS field so that one 
> > operand is the address, then expr, then HWI size, offset, etc.? 
> >  Because if the mem attrs aren't shared any longer, it doesn't make 
> > sense to keep the indirection. I still fear we have way too many MEMs 
> > in RTL that this would be noticeable.
> 
> It would be interesting to have numbers about the amount of sharing that 
> happens

A pathetic amount.  From compiling cse.c combine.c tree.c and dwarf2out.c 
the top 10 users of MEMs per routine are:

With -O0:
MEMs  ATTR  name
970   485   combine_simplify_rtx
1011  464   simple_cst_equal
1047  612   mem_loc_descriptor
1173  442   walk_tree_1
1296  515   loc_list_from_tree
1431  690   simplify_comparison
1911  752   substitute_placeholder_in_expr
1951  745   substitute_in_expr
2503  1532  cse_insn
3242  2206  try_combine

With -O2:
MEMs  ATTR  name
514   502   gen_tagged_type_die
701   536   simplify_comparison
743   877   find_decls_types_r
851   840   dwarf2out_finish
863   784   loc_list_from_tree
916   839   combine_simplify_rtx
978   878   gen_subprogram_die
1650  1475  cse_insn
1720  1782  mem_loc_descriptor
2336  1792  try_combine

Summing doesn't make sense, but the routines with largest differences:
-O0
532 force_to_mode
547 simple_cst_equal
640 simplify_shift_const_1
731 walk_tree_1
741 simplify_comparison
781 loc_list_from_tree
971 cse_insn
1036 try_combine
1159 substitute_placeholder_in_expr
1206 substitute_in_expr

-O2
100 gen_subprogram_die
101 make_extraction
112 output_loc_sequence
122 if_then_else_cond
124 substitute_placeholder_in_expr
144 simplify_shift_const_1
165 simplify_comparison
175 cse_insn
205 simplify_if_then_else
544 try_combine

(Using -g or not doesn't make a difference).  I've counted all MEM rtx in 
the whole insn stream at finalization time (i.e. slightly less than 
potentially are actually generated during RTL passes).  ATTR is the number 
of unique mem_attrs ever created by set_mem_attrs, reset to zero at each 
function start (including emptying the htab).

That is, we save a whopping 48 kilobyte due to this fantastic hash table 
:-)  (offseted by the need for a pointer in the MEM rtx)

Just remove the whole thing.  Same for the reg_attrs hash table (I haven't 
measured that one, though).

> - might be not trivial though, as some re-uses would be able to 
> simply modify the attr inplace.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2011-08-22 15:46 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-22  7:50 Various minor speed-ups Dimitrios Apostolou
2011-08-22  7:53 ` mem_attrs_htab Dimitrios Apostolou
2011-08-22  8:37   ` mem_attrs_htab Jakub Jelinek
2011-08-22  9:39     ` mem_attrs_htab Dimitrios Apostolou
2011-08-22  9:43       ` mem_attrs_htab Jakub Jelinek
2011-08-22 10:45         ` mem_attrs_htab Richard Guenther
2011-08-22 11:11           ` mem_attrs_htab Jakub Jelinek
2011-08-22 11:54             ` mem_attrs_htab Richard Guenther
2011-08-22 16:13               ` mem_attrs_htab Michael Matz
2011-08-22  7:53 ` graphds.[ch]: alloc_pool for edges Dimitrios Apostolou
2011-08-22  8:46   ` Jakub Jelinek
2011-08-22 10:11   ` Richard Guenther
2011-08-22  7:59 ` tree-ssa*: reduce malloc() calls by preallocating hot VECs on the stack Dimitrios Apostolou
2011-08-22 10:07   ` Richard Guenther
2011-08-22  8:05 ` tree-ssa-structalias.c: alloc_pool for struct equiv_class_label Dimitrios Apostolou
2011-08-22  9:01   ` Dimitrios Apostolou
2011-08-22 10:25   ` Richard Guenther
2011-08-22 11:26     ` Dimitrios Apostolou
2011-08-22  8:43 ` Various minor speed-ups Dimitrios Apostolou
2011-08-22 11:01   ` Richard Guenther
2011-08-22  9:44 ` Dimitrios Apostolou
2011-08-22  9:50 ` cse.c: preferable() Dimitrios Apostolou

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).