* Various minor speed-ups
@ 2011-08-22 7:50 Dimitrios Apostolou
2011-08-22 7:53 ` mem_attrs_htab Dimitrios Apostolou
` (6 more replies)
0 siblings, 7 replies; 22+ messages in thread
From: Dimitrios Apostolou @ 2011-08-22 7:50 UTC (permalink / raw)
To: gcc-patches; +Cc: Steven Bosscher, Dimitrios Apostolou
Hello list,
the followup patches are a selection of minor changes introduced in
various times during my GSOC project. They mostly are simple or
not that important to be posted alone, so I'll post them alltogether under
this thread. Nevertheless they have been carefully selected from a pool of
other changes and they are the ones that *do* offer some (minor) speed
improvement, and have the least impact on memory usage, if at all.
They have all been tested on x86_64, some also on i386. For production
builds I have seen no regression introduced.
Thanks,
Dimitris
^ permalink raw reply [flat|nested] 22+ messages in thread
* graphds.[ch]: alloc_pool for edges
2011-08-22 7:50 Various minor speed-ups Dimitrios Apostolou
2011-08-22 7:53 ` mem_attrs_htab Dimitrios Apostolou
@ 2011-08-22 7:53 ` Dimitrios Apostolou
2011-08-22 8:46 ` Jakub Jelinek
2011-08-22 10:11 ` Richard Guenther
2011-08-22 7:59 ` tree-ssa*: reduce malloc() calls by preallocating hot VECs on the stack Dimitrios Apostolou
` (4 subsequent siblings)
6 siblings, 2 replies; 22+ messages in thread
From: Dimitrios Apostolou @ 2011-08-22 7:53 UTC (permalink / raw)
To: Dimitrios Apostolou; +Cc: gcc-patches, Steven Bosscher
[-- Attachment #1: Type: TEXT/PLAIN, Size: 524 bytes --]
free() was called way too often before, this patch reduces it
significantly. Minor speed-up here too, I don't mention it individually
since numbers are within noise margins.
2011-08-22 Dimitrios Apostolou <jimis@gmx.net>
* graphds.h (struct graph): Added edge_pool as a pool for
allocating edges.
* graphds.c (new_graph): Initialise edge_pool.
(add_edge): Allocate edge from edge_pool rather than with malloc.
(free_graph): Instead of iterating across the graph freeing edges,
just destroy the edge_pool.
[-- Attachment #2: Type: TEXT/plain, Size: 1966 bytes --]
=== modified file 'gcc/graphds.c'
--- gcc/graphds.c 2009-11-25 10:55:54 +0000
+++ gcc/graphds.c 2011-08-19 16:44:41 +0000
@@ -62,7 +62,8 @@ new_graph (int n_vertices)
g->n_vertices = n_vertices;
g->vertices = XCNEWVEC (struct vertex, n_vertices);
-
+ g->edge_pool = create_alloc_pool ("edge_pool",
+ sizeof (struct graph_edge), 32);
return g;
}
@@ -71,7 +72,7 @@ new_graph (int n_vertices)
struct graph_edge *
add_edge (struct graph *g, int f, int t)
{
- struct graph_edge *e = XNEW (struct graph_edge);
+ struct graph_edge *e = (struct graph_edge *) pool_alloc (g->edge_pool);
struct vertex *vf = &g->vertices[f], *vt = &g->vertices[t];
@@ -326,19 +327,7 @@ for_each_edge (struct graph *g, graphds_
void
free_graph (struct graph *g)
{
- struct graph_edge *e, *n;
- struct vertex *v;
- int i;
-
- for (i = 0; i < g->n_vertices; i++)
- {
- v = &g->vertices[i];
- for (e = v->succ; e; e = n)
- {
- n = e->succ_next;
- free (e);
- }
- }
+ free_alloc_pool (g->edge_pool);
free (g->vertices);
free (g);
}
=== modified file 'gcc/graphds.h'
--- gcc/graphds.h 2009-02-20 15:20:38 +0000
+++ gcc/graphds.h 2011-08-19 16:44:41 +0000
@@ -18,6 +18,10 @@ You should have received a copy of the G
along with GCC; see the file COPYING3. If not see
<http://www.gnu.org/licenses/>. */
+
+#include "alloc-pool.h"
+
+
/* Structure representing edge of a graph. */
struct graph_edge
@@ -44,10 +48,10 @@ struct vertex
struct graph
{
- int n_vertices; /* Number of vertices. */
- struct vertex *vertices;
- /* The vertices. */
- htab_t indices; /* Fast lookup for indices. */
+ int n_vertices; /* Number of vertices. */
+ struct vertex *vertices; /* The vertices. */
+ htab_t indices; /* Fast lookup for indices. */
+ alloc_pool edge_pool; /* Pool for allocating edges. */
};
struct graph *new_graph (int);
^ permalink raw reply [flat|nested] 22+ messages in thread
* mem_attrs_htab
2011-08-22 7:50 Various minor speed-ups Dimitrios Apostolou
@ 2011-08-22 7:53 ` Dimitrios Apostolou
2011-08-22 8:37 ` mem_attrs_htab Jakub Jelinek
2011-08-22 7:53 ` graphds.[ch]: alloc_pool for edges Dimitrios Apostolou
` (5 subsequent siblings)
6 siblings, 1 reply; 22+ messages in thread
From: Dimitrios Apostolou @ 2011-08-22 7:53 UTC (permalink / raw)
To: Dimitrios Apostolou; +Cc: gcc-patches, Steven Bosscher
[-- Attachment #1: Type: TEXT/PLAIN, Size: 470 bytes --]
2011-08-22 Dimitrios Apostolou <jimis@gmx.net>
* emit-rtl.c (mem_attrs_htab_hash): Hash massively by calling
iterative_hash(). We disregard the offset,size rtx fields of the
mem_attrs struct, but overall this hash is a *huge* improvement to
the previous one, it reduces the collisions/searches ratio from 8
to 0.8 for some cases.
(init_emit_once): Slightly increase the mem_attrs_htab initial
size because it's frequently used and expanded many times.
[-- Attachment #2: Type: TEXT/plain, Size: 1222 bytes --]
=== modified file 'gcc/emit-rtl.c'
--- gcc/emit-rtl.c 2011-05-29 17:40:05 +0000
+++ gcc/emit-rtl.c 2011-08-21 04:44:25 +0000
@@ -256,11 +256,10 @@ mem_attrs_htab_hash (const void *x)
{
const mem_attrs *const p = (const mem_attrs *) x;
- return (p->alias ^ (p->align * 1000)
- ^ (p->addrspace * 4000)
- ^ ((p->offset ? INTVAL (p->offset) : 0) * 50000)
- ^ ((p->size ? INTVAL (p->size) : 0) * 2500000)
- ^ (size_t) iterative_hash_expr (p->expr, 0));
+ /* By massively feeding the mem_attrs struct to iterative_hash() we
+ disregard the p->offset and p->size rtx, but in total the hash is
+ quick and good enough. */
+ return iterative_hash_object (*p, iterative_hash_expr (p->expr, 0));
}
/* Returns nonzero if the value represented by X (which is really a
@@ -5494,7 +5500,7 @@ init_emit_once (void)
const_fixed_htab = htab_create_ggc (37, const_fixed_htab_hash,
const_fixed_htab_eq, NULL);
- mem_attrs_htab = htab_create_ggc (37, mem_attrs_htab_hash,
+ mem_attrs_htab = htab_create_ggc (128, mem_attrs_htab_hash,
mem_attrs_htab_eq, NULL);
reg_attrs_htab = htab_create_ggc (37, reg_attrs_htab_hash,
reg_attrs_htab_eq, NULL);
^ permalink raw reply [flat|nested] 22+ messages in thread
* tree-ssa*: reduce malloc() calls by preallocating hot VECs on the stack
2011-08-22 7:50 Various minor speed-ups Dimitrios Apostolou
2011-08-22 7:53 ` mem_attrs_htab Dimitrios Apostolou
2011-08-22 7:53 ` graphds.[ch]: alloc_pool for edges Dimitrios Apostolou
@ 2011-08-22 7:59 ` Dimitrios Apostolou
2011-08-22 10:07 ` Richard Guenther
2011-08-22 8:05 ` tree-ssa-structalias.c: alloc_pool for struct equiv_class_label Dimitrios Apostolou
` (3 subsequent siblings)
6 siblings, 1 reply; 22+ messages in thread
From: Dimitrios Apostolou @ 2011-08-22 7:59 UTC (permalink / raw)
To: Dimitrios Apostolou; +Cc: gcc-patches, Steven Bosscher
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1057 bytes --]
2011-08-22 Dimitrios Apostolou <jimis@gmx.net>
Allocate some very frequently used vectors on the stack:
* vecir.h: Defined a tree vector on the stack.
* tree-ssa-sccvn.c (print_scc, sort_scc, process_scc)
(extract_and_process_scc_for_name): Allocate the scc vector on the
stack instead of the heap, giving it a minimal initial size
instead of 0.
* tree-ssa-structalias.c (get_constraint_for_1)
(get_constraint_for, get_constraint_for_rhs, do_deref)
(get_constraint_for_ssa_var, get_constraint_for_ptr_offset)
(get_constraint_for_component_ref, get_constraint_for_address_of)
(process_all_all_constraints, do_structure_copy)
(make_constraints_to, make_constraint_to, handle_rhs_call)
(handle_lhs_call, handle_const_call, handle_pure_call)
(find_func_aliases_for_builtin_call, find_func_aliases_for_call)
(find_func_aliases, process_ipa_clobber, find_func_clobbers)
(create_variable_info_for): Converted the rhsc, lhsc vectors from
heap to stack, with a minimal initial size, since they were very
frequently allocated.
[-- Attachment #2: Type: TEXT/plain, Size: 28822 bytes --]
=== modified file 'gcc/tree-ssa-structalias.c'
--- gcc/tree-ssa-structalias.c 2011-08-18 06:53:12 +0000
+++ gcc/tree-ssa-structalias.c 2011-08-19 09:43:41 +0000
@@ -477,11 +477,14 @@ struct constraint_expr
typedef struct constraint_expr ce_s;
DEF_VEC_O(ce_s);
-DEF_VEC_ALLOC_O(ce_s, heap);
-static void get_constraint_for_1 (tree, VEC(ce_s, heap) **, bool, bool);
-static void get_constraint_for (tree, VEC(ce_s, heap) **);
-static void get_constraint_for_rhs (tree, VEC(ce_s, heap) **);
-static void do_deref (VEC (ce_s, heap) **);
+DEF_VEC_ALLOC_O_STACK(ce_s);
+#define VEC_ce_s_stack_alloc(alloc) \
+ VEC_stack_alloc (ce_s, alloc)
+
+static void get_constraint_for_1 (tree, VEC(ce_s, stack) **, bool, bool);
+static void get_constraint_for (tree, VEC(ce_s, stack) **);
+static void get_constraint_for_rhs (tree, VEC(ce_s, stack) **);
+static void do_deref (VEC (ce_s, stack) **);
/* Our set constraints are made up of two constraint expressions, one
LHS, and one RHS.
@@ -2736,7 +2739,7 @@ new_scalar_tmp_constraint_exp (const cha
If address_p is true, the result will be taken its address of. */
static void
-get_constraint_for_ssa_var (tree t, VEC(ce_s, heap) **results, bool address_p)
+get_constraint_for_ssa_var (tree t, VEC(ce_s, stack) **results, bool address_p)
{
struct constraint_expr cexpr;
varinfo_t vi;
@@ -2776,12 +2779,12 @@ get_constraint_for_ssa_var (tree t, VEC(
for (; vi; vi = vi->next)
{
cexpr.var = vi->id;
- VEC_safe_push (ce_s, heap, *results, &cexpr);
+ VEC_safe_push (ce_s, stack, *results, &cexpr);
}
return;
}
- VEC_safe_push (ce_s, heap, *results, &cexpr);
+ VEC_safe_push (ce_s, stack, *results, &cexpr);
}
/* Process constraint T, performing various simplifications and then
@@ -2861,7 +2864,7 @@ bitpos_of_field (const tree fdecl)
static void
get_constraint_for_ptr_offset (tree ptr, tree offset,
- VEC (ce_s, heap) **results)
+ VEC (ce_s, stack) **results)
{
struct constraint_expr c;
unsigned int j, n;
@@ -2920,7 +2923,7 @@ get_constraint_for_ptr_offset (tree ptr,
c2.type = ADDRESSOF;
c2.offset = 0;
if (c2.var != c.var)
- VEC_safe_push (ce_s, heap, *results, &c2);
+ VEC_safe_push (ce_s, stack, *results, &c2);
temp = temp->next;
}
while (temp);
@@ -2955,7 +2958,7 @@ get_constraint_for_ptr_offset (tree ptr,
c2.var = temp->next->id;
c2.type = ADDRESSOF;
c2.offset = 0;
- VEC_safe_push (ce_s, heap, *results, &c2);
+ VEC_safe_push (ce_s, stack, *results, &c2);
}
c.var = temp->id;
c.offset = 0;
@@ -2974,7 +2977,7 @@ get_constraint_for_ptr_offset (tree ptr,
as the lhs. */
static void
-get_constraint_for_component_ref (tree t, VEC(ce_s, heap) **results,
+get_constraint_for_component_ref (tree t, VEC(ce_s, stack) **results,
bool address_p, bool lhs_p)
{
tree orig_t = t;
@@ -2999,7 +3002,7 @@ get_constraint_for_component_ref (tree t
temp.offset = 0;
temp.var = integer_id;
temp.type = SCALAR;
- VEC_safe_push (ce_s, heap, *results, &temp);
+ VEC_safe_push (ce_s, stack, *results, &temp);
return;
}
@@ -3021,7 +3024,7 @@ get_constraint_for_component_ref (tree t
temp.offset = 0;
temp.var = anything_id;
temp.type = ADDRESSOF;
- VEC_safe_push (ce_s, heap, *results, &temp);
+ VEC_safe_push (ce_s, stack, *results, &temp);
return;
}
}
@@ -3062,7 +3065,7 @@ get_constraint_for_component_ref (tree t
bitpos, bitmaxsize))
{
cexpr.var = curr->id;
- VEC_safe_push (ce_s, heap, *results, &cexpr);
+ VEC_safe_push (ce_s, stack, *results, &cexpr);
if (address_p)
break;
}
@@ -3077,7 +3080,7 @@ get_constraint_for_component_ref (tree t
while (curr->next != NULL)
curr = curr->next;
cexpr.var = curr->id;
- VEC_safe_push (ce_s, heap, *results, &cexpr);
+ VEC_safe_push (ce_s, stack, *results, &cexpr);
}
else if (VEC_length (ce_s, *results) == 0)
/* Assert that we found *some* field there. The user couldn't be
@@ -3090,7 +3093,7 @@ get_constraint_for_component_ref (tree t
cexpr.type = SCALAR;
cexpr.var = anything_id;
cexpr.offset = 0;
- VEC_safe_push (ce_s, heap, *results, &cexpr);
+ VEC_safe_push (ce_s, stack, *results, &cexpr);
}
}
else if (bitmaxsize == 0)
@@ -3136,7 +3139,7 @@ get_constraint_for_component_ref (tree t
This is needed so that we can handle dereferencing DEREF constraints. */
static void
-do_deref (VEC (ce_s, heap) **constraints)
+do_deref (VEC (ce_s, stack) **constraints)
{
struct constraint_expr *c;
unsigned int i = 0;
@@ -3163,7 +3166,7 @@ do_deref (VEC (ce_s, heap) **constraints
address of it. */
static void
-get_constraint_for_address_of (tree t, VEC (ce_s, heap) **results)
+get_constraint_for_address_of (tree t, VEC (ce_s, stack) **results)
{
struct constraint_expr *c;
unsigned int i;
@@ -3182,7 +3185,7 @@ get_constraint_for_address_of (tree t, V
/* Given a tree T, return the constraint expression for it. */
static void
-get_constraint_for_1 (tree t, VEC (ce_s, heap) **results, bool address_p,
+get_constraint_for_1 (tree t, VEC (ce_s, stack) **results, bool address_p,
bool lhs_p)
{
struct constraint_expr temp;
@@ -3214,7 +3217,7 @@ get_constraint_for_1 (tree t, VEC (ce_s,
temp.var = nonlocal_id;
temp.type = ADDRESSOF;
temp.offset = 0;
- VEC_safe_push (ce_s, heap, *results, &temp);
+ VEC_safe_push (ce_s, stack, *results, &temp);
return;
}
@@ -3224,7 +3227,7 @@ get_constraint_for_1 (tree t, VEC (ce_s,
temp.var = readonly_id;
temp.type = SCALAR;
temp.offset = 0;
- VEC_safe_push (ce_s, heap, *results, &temp);
+ VEC_safe_push (ce_s, stack, *results, &temp);
return;
}
@@ -3275,7 +3278,7 @@ get_constraint_for_1 (tree t, VEC (ce_s,
if (curr->offset - vi->offset < size)
{
cs.var = curr->id;
- VEC_safe_push (ce_s, heap, *results, &cs);
+ VEC_safe_push (ce_s, stack, *results, &cs);
}
else
break;
@@ -3310,17 +3313,17 @@ get_constraint_for_1 (tree t, VEC (ce_s,
{
unsigned int i;
tree val;
- VEC (ce_s, heap) *tmp = NULL;
+ VEC (ce_s, stack) *tmp = VEC_alloc (ce_s, stack, 32);
FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (t), i, val)
{
struct constraint_expr *rhsp;
unsigned j;
get_constraint_for_1 (val, &tmp, address_p, lhs_p);
FOR_EACH_VEC_ELT (ce_s, tmp, j, rhsp)
- VEC_safe_push (ce_s, heap, *results, rhsp);
+ VEC_safe_push (ce_s, stack, *results, rhsp);
VEC_truncate (ce_s, tmp, 0);
}
- VEC_free (ce_s, heap, tmp);
+ VEC_free (ce_s, stack, tmp);
/* We do not know whether the constructor was complete,
so technically we have to add &NOTHING or &ANYTHING
like we do for an empty constructor as well. */
@@ -3341,7 +3344,7 @@ get_constraint_for_1 (tree t, VEC (ce_s,
temp.type = ADDRESSOF;
temp.var = nonlocal_id;
temp.offset = 0;
- VEC_safe_push (ce_s, heap, *results, &temp);
+ VEC_safe_push (ce_s, stack, *results, &temp);
return;
}
default:;
@@ -3351,13 +3354,13 @@ get_constraint_for_1 (tree t, VEC (ce_s,
temp.type = ADDRESSOF;
temp.var = anything_id;
temp.offset = 0;
- VEC_safe_push (ce_s, heap, *results, &temp);
+ VEC_safe_push (ce_s, stack, *results, &temp);
}
/* Given a gimple tree T, return the constraint expression vector for it. */
static void
-get_constraint_for (tree t, VEC (ce_s, heap) **results)
+get_constraint_for (tree t, VEC (ce_s, stack) **results)
{
gcc_assert (VEC_length (ce_s, *results) == 0);
@@ -3368,7 +3371,7 @@ get_constraint_for (tree t, VEC (ce_s, h
to be used as the rhs of a constraint. */
static void
-get_constraint_for_rhs (tree t, VEC (ce_s, heap) **results)
+get_constraint_for_rhs (tree t, VEC (ce_s, stack) **results)
{
gcc_assert (VEC_length (ce_s, *results) == 0);
@@ -3380,7 +3383,7 @@ get_constraint_for_rhs (tree t, VEC (ce_
entries in *LHSC. */
static void
-process_all_all_constraints (VEC (ce_s, heap) *lhsc, VEC (ce_s, heap) *rhsc)
+process_all_all_constraints (VEC (ce_s, stack) *lhsc, VEC (ce_s, stack) *rhsc)
{
struct constraint_expr *lhsp, *rhsp;
unsigned i, j;
@@ -3410,8 +3413,9 @@ static void
do_structure_copy (tree lhsop, tree rhsop)
{
struct constraint_expr *lhsp, *rhsp;
- VEC (ce_s, heap) *lhsc = NULL, *rhsc = NULL;
unsigned j;
+ VEC (ce_s, stack) *lhsc = VEC_alloc (ce_s, stack, 32);
+ VEC (ce_s, stack) *rhsc = VEC_alloc (ce_s, stack, 32);
get_constraint_for (lhsop, &lhsc);
get_constraint_for_rhs (rhsop, &rhsc);
@@ -3470,14 +3474,14 @@ do_structure_copy (tree lhsop, tree rhso
else
gcc_unreachable ();
- VEC_free (ce_s, heap, lhsc);
- VEC_free (ce_s, heap, rhsc);
+ VEC_free (ce_s, stack, lhsc);
+ VEC_free (ce_s, stack, rhsc);
}
/* Create constraints ID = { rhsc }. */
static void
-make_constraints_to (unsigned id, VEC(ce_s, heap) *rhsc)
+make_constraints_to (unsigned id, VEC(ce_s, stack) *rhsc)
{
struct constraint_expr *c;
struct constraint_expr includes;
@@ -3496,10 +3500,10 @@ make_constraints_to (unsigned id, VEC(ce
static void
make_constraint_to (unsigned id, tree op)
{
- VEC(ce_s, heap) *rhsc = NULL;
+ VEC(ce_s, stack) *rhsc = VEC_alloc (ce_s, stack, 32);
get_constraint_for_rhs (op, &rhsc);
make_constraints_to (id, rhsc);
- VEC_free (ce_s, heap, rhsc);
+ VEC_free (ce_s, stack, rhsc);
}
/* Create a constraint ID = &FROM. */
@@ -3690,7 +3694,7 @@ get_function_part_constraint (varinfo_t
RHS. */
static void
-handle_rhs_call (gimple stmt, VEC(ce_s, heap) **results)
+handle_rhs_call (gimple stmt, VEC(ce_s, stack) **results)
{
struct constraint_expr rhsc;
unsigned i;
@@ -3744,7 +3748,7 @@ handle_rhs_call (gimple stmt, VEC(ce_s,
rhsc.var = get_call_use_vi (stmt)->id;
rhsc.offset = 0;
rhsc.type = SCALAR;
- VEC_safe_push (ce_s, heap, *results, &rhsc);
+ VEC_safe_push (ce_s, stack, *results, &rhsc);
}
/* The static chain escapes as well. */
@@ -3756,22 +3760,23 @@ handle_rhs_call (gimple stmt, VEC(ce_s,
&& gimple_call_lhs (stmt) != NULL_TREE
&& TREE_ADDRESSABLE (TREE_TYPE (gimple_call_lhs (stmt))))
{
- VEC(ce_s, heap) *tmpc = NULL;
struct constraint_expr lhsc, *c;
+ VEC(ce_s, stack) *tmpc = VEC_alloc (ce_s, stack, 32);
+
get_constraint_for_address_of (gimple_call_lhs (stmt), &tmpc);
lhsc.var = escaped_id;
lhsc.offset = 0;
lhsc.type = SCALAR;
FOR_EACH_VEC_ELT (ce_s, tmpc, i, c)
process_constraint (new_constraint (lhsc, *c));
- VEC_free(ce_s, heap, tmpc);
+ VEC_free(ce_s, stack, tmpc);
}
/* Regular functions return nonlocal memory. */
rhsc.var = nonlocal_id;
rhsc.offset = 0;
rhsc.type = SCALAR;
- VEC_safe_push (ce_s, heap, *results, &rhsc);
+ VEC_safe_push (ce_s, stack, *results, &rhsc);
}
/* For non-IPA mode, generate constraints necessary for a call
@@ -3779,10 +3784,10 @@ handle_rhs_call (gimple stmt, VEC(ce_s,
the LHS point to global and escaped variables. */
static void
-handle_lhs_call (gimple stmt, tree lhs, int flags, VEC(ce_s, heap) *rhsc,
+handle_lhs_call (gimple stmt, tree lhs, int flags, VEC(ce_s, stack) *rhsc,
tree fndecl)
{
- VEC(ce_s, heap) *lhsc = NULL;
+ VEC(ce_s, stack) *lhsc = VEC_alloc (ce_s, stack, 32);
get_constraint_for (lhs, &lhsc);
/* If the store is to a global decl make sure to
@@ -3796,7 +3801,7 @@ handle_lhs_call (gimple stmt, tree lhs,
tmpc.var = escaped_id;
tmpc.offset = 0;
tmpc.type = SCALAR;
- VEC_safe_push (ce_s, heap, lhsc, &tmpc);
+ VEC_safe_push (ce_s, stack, lhsc, &tmpc);
}
/* If the call returns an argument unmodified override the rhs
@@ -3810,7 +3815,7 @@ handle_lhs_call (gimple stmt, tree lhs,
arg = gimple_call_arg (stmt, flags & ERF_RETURN_ARG_MASK);
get_constraint_for (arg, &rhsc);
process_all_all_constraints (lhsc, rhsc);
- VEC_free (ce_s, heap, rhsc);
+ VEC_free (ce_s, stack, rhsc);
}
else if (flags & ERF_NOALIAS)
{
@@ -3831,19 +3836,19 @@ handle_lhs_call (gimple stmt, tree lhs,
tmpc.var = vi->id;
tmpc.offset = 0;
tmpc.type = ADDRESSOF;
- VEC_safe_push (ce_s, heap, rhsc, &tmpc);
+ VEC_safe_push (ce_s, stack, rhsc, &tmpc);
}
process_all_all_constraints (lhsc, rhsc);
- VEC_free (ce_s, heap, lhsc);
+ VEC_free (ce_s, stack, lhsc);
}
/* For non-IPA mode, generate constraints necessary for a call of a
const function that returns a pointer in the statement STMT. */
static void
-handle_const_call (gimple stmt, VEC(ce_s, heap) **results)
+handle_const_call (gimple stmt, VEC(ce_s, stack) **results)
{
struct constraint_expr rhsc;
unsigned int k;
@@ -3858,34 +3863,35 @@ handle_const_call (gimple stmt, VEC(ce_s
rhsc.var = uses->id;
rhsc.offset = 0;
rhsc.type = SCALAR;
- VEC_safe_push (ce_s, heap, *results, &rhsc);
+ VEC_safe_push (ce_s, stack, *results, &rhsc);
}
/* May return arguments. */
for (k = 0; k < gimple_call_num_args (stmt); ++k)
{
tree arg = gimple_call_arg (stmt, k);
- VEC(ce_s, heap) *argc = NULL;
unsigned i;
struct constraint_expr *argp;
+ VEC(ce_s, stack) *argc = VEC_alloc (ce_s, stack, 32);
+
get_constraint_for_rhs (arg, &argc);
FOR_EACH_VEC_ELT (ce_s, argc, i, argp)
- VEC_safe_push (ce_s, heap, *results, argp);
- VEC_free(ce_s, heap, argc);
+ VEC_safe_push (ce_s, stack, *results, argp);
+ VEC_free(ce_s, stack, argc);
}
/* May return addresses of globals. */
rhsc.var = nonlocal_id;
rhsc.offset = 0;
rhsc.type = ADDRESSOF;
- VEC_safe_push (ce_s, heap, *results, &rhsc);
+ VEC_safe_push (ce_s, stack, *results, &rhsc);
}
/* For non-IPA mode, generate constraints necessary for a call to a
pure function in statement STMT. */
static void
-handle_pure_call (gimple stmt, VEC(ce_s, heap) **results)
+handle_pure_call (gimple stmt, VEC(ce_s, stack) **results)
{
struct constraint_expr rhsc;
unsigned i;
@@ -3920,12 +3926,12 @@ handle_pure_call (gimple stmt, VEC(ce_s,
rhsc.var = uses->id;
rhsc.offset = 0;
rhsc.type = SCALAR;
- VEC_safe_push (ce_s, heap, *results, &rhsc);
+ VEC_safe_push (ce_s, stack, *results, &rhsc);
}
rhsc.var = nonlocal_id;
rhsc.offset = 0;
rhsc.type = SCALAR;
- VEC_safe_push (ce_s, heap, *results, &rhsc);
+ VEC_safe_push (ce_s, stack, *results, &rhsc);
}
@@ -3966,9 +3972,9 @@ static bool
find_func_aliases_for_builtin_call (gimple t)
{
tree fndecl = gimple_call_fndecl (t);
- VEC(ce_s, heap) *lhsc = NULL;
- VEC(ce_s, heap) *rhsc = NULL;
varinfo_t fi;
+ VEC(ce_s, stack) *lhsc = VEC_alloc (ce_s, stack, 32);
+ VEC(ce_s, stack) *rhsc = VEC_alloc (ce_s, stack, 32);
if (fndecl != NULL_TREE
&& DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL)
@@ -4007,16 +4013,16 @@ find_func_aliases_for_builtin_call (gimp
else
get_constraint_for (dest, &rhsc);
process_all_all_constraints (lhsc, rhsc);
- VEC_free (ce_s, heap, lhsc);
- VEC_free (ce_s, heap, rhsc);
+ VEC_free (ce_s, stack, lhsc);
+ VEC_free (ce_s, stack, rhsc);
}
get_constraint_for_ptr_offset (dest, NULL_TREE, &lhsc);
get_constraint_for_ptr_offset (src, NULL_TREE, &rhsc);
do_deref (&lhsc);
do_deref (&rhsc);
process_all_all_constraints (lhsc, rhsc);
- VEC_free (ce_s, heap, lhsc);
- VEC_free (ce_s, heap, rhsc);
+ VEC_free (ce_s, stack, lhsc);
+ VEC_free (ce_s, stack, rhsc);
return true;
}
case BUILT_IN_MEMSET:
@@ -4031,8 +4037,8 @@ find_func_aliases_for_builtin_call (gimp
get_constraint_for (res, &lhsc);
get_constraint_for (dest, &rhsc);
process_all_all_constraints (lhsc, rhsc);
- VEC_free (ce_s, heap, lhsc);
- VEC_free (ce_s, heap, rhsc);
+ VEC_free (ce_s, stack, lhsc);
+ VEC_free (ce_s, stack, rhsc);
}
get_constraint_for_ptr_offset (dest, NULL_TREE, &lhsc);
do_deref (&lhsc);
@@ -4050,7 +4056,7 @@ find_func_aliases_for_builtin_call (gimp
ac.offset = 0;
FOR_EACH_VEC_ELT (ce_s, lhsc, i, lhsp)
process_constraint (new_constraint (*lhsp, ac));
- VEC_free (ce_s, heap, lhsc);
+ VEC_free (ce_s, stack, lhsc);
return true;
}
/* All the following functions do not return pointers, do not
@@ -4096,7 +4102,7 @@ find_func_aliases_for_builtin_call (gimp
get_constraint_for (frame, &rhsc);
FOR_EACH_VEC_ELT (ce_s, rhsc, i, rhsp)
process_constraint (new_constraint (lhs, *rhsp));
- VEC_free (ce_s, heap, rhsc);
+ VEC_free (ce_s, stack, rhsc);
/* Make the frame point to the function for
the trampoline adjustment call. */
@@ -4104,8 +4110,8 @@ find_func_aliases_for_builtin_call (gimp
do_deref (&lhsc);
get_constraint_for (nfunc, &rhsc);
process_all_all_constraints (lhsc, rhsc);
- VEC_free (ce_s, heap, rhsc);
- VEC_free (ce_s, heap, lhsc);
+ VEC_free (ce_s, stack, rhsc);
+ VEC_free (ce_s, stack, lhsc);
return true;
}
@@ -4124,8 +4130,8 @@ find_func_aliases_for_builtin_call (gimp
get_constraint_for (tramp, &rhsc);
do_deref (&rhsc);
process_all_all_constraints (lhsc, rhsc);
- VEC_free (ce_s, heap, rhsc);
- VEC_free (ce_s, heap, lhsc);
+ VEC_free (ce_s, stack, rhsc);
+ VEC_free (ce_s, stack, lhsc);
}
return true;
}
@@ -4148,7 +4154,7 @@ find_func_aliases_for_builtin_call (gimp
rhs.type = ADDRESSOF;
FOR_EACH_VEC_ELT (ce_s, lhsc, i, lhsp)
process_constraint (new_constraint (*lhsp, rhs));
- VEC_free (ce_s, heap, lhsc);
+ VEC_free (ce_s, stack, lhsc);
/* va_list is clobbered. */
make_constraint_to (get_call_clobber_vi (t)->id, valist);
return true;
@@ -4193,9 +4199,9 @@ static void
find_func_aliases_for_call (gimple t)
{
tree fndecl = gimple_call_fndecl (t);
- VEC(ce_s, heap) *lhsc = NULL;
- VEC(ce_s, heap) *rhsc = NULL;
varinfo_t fi;
+ VEC(ce_s, stack) *lhsc = VEC_alloc (ce_s, stack, 32);
+ VEC(ce_s, stack) *rhsc = VEC_alloc (ce_s, stack, 32);
if (fndecl != NULL_TREE
&& DECL_BUILT_IN (fndecl)
@@ -4206,7 +4212,7 @@ find_func_aliases_for_call (gimple t)
if (!in_ipa_mode
|| (fndecl && !fi->is_fn_info))
{
- VEC(ce_s, heap) *rhsc = NULL;
+ VEC(ce_s, stack) *rhsc = VEC_alloc (ce_s, stack, 32);
int flags = gimple_call_flags (t);
/* Const functions can return their arguments and addresses
@@ -4225,7 +4231,7 @@ find_func_aliases_for_call (gimple t)
handle_rhs_call (t, &rhsc);
if (gimple_call_lhs (t))
handle_lhs_call (t, gimple_call_lhs (t), flags, rhsc, fndecl);
- VEC_free (ce_s, heap, rhsc);
+ VEC_free (ce_s, stack, rhsc);
}
else
{
@@ -4263,11 +4269,11 @@ find_func_aliases_for_call (gimple t)
&& DECL_RESULT (fndecl)
&& DECL_BY_REFERENCE (DECL_RESULT (fndecl)))
{
- VEC(ce_s, heap) *tem = NULL;
- VEC_safe_push (ce_s, heap, tem, &rhs);
+ VEC(ce_s, stack) *tem = VEC_alloc (ce_s, stack, 32);
+ VEC_safe_push (ce_s, stack, tem, &rhs);
do_deref (&tem);
rhs = *VEC_index (ce_s, tem, 0);
- VEC_free(ce_s, heap, tem);
+ VEC_free(ce_s, stack, tem);
}
FOR_EACH_VEC_ELT (ce_s, lhsc, j, lhsp)
process_constraint (new_constraint (*lhsp, rhs));
@@ -4286,7 +4292,7 @@ find_func_aliases_for_call (gimple t)
lhs = get_function_part_constraint (fi, fi_result);
FOR_EACH_VEC_ELT (ce_s, rhsc, j, rhsp)
process_constraint (new_constraint (lhs, *rhsp));
- VEC_free (ce_s, heap, rhsc);
+ VEC_free (ce_s, stack, rhsc);
}
/* If we use a static chain, pass it along. */
@@ -4312,10 +4318,10 @@ static void
find_func_aliases (gimple origt)
{
gimple t = origt;
- VEC(ce_s, heap) *lhsc = NULL;
- VEC(ce_s, heap) *rhsc = NULL;
struct constraint_expr *c;
varinfo_t fi;
+ VEC(ce_s, stack) *lhsc = VEC_alloc (ce_s, stack, 32);
+ VEC(ce_s, stack) *rhsc = VEC_alloc (ce_s, stack, 32);
/* Now build constraints expressions. */
if (gimple_code (t) == GIMPLE_PHI)
@@ -4395,18 +4401,19 @@ find_func_aliases (gimple origt)
else
{
/* All other operations are merges. */
- VEC (ce_s, heap) *tmp = NULL;
struct constraint_expr *rhsp;
unsigned i, j;
+ VEC (ce_s, stack) *tmp = VEC_alloc (ce_s, stack, 32);
+
get_constraint_for_rhs (gimple_assign_rhs1 (t), &rhsc);
for (i = 2; i < gimple_num_ops (t); ++i)
{
get_constraint_for_rhs (gimple_op (t, i), &tmp);
FOR_EACH_VEC_ELT (ce_s, tmp, j, rhsp)
- VEC_safe_push (ce_s, heap, rhsc, rhsp);
+ VEC_safe_push (ce_s, stack, rhsc, rhsp);
VEC_truncate (ce_s, tmp, 0);
}
- VEC_free (ce_s, heap, tmp);
+ VEC_free (ce_s, stack, tmp);
}
process_all_all_constraints (lhsc, rhsc);
}
@@ -4477,16 +4484,17 @@ find_func_aliases (gimple origt)
any global memory. */
if (op)
{
- VEC(ce_s, heap) *lhsc = NULL;
struct constraint_expr rhsc, *lhsp;
unsigned j;
+ VEC(ce_s, stack) *lhsc = VEC_alloc (ce_s, stack, 32);
+
get_constraint_for (op, &lhsc);
rhsc.var = nonlocal_id;
rhsc.offset = 0;
rhsc.type = SCALAR;
FOR_EACH_VEC_ELT (ce_s, lhsc, j, lhsp)
process_constraint (new_constraint (*lhsp, rhsc));
- VEC_free (ce_s, heap, lhsc);
+ VEC_free (ce_s, stack, lhsc);
}
}
for (i = 0; i < gimple_asm_ninputs (t); ++i)
@@ -4510,8 +4518,8 @@ find_func_aliases (gimple origt)
}
}
- VEC_free (ce_s, heap, rhsc);
- VEC_free (ce_s, heap, lhsc);
+ VEC_free (ce_s, stack, rhsc);
+ VEC_free (ce_s, stack, lhsc);
}
@@ -4521,14 +4529,15 @@ find_func_aliases (gimple origt)
static void
process_ipa_clobber (varinfo_t fi, tree ptr)
{
- VEC(ce_s, heap) *ptrc = NULL;
struct constraint_expr *c, lhs;
unsigned i;
+ VEC(ce_s, stack) *ptrc = VEC_alloc (ce_s, stack, 32);
+
get_constraint_for_rhs (ptr, &ptrc);
lhs = get_function_part_constraint (fi, fi_clobbers);
FOR_EACH_VEC_ELT (ce_s, ptrc, i, c)
process_constraint (new_constraint (lhs, *c));
- VEC_free (ce_s, heap, ptrc);
+ VEC_free (ce_s, stack, ptrc);
}
/* Walk statement T setting up clobber and use constraints according to the
@@ -4539,9 +4548,9 @@ static void
find_func_clobbers (gimple origt)
{
gimple t = origt;
- VEC(ce_s, heap) *lhsc = NULL;
- VEC(ce_s, heap) *rhsc = NULL;
varinfo_t fi;
+ VEC(ce_s, stack) *lhsc = VEC_alloc (ce_s, stack, 32);
+ VEC(ce_s, stack) *rhsc = VEC_alloc (ce_s, stack, 32);
/* Add constraints for clobbered/used in IPA mode.
We are not interested in what automatic variables are clobbered
@@ -4579,7 +4588,7 @@ find_func_clobbers (gimple origt)
get_constraint_for_address_of (lhs, &rhsc);
FOR_EACH_VEC_ELT (ce_s, rhsc, i, rhsp)
process_constraint (new_constraint (lhsc, *rhsp));
- VEC_free (ce_s, heap, rhsc);
+ VEC_free (ce_s, stack, rhsc);
}
}
@@ -4607,7 +4616,7 @@ find_func_clobbers (gimple origt)
get_constraint_for_address_of (rhs, &rhsc);
FOR_EACH_VEC_ELT (ce_s, rhsc, i, rhsp)
process_constraint (new_constraint (lhs, *rhsp));
- VEC_free (ce_s, heap, rhsc);
+ VEC_free (ce_s, stack, rhsc);
}
}
@@ -4647,12 +4656,12 @@ find_func_clobbers (gimple origt)
lhs = get_function_part_constraint (fi, fi_clobbers);
FOR_EACH_VEC_ELT (ce_s, lhsc, i, lhsp)
process_constraint (new_constraint (lhs, *lhsp));
- VEC_free (ce_s, heap, lhsc);
+ VEC_free (ce_s, stack, lhsc);
get_constraint_for_ptr_offset (src, NULL_TREE, &rhsc);
lhs = get_function_part_constraint (fi, fi_uses);
FOR_EACH_VEC_ELT (ce_s, rhsc, i, rhsp)
process_constraint (new_constraint (lhs, *rhsp));
- VEC_free (ce_s, heap, rhsc);
+ VEC_free (ce_s, stack, rhsc);
return;
}
/* The following function clobbers memory pointed to by
@@ -4666,7 +4675,7 @@ find_func_clobbers (gimple origt)
lhs = get_function_part_constraint (fi, fi_clobbers);
FOR_EACH_VEC_ELT (ce_s, lhsc, i, lhsp)
process_constraint (new_constraint (lhs, *lhsp));
- VEC_free (ce_s, heap, lhsc);
+ VEC_free (ce_s, stack, lhsc);
return;
}
/* The following functions clobber their second and third
@@ -4735,7 +4744,7 @@ find_func_clobbers (gimple origt)
get_constraint_for_address_of (arg, &rhsc);
FOR_EACH_VEC_ELT (ce_s, rhsc, j, rhsp)
process_constraint (new_constraint (lhs, *rhsp));
- VEC_free (ce_s, heap, rhsc);
+ VEC_free (ce_s, stack, rhsc);
}
/* Build constraints for propagating clobbers/uses along the
@@ -4796,7 +4805,7 @@ find_func_clobbers (gimple origt)
anything_id);
}
- VEC_free (ce_s, heap, rhsc);
+ VEC_free (ce_s, stack, rhsc);
}
@@ -5436,9 +5445,10 @@ create_variable_info_for (tree decl, con
if (in_ipa_mode
&& DECL_INITIAL (decl))
{
- VEC (ce_s, heap) *rhsc = NULL;
struct constraint_expr lhs, *rhsp;
unsigned i;
+ VEC (ce_s, stack) *rhsc = VEC_alloc (ce_s, stack, 32);
+
get_constraint_for_rhs (DECL_INITIAL (decl), &rhsc);
lhs.var = vi->id;
lhs.offset = 0;
@@ -5455,7 +5465,7 @@ create_variable_info_for (tree decl, con
FOR_EACH_VEC_ELT (ce_s, rhsc, i, rhsp)
process_constraint (new_constraint (lhs, *rhsp));
}
- VEC_free (ce_s, heap, rhsc);
+ VEC_free (ce_s, stack, rhsc);
}
}
=== modified file 'gcc/tree-ssa-sccvn.c'
--- gcc/tree-ssa-sccvn.c 2011-05-02 13:11:27 +0000
+++ gcc/tree-ssa-sccvn.c 2011-08-19 12:50:47 +0000
@@ -2221,7 +2221,7 @@ vn_phi_insert (gimple phi, tree result)
/* Print set of components in strongly connected component SCC to OUT. */
static void
-print_scc (FILE *out, VEC (tree, heap) *scc)
+print_scc (FILE *out, VEC (tree, stack) *scc)
{
tree var;
unsigned int i;
@@ -3203,7 +3203,7 @@ compare_ops (const void *pa, const void
array will give you the members in RPO order. */
static void
-sort_scc (VEC (tree, heap) *scc)
+sort_scc (VEC (tree, stack) *scc)
{
VEC_qsort (tree, scc, compare_ops);
}
@@ -3254,7 +3254,7 @@ copy_reference (vn_reference_t oref, vn_
/* Process a strongly connected component in the SSA graph. */
static void
-process_scc (VEC (tree, heap) *scc)
+process_scc (VEC (tree, stack) *scc)
{
tree var;
unsigned int i;
@@ -3334,8 +3334,8 @@ DEF_VEC_ALLOC_O(ssa_op_iter,heap);
static bool
extract_and_process_scc_for_name (tree name)
{
- VEC (tree, heap) *scc = NULL;
tree x;
+ VEC (tree, stack) *scc = VEC_alloc (tree, stack, 16);
/* Found an SCC, pop the components off the SCC stack and
process them. */
@@ -3344,7 +3344,7 @@ extract_and_process_scc_for_name (tree n
x = VEC_pop (tree, sccstack);
VN_INFO (x)->on_sccstack = false;
- VEC_safe_push (tree, heap, scc, x);
+ VEC_safe_push (tree, stack, scc, x);
} while (x != name);
/* Bail out of SCCVN in case a SCC turns out to be incredibly large. */
@@ -3366,7 +3366,7 @@ extract_and_process_scc_for_name (tree n
process_scc (scc);
- VEC_free (tree, heap, scc);
+ VEC_free (tree, stack, scc);
return true;
}
=== modified file 'gcc/vecir.h'
--- gcc/vecir.h 2010-05-15 19:02:11 +0000
+++ gcc/vecir.h 2011-08-19 12:50:47 +0000
@@ -28,6 +28,9 @@ along with GCC; see the file COPYING3.
DEF_VEC_P(tree);
DEF_VEC_ALLOC_P(tree,gc);
DEF_VEC_ALLOC_P(tree,heap);
+DEF_VEC_ALLOC_P_STACK(tree);
+#define VEC_tree_stack_alloc(alloc) \
+ VEC_stack_alloc (tree, alloc)
/* A varray of gimple statements. */
DEF_VEC_P(gimple);
^ permalink raw reply [flat|nested] 22+ messages in thread
* tree-ssa-structalias.c: alloc_pool for struct equiv_class_label
2011-08-22 7:50 Various minor speed-ups Dimitrios Apostolou
` (2 preceding siblings ...)
2011-08-22 7:59 ` tree-ssa*: reduce malloc() calls by preallocating hot VECs on the stack Dimitrios Apostolou
@ 2011-08-22 8:05 ` Dimitrios Apostolou
2011-08-22 9:01 ` Dimitrios Apostolou
2011-08-22 10:25 ` Richard Guenther
2011-08-22 8:43 ` Various minor speed-ups Dimitrios Apostolou
` (2 subsequent siblings)
6 siblings, 2 replies; 22+ messages in thread
From: Dimitrios Apostolou @ 2011-08-22 8:05 UTC (permalink / raw)
To: Dimitrios Apostolou; +Cc: gcc-patches, Steven Bosscher
2011-08-22 Dimitrios Apostolou <jimis@gmx.net>
* tree-ssa-structalias.c (equiv_class_add)
(perform_var_substitution, free_var_substitution_info): Created a
new equiv_class_pool allocator pool for struct
equiv_class_label. Changed the pointer_equiv_class_table and
location_equiv_class_table hash tables to not iterate freeing all
elements in the end, but just free the pool.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: mem_attrs_htab
2011-08-22 7:53 ` mem_attrs_htab Dimitrios Apostolou
@ 2011-08-22 8:37 ` Jakub Jelinek
2011-08-22 9:39 ` mem_attrs_htab Dimitrios Apostolou
0 siblings, 1 reply; 22+ messages in thread
From: Jakub Jelinek @ 2011-08-22 8:37 UTC (permalink / raw)
To: Dimitrios Apostolou; +Cc: gcc-patches, Steven Bosscher
On Mon, Aug 22, 2011 at 10:32:35AM +0300, Dimitrios Apostolou wrote:
> --- gcc/emit-rtl.c 2011-05-29 17:40:05 +0000
> +++ gcc/emit-rtl.c 2011-08-21 04:44:25 +0000
> @@ -256,11 +256,10 @@ mem_attrs_htab_hash (const void *x)
> {
> const mem_attrs *const p = (const mem_attrs *) x;
>
> - return (p->alias ^ (p->align * 1000)
> - ^ (p->addrspace * 4000)
> - ^ ((p->offset ? INTVAL (p->offset) : 0) * 50000)
> - ^ ((p->size ? INTVAL (p->size) : 0) * 2500000)
> - ^ (size_t) iterative_hash_expr (p->expr, 0));
> + /* By massively feeding the mem_attrs struct to iterative_hash() we
> + disregard the p->offset and p->size rtx, but in total the hash is
> + quick and good enough. */
> + return iterative_hash_object (*p, iterative_hash_expr (p->expr, 0));
> }
>
> /* Returns nonzero if the value represented by X (which is really a
This patch isn't against the trunk, where p->offset and p->size aren't rtxes
anymore, but HOST_WIDE_INTs. Furthermore, it is a bad idea to hash
the p->expr address itself, it doesn't make any sense to hash on what
p->expr points to in that case. And p->offset and p->size should be ignored
if the *known_p corresponding fields are false. So, if you really think
using iterative_hash_object is a win, it should be something like:
mem_attrs q = *p;
q.expr = NULL;
if (!q.offset_known_p) q.offset = 0;
if (!q.size_known_p) q.size = 0;
return iterative_hash_object (q, iterative_hash_expr (p->expr, 0));
(or better yet avoid q.expr = NULL and instead start hashing from the next
field after expr). Hashing the struct padding might not be a good idea
either.
Jakub
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Various minor speed-ups
2011-08-22 7:50 Various minor speed-ups Dimitrios Apostolou
` (3 preceding siblings ...)
2011-08-22 8:05 ` tree-ssa-structalias.c: alloc_pool for struct equiv_class_label Dimitrios Apostolou
@ 2011-08-22 8:43 ` Dimitrios Apostolou
2011-08-22 11:01 ` Richard Guenther
2011-08-22 9:44 ` Dimitrios Apostolou
2011-08-22 9:50 ` cse.c: preferable() Dimitrios Apostolou
6 siblings, 1 reply; 22+ messages in thread
From: Dimitrios Apostolou @ 2011-08-22 8:43 UTC (permalink / raw)
To: Dimitrios Apostolou; +Cc: gcc-patches, Steven Bosscher
[-- Attachment #1: Type: TEXT/PLAIN, Size: 200 bytes --]
2011-08-22 Dimitrios Apostolou <jimis@gmx.net>
* tree-ssa-pre.c (phi_trans_add, init_pre, fini_pre): Added a pool
for phi_translate_table elements to avoid free() calls from
htab_delete().
[-- Attachment #2: Type: TEXT/plain, Size: 1983 bytes --]
=== modified file 'gcc/tree-ssa-pre.c'
--- gcc/tree-ssa-pre.c 2011-05-04 09:04:53 +0000
+++ gcc/tree-ssa-pre.c 2011-08-17 08:43:23 +0000
@@ -515,6 +515,10 @@ typedef struct expr_pred_trans_d
} *expr_pred_trans_t;
typedef const struct expr_pred_trans_d *const_expr_pred_trans_t;
+/* Pool of memory for the above */
+
+static alloc_pool phi_translate_pool;
+
/* Return the hash value for a phi translation table entry. */
static hashval_t
@@ -571,7 +575,8 @@ static inline void
phi_trans_add (pre_expr e, pre_expr v, basic_block pred)
{
void **slot;
- expr_pred_trans_t new_pair = XNEW (struct expr_pred_trans_d);
+ expr_pred_trans_t new_pair
+ = (expr_pred_trans_t) pool_alloc (phi_translate_pool);
new_pair->e = e;
new_pair->pred = pred;
new_pair->v = v;
@@ -580,7 +585,8 @@ phi_trans_add (pre_expr e, pre_expr v, b
slot = htab_find_slot_with_hash (phi_translate_table, new_pair,
new_pair->hashcode, INSERT);
- free (*slot);
+ if (*slot)
+ pool_free (phi_translate_pool, *slot);
*slot = (void *) new_pair;
}
@@ -4804,8 +4810,12 @@ init_pre (bool do_fre)
calculate_dominance_info (CDI_DOMINATORS);
bitmap_obstack_initialize (&grand_bitmap_obstack);
+ phi_translate_pool = create_alloc_pool ("phi_translate_table pool",
+ sizeof (struct expr_pred_trans_d),
+ 4096);
+ /* NULL as free because we'll free the whole pool in the end. */
phi_translate_table = htab_create (5110, expr_pred_trans_hash,
- expr_pred_trans_eq, free);
+ expr_pred_trans_eq, NULL);
expression_to_id = htab_create (num_ssa_names * 3,
pre_expr_hash,
pre_expr_eq, NULL);
@@ -4839,6 +4849,7 @@ fini_pre (bool do_fre)
free_alloc_pool (bitmap_set_pool);
free_alloc_pool (pre_expr_pool);
htab_delete (phi_translate_table);
+ free_alloc_pool (phi_translate_pool);
htab_delete (expression_to_id);
VEC_free (unsigned, heap, name_to_id);
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: graphds.[ch]: alloc_pool for edges
2011-08-22 7:53 ` graphds.[ch]: alloc_pool for edges Dimitrios Apostolou
@ 2011-08-22 8:46 ` Jakub Jelinek
2011-08-22 10:11 ` Richard Guenther
1 sibling, 0 replies; 22+ messages in thread
From: Jakub Jelinek @ 2011-08-22 8:46 UTC (permalink / raw)
To: Dimitrios Apostolou; +Cc: gcc-patches, Steven Bosscher
On Mon, Aug 22, 2011 at 10:37:58AM +0300, Dimitrios Apostolou wrote:
> --- gcc/graphds.h 2009-02-20 15:20:38 +0000
> +++ gcc/graphds.h 2011-08-19 16:44:41 +0000
> @@ -18,6 +18,10 @@ You should have received a copy of the G
> along with GCC; see the file COPYING3. If not see
> <http://www.gnu.org/licenses/>. */
>
> +
> +#include "alloc-pool.h"
> +
> +
This needs to be reflected in Makefile.in, we unfortunately don't have
automatic dependency generation for gcc.
So, create
GRAPHDS_H = graphds.h alloc-pool.h
and use $(GRAPHDS_H) everywhere where graphds.h has been used so far.
Jakub
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: tree-ssa-structalias.c: alloc_pool for struct equiv_class_label
2011-08-22 8:05 ` tree-ssa-structalias.c: alloc_pool for struct equiv_class_label Dimitrios Apostolou
@ 2011-08-22 9:01 ` Dimitrios Apostolou
2011-08-22 10:25 ` Richard Guenther
1 sibling, 0 replies; 22+ messages in thread
From: Dimitrios Apostolou @ 2011-08-22 9:01 UTC (permalink / raw)
To: Dimitrios Apostolou; +Cc: gcc-patches, Steven Bosscher
[-- Attachment #1: Type: TEXT/PLAIN, Size: 475 bytes --]
Forgot the patch...
On Mon, 22 Aug 2011, Dimitrios Apostolou wrote:
>
> 2011-08-22 Dimitrios Apostolou <jimis@gmx.net>
>
> * tree-ssa-structalias.c (equiv_class_add)
> (perform_var_substitution, free_var_substitution_info): Created a
> new equiv_class_pool allocator pool for struct
> equiv_class_label. Changed the pointer_equiv_class_table and
> location_equiv_class_table hash tables to not iterate freeing all
> elements in the end, but just free the pool.
>
>
>
[-- Attachment #2: Type: TEXT/plain, Size: 1816 bytes --]
=== modified file 'gcc/tree-ssa-structalias.c'
--- gcc/tree-ssa-structalias.c 2011-04-29 10:59:33 +0000
+++ gcc/tree-ssa-structalias.c 2011-08-18 06:53:12 +0000
@@ -1899,6 +1899,9 @@ static htab_t pointer_equiv_class_table;
classes. */
static htab_t location_equiv_class_table;
+/* Pool of memory for storing the above */
+static alloc_pool equiv_class_pool;
+
/* Hash function for a equiv_class_label_t */
static hashval_t
@@ -1948,7 +1951,8 @@ equiv_class_add (htab_t table, unsigned
bitmap labels)
{
void **slot;
- equiv_class_label_t ecl = XNEW (struct equiv_class_label);
+ equiv_class_label_t ecl
+ = (equiv_class_label_t) pool_alloc (equiv_class_pool);
ecl->labels = labels;
ecl->equivalence_class = equivalence_class;
@@ -2159,10 +2163,14 @@ perform_var_substitution (constraint_gra
struct scc_info *si = init_scc_info (size);
bitmap_obstack_initialize (&iteration_obstack);
+ equiv_class_pool = create_alloc_pool ("equiv_class_label pool",
+ sizeof (struct equiv_class_label),
+ 64);
+ /* NULL free function, we'll free the whole pool at the end of the pass. */
pointer_equiv_class_table = htab_create (511, equiv_class_label_hash,
- equiv_class_label_eq, free);
+ equiv_class_label_eq, NULL);
location_equiv_class_table = htab_create (511, equiv_class_label_hash,
- equiv_class_label_eq, free);
+ equiv_class_label_eq, NULL);
pointer_equiv_class = 1;
location_equiv_class = 1;
@@ -2269,6 +2277,7 @@ free_var_substitution_info (struct scc_i
sbitmap_free (graph->direct_nodes);
htab_delete (pointer_equiv_class_table);
htab_delete (location_equiv_class_table);
+ free_alloc_pool (equiv_class_pool);
bitmap_obstack_release (&iteration_obstack);
}
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: mem_attrs_htab
2011-08-22 8:37 ` mem_attrs_htab Jakub Jelinek
@ 2011-08-22 9:39 ` Dimitrios Apostolou
2011-08-22 9:43 ` mem_attrs_htab Jakub Jelinek
0 siblings, 1 reply; 22+ messages in thread
From: Dimitrios Apostolou @ 2011-08-22 9:39 UTC (permalink / raw)
To: Jakub Jelinek; +Cc: Dimitrios Apostolou, gcc-patches, Steven Bosscher
Hi Jakub,
I forgot to mention that all patches are against mid-July trunk, I was
hoping I'd have no conflicts. Anyway thanks for letting me know,
if there are conflicts with my other patches please let me know, and I'll
post an updated version at a later date.
All your other concerns are valid and I'll try addressing them in the
future. I didn't like hashing addresses either, and I was surprised I saw
no regressions.
Dimitris
>
> This patch isn't against the trunk, where p->offset and p->size aren't rtxes
> anymore, but HOST_WIDE_INTs. Furthermore, it is a bad idea to hash
> the p->expr address itself, it doesn't make any sense to hash on what
> p->expr points to in that case. And p->offset and p->size should be ignored
> if the *known_p corresponding fields are false. So, if you really think
> using iterative_hash_object is a win, it should be something like:
> mem_attrs q = *p;
> q.expr = NULL;
> if (!q.offset_known_p) q.offset = 0;
> if (!q.size_known_p) q.size = 0;
> return iterative_hash_object (q, iterative_hash_expr (p->expr, 0));
> (or better yet avoid q.expr = NULL and instead start hashing from the next
> field after expr). Hashing the struct padding might not be a good idea
> either.
>
> Jakub
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: mem_attrs_htab
2011-08-22 9:39 ` mem_attrs_htab Dimitrios Apostolou
@ 2011-08-22 9:43 ` Jakub Jelinek
2011-08-22 10:45 ` mem_attrs_htab Richard Guenther
0 siblings, 1 reply; 22+ messages in thread
From: Jakub Jelinek @ 2011-08-22 9:43 UTC (permalink / raw)
To: Dimitrios Apostolou; +Cc: gcc-patches, Steven Bosscher
On Mon, Aug 22, 2011 at 10:58:48AM +0300, Dimitrios Apostolou wrote:
> the future. I didn't like hashing addresses either, and I was
> surprised I saw no regressions.
Hashing on the expr address as well just results in smaller sharing
in the hash table (i.e. if the expr has different address, but is considered
equal). The hashing of mem attrs is done just to reduce memory overhead.
Jakub
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Various minor speed-ups
2011-08-22 7:50 Various minor speed-ups Dimitrios Apostolou
` (4 preceding siblings ...)
2011-08-22 8:43 ` Various minor speed-ups Dimitrios Apostolou
@ 2011-08-22 9:44 ` Dimitrios Apostolou
2011-08-22 9:50 ` cse.c: preferable() Dimitrios Apostolou
6 siblings, 0 replies; 22+ messages in thread
From: Dimitrios Apostolou @ 2011-08-22 9:44 UTC (permalink / raw)
To: Dimitrios Apostolou; +Cc: gcc-patches, Steven Bosscher
[-- Attachment #1: Type: TEXT/PLAIN, Size: 425 bytes --]
For whoever is concerned about memory usage, I didn't measure a real
increase, besides a few KB. These are very hot allocation pools and
allocating too many blocks of 10 elements is suboptimal.
2011-08-22 Dimitrios Apostolou <jimis@gmx.net>
* cselib.c (cselib_init): Increased initial size of elt_list_pool,
elt_loc_list_pool, cselib_val_pool, value_pool allocation pools
since they are very frequently used.
[-- Attachment #2: Type: TEXT/plain, Size: 912 bytes --]
=== modified file 'gcc/cselib.c'
--- gcc/cselib.c 2011-05-31 19:14:21 +0000
+++ gcc/cselib.c 2011-08-17 14:03:56 +0000
@@ -2484,12 +2484,12 @@ void
cselib_init (int record_what)
{
elt_list_pool = create_alloc_pool ("elt_list",
- sizeof (struct elt_list), 10);
+ sizeof (struct elt_list), 128);
elt_loc_list_pool = create_alloc_pool ("elt_loc_list",
- sizeof (struct elt_loc_list), 10);
+ sizeof (struct elt_loc_list), 128);
cselib_val_pool = create_alloc_pool ("cselib_val_list",
- sizeof (cselib_val), 10);
- value_pool = create_alloc_pool ("value", RTX_CODE_SIZE (VALUE), 100);
+ sizeof (cselib_val), 128);
+ value_pool = create_alloc_pool ("value", RTX_CODE_SIZE (VALUE), 128);
cselib_record_memory = record_what & CSELIB_RECORD_MEMORY;
cselib_preserve_constants = record_what & CSELIB_PRESERVE_CONSTANTS;
^ permalink raw reply [flat|nested] 22+ messages in thread
* cse.c: preferable()
2011-08-22 7:50 Various minor speed-ups Dimitrios Apostolou
` (5 preceding siblings ...)
2011-08-22 9:44 ` Dimitrios Apostolou
@ 2011-08-22 9:50 ` Dimitrios Apostolou
6 siblings, 0 replies; 22+ messages in thread
From: Dimitrios Apostolou @ 2011-08-22 9:50 UTC (permalink / raw)
To: Dimitrios Apostolou; +Cc: gcc-patches, Steven Bosscher, christophe.jaillet
[-- Attachment #1: Type: TEXT/PLAIN, Size: 350 bytes --]
Attached patch is also posted at bug #19832 and I think resolves it, as
well as /maybe/ offers a negligible speedup of 3-4 M instr or a couple
milliseconds. I also post it here for comments.
2011-08-13 Dimitrios Apostolou <jimis@gmx.net>
* cse.c (preferable): Make it more readable and slightly faster,
without affecting its logic.
[-- Attachment #2: Type: TEXT/plain, Size: 1585 bytes --]
=== modified file 'gcc/cse.c'
--- gcc/cse.c 2011-06-02 21:52:46 +0000
+++ gcc/cse.c 2011-08-13 00:54:06 +0000
@@ -720,32 +720,25 @@ approx_reg_cost (rtx x)
static int
preferable (int cost_a, int regcost_a, int cost_b, int regcost_b)
{
- /* First, get rid of cases involving expressions that are entirely
- unwanted. */
- if (cost_a != cost_b)
- {
- if (cost_a == MAX_COST)
- return 1;
- if (cost_b == MAX_COST)
- return -1;
- }
+ int cost_diff = cost_a - cost_b;
+ int regcost_diff = regcost_a - regcost_b;
- /* Avoid extending lifetimes of hardregs. */
- if (regcost_a != regcost_b)
+ if (cost_diff != 0)
{
- if (regcost_a == MAX_COST)
- return 1;
- if (regcost_b == MAX_COST)
- return -1;
+ /* If none of the expressions are entirely unwanted */
+ if ((cost_a != MAX_COST) && (cost_b != MAX_COST)
+ /* AND only one of the regs is HARD_REG */
+ && (regcost_diff != 0)
+ && ((regcost_a == MAX_COST) || (regcost_b == MAX_COST))
+ )
+ /* Then avoid extending lifetime of HARD_REG */
+ return regcost_diff;
+
+ return cost_diff;
}
- /* Normal operation costs take precedence. */
- if (cost_a != cost_b)
- return cost_a - cost_b;
- /* Only if these are identical consider effects on register pressure. */
- if (regcost_a != regcost_b)
- return regcost_a - regcost_b;
- return 0;
+ /* cost_a == costb, consider effects on register pressure */
+ return regcost_diff;
}
/* Internal function, to compute cost when X is not a register; called
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: tree-ssa*: reduce malloc() calls by preallocating hot VECs on the stack
2011-08-22 7:59 ` tree-ssa*: reduce malloc() calls by preallocating hot VECs on the stack Dimitrios Apostolou
@ 2011-08-22 10:07 ` Richard Guenther
0 siblings, 0 replies; 22+ messages in thread
From: Richard Guenther @ 2011-08-22 10:07 UTC (permalink / raw)
To: Dimitrios Apostolou; +Cc: gcc-patches, Steven Bosscher
On Mon, Aug 22, 2011 at 9:43 AM, Dimitrios Apostolou <jimis@gmx.net> wrote:
>
> 2011-08-22 Dimitrios Apostolou <jimis@gmx.net>
>
> Allocate some very frequently used vectors on the stack:
> * vecir.h: Defined a tree vector on the stack.
> * tree-ssa-sccvn.c (print_scc, sort_scc, process_scc)
> (extract_and_process_scc_for_name): Allocate the scc vector on the
> stack instead of the heap, giving it a minimal initial size
> instead of 0.
> * tree-ssa-structalias.c (get_constraint_for_1)
> (get_constraint_for, get_constraint_for_rhs, do_deref)
> (get_constraint_for_ssa_var, get_constraint_for_ptr_offset)
> (get_constraint_for_component_ref, get_constraint_for_address_of)
> (process_all_all_constraints, do_structure_copy)
> (make_constraints_to, make_constraint_to, handle_rhs_call)
> (handle_lhs_call, handle_const_call, handle_pure_call)
> (find_func_aliases_for_builtin_call, find_func_aliases_for_call)
> (find_func_aliases, process_ipa_clobber, find_func_clobbers)
> (create_variable_info_for): Converted the rhsc, lhsc vectors from
> heap to stack, with a minimal initial size, since they were very
> frequently allocated.
Ok if bootstrapped and tested ok - please always state how you tested
a patch.
Thanks,
Richard.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: graphds.[ch]: alloc_pool for edges
2011-08-22 7:53 ` graphds.[ch]: alloc_pool for edges Dimitrios Apostolou
2011-08-22 8:46 ` Jakub Jelinek
@ 2011-08-22 10:11 ` Richard Guenther
1 sibling, 0 replies; 22+ messages in thread
From: Richard Guenther @ 2011-08-22 10:11 UTC (permalink / raw)
To: Dimitrios Apostolou; +Cc: gcc-patches, Steven Bosscher
On Mon, Aug 22, 2011 at 9:37 AM, Dimitrios Apostolou <jimis@gmx.net> wrote:
> free() was called way too often before, this patch reduces it significantly.
> Minor speed-up here too, I don't mention it individually since numbers are
> within noise margins.
As there is no re-use in this pool the natural allocator to use is an
obstack which has even less overhead than a alloc_pool.
Richard.
>
> 2011-08-22 Dimitrios Apostolou <jimis@gmx.net>
>
> * graphds.h (struct graph): Added edge_pool as a pool for
> allocating edges.
> * graphds.c (new_graph): Initialise edge_pool.
> (add_edge): Allocate edge from edge_pool rather than with malloc.
> (free_graph): Instead of iterating across the graph freeing edges,
> just destroy the edge_pool.
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: tree-ssa-structalias.c: alloc_pool for struct equiv_class_label
2011-08-22 8:05 ` tree-ssa-structalias.c: alloc_pool for struct equiv_class_label Dimitrios Apostolou
2011-08-22 9:01 ` Dimitrios Apostolou
@ 2011-08-22 10:25 ` Richard Guenther
2011-08-22 11:26 ` Dimitrios Apostolou
1 sibling, 1 reply; 22+ messages in thread
From: Richard Guenther @ 2011-08-22 10:25 UTC (permalink / raw)
To: Dimitrios Apostolou; +Cc: gcc-patches, Steven Bosscher
On Mon, Aug 22, 2011 at 9:46 AM, Dimitrios Apostolou <jimis@gmx.net> wrote:
>
> 2011-08-22 Dimitrios Apostolou <jimis@gmx.net>
>
> * tree-ssa-structalias.c (equiv_class_add)
> (perform_var_substitution, free_var_substitution_info): Created a
> new equiv_class_pool allocator pool for struct
> equiv_class_label. Changed the pointer_equiv_class_table and
> location_equiv_class_table hash tables to not iterate freeing all
> elements in the end, but just free the pool.
Did you check if the hash functions have ever called free()? If so why
not use the pool free function so that entries can get re-used? If not,
the natural allocator would be an obstack instead.
Richard.
>
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: mem_attrs_htab
2011-08-22 9:43 ` mem_attrs_htab Jakub Jelinek
@ 2011-08-22 10:45 ` Richard Guenther
2011-08-22 11:11 ` mem_attrs_htab Jakub Jelinek
0 siblings, 1 reply; 22+ messages in thread
From: Richard Guenther @ 2011-08-22 10:45 UTC (permalink / raw)
To: Jakub Jelinek; +Cc: Dimitrios Apostolou, gcc-patches, Steven Bosscher
On Mon, Aug 22, 2011 at 10:04 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Mon, Aug 22, 2011 at 10:58:48AM +0300, Dimitrios Apostolou wrote:
>> the future. I didn't like hashing addresses either, and I was
>> surprised I saw no regressions.
>
> Hashing on the expr address as well just results in smaller sharing
> in the hash table (i.e. if the expr has different address, but is considered
> equal). The hashing of mem attrs is done just to reduce memory overhead.
And at some point the idea popped up to just dump this whole re-using
mem-attrs. There is at most a single function in RTL but the whole program
is available in SSA, so the memory overhead must be small.
Richard.
> Jakub
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Various minor speed-ups
2011-08-22 8:43 ` Various minor speed-ups Dimitrios Apostolou
@ 2011-08-22 11:01 ` Richard Guenther
0 siblings, 0 replies; 22+ messages in thread
From: Richard Guenther @ 2011-08-22 11:01 UTC (permalink / raw)
To: Dimitrios Apostolou; +Cc: gcc-patches, Steven Bosscher
On Mon, Aug 22, 2011 at 9:50 AM, Dimitrios Apostolou <jimis@gmx.net> wrote:
>
> 2011-08-22 Dimitrios Apostolou <jimis@gmx.net>
>
> * tree-ssa-pre.c (phi_trans_add, init_pre, fini_pre): Added a pool
> for phi_translate_table elements to avoid free() calls from
> htab_delete().
Ok if bootstrap and test pass.
Richard.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: mem_attrs_htab
2011-08-22 10:45 ` mem_attrs_htab Richard Guenther
@ 2011-08-22 11:11 ` Jakub Jelinek
2011-08-22 11:54 ` mem_attrs_htab Richard Guenther
0 siblings, 1 reply; 22+ messages in thread
From: Jakub Jelinek @ 2011-08-22 11:11 UTC (permalink / raw)
To: Richard Guenther; +Cc: Dimitrios Apostolou, gcc-patches, Steven Bosscher
On Mon, Aug 22, 2011 at 11:57:18AM +0200, Richard Guenther wrote:
> And at some point the idea popped up to just dump this whole re-using
> mem-attrs. There is at most a single function in RTL but the whole program
> is available in SSA, so the memory overhead must be small.
Some functions are extremely large though. Do you mean that MEM itself would be
enlarged to have the MEM_ATTRS field so that one operand is the address,
then expr, then HWI size, offset, etc.? Because if the mem attrs aren't
shared any longer, it doesn't make sense to keep the indirection.
I still fear we have way too many MEMs in RTL that this would be noticeable.
Jakub
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: tree-ssa-structalias.c: alloc_pool for struct equiv_class_label
2011-08-22 10:25 ` Richard Guenther
@ 2011-08-22 11:26 ` Dimitrios Apostolou
0 siblings, 0 replies; 22+ messages in thread
From: Dimitrios Apostolou @ 2011-08-22 11:26 UTC (permalink / raw)
To: Richard Guenther; +Cc: Dimitrios Apostolou, gcc-patches, Steven Bosscher
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1214 bytes --]
On Mon, 22 Aug 2011, Richard Guenther wrote:
> On Mon, Aug 22, 2011 at 9:46 AM, Dimitrios Apostolou <jimis@gmx.net> wrote:
>>
>> 2011-08-22  Dimitrios Apostolou  <jimis@gmx.net>
>>
>> Â Â Â Â * tree-ssa-structalias.c (equiv_class_add)
>> Â Â Â Â (perform_var_substitution, free_var_substitution_info): Created a
>> Â Â Â Â new equiv_class_pool allocator pool for struct
>> Â Â Â Â equiv_class_label. Changed the pointer_equiv_class_table and
>> Â Â Â Â location_equiv_class_table hash tables to not iterate freeing all
>> Â Â Â Â elements in the end, but just free the pool.
>
> Did you check if the hash functions have ever called free()? If so why
> not use the pool free function so that entries can get re-used? If not,
> the natural allocator would be an obstack instead.
I have not found any relevant call of htab_clear_slot(). I didn't consider
obstacks at all for all these cases, thanks for telling me, I'll see where
I can use them. As I've noted I have bootstrapped and tested all these
changes at least on x86_64 with release-checking enabled, but I plan to
test and measure all my changes together later, and hopefully on other
platforms in the future.
Thanks,
Dimitris
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: mem_attrs_htab
2011-08-22 11:11 ` mem_attrs_htab Jakub Jelinek
@ 2011-08-22 11:54 ` Richard Guenther
2011-08-22 16:13 ` mem_attrs_htab Michael Matz
0 siblings, 1 reply; 22+ messages in thread
From: Richard Guenther @ 2011-08-22 11:54 UTC (permalink / raw)
To: Jakub Jelinek; +Cc: Dimitrios Apostolou, gcc-patches, Steven Bosscher
On Mon, Aug 22, 2011 at 12:07 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Mon, Aug 22, 2011 at 11:57:18AM +0200, Richard Guenther wrote:
>> And at some point the idea popped up to just dump this whole re-using
>> mem-attrs. There is at most a single function in RTL but the whole program
>> is available in SSA, so the memory overhead must be small.
>
> Some functions are extremely large though. Do you mean that MEM itself would be
> enlarged to have the MEM_ATTRS field so that one operand is the address,
> then expr, then HWI size, offset, etc.? Because if the mem attrs aren't
> shared any longer, it doesn't make sense to keep the indirection.
> I still fear we have way too many MEMs in RTL that this would be noticeable.
It would be interesting to have numbers about the amount of sharing that
happens - might be not trivial though, as some re-uses would be able to
simply modify the attr inplace.
Richard.
> Jakub
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: mem_attrs_htab
2011-08-22 11:54 ` mem_attrs_htab Richard Guenther
@ 2011-08-22 16:13 ` Michael Matz
0 siblings, 0 replies; 22+ messages in thread
From: Michael Matz @ 2011-08-22 16:13 UTC (permalink / raw)
To: Richard Guenther
Cc: Jakub Jelinek, Dimitrios Apostolou, gcc-patches, Steven Bosscher
[-- Attachment #1: Type: TEXT/PLAIN, Size: 2539 bytes --]
Hi,
On Mon, 22 Aug 2011, Richard Guenther wrote:
> > Some functions are extremely large though. Â Do you mean that MEM
> > itself would be enlarged to have the MEM_ATTRS field so that one
> > operand is the address, then expr, then HWI size, offset, etc.?
> > Â Because if the mem attrs aren't shared any longer, it doesn't make
> > sense to keep the indirection. I still fear we have way too many MEMs
> > in RTL that this would be noticeable.
>
> It would be interesting to have numbers about the amount of sharing that
> happens
A pathetic amount. From compiling cse.c combine.c tree.c and dwarf2out.c
the top 10 users of MEMs per routine are:
With -O0:
MEMs ATTR name
970 485 combine_simplify_rtx
1011 464 simple_cst_equal
1047 612 mem_loc_descriptor
1173 442 walk_tree_1
1296 515 loc_list_from_tree
1431 690 simplify_comparison
1911 752 substitute_placeholder_in_expr
1951 745 substitute_in_expr
2503 1532 cse_insn
3242 2206 try_combine
With -O2:
MEMs ATTR name
514 502 gen_tagged_type_die
701 536 simplify_comparison
743 877 find_decls_types_r
851 840 dwarf2out_finish
863 784 loc_list_from_tree
916 839 combine_simplify_rtx
978 878 gen_subprogram_die
1650 1475 cse_insn
1720 1782 mem_loc_descriptor
2336 1792 try_combine
Summing doesn't make sense, but the routines with largest differences:
-O0
532 force_to_mode
547 simple_cst_equal
640 simplify_shift_const_1
731 walk_tree_1
741 simplify_comparison
781 loc_list_from_tree
971 cse_insn
1036 try_combine
1159 substitute_placeholder_in_expr
1206 substitute_in_expr
-O2
100 gen_subprogram_die
101 make_extraction
112 output_loc_sequence
122 if_then_else_cond
124 substitute_placeholder_in_expr
144 simplify_shift_const_1
165 simplify_comparison
175 cse_insn
205 simplify_if_then_else
544 try_combine
(Using -g or not doesn't make a difference). I've counted all MEM rtx in
the whole insn stream at finalization time (i.e. slightly less than
potentially are actually generated during RTL passes). ATTR is the number
of unique mem_attrs ever created by set_mem_attrs, reset to zero at each
function start (including emptying the htab).
That is, we save a whopping 48 kilobyte due to this fantastic hash table
:-) (offseted by the need for a pointer in the MEM rtx)
Just remove the whole thing. Same for the reg_attrs hash table (I haven't
measured that one, though).
> - might be not trivial though, as some re-uses would be able to
> simply modify the attr inplace.
Ciao,
Michael.
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2011-08-22 15:46 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-22 7:50 Various minor speed-ups Dimitrios Apostolou
2011-08-22 7:53 ` mem_attrs_htab Dimitrios Apostolou
2011-08-22 8:37 ` mem_attrs_htab Jakub Jelinek
2011-08-22 9:39 ` mem_attrs_htab Dimitrios Apostolou
2011-08-22 9:43 ` mem_attrs_htab Jakub Jelinek
2011-08-22 10:45 ` mem_attrs_htab Richard Guenther
2011-08-22 11:11 ` mem_attrs_htab Jakub Jelinek
2011-08-22 11:54 ` mem_attrs_htab Richard Guenther
2011-08-22 16:13 ` mem_attrs_htab Michael Matz
2011-08-22 7:53 ` graphds.[ch]: alloc_pool for edges Dimitrios Apostolou
2011-08-22 8:46 ` Jakub Jelinek
2011-08-22 10:11 ` Richard Guenther
2011-08-22 7:59 ` tree-ssa*: reduce malloc() calls by preallocating hot VECs on the stack Dimitrios Apostolou
2011-08-22 10:07 ` Richard Guenther
2011-08-22 8:05 ` tree-ssa-structalias.c: alloc_pool for struct equiv_class_label Dimitrios Apostolou
2011-08-22 9:01 ` Dimitrios Apostolou
2011-08-22 10:25 ` Richard Guenther
2011-08-22 11:26 ` Dimitrios Apostolou
2011-08-22 8:43 ` Various minor speed-ups Dimitrios Apostolou
2011-08-22 11:01 ` Richard Guenther
2011-08-22 9:44 ` Dimitrios Apostolou
2011-08-22 9:50 ` cse.c: preferable() Dimitrios Apostolou
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).