* [PATCH 4/6] ipa: Multiple predicates for loop properties, with frequencies
2020-09-29 18:12 [PATCH 0/6] IPA cleanups and IPA-CP improvements for 548.exchange2_r Martin Jambor
@ 2020-09-21 14:25 ` Martin Jambor
2020-09-29 22:18 ` Jan Hubicka
2020-09-21 14:25 ` [PATCH 6/6] ipa-cp: Separate and increase the large-unit parameter Martin Jambor
` (4 subsequent siblings)
5 siblings, 1 reply; 17+ messages in thread
From: Martin Jambor @ 2020-09-21 14:25 UTC (permalink / raw)
To: GCC Patches; +Cc: Jan Hubicka
This patch enhances the ability of IPA to reason under what conditions
loops in a function have known iteration counts or strides because it
replaces single predicates which currently hold conjunction of
predicates for all loops with vectors capable of holding multiple
predicates, each with a cumulative frequency of loops with the
property.
This second property is then used by IPA-CP to much more aggressively
boost its heuristic score for cloning opportunities which make
iteration counts or strides of frequent loops compile time constant.
gcc/ChangeLog:
2020-09-03 Martin Jambor <mjambor@suse.cz>
* ipa-fnsummary.h (ipa_freqcounting_predicate): New type.
(ipa_fn_summary): Change the type of loop_iterations and loop_strides
to vectors of ipa_freqcounting_predicate.
(ipa_fn_summary::ipa_fn_summary): Construct the new vectors.
(ipa_call_estimates): New fields loops_with_known_iterations and
loops_with_known_strides.
* ipa-cp.c (hint_time_bonus): Multiply param_ipa_cp_loop_hint_bonus
with the expected frequencies of loops with known iteration count or
stride.
* ipa-fnsummary.c (add_freqcounting_predicate): New function.
(ipa_fn_summary::~ipa_fn_summary): Release the new vectors instead of
just two predicates.
(remap_hint_predicate_after_duplication): Replace with function
remap_freqcounting_preds_after_dup.
(ipa_fn_summary_t::duplicate): Use it or duplicate new vectors.
(ipa_dump_fn_summary): Dump the new vectors.
(analyze_function_body): Compute the loop property vectors.
(ipa_call_context::estimate_size_and_time): Calculate also
loops_with_known_iterations and loops_with_known_strides. Adjusted
dumping accordinly.
(remap_hint_predicate): Replace with function
remap_freqcounting_predicate.
(ipa_merge_fn_summary_after_inlining): Use it.
(inline_read_section): Stream loopcounting vectors instead of two
simple predicates.
(ipa_fn_summary_write): Likewise.
* params.opt (ipa-max-loop-predicates): New parameter.
* doc/invoke.texi (ipa-max-loop-predicates): Document new param.
gcc/testsuite/ChangeLog:
2020-09-03 Martin Jambor <mjambor@suse.cz>
* gcc.dg/ipa/ipcp-loophint-1.c: New test.
---
gcc/doc/invoke.texi | 4 +
gcc/ipa-cp.c | 9 +
gcc/ipa-fnsummary.c | 318 ++++++++++++++-------
gcc/ipa-fnsummary.h | 38 ++-
gcc/params.opt | 4 +
gcc/testsuite/gcc.dg/ipa/ipcp-loophint-1.c | 29 ++
6 files changed, 288 insertions(+), 114 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/ipa/ipcp-loophint-1.c
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 226b0e1dc91..829598228ac 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -13433,6 +13433,10 @@ of iterations of a loop known, it adds a bonus of
@option{ipa-cp-loop-hint-bonus} to the profitability score of
the candidate.
+@item ipa-max-loop-predicates
+The maximum number of different predicates IPA will use to describe when
+loops in a function have known properties.
+
@item ipa-max-aa-steps
During its analysis of function bodies, IPA-CP employs alias analysis
in order to track values pointed to by function parameters. In order
diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index 77c84a6ed5d..f6320c787de 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -3205,6 +3205,15 @@ hint_time_bonus (cgraph_node *node, const ipa_call_estimates &estimates)
ipa_hints hints = estimates.hints;
if (hints & (INLINE_HINT_loop_iterations | INLINE_HINT_loop_stride))
result += opt_for_fn (node->decl, param_ipa_cp_loop_hint_bonus);
+
+ sreal bonus_for_one = opt_for_fn (node->decl, param_ipa_cp_loop_hint_bonus);
+
+ if (hints & INLINE_HINT_loop_iterations)
+ result += (estimates.loops_with_known_iterations * bonus_for_one).to_int ();
+
+ if (hints & INLINE_HINT_loop_stride)
+ result += (estimates.loops_with_known_strides * bonus_for_one).to_int ();
+
return result;
}
diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
index 6082f34d63f..bbbb94aa930 100644
--- a/gcc/ipa-fnsummary.c
+++ b/gcc/ipa-fnsummary.c
@@ -310,6 +310,36 @@ set_hint_predicate (predicate **p, predicate new_predicate)
}
}
+/* Find if NEW_PREDICATE is already in V and if so, increment its freq.
+ Otherwise add a new item to the vector with this predicate and frerq equal
+ to add_freq, unless the number of predicates would exceed MAX_NUM_PREDICATES
+ in which case the function does nothing. */
+
+static void
+add_freqcounting_predicate (vec<ipa_freqcounting_predicate, va_gc> **v,
+ const predicate &new_predicate, sreal add_freq,
+ unsigned max_num_predicates)
+{
+ if (new_predicate == false || new_predicate == true)
+ return;
+ ipa_freqcounting_predicate *f;
+ for (int i = 0; vec_safe_iterate (*v, i, &f); i++)
+ if (new_predicate == f->predicate)
+ {
+ f->freq += add_freq;
+ return;
+ }
+ if (vec_safe_length (*v) >= max_num_predicates)
+ /* Too many different predicates to account for. */
+ return;
+
+ ipa_freqcounting_predicate fcp;
+ fcp.predicate = NULL;
+ set_hint_predicate (&fcp.predicate, new_predicate);
+ fcp.freq = add_freq;
+ vec_safe_push (*v, fcp);
+ return;
+}
/* Compute what conditions may or may not hold given information about
parameters. RET_CLAUSE returns truths that may hold in a specialized copy,
@@ -710,13 +740,17 @@ ipa_call_summary::~ipa_call_summary ()
ipa_fn_summary::~ipa_fn_summary ()
{
- if (loop_iterations)
- edge_predicate_pool.remove (loop_iterations);
- if (loop_stride)
- edge_predicate_pool.remove (loop_stride);
+ unsigned len = vec_safe_length (loop_iterations);
+ for (unsigned i = 0; i < len; i++)
+ edge_predicate_pool.remove ((*loop_iterations)[i].predicate);
+ len = vec_safe_length (loop_strides);
+ for (unsigned i = 0; i < len; i++)
+ edge_predicate_pool.remove ((*loop_strides)[i].predicate);
vec_free (conds);
vec_free (size_time_table);
vec_free (call_size_time_table);
+ vec_free (loop_iterations);
+ vec_free (loop_strides);
}
void
@@ -729,24 +763,33 @@ ipa_fn_summary_t::remove_callees (cgraph_node *node)
ipa_call_summaries->remove (e);
}
-/* Same as remap_predicate_after_duplication but handle hint predicate *P.
- Additionally care about allocating new memory slot for updated predicate
- and set it to NULL when it becomes true or false (and thus uninteresting).
- */
+/* Duplicate predicates in loop hint vector, allocating memory for them and
+ remove and deallocate any uninteresting (true or false) ones. Return the
+ result. */
-static void
-remap_hint_predicate_after_duplication (predicate **p,
- clause_t possible_truths)
+static vec<ipa_freqcounting_predicate, va_gc> *
+remap_freqcounting_preds_after_dup (vec<ipa_freqcounting_predicate, va_gc> *v,
+ clause_t possible_truths)
{
- predicate new_predicate;
+ if (vec_safe_length (v) == 0)
+ return NULL;
- if (!*p)
- return;
+ vec<ipa_freqcounting_predicate, va_gc> *res = v->copy ();
+ int len = res->length();
+ for (int i = len - 1; i >= 0; i--)
+ {
+ predicate new_predicate
+ = (*res)[i].predicate->remap_after_duplication (possible_truths);
+ /* We do not want to free previous predicate; it is used by node
+ origin. */
+ (*res)[i].predicate = NULL;
+ set_hint_predicate (&(*res)[i].predicate, new_predicate);
- new_predicate = (*p)->remap_after_duplication (possible_truths);
- /* We do not want to free previous predicate; it is used by node origin. */
- *p = NULL;
- set_hint_predicate (p, new_predicate);
+ if (!(*res)[i].predicate)
+ res->unordered_remove (i);
+ }
+
+ return res;
}
@@ -859,9 +902,11 @@ ipa_fn_summary_t::duplicate (cgraph_node *src,
optimized_out_size += es->call_stmt_size * ipa_fn_summary::size_scale;
edge_set_predicate (edge, &new_predicate);
}
- remap_hint_predicate_after_duplication (&info->loop_iterations,
+ info->loop_iterations
+ = remap_freqcounting_preds_after_dup (info->loop_iterations,
possible_truths);
- remap_hint_predicate_after_duplication (&info->loop_stride,
+ info->loop_strides
+ = remap_freqcounting_preds_after_dup (info->loop_strides,
possible_truths);
/* If inliner or someone after inliner will ever start producing
@@ -873,17 +918,21 @@ ipa_fn_summary_t::duplicate (cgraph_node *src,
else
{
info->size_time_table = vec_safe_copy (info->size_time_table);
- if (info->loop_iterations)
+ info->loop_iterations = vec_safe_copy (info->loop_iterations);
+ info->loop_strides = vec_safe_copy (info->loop_strides);
+
+ ipa_freqcounting_predicate *f;
+ for (int i = 0; vec_safe_iterate (info->loop_iterations, i, &f); i++)
{
- predicate p = *info->loop_iterations;
- info->loop_iterations = NULL;
- set_hint_predicate (&info->loop_iterations, p);
+ predicate p = *f->predicate;
+ f->predicate = NULL;
+ set_hint_predicate (&f->predicate, p);
}
- if (info->loop_stride)
+ for (int i = 0; vec_safe_iterate (info->loop_strides, i, &f); i++)
{
- predicate p = *info->loop_stride;
- info->loop_stride = NULL;
- set_hint_predicate (&info->loop_stride, p);
+ predicate p = *f->predicate;
+ f->predicate = NULL;
+ set_hint_predicate (&f->predicate, p);
}
}
if (!dst->inlined_to)
@@ -1045,15 +1094,28 @@ ipa_dump_fn_summary (FILE *f, struct cgraph_node *node)
}
fprintf (f, "\n");
}
- if (s->loop_iterations)
+ ipa_freqcounting_predicate *fcp;
+ bool first_fcp = true;
+ for (int i = 0; vec_safe_iterate (s->loop_iterations, i, &fcp); i++)
{
- fprintf (f, " loop iterations:");
- s->loop_iterations->dump (f, s->conds);
+ if (first_fcp)
+ {
+ fprintf (f, " loop iterations:");
+ first_fcp = false;
+ }
+ fprintf (f, " %3.2f for ", fcp->freq.to_double ());
+ fcp->predicate->dump (f, s->conds);
}
- if (s->loop_stride)
+ first_fcp = true;
+ for (int i = 0; vec_safe_iterate (s->loop_strides, i, &fcp); i++)
{
- fprintf (f, " loop stride:");
- s->loop_stride->dump (f, s->conds);
+ if (first_fcp)
+ {
+ fprintf (f, " loop strides:");
+ first_fcp = false;
+ }
+ fprintf (f, " %3.2f for :", fcp->freq.to_double ());
+ fcp->predicate->dump (f, s->conds);
}
fprintf (f, " calls:\n");
dump_ipa_call_summary (f, 4, node, s);
@@ -2543,12 +2605,13 @@ analyze_function_body (struct cgraph_node *node, bool early)
if (fbi.info)
compute_bb_predicates (&fbi, node, info, params_summary);
+ const profile_count entry_count = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count;
order = XNEWVEC (int, n_basic_blocks_for_fn (cfun));
nblocks = pre_and_rev_post_order_compute (NULL, order, false);
for (n = 0; n < nblocks; n++)
{
bb = BASIC_BLOCK_FOR_FN (cfun, order[n]);
- freq = bb->count.to_sreal_scale (ENTRY_BLOCK_PTR_FOR_FN (cfun)->count);
+ freq = bb->count.to_sreal_scale (entry_count);
if (clobber_only_eh_bb_p (bb))
{
if (dump_file && (dump_flags & TDF_DETAILS))
@@ -2790,23 +2853,28 @@ analyze_function_body (struct cgraph_node *node, bool early)
if (nonconstant_names.exists () && !early)
{
+ ipa_fn_summary *s = ipa_fn_summaries->get (node);
class loop *loop;
- predicate loop_iterations = true;
- predicate loop_stride = true;
+ unsigned max_loop_predicates = opt_for_fn (node->decl,
+ param_ipa_max_loop_predicates);
if (dump_file && (dump_flags & TDF_DETAILS))
flow_loops_dump (dump_file, NULL, 0);
scev_initialize ();
FOR_EACH_LOOP (loop, 0)
{
+ predicate loop_iterations = true;
+ sreal header_freq;
edge ex;
unsigned int j;
class tree_niter_desc niter_desc;
- if (loop->header->aux)
- bb_predicate = *(predicate *) loop->header->aux;
- else
- bb_predicate = false;
+ if (!loop->header->aux)
+ continue;
+ profile_count phdr_count = loop_preheader_edge (loop)->count ();
+ sreal phdr_freq = phdr_count.to_sreal_scale (entry_count);
+
+ bb_predicate = *(predicate *) loop->header->aux;
auto_vec<edge> exits = get_loop_exit_edges (loop);
FOR_EACH_VEC_ELT (exits, j, ex)
if (number_of_iterations_exit (loop, ex, &niter_desc, false)
@@ -2821,10 +2889,10 @@ analyze_function_body (struct cgraph_node *node, bool early)
will_be_nonconstant = bb_predicate & will_be_nonconstant;
if (will_be_nonconstant != true
&& will_be_nonconstant != false)
- /* This is slightly inprecise. We may want to represent each
- loop with independent predicate. */
loop_iterations &= will_be_nonconstant;
}
+ add_freqcounting_predicate (&s->loop_iterations, loop_iterations,
+ phdr_freq, max_loop_predicates);
}
/* To avoid quadratic behavior we analyze stride predicates only
@@ -2833,14 +2901,17 @@ analyze_function_body (struct cgraph_node *node, bool early)
for (loop = loops_for_fn (cfun)->tree_root->inner;
loop != NULL; loop = loop->next)
{
+ predicate loop_stride = true;
basic_block *body = get_loop_body (loop);
+ profile_count phdr_count = loop_preheader_edge (loop)->count ();
+ sreal phdr_freq = phdr_count.to_sreal_scale (entry_count);
for (unsigned i = 0; i < loop->num_nodes; i++)
{
gimple_stmt_iterator gsi;
- if (body[i]->aux)
- bb_predicate = *(predicate *) body[i]->aux;
- else
- bb_predicate = false;
+ if (!body[i]->aux)
+ continue;
+
+ bb_predicate = *(predicate *) body[i]->aux;
for (gsi = gsi_start_bb (body[i]); !gsi_end_p (gsi);
gsi_next (&gsi))
{
@@ -2869,16 +2940,13 @@ analyze_function_body (struct cgraph_node *node, bool early)
will_be_nonconstant = bb_predicate & will_be_nonconstant;
if (will_be_nonconstant != true
&& will_be_nonconstant != false)
- /* This is slightly inprecise. We may want to represent
- each loop with independent predicate. */
loop_stride = loop_stride & will_be_nonconstant;
}
}
+ add_freqcounting_predicate (&s->loop_strides, loop_stride,
+ phdr_freq, max_loop_predicates);
free (body);
}
- ipa_fn_summary *s = ipa_fn_summaries->get (node);
- set_hint_predicate (&s->loop_iterations, loop_iterations);
- set_hint_predicate (&s->loop_stride, loop_stride);
scev_finalize ();
}
FOR_ALL_BB_FN (bb, my_function)
@@ -3551,6 +3619,8 @@ ipa_call_context::estimate_size_and_time (ipa_call_estimates *estimates,
sreal time = 0;
int min_size = 0;
ipa_hints hints = 0;
+ sreal loops_with_known_iterations = 0;
+ sreal loops_with_known_strides = 0;
int i;
if (dump_file && (dump_flags & TDF_DETAILS))
@@ -3643,16 +3713,27 @@ ipa_call_context::estimate_size_and_time (ipa_call_estimates *estimates,
if (est_hints)
{
- if (info->loop_iterations
- && !info->loop_iterations->evaluate (m_possible_truths))
- hints |= INLINE_HINT_loop_iterations;
- if (info->loop_stride
- && !info->loop_stride->evaluate (m_possible_truths))
- hints |= INLINE_HINT_loop_stride;
if (info->scc_no)
hints |= INLINE_HINT_in_scc;
if (DECL_DECLARED_INLINE_P (m_node->decl))
hints |= INLINE_HINT_declared_inline;
+
+ ipa_freqcounting_predicate *fcp;
+ for (i = 0; vec_safe_iterate (info->loop_iterations, i, &fcp); i++)
+ if (!fcp->predicate->evaluate (m_possible_truths))
+ {
+ hints |= INLINE_HINT_loop_iterations;
+ loops_with_known_iterations += fcp->freq;
+ }
+ estimates->loops_with_known_iterations = loops_with_known_iterations;
+
+ for (i = 0; vec_safe_iterate (info->loop_strides, i, &fcp); i++)
+ if (!fcp->predicate->evaluate (m_possible_truths))
+ {
+ hints |= INLINE_HINT_loop_stride;
+ loops_with_known_strides += fcp->freq;
+ }
+ estimates->loops_with_known_strides = loops_with_known_strides;
}
size = RDIV (size, ipa_fn_summary::size_scale);
@@ -3660,12 +3741,15 @@ ipa_call_context::estimate_size_and_time (ipa_call_estimates *estimates,
if (dump_file && (dump_flags & TDF_DETAILS))
{
+ fprintf (dump_file, "\n size:%i", (int) size);
if (est_times)
- fprintf (dump_file, "\n size:%i time:%f nonspec time:%f\n",
- (int) size, time.to_double (),
- nonspecialized_time.to_double ());
- else
- fprintf (dump_file, "\n size:%i (time not estimated)\n", (int) size);
+ fprintf (dump_file, " time:%f nonspec time:%f",
+ time.to_double (), nonspecialized_time.to_double ());
+ if (est_hints)
+ fprintf (dump_file, " loops with known iterations:%f "
+ "known strides:%f", loops_with_known_iterations.to_double (),
+ loops_with_known_strides.to_double ());
+ fprintf (dump_file, "\n");
}
if (est_times)
{
@@ -3865,32 +3949,29 @@ remap_edge_summaries (struct cgraph_edge *inlined_edge,
}
}
-/* Same as remap_predicate, but set result into hint *HINT. */
+/* Run remap_after_inlining on each predicate in V. */
static void
-remap_hint_predicate (class ipa_fn_summary *info,
- class ipa_node_params *params_summary,
- class ipa_fn_summary *callee_info,
- predicate **hint,
- vec<int> operand_map,
- vec<int> offset_map,
- clause_t possible_truths,
- predicate *toplev_predicate)
-{
- predicate p;
+remap_freqcounting_predicate (class ipa_fn_summary *info,
+ class ipa_node_params *params_summary,
+ class ipa_fn_summary *callee_info,
+ vec<ipa_freqcounting_predicate, va_gc> *v,
+ vec<int> operand_map,
+ vec<int> offset_map,
+ clause_t possible_truths,
+ predicate *toplev_predicate)
- if (!*hint)
- return;
- p = (*hint)->remap_after_inlining
- (info, params_summary, callee_info,
- operand_map, offset_map,
- possible_truths, *toplev_predicate);
- if (p != false && p != true)
+{
+ ipa_freqcounting_predicate *fcp;
+ for (int i = 0; vec_safe_iterate (v, i, &fcp); i++)
{
- if (!*hint)
- set_hint_predicate (hint, p);
- else
- **hint &= p;
+ predicate p
+ = fcp->predicate->remap_after_inlining (info, params_summary,
+ callee_info, operand_map,
+ offset_map, possible_truths,
+ *toplev_predicate);
+ if (p != false && p != true)
+ *fcp->predicate &= p;
}
}
@@ -3998,12 +4079,12 @@ ipa_merge_fn_summary_after_inlining (struct cgraph_edge *edge)
remap_edge_summaries (edge, edge->callee, info, params_summary,
callee_info, operand_map,
offset_map, clause, &toplev_predicate);
- remap_hint_predicate (info, params_summary, callee_info,
- &callee_info->loop_iterations,
- operand_map, offset_map, clause, &toplev_predicate);
- remap_hint_predicate (info, params_summary, callee_info,
- &callee_info->loop_stride,
- operand_map, offset_map, clause, &toplev_predicate);
+ remap_freqcounting_predicate (info, params_summary, callee_info,
+ info->loop_iterations, operand_map,
+ offset_map, clause, &toplev_predicate);
+ remap_freqcounting_predicate (info, params_summary, callee_info,
+ info->loop_strides, operand_map,
+ offset_map, clause, &toplev_predicate);
HOST_WIDE_INT stack_frame_offset = ipa_get_stack_frame_offset (edge->callee);
HOST_WIDE_INT peak = stack_frame_offset + callee_info->estimated_stack_size;
@@ -4334,12 +4415,34 @@ inline_read_section (struct lto_file_decl_data *file_data, const char *data,
info->size_time_table->quick_push (e);
}
- p.stream_in (&ib);
- if (info)
- set_hint_predicate (&info->loop_iterations, p);
- p.stream_in (&ib);
- if (info)
- set_hint_predicate (&info->loop_stride, p);
+ count2 = streamer_read_uhwi (&ib);
+ for (j = 0; j < count2; j++)
+ {
+ p.stream_in (&ib);
+ sreal fcp_freq = sreal::stream_in (&ib);
+ if (info)
+ {
+ ipa_freqcounting_predicate fcp;
+ fcp.predicate = NULL;
+ set_hint_predicate (&fcp.predicate, p);
+ fcp.freq = fcp_freq;
+ vec_safe_push (info->loop_iterations, fcp);
+ }
+ }
+ count2 = streamer_read_uhwi (&ib);
+ for (j = 0; j < count2; j++)
+ {
+ p.stream_in (&ib);
+ sreal fcp_freq = sreal::stream_in (&ib);
+ if (info)
+ {
+ ipa_freqcounting_predicate fcp;
+ fcp.predicate = NULL;
+ set_hint_predicate (&fcp.predicate, p);
+ fcp.freq = fcp_freq;
+ vec_safe_push (info->loop_strides, fcp);
+ }
+ }
for (e = node->callees; e; e = e->next_callee)
read_ipa_call_summary (&ib, e, info != NULL);
for (e = node->indirect_calls; e; e = e->next_callee)
@@ -4502,14 +4605,19 @@ ipa_fn_summary_write (void)
e->exec_predicate.stream_out (ob);
e->nonconst_predicate.stream_out (ob);
}
- if (info->loop_iterations)
- info->loop_iterations->stream_out (ob);
- else
- streamer_write_uhwi (ob, 0);
- if (info->loop_stride)
- info->loop_stride->stream_out (ob);
- else
- streamer_write_uhwi (ob, 0);
+ ipa_freqcounting_predicate *fcp;
+ streamer_write_uhwi (ob, vec_safe_length (info->loop_iterations));
+ for (i = 0; vec_safe_iterate (info->loop_iterations, i, &fcp); i++)
+ {
+ fcp->predicate->stream_out (ob);
+ fcp->freq.stream_out (ob);
+ }
+ streamer_write_uhwi (ob, vec_safe_length (info->loop_strides));
+ for (i = 0; vec_safe_iterate (info->loop_strides, i, &fcp); i++)
+ {
+ fcp->predicate->stream_out (ob);
+ fcp->freq.stream_out (ob);
+ }
for (edge = cnode->callees; edge; edge = edge->next_callee)
write_ipa_call_summary (ob, edge);
for (edge = cnode->indirect_calls; edge; edge = edge->next_callee)
diff --git a/gcc/ipa-fnsummary.h b/gcc/ipa-fnsummary.h
index ccb6b432f0b..f4dd5b85ab9 100644
--- a/gcc/ipa-fnsummary.h
+++ b/gcc/ipa-fnsummary.h
@@ -101,6 +101,19 @@ public:
}
};
+/* Structure to capture how frequently some interesting events occur given a
+ particular predicate. The structure is used to estimate how often we
+ encounter loops with known iteration count or stride in various
+ contexts. */
+
+struct GTY(()) ipa_freqcounting_predicate
+{
+ /* The described event happens with this frequency... */
+ sreal freq;
+ /* ...when this predicate evaluates to false. */
+ class predicate * GTY((skip)) predicate;
+};
+
/* Function inlining information. */
class GTY(()) ipa_fn_summary
{
@@ -112,8 +125,9 @@ public:
inlinable (false), single_caller (false),
fp_expressions (false), estimated_stack_size (false),
time (0), conds (NULL),
- size_time_table (NULL), call_size_time_table (NULL), loop_iterations (NULL),
- loop_stride (NULL), growth (0), scc_no (0)
+ size_time_table (NULL), call_size_time_table (NULL),
+ loop_iterations (NULL), loop_strides (NULL),
+ growth (0), scc_no (0)
{
}
@@ -125,7 +139,7 @@ public:
estimated_stack_size (s.estimated_stack_size),
time (s.time), conds (s.conds), size_time_table (s.size_time_table),
call_size_time_table (NULL),
- loop_iterations (s.loop_iterations), loop_stride (s.loop_stride),
+ loop_iterations (s.loop_iterations), loop_strides (s.loop_strides),
growth (s.growth), scc_no (s.scc_no)
{}
@@ -164,12 +178,10 @@ public:
vec<size_time_entry, va_gc> *size_time_table;
vec<size_time_entry, va_gc> *call_size_time_table;
- /* Predicate on when some loop in the function becomes to have known
- bounds. */
- predicate * GTY((skip)) loop_iterations;
- /* Predicate on when some loop in the function becomes to have known
- stride. */
- predicate * GTY((skip)) loop_stride;
+ /* Predicates on when some loops in the function can have known bounds. */
+ vec<ipa_freqcounting_predicate, va_gc> *loop_iterations;
+ /* Predicates on when some loops in the function can have known strides. */
+ vec<ipa_freqcounting_predicate, va_gc> *loop_strides;
/* Estimated growth for inlining all copies of the function before start
of small functions inlining.
This value will get out of date as the callers are duplicated, but
@@ -308,6 +320,14 @@ struct ipa_call_estimates
/* Further discovered reasons why to inline or specialize the give calls. */
ipa_hints hints;
+
+ /* Frequency how often a loop with known number of iterations is encountered.
+ Calculated with hints. */
+ sreal loops_with_known_iterations;
+
+ /* Frequency how often a loop with known strides is encountered. Calculated
+ with hints. */
+ sreal loops_with_known_strides;
};
class ipa_cached_call_context;
diff --git a/gcc/params.opt b/gcc/params.opt
index 5bc7e1619c5..acb59f17e45 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -234,6 +234,10 @@ Maximum number of aggregate content items for a parameter in jump functions and
Common Joined UInteger Var(param_ipa_max_param_expr_ops) Init(10) Param Optimization
Maximum number of operations in a parameter expression that can be handled by IPA analysis.
+-param=ipa-max-loop-predicates=
+Common Joined UInteger Var(param_ipa_max_loop_predicates) Init(16) Param Optimization
+Maximum number of different predicates used to track properties of loops in IPA analysis.
+
-param=ipa-max-switch-predicate-bounds=
Common Joined UInteger Var(param_ipa_max_switch_predicate_bounds) Init(5) Param Optimization
Maximal number of boundary endpoints of case ranges of switch statement used during IPA function summary generation.
diff --git a/gcc/testsuite/gcc.dg/ipa/ipcp-loophint-1.c b/gcc/testsuite/gcc.dg/ipa/ipcp-loophint-1.c
new file mode 100644
index 00000000000..6d049af68af
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/ipcp-loophint-1.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-ipa-cp-details" } */
+
+extern int *o, *p, *q, *r;
+
+#define FUNCTIONS fa(), fb(), fc(), fd(), fe(), ff(), fg()
+
+extern void FUNCTIONS;
+
+void foo (int c)
+{
+ FUNCTIONS;
+ FUNCTIONS;
+ for (int i = 0; i < 100; i++)
+ {
+ for (int j = 0; j < c; j++)
+ o[i] = p[i] + q[i] * r[i];
+ }
+ FUNCTIONS;
+ FUNCTIONS;
+}
+
+void bar()
+{
+ foo (8);
+ p[4]++;
+}
+
+/* { dg-final { scan-ipa-dump {with known iterations:[1-9]} "cp" } } */
--
2.28.0
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 1/6] ipa: Bundle vectors describing argument values
2020-09-29 18:12 [PATCH 0/6] IPA cleanups and IPA-CP improvements for 548.exchange2_r Martin Jambor
` (2 preceding siblings ...)
2020-09-21 14:25 ` [PATCH 5/6] ipa-cp: Add dumping of overall_size after cloning Martin Jambor
@ 2020-09-28 18:47 ` Martin Jambor
2020-10-02 11:54 ` Jan Hubicka
2020-09-28 18:47 ` [PATCH 3/6] ipa: Bundle estimates of ipa_call_context::estimate_size_and_time Martin Jambor
2020-09-28 18:47 ` [PATCH 2/6] ipa: Introduce ipa_cached_call_context Martin Jambor
5 siblings, 1 reply; 17+ messages in thread
From: Martin Jambor @ 2020-09-28 18:47 UTC (permalink / raw)
To: GCC Patches; +Cc: Jan Hubicka
Hi,
this large patch is mostly mechanical change which aims to replace
uses of separate vectors about known scalar values (usually called
known_vals or known_csts), known aggregate values (known_aggs), known
virtual call contexts (known_contexts) and known value
ranges (known_value_ranges) with uses of either new type
ipa_call_arg_values or ipa_auto_call_arg_values, both of which simply
contain these vectors inside them.
The need for two distinct comes from the fact that when the vectors
are constructed from jump functions or lattices, we really should use
auto_vecs with embedded storage allocated on stack. On the other hand,
the bundle in ipa_call_context can be allocated on heap when in cache,
one time for each call_graph node.
ipa_call_context is constructible from ipa_auto_call_arg_values but
then its vectors must not be resized, otherwise the vectors will stop
pointing to the stack ones. Unfortunately, I don't think the
structure embedded in ipa_call_context can be made constant because we
need to manipulate and deallocate it when in cache.
gcc/ChangeLog:
2020-09-01 Martin Jambor <mjambor@suse.cz>
* ipa-prop.h (ipa_auto_call_arg_values): New type.
(class ipa_call_arg_values): Likewise.
(ipa_get_indirect_edge_target): Replaced vector arguments with
ipa_call_arg_values in declaration. Added an overload for
ipa_auto_call_arg_values.
* ipa-fnsummary.h (ipa_call_context): Removed members m_known_vals,
m_known_contexts, m_known_aggs, duplicate_from, release and equal_to,
new members m_avals, store_to_cache and equivalent_to_p. Adjusted
construcotr arguments.
(estimate_ipcp_clone_size_and_time): Replaced vector arguments
with ipa_auto_call_arg_values in declaration.
(evaluate_properties_for_edge): Likewise.
* ipa-cp.c (ipa_get_indirect_edge_target): Adjusted to work on
ipa_call_arg_values rather than on separate vectors. Added an
overload for ipa_auto_call_arg_values.
(devirtualization_time_bonus): Adjusted to work on
ipa_auto_call_arg_values rather than on separate vectors.
(gather_context_independent_values): Adjusted to work on
ipa_auto_call_arg_values rather than on separate vectors.
(perform_estimation_of_a_value): Likewise.
(estimate_local_effects): Likewise.
(modify_known_vectors_with_val): Adjusted both variants to work on
ipa_auto_call_arg_values and rename them to
copy_known_vectors_add_val.
(decide_about_value): Adjusted to work on ipa_call_arg_values rather
than on separate vectors.
(decide_whether_version_node): Likewise.
* ipa-fnsummary.c (evaluate_conditions_for_known_args): Likewise.
(evaluate_properties_for_edge): Likewise.
(ipa_fn_summary_t::duplicate): Likewise.
(estimate_edge_devirt_benefit): Adjusted to work on
ipa_call_arg_values rather than on separate vectors.
(estimate_edge_size_and_time): Likewise.
(estimate_calls_size_and_time_1): Likewise.
(summarize_calls_size_and_time): Adjusted calls to
estimate_edge_size_and_time.
(estimate_calls_size_and_time): Adjusted to work on
ipa_call_arg_values rather than on separate vectors.
(ipa_call_context::ipa_call_context): Construct from a pointer to
ipa_auto_call_arg_values instead of inividual vectors.
(ipa_call_context::duplicate_from): Adjusted to access vectors within
m_avals.
(ipa_call_context::release): Likewise.
(ipa_call_context::equal_to): Likewise.
(ipa_call_context::estimate_size_and_time): Adjusted to work on
ipa_call_arg_values rather than on separate vectors.
(estimate_ipcp_clone_size_and_time): Adjusted to work with
ipa_auto_call_arg_values rather than on separate vectors.
(ipa_merge_fn_summary_after_inlining): Likewise. Adjusted call to
estimate_edge_size_and_time.
(ipa_update_overall_fn_summary): Adjusted call to
estimate_edge_size_and_time.
* ipa-inline-analysis.c (do_estimate_edge_time): Adjusted to work with
ipa_auto_call_arg_values rather than with separate vectors.
(do_estimate_edge_size): Likewise.
(do_estimate_edge_hints): Likewise.
* ipa-prop.c (ipa_auto_call_arg_values::~ipa_auto_call_arg_values):
New destructor.
---
gcc/ipa-cp.c | 245 ++++++++++-----------
gcc/ipa-fnsummary.c | 446 +++++++++++++++++---------------------
gcc/ipa-fnsummary.h | 27 +--
gcc/ipa-inline-analysis.c | 41 +---
gcc/ipa-prop.c | 10 +
gcc/ipa-prop.h | 112 +++++++++-
6 files changed, 452 insertions(+), 429 deletions(-)
diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index b3e7d41ea10..292dd7e5bdf 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -3117,30 +3117,40 @@ ipa_get_indirect_edge_target_1 (struct cgraph_edge *ie,
return target;
}
-
-/* If an indirect edge IE can be turned into a direct one based on KNOWN_CSTS,
- KNOWN_CONTEXTS (which can be vNULL) or KNOWN_AGGS (which also can be vNULL)
- return the destination. */
+/* If an indirect edge IE can be turned into a direct one based on data in
+ AVALS, return the destination. Store into *SPECULATIVE a boolean determinig
+ whether the discovered target is only speculative guess. */
tree
ipa_get_indirect_edge_target (struct cgraph_edge *ie,
- vec<tree> known_csts,
- vec<ipa_polymorphic_call_context> known_contexts,
- vec<ipa_agg_value_set> known_aggs,
+ ipa_call_arg_values *avals,
bool *speculative)
{
- return ipa_get_indirect_edge_target_1 (ie, known_csts, known_contexts,
- known_aggs, NULL, speculative);
+ return ipa_get_indirect_edge_target_1 (ie, avals->m_known_vals,
+ avals->m_known_contexts,
+ avals->m_known_aggs,
+ NULL, speculative);
}
-/* Calculate devirtualization time bonus for NODE, assuming we know KNOWN_CSTS
- and KNOWN_CONTEXTS. */
+/* The same functionality as above overloaded for ipa_auto_call_arg_values. */
+
+tree
+ipa_get_indirect_edge_target (struct cgraph_edge *ie,
+ ipa_auto_call_arg_values *avals,
+ bool *speculative)
+{
+ return ipa_get_indirect_edge_target_1 (ie, avals->m_known_vals,
+ avals->m_known_contexts,
+ avals->m_known_aggs,
+ NULL, speculative);
+}
+
+/* Calculate devirtualization time bonus for NODE, assuming we know information
+ about arguments stored in AVALS. */
static int
devirtualization_time_bonus (struct cgraph_node *node,
- vec<tree> known_csts,
- vec<ipa_polymorphic_call_context> known_contexts,
- vec<ipa_agg_value_set> known_aggs)
+ ipa_auto_call_arg_values *avals)
{
struct cgraph_edge *ie;
int res = 0;
@@ -3153,8 +3163,7 @@ devirtualization_time_bonus (struct cgraph_node *node,
tree target;
bool speculative;
- target = ipa_get_indirect_edge_target (ie, known_csts, known_contexts,
- known_aggs, &speculative);
+ target = ipa_get_indirect_edge_target (ie, avals, &speculative);
if (!target)
continue;
@@ -3306,32 +3315,27 @@ context_independent_aggregate_values (class ipcp_param_lattices *plats)
return res;
}
-/* Allocate KNOWN_CSTS, KNOWN_CONTEXTS and, if non-NULL, KNOWN_AGGS and
- populate them with values of parameters that are known independent of the
- context. INFO describes the function. If REMOVABLE_PARAMS_COST is
- non-NULL, the movement cost of all removable parameters will be stored in
- it. */
+/* Grow vectors in AVALS and fill them with information about values of
+ parameters that are known to be independent of the context. Only calculate
+ m_known_aggs if CALCULATE_AGGS is true. INFO describes the function. If
+ REMOVABLE_PARAMS_COST is non-NULL, the movement cost of all removable
+ parameters will be stored in it.
+
+ TODO: Also grow context independent value range vectors. */
static bool
gather_context_independent_values (class ipa_node_params *info,
- vec<tree> *known_csts,
- vec<ipa_polymorphic_call_context>
- *known_contexts,
- vec<ipa_agg_value_set> *known_aggs,
+ ipa_auto_call_arg_values *avals,
+ bool calculate_aggs,
int *removable_params_cost)
{
int i, count = ipa_get_param_count (info);
bool ret = false;
- known_csts->create (0);
- known_contexts->create (0);
- known_csts->safe_grow_cleared (count, true);
- known_contexts->safe_grow_cleared (count, true);
- if (known_aggs)
- {
- known_aggs->create (0);
- known_aggs->safe_grow_cleared (count, true);
- }
+ avals->m_known_vals.safe_grow_cleared (count, true);
+ avals->m_known_contexts.safe_grow_cleared (count, true);
+ if (calculate_aggs)
+ avals->m_known_aggs.safe_grow_cleared (count, true);
if (removable_params_cost)
*removable_params_cost = 0;
@@ -3345,7 +3349,7 @@ gather_context_independent_values (class ipa_node_params *info,
{
ipcp_value<tree> *val = lat->values;
gcc_checking_assert (TREE_CODE (val->value) != TREE_BINFO);
- (*known_csts)[i] = val->value;
+ avals->m_known_vals[i] = val->value;
if (removable_params_cost)
*removable_params_cost
+= estimate_move_cost (TREE_TYPE (val->value), false);
@@ -3363,15 +3367,15 @@ gather_context_independent_values (class ipa_node_params *info,
/* Do not account known context as reason for cloning. We can see
if it permits devirtualization. */
if (ctxlat->is_single_const ())
- (*known_contexts)[i] = ctxlat->values->value;
+ avals->m_known_contexts[i] = ctxlat->values->value;
- if (known_aggs)
+ if (calculate_aggs)
{
vec<ipa_agg_value> agg_items;
struct ipa_agg_value_set *agg;
agg_items = context_independent_aggregate_values (plats);
- agg = &(*known_aggs)[i];
+ agg = &avals->m_known_aggs[i];
agg->items = agg_items;
agg->by_ref = plats->aggs_by_ref;
ret |= !agg_items.is_empty ();
@@ -3381,25 +3385,23 @@ gather_context_independent_values (class ipa_node_params *info,
return ret;
}
-/* Perform time and size measurement of NODE with the context given in
- KNOWN_CSTS, KNOWN_CONTEXTS and KNOWN_AGGS, calculate the benefit and cost
- given BASE_TIME of the node without specialization, REMOVABLE_PARAMS_COST of
- all context-independent removable parameters and EST_MOVE_COST of estimated
- movement of the considered parameter and store it into VAL. */
+/* Perform time and size measurement of NODE with the context given in AVALS,
+ calculate the benefit compared to the node without specialization and store
+ it into VAL. Take into account REMOVABLE_PARAMS_COST of all
+ context-independent or unused removable parameters and EST_MOVE_COST, the
+ estimated movement of the considered parameter. */
static void
-perform_estimation_of_a_value (cgraph_node *node, vec<tree> known_csts,
- vec<ipa_polymorphic_call_context> known_contexts,
- vec<ipa_agg_value_set> known_aggs,
- int removable_params_cost,
- int est_move_cost, ipcp_value_base *val)
+perform_estimation_of_a_value (cgraph_node *node,
+ ipa_auto_call_arg_values *avals,
+ int removable_params_cost, int est_move_cost,
+ ipcp_value_base *val)
{
int size, time_benefit;
sreal time, base_time;
ipa_hints hints;
- estimate_ipcp_clone_size_and_time (node, known_csts, known_contexts,
- known_aggs, &size, &time,
+ estimate_ipcp_clone_size_and_time (node, avals, &size, &time,
&base_time, &hints);
base_time -= time;
if (base_time > 65535)
@@ -3412,8 +3414,7 @@ perform_estimation_of_a_value (cgraph_node *node, vec<tree> known_csts,
time_benefit = 0;
else
time_benefit = base_time.to_int ()
- + devirtualization_time_bonus (node, known_csts, known_contexts,
- known_aggs)
+ + devirtualization_time_bonus (node, avals)
+ hint_time_bonus (node, hints)
+ removable_params_cost + est_move_cost;
@@ -3454,9 +3455,6 @@ estimate_local_effects (struct cgraph_node *node)
{
class ipa_node_params *info = IPA_NODE_REF (node);
int i, count = ipa_get_param_count (info);
- vec<tree> known_csts;
- vec<ipa_polymorphic_call_context> known_contexts;
- vec<ipa_agg_value_set> known_aggs;
bool always_const;
int removable_params_cost;
@@ -3466,11 +3464,10 @@ estimate_local_effects (struct cgraph_node *node)
if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file, "\nEstimating effects for %s.\n", node->dump_name ());
- always_const = gather_context_independent_values (info, &known_csts,
- &known_contexts, &known_aggs,
+ ipa_auto_call_arg_values avals;
+ always_const = gather_context_independent_values (info, &avals, true,
&removable_params_cost);
- int devirt_bonus = devirtualization_time_bonus (node, known_csts,
- known_contexts, known_aggs);
+ int devirt_bonus = devirtualization_time_bonus (node, &avals);
if (always_const || devirt_bonus
|| (removable_params_cost && node->can_change_signature))
{
@@ -3482,8 +3479,7 @@ estimate_local_effects (struct cgraph_node *node)
init_caller_stats (&stats);
node->call_for_symbol_thunks_and_aliases (gather_caller_stats, &stats,
false);
- estimate_ipcp_clone_size_and_time (node, known_csts, known_contexts,
- known_aggs, &size, &time,
+ estimate_ipcp_clone_size_and_time (node, &avals, &size, &time,
&base_time, &hints);
time -= devirt_bonus;
time -= hint_time_bonus (node, hints);
@@ -3536,18 +3532,17 @@ estimate_local_effects (struct cgraph_node *node)
if (lat->bottom
|| !lat->values
- || known_csts[i])
+ || avals.m_known_vals[i])
continue;
for (val = lat->values; val; val = val->next)
{
gcc_checking_assert (TREE_CODE (val->value) != TREE_BINFO);
- known_csts[i] = val->value;
+ avals.m_known_vals[i] = val->value;
int emc = estimate_move_cost (TREE_TYPE (val->value), true);
- perform_estimation_of_a_value (node, known_csts, known_contexts,
- known_aggs,
- removable_params_cost, emc, val);
+ perform_estimation_of_a_value (node, &avals, removable_params_cost,
+ emc, val);
if (dump_file && (dump_flags & TDF_DETAILS))
{
@@ -3559,7 +3554,7 @@ estimate_local_effects (struct cgraph_node *node)
val->local_time_benefit, val->local_size_cost);
}
}
- known_csts[i] = NULL_TREE;
+ avals.m_known_vals[i] = NULL_TREE;
}
for (i = 0; i < count; i++)
@@ -3574,15 +3569,14 @@ estimate_local_effects (struct cgraph_node *node)
if (ctxlat->bottom
|| !ctxlat->values
- || !known_contexts[i].useless_p ())
+ || !avals.m_known_contexts[i].useless_p ())
continue;
for (val = ctxlat->values; val; val = val->next)
{
- known_contexts[i] = val->value;
- perform_estimation_of_a_value (node, known_csts, known_contexts,
- known_aggs,
- removable_params_cost, 0, val);
+ avals.m_known_contexts[i] = val->value;
+ perform_estimation_of_a_value (node, &avals, removable_params_cost,
+ 0, val);
if (dump_file && (dump_flags & TDF_DETAILS))
{
@@ -3594,20 +3588,18 @@ estimate_local_effects (struct cgraph_node *node)
val->local_time_benefit, val->local_size_cost);
}
}
- known_contexts[i] = ipa_polymorphic_call_context ();
+ avals.m_known_contexts[i] = ipa_polymorphic_call_context ();
}
for (i = 0; i < count; i++)
{
class ipcp_param_lattices *plats = ipa_get_parm_lattices (info, i);
- struct ipa_agg_value_set *agg;
- struct ipcp_agg_lattice *aglat;
if (plats->aggs_bottom || !plats->aggs)
continue;
- agg = &known_aggs[i];
- for (aglat = plats->aggs; aglat; aglat = aglat->next)
+ ipa_agg_value_set *agg = &avals.m_known_aggs[i];
+ for (ipcp_agg_lattice *aglat = plats->aggs; aglat; aglat = aglat->next)
{
ipcp_value<tree> *val;
if (aglat->bottom || !aglat->values
@@ -3624,8 +3616,7 @@ estimate_local_effects (struct cgraph_node *node)
item.value = val->value;
agg->items.safe_push (item);
- perform_estimation_of_a_value (node, known_csts, known_contexts,
- known_aggs,
+ perform_estimation_of_a_value (node, &avals,
removable_params_cost, 0, val);
if (dump_file && (dump_flags & TDF_DETAILS))
@@ -3645,10 +3636,6 @@ estimate_local_effects (struct cgraph_node *node)
}
}
}
-
- known_csts.release ();
- known_contexts.release ();
- ipa_release_agg_values (known_aggs);
}
@@ -5372,31 +5359,34 @@ copy_useful_known_contexts (vec<ipa_polymorphic_call_context> known_contexts)
return vNULL;
}
-/* Copy KNOWN_CSTS and modify the copy according to VAL and INDEX. If
- non-empty, replace KNOWN_CONTEXTS with its copy too. */
+/* Copy known scalar values from AVALS into KNOWN_CSTS and modify the copy
+ according to VAL and INDEX. If non-empty, replace KNOWN_CONTEXTS with its
+ copy too. */
static void
-modify_known_vectors_with_val (vec<tree> *known_csts,
- vec<ipa_polymorphic_call_context> *known_contexts,
- ipcp_value<tree> *val,
- int index)
+copy_known_vectors_add_val (ipa_auto_call_arg_values *avals,
+ vec<tree> *known_csts,
+ vec<ipa_polymorphic_call_context> *known_contexts,
+ ipcp_value<tree> *val, int index)
{
- *known_csts = known_csts->copy ();
- *known_contexts = copy_useful_known_contexts (*known_contexts);
+ *known_csts = avals->m_known_vals.copy ();
+ *known_contexts = copy_useful_known_contexts (avals->m_known_contexts);
(*known_csts)[index] = val->value;
}
-/* Replace KNOWN_CSTS with its copy. Also copy KNOWN_CONTEXTS and modify the
- copy according to VAL and INDEX. */
+/* Copy known scalar values from AVALS into KNOWN_CSTS. Similarly, copy
+ contexts to KNOWN_CONTEXTS and modify the copy according to VAL and
+ INDEX. */
static void
-modify_known_vectors_with_val (vec<tree> *known_csts,
- vec<ipa_polymorphic_call_context> *known_contexts,
- ipcp_value<ipa_polymorphic_call_context> *val,
- int index)
+copy_known_vectors_add_val (ipa_auto_call_arg_values *avals,
+ vec<tree> *known_csts,
+ vec<ipa_polymorphic_call_context> *known_contexts,
+ ipcp_value<ipa_polymorphic_call_context> *val,
+ int index)
{
- *known_csts = known_csts->copy ();
- *known_contexts = known_contexts->copy ();
+ *known_csts = avals->m_known_vals.copy ();
+ *known_contexts = avals->m_known_contexts.copy ();
(*known_contexts)[index] = val->value;
}
@@ -5433,16 +5423,15 @@ ipcp_val_agg_replacement_ok_p (ipa_agg_replacement_value *,
return offset == -1;
}
-/* Decide whether to create a special version of NODE for value VAL of parameter
- at the given INDEX. If OFFSET is -1, the value is for the parameter itself,
- otherwise it is stored at the given OFFSET of the parameter. KNOWN_CSTS,
- KNOWN_CONTEXTS and KNOWN_AGGS describe the other already known values. */
+/* Decide whether to create a special version of NODE for value VAL of
+ parameter at the given INDEX. If OFFSET is -1, the value is for the
+ parameter itself, otherwise it is stored at the given OFFSET of the
+ parameter. AVALS describes the other already known values. */
template <typename valtype>
static bool
decide_about_value (struct cgraph_node *node, int index, HOST_WIDE_INT offset,
- ipcp_value<valtype> *val, vec<tree> known_csts,
- vec<ipa_polymorphic_call_context> known_contexts)
+ ipcp_value<valtype> *val, ipa_auto_call_arg_values *avals)
{
struct ipa_agg_replacement_value *aggvals;
int freq_sum, caller_count;
@@ -5492,13 +5481,16 @@ decide_about_value (struct cgraph_node *node, int index, HOST_WIDE_INT offset,
fprintf (dump_file, " Creating a specialized node of %s.\n",
node->dump_name ());
+ vec<tree> known_csts;
+ vec<ipa_polymorphic_call_context> known_contexts;
+
callers = gather_edges_for_value (val, node, caller_count);
if (offset == -1)
- modify_known_vectors_with_val (&known_csts, &known_contexts, val, index);
+ copy_known_vectors_add_val (avals, &known_csts, &known_contexts, val, index);
else
{
- known_csts = known_csts.copy ();
- known_contexts = copy_useful_known_contexts (known_contexts);
+ known_csts = avals->m_known_vals.copy ();
+ known_contexts = copy_useful_known_contexts (avals->m_known_contexts);
}
find_more_scalar_values_for_callers_subset (node, known_csts, callers);
find_more_contexts_for_caller_subset (node, &known_contexts, callers);
@@ -5522,8 +5514,6 @@ decide_whether_version_node (struct cgraph_node *node)
{
class ipa_node_params *info = IPA_NODE_REF (node);
int i, count = ipa_get_param_count (info);
- vec<tree> known_csts;
- vec<ipa_polymorphic_call_context> known_contexts;
bool ret = false;
if (count == 0)
@@ -5533,8 +5523,8 @@ decide_whether_version_node (struct cgraph_node *node)
fprintf (dump_file, "\nEvaluating opportunities for %s.\n",
node->dump_name ());
- gather_context_independent_values (info, &known_csts, &known_contexts,
- NULL, NULL);
+ ipa_auto_call_arg_values avals;
+ gather_context_independent_values (info, &avals, false, NULL);
for (i = 0; i < count;i++)
{
@@ -5543,12 +5533,11 @@ decide_whether_version_node (struct cgraph_node *node)
ipcp_lattice<ipa_polymorphic_call_context> *ctxlat = &plats->ctxlat;
if (!lat->bottom
- && !known_csts[i])
+ && !avals.m_known_vals[i])
{
ipcp_value<tree> *val;
for (val = lat->values; val; val = val->next)
- ret |= decide_about_value (node, i, -1, val, known_csts,
- known_contexts);
+ ret |= decide_about_value (node, i, -1, val, &avals);
}
if (!plats->aggs_bottom)
@@ -5557,22 +5546,20 @@ decide_whether_version_node (struct cgraph_node *node)
ipcp_value<tree> *val;
for (aglat = plats->aggs; aglat; aglat = aglat->next)
if (!aglat->bottom && aglat->values
- /* If the following is false, the one value is in
- known_aggs. */
+ /* If the following is false, the one value has been considered
+ for cloning for all contexts. */
&& (plats->aggs_contain_variable
|| !aglat->is_single_const ()))
for (val = aglat->values; val; val = val->next)
- ret |= decide_about_value (node, i, aglat->offset, val,
- known_csts, known_contexts);
+ ret |= decide_about_value (node, i, aglat->offset, val, &avals);
}
if (!ctxlat->bottom
- && known_contexts[i].useless_p ())
+ && avals.m_known_contexts[i].useless_p ())
{
ipcp_value<ipa_polymorphic_call_context> *val;
for (val = ctxlat->values; val; val = val->next)
- ret |= decide_about_value (node, i, -1, val, known_csts,
- known_contexts);
+ ret |= decide_about_value (node, i, -1, val, &avals);
}
info = IPA_NODE_REF (node);
@@ -5595,11 +5582,9 @@ decide_whether_version_node (struct cgraph_node *node)
if (!adjust_callers_for_value_intersection (callers, node))
{
/* If node is not called by anyone, or all its caller edges are
- self-recursive, the node is not really be in use, no need to
- do cloning. */
+ self-recursive, the node is not really in use, no need to do
+ cloning. */
callers.release ();
- known_csts.release ();
- known_contexts.release ();
info->do_clone_for_all_contexts = false;
return ret;
}
@@ -5608,6 +5593,9 @@ decide_whether_version_node (struct cgraph_node *node)
fprintf (dump_file, " - Creating a specialized node of %s "
"for all known contexts.\n", node->dump_name ());
+ vec<tree> known_csts = avals.m_known_vals.copy ();
+ vec<ipa_polymorphic_call_context> known_contexts
+ = copy_useful_known_contexts (avals.m_known_contexts);
find_more_scalar_values_for_callers_subset (node, known_csts, callers);
find_more_contexts_for_caller_subset (node, &known_contexts, callers);
ipa_agg_replacement_value *aggvals
@@ -5625,11 +5613,6 @@ decide_whether_version_node (struct cgraph_node *node)
IPA_NODE_REF (clone)->is_all_contexts_clone = true;
ret = true;
}
- else
- {
- known_csts.release ();
- known_contexts.release ();
- }
return ret;
}
diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
index 4c1c1f91482..e8645aa0a1b 100644
--- a/gcc/ipa-fnsummary.c
+++ b/gcc/ipa-fnsummary.c
@@ -320,19 +320,18 @@ set_hint_predicate (predicate **p, predicate new_predicate)
is always false in the second and also builtin_constant_p tests cannot use
the fact that parameter is indeed a constant.
- KNOWN_VALS is partial mapping of parameters of NODE to constant values.
- KNOWN_AGGS is a vector of aggregate known offset/value set for each
- parameter. Return clause of possible truths. When INLINE_P is true, assume
- that we are inlining.
+ When INLINE_P is true, assume that we are inlining. AVAL contains known
+ information about argument values. The function does not modify its content
+ and so AVALs could also be of type ipa_call_arg_values but so far all
+ callers work with the auto version and so we avoid the conversion for
+ convenience.
- ERROR_MARK means compile time invariant. */
+ ERROR_MARK value of an argument means compile time invariant. */
static void
evaluate_conditions_for_known_args (struct cgraph_node *node,
bool inline_p,
- vec<tree> known_vals,
- vec<value_range> known_value_ranges,
- vec<ipa_agg_value_set> known_aggs,
+ ipa_auto_call_arg_values *avals,
clause_t *ret_clause,
clause_t *ret_nonspec_clause)
{
@@ -351,38 +350,33 @@ evaluate_conditions_for_known_args (struct cgraph_node *node,
/* We allow call stmt to have fewer arguments than the callee function
(especially for K&R style programs). So bound check here (we assume
- known_aggs vector, if non-NULL, has the same length as
- known_vals). */
- gcc_checking_assert (!known_aggs.length () || !known_vals.length ()
- || (known_vals.length () == known_aggs.length ()));
+ m_known_aggs vector is either empty or has the same length as
+ m_known_vals). */
+ gcc_checking_assert (!avals->m_known_aggs.length ()
+ || !avals->m_known_vals.length ()
+ || (avals->m_known_vals.length ()
+ == avals->m_known_aggs.length ()));
if (c->agg_contents)
{
- struct ipa_agg_value_set *agg;
-
if (c->code == predicate::changed
&& !c->by_ref
- && c->operand_num < (int)known_vals.length ()
- && (known_vals[c->operand_num] == error_mark_node))
+ && (avals->safe_sval_at(c->operand_num) == error_mark_node))
continue;
- if (c->operand_num < (int)known_aggs.length ())
+ if (ipa_agg_value_set *agg = avals->safe_aggval_at (c->operand_num))
{
- agg = &known_aggs[c->operand_num];
- val = ipa_find_agg_cst_for_param (agg,
- c->operand_num
- < (int) known_vals.length ()
- ? known_vals[c->operand_num]
- : NULL,
- c->offset, c->by_ref);
+ tree sval = avals->safe_sval_at (c->operand_num);
+ val = ipa_find_agg_cst_for_param (agg, sval, c->offset,
+ c->by_ref);
}
else
val = NULL_TREE;
}
- else if (c->operand_num < (int) known_vals.length ())
+ else
{
- val = known_vals[c->operand_num];
- if (val == error_mark_node && c->code != predicate::changed)
+ val = avals->safe_sval_at (c->operand_num);
+ if (val && val == error_mark_node && c->code != predicate::changed)
val = NULL_TREE;
}
@@ -446,53 +440,54 @@ evaluate_conditions_for_known_args (struct cgraph_node *node,
continue;
}
}
- if (c->operand_num < (int) known_value_ranges.length ()
+ if (c->operand_num < (int) avals->m_known_value_ranges.length ()
&& !c->agg_contents
- && !known_value_ranges[c->operand_num].undefined_p ()
- && !known_value_ranges[c->operand_num].varying_p ()
- && TYPE_SIZE (c->type)
- == TYPE_SIZE (known_value_ranges[c->operand_num].type ())
&& (!val || TREE_CODE (val) != INTEGER_CST))
{
- value_range vr = known_value_ranges[c->operand_num];
- if (!useless_type_conversion_p (c->type, vr.type ()))
+ value_range vr = avals->m_known_value_ranges[c->operand_num];
+ if (!vr.undefined_p ()
+ && !vr.varying_p ()
+ && (TYPE_SIZE (c->type) == TYPE_SIZE (vr.type ())))
{
- value_range res;
- range_fold_unary_expr (&res, NOP_EXPR,
- c->type, &vr, vr.type ());
- vr = res;
- }
- tree type = c->type;
-
- for (j = 0; vec_safe_iterate (c->param_ops, j, &op); j++)
- {
- if (vr.varying_p () || vr.undefined_p ())
- break;
-
- value_range res;
- if (!op->val[0])
- range_fold_unary_expr (&res, op->code, op->type, &vr, type);
- else if (!op->val[1])
+ if (!useless_type_conversion_p (c->type, vr.type ()))
{
- value_range op0 (op->val[0], op->val[0]);
- range_fold_binary_expr (&res, op->code, op->type,
- op->index ? &op0 : &vr,
- op->index ? &vr : &op0);
+ value_range res;
+ range_fold_unary_expr (&res, NOP_EXPR,
+ c->type, &vr, vr.type ());
+ vr = res;
+ }
+ tree type = c->type;
+
+ for (j = 0; vec_safe_iterate (c->param_ops, j, &op); j++)
+ {
+ if (vr.varying_p () || vr.undefined_p ())
+ break;
+
+ value_range res;
+ if (!op->val[0])
+ range_fold_unary_expr (&res, op->code, op->type, &vr, type);
+ else if (!op->val[1])
+ {
+ value_range op0 (op->val[0], op->val[0]);
+ range_fold_binary_expr (&res, op->code, op->type,
+ op->index ? &op0 : &vr,
+ op->index ? &vr : &op0);
+ }
+ else
+ gcc_unreachable ();
+ type = op->type;
+ vr = res;
+ }
+ if (!vr.varying_p () && !vr.undefined_p ())
+ {
+ value_range res;
+ value_range val_vr (c->val, c->val);
+ range_fold_binary_expr (&res, c->code, boolean_type_node,
+ &vr,
+ &val_vr);
+ if (res.zero_p ())
+ continue;
}
- else
- gcc_unreachable ();
- type = op->type;
- vr = res;
- }
- if (!vr.varying_p () && !vr.undefined_p ())
- {
- value_range res;
- value_range val_vr (c->val, c->val);
- range_fold_binary_expr (&res, c->code, boolean_type_node,
- &vr,
- &val_vr);
- if (res.zero_p ())
- continue;
}
}
@@ -538,24 +533,20 @@ fre_will_run_p (struct cgraph_node *node)
(if non-NULL) conditions evaluated for nonspecialized clone called
in a given context.
- KNOWN_VALS_PTR and KNOWN_AGGS_PTR must be non-NULL and will be filled by
- known constant and aggregate values of parameters.
-
- KNOWN_CONTEXT_PTR, if non-NULL, will be filled by polymorphic call contexts
- of parameter used by a polymorphic call. */
+ Vectors in AVALS will be populated with useful known information about
+ argument values - information not known to have any uses will be omitted -
+ except for m_known_contexts which will only be calculated if
+ COMPUTE_CONTEXTS is true. */
void
evaluate_properties_for_edge (struct cgraph_edge *e, bool inline_p,
clause_t *clause_ptr,
clause_t *nonspec_clause_ptr,
- vec<tree> *known_vals_ptr,
- vec<ipa_polymorphic_call_context>
- *known_contexts_ptr,
- vec<ipa_agg_value_set> *known_aggs_ptr)
+ ipa_auto_call_arg_values *avals,
+ bool compute_contexts)
{
struct cgraph_node *callee = e->callee->ultimate_alias_target ();
class ipa_fn_summary *info = ipa_fn_summaries->get (callee);
- auto_vec<value_range, 32> known_value_ranges;
class ipa_edge_args *args;
if (clause_ptr)
@@ -563,7 +554,7 @@ evaluate_properties_for_edge (struct cgraph_edge *e, bool inline_p,
if (ipa_node_params_sum
&& !e->call_stmt_cannot_inline_p
- && (info->conds || known_contexts_ptr)
+ && (info->conds || compute_contexts)
&& (args = IPA_EDGE_REF (e)) != NULL)
{
struct cgraph_node *caller;
@@ -608,15 +599,15 @@ evaluate_properties_for_edge (struct cgraph_edge *e, bool inline_p,
if (cst)
{
gcc_checking_assert (TREE_CODE (cst) != TREE_BINFO);
- if (!known_vals_ptr->length ())
- vec_safe_grow_cleared (known_vals_ptr, count, true);
- (*known_vals_ptr)[i] = cst;
+ if (!avals->m_known_vals.length ())
+ avals->m_known_vals.safe_grow_cleared (count, true);
+ avals->m_known_vals[i] = cst;
}
else if (inline_p && !es->param[i].change_prob)
{
- if (!known_vals_ptr->length ())
- vec_safe_grow_cleared (known_vals_ptr, count, true);
- (*known_vals_ptr)[i] = error_mark_node;
+ if (!avals->m_known_vals.length ())
+ avals->m_known_vals.safe_grow_cleared (count, true);
+ avals->m_known_vals[i] = error_mark_node;
}
/* If we failed to get simple constant, try value range. */
@@ -624,19 +615,20 @@ evaluate_properties_for_edge (struct cgraph_edge *e, bool inline_p,
&& vrp_will_run_p (caller)
&& ipa_is_param_used_by_ipa_predicates (callee_pi, i))
{
- value_range vr
+ value_range vr
= ipa_value_range_from_jfunc (caller_parms_info, e, jf,
ipa_get_type (callee_pi,
i));
if (!vr.undefined_p () && !vr.varying_p ())
{
- if (!known_value_ranges.length ())
+ if (!avals->m_known_value_ranges.length ())
{
- known_value_ranges.safe_grow (count, true);
+ avals->m_known_value_ranges.safe_grow (count, true);
for (int i = 0; i < count; ++i)
- new (&known_value_ranges[i]) value_range ();
+ new (&avals->m_known_value_ranges[i])
+ value_range ();
}
- known_value_ranges[i] = vr;
+ avals->m_known_value_ranges[i] = vr;
}
}
@@ -648,25 +640,25 @@ evaluate_properties_for_edge (struct cgraph_edge *e, bool inline_p,
caller, &jf->agg);
if (agg.items.length ())
{
- if (!known_aggs_ptr->length ())
- vec_safe_grow_cleared (known_aggs_ptr, count, true);
- (*known_aggs_ptr)[i] = agg;
+ if (!avals->m_known_aggs.length ())
+ avals->m_known_aggs.safe_grow_cleared (count, true);
+ avals->m_known_aggs[i] = agg;
}
}
}
/* For calls used in polymorphic calls we further determine
polymorphic call context. */
- if (known_contexts_ptr
+ if (compute_contexts
&& ipa_is_param_used_by_polymorphic_call (callee_pi, i))
{
ipa_polymorphic_call_context
ctx = ipa_context_from_jfunc (caller_parms_info, e, i, jf);
if (!ctx.useless_p ())
{
- if (!known_contexts_ptr->length ())
- known_contexts_ptr->safe_grow_cleared (count, true);
- (*known_contexts_ptr)[i]
+ if (!avals->m_known_contexts.length ())
+ avals->m_known_contexts.safe_grow_cleared (count, true);
+ avals->m_known_contexts[i]
= ipa_context_from_jfunc (caller_parms_info, e, i, jf);
}
}
@@ -685,18 +677,14 @@ evaluate_properties_for_edge (struct cgraph_edge *e, bool inline_p,
cst = NULL;
if (cst)
{
- if (!known_vals_ptr->length ())
- vec_safe_grow_cleared (known_vals_ptr, count, true);
- (*known_vals_ptr)[i] = cst;
+ if (!avals->m_known_vals.length ())
+ avals->m_known_vals.safe_grow_cleared (count, true);
+ avals->m_known_vals[i] = cst;
}
}
}
- evaluate_conditions_for_known_args (callee, inline_p,
- *known_vals_ptr,
- known_value_ranges,
- *known_aggs_ptr,
- clause_ptr,
+ evaluate_conditions_for_known_args (callee, inline_p, avals, clause_ptr,
nonspec_clause_ptr);
}
@@ -781,7 +769,7 @@ ipa_fn_summary_t::duplicate (cgraph_node *src,
vec<size_time_entry, va_gc> *entry = info->size_time_table;
/* Use SRC parm info since it may not be copied yet. */
class ipa_node_params *parms_info = IPA_NODE_REF (src);
- vec<tree> known_vals = vNULL;
+ ipa_auto_call_arg_values avals;
int count = ipa_get_param_count (parms_info);
int i, j;
clause_t possible_truths;
@@ -792,7 +780,7 @@ ipa_fn_summary_t::duplicate (cgraph_node *src,
struct cgraph_edge *edge, *next;
info->size_time_table = 0;
- known_vals.safe_grow_cleared (count, true);
+ avals.m_known_vals.safe_grow_cleared (count, true);
for (i = 0; i < count; i++)
{
struct ipa_replace_map *r;
@@ -801,20 +789,17 @@ ipa_fn_summary_t::duplicate (cgraph_node *src,
{
if (r->parm_num == i)
{
- known_vals[i] = r->new_tree;
+ avals.m_known_vals[i] = r->new_tree;
break;
}
}
}
evaluate_conditions_for_known_args (dst, false,
- known_vals,
- vNULL,
- vNULL,
+ &avals,
&possible_truths,
/* We are going to specialize,
so ignore nonspec truths. */
NULL);
- known_vals.release ();
info->account_size_time (0, 0, true_pred, true_pred);
@@ -3054,15 +3039,14 @@ compute_fn_summary_for_current (void)
return 0;
}
-/* Estimate benefit devirtualizing indirect edge IE, provided KNOWN_VALS,
- KNOWN_CONTEXTS and KNOWN_AGGS. */
+/* Estimate benefit devirtualizing indirect edge IE and return true if it can
+ be devirtualized and inlined, provided m_known_vals, m_known_contexts and
+ m_known_aggs in AVALS. Return false straight away if AVALS is NULL. */
static bool
estimate_edge_devirt_benefit (struct cgraph_edge *ie,
int *size, int *time,
- vec<tree> known_vals,
- vec<ipa_polymorphic_call_context> known_contexts,
- vec<ipa_agg_value_set> known_aggs)
+ ipa_call_arg_values *avals)
{
tree target;
struct cgraph_node *callee;
@@ -3070,13 +3054,13 @@ estimate_edge_devirt_benefit (struct cgraph_edge *ie,
enum availability avail;
bool speculative;
- if (!known_vals.length () && !known_contexts.length ())
+ if (!avals
+ || (!avals->m_known_vals.length() && !avals->m_known_contexts.length ()))
return false;
if (!opt_for_fn (ie->caller->decl, flag_indirect_inlining))
return false;
- target = ipa_get_indirect_edge_target (ie, known_vals, known_contexts,
- known_aggs, &speculative);
+ target = ipa_get_indirect_edge_target (ie, avals, &speculative);
if (!target || speculative)
return false;
@@ -3100,17 +3084,13 @@ estimate_edge_devirt_benefit (struct cgraph_edge *ie,
}
/* Increase SIZE, MIN_SIZE (if non-NULL) and TIME for size and time needed to
- handle edge E with probability PROB.
- Set HINTS if edge may be devirtualized.
- KNOWN_VALS, KNOWN_AGGS and KNOWN_CONTEXTS describe context of the call
- site. */
+ handle edge E with probability PROB. Set HINTS accordingly if edge may be
+ devirtualized. AVALS, if non-NULL, describes the context of the call site
+ as far as values of parameters are concerened. */
static inline void
estimate_edge_size_and_time (struct cgraph_edge *e, int *size, int *min_size,
- sreal *time,
- vec<tree> known_vals,
- vec<ipa_polymorphic_call_context> known_contexts,
- vec<ipa_agg_value_set> known_aggs,
+ sreal *time, ipa_call_arg_values *avals,
ipa_hints *hints)
{
class ipa_call_summary *es = ipa_call_summaries->get (e);
@@ -3119,8 +3099,7 @@ estimate_edge_size_and_time (struct cgraph_edge *e, int *size, int *min_size,
int cur_size;
if (!e->callee && hints && e->maybe_hot_p ()
- && estimate_edge_devirt_benefit (e, &call_size, &call_time,
- known_vals, known_contexts, known_aggs))
+ && estimate_edge_devirt_benefit (e, &call_size, &call_time, avals))
*hints |= INLINE_HINT_indirect_call;
cur_size = call_size * ipa_fn_summary::size_scale;
*size += cur_size;
@@ -3132,9 +3111,9 @@ estimate_edge_size_and_time (struct cgraph_edge *e, int *size, int *min_size,
/* Increase SIZE, MIN_SIZE and TIME for size and time needed to handle all
- calls in NODE. POSSIBLE_TRUTHS, KNOWN_VALS, KNOWN_AGGS and KNOWN_CONTEXTS
- describe context of the call site.
-
+ calls in NODE. POSSIBLE_TRUTHS and AVALS describe the context of the call
+ site.
+
Helper for estimate_calls_size_and_time which does the same but
(in most cases) faster. */
@@ -3143,9 +3122,7 @@ estimate_calls_size_and_time_1 (struct cgraph_node *node, int *size,
int *min_size, sreal *time,
ipa_hints *hints,
clause_t possible_truths,
- vec<tree> known_vals,
- vec<ipa_polymorphic_call_context> known_contexts,
- vec<ipa_agg_value_set> known_aggs)
+ ipa_call_arg_values *avals)
{
struct cgraph_edge *e;
for (e = node->callees; e; e = e->next_callee)
@@ -3154,10 +3131,8 @@ estimate_calls_size_and_time_1 (struct cgraph_node *node, int *size,
{
gcc_checking_assert (!ipa_call_summaries->get (e));
estimate_calls_size_and_time_1 (e->callee, size, min_size, time,
- hints,
- possible_truths,
- known_vals, known_contexts,
- known_aggs);
+ hints, possible_truths, avals);
+
continue;
}
class ipa_call_summary *es = ipa_call_summaries->get (e);
@@ -3175,9 +3150,7 @@ estimate_calls_size_and_time_1 (struct cgraph_node *node, int *size,
so we do not need to compute probabilities. */
estimate_edge_size_and_time (e, size,
es->predicate ? NULL : min_size,
- time,
- known_vals, known_contexts,
- known_aggs, hints);
+ time, avals, hints);
}
}
for (e = node->indirect_calls; e; e = e->next_callee)
@@ -3187,9 +3160,7 @@ estimate_calls_size_and_time_1 (struct cgraph_node *node, int *size,
|| es->predicate->evaluate (possible_truths))
estimate_edge_size_and_time (e, size,
es->predicate ? NULL : min_size,
- time,
- known_vals, known_contexts, known_aggs,
- hints);
+ time, avals, hints);
}
}
@@ -3211,8 +3182,7 @@ summarize_calls_size_and_time (struct cgraph_node *node,
int size = 0;
sreal time = 0;
- estimate_edge_size_and_time (e, &size, NULL, &time,
- vNULL, vNULL, vNULL, NULL);
+ estimate_edge_size_and_time (e, &size, NULL, &time, NULL, NULL);
struct predicate pred = true;
class ipa_call_summary *es = ipa_call_summaries->get (e);
@@ -3226,8 +3196,7 @@ summarize_calls_size_and_time (struct cgraph_node *node,
int size = 0;
sreal time = 0;
- estimate_edge_size_and_time (e, &size, NULL, &time,
- vNULL, vNULL, vNULL, NULL);
+ estimate_edge_size_and_time (e, &size, NULL, &time, NULL, NULL);
struct predicate pred = true;
class ipa_call_summary *es = ipa_call_summaries->get (e);
@@ -3238,17 +3207,15 @@ summarize_calls_size_and_time (struct cgraph_node *node,
}
/* Increase SIZE, MIN_SIZE and TIME for size and time needed to handle all
- calls in NODE. POSSIBLE_TRUTHS, KNOWN_VALS, KNOWN_AGGS and KNOWN_CONTEXTS
- describe context of the call site. */
+ calls in NODE. POSSIBLE_TRUTHS and AVALS (the latter if non-NULL) describe
+ context of the call site. */
static void
estimate_calls_size_and_time (struct cgraph_node *node, int *size,
int *min_size, sreal *time,
ipa_hints *hints,
clause_t possible_truths,
- vec<tree> known_vals,
- vec<ipa_polymorphic_call_context> known_contexts,
- vec<ipa_agg_value_set> known_aggs)
+ ipa_call_arg_values *avals)
{
class ipa_fn_summary *sum = ipa_fn_summaries->get (node);
bool use_table = true;
@@ -3267,9 +3234,10 @@ estimate_calls_size_and_time (struct cgraph_node *node, int *size,
use_table = false;
/* If there is an indirect edge that may be optimized, we need
to go the slow way. */
- else if ((known_vals.length ()
- || known_contexts.length ()
- || known_aggs.length ()) && hints)
+ else if (avals && hints
+ && (avals->m_known_vals.length ()
+ || avals->m_known_contexts.length ()
+ || avals->m_known_aggs.length ()))
{
class ipa_node_params *params_summary = IPA_NODE_REF (node);
unsigned int nargs = params_summary
@@ -3278,13 +3246,13 @@ estimate_calls_size_and_time (struct cgraph_node *node, int *size,
for (unsigned int i = 0; i < nargs && use_table; i++)
{
if (ipa_is_param_used_by_indirect_call (params_summary, i)
- && ((known_vals.length () > i && known_vals[i])
- || (known_aggs.length () > i
- && known_aggs[i].items.length ())))
+ && (avals->safe_sval_at (i)
+ || (avals->m_known_aggs.length () > i
+ && avals->m_known_aggs[i].items.length ())))
use_table = false;
else if (ipa_is_param_used_by_polymorphic_call (params_summary, i)
- && (known_contexts.length () > i
- && !known_contexts[i].useless_p ()))
+ && (avals->m_known_contexts.length () > i
+ && !avals->m_known_contexts[i].useless_p ()))
use_table = false;
}
}
@@ -3327,8 +3295,7 @@ estimate_calls_size_and_time (struct cgraph_node *node, int *size,
< ipa_fn_summary::max_size_time_table_size)
{
estimate_calls_size_and_time_1 (node, &old_size, NULL, &old_time, NULL,
- possible_truths, known_vals,
- known_contexts, known_aggs);
+ possible_truths, avals);
gcc_assert (*size == old_size);
if (time && (*time - old_time > 1 || *time - old_time < -1)
&& dump_file)
@@ -3340,31 +3307,22 @@ estimate_calls_size_and_time (struct cgraph_node *node, int *size,
/* Slow path by walking all edges. */
else
estimate_calls_size_and_time_1 (node, size, min_size, time, hints,
- possible_truths, known_vals, known_contexts,
- known_aggs);
+ possible_truths, avals);
}
-/* Default constructor for ipa call context.
- Memory allocation of known_vals, known_contexts
- and known_aggs vectors is owned by the caller, but can
- be release by ipa_call_context::release.
-
- inline_param_summary is owned by the caller. */
-ipa_call_context::ipa_call_context (cgraph_node *node,
- clause_t possible_truths,
+/* Main constructor for ipa call context. Memory allocation of ARG_VALUES
+ is owned by the caller. INLINE_PARAM_SUMMARY is also owned by the
+ caller. */
+
+ipa_call_context::ipa_call_context (cgraph_node *node, clause_t possible_truths,
clause_t nonspec_possible_truths,
- vec<tree> known_vals,
- vec<ipa_polymorphic_call_context>
- known_contexts,
- vec<ipa_agg_value_set> known_aggs,
vec<inline_param_summary>
- inline_param_summary)
+ inline_param_summary,
+ ipa_auto_call_arg_values *arg_values)
: m_node (node), m_possible_truths (possible_truths),
m_nonspec_possible_truths (nonspec_possible_truths),
m_inline_param_summary (inline_param_summary),
- m_known_vals (known_vals),
- m_known_contexts (known_contexts),
- m_known_aggs (known_aggs)
+ m_avals (arg_values)
{
}
@@ -3395,47 +3353,50 @@ ipa_call_context::duplicate_from (const ipa_call_context &ctx)
break;
}
}
- m_known_vals = vNULL;
- if (ctx.m_known_vals.exists ())
+ m_avals.m_known_vals = vNULL;
+ if (ctx.m_avals.m_known_vals.exists ())
{
- unsigned int n = MIN (ctx.m_known_vals.length (), nargs);
+ unsigned int n = MIN (ctx.m_avals.m_known_vals.length (), nargs);
for (unsigned int i = 0; i < n; i++)
if (ipa_is_param_used_by_indirect_call (params_summary, i)
- && ctx.m_known_vals[i])
+ && ctx.m_avals.m_known_vals[i])
{
- m_known_vals = ctx.m_known_vals.copy ();
+ m_avals.m_known_vals = ctx.m_avals.m_known_vals.copy ();
break;
}
}
- m_known_contexts = vNULL;
- if (ctx.m_known_contexts.exists ())
+ m_avals.m_known_contexts = vNULL;
+ if (ctx.m_avals.m_known_contexts.exists ())
{
- unsigned int n = MIN (ctx.m_known_contexts.length (), nargs);
+ unsigned int n = MIN (ctx.m_avals.m_known_contexts.length (), nargs);
for (unsigned int i = 0; i < n; i++)
if (ipa_is_param_used_by_polymorphic_call (params_summary, i)
- && !ctx.m_known_contexts[i].useless_p ())
+ && !ctx.m_avals.m_known_contexts[i].useless_p ())
{
- m_known_contexts = ctx.m_known_contexts.copy ();
+ m_avals.m_known_contexts = ctx.m_avals.m_known_contexts.copy ();
break;
}
}
- m_known_aggs = vNULL;
- if (ctx.m_known_aggs.exists ())
+ m_avals.m_known_aggs = vNULL;
+ if (ctx.m_avals.m_known_aggs.exists ())
{
- unsigned int n = MIN (ctx.m_known_aggs.length (), nargs);
+ unsigned int n = MIN (ctx.m_avals.m_known_aggs.length (), nargs);
for (unsigned int i = 0; i < n; i++)
if (ipa_is_param_used_by_indirect_call (params_summary, i)
- && !ctx.m_known_aggs[i].is_empty ())
+ && !ctx.m_avals.m_known_aggs[i].is_empty ())
{
- m_known_aggs = ipa_copy_agg_values (ctx.m_known_aggs);
+ m_avals.m_known_aggs
+ = ipa_copy_agg_values (ctx.m_avals.m_known_aggs);
break;
}
}
+
+ m_avals.m_known_value_ranges = vNULL;
}
/* Release memory used by known_vals/contexts/aggs vectors.
@@ -3449,11 +3410,11 @@ ipa_call_context::release (bool all)
/* See if context is initialized at first place. */
if (!m_node)
return;
- ipa_release_agg_values (m_known_aggs, all);
+ ipa_release_agg_values (m_avals.m_known_aggs, all);
if (all)
{
- m_known_vals.release ();
- m_known_contexts.release ();
+ m_avals.m_known_vals.release ();
+ m_avals.m_known_contexts.release ();
m_inline_param_summary.release ();
}
}
@@ -3499,77 +3460,81 @@ ipa_call_context::equal_to (const ipa_call_context &ctx)
return false;
}
}
- if (m_known_vals.exists () || ctx.m_known_vals.exists ())
+ if (m_avals.m_known_vals.exists () || ctx.m_avals.m_known_vals.exists ())
{
for (unsigned int i = 0; i < nargs; i++)
{
if (!ipa_is_param_used_by_indirect_call (params_summary, i))
continue;
- if (i >= m_known_vals.length () || !m_known_vals[i])
+ if (i >= m_avals.m_known_vals.length () || !m_avals.m_known_vals[i])
{
- if (i < ctx.m_known_vals.length () && ctx.m_known_vals[i])
+ if (i < ctx.m_avals.m_known_vals.length ()
+ && ctx.m_avals.m_known_vals[i])
return false;
continue;
}
- if (i >= ctx.m_known_vals.length () || !ctx.m_known_vals[i])
+ if (i >= ctx.m_avals.m_known_vals.length ()
+ || !ctx.m_avals.m_known_vals[i])
{
- if (i < m_known_vals.length () && m_known_vals[i])
+ if (i < m_avals.m_known_vals.length () && m_avals.m_known_vals[i])
return false;
continue;
}
- if (m_known_vals[i] != ctx.m_known_vals[i])
+ if (m_avals.m_known_vals[i] != ctx.m_avals.m_known_vals[i])
return false;
}
}
- if (m_known_contexts.exists () || ctx.m_known_contexts.exists ())
+ if (m_avals.m_known_contexts.exists ()
+ || ctx.m_avals.m_known_contexts.exists ())
{
for (unsigned int i = 0; i < nargs; i++)
{
if (!ipa_is_param_used_by_polymorphic_call (params_summary, i))
continue;
- if (i >= m_known_contexts.length ()
- || m_known_contexts[i].useless_p ())
+ if (i >= m_avals.m_known_contexts.length ()
+ || m_avals.m_known_contexts[i].useless_p ())
{
- if (i < ctx.m_known_contexts.length ()
- && !ctx.m_known_contexts[i].useless_p ())
+ if (i < ctx.m_avals.m_known_contexts.length ()
+ && !ctx.m_avals.m_known_contexts[i].useless_p ())
return false;
continue;
}
- if (i >= ctx.m_known_contexts.length ()
- || ctx.m_known_contexts[i].useless_p ())
+ if (i >= ctx.m_avals.m_known_contexts.length ()
+ || ctx.m_avals.m_known_contexts[i].useless_p ())
{
- if (i < m_known_contexts.length ()
- && !m_known_contexts[i].useless_p ())
+ if (i < m_avals.m_known_contexts.length ()
+ && !m_avals.m_known_contexts[i].useless_p ())
return false;
continue;
}
- if (!m_known_contexts[i].equal_to
- (ctx.m_known_contexts[i]))
+ if (!m_avals.m_known_contexts[i].equal_to
+ (ctx.m_avals.m_known_contexts[i]))
return false;
}
}
- if (m_known_aggs.exists () || ctx.m_known_aggs.exists ())
+ if (m_avals.m_known_aggs.exists () || ctx.m_avals.m_known_aggs.exists ())
{
for (unsigned int i = 0; i < nargs; i++)
{
if (!ipa_is_param_used_by_indirect_call (params_summary, i))
continue;
- if (i >= m_known_aggs.length () || m_known_aggs[i].is_empty ())
+ if (i >= m_avals.m_known_aggs.length ()
+ || m_avals.m_known_aggs[i].is_empty ())
{
- if (i < ctx.m_known_aggs.length ()
- && !ctx.m_known_aggs[i].is_empty ())
+ if (i < ctx.m_avals.m_known_aggs.length ()
+ && !ctx.m_avals.m_known_aggs[i].is_empty ())
return false;
continue;
}
- if (i >= ctx.m_known_aggs.length ()
- || ctx.m_known_aggs[i].is_empty ())
+ if (i >= ctx.m_avals.m_known_aggs.length ()
+ || ctx.m_avals.m_known_aggs[i].is_empty ())
{
- if (i < m_known_aggs.length ()
- && !m_known_aggs[i].is_empty ())
+ if (i < m_avals.m_known_aggs.length ()
+ && !m_avals.m_known_aggs[i].is_empty ())
return false;
continue;
}
- if (!m_known_aggs[i].equal_to (ctx.m_known_aggs[i]))
+ if (!m_avals.m_known_aggs[i].equal_to (ctx.m_avals.m_known_aggs[i]))
return false;
}
}
@@ -3619,7 +3584,7 @@ ipa_call_context::estimate_size_and_time (int *ret_size,
estimate_calls_size_and_time (m_node, &size, &min_size,
ret_time ? &time : NULL,
ret_hints ? &hints : NULL, m_possible_truths,
- m_known_vals, m_known_contexts, m_known_aggs);
+ &m_avals);
sreal nonspecialized_time = time;
@@ -3726,22 +3691,16 @@ ipa_call_context::estimate_size_and_time (int *ret_size,
void
estimate_ipcp_clone_size_and_time (struct cgraph_node *node,
- vec<tree> known_vals,
- vec<ipa_polymorphic_call_context>
- known_contexts,
- vec<ipa_agg_value_set> known_aggs,
+ ipa_auto_call_arg_values *avals,
int *ret_size, sreal *ret_time,
sreal *ret_nonspec_time,
ipa_hints *hints)
{
clause_t clause, nonspec_clause;
- /* TODO: Also pass known value ranges. */
- evaluate_conditions_for_known_args (node, false, known_vals, vNULL,
- known_aggs, &clause, &nonspec_clause);
- ipa_call_context ctx (node, clause, nonspec_clause,
- known_vals, known_contexts,
- known_aggs, vNULL);
+ evaluate_conditions_for_known_args (node, false, avals, &clause,
+ &nonspec_clause);
+ ipa_call_context ctx (node, clause, nonspec_clause, vNULL, avals);
ctx.estimate_size_and_time (ret_size, NULL, ret_time,
ret_nonspec_time, hints);
}
@@ -3970,10 +3929,8 @@ ipa_merge_fn_summary_after_inlining (struct cgraph_edge *edge)
if (callee_info->conds)
{
- auto_vec<tree, 32> known_vals;
- auto_vec<ipa_agg_value_set, 32> known_aggs;
- evaluate_properties_for_edge (edge, true, &clause, NULL,
- &known_vals, NULL, &known_aggs);
+ ipa_auto_call_arg_values avals;
+ evaluate_properties_for_edge (edge, true, &clause, NULL, &avals, false);
}
if (ipa_node_params_sum && callee_info->conds)
{
@@ -4067,8 +4024,7 @@ ipa_merge_fn_summary_after_inlining (struct cgraph_edge *edge)
int edge_size = 0;
sreal edge_time = 0;
- estimate_edge_size_and_time (edge, &edge_size, NULL, &edge_time, vNULL,
- vNULL, vNULL, 0);
+ estimate_edge_size_and_time (edge, &edge_size, NULL, &edge_time, NULL, 0);
/* Unaccount size and time of the optimized out call. */
info->account_size_time (-edge_size, -edge_time,
es->predicate ? *es->predicate : true,
@@ -4110,7 +4066,7 @@ ipa_update_overall_fn_summary (struct cgraph_node *node, bool reset)
estimate_calls_size_and_time (node, &size_info->size, &info->min_size,
&info->time, NULL,
~(clause_t) (1 << predicate::false_condition),
- vNULL, vNULL, vNULL);
+ NULL);
size_info->size = RDIV (size_info->size, ipa_fn_summary::size_scale);
info->min_size = RDIV (info->min_size, ipa_fn_summary::size_scale);
}
diff --git a/gcc/ipa-fnsummary.h b/gcc/ipa-fnsummary.h
index 4e1f841afad..6893858d18e 100644
--- a/gcc/ipa-fnsummary.h
+++ b/gcc/ipa-fnsummary.h
@@ -297,10 +297,8 @@ public:
ipa_call_context (cgraph_node *node,
clause_t possible_truths,
clause_t nonspec_possible_truths,
- vec<tree> known_vals,
- vec<ipa_polymorphic_call_context> known_contexts,
- vec<ipa_agg_value_set> known_aggs,
- vec<inline_param_summary> m_inline_param_summary);
+ vec<inline_param_summary> inline_param_summary,
+ ipa_auto_call_arg_values *arg_values);
ipa_call_context ()
: m_node(NULL)
{
@@ -328,14 +326,9 @@ private:
/* Inline summary maintains info about change probabilities. */
vec<inline_param_summary> m_inline_param_summary;
- /* The following is used only to resolve indirect calls. */
-
- /* Vector describing known values of parameters. */
- vec<tree> m_known_vals;
- /* Vector describing known polymorphic call contexts. */
- vec<ipa_polymorphic_call_context> m_known_contexts;
- /* Vector describing known aggregate values. */
- vec<ipa_agg_value_set> m_known_aggs;
+ /* Even after having calculated clauses, the information about argument
+ values is used to resolve indirect calls. */
+ ipa_call_arg_values m_avals;
};
extern fast_call_summary <ipa_call_summary *, va_heap> *ipa_call_summaries;
@@ -349,9 +342,7 @@ void ipa_free_fn_summary (void);
void ipa_free_size_summary (void);
void inline_analyze_function (struct cgraph_node *node);
void estimate_ipcp_clone_size_and_time (struct cgraph_node *,
- vec<tree>,
- vec<ipa_polymorphic_call_context>,
- vec<ipa_agg_value_set>,
+ ipa_auto_call_arg_values *,
int *, sreal *, sreal *,
ipa_hints *);
void ipa_merge_fn_summary_after_inlining (struct cgraph_edge *edge);
@@ -365,10 +356,8 @@ void evaluate_properties_for_edge (struct cgraph_edge *e,
bool inline_p,
clause_t *clause_ptr,
clause_t *nonspec_clause_ptr,
- vec<tree> *known_vals_ptr,
- vec<ipa_polymorphic_call_context>
- *known_contexts_ptr,
- vec<ipa_agg_value_set> *);
+ ipa_auto_call_arg_values *avals,
+ bool compute_contexts);
void ipa_fnsummary_c_finalize (void);
HOST_WIDE_INT ipa_get_stack_frame_offset (struct cgraph_node *node);
diff --git a/gcc/ipa-inline-analysis.c b/gcc/ipa-inline-analysis.c
index 148efbc09ef..d2ae8196d09 100644
--- a/gcc/ipa-inline-analysis.c
+++ b/gcc/ipa-inline-analysis.c
@@ -184,20 +184,16 @@ do_estimate_edge_time (struct cgraph_edge *edge, sreal *ret_nonspec_time)
ipa_hints hints;
struct cgraph_node *callee;
clause_t clause, nonspec_clause;
- auto_vec<tree, 32> known_vals;
- auto_vec<ipa_polymorphic_call_context, 32> known_contexts;
- auto_vec<ipa_agg_value_set, 32> known_aggs;
+ ipa_auto_call_arg_values avals;
class ipa_call_summary *es = ipa_call_summaries->get (edge);
int min_size = -1;
callee = edge->callee->ultimate_alias_target ();
gcc_checking_assert (edge->inline_failed);
- evaluate_properties_for_edge (edge, true,
- &clause, &nonspec_clause, &known_vals,
- &known_contexts, &known_aggs);
- ipa_call_context ctx (callee, clause, nonspec_clause, known_vals,
- known_contexts, known_aggs, es->param);
+ evaluate_properties_for_edge (edge, true, &clause, &nonspec_clause,
+ &avals, true);
+ ipa_call_context ctx (callee, clause, nonspec_clause, es->param, &avals);
if (node_context_cache != NULL)
{
node_context_summary *e = node_context_cache->get_create (callee);
@@ -255,7 +251,6 @@ do_estimate_edge_time (struct cgraph_edge *edge, sreal *ret_nonspec_time)
: edge->caller->count.ipa ())))
hints |= INLINE_HINT_known_hot;
- ctx.release ();
gcc_checking_assert (size >= 0);
gcc_checking_assert (time >= 0);
@@ -307,9 +302,6 @@ do_estimate_edge_size (struct cgraph_edge *edge)
int size;
struct cgraph_node *callee;
clause_t clause, nonspec_clause;
- auto_vec<tree, 32> known_vals;
- auto_vec<ipa_polymorphic_call_context, 32> known_contexts;
- auto_vec<ipa_agg_value_set, 32> known_aggs;
/* When we do caching, use do_estimate_edge_time to populate the entry. */
@@ -325,14 +317,11 @@ do_estimate_edge_size (struct cgraph_edge *edge)
/* Early inliner runs without caching, go ahead and do the dirty work. */
gcc_checking_assert (edge->inline_failed);
- evaluate_properties_for_edge (edge, true,
- &clause, &nonspec_clause,
- &known_vals, &known_contexts,
- &known_aggs);
- ipa_call_context ctx (callee, clause, nonspec_clause, known_vals,
- known_contexts, known_aggs, vNULL);
+ ipa_auto_call_arg_values avals;
+ evaluate_properties_for_edge (edge, true, &clause, &nonspec_clause,
+ &avals, true);
+ ipa_call_context ctx (callee, clause, nonspec_clause, vNULL, &avals);
ctx.estimate_size_and_time (&size, NULL, NULL, NULL, NULL);
- ctx.release ();
return size;
}
@@ -346,9 +335,6 @@ do_estimate_edge_hints (struct cgraph_edge *edge)
ipa_hints hints;
struct cgraph_node *callee;
clause_t clause, nonspec_clause;
- auto_vec<tree, 32> known_vals;
- auto_vec<ipa_polymorphic_call_context, 32> known_contexts;
- auto_vec<ipa_agg_value_set, 32> known_aggs;
/* When we do caching, use do_estimate_edge_time to populate the entry. */
@@ -364,14 +350,11 @@ do_estimate_edge_hints (struct cgraph_edge *edge)
/* Early inliner runs without caching, go ahead and do the dirty work. */
gcc_checking_assert (edge->inline_failed);
- evaluate_properties_for_edge (edge, true,
- &clause, &nonspec_clause,
- &known_vals, &known_contexts,
- &known_aggs);
- ipa_call_context ctx (callee, clause, nonspec_clause, known_vals,
- known_contexts, known_aggs, vNULL);
+ ipa_auto_call_arg_values avals;
+ evaluate_properties_for_edge (edge, true, &clause, &nonspec_clause,
+ &avals, true);
+ ipa_call_context ctx (callee, clause, nonspec_clause, vNULL, &avals);
ctx.estimate_size_and_time (NULL, NULL, NULL, NULL, &hints);
- ctx.release ();
hints |= simple_edge_hints (edge);
return hints;
}
diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
index b28c78eeab4..230625a89bb 100644
--- a/gcc/ipa-prop.c
+++ b/gcc/ipa-prop.c
@@ -5795,4 +5795,14 @@ ipa_agg_value::equal_to (const ipa_agg_value &other)
return offset == other.offset
&& operand_equal_p (value, other.value, 0);
}
+
+/* Destructor also removing individual aggregate values. */
+
+ipa_auto_call_arg_values::~ipa_auto_call_arg_values ()
+{
+ ipa_release_agg_values (m_known_aggs, false);
+}
+
+
+
#include "gt-ipa-prop.h"
diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h
index 23fcf905ef3..8b2edf6300c 100644
--- a/gcc/ipa-prop.h
+++ b/gcc/ipa-prop.h
@@ -433,6 +433,107 @@ ipa_get_jf_ancestor_type_preserved (struct ipa_jump_func *jfunc)
return jfunc->value.ancestor.agg_preserved;
}
+/* Class for allocating a bundle of various potentially known properties about
+ actual arguments of a particular call on stack for the usual case and on
+ heap only if there are unusually many arguments. The data is deallocated
+ when the instance of this class goes out of scope or is otherwise
+ destructed. */
+
+class ipa_auto_call_arg_values
+{
+public:
+ ~ipa_auto_call_arg_values ();
+
+ /* If m_known_vals (vector of known "scalar" values) is sufficiantly long,
+ return its element at INDEX, otherwise return NULL. */
+ tree safe_sval_at (int index)
+ {
+ /* TODO: Assert non-negative index here and test. */
+ if ((unsigned) index < m_known_vals.length ())
+ return m_known_vals[index];
+ return NULL;
+ }
+
+ /* If m_known_aggs is sufficiantly long, return the pointer rto its element
+ at INDEX, otherwise return NULL. */
+ ipa_agg_value_set *safe_aggval_at (int index)
+ {
+ /* TODO: Assert non-negative index here and test. */
+ if ((unsigned) index < m_known_aggs.length ())
+ return &m_known_aggs[index];
+ return NULL;
+ }
+
+ /* Vector describing known values of parameters. */
+ auto_vec<tree, 32> m_known_vals;
+
+ /* Vector describing known polymorphic call contexts. */
+ auto_vec<ipa_polymorphic_call_context, 32> m_known_contexts;
+
+ /* Vector describing known aggregate values. */
+ auto_vec<ipa_agg_value_set, 32> m_known_aggs;
+
+ /* Vector describing known value ranges of arguments. */
+ auto_vec<value_range, 32> m_known_value_ranges;
+};
+
+/* Class bundling the various potentially known properties about actual
+ arguments of a particular call. This variant does not deallocate the
+ bundled data in any way. */
+
+class ipa_call_arg_values
+{
+public:
+ /* Default constructor, setting the vectors to empty ones. */
+ ipa_call_arg_values ()
+ {}
+
+ /* Construct this general variant of the bundle from the variant which uses
+ auto_vecs to hold the vectors. This means that vectors of objects
+ constructed with this constructor should not be changed because if they
+ get reallocated, the member vectors and the underlying auto_vecs would get
+ out of sync. */
+ ipa_call_arg_values (ipa_auto_call_arg_values *aavals)
+ : m_known_vals (aavals->m_known_vals),
+ m_known_contexts (aavals->m_known_contexts),
+ m_known_aggs (aavals->m_known_aggs),
+ m_known_value_ranges (aavals->m_known_value_ranges)
+ {}
+
+ /* If m_known_vals (vector of known "scalar" values) is sufficiantly long,
+ return its element at INDEX, otherwise return NULL. */
+ tree safe_sval_at (int index)
+ {
+ /* TODO: Assert non-negative index here and test. */
+ if ((unsigned) index < m_known_vals.length ())
+ return m_known_vals[index];
+ return NULL;
+ }
+
+ /* If m_known_aggs is sufficiantly long, return the pointer rto its element
+ at INDEX, otherwise return NULL. */
+ ipa_agg_value_set *safe_aggval_at (int index)
+ {
+ /* TODO: Assert non-negative index here and test. */
+ if ((unsigned) index < m_known_aggs.length ())
+ return &m_known_aggs[index];
+ return NULL;
+ }
+
+ /* Vector describing known values of parameters. */
+ vec<tree> m_known_vals = vNULL;
+
+ /* Vector describing known polymorphic call contexts. */
+ vec<ipa_polymorphic_call_context> m_known_contexts = vNULL;
+
+ /* Vector describing known aggregate values. */
+ vec<ipa_agg_value_set> m_known_aggs = vNULL;
+
+ /* Vector describing known value ranges of arguments. */
+ vec<value_range> m_known_value_ranges = vNULL;
+};
+
+
/* Summary describing a single formal parameter. */
struct GTY(()) ipa_param_descriptor
@@ -970,12 +1071,13 @@ void ipa_initialize_node_params (struct cgraph_node *node);
bool ipa_propagate_indirect_call_infos (struct cgraph_edge *cs,
vec<cgraph_edge *> *new_edges);
-/* Indirect edge and binfo processing. */
+/* Indirect edge processing and target discovery. */
tree ipa_get_indirect_edge_target (struct cgraph_edge *ie,
- vec<tree>,
- vec<ipa_polymorphic_call_context>,
- vec<ipa_agg_value_set>,
- bool *);
+ ipa_call_arg_values *avals,
+ bool *speculative);
+tree ipa_get_indirect_edge_target (struct cgraph_edge *ie,
+ ipa_auto_call_arg_values *avals,
+ bool *speculative);
struct cgraph_edge *ipa_make_edge_direct_to_target (struct cgraph_edge *, tree,
bool speculative = false);
tree ipa_impossible_devirt_target (struct cgraph_edge *, tree);
--
2.28.0
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 3/6] ipa: Bundle estimates of ipa_call_context::estimate_size_and_time
2020-09-29 18:12 [PATCH 0/6] IPA cleanups and IPA-CP improvements for 548.exchange2_r Martin Jambor
` (3 preceding siblings ...)
2020-09-28 18:47 ` [PATCH 1/6] ipa: Bundle vectors describing argument values Martin Jambor
@ 2020-09-28 18:47 ` Martin Jambor
2020-09-29 18:39 ` Jan Hubicka
2020-09-28 18:47 ` [PATCH 2/6] ipa: Introduce ipa_cached_call_context Martin Jambor
5 siblings, 1 reply; 17+ messages in thread
From: Martin Jambor @ 2020-09-28 18:47 UTC (permalink / raw)
To: GCC Patches; +Cc: Jan Hubicka
A subsequent patch adds another two estimates that the code in
ipa_call_context::estimate_size_and_time computes, and the fact that
the function has a special output parameter for each thing it computes
would make it have just too many. Therefore, this patch collapses all
those ouptut parameters into one output structure.
gcc/ChangeLog:
2020-09-02 Martin Jambor <mjambor@suse.cz>
* ipa-inline-analysis.c (do_estimate_edge_time): Adjusted to use
ipa_call_estimates.
(do_estimate_edge_size): Likewise.
(do_estimate_edge_hints): Likewise.
* ipa-fnsummary.h (struct ipa_call_estimates): New type.
(ipa_call_context::estimate_size_and_time): Adjusted declaration.
(estimate_ipcp_clone_size_and_time): Likewise.
* ipa-cp.c (hint_time_bonus): Changed the type of the second argument
to ipa_call_estimates.
(perform_estimation_of_a_value): Adjusted to use ipa_call_estimates.
(estimate_local_effects): Likewise.
* ipa-fnsummary.c (ipa_call_context::estimate_size_and_time): Adjusted
to return estimates in a single ipa_call_estimates parameter.
(estimate_ipcp_clone_size_and_time): Likewise.
---
gcc/ipa-cp.c | 45 ++++++++++++++---------------
gcc/ipa-fnsummary.c | 60 +++++++++++++++++++--------------------
gcc/ipa-fnsummary.h | 36 +++++++++++++++++------
gcc/ipa-inline-analysis.c | 47 +++++++++++++++++-------------
4 files changed, 105 insertions(+), 83 deletions(-)
diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index 292dd7e5bdf..77c84a6ed5d 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -3196,12 +3196,13 @@ devirtualization_time_bonus (struct cgraph_node *node,
return res;
}
-/* Return time bonus incurred because of HINTS. */
+/* Return time bonus incurred because of hints stored in ESTIMATES. */
static int
-hint_time_bonus (cgraph_node *node, ipa_hints hints)
+hint_time_bonus (cgraph_node *node, const ipa_call_estimates &estimates)
{
int result = 0;
+ ipa_hints hints = estimates.hints;
if (hints & (INLINE_HINT_loop_iterations | INLINE_HINT_loop_stride))
result += opt_for_fn (node->decl, param_ipa_cp_loop_hint_bonus);
return result;
@@ -3397,15 +3398,13 @@ perform_estimation_of_a_value (cgraph_node *node,
int removable_params_cost, int est_move_cost,
ipcp_value_base *val)
{
- int size, time_benefit;
- sreal time, base_time;
- ipa_hints hints;
+ int time_benefit;
+ ipa_call_estimates estimates;
- estimate_ipcp_clone_size_and_time (node, avals, &size, &time,
- &base_time, &hints);
- base_time -= time;
- if (base_time > 65535)
- base_time = 65535;
+ estimate_ipcp_clone_size_and_time (node, avals, &estimates);
+ sreal time_delta = estimates.nonspecialized_time - estimates.time;
+ if (time_delta > 65535)
+ time_delta = 65535;
/* Extern inline functions have no cloning local time benefits because they
will be inlined anyway. The only reason to clone them is if it enables
@@ -3413,11 +3412,12 @@ perform_estimation_of_a_value (cgraph_node *node,
if (DECL_EXTERNAL (node->decl) && DECL_DECLARED_INLINE_P (node->decl))
time_benefit = 0;
else
- time_benefit = base_time.to_int ()
+ time_benefit = time_delta.to_int ()
+ devirtualization_time_bonus (node, avals)
- + hint_time_bonus (node, hints)
+ + hint_time_bonus (node, estimates)
+ removable_params_cost + est_move_cost;
+ int size = estimates.size;
gcc_checking_assert (size >=0);
/* The inliner-heuristics based estimates may think that in certain
contexts some functions do not have any size at all but we want
@@ -3472,23 +3472,21 @@ estimate_local_effects (struct cgraph_node *node)
|| (removable_params_cost && node->can_change_signature))
{
struct caller_statistics stats;
- ipa_hints hints;
- sreal time, base_time;
- int size;
+ ipa_call_estimates estimates;
init_caller_stats (&stats);
node->call_for_symbol_thunks_and_aliases (gather_caller_stats, &stats,
false);
- estimate_ipcp_clone_size_and_time (node, &avals, &size, &time,
- &base_time, &hints);
- time -= devirt_bonus;
- time -= hint_time_bonus (node, hints);
- time -= removable_params_cost;
- size -= stats.n_calls * removable_params_cost;
+ estimate_ipcp_clone_size_and_time (node, &avals, &estimates);
+ sreal time = estimates.nonspecialized_time - estimates.time;
+ time += devirt_bonus;
+ time += hint_time_bonus (node, estimates);
+ time += removable_params_cost;
+ int size = estimates.size - stats.n_calls * removable_params_cost;
if (dump_file)
fprintf (dump_file, " - context independent values, size: %i, "
- "time_benefit: %f\n", size, (base_time - time).to_double ());
+ "time_benefit: %f\n", size, (time).to_double ());
if (size <= 0 || node->local)
{
@@ -3499,8 +3497,7 @@ estimate_local_effects (struct cgraph_node *node)
"known contexts, code not going to grow.\n");
}
else if (good_cloning_opportunity_p (node,
- MIN ((base_time - time).to_int (),
- 65536),
+ MIN ((time).to_int (), 65536),
stats.freq_sum, stats.count_sum,
size))
{
diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
index 4ef7d2570e9..6082f34d63f 100644
--- a/gcc/ipa-fnsummary.c
+++ b/gcc/ipa-fnsummary.c
@@ -3536,18 +3536,14 @@ ipa_call_context::equal_to (const ipa_call_context &ctx)
return true;
}
-/* Estimate size and time needed to execute call in the given context.
- Additionally determine hints determined by the context. Finally compute
- minimal size needed for the call that is independent on the call context and
- can be used for fast estimates. Return the values in RET_SIZE,
- RET_MIN_SIZE, RET_TIME and RET_HINTS. */
+/* Fill in the selected fields in ESTIMATES with value estimated for call in
+ this context. Always compute size and min_size. Only compute time and
+ nonspecialized_time if EST_TIMES is true. Only compute hints if EST_HINTS
+ is true. */
void
-ipa_call_context::estimate_size_and_time (int *ret_size,
- int *ret_min_size,
- sreal *ret_time,
- sreal *ret_nonspecialized_time,
- ipa_hints *ret_hints)
+ipa_call_context::estimate_size_and_time (ipa_call_estimates *estimates,
+ bool est_times, bool est_hints)
{
class ipa_fn_summary *info = ipa_fn_summaries->get (m_node);
size_time_entry *e;
@@ -3577,8 +3573,8 @@ ipa_call_context::estimate_size_and_time (int *ret_size,
if (m_node->callees || m_node->indirect_calls)
estimate_calls_size_and_time (m_node, &size, &min_size,
- ret_time ? &time : NULL,
- ret_hints ? &hints : NULL, m_possible_truths,
+ est_times ? &time : NULL,
+ est_hints ? &hints : NULL, m_possible_truths,
&m_avals);
sreal nonspecialized_time = time;
@@ -3605,7 +3601,7 @@ ipa_call_context::estimate_size_and_time (int *ret_size,
known to be constant in a specialized setting. */
if (nonconst)
size += e->size;
- if (!ret_time)
+ if (!est_times)
continue;
nonspecialized_time += e->time;
if (!nonconst)
@@ -3645,7 +3641,7 @@ ipa_call_context::estimate_size_and_time (int *ret_size,
if (time > nonspecialized_time)
time = nonspecialized_time;
- if (ret_hints)
+ if (est_hints)
{
if (info->loop_iterations
&& !info->loop_iterations->evaluate (m_possible_truths))
@@ -3663,18 +3659,23 @@ ipa_call_context::estimate_size_and_time (int *ret_size,
min_size = RDIV (min_size, ipa_fn_summary::size_scale);
if (dump_file && (dump_flags & TDF_DETAILS))
- fprintf (dump_file, "\n size:%i time:%f nonspec time:%f\n", (int) size,
- time.to_double (), nonspecialized_time.to_double ());
- if (ret_time)
- *ret_time = time;
- if (ret_nonspecialized_time)
- *ret_nonspecialized_time = nonspecialized_time;
- if (ret_size)
- *ret_size = size;
- if (ret_min_size)
- *ret_min_size = min_size;
- if (ret_hints)
- *ret_hints = hints;
+ {
+ if (est_times)
+ fprintf (dump_file, "\n size:%i time:%f nonspec time:%f\n",
+ (int) size, time.to_double (),
+ nonspecialized_time.to_double ());
+ else
+ fprintf (dump_file, "\n size:%i (time not estimated)\n", (int) size);
+ }
+ if (est_times)
+ {
+ estimates->time = time;
+ estimates->nonspecialized_time = nonspecialized_time;
+ }
+ estimates->size = size;
+ estimates->min_size = min_size;
+ if (est_hints)
+ estimates->hints = hints;
return;
}
@@ -3687,17 +3688,14 @@ ipa_call_context::estimate_size_and_time (int *ret_size,
void
estimate_ipcp_clone_size_and_time (struct cgraph_node *node,
ipa_auto_call_arg_values *avals,
- int *ret_size, sreal *ret_time,
- sreal *ret_nonspec_time,
- ipa_hints *hints)
+ ipa_call_estimates *estimates)
{
clause_t clause, nonspec_clause;
evaluate_conditions_for_known_args (node, false, avals, &clause,
&nonspec_clause);
ipa_call_context ctx (node, clause, nonspec_clause, vNULL, avals);
- ctx.estimate_size_and_time (ret_size, NULL, ret_time,
- ret_nonspec_time, hints);
+ ctx.estimate_size_and_time (estimates);
}
/* Return stack frame offset where frame of NODE is supposed to start inside
diff --git a/gcc/ipa-fnsummary.h b/gcc/ipa-fnsummary.h
index 020a6f0425d..ccb6b432f0b 100644
--- a/gcc/ipa-fnsummary.h
+++ b/gcc/ipa-fnsummary.h
@@ -287,6 +287,29 @@ public:
ipa_call_summary *dst_data);
};
+/* Estimated execution times, code sizes and other information about the
+ code executing a call described by ipa_call_context. */
+
+struct ipa_call_estimates
+{
+ /* Estimated size needed to execute call in the given context. */
+ int size;
+
+ /* Minimal size needed for the call that is + independent on the call context
+ and can be used for fast estimates. */
+ int min_size;
+
+ /* Estimated time needed to execute call in the given context. */
+ sreal time;
+
+ /* Estimated time needed to execute the function when not ignoring
+ computations known to be constant in this context. */
+ sreal nonspecialized_time;
+
+ /* Further discovered reasons why to inline or specialize the give calls. */
+ ipa_hints hints;
+};
+
class ipa_cached_call_context;
/* This object describe a context of call. That is a summary of known
@@ -305,10 +328,8 @@ public:
: m_node(NULL)
{
}
- void estimate_size_and_time (int *ret_size, int *ret_min_size,
- sreal *ret_time,
- sreal *ret_nonspecialized_time,
- ipa_hints *ret_hints);
+ void estimate_size_and_time (ipa_call_estimates *estimates,
+ bool est_times = true, bool est_hints = true);
bool equal_to (const ipa_call_context &);
bool exists_p ()
{
@@ -353,10 +374,9 @@ void ipa_dump_hints (FILE *f, ipa_hints);
void ipa_free_fn_summary (void);
void ipa_free_size_summary (void);
void inline_analyze_function (struct cgraph_node *node);
-void estimate_ipcp_clone_size_and_time (struct cgraph_node *,
- ipa_auto_call_arg_values *,
- int *, sreal *, sreal *,
- ipa_hints *);
+void estimate_ipcp_clone_size_and_time (struct cgraph_node *node,
+ ipa_auto_call_arg_values *avals,
+ ipa_call_estimates *estimates);
void ipa_merge_fn_summary_after_inlining (struct cgraph_edge *edge);
void ipa_update_overall_fn_summary (struct cgraph_node *node, bool reset = true);
void compute_fn_summary (struct cgraph_node *, bool);
diff --git a/gcc/ipa-inline-analysis.c b/gcc/ipa-inline-analysis.c
index b7af77f7b9b..acbf82e84d9 100644
--- a/gcc/ipa-inline-analysis.c
+++ b/gcc/ipa-inline-analysis.c
@@ -208,16 +208,12 @@ do_estimate_edge_time (struct cgraph_edge *edge, sreal *ret_nonspec_time)
&& !opt_for_fn (callee->decl, flag_profile_partial_training)
&& !callee->count.ipa_p ())
{
- sreal chk_time, chk_nonspec_time;
- int chk_size, chk_min_size;
-
- ipa_hints chk_hints;
- ctx.estimate_size_and_time (&chk_size, &chk_min_size,
- &chk_time, &chk_nonspec_time,
- &chk_hints);
- gcc_assert (chk_size == size && chk_time == time
- && chk_nonspec_time == nonspec_time
- && chk_hints == hints);
+ ipa_call_estimates chk_estimates;
+ ctx.estimate_size_and_time (&chk_estimates);
+ gcc_assert (chk_estimates.size == size
+ && chk_estimates.time == time
+ && chk_estimates.nonspecialized_time == nonspec_time
+ && chk_estimates.hints == hints);
}
}
else
@@ -227,18 +223,28 @@ do_estimate_edge_time (struct cgraph_edge *edge, sreal *ret_nonspec_time)
else
node_context_cache_clear++;
e->entry.ctx.release ();
- ctx.estimate_size_and_time (&size, &min_size,
- &time, &nonspec_time, &hints);
+ ipa_call_estimates estimates;
+ ctx.estimate_size_and_time (&estimates);
+ size = estimates.size;
e->entry.size = size;
+ time = estimates.time;
e->entry.time = time;
+ nonspec_time = estimates.nonspecialized_time;
e->entry.nonspec_time = nonspec_time;
+ hints = estimates.hints;
e->entry.hints = hints;
e->entry.ctx.duplicate_from (ctx);
}
}
else
- ctx.estimate_size_and_time (&size, &min_size,
- &time, &nonspec_time, &hints);
+ {
+ ipa_call_estimates estimates;
+ ctx.estimate_size_and_time (&estimates);
+ size = estimates.size;
+ time = estimates.time;
+ nonspec_time = estimates.nonspecialized_time;
+ hints = estimates.hints;
+ }
/* When we have profile feedback, we can quite safely identify hot
edges and for those we disable size limits. Don't do that when
@@ -321,8 +327,9 @@ do_estimate_edge_size (struct cgraph_edge *edge)
evaluate_properties_for_edge (edge, true, &clause, &nonspec_clause,
&avals, true);
ipa_call_context ctx (callee, clause, nonspec_clause, vNULL, &avals);
- ctx.estimate_size_and_time (&size, NULL, NULL, NULL, NULL);
- return size;
+ ipa_call_estimates estimates;
+ ctx.estimate_size_and_time (&estimates, false, false);
+ return estimates.size;
}
@@ -332,7 +339,6 @@ do_estimate_edge_size (struct cgraph_edge *edge)
ipa_hints
do_estimate_edge_hints (struct cgraph_edge *edge)
{
- ipa_hints hints;
struct cgraph_node *callee;
clause_t clause, nonspec_clause;
@@ -341,7 +347,7 @@ do_estimate_edge_hints (struct cgraph_edge *edge)
if (edge_growth_cache != NULL)
{
do_estimate_edge_time (edge);
- hints = edge_growth_cache->get (edge)->hints;
+ ipa_hints hints = edge_growth_cache->get (edge)->hints;
gcc_checking_assert (hints);
return hints - 1;
}
@@ -354,8 +360,9 @@ do_estimate_edge_hints (struct cgraph_edge *edge)
evaluate_properties_for_edge (edge, true, &clause, &nonspec_clause,
&avals, true);
ipa_call_context ctx (callee, clause, nonspec_clause, vNULL, &avals);
- ctx.estimate_size_and_time (NULL, NULL, NULL, NULL, &hints);
- hints |= simple_edge_hints (edge);
+ ipa_call_estimates estimates;
+ ctx.estimate_size_and_time (&estimates, false, true);
+ ipa_hints hints = estimates.hints | simple_edge_hints (edge);
return hints;
}
--
2.28.0
^ permalink raw reply [flat|nested] 17+ messages in thread