* [PATCH] Setup predicate for switch default case in IPA (PR ipa/91089)
@ 2019-07-12 8:51 Feng Xue OS
2019-08-29 15:14 ` Martin Jambor
0 siblings, 1 reply; 11+ messages in thread
From: Feng Xue OS @ 2019-07-12 8:51 UTC (permalink / raw)
To: gcc-patches, Martin Jambor, Jan Hubicka
IPA does not construct executability predicate for default case of switch statement.
So execution cost of default case is not properly evaluated in IPA-cp, this might
prevent function clone for function containing switch statement, if certain non-default
case is proved to be executed after constant propagation.
This patch is composed to deduce predicate for default case, if it turns out to be a
relative simple one, for example, we can try to merge case range, and use
comparison upon range bounds, and also range analysis information to simplify
predicate.
Feng
----
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 3d92250b520..4de2f568990 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,10 @@
+2019-07-12 Feng Xue <fxue@os.amperecomputing.com>
+
+ PR ipa/91089
+ * ipa-fnsummary.c (set_switch_stmt_execution_predicate): Add predicate
+ for switch default case using range analysis information.
+ * params.def (PARAM_IPA_MAX_SWITCH_PREDICATE_BOUNDS): New.
+
2019-07-11 Sunil K Pandey <sunil.k.pandey@intel.com>
PR target/90980
diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
index 09986211a1d..141fdce53c2 100644
--- a/gcc/ipa-fnsummary.c
+++ b/gcc/ipa-fnsummary.c
@@ -1289,30 +1289,35 @@ set_switch_stmt_execution_predicate (struct ipa_func_body_info *fbi,
if (!unmodified_parm_or_parm_agg_item (fbi, last, op, &index, &size, &aggpos))
return;
+ auto_vec<std::pair<tree, tree> > ranges;
+ tree type = TREE_TYPE (op);
+ tree one_cst = build_one_cst (type);
+ int bound_limit = PARAM_VALUE (PARAM_IPA_MAX_SWITCH_PREDICATE_BOUNDS);
+ int bound_count = 0;
+ value_range_base vr;
+
+ get_range_info (op, vr);
+
FOR_EACH_EDGE (e, ei, bb->succs)
{
e->aux = edge_predicate_pool.allocate ();
*(predicate *) e->aux = false;
}
n = gimple_switch_num_labels (last);
- for (case_idx = 0; case_idx < n; ++case_idx)
+ for (case_idx = 1; case_idx < n; ++case_idx)
{
tree cl = gimple_switch_label (last, case_idx);
- tree min, max;
+ tree min = CASE_LOW (cl);
+ tree max = CASE_HIGH (cl);
predicate p;
- e = gimple_switch_edge (cfun, last, case_idx);
- min = CASE_LOW (cl);
- max = CASE_HIGH (cl);
-
- /* For default we might want to construct predicate that none
- of cases is met, but it is bit hard to do not having negations
- of conditionals handy. */
- if (!min && !max)
- p = true;
- else if (!max)
- p = add_condition (summary, index, size, &aggpos, EQ_EXPR,
- unshare_expr_without_location (min));
+ if (!max)
+ {
+ max = min;
+
+ p = add_condition (summary, index, size, &aggpos, EQ_EXPR,
+ unshare_expr_without_location (min));
+ }
else
{
predicate p1, p2;
@@ -1322,9 +1327,112 @@ set_switch_stmt_execution_predicate (struct ipa_func_body_info *fbi,
unshare_expr_without_location (max));
p = p1 & p2;
}
+
+ e = gimple_switch_edge (cfun, last, case_idx);
*(class predicate *) e->aux
= p.or_with (summary->conds, *(class predicate *) e->aux);
+
+ /* If there are too many disjoint case ranges, predicate for default
+ case might become too complicated. So add a limit here. */
+ if (bound_count > bound_limit)
+ continue;
+
+ bool new_range = true;
+
+ if (!ranges.is_empty ())
+ {
+ tree last_max_plus_one
+ = int_const_binop (PLUS_EXPR, ranges.last ().second, one_cst);
+
+ /* Merge case ranges if they are continuous. */
+ if (tree_int_cst_equal (min, last_max_plus_one))
+ new_range = false;
+ else if (vr.kind () == VR_ANTI_RANGE)
+ {
+ tree min_minus_one = int_const_binop (MINUS_EXPR, min, one_cst);
+
+ /* If anti-range of switch index can connect two disjoint case
+ ranges, combine them to one range. */
+ if (tree_int_cst_lt (vr.max (), min_minus_one))
+ vr.set_undefined ();
+ else if (tree_int_cst_le (vr.min (), last_max_plus_one))
+ new_range = false;
+ }
+ }
+
+ /* Create/extend a case range. And we count endpoints of range set,
+ this number nearly equals to number of conditions that we will create
+ for predicate of default case. */
+ if (new_range)
+ {
+ bound_count += (min == max) ? 1 : 2;
+ ranges.safe_push (std::make_pair (min, max));
+ }
+ else
+ {
+ bound_count += (ranges.last ().first == ranges.last ().second);
+ ranges.last ().second = max;
+ }
+ }
+
+ e = gimple_switch_edge (cfun, last, 0);
+ if (bound_count > bound_limit)
+ {
+ *(class predicate *) e->aux = true;
+ return;
+ }
+
+ tree vr_min = vr.kind () == VR_RANGE ? vr.min () : TYPE_MIN_VALUE (type);
+ tree vr_max = vr.kind () == VR_RANGE ? vr.max () : TYPE_MAX_VALUE (type);
+ predicate p_seg = true;
+ predicate p_all = false;
+
+ /* Construct predicate to represent default range set that is negation of
+ all case ranges. Case range is classified as containing single/non-single
+ values. Suppose a piece of case ranges in the following.
+
+ [D1...D2] [S1] ... [Sn] [D3...D4]
+
+ To represent default case's range sets between two non-single value
+ case ranges (From D2 to D3), we construct predicate as:
+
+ D2 < x < D3 && x != S1 && ... && x != Sn
+ */
+ for (size_t i = 0; i < ranges.length (); i++)
+ {
+ tree min = ranges[i].first;
+ tree max = ranges[i].second;
+
+ if (min == max)
+ p_seg &= add_condition (summary, index, size, &aggpos, NE_EXPR,
+ unshare_expr_without_location (min));
+ else
+ {
+ /* Do not create sub-predicate for range that is beyond low bound
+ of switch index. */
+ if (tree_int_cst_lt (vr_min, min))
+ {
+ p_seg &= add_condition (summary, index, size, &aggpos, LT_EXPR,
+ unshare_expr_without_location (min));
+ p_all = p_all.or_with (summary->conds, p_seg);
+ }
+
+ /* Do not create sub-predicate for range that is beyond up bound
+ of switch index. */
+ if (tree_int_cst_le (vr_max, max))
+ {
+ p_seg = false;
+ break;
+ }
+
+ p_seg = add_condition (summary, index, size, &aggpos, GT_EXPR,
+ unshare_expr_without_location (max));
+ }
}
+
+ p_all = p_all.or_with (summary->conds, p_seg);
+ *(class predicate *) e->aux
+ = p_all.or_with (summary->conds, *(class predicate *) e->aux);
}
diff --git a/gcc/params.def b/gcc/params.def
index 4567c17ba60..ff6920f287a 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -1121,6 +1121,14 @@ DEFPARAM (PARAM_IPA_MAX_AA_STEPS,
"parameter analysis based on alias analysis in any given function.",
25000, 0, 0)
+DEFPARAM (PARAM_IPA_MAX_SWITCH_PREDICATE_BOUNDS,
+ "ipa-max-switch-predicate-bounds",
+ "Maximal number of boundary endpoints of case ranges of switch "
+ "statement. For switch exceeding this limit, do not construct cost-"
+ "evaluating predicate for default case during IPA function summary"
+ "generation.",
+ 5, 0, 0)
+
/* WHOPR partitioning configuration. */
DEFPARAM (PARAM_LTO_PARTITIONS,
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 6c40496db9c..ccb92bb8f7f 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2019-07-12 Feng Xue <fxue@os.amperecomputing.com>
+
+ PR ipa/91089
+ * gcc.dg/ipa/pr91089.c: New test.
+
2019-07-11 Sunil K Pandey <sunil.k.pandey@intel.com>
PR target/90980
diff --git a/gcc/testsuite/gcc.dg/ipa/pr91089.c b/gcc/testsuite/gcc.dg/ipa/pr91089.c
new file mode 100644
index 00000000000..fa3111f565f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr91089.c
@@ -0,0 +1,62 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-ipa-cp-details -fdump-ipa-fnsummary-details --param ipa-max-switch-predicate-bounds=10 -fno-inline" } */
+
+int fn();
+
+int data;
+
+int callee(int i)
+{
+ switch (i)
+ {
+ case -126: return i + 13;
+ case -127: return i + 5;
+ case -8: return i * i;
+ case 0: return i % 9;
+ case 5:
+ case 7:
+ case 6: return 3;
+ default:
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ }
+
+ return data += i;
+}
+
+int caller()
+{
+ return callee (-127) +
+ callee (-126) +
+ callee (-8) +
+ callee (0) +
+ callee (5) +
+ callee (6) +
+ callee (7) +
+ callee (100);
+}
+
+/* { dg-final { scan-ipa-dump-times "Creating a specialized node of callee" 7 "cp" } } */
+/* { dg-final { scan-ipa-dump "op0 < -127" "fnsummary" } } */
+/* { dg-final { scan-ipa-dump "op0 > -126" "fnsummary" } } */
+/* { dg-final { scan-ipa-dump "op0 != -8" "fnsummary" } } */
+/* { dg-final { scan-ipa-dump "op0 != 0" "fnsummary" } } */
+/* { dg-final { scan-ipa-dump "op0 < 5" "fnsummary" } } */
+/* { dg-final { scan-ipa-dump "op0 > 7" "fnsummary" } } */
--
2.17.1
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Setup predicate for switch default case in IPA (PR ipa/91089)
2019-07-12 8:51 [PATCH] Setup predicate for switch default case in IPA (PR ipa/91089) Feng Xue OS
@ 2019-08-29 15:14 ` Martin Jambor
2019-08-30 8:32 ` Feng Xue OS
2019-09-02 3:06 ` [PATCH V2] " Feng Xue OS
0 siblings, 2 replies; 11+ messages in thread
From: Martin Jambor @ 2019-08-29 15:14 UTC (permalink / raw)
To: Feng Xue OS, gcc-patches, Jan Hubicka
Hi,
On Fri, Jul 12 2019, Feng Xue OS wrote:
> IPA does not construct executability predicate for default case of switch statement.
> So execution cost of default case is not properly evaluated in IPA-cp, this might
> prevent function clone for function containing switch statement, if certain non-default
> case is proved to be executed after constant propagation.
>
> This patch is composed to deduce predicate for default case, if it turns out to be a
> relative simple one, for example, we can try to merge case range, and use
> comparison upon range bounds, and also range analysis information to simplify
> predicate.
>
I have read through the patch and it looks OK to me but I cannot approve
it, you have to ping Honza for that. Since you decided to use the value
range info, it would be nice if you could also add a testcase where it
plays a role. Also, please don't post changelog entries as a part of
the patch, it basically guarantees it will not apply for anybody, not
even for you when you update your trunk.
Thanks for working on this,
Martin
> Feng
>
> ----
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 3d92250b520..4de2f568990 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,10 @@
> +2019-07-12 Feng Xue <fxue@os.amperecomputing.com>
> +
> + PR ipa/91089
> + * ipa-fnsummary.c (set_switch_stmt_execution_predicate): Add predicate
> + for switch default case using range analysis information.
> + * params.def (PARAM_IPA_MAX_SWITCH_PREDICATE_BOUNDS): New.
> +
> 2019-07-11 Sunil K Pandey <sunil.k.pandey@intel.com>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Setup predicate for switch default case in IPA (PR ipa/91089)
2019-08-29 15:14 ` Martin Jambor
@ 2019-08-30 8:32 ` Feng Xue OS
2019-09-02 3:06 ` [PATCH V2] " Feng Xue OS
1 sibling, 0 replies; 11+ messages in thread
From: Feng Xue OS @ 2019-08-30 8:32 UTC (permalink / raw)
To: Martin Jambor, gcc-patches, Jan Hubicka
That's good. Thanks for your comments.
Feng
________________________________________
From: Martin Jambor <mjambor@suse.cz>
Sent: Thursday, August 29, 2019 11:00 PM
To: Feng Xue OS; gcc-patches@gcc.gnu.org; Jan Hubicka
Subject: Re: [PATCH] Setup predicate for switch default case in IPA (PR ipa/91089)
Hi,
On Fri, Jul 12 2019, Feng Xue OS wrote:
> IPA does not construct executability predicate for default case of switch statement.
> So execution cost of default case is not properly evaluated in IPA-cp, this might
> prevent function clone for function containing switch statement, if certain non-default
> case is proved to be executed after constant propagation.
>
> This patch is composed to deduce predicate for default case, if it turns out to be a
> relative simple one, for example, we can try to merge case range, and use
> comparison upon range bounds, and also range analysis information to simplify
> predicate.
>
I have read through the patch and it looks OK to me but I cannot approve
it, you have to ping Honza for that. Since you decided to use the value
range info, it would be nice if you could also add a testcase where it
plays a role. Also, please don't post changelog entries as a part of
the patch, it basically guarantees it will not apply for anybody, not
even for you when you update your trunk.
Thanks for working on this,
Martin
> Feng
>
> ----
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 3d92250b520..4de2f568990 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,10 @@
> +2019-07-12 Feng Xue <fxue@os.amperecomputing.com>
> +
> + PR ipa/91089
> + * ipa-fnsummary.c (set_switch_stmt_execution_predicate): Add predicate
> + for switch default case using range analysis information.
> + * params.def (PARAM_IPA_MAX_SWITCH_PREDICATE_BOUNDS): New.
> +
> 2019-07-11 Sunil K Pandey <sunil.k.pandey@intel.com>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH V2] Setup predicate for switch default case in IPA (PR ipa/91089)
2019-08-29 15:14 ` Martin Jambor
2019-08-30 8:32 ` Feng Xue OS
@ 2019-09-02 3:06 ` Feng Xue OS
2019-09-14 15:57 ` Jan Hubicka
1 sibling, 1 reply; 11+ messages in thread
From: Feng Xue OS @ 2019-09-02 3:06 UTC (permalink / raw)
To: Martin Jambor, Jan Hubicka, gcc-patches
> I have read through the patch and it looks OK to me but I cannot approve
> it, you have to ping Honza for that. Since you decided to use the value
> range info, it would be nice if you could also add a testcase where it
> plays a role.
It is somewhat hard to write a testcase to show role of range info only with
this patch. If another patch "Generalized predicate/condition for parameter
reference in IPA (PR ipa/91088)" is accepted, it will become easy and I will
update this testcase to show that.
And this new version also resolves a problem that we might generate very
complicated predicate for basic block in convergence point reached from
all switch cases.
switch (index)
/ | \
case1 case2 ... default
\ | /
convergence point
For switch/if statement, current algorithm synthesizes predicate of
convergence point by OR-combining predicates of all its cases/branches, which
might give us a very complicated one. Actually, we can get simplified predicate
by using the fact that predicate for a basic block must also hold true for its post
dominators. To be specific, convergence point should include (at least equal to)
predicate of the switch/if statement.
Honza & Martin,
Would please review this new patch when you have time? Thanks.
Feng
-----
diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
index 278bf606661..dc5752fc005 100644
--- a/gcc/ipa-fnsummary.c
+++ b/gcc/ipa-fnsummary.c
@@ -1198,7 +1198,8 @@ set_cond_stmt_execution_predicate (struct ipa_func_body_info *fbi,
/* invert_tree_comparison will return ERROR_MARK on FP
comparsions that are not EQ/NE instead of returning proper
unordered one. Be sure it is not confused with NON_CONSTANT. */
- if (this_code != ERROR_MARK)
+ if (this_code != ERROR_MARK
+ && !dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest))
{
predicate p
= add_condition (summary, index, size, &aggpos, this_code,
@@ -1269,13 +1270,31 @@ set_switch_stmt_execution_predicate (struct ipa_func_body_info *fbi,
if (!unmodified_parm_or_parm_agg_item (fbi, last, op, &index, &size, &aggpos))
return;
+ auto_vec<std::pair<tree, tree> > ranges;
+ tree type = TREE_TYPE (op);
+ tree one_cst = build_one_cst (type);
+ int bound_limit = PARAM_VALUE (PARAM_IPA_MAX_SWITCH_PREDICATE_BOUNDS);
+ int bound_count = 0;
+ value_range_base vr;
+
+ get_range_info (op, vr);
+
FOR_EACH_EDGE (e, ei, bb->succs)
{
e->aux = edge_predicate_pool.allocate ();
*(predicate *) e->aux = false;
}
+
+ e = gimple_switch_edge (cfun, last, 0);
+ /* Set BOUND_COUNT to maximum count to bypass computing predicate for
+ default case if its target basic block is in convergence point of all
+ switch cases, which can be determined by checking whether it
+ post-dominates the switch statement. */
+ if (dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest))
+ bound_count = INT_MAX;
+
n = gimple_switch_num_labels (last);
- for (case_idx = 0; case_idx < n; ++case_idx)
+ for (case_idx = 1; case_idx < n; ++case_idx)
{
tree cl = gimple_switch_label (last, case_idx);
tree min, max;
@@ -1285,10 +1304,10 @@ set_switch_stmt_execution_predicate (struct ipa_func_body_info *fbi,
min = CASE_LOW (cl);
max = CASE_HIGH (cl);
- /* For default we might want to construct predicate that none
- of cases is met, but it is bit hard to do not having negations
- of conditionals handy. */
- if (!min && !max)
+ /* The case's target basic block is in convergence point of all switch
+ cases, its predicate should be at least as that of the switch
+ statement. */
+ if (dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest))
p = true;
else if (!max)
p = add_condition (summary, index, size, &aggpos, EQ_EXPR,
@@ -1304,7 +1323,111 @@ set_switch_stmt_execution_predicate (struct ipa_func_body_info *fbi,
}
*(class predicate *) e->aux
= p.or_with (summary->conds, *(class predicate *) e->aux);
+
+ /* If there are too many disjoint case ranges, predicate for default
+ case might become too complicated. So add a limit here. */
+ if (bound_count > bound_limit)
+ continue;
+
+ bool new_range = true;
+
+ if (!ranges.is_empty ())
+ {
+ tree last_max_plus_one
+ = int_const_binop (PLUS_EXPR, ranges.last ().second, one_cst);
+
+ /* Merge case ranges if they are continuous. */
+ if (tree_int_cst_equal (min, last_max_plus_one))
+ new_range = false;
+ else if (vr.kind () == VR_ANTI_RANGE)
+ {
+ tree min_minus_one = int_const_binop (MINUS_EXPR, min, one_cst);
+
+ /* If two disjoint case ranges can be connected by anti-range
+ of switch index, combine them to one range. */
+ if (tree_int_cst_lt (vr.max (), min_minus_one))
+ vr.set_undefined ();
+ else if (tree_int_cst_le (vr.min (), last_max_plus_one))
+ new_range = false;
+ }
+ }
+
+ if (!max)
+ max = min;
+
+ /* Create/extend a case range. And we count endpoints of range set,
+ this number nearly equals to number of conditions that we will create
+ for predicate of default case. */
+ if (new_range)
+ {
+ bound_count += (min == max) ? 1 : 2;
+ ranges.safe_push (std::make_pair (min, max));
+ }
+ else
+ {
+ bound_count += (ranges.last ().first == ranges.last ().second);
+ ranges.last ().second = max;
+ }
+ }
+
+ e = gimple_switch_edge (cfun, last, 0);
+ if (bound_count > bound_limit)
+ {
+ *(class predicate *) e->aux = true;
+ return;
+ }
+
+ tree vr_min = vr.kind () == VR_RANGE ? vr.min () : TYPE_MIN_VALUE (type);
+ tree vr_max = vr.kind () == VR_RANGE ? vr.max () : TYPE_MAX_VALUE (type);
+ predicate p_seg = true;
+ predicate p_all = false;
+
+ /* Construct predicate to represent default range set that is negation of
+ all case ranges. Case range is classified as containing single/non-single
+ values. Suppose a piece of case ranges in the following.
+
+ [D1...D2] [S1] ... [Sn] [D3...D4]
+
+ To represent default case's range sets between two non-single value
+ case ranges (From D2 to D3), we construct predicate as:
+
+ D2 < x < D3 && x != S1 && ... && x != Sn
+ */
+ for (size_t i = 0; i < ranges.length (); i++)
+ {
+ tree min = ranges[i].first;
+ tree max = ranges[i].second;
+
+ if (min == max)
+ p_seg &= add_condition (summary, index, size, &aggpos, NE_EXPR,
+ unshare_expr_without_location (min));
+ else
+ {
+ /* Do not create sub-predicate for range that is beyond low bound
+ of switch index. */
+ if (tree_int_cst_lt (vr_min, min))
+ {
+ p_seg &= add_condition (summary, index, size, &aggpos, LT_EXPR,
+ unshare_expr_without_location (min));
+ p_all = p_all.or_with (summary->conds, p_seg);
+ }
+
+ /* Do not create sub-predicate for range that is beyond up bound
+ of switch index. */
+ if (tree_int_cst_le (vr_max, max))
+ {
+ p_seg = false;
+ break;
+ }
+
+ p_seg = add_condition (summary, index, size, &aggpos, GT_EXPR,
+ unshare_expr_without_location (max));
+ }
}
+
+ p_all = p_all.or_with (summary->conds, p_seg);
+ *(class predicate *) e->aux
+ = p_all.or_with (summary->conds, *(class predicate *) e->aux);
}
@@ -1354,10 +1477,10 @@ compute_bb_predicates (struct ipa_func_body_info *fbi,
break;
}
}
- if (p == false)
- gcc_checking_assert (!bb->aux);
- else
+ if (p != false)
{
+ basic_block pdom_bb;
+
if (!bb->aux)
{
done = false;
@@ -1376,6 +1499,34 @@ compute_bb_predicates (struct ipa_func_body_info *fbi,
*((predicate *) bb->aux) = p;
}
}
+
+ /* For switch/if statement, we can OR-combine predicates of all
+ its cases/branches to get predicate for basic block in their
+ convergence point, but sometimes this will generate very
+ complicated predicate. Actually, we can get simplified
+ predicate in another way by using the fact that predicate
+ for a basic block must also hold true for its post dominators.
+ To be specific, basic block in convergence point of
+ conditional statement should include predicate of the
+ statement. */
+ pdom_bb = get_immediate_dominator (CDI_POST_DOMINATORS, bb);
+ if (pdom_bb == EXIT_BLOCK_PTR_FOR_FN (my_function) || !pdom_bb)
+ ;
+ else if (!pdom_bb->aux)
+ {
+ done = false;
+ pdom_bb->aux = edge_predicate_pool.allocate ();
+ *((predicate *) pdom_bb->aux) = p;
+ }
+ else if (p != *(predicate *) pdom_bb->aux)
+ {
+ p = p.or_with (summary->conds, *(predicate *)pdom_bb->aux);
+ if (p != *(predicate *) pdom_bb->aux)
+ {
+ done = false;
+ *((predicate *) pdom_bb->aux) = p;
+ }
+ }
}
}
}
@@ -1980,6 +2131,7 @@ analyze_function_body (struct cgraph_node *node, bool early)
if (opt_for_fn (node->decl, optimize))
{
calculate_dominance_info (CDI_DOMINATORS);
+ calculate_dominance_info (CDI_POST_DOMINATORS);
if (!early)
loop_optimizer_init (LOOPS_NORMAL | LOOPS_HAVE_RECORDED_EXITS);
else
@@ -2360,6 +2512,7 @@ analyze_function_body (struct cgraph_node *node, bool early)
else if (!ipa_edge_args_sum)
ipa_free_all_node_params ();
free_dominance_info (CDI_DOMINATORS);
+ free_dominance_info (CDI_POST_DOMINATORS);
}
if (dump_file)
{
diff --git a/gcc/params.def b/gcc/params.def
index 13001a7bb2d..1461aa4849d 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -1123,6 +1123,14 @@ DEFPARAM (PARAM_IPA_MAX_AA_STEPS,
"parameter analysis based on alias analysis in any given function.",
25000, 0, 0)
+DEFPARAM (PARAM_IPA_MAX_SWITCH_PREDICATE_BOUNDS,
+ "ipa-max-switch-predicate-bounds",
+ "Maximal number of boundary endpoints of case ranges of switch "
+ "statement. For switch exceeding this limit, do not construct cost-"
+ "evaluating predicate for default case during IPA function summary"
+ "generation.",
+ 5, 0, 0)
+
/* WHOPR partitioning configuration. */
DEFPARAM (PARAM_LTO_PARTITIONS,
diff --git a/gcc/testsuite/gcc.dg/ipa/pr91089.c b/gcc/testsuite/gcc.dg/ipa/pr91089.c
new file mode 100644
index 00000000000..dbd9b8c94fe
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr91089.c
@@ -0,0 +1,111 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-ipa-cp-details -fdump-ipa-fnsummary-details --param ipa-max-switch-predicate-bounds=10 -fno-inline" } */
+
+int fn();
+
+int data;
+
+int callee1(int i)
+{
+ switch (i)
+ {
+ case -126: return i + 13;
+ case -127: return i + 5;
+ case -8: return i * i;
+ case 0: return i % 9;
+ case 5:
+ case 7:
+ case 6: return 3;
+ default:
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ }
+
+ return data += i;
+}
+
+int fn2();
+
+int callee2(int i)
+{
+ switch (i )
+ {
+ case 0:
+ fn();
+ fn();
+ fn();
+ case 1:
+ fn();
+ fn();
+ case -1:
+ fn();
+ case -2:
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ fn();
+ data += i;
+ break;
+ }
+
+ if (i == 1000)
+ {
+ int j;
+
+ for (j = 0; j < 100; j++)
+ fn2();
+ }
+ return i + 3;
+}
+
+int caller()
+{
+ return callee1 (-127) +
+ callee1 (-126) +
+ callee1 (-8) +
+ callee1 (0) +
+ callee1 (5) +
+ callee1 (6) +
+ callee1 (7) +
+ callee1 (100) +
+ callee2 (9);
+ callee2 (13);
+}
+
+/* { dg-final { scan-ipa-dump-times "Creating a specialized node of callee" 7 "cp" } } */
+/* { dg-final { scan-ipa-dump "op0 < -127" "fnsummary" } } */
+/* { dg-final { scan-ipa-dump "op0 > -126" "fnsummary" } } */
+/* { dg-final { scan-ipa-dump "op0 != -8" "fnsummary" } } */
+/* { dg-final { scan-ipa-dump "op0 != 0" "fnsummary" } } */
+/* { dg-final { scan-ipa-dump "op0 < 5" "fnsummary" } } */
+/* { dg-final { scan-ipa-dump "op0 > 7" "fnsummary" } } */
+/* { dg-final { scan-ipa-dump "loop depth: 1 .+ time:\[ \]*\[0-9\]+ predicate: \\(op0 == 1000\\)" "fnsummary" } } */
--
2.17.1
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH V2] Setup predicate for switch default case in IPA (PR ipa/91089)
2019-09-02 3:06 ` [PATCH V2] " Feng Xue OS
@ 2019-09-14 15:57 ` Jan Hubicka
2019-09-16 10:11 ` [PATCH 1/2 V3] " Feng Xue OS
2019-09-16 10:16 ` [PATCH 2/2 V3] Use post-dom info to update if/switch predicate " Feng Xue OS
0 siblings, 2 replies; 11+ messages in thread
From: Jan Hubicka @ 2019-09-14 15:57 UTC (permalink / raw)
To: Feng Xue OS; +Cc: Martin Jambor, gcc-patches
> It is somewhat hard to write a testcase to show role of range info only with
> this patch. If another patch "Generalized predicate/condition for parameter
> reference in IPA (PR ipa/91088)" is accepted, it will become easy and I will
> update this testcase to show that.
>
> And this new version also resolves a problem that we might generate very
> complicated predicate for basic block in convergence point reached from
> all switch cases.
>
> switch (index)
> / | \
> case1 case2 ... default
> \ | /
> convergence point
>
> For switch/if statement, current algorithm synthesizes predicate of
> convergence point by OR-combining predicates of all its cases/branches, which
> might give us a very complicated one. Actually, we can get simplified predicate
> by using the fact that predicate for a basic block must also hold true for its post
> dominators. To be specific, convergence point should include (at least equal to)
> predicate of the switch/if statement.
>
> Honza & Martin,
>
> Would please review this new patch when you have time? Thanks.
>
> Feng
>
> -----
> diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
> index 278bf606661..dc5752fc005 100644
> --- a/gcc/ipa-fnsummary.c
> +++ b/gcc/ipa-fnsummary.c
> @@ -1198,7 +1198,8 @@ set_cond_stmt_execution_predicate (struct ipa_func_body_info *fbi,
> /* invert_tree_comparison will return ERROR_MARK on FP
> comparsions that are not EQ/NE instead of returning proper
> unordered one. Be sure it is not confused with NON_CONSTANT. */
> - if (this_code != ERROR_MARK)
> + if (this_code != ERROR_MARK
> + && !dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest))
> {
> predicate p
> = add_condition (summary, index, size, &aggpos, this_code,
So this change is handling the diamond conditional you describe above?
This is bit of hack since it leaves the edge predicate unnecesarily
conservative though I see it saves some conditions to be inserted and
makes things to go smoother.
Please add a comment that explain this and reffers to the other places
where we do this (in the switch handling below).
> @@ -1269,13 +1270,31 @@ set_switch_stmt_execution_predicate (struct ipa_func_body_info *fbi,
> if (!unmodified_parm_or_parm_agg_item (fbi, last, op, &index, &size, &aggpos))
> return;
>
> + auto_vec<std::pair<tree, tree> > ranges;
> + tree type = TREE_TYPE (op);
> + tree one_cst = build_one_cst (type);
> + int bound_limit = PARAM_VALUE (PARAM_IPA_MAX_SWITCH_PREDICATE_BOUNDS);
> + int bound_count = 0;
> + value_range_base vr;
> +
> + get_range_info (op, vr);
> +
> FOR_EACH_EDGE (e, ei, bb->succs)
> {
> e->aux = edge_predicate_pool.allocate ();
> *(predicate *) e->aux = false;
> }
> +
> + e = gimple_switch_edge (cfun, last, 0);
> + /* Set BOUND_COUNT to maximum count to bypass computing predicate for
> + default case if its target basic block is in convergence point of all
> + switch cases, which can be determined by checking whether it
> + post-dominates the switch statement. */
> + if (dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest))
> + bound_count = INT_MAX;
> +
> n = gimple_switch_num_labels (last);
> - for (case_idx = 0; case_idx < n; ++case_idx)
> + for (case_idx = 1; case_idx < n; ++case_idx)
> {
> tree cl = gimple_switch_label (last, case_idx);
> tree min, max;
> @@ -1285,10 +1304,10 @@ set_switch_stmt_execution_predicate (struct ipa_func_body_info *fbi,
> min = CASE_LOW (cl);
> max = CASE_HIGH (cl);
>
> - /* For default we might want to construct predicate that none
> - of cases is met, but it is bit hard to do not having negations
> - of conditionals handy. */
> - if (!min && !max)
> + /* The case's target basic block is in convergence point of all switch
> + cases, its predicate should be at least as that of the switch
> + statement. */
> + if (dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest))
> p = true;
> else if (!max)
> p = add_condition (summary, index, size, &aggpos, EQ_EXPR,
> @@ -1304,7 +1323,111 @@ set_switch_stmt_execution_predicate (struct ipa_func_body_info *fbi,
> }
> *(class predicate *) e->aux
> = p.or_with (summary->conds, *(class predicate *) e->aux);
> +
> + /* If there are too many disjoint case ranges, predicate for default
> + case might become too complicated. So add a limit here. */
> + if (bound_count > bound_limit)
> + continue;
> +
> + bool new_range = true;
> +
> + if (!ranges.is_empty ())
> + {
> + tree last_max_plus_one
> + = int_const_binop (PLUS_EXPR, ranges.last ().second, one_cst);
> +
> + /* Merge case ranges if they are continuous. */
> + if (tree_int_cst_equal (min, last_max_plus_one))
> + new_range = false;
> + else if (vr.kind () == VR_ANTI_RANGE)
> + {
> + tree min_minus_one = int_const_binop (MINUS_EXPR, min, one_cst);
> +
> + /* If two disjoint case ranges can be connected by anti-range
> + of switch index, combine them to one range. */
> + if (tree_int_cst_lt (vr.max (), min_minus_one))
> + vr.set_undefined ();
> + else if (tree_int_cst_le (vr.min (), last_max_plus_one))
> + new_range = false;
> + }
I believe the case ranges always has to be INTEGER_CST. In this case all
this can be written using wide ints and produce a lot better code
(avoiding need to build & lookup & share all temporary tree codes).
You can take a look at the tree-switch-conversion code which does quite
some of this wide int handling.
Richard may have an opinion on this.
> @@ -1376,6 +1499,34 @@ compute_bb_predicates (struct ipa_func_body_info *fbi,
> *((predicate *) bb->aux) = p;
> }
> }
> +
> + /* For switch/if statement, we can OR-combine predicates of all
> + its cases/branches to get predicate for basic block in their
> + convergence point, but sometimes this will generate very
> + complicated predicate. Actually, we can get simplified
> + predicate in another way by using the fact that predicate
> + for a basic block must also hold true for its post dominators.
> + To be specific, basic block in convergence point of
> + conditional statement should include predicate of the
> + statement. */
> + pdom_bb = get_immediate_dominator (CDI_POST_DOMINATORS, bb);
> + if (pdom_bb == EXIT_BLOCK_PTR_FOR_FN (my_function) || !pdom_bb)
> + ;
> + else if (!pdom_bb->aux)
> + {
> + done = false;
> + pdom_bb->aux = edge_predicate_pool.allocate ();
> + *((predicate *) pdom_bb->aux) = p;
> + }
> + else if (p != *(predicate *) pdom_bb->aux)
> + {
> + p = p.or_with (summary->conds, *(predicate *)pdom_bb->aux);
> + if (p != *(predicate *) pdom_bb->aux)
> + {
> + done = false;
> + *((predicate *) pdom_bb->aux) = p;
> + }
> + }
So this basically the idea is to or BB's predicate to the
post-dominator predicate. This seems safe to do (we need to be careful
so that the dataflow still converges), but I would preffer to get this
in separately possibly with a testcase that shows an improvement in the
dataflow answer.
Honza
> }
> }
> }
> @@ -1980,6 +2131,7 @@ analyze_function_body (struct cgraph_node *node, bool early)
> if (opt_for_fn (node->decl, optimize))
> {
> calculate_dominance_info (CDI_DOMINATORS);
> + calculate_dominance_info (CDI_POST_DOMINATORS);
> if (!early)
> loop_optimizer_init (LOOPS_NORMAL | LOOPS_HAVE_RECORDED_EXITS);
> else
> @@ -2360,6 +2512,7 @@ analyze_function_body (struct cgraph_node *node, bool early)
> else if (!ipa_edge_args_sum)
> ipa_free_all_node_params ();
> free_dominance_info (CDI_DOMINATORS);
> + free_dominance_info (CDI_POST_DOMINATORS);
> }
> if (dump_file)
> {
> diff --git a/gcc/params.def b/gcc/params.def
> index 13001a7bb2d..1461aa4849d 100644
> --- a/gcc/params.def
> +++ b/gcc/params.def
> @@ -1123,6 +1123,14 @@ DEFPARAM (PARAM_IPA_MAX_AA_STEPS,
> "parameter analysis based on alias analysis in any given function.",
> 25000, 0, 0)
>
> +DEFPARAM (PARAM_IPA_MAX_SWITCH_PREDICATE_BOUNDS,
> + "ipa-max-switch-predicate-bounds",
> + "Maximal number of boundary endpoints of case ranges of switch "
> + "statement. For switch exceeding this limit, do not construct cost-"
> + "evaluating predicate for default case during IPA function summary"
> + "generation.",
> + 5, 0, 0)
> +
> /* WHOPR partitioning configuration. */
>
> DEFPARAM (PARAM_LTO_PARTITIONS,
> diff --git a/gcc/testsuite/gcc.dg/ipa/pr91089.c b/gcc/testsuite/gcc.dg/ipa/pr91089.c
> new file mode 100644
> index 00000000000..dbd9b8c94fe
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/ipa/pr91089.c
> @@ -0,0 +1,111 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -fdump-ipa-cp-details -fdump-ipa-fnsummary-details --param ipa-max-switch-predicate-bounds=10 -fno-inline" } */
> +
> +int fn();
> +
> +int data;
> +
> +int callee1(int i)
> +{
> + switch (i)
> + {
> + case -126: return i + 13;
> + case -127: return i + 5;
> + case -8: return i * i;
> + case 0: return i % 9;
> + case 5:
> + case 7:
> + case 6: return 3;
> + default:
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + }
> +
> + return data += i;
> +}
> +
> +int fn2();
> +
> +int callee2(int i)
> +{
> + switch (i )
> + {
> + case 0:
> + fn();
> + fn();
> + fn();
> + case 1:
> + fn();
> + fn();
> + case -1:
> + fn();
> + case -2:
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + data += i;
> + break;
> + }
> +
> + if (i == 1000)
> + {
> + int j;
> +
> + for (j = 0; j < 100; j++)
> + fn2();
> + }
> + return i + 3;
> +}
> +
> +int caller()
> +{
> + return callee1 (-127) +
> + callee1 (-126) +
> + callee1 (-8) +
> + callee1 (0) +
> + callee1 (5) +
> + callee1 (6) +
> + callee1 (7) +
> + callee1 (100) +
> + callee2 (9);
> + callee2 (13);
> +}
> +
> +/* { dg-final { scan-ipa-dump-times "Creating a specialized node of callee" 7 "cp" } } */
> +/* { dg-final { scan-ipa-dump "op0 < -127" "fnsummary" } } */
> +/* { dg-final { scan-ipa-dump "op0 > -126" "fnsummary" } } */
> +/* { dg-final { scan-ipa-dump "op0 != -8" "fnsummary" } } */
> +/* { dg-final { scan-ipa-dump "op0 != 0" "fnsummary" } } */
> +/* { dg-final { scan-ipa-dump "op0 < 5" "fnsummary" } } */
> +/* { dg-final { scan-ipa-dump "op0 > 7" "fnsummary" } } */
> +/* { dg-final { scan-ipa-dump "loop depth: 1 .+ time:\[ \]*\[0-9\]+ predicate: \\(op0 == 1000\\)" "fnsummary" } } */
> --
> 2.17.1
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/2 V3] Setup predicate for switch default case in IPA (PR ipa/91089)
2019-09-14 15:57 ` Jan Hubicka
@ 2019-09-16 10:11 ` Feng Xue OS
2019-09-16 15:04 ` Jan Hubicka
2019-09-16 10:16 ` [PATCH 2/2 V3] Use post-dom info to update if/switch predicate " Feng Xue OS
1 sibling, 1 reply; 11+ messages in thread
From: Feng Xue OS @ 2019-09-16 10:11 UTC (permalink / raw)
To: Jan Hubicka; +Cc: Martin Jambor, gcc-patches
>> + if (this_code != ERROR_MARK
>> + && !dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest))
>> {
>> predicate p
>> = add_condition (summary, index, size, &aggpos, this_code,
> So this change is handling the diamond conditional you describe above?
> This is bit of hack since it leaves the edge predicate unnecesarily
> conservative though I see it saves some conditions to be inserted and
> makes things to go smoother.
>
> Please add a comment that explain this and reffers to the other places
> where we do this (in the switch handling below).
Done.
> I believe the case ranges always has to be INTEGER_CST. In this case all
> this can be written using wide ints and produce a lot better code
> (avoiding need to build & lookup & share all temporary tree codes).
> You can take a look at the tree-switch-conversion code which does quite
> some of this wide int handling.
>
> Richard may have an opinion on this.
Done.
> So this basically the idea is to or BB's predicate to the
> post-dominator predicate. This seems safe to do (we need to be careful
> so that the dataflow still converges), but I would preffer to get this
> in separately possibly with a testcase that shows an improvement in the
> dataflow answer.
I've split the patch to two.
Thanks for your comments.
Feng
---
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 1391a562c35..7f312c96f37 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -11960,6 +11960,12 @@ not spend too much time analyzing huge functions, it gives up and
consider all memory clobbered after examining
@option{ipa-max-aa-steps} statements modifying memory.
+@item ipa-max-switch-predicate-bounds
+Maximal number of boundary endpoints of case ranges of switch statement.
+For switch exceeding this limit, IPA-CP will not construct cloning cost
+predicate, which is used to estimate cloning benefit, for default case
+of the switch statement.
+
@item lto-partitions
Specify desired number of partitions produced during WHOPR compilation.
The number of partitions should exceed the number of CPUs used for compilation.
diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
index 278bf606661..1bf1806eaf8 100644
--- a/gcc/ipa-fnsummary.c
+++ b/gcc/ipa-fnsummary.c
@@ -1269,13 +1269,21 @@ set_switch_stmt_execution_predicate (struct ipa_func_body_info *fbi,
if (!unmodified_parm_or_parm_agg_item (fbi, last, op, &index, &size, &aggpos))
return;
+ auto_vec<std::pair<tree, tree> > ranges;
+ tree type = TREE_TYPE (op);
+ int bound_limit = PARAM_VALUE (PARAM_IPA_MAX_SWITCH_PREDICATE_BOUNDS);
+ int bound_count = 0;
+ wide_int vr_wmin, vr_wmax;
+ value_range_kind vr_type = get_range_info (op, &vr_wmin, &vr_wmax);
+
FOR_EACH_EDGE (e, ei, bb->succs)
{
e->aux = edge_predicate_pool.allocate ();
*(predicate *) e->aux = false;
}
+
n = gimple_switch_num_labels (last);
- for (case_idx = 0; case_idx < n; ++case_idx)
+ for (case_idx = 1; case_idx < n; ++case_idx)
{
tree cl = gimple_switch_label (last, case_idx);
tree min, max;
@@ -1285,12 +1293,7 @@ set_switch_stmt_execution_predicate (struct ipa_func_body_info *fbi,
min = CASE_LOW (cl);
max = CASE_HIGH (cl);
- /* For default we might want to construct predicate that none
- of cases is met, but it is bit hard to do not having negations
- of conditionals handy. */
- if (!min && !max)
- p = true;
- else if (!max)
+ if (!max)
p = add_condition (summary, index, size, &aggpos, EQ_EXPR,
unshare_expr_without_location (min));
else
@@ -1304,7 +1307,113 @@ set_switch_stmt_execution_predicate (struct ipa_func_body_info *fbi,
}
*(class predicate *) e->aux
= p.or_with (summary->conds, *(class predicate *) e->aux);
+
+ /* If there are too many disjoint case ranges, predicate for default
+ case might become too complicated. So add a limit here. */
+ if (bound_count > bound_limit)
+ continue;
+
+ bool new_range = true;
+
+ if (!ranges.is_empty ())
+ {
+ wide_int curr_wmin = wi::to_wide (min);
+ wide_int last_wmax = wi::to_wide (ranges.last ().second);
+
+ /* Merge case ranges if they are continuous. */
+ if (curr_wmin == last_wmax + 1)
+ new_range = false;
+ else if (vr_type == VR_ANTI_RANGE)
+ {
+ /* If two disjoint case ranges can be connected by anti-range
+ of switch index, combine them to one range. */
+ if (wi::lt_p (vr_wmax, curr_wmin - 1, TYPE_SIGN (type)))
+ vr_type = VR_UNDEFINED;
+ else if (wi::le_p (vr_wmin, last_wmax + 1, TYPE_SIGN (type)))
+ new_range = false;
+ }
+ }
+
+ if (!max)
+ max = min;
+
+ /* Create/extend a case range. And we count endpoints of range set,
+ this number nearly equals to number of conditions that we will create
+ for predicate of default case. */
+ if (new_range)
+ {
+ bound_count += (min == max) ? 1 : 2;
+ ranges.safe_push (std::make_pair (min, max));
+ }
+ else
+ {
+ bound_count += (ranges.last ().first == ranges.last ().second);
+ ranges.last ().second = max;
+ }
+ }
+
+ e = gimple_switch_edge (cfun, last, 0);
+ if (bound_count > bound_limit)
+ {
+ *(class predicate *) e->aux = true;
+ return;
}
+
+ predicate p_seg = true;
+ predicate p_all = false;
+
+ if (vr_type != VR_RANGE)
+ {
+ vr_wmin = wi::to_wide (TYPE_MIN_VALUE (type));
+ vr_wmax = wi::to_wide (TYPE_MAX_VALUE (type));
+ }
+
+ /* Construct predicate to represent default range set that is negation of
+ all case ranges. Case range is classified as containing single/non-single
+ values. Suppose a piece of case ranges in the following.
+
+ [D1...D2] [S1] ... [Sn] [D3...D4]
+
+ To represent default case's range sets between two non-single value
+ case ranges (From D2 to D3), we construct predicate as:
+
+ D2 < x < D3 && x != S1 && ... && x != Sn
+ */
+ for (size_t i = 0; i < ranges.length (); i++)
+ {
+ tree min = ranges[i].first;
+ tree max = ranges[i].second;
+
+ if (min == max)
+ p_seg &= add_condition (summary, index, size, &aggpos, NE_EXPR,
+ unshare_expr_without_location (min));
+ else
+ {
+ /* Do not create sub-predicate for range that is beyond low bound
+ of switch index. */
+ if (wi::lt_p (vr_wmin, wi::to_wide (min), TYPE_SIGN (type)))
+ {
+ p_seg &= add_condition (summary, index, size, &aggpos, LT_EXPR,
+ unshare_expr_without_location (min));
+ p_all = p_all.or_with (summary->conds, p_seg);
+ }
+
+ /* Do not create sub-predicate for range that is beyond up bound
+ of switch index. */
+ if (wi::le_p (vr_wmax, wi::to_wide (max), TYPE_SIGN (type)))
+ {
+ p_seg = false;
+ break;
+ }
+
+ p_seg = add_condition (summary, index, size, &aggpos, GT_EXPR,
+ unshare_expr_without_location (max));
+ }
+ }
+
+ p_all = p_all.or_with (summary->conds, p_seg);
+ *(class predicate *) e->aux
+ = p_all.or_with (summary->conds, *(class predicate *) e->aux);
}
diff --git a/gcc/params.def b/gcc/params.def
index 13001a7bb2d..1461aa4849d 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -1123,6 +1123,14 @@ DEFPARAM (PARAM_IPA_MAX_AA_STEPS,
"parameter analysis based on alias analysis in any given function.",
25000, 0, 0)
+DEFPARAM (PARAM_IPA_MAX_SWITCH_PREDICATE_BOUNDS,
+ "ipa-max-switch-predicate-bounds",
+ "Maximal number of boundary endpoints of case ranges of switch "
+ "statement. For switch exceeding this limit, do not construct cost-"
+ "evaluating predicate for default case during IPA function summary"
+ "generation.",
+ 5, 0, 0)
+
/* WHOPR partitioning configuration. */
DEFPARAM (PARAM_LTO_PARTITIONS,
diff --git a/gcc/testsuite/gcc.dg/ipa/pr91089.c b/gcc/testsuite/gcc.dg/ipa/pr91089.c
new file mode 100644
index 00000000000..e9e206fc24d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr91089.c
@@ -0,0 +1,62 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-ipa-cp-details -fdump-ipa-fnsummary-details --param ipa-max-switch-predicate-bounds=10 -fno-inline" } */
+
+int fn ();
+
+int data;
+
+int callee (int i)
+{
+ switch (i)
+ {
+ case -126: return i + 13;
+ case -127: return i + 5;
+ case -8: return i * i;
+ case 0: return i % 9;
+ case 5:
+ case 7:
+ case 6: return 3;
+ default:
+ fn ();
+ fn ();
+ fn ();
+ fn ();
+ fn ();
+ fn ();
+ fn ();
+ fn ();
+ fn ();
+ fn ();
+ fn ();
+ fn ();
+ fn ();
+ fn ();
+ fn ();
+ fn ();
+ fn ();
+ fn ();
+ fn ();
+ }
+
+ return data += i;
+}
+
+int caller ()
+{
+ return callee (-127) +
+ callee (-126) +
+ callee (-8) +
+ callee (0) +
+ callee (5) +
+ callee (6) +
+ callee (7) +
+ callee (100);
+}
+
+/* { dg-final { scan-ipa-dump-times "Creating a specialized node of callee" 7 "cp" } } */
+/* { dg-final { scan-ipa-dump "op0 < -127" "fnsummary" } } */
+/* { dg-final { scan-ipa-dump "op0 > -126" "fnsummary" } } */
+/* { dg-final { scan-ipa-dump "op0 != -8" "fnsummary" } } */
+/* { dg-final { scan-ipa-dump "op0 != 0" "fnsummary" } } */
+/* { dg-final { scan-ipa-dump "op0 < 5" "fnsummary" } } */
+/* { dg-final { scan-ipa-dump "op0 > 7" "fnsummary" } } */
--
2.17.1
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 2/2 V3] Use post-dom info to update if/switch predicate (PR ipa/91089)
2019-09-14 15:57 ` Jan Hubicka
2019-09-16 10:11 ` [PATCH 1/2 V3] " Feng Xue OS
@ 2019-09-16 10:16 ` Feng Xue OS
2019-09-16 15:05 ` Jan Hubicka
1 sibling, 1 reply; 11+ messages in thread
From: Feng Xue OS @ 2019-09-16 10:16 UTC (permalink / raw)
To: Jan Hubicka; +Cc: Martin Jambor, gcc-patches
This one is split from original patch.
Feng
---
diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
index 1bf1806eaf8..5423756d275 100644
--- a/gcc/ipa-fnsummary.c
+++ b/gcc/ipa-fnsummary.c
@@ -1197,8 +1197,14 @@ set_cond_stmt_execution_predicate (struct ipa_func_body_info *fbi,
? code : inverted_code);
/* invert_tree_comparison will return ERROR_MARK on FP
comparsions that are not EQ/NE instead of returning proper
- unordered one. Be sure it is not confused with NON_CONSTANT. */
- if (this_code != ERROR_MARK)
+ unordered one. Be sure it is not confused with NON_CONSTANT.
+
+ And if the edge's target is the final block of diamond CFG graph
+ of this conditional statement, we do not need to compute
+ predicate for the edge because the final block's predicate must
+ be at least as that of the first block of the statement. */
+ if (this_code != ERROR_MARK
+ && !dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest))
{
predicate p
= add_condition (summary, index, size, &aggpos, this_code,
@@ -1282,6 +1288,14 @@ set_switch_stmt_execution_predicate (struct ipa_func_body_info *fbi,
*(predicate *) e->aux = false;
}
+ e = gimple_switch_edge (cfun, last, 0);
+ /* Set BOUND_COUNT to maximum count to bypass computing predicate for
+ default case if its target basic block is in convergence point of all
+ switch cases, which can be determined by checking whether it
+ post-dominates the switch statement. */
+ if (dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest))
+ bound_count = INT_MAX;
+
n = gimple_switch_num_labels (last);
for (case_idx = 1; case_idx < n; ++case_idx)
{
@@ -1293,7 +1307,12 @@ set_switch_stmt_execution_predicate (struct ipa_func_body_info *fbi,
min = CASE_LOW (cl);
max = CASE_HIGH (cl);
- if (!max)
+ /* The case's target basic block is in convergence point of all switch
+ cases, its predicate should be at least as that of the switch
+ statement. */
+ if (dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest))
+ p = true;
+ else if (!max)
p = add_condition (summary, index, size, &aggpos, EQ_EXPR,
unshare_expr_without_location (min));
else
@@ -1463,10 +1482,10 @@ compute_bb_predicates (struct ipa_func_body_info *fbi,
break;
}
}
- if (p == false)
- gcc_checking_assert (!bb->aux);
- else
+ if (p != false)
{
+ basic_block pdom_bb;
+
if (!bb->aux)
{
done = false;
@@ -1485,6 +1504,34 @@ compute_bb_predicates (struct ipa_func_body_info *fbi,
*((predicate *) bb->aux) = p;
}
}
+
+ /* For switch/if statement, we can OR-combine predicates of all
+ its cases/branches to get predicate for basic block in their
+ convergence point, but sometimes this will generate very
+ complicated predicate. Actually, we can get simplified
+ predicate in another way by using the fact that predicate
+ for a basic block must also hold true for its post dominators.
+ To be specific, basic block in convergence point of
+ conditional statement should include predicate of the
+ statement. */
+ pdom_bb = get_immediate_dominator (CDI_POST_DOMINATORS, bb);
+ if (pdom_bb == EXIT_BLOCK_PTR_FOR_FN (my_function) || !pdom_bb)
+ ;
+ else if (!pdom_bb->aux)
+ {
+ done = false;
+ pdom_bb->aux = edge_predicate_pool.allocate ();
+ *((predicate *) pdom_bb->aux) = p;
+ }
+ else if (p != *(predicate *) pdom_bb->aux)
+ {
+ p = p.or_with (summary->conds, *(predicate *)pdom_bb->aux);
+ if (p != *(predicate *) pdom_bb->aux)
+ {
+ done = false;
+ *((predicate *) pdom_bb->aux) = p;
+ }
+ }
}
}
}
@@ -2089,6 +2136,7 @@ analyze_function_body (struct cgraph_node *node, bool early)
if (opt_for_fn (node->decl, optimize))
{
calculate_dominance_info (CDI_DOMINATORS);
+ calculate_dominance_info (CDI_POST_DOMINATORS);
if (!early)
loop_optimizer_init (LOOPS_NORMAL | LOOPS_HAVE_RECORDED_EXITS);
else
@@ -2469,6 +2517,7 @@ analyze_function_body (struct cgraph_node *node, bool early)
else if (!ipa_edge_args_sum)
ipa_free_all_node_params ();
free_dominance_info (CDI_DOMINATORS);
+ free_dominance_info (CDI_POST_DOMINATORS);
}
if (dump_file)
{
diff --git a/gcc/testsuite/gcc.dg/ipa/pr91089.c b/gcc/testsuite/gcc.dg/ipa/pr91089.c
index e9e206fc24d..92b5550aa76 100644
--- a/gcc/testsuite/gcc.dg/ipa/pr91089.c
+++ b/gcc/testsuite/gcc.dg/ipa/pr91089.c
@@ -41,6 +41,52 @@ int callee (int i)
return data += i;
}
+int fn2 ();
+
+int callee_complex_predicate (int i)
+{
+ switch (i )
+ {
+ case 0:
+ fn ();
+ fn ();
+ fn ();
+ case 1:
+ fn ();
+ fn ();
+ case -1:
+ fn ();
+ case -2:
+ fn ();
+ fn ();
+ fn ();
+ fn ();
+ fn ();
+ fn ();
+ fn ();
+ fn ();
+ fn ();
+ fn ();
+ fn ();
+ fn ();
+ fn ();
+ fn ();
+ fn ();
+ fn ();
+ data += i;
+ break;
+ }
+
+ if (i == 1000)
+ {
+ int j;
+
+ for (j = 0; j < 100; j++)
+ fn2 ();
+ }
+ return i + 3;
+}
+
int caller ()
{
return callee (-127) +
@@ -60,3 +106,4 @@ int caller ()
/* { dg-final { scan-ipa-dump "op0 != 0" "fnsummary" } } */
/* { dg-final { scan-ipa-dump "op0 < 5" "fnsummary" } } */
/* { dg-final { scan-ipa-dump "op0 > 7" "fnsummary" } } */
+/* { dg-final { scan-ipa-dump "loop depth: 1 .+ time:\[ \]*\[0-9\]+ predicate: \\(op0 == 1000\\)" "fnsummary" } } */
--
2.17.1
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2 V3] Setup predicate for switch default case in IPA (PR ipa/91089)
2019-09-16 10:11 ` [PATCH 1/2 V3] " Feng Xue OS
@ 2019-09-16 15:04 ` Jan Hubicka
2019-09-17 13:51 ` Feng Xue OS
0 siblings, 1 reply; 11+ messages in thread
From: Jan Hubicka @ 2019-09-16 15:04 UTC (permalink / raw)
To: Feng Xue OS; +Cc: Martin Jambor, gcc-patches
Hi,
patch is OK with change below provided that you add a ChangeLog entry
(it is usual to write those into the emails) and that it passes
bootstrap®testing (it is also usual to indicate this).
Do you have rights to the svn repository and copyright assignment done?
>
> +DEFPARAM (PARAM_IPA_MAX_SWITCH_PREDICATE_BOUNDS,
> + "ipa-max-switch-predicate-bounds",
> + "Maximal number of boundary endpoints of case ranges of switch "
> + "statement. For switch exceeding this limit, do not construct cost-"
> + "evaluating predicate for default case during IPA function summary"
> + "generation.",
I think those texts are intended to not be too long. I think the first
sentence extended by " used during IPA functoin summary generation" is
good enough.
Thank you,
Honza
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2 V3] Use post-dom info to update if/switch predicate (PR ipa/91089)
2019-09-16 10:16 ` [PATCH 2/2 V3] Use post-dom info to update if/switch predicate " Feng Xue OS
@ 2019-09-16 15:05 ` Jan Hubicka
2019-09-17 13:54 ` Feng Xue OS
0 siblings, 1 reply; 11+ messages in thread
From: Jan Hubicka @ 2019-09-16 15:05 UTC (permalink / raw)
To: Feng Xue OS; +Cc: Martin Jambor, gcc-patches
> This one is split from original patch.
This patch is also OK (modulo the changelog entry and testing). Please
wait with comitting this for bit over a day after commiting the first so
the periodic benchmarks picks each change differently.
Thanks,
Honza
>
> Feng
> ---
> diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
> index 1bf1806eaf8..5423756d275 100644
> --- a/gcc/ipa-fnsummary.c
> +++ b/gcc/ipa-fnsummary.c
> @@ -1197,8 +1197,14 @@ set_cond_stmt_execution_predicate (struct ipa_func_body_info *fbi,
> ? code : inverted_code);
> /* invert_tree_comparison will return ERROR_MARK on FP
> comparsions that are not EQ/NE instead of returning proper
> - unordered one. Be sure it is not confused with NON_CONSTANT. */
> - if (this_code != ERROR_MARK)
> + unordered one. Be sure it is not confused with NON_CONSTANT.
> +
> + And if the edge's target is the final block of diamond CFG graph
> + of this conditional statement, we do not need to compute
> + predicate for the edge because the final block's predicate must
> + be at least as that of the first block of the statement. */
> + if (this_code != ERROR_MARK
> + && !dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest))
> {
> predicate p
> = add_condition (summary, index, size, &aggpos, this_code,
> @@ -1282,6 +1288,14 @@ set_switch_stmt_execution_predicate (struct ipa_func_body_info *fbi,
> *(predicate *) e->aux = false;
> }
>
> + e = gimple_switch_edge (cfun, last, 0);
> + /* Set BOUND_COUNT to maximum count to bypass computing predicate for
> + default case if its target basic block is in convergence point of all
> + switch cases, which can be determined by checking whether it
> + post-dominates the switch statement. */
> + if (dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest))
> + bound_count = INT_MAX;
> +
> n = gimple_switch_num_labels (last);
> for (case_idx = 1; case_idx < n; ++case_idx)
> {
> @@ -1293,7 +1307,12 @@ set_switch_stmt_execution_predicate (struct ipa_func_body_info *fbi,
> min = CASE_LOW (cl);
> max = CASE_HIGH (cl);
>
> - if (!max)
> + /* The case's target basic block is in convergence point of all switch
> + cases, its predicate should be at least as that of the switch
> + statement. */
> + if (dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest))
> + p = true;
> + else if (!max)
> p = add_condition (summary, index, size, &aggpos, EQ_EXPR,
> unshare_expr_without_location (min));
> else
> @@ -1463,10 +1482,10 @@ compute_bb_predicates (struct ipa_func_body_info *fbi,
> break;
> }
> }
> - if (p == false)
> - gcc_checking_assert (!bb->aux);
> - else
> + if (p != false)
> {
> + basic_block pdom_bb;
> +
> if (!bb->aux)
> {
> done = false;
> @@ -1485,6 +1504,34 @@ compute_bb_predicates (struct ipa_func_body_info *fbi,
> *((predicate *) bb->aux) = p;
> }
> }
> +
> + /* For switch/if statement, we can OR-combine predicates of all
> + its cases/branches to get predicate for basic block in their
> + convergence point, but sometimes this will generate very
> + complicated predicate. Actually, we can get simplified
> + predicate in another way by using the fact that predicate
> + for a basic block must also hold true for its post dominators.
> + To be specific, basic block in convergence point of
> + conditional statement should include predicate of the
> + statement. */
> + pdom_bb = get_immediate_dominator (CDI_POST_DOMINATORS, bb);
> + if (pdom_bb == EXIT_BLOCK_PTR_FOR_FN (my_function) || !pdom_bb)
> + ;
> + else if (!pdom_bb->aux)
> + {
> + done = false;
> + pdom_bb->aux = edge_predicate_pool.allocate ();
> + *((predicate *) pdom_bb->aux) = p;
> + }
> + else if (p != *(predicate *) pdom_bb->aux)
> + {
> + p = p.or_with (summary->conds, *(predicate *)pdom_bb->aux);
> + if (p != *(predicate *) pdom_bb->aux)
> + {
> + done = false;
> + *((predicate *) pdom_bb->aux) = p;
> + }
> + }
> }
> }
> }
> @@ -2089,6 +2136,7 @@ analyze_function_body (struct cgraph_node *node, bool early)
> if (opt_for_fn (node->decl, optimize))
> {
> calculate_dominance_info (CDI_DOMINATORS);
> + calculate_dominance_info (CDI_POST_DOMINATORS);
> if (!early)
> loop_optimizer_init (LOOPS_NORMAL | LOOPS_HAVE_RECORDED_EXITS);
> else
> @@ -2469,6 +2517,7 @@ analyze_function_body (struct cgraph_node *node, bool early)
> else if (!ipa_edge_args_sum)
> ipa_free_all_node_params ();
> free_dominance_info (CDI_DOMINATORS);
> + free_dominance_info (CDI_POST_DOMINATORS);
> }
> if (dump_file)
> {
> diff --git a/gcc/testsuite/gcc.dg/ipa/pr91089.c b/gcc/testsuite/gcc.dg/ipa/pr91089.c
> index e9e206fc24d..92b5550aa76 100644
> --- a/gcc/testsuite/gcc.dg/ipa/pr91089.c
> +++ b/gcc/testsuite/gcc.dg/ipa/pr91089.c
> @@ -41,6 +41,52 @@ int callee (int i)
> return data += i;
> }
>
> +int fn2 ();
> +
> +int callee_complex_predicate (int i)
> +{
> + switch (i )
> + {
> + case 0:
> + fn ();
> + fn ();
> + fn ();
> + case 1:
> + fn ();
> + fn ();
> + case -1:
> + fn ();
> + case -2:
> + fn ();
> + fn ();
> + fn ();
> + fn ();
> + fn ();
> + fn ();
> + fn ();
> + fn ();
> + fn ();
> + fn ();
> + fn ();
> + fn ();
> + fn ();
> + fn ();
> + fn ();
> + fn ();
> + data += i;
> + break;
> + }
> +
> + if (i == 1000)
> + {
> + int j;
> +
> + for (j = 0; j < 100; j++)
> + fn2 ();
> + }
> + return i + 3;
> +}
> +
> int caller ()
> {
> return callee (-127) +
> @@ -60,3 +106,4 @@ int caller ()
> /* { dg-final { scan-ipa-dump "op0 != 0" "fnsummary" } } */
> /* { dg-final { scan-ipa-dump "op0 < 5" "fnsummary" } } */
> /* { dg-final { scan-ipa-dump "op0 > 7" "fnsummary" } } */
> +/* { dg-final { scan-ipa-dump "loop depth: 1 .+ time:\[ \]*\[0-9\]+ predicate: \\(op0 == 1000\\)" "fnsummary" } } */
> --
> 2.17.1
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2 V3] Setup predicate for switch default case in IPA (PR ipa/91089)
2019-09-16 15:04 ` Jan Hubicka
@ 2019-09-17 13:51 ` Feng Xue OS
0 siblings, 0 replies; 11+ messages in thread
From: Feng Xue OS @ 2019-09-17 13:51 UTC (permalink / raw)
To: Jan Hubicka; +Cc: Martin Jambor, gcc-patches
> Hi,
> patch is OK with change below provided that you add a ChangeLog entry
> (it is usual to write those into the emails) and that it passes
> bootstrap®testing (it is also usual to indicate this).
> Do you have rights to the svn repository and copyright assignment done?
Yes, I have managed to commit some patches before. Thanks for your review.
Feng
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2 V3] Use post-dom info to update if/switch predicate (PR ipa/91089)
2019-09-16 15:05 ` Jan Hubicka
@ 2019-09-17 13:54 ` Feng Xue OS
0 siblings, 0 replies; 11+ messages in thread
From: Feng Xue OS @ 2019-09-17 13:54 UTC (permalink / raw)
To: Jan Hubicka; +Cc: Martin Jambor, gcc-patches
> This patch is also OK (modulo the changelog entry and testing). Please
> wait with comitting this for bit over a day after commiting the first so
> the periodic benchmarks picks each change differently.
Understood. Thanks for this.
Feng
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2019-09-17 13:54 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-12 8:51 [PATCH] Setup predicate for switch default case in IPA (PR ipa/91089) Feng Xue OS
2019-08-29 15:14 ` Martin Jambor
2019-08-30 8:32 ` Feng Xue OS
2019-09-02 3:06 ` [PATCH V2] " Feng Xue OS
2019-09-14 15:57 ` Jan Hubicka
2019-09-16 10:11 ` [PATCH 1/2 V3] " Feng Xue OS
2019-09-16 15:04 ` Jan Hubicka
2019-09-17 13:51 ` Feng Xue OS
2019-09-16 10:16 ` [PATCH 2/2 V3] Use post-dom info to update if/switch predicate " Feng Xue OS
2019-09-16 15:05 ` Jan Hubicka
2019-09-17 13:54 ` Feng Xue OS
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).