* [PATCH] Setup predicate for switch default case in IPA (PR ipa/91089) @ 2019-07-12 8:51 Feng Xue OS 2019-08-29 15:14 ` Martin Jambor 0 siblings, 1 reply; 11+ messages in thread From: Feng Xue OS @ 2019-07-12 8:51 UTC (permalink / raw) To: gcc-patches, Martin Jambor, Jan Hubicka IPA does not construct executability predicate for default case of switch statement. So execution cost of default case is not properly evaluated in IPA-cp, this might prevent function clone for function containing switch statement, if certain non-default case is proved to be executed after constant propagation. This patch is composed to deduce predicate for default case, if it turns out to be a relative simple one, for example, we can try to merge case range, and use comparison upon range bounds, and also range analysis information to simplify predicate. Feng ---- diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 3d92250b520..4de2f568990 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,10 @@ +2019-07-12 Feng Xue <fxue@os.amperecomputing.com> + + PR ipa/91089 + * ipa-fnsummary.c (set_switch_stmt_execution_predicate): Add predicate + for switch default case using range analysis information. + * params.def (PARAM_IPA_MAX_SWITCH_PREDICATE_BOUNDS): New. + 2019-07-11 Sunil K Pandey <sunil.k.pandey@intel.com> PR target/90980 diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c index 09986211a1d..141fdce53c2 100644 --- a/gcc/ipa-fnsummary.c +++ b/gcc/ipa-fnsummary.c @@ -1289,30 +1289,35 @@ set_switch_stmt_execution_predicate (struct ipa_func_body_info *fbi, if (!unmodified_parm_or_parm_agg_item (fbi, last, op, &index, &size, &aggpos)) return; + auto_vec<std::pair<tree, tree> > ranges; + tree type = TREE_TYPE (op); + tree one_cst = build_one_cst (type); + int bound_limit = PARAM_VALUE (PARAM_IPA_MAX_SWITCH_PREDICATE_BOUNDS); + int bound_count = 0; + value_range_base vr; + + get_range_info (op, vr); + FOR_EACH_EDGE (e, ei, bb->succs) { e->aux = edge_predicate_pool.allocate (); *(predicate *) e->aux = false; } n = gimple_switch_num_labels (last); - for (case_idx = 0; case_idx < n; ++case_idx) + for (case_idx = 1; case_idx < n; ++case_idx) { tree cl = gimple_switch_label (last, case_idx); - tree min, max; + tree min = CASE_LOW (cl); + tree max = CASE_HIGH (cl); predicate p; - e = gimple_switch_edge (cfun, last, case_idx); - min = CASE_LOW (cl); - max = CASE_HIGH (cl); - - /* For default we might want to construct predicate that none - of cases is met, but it is bit hard to do not having negations - of conditionals handy. */ - if (!min && !max) - p = true; - else if (!max) - p = add_condition (summary, index, size, &aggpos, EQ_EXPR, - unshare_expr_without_location (min)); + if (!max) + { + max = min; + + p = add_condition (summary, index, size, &aggpos, EQ_EXPR, + unshare_expr_without_location (min)); + } else { predicate p1, p2; @@ -1322,9 +1327,112 @@ set_switch_stmt_execution_predicate (struct ipa_func_body_info *fbi, unshare_expr_without_location (max)); p = p1 & p2; } + + e = gimple_switch_edge (cfun, last, case_idx); *(class predicate *) e->aux = p.or_with (summary->conds, *(class predicate *) e->aux); + + /* If there are too many disjoint case ranges, predicate for default + case might become too complicated. So add a limit here. */ + if (bound_count > bound_limit) + continue; + + bool new_range = true; + + if (!ranges.is_empty ()) + { + tree last_max_plus_one + = int_const_binop (PLUS_EXPR, ranges.last ().second, one_cst); + + /* Merge case ranges if they are continuous. */ + if (tree_int_cst_equal (min, last_max_plus_one)) + new_range = false; + else if (vr.kind () == VR_ANTI_RANGE) + { + tree min_minus_one = int_const_binop (MINUS_EXPR, min, one_cst); + + /* If anti-range of switch index can connect two disjoint case + ranges, combine them to one range. */ + if (tree_int_cst_lt (vr.max (), min_minus_one)) + vr.set_undefined (); + else if (tree_int_cst_le (vr.min (), last_max_plus_one)) + new_range = false; + } + } + + /* Create/extend a case range. And we count endpoints of range set, + this number nearly equals to number of conditions that we will create + for predicate of default case. */ + if (new_range) + { + bound_count += (min == max) ? 1 : 2; + ranges.safe_push (std::make_pair (min, max)); + } + else + { + bound_count += (ranges.last ().first == ranges.last ().second); + ranges.last ().second = max; + } + } + + e = gimple_switch_edge (cfun, last, 0); + if (bound_count > bound_limit) + { + *(class predicate *) e->aux = true; + return; + } + + tree vr_min = vr.kind () == VR_RANGE ? vr.min () : TYPE_MIN_VALUE (type); + tree vr_max = vr.kind () == VR_RANGE ? vr.max () : TYPE_MAX_VALUE (type); + predicate p_seg = true; + predicate p_all = false; + + /* Construct predicate to represent default range set that is negation of + all case ranges. Case range is classified as containing single/non-single + values. Suppose a piece of case ranges in the following. + + [D1...D2] [S1] ... [Sn] [D3...D4] + + To represent default case's range sets between two non-single value + case ranges (From D2 to D3), we construct predicate as: + + D2 < x < D3 && x != S1 && ... && x != Sn + */ + for (size_t i = 0; i < ranges.length (); i++) + { + tree min = ranges[i].first; + tree max = ranges[i].second; + + if (min == max) + p_seg &= add_condition (summary, index, size, &aggpos, NE_EXPR, + unshare_expr_without_location (min)); + else + { + /* Do not create sub-predicate for range that is beyond low bound + of switch index. */ + if (tree_int_cst_lt (vr_min, min)) + { + p_seg &= add_condition (summary, index, size, &aggpos, LT_EXPR, + unshare_expr_without_location (min)); + p_all = p_all.or_with (summary->conds, p_seg); + } + + /* Do not create sub-predicate for range that is beyond up bound + of switch index. */ + if (tree_int_cst_le (vr_max, max)) + { + p_seg = false; + break; + } + + p_seg = add_condition (summary, index, size, &aggpos, GT_EXPR, + unshare_expr_without_location (max)); + } } + + p_all = p_all.or_with (summary->conds, p_seg); + *(class predicate *) e->aux + = p_all.or_with (summary->conds, *(class predicate *) e->aux); } diff --git a/gcc/params.def b/gcc/params.def index 4567c17ba60..ff6920f287a 100644 --- a/gcc/params.def +++ b/gcc/params.def @@ -1121,6 +1121,14 @@ DEFPARAM (PARAM_IPA_MAX_AA_STEPS, "parameter analysis based on alias analysis in any given function.", 25000, 0, 0) +DEFPARAM (PARAM_IPA_MAX_SWITCH_PREDICATE_BOUNDS, + "ipa-max-switch-predicate-bounds", + "Maximal number of boundary endpoints of case ranges of switch " + "statement. For switch exceeding this limit, do not construct cost-" + "evaluating predicate for default case during IPA function summary" + "generation.", + 5, 0, 0) + /* WHOPR partitioning configuration. */ DEFPARAM (PARAM_LTO_PARTITIONS, diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index 6c40496db9c..ccb92bb8f7f 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,8 @@ +2019-07-12 Feng Xue <fxue@os.amperecomputing.com> + + PR ipa/91089 + * gcc.dg/ipa/pr91089.c: New test. + 2019-07-11 Sunil K Pandey <sunil.k.pandey@intel.com> PR target/90980 diff --git a/gcc/testsuite/gcc.dg/ipa/pr91089.c b/gcc/testsuite/gcc.dg/ipa/pr91089.c new file mode 100644 index 00000000000..fa3111f565f --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/pr91089.c @@ -0,0 +1,62 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -fdump-ipa-cp-details -fdump-ipa-fnsummary-details --param ipa-max-switch-predicate-bounds=10 -fno-inline" } */ + +int fn(); + +int data; + +int callee(int i) +{ + switch (i) + { + case -126: return i + 13; + case -127: return i + 5; + case -8: return i * i; + case 0: return i % 9; + case 5: + case 7: + case 6: return 3; + default: + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + } + + return data += i; +} + +int caller() +{ + return callee (-127) + + callee (-126) + + callee (-8) + + callee (0) + + callee (5) + + callee (6) + + callee (7) + + callee (100); +} + +/* { dg-final { scan-ipa-dump-times "Creating a specialized node of callee" 7 "cp" } } */ +/* { dg-final { scan-ipa-dump "op0 < -127" "fnsummary" } } */ +/* { dg-final { scan-ipa-dump "op0 > -126" "fnsummary" } } */ +/* { dg-final { scan-ipa-dump "op0 != -8" "fnsummary" } } */ +/* { dg-final { scan-ipa-dump "op0 != 0" "fnsummary" } } */ +/* { dg-final { scan-ipa-dump "op0 < 5" "fnsummary" } } */ +/* { dg-final { scan-ipa-dump "op0 > 7" "fnsummary" } } */ -- 2.17.1 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Setup predicate for switch default case in IPA (PR ipa/91089) 2019-07-12 8:51 [PATCH] Setup predicate for switch default case in IPA (PR ipa/91089) Feng Xue OS @ 2019-08-29 15:14 ` Martin Jambor 2019-08-30 8:32 ` Feng Xue OS 2019-09-02 3:06 ` [PATCH V2] " Feng Xue OS 0 siblings, 2 replies; 11+ messages in thread From: Martin Jambor @ 2019-08-29 15:14 UTC (permalink / raw) To: Feng Xue OS, gcc-patches, Jan Hubicka Hi, On Fri, Jul 12 2019, Feng Xue OS wrote: > IPA does not construct executability predicate for default case of switch statement. > So execution cost of default case is not properly evaluated in IPA-cp, this might > prevent function clone for function containing switch statement, if certain non-default > case is proved to be executed after constant propagation. > > This patch is composed to deduce predicate for default case, if it turns out to be a > relative simple one, for example, we can try to merge case range, and use > comparison upon range bounds, and also range analysis information to simplify > predicate. > I have read through the patch and it looks OK to me but I cannot approve it, you have to ping Honza for that. Since you decided to use the value range info, it would be nice if you could also add a testcase where it plays a role. Also, please don't post changelog entries as a part of the patch, it basically guarantees it will not apply for anybody, not even for you when you update your trunk. Thanks for working on this, Martin > Feng > > ---- > diff --git a/gcc/ChangeLog b/gcc/ChangeLog > index 3d92250b520..4de2f568990 100644 > --- a/gcc/ChangeLog > +++ b/gcc/ChangeLog > @@ -1,3 +1,10 @@ > +2019-07-12 Feng Xue <fxue@os.amperecomputing.com> > + > + PR ipa/91089 > + * ipa-fnsummary.c (set_switch_stmt_execution_predicate): Add predicate > + for switch default case using range analysis information. > + * params.def (PARAM_IPA_MAX_SWITCH_PREDICATE_BOUNDS): New. > + > 2019-07-11 Sunil K Pandey <sunil.k.pandey@intel.com> > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Setup predicate for switch default case in IPA (PR ipa/91089) 2019-08-29 15:14 ` Martin Jambor @ 2019-08-30 8:32 ` Feng Xue OS 2019-09-02 3:06 ` [PATCH V2] " Feng Xue OS 1 sibling, 0 replies; 11+ messages in thread From: Feng Xue OS @ 2019-08-30 8:32 UTC (permalink / raw) To: Martin Jambor, gcc-patches, Jan Hubicka That's good. Thanks for your comments. Feng ________________________________________ From: Martin Jambor <mjambor@suse.cz> Sent: Thursday, August 29, 2019 11:00 PM To: Feng Xue OS; gcc-patches@gcc.gnu.org; Jan Hubicka Subject: Re: [PATCH] Setup predicate for switch default case in IPA (PR ipa/91089) Hi, On Fri, Jul 12 2019, Feng Xue OS wrote: > IPA does not construct executability predicate for default case of switch statement. > So execution cost of default case is not properly evaluated in IPA-cp, this might > prevent function clone for function containing switch statement, if certain non-default > case is proved to be executed after constant propagation. > > This patch is composed to deduce predicate for default case, if it turns out to be a > relative simple one, for example, we can try to merge case range, and use > comparison upon range bounds, and also range analysis information to simplify > predicate. > I have read through the patch and it looks OK to me but I cannot approve it, you have to ping Honza for that. Since you decided to use the value range info, it would be nice if you could also add a testcase where it plays a role. Also, please don't post changelog entries as a part of the patch, it basically guarantees it will not apply for anybody, not even for you when you update your trunk. Thanks for working on this, Martin > Feng > > ---- > diff --git a/gcc/ChangeLog b/gcc/ChangeLog > index 3d92250b520..4de2f568990 100644 > --- a/gcc/ChangeLog > +++ b/gcc/ChangeLog > @@ -1,3 +1,10 @@ > +2019-07-12 Feng Xue <fxue@os.amperecomputing.com> > + > + PR ipa/91089 > + * ipa-fnsummary.c (set_switch_stmt_execution_predicate): Add predicate > + for switch default case using range analysis information. > + * params.def (PARAM_IPA_MAX_SWITCH_PREDICATE_BOUNDS): New. > + > 2019-07-11 Sunil K Pandey <sunil.k.pandey@intel.com> > ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH V2] Setup predicate for switch default case in IPA (PR ipa/91089) 2019-08-29 15:14 ` Martin Jambor 2019-08-30 8:32 ` Feng Xue OS @ 2019-09-02 3:06 ` Feng Xue OS 2019-09-14 15:57 ` Jan Hubicka 1 sibling, 1 reply; 11+ messages in thread From: Feng Xue OS @ 2019-09-02 3:06 UTC (permalink / raw) To: Martin Jambor, Jan Hubicka, gcc-patches > I have read through the patch and it looks OK to me but I cannot approve > it, you have to ping Honza for that. Since you decided to use the value > range info, it would be nice if you could also add a testcase where it > plays a role. It is somewhat hard to write a testcase to show role of range info only with this patch. If another patch "Generalized predicate/condition for parameter reference in IPA (PR ipa/91088)" is accepted, it will become easy and I will update this testcase to show that. And this new version also resolves a problem that we might generate very complicated predicate for basic block in convergence point reached from all switch cases. switch (index) / | \ case1 case2 ... default \ | / convergence point For switch/if statement, current algorithm synthesizes predicate of convergence point by OR-combining predicates of all its cases/branches, which might give us a very complicated one. Actually, we can get simplified predicate by using the fact that predicate for a basic block must also hold true for its post dominators. To be specific, convergence point should include (at least equal to) predicate of the switch/if statement. Honza & Martin, Would please review this new patch when you have time? Thanks. Feng ----- diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c index 278bf606661..dc5752fc005 100644 --- a/gcc/ipa-fnsummary.c +++ b/gcc/ipa-fnsummary.c @@ -1198,7 +1198,8 @@ set_cond_stmt_execution_predicate (struct ipa_func_body_info *fbi, /* invert_tree_comparison will return ERROR_MARK on FP comparsions that are not EQ/NE instead of returning proper unordered one. Be sure it is not confused with NON_CONSTANT. */ - if (this_code != ERROR_MARK) + if (this_code != ERROR_MARK + && !dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest)) { predicate p = add_condition (summary, index, size, &aggpos, this_code, @@ -1269,13 +1270,31 @@ set_switch_stmt_execution_predicate (struct ipa_func_body_info *fbi, if (!unmodified_parm_or_parm_agg_item (fbi, last, op, &index, &size, &aggpos)) return; + auto_vec<std::pair<tree, tree> > ranges; + tree type = TREE_TYPE (op); + tree one_cst = build_one_cst (type); + int bound_limit = PARAM_VALUE (PARAM_IPA_MAX_SWITCH_PREDICATE_BOUNDS); + int bound_count = 0; + value_range_base vr; + + get_range_info (op, vr); + FOR_EACH_EDGE (e, ei, bb->succs) { e->aux = edge_predicate_pool.allocate (); *(predicate *) e->aux = false; } + + e = gimple_switch_edge (cfun, last, 0); + /* Set BOUND_COUNT to maximum count to bypass computing predicate for + default case if its target basic block is in convergence point of all + switch cases, which can be determined by checking whether it + post-dominates the switch statement. */ + if (dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest)) + bound_count = INT_MAX; + n = gimple_switch_num_labels (last); - for (case_idx = 0; case_idx < n; ++case_idx) + for (case_idx = 1; case_idx < n; ++case_idx) { tree cl = gimple_switch_label (last, case_idx); tree min, max; @@ -1285,10 +1304,10 @@ set_switch_stmt_execution_predicate (struct ipa_func_body_info *fbi, min = CASE_LOW (cl); max = CASE_HIGH (cl); - /* For default we might want to construct predicate that none - of cases is met, but it is bit hard to do not having negations - of conditionals handy. */ - if (!min && !max) + /* The case's target basic block is in convergence point of all switch + cases, its predicate should be at least as that of the switch + statement. */ + if (dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest)) p = true; else if (!max) p = add_condition (summary, index, size, &aggpos, EQ_EXPR, @@ -1304,7 +1323,111 @@ set_switch_stmt_execution_predicate (struct ipa_func_body_info *fbi, } *(class predicate *) e->aux = p.or_with (summary->conds, *(class predicate *) e->aux); + + /* If there are too many disjoint case ranges, predicate for default + case might become too complicated. So add a limit here. */ + if (bound_count > bound_limit) + continue; + + bool new_range = true; + + if (!ranges.is_empty ()) + { + tree last_max_plus_one + = int_const_binop (PLUS_EXPR, ranges.last ().second, one_cst); + + /* Merge case ranges if they are continuous. */ + if (tree_int_cst_equal (min, last_max_plus_one)) + new_range = false; + else if (vr.kind () == VR_ANTI_RANGE) + { + tree min_minus_one = int_const_binop (MINUS_EXPR, min, one_cst); + + /* If two disjoint case ranges can be connected by anti-range + of switch index, combine them to one range. */ + if (tree_int_cst_lt (vr.max (), min_minus_one)) + vr.set_undefined (); + else if (tree_int_cst_le (vr.min (), last_max_plus_one)) + new_range = false; + } + } + + if (!max) + max = min; + + /* Create/extend a case range. And we count endpoints of range set, + this number nearly equals to number of conditions that we will create + for predicate of default case. */ + if (new_range) + { + bound_count += (min == max) ? 1 : 2; + ranges.safe_push (std::make_pair (min, max)); + } + else + { + bound_count += (ranges.last ().first == ranges.last ().second); + ranges.last ().second = max; + } + } + + e = gimple_switch_edge (cfun, last, 0); + if (bound_count > bound_limit) + { + *(class predicate *) e->aux = true; + return; + } + + tree vr_min = vr.kind () == VR_RANGE ? vr.min () : TYPE_MIN_VALUE (type); + tree vr_max = vr.kind () == VR_RANGE ? vr.max () : TYPE_MAX_VALUE (type); + predicate p_seg = true; + predicate p_all = false; + + /* Construct predicate to represent default range set that is negation of + all case ranges. Case range is classified as containing single/non-single + values. Suppose a piece of case ranges in the following. + + [D1...D2] [S1] ... [Sn] [D3...D4] + + To represent default case's range sets between two non-single value + case ranges (From D2 to D3), we construct predicate as: + + D2 < x < D3 && x != S1 && ... && x != Sn + */ + for (size_t i = 0; i < ranges.length (); i++) + { + tree min = ranges[i].first; + tree max = ranges[i].second; + + if (min == max) + p_seg &= add_condition (summary, index, size, &aggpos, NE_EXPR, + unshare_expr_without_location (min)); + else + { + /* Do not create sub-predicate for range that is beyond low bound + of switch index. */ + if (tree_int_cst_lt (vr_min, min)) + { + p_seg &= add_condition (summary, index, size, &aggpos, LT_EXPR, + unshare_expr_without_location (min)); + p_all = p_all.or_with (summary->conds, p_seg); + } + + /* Do not create sub-predicate for range that is beyond up bound + of switch index. */ + if (tree_int_cst_le (vr_max, max)) + { + p_seg = false; + break; + } + + p_seg = add_condition (summary, index, size, &aggpos, GT_EXPR, + unshare_expr_without_location (max)); + } } + + p_all = p_all.or_with (summary->conds, p_seg); + *(class predicate *) e->aux + = p_all.or_with (summary->conds, *(class predicate *) e->aux); } @@ -1354,10 +1477,10 @@ compute_bb_predicates (struct ipa_func_body_info *fbi, break; } } - if (p == false) - gcc_checking_assert (!bb->aux); - else + if (p != false) { + basic_block pdom_bb; + if (!bb->aux) { done = false; @@ -1376,6 +1499,34 @@ compute_bb_predicates (struct ipa_func_body_info *fbi, *((predicate *) bb->aux) = p; } } + + /* For switch/if statement, we can OR-combine predicates of all + its cases/branches to get predicate for basic block in their + convergence point, but sometimes this will generate very + complicated predicate. Actually, we can get simplified + predicate in another way by using the fact that predicate + for a basic block must also hold true for its post dominators. + To be specific, basic block in convergence point of + conditional statement should include predicate of the + statement. */ + pdom_bb = get_immediate_dominator (CDI_POST_DOMINATORS, bb); + if (pdom_bb == EXIT_BLOCK_PTR_FOR_FN (my_function) || !pdom_bb) + ; + else if (!pdom_bb->aux) + { + done = false; + pdom_bb->aux = edge_predicate_pool.allocate (); + *((predicate *) pdom_bb->aux) = p; + } + else if (p != *(predicate *) pdom_bb->aux) + { + p = p.or_with (summary->conds, *(predicate *)pdom_bb->aux); + if (p != *(predicate *) pdom_bb->aux) + { + done = false; + *((predicate *) pdom_bb->aux) = p; + } + } } } } @@ -1980,6 +2131,7 @@ analyze_function_body (struct cgraph_node *node, bool early) if (opt_for_fn (node->decl, optimize)) { calculate_dominance_info (CDI_DOMINATORS); + calculate_dominance_info (CDI_POST_DOMINATORS); if (!early) loop_optimizer_init (LOOPS_NORMAL | LOOPS_HAVE_RECORDED_EXITS); else @@ -2360,6 +2512,7 @@ analyze_function_body (struct cgraph_node *node, bool early) else if (!ipa_edge_args_sum) ipa_free_all_node_params (); free_dominance_info (CDI_DOMINATORS); + free_dominance_info (CDI_POST_DOMINATORS); } if (dump_file) { diff --git a/gcc/params.def b/gcc/params.def index 13001a7bb2d..1461aa4849d 100644 --- a/gcc/params.def +++ b/gcc/params.def @@ -1123,6 +1123,14 @@ DEFPARAM (PARAM_IPA_MAX_AA_STEPS, "parameter analysis based on alias analysis in any given function.", 25000, 0, 0) +DEFPARAM (PARAM_IPA_MAX_SWITCH_PREDICATE_BOUNDS, + "ipa-max-switch-predicate-bounds", + "Maximal number of boundary endpoints of case ranges of switch " + "statement. For switch exceeding this limit, do not construct cost-" + "evaluating predicate for default case during IPA function summary" + "generation.", + 5, 0, 0) + /* WHOPR partitioning configuration. */ DEFPARAM (PARAM_LTO_PARTITIONS, diff --git a/gcc/testsuite/gcc.dg/ipa/pr91089.c b/gcc/testsuite/gcc.dg/ipa/pr91089.c new file mode 100644 index 00000000000..dbd9b8c94fe --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/pr91089.c @@ -0,0 +1,111 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -fdump-ipa-cp-details -fdump-ipa-fnsummary-details --param ipa-max-switch-predicate-bounds=10 -fno-inline" } */ + +int fn(); + +int data; + +int callee1(int i) +{ + switch (i) + { + case -126: return i + 13; + case -127: return i + 5; + case -8: return i * i; + case 0: return i % 9; + case 5: + case 7: + case 6: return 3; + default: + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + } + + return data += i; +} + +int fn2(); + +int callee2(int i) +{ + switch (i ) + { + case 0: + fn(); + fn(); + fn(); + case 1: + fn(); + fn(); + case -1: + fn(); + case -2: + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + data += i; + break; + } + + if (i == 1000) + { + int j; + + for (j = 0; j < 100; j++) + fn2(); + } + return i + 3; +} + +int caller() +{ + return callee1 (-127) + + callee1 (-126) + + callee1 (-8) + + callee1 (0) + + callee1 (5) + + callee1 (6) + + callee1 (7) + + callee1 (100) + + callee2 (9); + callee2 (13); +} + +/* { dg-final { scan-ipa-dump-times "Creating a specialized node of callee" 7 "cp" } } */ +/* { dg-final { scan-ipa-dump "op0 < -127" "fnsummary" } } */ +/* { dg-final { scan-ipa-dump "op0 > -126" "fnsummary" } } */ +/* { dg-final { scan-ipa-dump "op0 != -8" "fnsummary" } } */ +/* { dg-final { scan-ipa-dump "op0 != 0" "fnsummary" } } */ +/* { dg-final { scan-ipa-dump "op0 < 5" "fnsummary" } } */ +/* { dg-final { scan-ipa-dump "op0 > 7" "fnsummary" } } */ +/* { dg-final { scan-ipa-dump "loop depth: 1 .+ time:\[ \]*\[0-9\]+ predicate: \\(op0 == 1000\\)" "fnsummary" } } */ -- 2.17.1 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH V2] Setup predicate for switch default case in IPA (PR ipa/91089) 2019-09-02 3:06 ` [PATCH V2] " Feng Xue OS @ 2019-09-14 15:57 ` Jan Hubicka 2019-09-16 10:11 ` [PATCH 1/2 V3] " Feng Xue OS 2019-09-16 10:16 ` [PATCH 2/2 V3] Use post-dom info to update if/switch predicate " Feng Xue OS 0 siblings, 2 replies; 11+ messages in thread From: Jan Hubicka @ 2019-09-14 15:57 UTC (permalink / raw) To: Feng Xue OS; +Cc: Martin Jambor, gcc-patches > It is somewhat hard to write a testcase to show role of range info only with > this patch. If another patch "Generalized predicate/condition for parameter > reference in IPA (PR ipa/91088)" is accepted, it will become easy and I will > update this testcase to show that. > > And this new version also resolves a problem that we might generate very > complicated predicate for basic block in convergence point reached from > all switch cases. > > switch (index) > / | \ > case1 case2 ... default > \ | / > convergence point > > For switch/if statement, current algorithm synthesizes predicate of > convergence point by OR-combining predicates of all its cases/branches, which > might give us a very complicated one. Actually, we can get simplified predicate > by using the fact that predicate for a basic block must also hold true for its post > dominators. To be specific, convergence point should include (at least equal to) > predicate of the switch/if statement. > > Honza & Martin, > > Would please review this new patch when you have time? Thanks. > > Feng > > ----- > diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c > index 278bf606661..dc5752fc005 100644 > --- a/gcc/ipa-fnsummary.c > +++ b/gcc/ipa-fnsummary.c > @@ -1198,7 +1198,8 @@ set_cond_stmt_execution_predicate (struct ipa_func_body_info *fbi, > /* invert_tree_comparison will return ERROR_MARK on FP > comparsions that are not EQ/NE instead of returning proper > unordered one. Be sure it is not confused with NON_CONSTANT. */ > - if (this_code != ERROR_MARK) > + if (this_code != ERROR_MARK > + && !dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest)) > { > predicate p > = add_condition (summary, index, size, &aggpos, this_code, So this change is handling the diamond conditional you describe above? This is bit of hack since it leaves the edge predicate unnecesarily conservative though I see it saves some conditions to be inserted and makes things to go smoother. Please add a comment that explain this and reffers to the other places where we do this (in the switch handling below). > @@ -1269,13 +1270,31 @@ set_switch_stmt_execution_predicate (struct ipa_func_body_info *fbi, > if (!unmodified_parm_or_parm_agg_item (fbi, last, op, &index, &size, &aggpos)) > return; > > + auto_vec<std::pair<tree, tree> > ranges; > + tree type = TREE_TYPE (op); > + tree one_cst = build_one_cst (type); > + int bound_limit = PARAM_VALUE (PARAM_IPA_MAX_SWITCH_PREDICATE_BOUNDS); > + int bound_count = 0; > + value_range_base vr; > + > + get_range_info (op, vr); > + > FOR_EACH_EDGE (e, ei, bb->succs) > { > e->aux = edge_predicate_pool.allocate (); > *(predicate *) e->aux = false; > } > + > + e = gimple_switch_edge (cfun, last, 0); > + /* Set BOUND_COUNT to maximum count to bypass computing predicate for > + default case if its target basic block is in convergence point of all > + switch cases, which can be determined by checking whether it > + post-dominates the switch statement. */ > + if (dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest)) > + bound_count = INT_MAX; > + > n = gimple_switch_num_labels (last); > - for (case_idx = 0; case_idx < n; ++case_idx) > + for (case_idx = 1; case_idx < n; ++case_idx) > { > tree cl = gimple_switch_label (last, case_idx); > tree min, max; > @@ -1285,10 +1304,10 @@ set_switch_stmt_execution_predicate (struct ipa_func_body_info *fbi, > min = CASE_LOW (cl); > max = CASE_HIGH (cl); > > - /* For default we might want to construct predicate that none > - of cases is met, but it is bit hard to do not having negations > - of conditionals handy. */ > - if (!min && !max) > + /* The case's target basic block is in convergence point of all switch > + cases, its predicate should be at least as that of the switch > + statement. */ > + if (dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest)) > p = true; > else if (!max) > p = add_condition (summary, index, size, &aggpos, EQ_EXPR, > @@ -1304,7 +1323,111 @@ set_switch_stmt_execution_predicate (struct ipa_func_body_info *fbi, > } > *(class predicate *) e->aux > = p.or_with (summary->conds, *(class predicate *) e->aux); > + > + /* If there are too many disjoint case ranges, predicate for default > + case might become too complicated. So add a limit here. */ > + if (bound_count > bound_limit) > + continue; > + > + bool new_range = true; > + > + if (!ranges.is_empty ()) > + { > + tree last_max_plus_one > + = int_const_binop (PLUS_EXPR, ranges.last ().second, one_cst); > + > + /* Merge case ranges if they are continuous. */ > + if (tree_int_cst_equal (min, last_max_plus_one)) > + new_range = false; > + else if (vr.kind () == VR_ANTI_RANGE) > + { > + tree min_minus_one = int_const_binop (MINUS_EXPR, min, one_cst); > + > + /* If two disjoint case ranges can be connected by anti-range > + of switch index, combine them to one range. */ > + if (tree_int_cst_lt (vr.max (), min_minus_one)) > + vr.set_undefined (); > + else if (tree_int_cst_le (vr.min (), last_max_plus_one)) > + new_range = false; > + } I believe the case ranges always has to be INTEGER_CST. In this case all this can be written using wide ints and produce a lot better code (avoiding need to build & lookup & share all temporary tree codes). You can take a look at the tree-switch-conversion code which does quite some of this wide int handling. Richard may have an opinion on this. > @@ -1376,6 +1499,34 @@ compute_bb_predicates (struct ipa_func_body_info *fbi, > *((predicate *) bb->aux) = p; > } > } > + > + /* For switch/if statement, we can OR-combine predicates of all > + its cases/branches to get predicate for basic block in their > + convergence point, but sometimes this will generate very > + complicated predicate. Actually, we can get simplified > + predicate in another way by using the fact that predicate > + for a basic block must also hold true for its post dominators. > + To be specific, basic block in convergence point of > + conditional statement should include predicate of the > + statement. */ > + pdom_bb = get_immediate_dominator (CDI_POST_DOMINATORS, bb); > + if (pdom_bb == EXIT_BLOCK_PTR_FOR_FN (my_function) || !pdom_bb) > + ; > + else if (!pdom_bb->aux) > + { > + done = false; > + pdom_bb->aux = edge_predicate_pool.allocate (); > + *((predicate *) pdom_bb->aux) = p; > + } > + else if (p != *(predicate *) pdom_bb->aux) > + { > + p = p.or_with (summary->conds, *(predicate *)pdom_bb->aux); > + if (p != *(predicate *) pdom_bb->aux) > + { > + done = false; > + *((predicate *) pdom_bb->aux) = p; > + } > + } So this basically the idea is to or BB's predicate to the post-dominator predicate. This seems safe to do (we need to be careful so that the dataflow still converges), but I would preffer to get this in separately possibly with a testcase that shows an improvement in the dataflow answer. Honza > } > } > } > @@ -1980,6 +2131,7 @@ analyze_function_body (struct cgraph_node *node, bool early) > if (opt_for_fn (node->decl, optimize)) > { > calculate_dominance_info (CDI_DOMINATORS); > + calculate_dominance_info (CDI_POST_DOMINATORS); > if (!early) > loop_optimizer_init (LOOPS_NORMAL | LOOPS_HAVE_RECORDED_EXITS); > else > @@ -2360,6 +2512,7 @@ analyze_function_body (struct cgraph_node *node, bool early) > else if (!ipa_edge_args_sum) > ipa_free_all_node_params (); > free_dominance_info (CDI_DOMINATORS); > + free_dominance_info (CDI_POST_DOMINATORS); > } > if (dump_file) > { > diff --git a/gcc/params.def b/gcc/params.def > index 13001a7bb2d..1461aa4849d 100644 > --- a/gcc/params.def > +++ b/gcc/params.def > @@ -1123,6 +1123,14 @@ DEFPARAM (PARAM_IPA_MAX_AA_STEPS, > "parameter analysis based on alias analysis in any given function.", > 25000, 0, 0) > > +DEFPARAM (PARAM_IPA_MAX_SWITCH_PREDICATE_BOUNDS, > + "ipa-max-switch-predicate-bounds", > + "Maximal number of boundary endpoints of case ranges of switch " > + "statement. For switch exceeding this limit, do not construct cost-" > + "evaluating predicate for default case during IPA function summary" > + "generation.", > + 5, 0, 0) > + > /* WHOPR partitioning configuration. */ > > DEFPARAM (PARAM_LTO_PARTITIONS, > diff --git a/gcc/testsuite/gcc.dg/ipa/pr91089.c b/gcc/testsuite/gcc.dg/ipa/pr91089.c > new file mode 100644 > index 00000000000..dbd9b8c94fe > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/ipa/pr91089.c > @@ -0,0 +1,111 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O3 -fdump-ipa-cp-details -fdump-ipa-fnsummary-details --param ipa-max-switch-predicate-bounds=10 -fno-inline" } */ > + > +int fn(); > + > +int data; > + > +int callee1(int i) > +{ > + switch (i) > + { > + case -126: return i + 13; > + case -127: return i + 5; > + case -8: return i * i; > + case 0: return i % 9; > + case 5: > + case 7: > + case 6: return 3; > + default: > + fn(); > + fn(); > + fn(); > + fn(); > + fn(); > + fn(); > + fn(); > + fn(); > + fn(); > + fn(); > + fn(); > + fn(); > + fn(); > + fn(); > + fn(); > + fn(); > + fn(); > + fn(); > + fn(); > + } > + > + return data += i; > +} > + > +int fn2(); > + > +int callee2(int i) > +{ > + switch (i ) > + { > + case 0: > + fn(); > + fn(); > + fn(); > + case 1: > + fn(); > + fn(); > + case -1: > + fn(); > + case -2: > + fn(); > + fn(); > + fn(); > + fn(); > + fn(); > + fn(); > + fn(); > + fn(); > + fn(); > + fn(); > + fn(); > + fn(); > + fn(); > + fn(); > + fn(); > + fn(); > + data += i; > + break; > + } > + > + if (i == 1000) > + { > + int j; > + > + for (j = 0; j < 100; j++) > + fn2(); > + } > + return i + 3; > +} > + > +int caller() > +{ > + return callee1 (-127) + > + callee1 (-126) + > + callee1 (-8) + > + callee1 (0) + > + callee1 (5) + > + callee1 (6) + > + callee1 (7) + > + callee1 (100) + > + callee2 (9); > + callee2 (13); > +} > + > +/* { dg-final { scan-ipa-dump-times "Creating a specialized node of callee" 7 "cp" } } */ > +/* { dg-final { scan-ipa-dump "op0 < -127" "fnsummary" } } */ > +/* { dg-final { scan-ipa-dump "op0 > -126" "fnsummary" } } */ > +/* { dg-final { scan-ipa-dump "op0 != -8" "fnsummary" } } */ > +/* { dg-final { scan-ipa-dump "op0 != 0" "fnsummary" } } */ > +/* { dg-final { scan-ipa-dump "op0 < 5" "fnsummary" } } */ > +/* { dg-final { scan-ipa-dump "op0 > 7" "fnsummary" } } */ > +/* { dg-final { scan-ipa-dump "loop depth: 1 .+ time:\[ \]*\[0-9\]+ predicate: \\(op0 == 1000\\)" "fnsummary" } } */ > -- > 2.17.1 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/2 V3] Setup predicate for switch default case in IPA (PR ipa/91089) 2019-09-14 15:57 ` Jan Hubicka @ 2019-09-16 10:11 ` Feng Xue OS 2019-09-16 15:04 ` Jan Hubicka 2019-09-16 10:16 ` [PATCH 2/2 V3] Use post-dom info to update if/switch predicate " Feng Xue OS 1 sibling, 1 reply; 11+ messages in thread From: Feng Xue OS @ 2019-09-16 10:11 UTC (permalink / raw) To: Jan Hubicka; +Cc: Martin Jambor, gcc-patches >> + if (this_code != ERROR_MARK >> + && !dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest)) >> { >> predicate p >> = add_condition (summary, index, size, &aggpos, this_code, > So this change is handling the diamond conditional you describe above? > This is bit of hack since it leaves the edge predicate unnecesarily > conservative though I see it saves some conditions to be inserted and > makes things to go smoother. > > Please add a comment that explain this and reffers to the other places > where we do this (in the switch handling below). Done. > I believe the case ranges always has to be INTEGER_CST. In this case all > this can be written using wide ints and produce a lot better code > (avoiding need to build & lookup & share all temporary tree codes). > You can take a look at the tree-switch-conversion code which does quite > some of this wide int handling. > > Richard may have an opinion on this. Done. > So this basically the idea is to or BB's predicate to the > post-dominator predicate. This seems safe to do (we need to be careful > so that the dataflow still converges), but I would preffer to get this > in separately possibly with a testcase that shows an improvement in the > dataflow answer. I've split the patch to two. Thanks for your comments. Feng --- diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 1391a562c35..7f312c96f37 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -11960,6 +11960,12 @@ not spend too much time analyzing huge functions, it gives up and consider all memory clobbered after examining @option{ipa-max-aa-steps} statements modifying memory. +@item ipa-max-switch-predicate-bounds +Maximal number of boundary endpoints of case ranges of switch statement. +For switch exceeding this limit, IPA-CP will not construct cloning cost +predicate, which is used to estimate cloning benefit, for default case +of the switch statement. + @item lto-partitions Specify desired number of partitions produced during WHOPR compilation. The number of partitions should exceed the number of CPUs used for compilation. diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c index 278bf606661..1bf1806eaf8 100644 --- a/gcc/ipa-fnsummary.c +++ b/gcc/ipa-fnsummary.c @@ -1269,13 +1269,21 @@ set_switch_stmt_execution_predicate (struct ipa_func_body_info *fbi, if (!unmodified_parm_or_parm_agg_item (fbi, last, op, &index, &size, &aggpos)) return; + auto_vec<std::pair<tree, tree> > ranges; + tree type = TREE_TYPE (op); + int bound_limit = PARAM_VALUE (PARAM_IPA_MAX_SWITCH_PREDICATE_BOUNDS); + int bound_count = 0; + wide_int vr_wmin, vr_wmax; + value_range_kind vr_type = get_range_info (op, &vr_wmin, &vr_wmax); + FOR_EACH_EDGE (e, ei, bb->succs) { e->aux = edge_predicate_pool.allocate (); *(predicate *) e->aux = false; } + n = gimple_switch_num_labels (last); - for (case_idx = 0; case_idx < n; ++case_idx) + for (case_idx = 1; case_idx < n; ++case_idx) { tree cl = gimple_switch_label (last, case_idx); tree min, max; @@ -1285,12 +1293,7 @@ set_switch_stmt_execution_predicate (struct ipa_func_body_info *fbi, min = CASE_LOW (cl); max = CASE_HIGH (cl); - /* For default we might want to construct predicate that none - of cases is met, but it is bit hard to do not having negations - of conditionals handy. */ - if (!min && !max) - p = true; - else if (!max) + if (!max) p = add_condition (summary, index, size, &aggpos, EQ_EXPR, unshare_expr_without_location (min)); else @@ -1304,7 +1307,113 @@ set_switch_stmt_execution_predicate (struct ipa_func_body_info *fbi, } *(class predicate *) e->aux = p.or_with (summary->conds, *(class predicate *) e->aux); + + /* If there are too many disjoint case ranges, predicate for default + case might become too complicated. So add a limit here. */ + if (bound_count > bound_limit) + continue; + + bool new_range = true; + + if (!ranges.is_empty ()) + { + wide_int curr_wmin = wi::to_wide (min); + wide_int last_wmax = wi::to_wide (ranges.last ().second); + + /* Merge case ranges if they are continuous. */ + if (curr_wmin == last_wmax + 1) + new_range = false; + else if (vr_type == VR_ANTI_RANGE) + { + /* If two disjoint case ranges can be connected by anti-range + of switch index, combine them to one range. */ + if (wi::lt_p (vr_wmax, curr_wmin - 1, TYPE_SIGN (type))) + vr_type = VR_UNDEFINED; + else if (wi::le_p (vr_wmin, last_wmax + 1, TYPE_SIGN (type))) + new_range = false; + } + } + + if (!max) + max = min; + + /* Create/extend a case range. And we count endpoints of range set, + this number nearly equals to number of conditions that we will create + for predicate of default case. */ + if (new_range) + { + bound_count += (min == max) ? 1 : 2; + ranges.safe_push (std::make_pair (min, max)); + } + else + { + bound_count += (ranges.last ().first == ranges.last ().second); + ranges.last ().second = max; + } + } + + e = gimple_switch_edge (cfun, last, 0); + if (bound_count > bound_limit) + { + *(class predicate *) e->aux = true; + return; } + + predicate p_seg = true; + predicate p_all = false; + + if (vr_type != VR_RANGE) + { + vr_wmin = wi::to_wide (TYPE_MIN_VALUE (type)); + vr_wmax = wi::to_wide (TYPE_MAX_VALUE (type)); + } + + /* Construct predicate to represent default range set that is negation of + all case ranges. Case range is classified as containing single/non-single + values. Suppose a piece of case ranges in the following. + + [D1...D2] [S1] ... [Sn] [D3...D4] + + To represent default case's range sets between two non-single value + case ranges (From D2 to D3), we construct predicate as: + + D2 < x < D3 && x != S1 && ... && x != Sn + */ + for (size_t i = 0; i < ranges.length (); i++) + { + tree min = ranges[i].first; + tree max = ranges[i].second; + + if (min == max) + p_seg &= add_condition (summary, index, size, &aggpos, NE_EXPR, + unshare_expr_without_location (min)); + else + { + /* Do not create sub-predicate for range that is beyond low bound + of switch index. */ + if (wi::lt_p (vr_wmin, wi::to_wide (min), TYPE_SIGN (type))) + { + p_seg &= add_condition (summary, index, size, &aggpos, LT_EXPR, + unshare_expr_without_location (min)); + p_all = p_all.or_with (summary->conds, p_seg); + } + + /* Do not create sub-predicate for range that is beyond up bound + of switch index. */ + if (wi::le_p (vr_wmax, wi::to_wide (max), TYPE_SIGN (type))) + { + p_seg = false; + break; + } + + p_seg = add_condition (summary, index, size, &aggpos, GT_EXPR, + unshare_expr_without_location (max)); + } + } + + p_all = p_all.or_with (summary->conds, p_seg); + *(class predicate *) e->aux + = p_all.or_with (summary->conds, *(class predicate *) e->aux); } diff --git a/gcc/params.def b/gcc/params.def index 13001a7bb2d..1461aa4849d 100644 --- a/gcc/params.def +++ b/gcc/params.def @@ -1123,6 +1123,14 @@ DEFPARAM (PARAM_IPA_MAX_AA_STEPS, "parameter analysis based on alias analysis in any given function.", 25000, 0, 0) +DEFPARAM (PARAM_IPA_MAX_SWITCH_PREDICATE_BOUNDS, + "ipa-max-switch-predicate-bounds", + "Maximal number of boundary endpoints of case ranges of switch " + "statement. For switch exceeding this limit, do not construct cost-" + "evaluating predicate for default case during IPA function summary" + "generation.", + 5, 0, 0) + /* WHOPR partitioning configuration. */ DEFPARAM (PARAM_LTO_PARTITIONS, diff --git a/gcc/testsuite/gcc.dg/ipa/pr91089.c b/gcc/testsuite/gcc.dg/ipa/pr91089.c new file mode 100644 index 00000000000..e9e206fc24d --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/pr91089.c @@ -0,0 +1,62 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -fdump-ipa-cp-details -fdump-ipa-fnsummary-details --param ipa-max-switch-predicate-bounds=10 -fno-inline" } */ + +int fn (); + +int data; + +int callee (int i) +{ + switch (i) + { + case -126: return i + 13; + case -127: return i + 5; + case -8: return i * i; + case 0: return i % 9; + case 5: + case 7: + case 6: return 3; + default: + fn (); + fn (); + fn (); + fn (); + fn (); + fn (); + fn (); + fn (); + fn (); + fn (); + fn (); + fn (); + fn (); + fn (); + fn (); + fn (); + fn (); + fn (); + fn (); + } + + return data += i; +} + +int caller () +{ + return callee (-127) + + callee (-126) + + callee (-8) + + callee (0) + + callee (5) + + callee (6) + + callee (7) + + callee (100); +} + +/* { dg-final { scan-ipa-dump-times "Creating a specialized node of callee" 7 "cp" } } */ +/* { dg-final { scan-ipa-dump "op0 < -127" "fnsummary" } } */ +/* { dg-final { scan-ipa-dump "op0 > -126" "fnsummary" } } */ +/* { dg-final { scan-ipa-dump "op0 != -8" "fnsummary" } } */ +/* { dg-final { scan-ipa-dump "op0 != 0" "fnsummary" } } */ +/* { dg-final { scan-ipa-dump "op0 < 5" "fnsummary" } } */ +/* { dg-final { scan-ipa-dump "op0 > 7" "fnsummary" } } */ -- 2.17.1 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2 V3] Setup predicate for switch default case in IPA (PR ipa/91089) 2019-09-16 10:11 ` [PATCH 1/2 V3] " Feng Xue OS @ 2019-09-16 15:04 ` Jan Hubicka 2019-09-17 13:51 ` Feng Xue OS 0 siblings, 1 reply; 11+ messages in thread From: Jan Hubicka @ 2019-09-16 15:04 UTC (permalink / raw) To: Feng Xue OS; +Cc: Martin Jambor, gcc-patches Hi, patch is OK with change below provided that you add a ChangeLog entry (it is usual to write those into the emails) and that it passes bootstrap®testing (it is also usual to indicate this). Do you have rights to the svn repository and copyright assignment done? > > +DEFPARAM (PARAM_IPA_MAX_SWITCH_PREDICATE_BOUNDS, > + "ipa-max-switch-predicate-bounds", > + "Maximal number of boundary endpoints of case ranges of switch " > + "statement. For switch exceeding this limit, do not construct cost-" > + "evaluating predicate for default case during IPA function summary" > + "generation.", I think those texts are intended to not be too long. I think the first sentence extended by " used during IPA functoin summary generation" is good enough. Thank you, Honza ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2 V3] Setup predicate for switch default case in IPA (PR ipa/91089) 2019-09-16 15:04 ` Jan Hubicka @ 2019-09-17 13:51 ` Feng Xue OS 0 siblings, 0 replies; 11+ messages in thread From: Feng Xue OS @ 2019-09-17 13:51 UTC (permalink / raw) To: Jan Hubicka; +Cc: Martin Jambor, gcc-patches > Hi, > patch is OK with change below provided that you add a ChangeLog entry > (it is usual to write those into the emails) and that it passes > bootstrap®testing (it is also usual to indicate this). > Do you have rights to the svn repository and copyright assignment done? Yes, I have managed to commit some patches before. Thanks for your review. Feng ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 2/2 V3] Use post-dom info to update if/switch predicate (PR ipa/91089) 2019-09-14 15:57 ` Jan Hubicka 2019-09-16 10:11 ` [PATCH 1/2 V3] " Feng Xue OS @ 2019-09-16 10:16 ` Feng Xue OS 2019-09-16 15:05 ` Jan Hubicka 1 sibling, 1 reply; 11+ messages in thread From: Feng Xue OS @ 2019-09-16 10:16 UTC (permalink / raw) To: Jan Hubicka; +Cc: Martin Jambor, gcc-patches This one is split from original patch. Feng --- diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c index 1bf1806eaf8..5423756d275 100644 --- a/gcc/ipa-fnsummary.c +++ b/gcc/ipa-fnsummary.c @@ -1197,8 +1197,14 @@ set_cond_stmt_execution_predicate (struct ipa_func_body_info *fbi, ? code : inverted_code); /* invert_tree_comparison will return ERROR_MARK on FP comparsions that are not EQ/NE instead of returning proper - unordered one. Be sure it is not confused with NON_CONSTANT. */ - if (this_code != ERROR_MARK) + unordered one. Be sure it is not confused with NON_CONSTANT. + + And if the edge's target is the final block of diamond CFG graph + of this conditional statement, we do not need to compute + predicate for the edge because the final block's predicate must + be at least as that of the first block of the statement. */ + if (this_code != ERROR_MARK + && !dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest)) { predicate p = add_condition (summary, index, size, &aggpos, this_code, @@ -1282,6 +1288,14 @@ set_switch_stmt_execution_predicate (struct ipa_func_body_info *fbi, *(predicate *) e->aux = false; } + e = gimple_switch_edge (cfun, last, 0); + /* Set BOUND_COUNT to maximum count to bypass computing predicate for + default case if its target basic block is in convergence point of all + switch cases, which can be determined by checking whether it + post-dominates the switch statement. */ + if (dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest)) + bound_count = INT_MAX; + n = gimple_switch_num_labels (last); for (case_idx = 1; case_idx < n; ++case_idx) { @@ -1293,7 +1307,12 @@ set_switch_stmt_execution_predicate (struct ipa_func_body_info *fbi, min = CASE_LOW (cl); max = CASE_HIGH (cl); - if (!max) + /* The case's target basic block is in convergence point of all switch + cases, its predicate should be at least as that of the switch + statement. */ + if (dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest)) + p = true; + else if (!max) p = add_condition (summary, index, size, &aggpos, EQ_EXPR, unshare_expr_without_location (min)); else @@ -1463,10 +1482,10 @@ compute_bb_predicates (struct ipa_func_body_info *fbi, break; } } - if (p == false) - gcc_checking_assert (!bb->aux); - else + if (p != false) { + basic_block pdom_bb; + if (!bb->aux) { done = false; @@ -1485,6 +1504,34 @@ compute_bb_predicates (struct ipa_func_body_info *fbi, *((predicate *) bb->aux) = p; } } + + /* For switch/if statement, we can OR-combine predicates of all + its cases/branches to get predicate for basic block in their + convergence point, but sometimes this will generate very + complicated predicate. Actually, we can get simplified + predicate in another way by using the fact that predicate + for a basic block must also hold true for its post dominators. + To be specific, basic block in convergence point of + conditional statement should include predicate of the + statement. */ + pdom_bb = get_immediate_dominator (CDI_POST_DOMINATORS, bb); + if (pdom_bb == EXIT_BLOCK_PTR_FOR_FN (my_function) || !pdom_bb) + ; + else if (!pdom_bb->aux) + { + done = false; + pdom_bb->aux = edge_predicate_pool.allocate (); + *((predicate *) pdom_bb->aux) = p; + } + else if (p != *(predicate *) pdom_bb->aux) + { + p = p.or_with (summary->conds, *(predicate *)pdom_bb->aux); + if (p != *(predicate *) pdom_bb->aux) + { + done = false; + *((predicate *) pdom_bb->aux) = p; + } + } } } } @@ -2089,6 +2136,7 @@ analyze_function_body (struct cgraph_node *node, bool early) if (opt_for_fn (node->decl, optimize)) { calculate_dominance_info (CDI_DOMINATORS); + calculate_dominance_info (CDI_POST_DOMINATORS); if (!early) loop_optimizer_init (LOOPS_NORMAL | LOOPS_HAVE_RECORDED_EXITS); else @@ -2469,6 +2517,7 @@ analyze_function_body (struct cgraph_node *node, bool early) else if (!ipa_edge_args_sum) ipa_free_all_node_params (); free_dominance_info (CDI_DOMINATORS); + free_dominance_info (CDI_POST_DOMINATORS); } if (dump_file) { diff --git a/gcc/testsuite/gcc.dg/ipa/pr91089.c b/gcc/testsuite/gcc.dg/ipa/pr91089.c index e9e206fc24d..92b5550aa76 100644 --- a/gcc/testsuite/gcc.dg/ipa/pr91089.c +++ b/gcc/testsuite/gcc.dg/ipa/pr91089.c @@ -41,6 +41,52 @@ int callee (int i) return data += i; } +int fn2 (); + +int callee_complex_predicate (int i) +{ + switch (i ) + { + case 0: + fn (); + fn (); + fn (); + case 1: + fn (); + fn (); + case -1: + fn (); + case -2: + fn (); + fn (); + fn (); + fn (); + fn (); + fn (); + fn (); + fn (); + fn (); + fn (); + fn (); + fn (); + fn (); + fn (); + fn (); + fn (); + data += i; + break; + } + + if (i == 1000) + { + int j; + + for (j = 0; j < 100; j++) + fn2 (); + } + return i + 3; +} + int caller () { return callee (-127) + @@ -60,3 +106,4 @@ int caller () /* { dg-final { scan-ipa-dump "op0 != 0" "fnsummary" } } */ /* { dg-final { scan-ipa-dump "op0 < 5" "fnsummary" } } */ /* { dg-final { scan-ipa-dump "op0 > 7" "fnsummary" } } */ +/* { dg-final { scan-ipa-dump "loop depth: 1 .+ time:\[ \]*\[0-9\]+ predicate: \\(op0 == 1000\\)" "fnsummary" } } */ -- 2.17.1 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2 V3] Use post-dom info to update if/switch predicate (PR ipa/91089) 2019-09-16 10:16 ` [PATCH 2/2 V3] Use post-dom info to update if/switch predicate " Feng Xue OS @ 2019-09-16 15:05 ` Jan Hubicka 2019-09-17 13:54 ` Feng Xue OS 0 siblings, 1 reply; 11+ messages in thread From: Jan Hubicka @ 2019-09-16 15:05 UTC (permalink / raw) To: Feng Xue OS; +Cc: Martin Jambor, gcc-patches > This one is split from original patch. This patch is also OK (modulo the changelog entry and testing). Please wait with comitting this for bit over a day after commiting the first so the periodic benchmarks picks each change differently. Thanks, Honza > > Feng > --- > diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c > index 1bf1806eaf8..5423756d275 100644 > --- a/gcc/ipa-fnsummary.c > +++ b/gcc/ipa-fnsummary.c > @@ -1197,8 +1197,14 @@ set_cond_stmt_execution_predicate (struct ipa_func_body_info *fbi, > ? code : inverted_code); > /* invert_tree_comparison will return ERROR_MARK on FP > comparsions that are not EQ/NE instead of returning proper > - unordered one. Be sure it is not confused with NON_CONSTANT. */ > - if (this_code != ERROR_MARK) > + unordered one. Be sure it is not confused with NON_CONSTANT. > + > + And if the edge's target is the final block of diamond CFG graph > + of this conditional statement, we do not need to compute > + predicate for the edge because the final block's predicate must > + be at least as that of the first block of the statement. */ > + if (this_code != ERROR_MARK > + && !dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest)) > { > predicate p > = add_condition (summary, index, size, &aggpos, this_code, > @@ -1282,6 +1288,14 @@ set_switch_stmt_execution_predicate (struct ipa_func_body_info *fbi, > *(predicate *) e->aux = false; > } > > + e = gimple_switch_edge (cfun, last, 0); > + /* Set BOUND_COUNT to maximum count to bypass computing predicate for > + default case if its target basic block is in convergence point of all > + switch cases, which can be determined by checking whether it > + post-dominates the switch statement. */ > + if (dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest)) > + bound_count = INT_MAX; > + > n = gimple_switch_num_labels (last); > for (case_idx = 1; case_idx < n; ++case_idx) > { > @@ -1293,7 +1307,12 @@ set_switch_stmt_execution_predicate (struct ipa_func_body_info *fbi, > min = CASE_LOW (cl); > max = CASE_HIGH (cl); > > - if (!max) > + /* The case's target basic block is in convergence point of all switch > + cases, its predicate should be at least as that of the switch > + statement. */ > + if (dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest)) > + p = true; > + else if (!max) > p = add_condition (summary, index, size, &aggpos, EQ_EXPR, > unshare_expr_without_location (min)); > else > @@ -1463,10 +1482,10 @@ compute_bb_predicates (struct ipa_func_body_info *fbi, > break; > } > } > - if (p == false) > - gcc_checking_assert (!bb->aux); > - else > + if (p != false) > { > + basic_block pdom_bb; > + > if (!bb->aux) > { > done = false; > @@ -1485,6 +1504,34 @@ compute_bb_predicates (struct ipa_func_body_info *fbi, > *((predicate *) bb->aux) = p; > } > } > + > + /* For switch/if statement, we can OR-combine predicates of all > + its cases/branches to get predicate for basic block in their > + convergence point, but sometimes this will generate very > + complicated predicate. Actually, we can get simplified > + predicate in another way by using the fact that predicate > + for a basic block must also hold true for its post dominators. > + To be specific, basic block in convergence point of > + conditional statement should include predicate of the > + statement. */ > + pdom_bb = get_immediate_dominator (CDI_POST_DOMINATORS, bb); > + if (pdom_bb == EXIT_BLOCK_PTR_FOR_FN (my_function) || !pdom_bb) > + ; > + else if (!pdom_bb->aux) > + { > + done = false; > + pdom_bb->aux = edge_predicate_pool.allocate (); > + *((predicate *) pdom_bb->aux) = p; > + } > + else if (p != *(predicate *) pdom_bb->aux) > + { > + p = p.or_with (summary->conds, *(predicate *)pdom_bb->aux); > + if (p != *(predicate *) pdom_bb->aux) > + { > + done = false; > + *((predicate *) pdom_bb->aux) = p; > + } > + } > } > } > } > @@ -2089,6 +2136,7 @@ analyze_function_body (struct cgraph_node *node, bool early) > if (opt_for_fn (node->decl, optimize)) > { > calculate_dominance_info (CDI_DOMINATORS); > + calculate_dominance_info (CDI_POST_DOMINATORS); > if (!early) > loop_optimizer_init (LOOPS_NORMAL | LOOPS_HAVE_RECORDED_EXITS); > else > @@ -2469,6 +2517,7 @@ analyze_function_body (struct cgraph_node *node, bool early) > else if (!ipa_edge_args_sum) > ipa_free_all_node_params (); > free_dominance_info (CDI_DOMINATORS); > + free_dominance_info (CDI_POST_DOMINATORS); > } > if (dump_file) > { > diff --git a/gcc/testsuite/gcc.dg/ipa/pr91089.c b/gcc/testsuite/gcc.dg/ipa/pr91089.c > index e9e206fc24d..92b5550aa76 100644 > --- a/gcc/testsuite/gcc.dg/ipa/pr91089.c > +++ b/gcc/testsuite/gcc.dg/ipa/pr91089.c > @@ -41,6 +41,52 @@ int callee (int i) > return data += i; > } > > +int fn2 (); > + > +int callee_complex_predicate (int i) > +{ > + switch (i ) > + { > + case 0: > + fn (); > + fn (); > + fn (); > + case 1: > + fn (); > + fn (); > + case -1: > + fn (); > + case -2: > + fn (); > + fn (); > + fn (); > + fn (); > + fn (); > + fn (); > + fn (); > + fn (); > + fn (); > + fn (); > + fn (); > + fn (); > + fn (); > + fn (); > + fn (); > + fn (); > + data += i; > + break; > + } > + > + if (i == 1000) > + { > + int j; > + > + for (j = 0; j < 100; j++) > + fn2 (); > + } > + return i + 3; > +} > + > int caller () > { > return callee (-127) + > @@ -60,3 +106,4 @@ int caller () > /* { dg-final { scan-ipa-dump "op0 != 0" "fnsummary" } } */ > /* { dg-final { scan-ipa-dump "op0 < 5" "fnsummary" } } */ > /* { dg-final { scan-ipa-dump "op0 > 7" "fnsummary" } } */ > +/* { dg-final { scan-ipa-dump "loop depth: 1 .+ time:\[ \]*\[0-9\]+ predicate: \\(op0 == 1000\\)" "fnsummary" } } */ > -- > 2.17.1 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2 V3] Use post-dom info to update if/switch predicate (PR ipa/91089) 2019-09-16 15:05 ` Jan Hubicka @ 2019-09-17 13:54 ` Feng Xue OS 0 siblings, 0 replies; 11+ messages in thread From: Feng Xue OS @ 2019-09-17 13:54 UTC (permalink / raw) To: Jan Hubicka; +Cc: Martin Jambor, gcc-patches > This patch is also OK (modulo the changelog entry and testing). Please > wait with comitting this for bit over a day after commiting the first so > the periodic benchmarks picks each change differently. Understood. Thanks for this. Feng ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2019-09-17 13:54 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-07-12 8:51 [PATCH] Setup predicate for switch default case in IPA (PR ipa/91089) Feng Xue OS 2019-08-29 15:14 ` Martin Jambor 2019-08-30 8:32 ` Feng Xue OS 2019-09-02 3:06 ` [PATCH V2] " Feng Xue OS 2019-09-14 15:57 ` Jan Hubicka 2019-09-16 10:11 ` [PATCH 1/2 V3] " Feng Xue OS 2019-09-16 15:04 ` Jan Hubicka 2019-09-17 13:51 ` Feng Xue OS 2019-09-16 10:16 ` [PATCH 2/2 V3] Use post-dom info to update if/switch predicate " Feng Xue OS 2019-09-16 15:05 ` Jan Hubicka 2019-09-17 13:54 ` Feng Xue OS
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).