From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 47367 invoked by alias); 2 Sep 2019 03:06:20 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 47356 invoked by uid 89); 2 Sep 2019 03:06:19 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-24.6 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.1 spammy=connected, nearly, fnoinline, fno-inline X-HELO: NAM02-BL2-obe.outbound.protection.outlook.com Received: from mail-eopbgr750138.outbound.protection.outlook.com (HELO NAM02-BL2-obe.outbound.protection.outlook.com) (40.107.75.138) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 02 Sep 2019 03:06:17 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=JwaJ1FTJU5mO2c+TD7j+TaIIZYNogcWEjZN90YSgK9Fr9lmgc3BxhIcecwBWwu4TLAA+ucibB3wHh+c/aeptx9QCBDwRknD8ouYgnHwj6cUSoTxO6B1GCC+KJOGETWEAVAKutugYuyLk5G38VNQZ9KXeg+tZh9adZ8uyRHWHOs08QJ2v7pssNaRjuptAVcmhqDrRTtUQcNXqtCbyFQ/5ukb4I5aL+M5gSBUfiN+dF3y8I9VY9uBX1yah7nAE3GXnhzkZJHDBjVSUAfoosdX68KClcN9Qa0w2urBEQkeQltY2HS1Ot0eYzhGQ/bERwTG+i9tyN7V7aru9e1U9qB31Uw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=4VUS1RQHjmnk/sf0xUcmXE03hEPkgeydUjlxoaHKGJ0=; b=fcRCI1jw9UJNrHOgrdAm1CbzdaFinzXjAVkU22t6aP1fuUZKZ1Z5hkTWzMOx44vKewSXOOSwePJjs3jUXDR3/Xsm9/Au5I5hud7lTx0Z8x93BIrmifSNz661Ch4wakCpfNVU+q0IIne7NOaJKBEzObiJDlqocTQtRhZpU9tWL8xOPRNJaiChsnLETSfpZDCraay1+HAVZM/BTXr2l2g784cx8h1V+TZYxtC8XpLY72VbmUXITrRFd8kGHwmFaM2ZbMNuCd/e0BCI+FMb42wku4iL42i0W3K4o1u3YXt+JWk8ppf43KSeHvY6AZvfAg1Wb62+TQeP3c4KHmB+lWIPnQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=os.amperecomputing.com; dmarc=pass action=none header.from=os.amperecomputing.com; dkim=pass header.d=os.amperecomputing.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=os.amperecomputing.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=4VUS1RQHjmnk/sf0xUcmXE03hEPkgeydUjlxoaHKGJ0=; b=EsbooWhDx1Mj+gEfnwvExE71BRtUtSN9GOnXlnBz4qWCKxL0v39eGxc3AXIWXckyADUCxA/VvkZL+ceLIVV+zYLC8yk+rDjQ3j625Rfn8qzBjWwGrs1X0nHZh084n3c7r0iC8VP94ERPmX+CiteaDPq+xk9Qu+G0N1gIUJp1ThE= Received: from BYAPR01MB4869.prod.exchangelabs.com (20.177.228.18) by BYAPR01MB4261.prod.exchangelabs.com (52.135.239.84) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2220.19; Mon, 2 Sep 2019 03:06:13 +0000 Received: from BYAPR01MB4869.prod.exchangelabs.com ([fe80::60eb:f69d:f5b6:cc27]) by BYAPR01MB4869.prod.exchangelabs.com ([fe80::60eb:f69d:f5b6:cc27%2]) with mapi id 15.20.2199.021; Mon, 2 Sep 2019 03:06:13 +0000 From: Feng Xue OS To: Martin Jambor , Jan Hubicka , "gcc-patches@gcc.gnu.org" Subject: [PATCH V2] Setup predicate for switch default case in IPA (PR ipa/91089) Date: Mon, 02 Sep 2019 03:06:00 -0000 Message-ID: References: , In-Reply-To: authentication-results: spf=none (sender IP is ) smtp.mailfrom=fxue@os.amperecomputing.com; x-ms-oob-tlc-oobclassifiers: OLM:10000; received-spf: None (protection.outlook.com: os.amperecomputing.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: tjETyMh4RcOXHkhGJE20UgL1TZgYMgfhUJ7ICZerBAlcQ83xdxL4MQe+PEYvE5KxDxeYj4n5E/MgCRP/b6JFSVQLKpVokmjW5atEwcEtj54= X-SW-Source: 2019-09/txt/msg00021.txt.bz2 > I have read through the patch and it looks OK to me but I cannot approve > it, you have to ping Honza for that. Since you decided to use the value > range info, it would be nice if you could also add a testcase where it > plays a role.=20 It is somewhat hard to write a testcase to show role of range info only with this patch. If another patch "Generalized predicate/condition for parameter reference in IPA (PR ipa/91088)" is accepted, it will become easy and I will update this testcase to show that. And this new version also resolves a problem that we might generate very complicated predicate for basic block in convergence point reached from all switch cases. switch (index) / | \ case1 case2 ... default \ | / convergence point For switch/if statement, current algorithm synthesizes predicate of convergence point by OR-combining predicates of all its cases/branches, whi= ch might give us a very complicated one. Actually, we can get simplified pred= icate by using the fact that predicate for a basic block must also hold true for = its post dominators. To be specific, convergence point should include (at least equ= al to) predicate of the switch/if statement. Honza & Martin, =20=20=20=20 Would please review this new patch when you have time? Thanks. Feng ----- diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c index 278bf606661..dc5752fc005 100644 --- a/gcc/ipa-fnsummary.c +++ b/gcc/ipa-fnsummary.c @@ -1198,7 +1198,8 @@ set_cond_stmt_execution_predicate (struct ipa_func_bo= dy_info *fbi, /* invert_tree_comparison will return ERROR_MARK on FP comparsions that are not EQ/NE instead of returning proper unordered one. Be sure it is not confused with NON_CONSTANT. */ - if (this_code !=3D ERROR_MARK) + if (this_code !=3D ERROR_MARK + && !dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest)) { predicate p =3D add_condition (summary, index, size, &aggpos, this_code, @@ -1269,13 +1270,31 @@ set_switch_stmt_execution_predicate (struct ipa_fun= c_body_info *fbi, if (!unmodified_parm_or_parm_agg_item (fbi, last, op, &index, &size, &ag= gpos)) return; =20 + auto_vec > ranges; + tree type =3D TREE_TYPE (op); + tree one_cst =3D build_one_cst (type); + int bound_limit =3D PARAM_VALUE (PARAM_IPA_MAX_SWITCH_PREDICATE_BOUNDS); + int bound_count =3D 0; + value_range_base vr; + + get_range_info (op, vr); + FOR_EACH_EDGE (e, ei, bb->succs) { e->aux =3D edge_predicate_pool.allocate (); *(predicate *) e->aux =3D false; } + + e =3D gimple_switch_edge (cfun, last, 0); + /* Set BOUND_COUNT to maximum count to bypass computing predicate for + default case if its target basic block is in convergence point of all + switch cases, which can be determined by checking whether it + post-dominates the switch statement. */ + if (dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest)) + bound_count =3D INT_MAX; + n =3D gimple_switch_num_labels (last); - for (case_idx =3D 0; case_idx < n; ++case_idx) + for (case_idx =3D 1; case_idx < n; ++case_idx) { tree cl =3D gimple_switch_label (last, case_idx); tree min, max; @@ -1285,10 +1304,10 @@ set_switch_stmt_execution_predicate (struct ipa_fun= c_body_info *fbi, min =3D CASE_LOW (cl); max =3D CASE_HIGH (cl); =20 - /* For default we might want to construct predicate that none - of cases is met, but it is bit hard to do not having negations - of conditionals handy. */ - if (!min && !max) + /* The case's target basic block is in convergence point of all swit= ch + cases, its predicate should be at least as that of the switch + statement. */ + if (dominated_by_p (CDI_POST_DOMINATORS, bb, e->dest)) p =3D true; else if (!max) p =3D add_condition (summary, index, size, &aggpos, EQ_EXPR, @@ -1304,7 +1323,111 @@ set_switch_stmt_execution_predicate (struct ipa_fun= c_body_info *fbi, } *(class predicate *) e->aux =3D p.or_with (summary->conds, *(class predicate *) e->aux); + + /* If there are too many disjoint case ranges, predicate for default + case might become too complicated. So add a limit here. */ + if (bound_count > bound_limit) + continue; + + bool new_range =3D true; + + if (!ranges.is_empty ()) + { + tree last_max_plus_one + =3D int_const_binop (PLUS_EXPR, ranges.last ().second, one_cst); + + /* Merge case ranges if they are continuous. */ + if (tree_int_cst_equal (min, last_max_plus_one)) + new_range =3D false; + else if (vr.kind () =3D=3D VR_ANTI_RANGE) + { + tree min_minus_one =3D int_const_binop (MINUS_EXPR, min, one_cst); + + /* If two disjoint case ranges can be connected by anti-range + of switch index, combine them to one range. */ + if (tree_int_cst_lt (vr.max (), min_minus_one)) + vr.set_undefined (); + else if (tree_int_cst_le (vr.min (), last_max_plus_one)) + new_range =3D false; + } + } + + if (!max) + max =3D min; + + /* Create/extend a case range. And we count endpoints of range set, + this number nearly equals to number of conditions that we will create + for predicate of default case. */ + if (new_range) + { + bound_count +=3D (min =3D=3D max) ? 1 : 2; + ranges.safe_push (std::make_pair (min, max)); + } + else + { + bound_count +=3D (ranges.last ().first =3D=3D ranges.last ().second); + ranges.last ().second =3D max; + } + } + + e =3D gimple_switch_edge (cfun, last, 0); + if (bound_count > bound_limit) + { + *(class predicate *) e->aux =3D true; + return; + } + + tree vr_min =3D vr.kind () =3D=3D VR_RANGE ? vr.min () : TYPE_MIN_VALUE = (type); + tree vr_max =3D vr.kind () =3D=3D VR_RANGE ? vr.max () : TYPE_MAX_VALUE = (type); + predicate p_seg =3D true; + predicate p_all =3D false; + + /* Construct predicate to represent default range set that is negation of + all case ranges. Case range is classified as containing single/non-s= ingle + values. Suppose a piece of case ranges in the following. + + [D1...D2] [S1] ... [Sn] [D3...D4] + + To represent default case's range sets between two non-single value + case ranges (From D2 to D3), we construct predicate as: + + D2 < x < D3 && x !=3D S1 && ... && x !=3D Sn + */ + for (size_t i =3D 0; i < ranges.length (); i++) + { + tree min =3D ranges[i].first; + tree max =3D ranges[i].second; + + if (min =3D=3D max) + p_seg &=3D add_condition (summary, index, size, &aggpos, NE_EXPR, + unshare_expr_without_location (min)); + else + { + /* Do not create sub-predicate for range that is beyond low bound + of switch index. */ + if (tree_int_cst_lt (vr_min, min)) + { + p_seg &=3D add_condition (summary, index, size, &aggpos, LT_EXPR, + unshare_expr_without_location (min)); + p_all =3D p_all.or_with (summary->conds, p_seg); + } + + /* Do not create sub-predicate for range that is beyond up bound + of switch index. */ + if (tree_int_cst_le (vr_max, max)) + { + p_seg =3D false; + break; + } + + p_seg =3D add_condition (summary, index, size, &aggpos, GT_EXPR, + unshare_expr_without_location (max)); + } } + + p_all =3D p_all.or_with (summary->conds, p_seg); + *(class predicate *) e->aux + =3D p_all.or_with (summary->conds, *(class predicate *) e->aux); } =20 =20 @@ -1354,10 +1477,10 @@ compute_bb_predicates (struct ipa_func_body_info *f= bi, break; } } - if (p =3D=3D false) - gcc_checking_assert (!bb->aux); - else + if (p !=3D false) { + basic_block pdom_bb; + if (!bb->aux) { done =3D false; @@ -1376,6 +1499,34 @@ compute_bb_predicates (struct ipa_func_body_info *fb= i, *((predicate *) bb->aux) =3D p; } } + + /* For switch/if statement, we can OR-combine predicates of all + its cases/branches to get predicate for basic block in their + convergence point, but sometimes this will generate very + complicated predicate. Actually, we can get simplified + predicate in another way by using the fact that predicate + for a basic block must also hold true for its post dominators. + To be specific, basic block in convergence point of + conditional statement should include predicate of the + statement. */ + pdom_bb =3D get_immediate_dominator (CDI_POST_DOMINATORS, bb); + if (pdom_bb =3D=3D EXIT_BLOCK_PTR_FOR_FN (my_function) || !pdom_bb) + ; + else if (!pdom_bb->aux) + { + done =3D false; + pdom_bb->aux =3D edge_predicate_pool.allocate (); + *((predicate *) pdom_bb->aux) =3D p; + } + else if (p !=3D *(predicate *) pdom_bb->aux) + { + p =3D p.or_with (summary->conds, *(predicate *)pdom_bb->aux); + if (p !=3D *(predicate *) pdom_bb->aux) + { + done =3D false; + *((predicate *) pdom_bb->aux) =3D p; + } + } } } } @@ -1980,6 +2131,7 @@ analyze_function_body (struct cgraph_node *node, bool= early) if (opt_for_fn (node->decl, optimize)) { calculate_dominance_info (CDI_DOMINATORS); + calculate_dominance_info (CDI_POST_DOMINATORS); if (!early) loop_optimizer_init (LOOPS_NORMAL | LOOPS_HAVE_RECORDED_EXITS); else @@ -2360,6 +2512,7 @@ analyze_function_body (struct cgraph_node *node, bool= early) else if (!ipa_edge_args_sum) ipa_free_all_node_params (); free_dominance_info (CDI_DOMINATORS); + free_dominance_info (CDI_POST_DOMINATORS); } if (dump_file) { diff --git a/gcc/params.def b/gcc/params.def index 13001a7bb2d..1461aa4849d 100644 --- a/gcc/params.def +++ b/gcc/params.def @@ -1123,6 +1123,14 @@ DEFPARAM (PARAM_IPA_MAX_AA_STEPS, "parameter analysis based on alias analysis in any given function.", 25000, 0, 0) =20 +DEFPARAM (PARAM_IPA_MAX_SWITCH_PREDICATE_BOUNDS, + "ipa-max-switch-predicate-bounds", + "Maximal number of boundary endpoints of case ranges of switch " + "statement. For switch exceeding this limit, do not construct cost-" + "evaluating predicate for default case during IPA function summary" + "generation.", + 5, 0, 0) + /* WHOPR partitioning configuration. */ =20 DEFPARAM (PARAM_LTO_PARTITIONS, diff --git a/gcc/testsuite/gcc.dg/ipa/pr91089.c b/gcc/testsuite/gcc.dg/ipa/= pr91089.c new file mode 100644 index 00000000000..dbd9b8c94fe --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/pr91089.c @@ -0,0 +1,111 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -fdump-ipa-cp-details -fdump-ipa-fnsummary-details --= param ipa-max-switch-predicate-bounds=3D10 -fno-inline" } */ + +int fn(); + +int data; + +int callee1(int i) +{ + switch (i) + { + case -126: return i + 13; + case -127: return i + 5; + case -8: return i * i; + case 0: return i % 9; + case 5: + case 7: + case 6: return 3; + default: + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + } + + return data +=3D i; +} + +int fn2(); + +int callee2(int i) +{ + switch (i ) + { + case 0: + fn(); + fn(); + fn(); + case 1: + fn(); + fn(); + case -1: + fn(); + case -2: + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + fn(); + data +=3D i; + break; + } + + if (i =3D=3D 1000) + { + int j; + + for (j =3D 0; j < 100; j++) + fn2(); + } + return i + 3; +}=09=09 + +int caller() +{ + return callee1 (-127) + + callee1 (-126) + + callee1 (-8) + + callee1 (0) + + callee1 (5) + + callee1 (6) + + callee1 (7) + + callee1 (100) + + callee2 (9); + callee2 (13); +} +=20 +/* { dg-final { scan-ipa-dump-times "Creating a specialized node of callee= " 7 "cp" } } */ +/* { dg-final { scan-ipa-dump "op0 < -127" "fnsummary" } } */ +/* { dg-final { scan-ipa-dump "op0 > -126" "fnsummary" } } */ +/* { dg-final { scan-ipa-dump "op0 !=3D -8" "fnsummary" } } */ +/* { dg-final { scan-ipa-dump "op0 !=3D 0" "fnsummary" } } */ +/* { dg-final { scan-ipa-dump "op0 < 5" "fnsummary" } } */ +/* { dg-final { scan-ipa-dump "op0 > 7" "fnsummary" } } */ +/* { dg-final { scan-ipa-dump "loop depth: 1 .+ time:\[ \]*\[0-9\]+ predic= ate: \\(op0 =3D=3D 1000\\)" "fnsummary" } } */ --=20 2.17.1