From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 99615 invoked by alias); 30 May 2019 23:35:50 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 99607 invoked by uid 89); 30 May 2019 23:35:50 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-13.3 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,SPF_HELO_PASS,UNSUBSCRIBE_BODY autolearn=ham version=3.3.1 spammy= X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 30 May 2019 23:35:48 +0000 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id A32EB859FB; Thu, 30 May 2019 23:35:46 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-56.rdu2.redhat.com [10.10.112.56]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0963587B1; Thu, 30 May 2019 23:35:44 +0000 (UTC) Subject: Re: [PATCH V2] A jump threading opportunity for condition branch To: Jiufu Guo Cc: Richard Biener , gcc-patches@gcc.gnu.org, Jakub Jelinek , Daniel Berlin , segher@kernel.crashing.org, wschmidt@linux.ibm.com References: <1558446288-52444-1-git-send-email-guojiufu@linux.ibm.com> <08cc117a-52c7-adb7-2bef-1aef6379e5c6@redhat.com> From: Jeff Law Openpgp: preference=signencrypt Message-ID: Date: Thu, 30 May 2019 23:55:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/mixed; boundary="------------6497B2ACE7B7C7A3859A86BB" X-IsSubscribed: yes X-SW-Source: 2019-05/txt/msg02069.txt.bz2 This is a multi-part message in MIME format. --------------6497B2ACE7B7C7A3859A86BB Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Content-length: 12087 On 5/30/19 9:03 AM, Jiufu Guo wrote: > Jeff Law writes: > >> On 5/29/19 6:36 AM, Richard Biener wrote: >>> On Tue, 28 May 2019, Jiufu Guo wrote: >>> >>>> Hi, >>>> >>>> This patch implements a new opportunity of jump threading for PR77820. >>>> In this optimization, conditional jumps are merged with unconditional >>>> jump. And then moving CMP result to GPR is eliminated. >>>> >>>> This version is based on the proposal of Richard, Jeff and Andrew, and >>>> refined to incorporate comments. Thanks for the reviews! >>>> >>>> Bootstrapped and tested on powerpc64le and powerpc64be with no >>>> regressions (one case is improved) and new testcases are added. Is this >>>> ok for trunk? >>>> >>>> Example of this opportunity looks like below: >>>> >>>> >>>> p0 = a CMP b >>>> goto ; >>>> >>>> >>>> p1 = c CMP d >>>> goto ; >>>> >>>> >>>> # phi = PHI >>>> if (phi != 0) goto ; else goto ; >>>> >>>> Could be transformed to: >>>> >>>> >>>> p0 = a CMP b >>>> if (p0 != 0) goto ; else goto ; >>>> >>>> >>>> p1 = c CMP d >>>> if (p1 != 0) goto ; else goto ; >>>> >>>> >>>> This optimization eliminates: >>>> 1. saving CMP result: p0 = a CMP b. >>>> 2. additional CMP on branch: if (phi != 0). >>>> 3. converting CMP result if there is phi = (INT_CONV) p0 if there is. >>>> >>>> Thanks! >>>> Jiufu Guo >>>> >>>> >>>> [gcc] >>>> 2019-05-28 Jiufu Guo >>>> Lijia He >>>> >>>> PR tree-optimization/77820 >>>> * tree-ssa-threadedge.c >>>> (edge_forwards_cmp_to_conditional_jump_through_empty_bb_p): New >>>> function. >>>> (thread_across_edge): Add call to >>>> edge_forwards_cmp_to_conditional_jump_through_empty_bb_p. >>>> >>>> [gcc/testsuite] >>>> 2019-05-28 Jiufu Guo >>>> Lijia He >>>> >>>> PR tree-optimization/77820 >>>> * gcc.dg/tree-ssa/phi_on_compare-1.c: New testcase. >>>> * gcc.dg/tree-ssa/phi_on_compare-2.c: New testcase. >>>> * gcc.dg/tree-ssa/phi_on_compare-3.c: New testcase. >>>> * gcc.dg/tree-ssa/phi_on_compare-4.c: New testcase. >>>> * gcc.dg/tree-ssa/split-path-6.c: Update testcase. >>>> >>>> --- >>>> gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-1.c | 30 ++++++++++ >>>> gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-2.c | 23 +++++++ >>>> gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-3.c | 25 ++++++++ >>>> gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-4.c | 40 +++++++++++++ >>>> gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c | 2 +- >>>> gcc/tree-ssa-threadedge.c | 76 +++++++++++++++++++++++- >>>> 6 files changed, 192 insertions(+), 4 deletions(-) >>>> create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-1.c >>>> create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-2.c >>>> create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-3.c >>>> create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-4.c >>>> >>>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-1.c b/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-1.c >>>> new file mode 100644 >>>> index 0000000..5227c87 >>>> --- /dev/null >>>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-1.c >>>> @@ -0,0 +1,30 @@ >>>> +/* { dg-do compile } */ >>>> +/* { dg-options "-Ofast -fdump-tree-vrp1" } */ >>>> + >>>> +void g (int); >>>> +void g1 (int); >>>> + >>>> +void >>>> +f (long a, long b, long c, long d, long x) >>>> +{ >>>> + _Bool t; >>>> + if (x) >>>> + { >>>> + g (a + 1); >>>> + t = a < b; >>>> + c = d + x; >>>> + } >>>> + else >>>> + { >>>> + g (b + 1); >>>> + a = c + d; >>>> + t = c > d; >>>> + } >>>> + >>>> + if (t) >>>> + g1 (c); >>>> + >>>> + g (a); >>>> +} >>>> + >>>> +/* { dg-final { scan-tree-dump-times "Removing basic block" 1 "vrp1" } } */ >>>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-2.c b/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-2.c >>>> new file mode 100644 >>>> index 0000000..eaf89bb >>>> --- /dev/null >>>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-2.c >>>> @@ -0,0 +1,23 @@ >>>> +/* { dg-do compile } */ >>>> +/* { dg-options "-Ofast -fdump-tree-vrp1" } */ >>>> + >>>> +void g (void); >>>> +void g1 (void); >>>> + >>>> +void >>>> +f (long a, long b, long c, long d, int x) >>>> +{ >>>> + _Bool t; >>>> + if (x) >>>> + t = c < d; >>>> + else >>>> + t = a < b; >>>> + >>>> + if (t) >>>> + { >>>> + g1 (); >>>> + g (); >>>> + } >>>> +} >>>> + >>>> +/* { dg-final { scan-tree-dump-times "Removing basic block" 1 "vrp1" } } */ >>>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-3.c b/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-3.c >>>> new file mode 100644 >>>> index 0000000..d5a1e0b >>>> --- /dev/null >>>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-3.c >>>> @@ -0,0 +1,25 @@ >>>> +/* { dg-do compile } */ >>>> +/* { dg-options "-Ofast -fdump-tree-vrp1" } */ >>>> + >>>> +void g (void); >>>> +void g1 (void); >>>> + >>>> +void >>>> +f (long a, long b, long c, long d, int x) >>>> +{ >>>> + int t; >>>> + if (x) >>>> + t = a < b; >>>> + else if (d == x) >>>> + t = c < b; >>>> + else >>>> + t = d > c; >>>> + >>>> + if (t) >>>> + { >>>> + g1 (); >>>> + g (); >>>> + } >>>> +} >>>> + >>>> +/* { dg-final { scan-tree-dump-times "Removing basic block" 1 "vrp1" } } */ >>>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-4.c b/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-4.c >>>> new file mode 100644 >>>> index 0000000..53acabc >>>> --- /dev/null >>>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-4.c >>>> @@ -0,0 +1,40 @@ >>>> +/* { dg-do compile } */ >>>> +/* { dg-options "-Ofast -fdump-tree-vrp1" } */ >>>> + >>>> +void g (int); >>>> +void g1 (int); >>>> + >>>> +void >>>> +f (long a, long b, long c, long d, int x) >>>> +{ >>>> + int t; >>>> + _Bool l1 = 0, l2 = 0; >>>> + if (x) >>>> + { >>>> + g (a); >>>> + c = a + b; >>>> + t = a < b; >>>> + l1 = 1; >>>> + } >>>> + else >>>> + { >>>> + g1 (b); >>>> + t = c > d; >>>> + d = c + b; >>>> + l2 = 1; >>>> + } >>>> + >>>> + if (t) >>>> + { >>>> + if (l1 | l2) >>>> + g1 (c); >>>> + } >>>> + else >>>> + { >>>> + g (d); >>>> + g1 (a + b); >>>> + } >>>> + g (c + d); >>>> +} >>>> + >>>> +/* { dg-final { scan-tree-dump-times "Removing basic block" 1 "vrp1" } } */ >>>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c b/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c >>>> index e9b4f26..1d7b587 100644 >>>> --- a/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c >>>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c >>>> @@ -69,4 +69,4 @@ lookharder (string) >>>> } >>>> } >>>> >>>> -/* { dg-final { scan-tree-dump-times "Duplicating join block" 3 "split-paths" } } */ >>>> +/* { dg-final { scan-tree-dump-times "Duplicating join block" 2 "split-paths" } } */ >>>> diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c >>>> index c3ea2d6..36c413a 100644 >>>> --- a/gcc/tree-ssa-threadedge.c >>>> +++ b/gcc/tree-ssa-threadedge.c >>>> @@ -1157,6 +1157,74 @@ thread_through_normal_block (edge e, >>>> return 0; >>>> } >>>> >>>> +/* There are basic blocks look like: >>>> + >>>> + p0 = a CMP b ; or p0 = (INT) (a CMP b) >>>> + goto ; >>>> + >>>> + >>>> + p1 = c CMP d >>>> + goto ; >>>> + >>>> + >>>> + # phi = PHI >>>> + if (phi != 0) goto ; else goto ; >>>> + >>>> + Then, edge (P0,X) or (P1,X) could be marked as EDGE_START_JUMP_THREAD >>>> + And edge (X,Y), (X,Z) is EDGE_COPY_SRC_JOINER_BLOCK >>>> + >>>> + Return true if E is (P0,X) or (P1,X) */ >>>> + >>>> +bool >>>> +edge_forwards_cmp_to_conditional_jump_through_empty_bb_p (edge e) >>>> +{ >>>> + basic_block bb = e->dest; >>>> + >>>> + /* See if there is only one stmt which is gcond. */ >>>> + gimple *gs = last_and_only_stmt (bb); >>>> + if (gs == NULL || gimple_code (gs) != GIMPLE_COND) >>>> + return false; >>> gcond *gs; >>> if (!(gs = safe_dyn_cast (last_and_only_stmt (bb)))) >>> return false; >>> >>> makes the following gimple_cond_ accesses more efficient when >>> checking is enabled. >>> >>>> + >>>> + /* See if gcond's cond is "(phi !=/== 0/1)" in the basic block. */ >>>> + tree cond = gimple_cond_lhs (gs); >>>> + enum tree_code code = gimple_cond_code (gs); >>>> + tree rhs = gimple_cond_rhs (gs); >>>> + if (TREE_CODE (cond) != SSA_NAME >>>> + || (code != NE_EXPR && code != EQ_EXPR) >>>> + || (!integer_onep (rhs) && !integer_zerop (rhs))) >>>> + return false; >>>> + gphi *phi = dyn_cast (SSA_NAME_DEF_STMT (cond)); >>>> + if (phi == NULL || gimple_bb (phi) != bb) >>>> + return false; >>>> + >>>> + /* Check if phi's incoming value is CMP. */ >>>> + gimple *def; >>>> + tree value = PHI_ARG_DEF_FROM_EDGE (phi, e); >>>> + if (TREE_CODE (value) == SSA_NAME >>>> + && has_single_use (value) >>>> + && is_gimple_assign (SSA_NAME_DEF_STMT (value))) >>>> + def = SSA_NAME_DEF_STMT (value); >>> Same is true here and below if you rewrite to >>> >>> gassign *def; >>> tree value = PHI_ARG_DEF_FROM_EDGE (phi, e); >>> if (TREE_CODE (value) != SSA_NAME >>> || !has_single_use (value) >>> || !(def = dyn_cast (SSA_NAME_DEF_STMT (value)))) >>> return false; >>> >>> Otherwise it looks good. I'd like to have Jeffs opinion and >>> final ACK here because we touch jump-threading and he's most >>> familiar with that detail and the place you hook into. >> I've got the full thread to look over. At a high level I wouldn't have >> guessed it'd be this easy to get the threader handle this, but >> occasionally we are surprised in a good way. Anyway, I'll be looking >> through the full discussion. >> >> Jeff > > Hi Jeff, Richard and all, > > Thanks a lot for your great comments in all threads. Based on those > comments, I refined the code as below: > > bool > edge_forwards_cmp_to_conditional_jump_through_empty_bb_p (edge e) > { > /* See if there is only one stmt which is gcond. */ > gcond *gs; > if (!(gs = safe_dyn_cast (last_and_only_stmt (e->dest)))) > return false; > > /* See if gcond's cond is "(phi !=/== 0/1)" in the basic block. */ > tree cond = gimple_cond_lhs (gs); > enum tree_code code = gimple_cond_code (gs); > tree rhs = gimple_cond_rhs (gs); > if (TREE_CODE (cond) != SSA_NAME > || (code != NE_EXPR && code != EQ_EXPR) > || (!integer_onep (rhs) && !integer_zerop (rhs))) > return false; > gphi *phi = dyn_cast (SSA_NAME_DEF_STMT (cond)); > if (phi == NULL || gimple_bb (phi) != e->dest) > return false; > > /* Check if phi's incoming value is CMP. */ > gassign *def; > tree value = PHI_ARG_DEF_FROM_EDGE (phi, e); > if (TREE_CODE (value) != SSA_NAME > || !has_single_use (value) > || !(def = dyn_cast (SSA_NAME_DEF_STMT (value)))) > return false; > > /* Or if it is (INT) (a CMP b). */ > if (CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def))) > { > value = gimple_assign_rhs1 (def); > if (TREE_CODE (value) != SSA_NAME > || !has_single_use (value) > || !(def = dyn_cast (SSA_NAME_DEF_STMT (value)))) > return false; > } > > if (TREE_CODE_CLASS (gimple_assign_rhs_code (def)) != tcc_comparison) > return false; > > if (!single_succ_p (e->src)) > return false; > > return true; > } > > > In this code, I put "if (!single_succ_p (e->src))" there, which may be > helpful for reducing the copy block. Sounds good. My testing did show a regression on the sh port. For pr51244-20 on the SH port your changes make the ultimate resulting code better, but compromise the test. We can restore the shape of the CFG and get the testing coverage for sh_treg_combine by disabling VRP and DOM. Can you please include the attached patch in your next update? Jeff --------------6497B2ACE7B7C7A3859A86BB Content-Type: text/plain; charset=UTF-8; name="P" Content-Transfer-Encoding: base64 Content-Disposition: inline; filename="P" Content-length: 777 ZGlmZiAtLWdpdCBhL2djYy90ZXN0c3VpdGUvZ2NjLnRhcmdldC9zaC9wcjUx MjQ0LTIwLmMgYi9nY2MvdGVzdHN1aXRlL2djYy50YXJnZXQvc2gvcHI1MTI0 NC0yMC5jCmluZGV4IGMzNDIxNjMxNjBiLi5iZTI2NWNkMTZhZiAxMDA2NDQK LS0tIGEvZ2NjL3Rlc3RzdWl0ZS9nY2MudGFyZ2V0L3NoL3ByNTEyNDQtMjAu YworKysgYi9nY2MvdGVzdHN1aXRlL2djYy50YXJnZXQvc2gvcHI1MTI0NC0y MC5jCkBAIC0xLDcgKzEsNyBAQAogLyogQ2hlY2sgdGhhdCB0aGUgU0ggc3Bl Y2lmaWMgc2hfdHJlZ19jb21iaW5lIFJUTCBvcHRpbWl6YXRpb24gcGFzcyB3 b3JrcyBhcwogICAgZXhwZWN0ZWQuICAqLwogLyogeyBkZy1kbyBjb21waWxl IH0gICovCi0vKiB7IGRnLW9wdGlvbnMgIi1PMiIgfSAqLworLyogeyBkZy1v cHRpb25zICItTzIgLWZuby10cmVlLWRvbWluYXRvci1vcHRzIC1mbm8tdHJl ZS12cnAiIH0gKi8KIAogLyogeyBkZy1maW5hbCB7IHNjYW4tYXNzZW1ibGVy LW5vdCAibm90XHQiIH0gfSAqLwogLyogeyBkZy1maW5hbCB7IHNjYW4tYXNz ZW1ibGVyLXRpbWVzICJjbXAvZXEiIDIgfSB9ICovCg== --------------6497B2ACE7B7C7A3859A86BB--