From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr1-x431.google.com (mail-wr1-x431.google.com [IPv6:2a00:1450:4864:20::431]) by sourceware.org (Postfix) with ESMTPS id 881AB3858C55 for ; Sat, 1 Oct 2022 04:48:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 881AB3858C55 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org Received: by mail-wr1-x431.google.com with SMTP id b4so2611455wrs.1 for ; Fri, 30 Sep 2022 21:48:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=list-id:mime-version:subject:message-id:cc:to:date:from:from:to:cc :subject:date; bh=N9QqQBdHoOHmUZhpsmqh5AnOoH2Z9Oo1QdoVSPp8OrU=; b=nVabhyUCfKztfqdk+Grw4W1GAydRVl4u11kGqRw5ZUbvSql7zvHa7Ym7lNAs/onS6A jxBRKeEOVLqUE1hvCTct1XXco28oeEz998quS6AkLU2+KkyPNV4kXRV3iz10u4XLWwun knPGb7pELs0vI5GKW7jMxH1eYakhoE0FXBQNdWhX8xJjeEM/gQMR3Ya++SllPltNim79 bTaLjYy+BGqdgtyj+8kBXoQyMG2Su464Zc+0LRloR/SbRmZZatz7DnI4XWyTCKX56Cs/ 04BY8EtkQUdatAfbQpbkxn6vqhLhR1a0SvO4RVcy79fTRfyWS+5gODcfVY5bHtcxRHVu 8t7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=list-id:mime-version:subject:message-id:cc:to:date:from :x-gm-message-state:from:to:cc:subject:date; bh=N9QqQBdHoOHmUZhpsmqh5AnOoH2Z9Oo1QdoVSPp8OrU=; b=nrfJ05fzsLyAVLsWWUIgwRnlaRQXPd/mA8LpKvEYLizbGbCEsnNtyipwczaxTjhwTe SeEUlOKeeQG9GVrU5O5Y0CeuNSQ2nsgQUH2GjDLJhVB1bZhnYYk1KNTzR2dYQuit8fcX 17Z7qMoyCLHOic7ZxIkIODkKpMcWzEbVa32QRVLB5SfYGBTyN6yeRnRsylW1fDNNNqxT JgYueRNS0qjyVE6dZrMMHfiFkQIUcAdQPfv69bWAD/JMdwkT/BSzvUj7ExWojbG7SUku /L+ijfZHSOtZI7Bi+DATwkKgWHdUMazqDuIWY5CmMRLKn/5350Ad1xJFqTz0PUL9iXB6 RbTg== X-Gm-Message-State: ACrzQf0XPL1MvfIimdL3cCjqNUPykvdiCYw7o3WPFNEH6qiLPJHG6Knr zaI4cQFoFdgjTPbUx5wVudF5Vw== X-Google-Smtp-Source: AMsMyM6B5737HgJn+TwFFIKu30S2AtFq9TaszYtr5hI5LUJgU8Z+xHuWw7Atyc46yveeVGX9ZutPuA== X-Received: by 2002:a5d:6d8c:0:b0:22a:e993:c37c with SMTP id l12-20020a5d6d8c000000b0022ae993c37cmr7594299wrs.592.1664599700168; Fri, 30 Sep 2022 21:48:20 -0700 (PDT) Received: from jenkins.jenkins (ci.linaro.org. [88.99.136.175]) by smtp.gmail.com with ESMTPSA id x6-20020adfdcc6000000b0022add371ed2sm3950321wrm.55.2022.09.30.21.48.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 30 Sep 2022 21:48:19 -0700 (PDT) From: ci_notify@linaro.org X-Google-Original-From: linaro-infrastructure-errors@lists.linaro.org Date: Sat, 1 Oct 2022 04:48:19 +0000 (UTC) To: Jeff Law Cc: gcc-regression@gcc.gnu.org Message-ID: <551270414.5142.1664599699725@jenkins.jenkins> Subject: [TCWG CI] Failure after basepoints/gcc-13-3004-g1214196da79: More gimple const/copy propagation opportunities MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_5141_359364139.1664599699166" X-Jenkins-Job: TCWG Build tcwg_gnu_cross_build/master-aarch64 X-Jenkins-Result: SUCCESS List-ID: X-Spam-Status: No, score=-13.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org ------=_Part_5141_359364139.1664599699166 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Failure after basepoints/gcc-13-3004-g1214196da79: More gimple const/copy p= ropagation opportunities: Results changed to -10 # true: 0 # build_abe binutils: 1 # build_abe stage1: # FAILED # First few build errors in logs: # 00:02:18 ../../../../../../gcc/gcc/tree-ssa-dom.cc:689:13: error: =E2=80= =98dst=E2=80=99 was not declared in this scope; did you mean =E2=80=98dse= =E2=80=99? # 00:02:18 ../../../../../../gcc/gcc/tree-ssa-dom.cc:689:33: error: =E2=80= =98phi=E2=80=99 was not declared in this scope; did you mean =E2=80=98gphi= =E2=80=99? # 00:02:18 make[2]: *** [Makefile:1147: tree-ssa-dom.o] Error 1 # 00:05:06 make[1]: *** [Makefile:4565: all-gcc] Error 2 # 00:05:06 make: *** [Makefile:1004: all] Error 2 from -10 # true: 0 # build_abe binutils: 1 # build_abe stage1: 2 # build_abe linux: 3 # build_abe glibc: 4 # build_abe stage2: 5 # build_abe gdb: 6 # build_abe qemu: 7 THIS IS THE END OF INTERESTING STUFF. BELOW ARE LINKS TO BUILDS, REPRODUCT= ION INSTRUCTIONS, AND THE RAW COMMIT. For latest status see comments in https://linaro.atlassian.net/browse/GNU-6= 92 . Status of basepoints/gcc-13-3004-g1214196da79 commit for tcwg_gnu_cross_bui= ld: commit 1214196da79aabbe5c14ed36e5a28012e141f04c Author: Jeff Law Date: Fri Sep 30 19:26:31 2022 -0400 More gimple const/copy propagation opportunities =20 While investigating a benchmark for optimization opportunities I came a= cross single block loop which either iterates precisely once or forever. = This is an interesting scenario as we can ignore the infinite looping path= and treat any PHI nodes as degenerates. So more concretely let's consider= this trivial testcase: =20 volatile void abort (void); =20 void foo(int a) { int b =3D 0; =20 while (1) { if (!a) break; b =3D 1; } =20 if (b !=3D 0) abort (); } =20 Quick analysis shows that b's initial value is 0 and its value only cha= nges if we enter an infinite loop. So if we get to the test b !=3D 0, the = only possible value b could have would be 0 and the test and its true arm c= an be eliminated. =20 The DOM3 dump looks something like this: =20 ;; basic block 2, loop depth 0, count 118111600 (estimated locally), = maybe hot ;; prev block 0, next block 3, flags: (NEW, VISITED) ;; pred: ENTRY [always] count:118111600 (estimated locally) (= FALLTHRU,EXECUTABLE) ;; succ: 3 [always] count:118111600 (estimated locally) (FALL= THRU,EXECUTABLE) =20 ;; basic block 3, loop depth 1, count 1073741824 (estimated locally),= maybe hot ;; prev block 2, next block 4, flags: (NEW, VISITED) ;; pred: 2 [always] count:118111600 (estimated locally) (FALL= THRU,EXECUTABLE) ;; 3 [89.0% (guessed)] count:955630224 (estimated local= ly) (FALSE_VALUE,EXECUTABLE) # b_1 =3D PHI <0(2), 1(3)> if (a_3(D) =3D=3D 0) goto ; [11.00%] else goto ; [89.00%] ;; succ: 4 [11.0% (guessed)] count:118111600 (estimated local= ly) (TRUE_VALUE,EXECUTABLE) ;; 3 [89.0% (guessed)] count:955630224 (estimated local= ly) (FALSE_VALUE,EXECUTABLE) =20 ;; basic block 4, loop depth 0, count 118111600 (estimated locally), = maybe hot ;; prev block 3, next block 5, flags: (NEW, VISITED) ;; pred: 3 [11.0% (guessed)] count:118111600 (estimated local= ly) (TRUE_VALUE,EXECUTABLE) if (b_1 !=3D 0) goto ; [0.00%] else goto ; [100.00%] ;; succ: 5 [never] count:0 (precise) (TRUE_VALUE,EXECUTABLE) ;; 6 [always] count:118111600 (estimated locally) (FALS= E_VALUE,EXECUTABLE) =20 This is a good representative of what the benchmark code looks like. =20 The primary effect we want to capture is to realize that the test if (b= _1 !=3D 0) is always false and optimize it accordingly. =20 In the benchmark, this opportunity is well hidden until after the loop = optimizers have completed, so the first chance to capture this case is in D= OM3. Furthermore, DOM wants loops normalized with latch blocks/edges. So = instead of bb3 looping back to itself, there's an intermediate empty block = during DOM. =20 I originally thought this was likely to only affect the benchmark. But= when I instrumented the optimization and bootstrapped GCC, much to my surp= rise there were several hundred similar cases identified in GCC itself. So= it's not as benchmark specific as I'd initially feared. =20 Anyway, detecting this in DOM is pretty simple. We detect the infinit= e loop, including the latch block. Once we've done that, we walk the PHI n= odes and attach equivalences to the appropriate outgoing edge. That's all= we need to do as the rest of DOM is already prepared to handle equivalence= s on edges. =20 gcc/ * tree-ssa-dom.cc (single_block_loop_p): New function. (record_edge_info): Also record equivalences for the outgoing edge of a single block loop where the condition is an invariant= . =20 gcc/testsuite/ * gcc.dg/infinite-loop.c: New test. * master-aarch64 ** Failure after basepoints/gcc-13-3004-g1214196da79: More gimple const/cop= y propagation opportunities: ** https://ci.linaro.org/job/tcwg_gnu_cross_build-build-master-aarch64/1606= / Bad build: https://ci.linaro.org/job/tcwg_gnu_cross_build-build-master-aar= ch64/1606/artifact/artifacts Good build: https://ci.linaro.org/job/tcwg_gnu_cross_build-build-master-aar= ch64/1605/artifact/artifacts Reproduce current build: mkdir -p investigate-gcc-1214196da79aabbe5c14ed36e5a28012e141f04c cd investigate-gcc-1214196da79aabbe5c14ed36e5a28012e141f04c # Fetch scripts git clone https://git.linaro.org/toolchain/jenkins-scripts # Fetch manifests for bad and good builds mkdir -p bad/artifacts good/artifacts curl -o bad/artifacts/manifest.sh https://ci.linaro.org/job/tcwg_gnu_cross_= build-build-master-aarch64/1606/artifact/artifacts/manifest.sh --fail curl -o good/artifacts/manifest.sh https://ci.linaro.org/job/tcwg_gnu_cross= _build-build-master-aarch64/1605/artifact/artifacts/manifest.sh --fail # Reproduce bad build (cd bad; ../jenkins-scripts/tcwg_gnu-build.sh ^^ true %%rr[top_artifacts] a= rtifacts) # Reproduce good build (cd good; ../jenkins-scripts/tcwg_gnu-build.sh ^^ true %%rr[top_artifacts] = artifacts) Full commit (up to 1000 lines): commit 1214196da79aabbe5c14ed36e5a28012e141f04c Author: Jeff Law Date: Fri Sep 30 19:26:31 2022 -0400 More gimple const/copy propagation opportunities =20 While investigating a benchmark for optimization opportunities I came a= cross single block loop which either iterates precisely once or forever. = This is an interesting scenario as we can ignore the infinite looping path= and treat any PHI nodes as degenerates. So more concretely let's consider= this trivial testcase: =20 volatile void abort (void); =20 void foo(int a) { int b =3D 0; =20 while (1) { if (!a) break; b =3D 1; } =20 if (b !=3D 0) abort (); } =20 Quick analysis shows that b's initial value is 0 and its value only cha= nges if we enter an infinite loop. So if we get to the test b !=3D 0, the = only possible value b could have would be 0 and the test and its true arm c= an be eliminated. =20 The DOM3 dump looks something like this: =20 ;; basic block 2, loop depth 0, count 118111600 (estimated locally), = maybe hot ;; prev block 0, next block 3, flags: (NEW, VISITED) ;; pred: ENTRY [always] count:118111600 (estimated locally) (= FALLTHRU,EXECUTABLE) ;; succ: 3 [always] count:118111600 (estimated locally) (FALL= THRU,EXECUTABLE) =20 ;; basic block 3, loop depth 1, count 1073741824 (estimated locally),= maybe hot ;; prev block 2, next block 4, flags: (NEW, VISITED) ;; pred: 2 [always] count:118111600 (estimated locally) (FALL= THRU,EXECUTABLE) ;; 3 [89.0% (guessed)] count:955630224 (estimated local= ly) (FALSE_VALUE,EXECUTABLE) # b_1 =3D PHI <0(2), 1(3)> if (a_3(D) =3D=3D 0) goto ; [11.00%] else goto ; [89.00%] ;; succ: 4 [11.0% (guessed)] count:118111600 (estimated local= ly) (TRUE_VALUE,EXECUTABLE) ;; 3 [89.0% (guessed)] count:955630224 (estimated local= ly) (FALSE_VALUE,EXECUTABLE) =20 ;; basic block 4, loop depth 0, count 118111600 (estimated locally), = maybe hot ;; prev block 3, next block 5, flags: (NEW, VISITED) ;; pred: 3 [11.0% (guessed)] count:118111600 (estimated local= ly) (TRUE_VALUE,EXECUTABLE) if (b_1 !=3D 0) goto ; [0.00%] else goto ; [100.00%] ;; succ: 5 [never] count:0 (precise) (TRUE_VALUE,EXECUTABLE) ;; 6 [always] count:118111600 (estimated locally) (FALS= E_VALUE,EXECUTABLE) =20 This is a good representative of what the benchmark code looks like. =20 The primary effect we want to capture is to realize that the test if (b= _1 !=3D 0) is always false and optimize it accordingly. =20 In the benchmark, this opportunity is well hidden until after the loop = optimizers have completed, so the first chance to capture this case is in D= OM3. Furthermore, DOM wants loops normalized with latch blocks/edges. So = instead of bb3 looping back to itself, there's an intermediate empty block = during DOM. =20 I originally thought this was likely to only affect the benchmark. But= when I instrumented the optimization and bootstrapped GCC, much to my surp= rise there were several hundred similar cases identified in GCC itself. So= it's not as benchmark specific as I'd initially feared. =20 Anyway, detecting this in DOM is pretty simple. We detect the infinit= e loop, including the latch block. Once we've done that, we walk the PHI n= odes and attach equivalences to the appropriate outgoing edge. That's all= we need to do as the rest of DOM is already prepared to handle equivalence= s on edges. =20 gcc/ * tree-ssa-dom.cc (single_block_loop_p): New function. (record_edge_info): Also record equivalences for the outgoing edge of a single block loop where the condition is an invariant= . =20 gcc/testsuite/ * gcc.dg/infinite-loop.c: New test. --- gcc/testsuite/gcc.dg/infinite-loop.c | 26 +++++++ gcc/tree-ssa-dom.cc | 133 +++++++++++++++++++++++++++++++= +++- 2 files changed, 157 insertions(+), 2 deletions(-) diff --git a/gcc/testsuite/gcc.dg/infinite-loop.c b/gcc/testsuite/gcc.dg/in= finite-loop.c new file mode 100644 index 00000000000..25037a2027e --- /dev/null +++ b/gcc/testsuite/gcc.dg/infinite-loop.c @@ -0,0 +1,26 @@ +/* { dg-do link } */ +/* { dg-options "-O2" } */ +void link_error (void); + +void __attribute__ ((noinline,noipa)) +foo(int a) +{ + int b =3D 0; + + while (1) + { + if (!a) + break; + b =3D 1; + } + + if (b !=3D 0) + link_error (); +} + +int +main() +{ + foo (0); +} + diff --git a/gcc/tree-ssa-dom.cc b/gcc/tree-ssa-dom.cc index fa43dbe6c44..8d8312ca350 100644 --- a/gcc/tree-ssa-dom.cc +++ b/gcc/tree-ssa-dom.cc @@ -426,6 +426,74 @@ free_all_edge_infos (void) } } =20 +/* Return TRUE if BB has precisely two preds, one of which + is a backedge from a forwarder block where the forwarder + block is a direct successor of BB. Being a forwarder + block, it has no side effects other than transfer of + control. Otherwise return FALSE. */ + +static bool +single_block_loop_p (basic_block bb) +{ + /* Two preds. */ + if (EDGE_COUNT (bb->preds) !=3D 2) + return false; + + /* One and only one of the edges must be marked with + EDGE_DFS_BACK. */ + basic_block pred =3D NULL; + unsigned int count =3D 0; + if (EDGE_PRED (bb, 0)->flags & EDGE_DFS_BACK) + { + pred =3D EDGE_PRED (bb, 0)->src; + count++; + } + if (EDGE_PRED (bb, 1)->flags & EDGE_DFS_BACK) + { + pred =3D EDGE_PRED (bb, 1)->src; + count++; + } + + if (count !=3D 1) + return false; + + /* Now examine PRED. It should have a single predecessor which + is BB and a single successor that is also BB. */ + if (EDGE_COUNT (pred->preds) !=3D 1 + || EDGE_COUNT (pred->succs) !=3D 1 + || EDGE_PRED (pred, 0)->src !=3D bb + || EDGE_SUCC (pred, 0)->dest !=3D bb) + return false; + + /* This looks good from a CFG standpoint. Now look at the guts + of PRED. Basically we want to verify there are no PHI nodes + and no real statements. */ + if (! gimple_seq_empty_p (phi_nodes (pred))) + return false; + + gimple_stmt_iterator gsi; + for (gsi =3D gsi_last_bb (pred); !gsi_end_p (gsi); gsi_prev (&gsi)) + { + gimple *stmt =3D gsi_stmt (gsi); + + switch (gimple_code (stmt)) +=09{ +=09 case GIMPLE_LABEL: +=09 if (DECL_NONLOCAL (gimple_label_label (as_a (stmt)))) +=09 return false; +=09 break; + +=09 case GIMPLE_DEBUG: +=09 break; + +=09 default: +=09 return false; +=09} + } + + return true; +} + /* We have finished optimizing BB, record any information implied by taking a specific outgoing edge from BB. */ =20 @@ -435,6 +503,13 @@ record_edge_info (basic_block bb) gimple_stmt_iterator gsi =3D gsi_last_bb (bb); class edge_info *edge_info; =20 + /* Free all the outgoing edge info data associated with + BB's outgoing edges. */ + edge e; + edge_iterator ei; + FOR_EACH_EDGE (e, ei, bb->succs) + free_dom_edge_info (e); + if (! gsi_end_p (gsi)) { gimple *stmt =3D gsi_stmt (gsi); @@ -450,8 +525,6 @@ record_edge_info (basic_block bb) =09 int i; int n_labels =3D gimple_switch_num_labels (switch_stmt); =09 tree *info =3D XCNEWVEC (tree, last_basic_block_for_fn (cfun)); -=09 edge e; -=09 edge_iterator ei; =20 =09 for (i =3D 0; i < n_labels; i++) =09=09{ @@ -583,6 +656,62 @@ record_edge_info (basic_block bb) if (can_infer_simple_equiv && TREE_CODE (inverted) =3D=3D EQ= _EXPR) =09=09edge_info->record_simple_equiv (op0, op1); } + +=09 /* If this block is a single block loop, then we may be able to +=09 record some equivalences on the loop's exit edge. */ +=09 if (single_block_loop_p (bb)) +=09 { +=09 /* We know it's a single block loop. Now look at the loop +=09=09 exit condition. What we're looking for is whether or not +=09=09 the exit condition is loop invariant which we can detect +=09=09 by checking if all the SSA_NAMEs referenced are defined +=09=09 outside the loop. */ +=09 if ((TREE_CODE (op0) !=3D SSA_NAME +=09=09 || gimple_bb (SSA_NAME_DEF_STMT (op0)) !=3D bb) +=09=09 && (TREE_CODE (op1) !=3D SSA_NAME +=09=09 || gimple_bb (SSA_NAME_DEF_STMT (op1)) !=3D bb)) +=09=09{ +=09=09 /* At this point we know the exit condition is loop +=09=09 invariant. The only way to get out of the loop is +=09=09 if never traverses the backedge to begin with. This +=09=09 implies that any PHI nodes create equivalances we can +=09=09 attach to the loop exit edge. */ +=09=09 int alternative +=09=09 =3D (EDGE_PRED (bb, 0)->flags & EDGE_DFS_BACK) ? 1 : 0; + +=09=09 gphi_iterator gsi; +=09=09 for (gsi =3D gsi_start_phis (bb); +=09=09 !gsi_end_p (gsi); +=09=09 gsi_next (&gsi)) +=09=09 { +=09=09 /* If the other alternative is the same as the result, +=09=09=09 then this is a degenerate and can be ignored. */ +=09=09 if (dst =3D=3D PHI_ARG_DEF (phi, !alternative)) +=09=09=09continue; + +=09=09 /* Now get the EDGE_INFO class so we can append +=09=09=09 it to our list. We want the successor edge +=09=09=09 where the destination is not the source of +=09=09=09 an incoming edge. */ +=09=09 gphi *phi =3D gsi.phi (); +=09=09 tree src =3D PHI_ARG_DEF (phi, alternative); +=09=09 tree dst =3D PHI_RESULT (phi); + +=09=09 if (EDGE_SUCC (bb, 0)->dest +=09=09=09 !=3D EDGE_PRED (bb, !alternative)->src) +=09=09=09edge_info =3D (class edge_info *)EDGE_SUCC (bb, 0)->aux; +=09=09 else +=09=09=09edge_info =3D (class edge_info *)EDGE_SUCC (bb, 1)->aux; + +=09=09 /* Note that since this processing is done independently +=09=09=09 of other edge equivalency processing, we may not +=09=09=09 have an EDGE_INFO structure set up yet. */ +=09=09 if (edge_info =3D=3D NULL) +=09=09=09edge_info =3D new class edge_info (false_edge); +=09=09 edge_info->record_simple_equiv (dst, src); +=09=09 } +=09=09} +=09 } } } } ------=_Part_5141_359364139.1664599699166--