From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x32c.google.com (mail-wm1-x32c.google.com [IPv6:2a00:1450:4864:20::32c]) by sourceware.org (Postfix) with ESMTPS id 859053858D1E for ; Tue, 4 Oct 2022 06:36:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 859053858D1E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org Received: by mail-wm1-x32c.google.com with SMTP id r187-20020a1c44c4000000b003bcde03bd44so345291wma.5 for ; Mon, 03 Oct 2022 23:36:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=list-id:mime-version:subject:message-id:cc:to:date:from:from:to:cc :subject:date; bh=J3frua+CXRU7P0szgYTX3gkmV6NQLXkY3gKvhlkcdcs=; b=iGolVRTSqUQ4w6K5MReztpr9bLbtlGK+YXuQHkzNxPLo/qeB4yrkBsKXdB7ojeSDRP yiEAy/5mf9l8H2OEp4zRasQYGTXhr1R9McTCcDTz/BAuFjMQ4Dq0a9qdj4hKA1M9bkkT V1jJ/Amq+qxr0Ez9gn44cSIrInTDLmkIa59ey8sEvO3L8C9vKk7UOJgTuFMCPNgtE+S9 KPhLidAyX7tGrzz6ZEHChlqrVkDLWcxBSph91ND0xWF1E+xUvsPZ1zEDTk6U0x/WIwlB SjG2uIAeAPcAnMKS/QR/9ZZuDhma+djhZlT69RKTs9lgqOD7vHUbgsOexJzgGUJ72dPn is9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=list-id:mime-version:subject:message-id:cc:to:date:from :x-gm-message-state:from:to:cc:subject:date; bh=J3frua+CXRU7P0szgYTX3gkmV6NQLXkY3gKvhlkcdcs=; b=1NJfZ9a0ahdb4I8mu8Qomu1nklamJ5ZehHOQ6LW3oJ2iOPn9T4rpqDc/2AZVuiQ2U1 4TLAeIHkcxDrGEn9hQI3fu95aWWAgHl06SJv//tIzJwjsW+ydm5Kz4n80aN3ZcrZPYOw IKarGOjrjDwRd7aVXtC6KQgpnutYviBp4j+HzDdprvZf8dPZIdn9GphcT7Lj5xuWDILq 896lHcrivZaz2HCcLhg+/vKbu0INwKLIFjvpfYUzgxzah3MR+MP7WLak5PBeLsZaarDW L/sjcQ5bo1Y5kU2PNbYeN6iFk7+g48JpO5qvKakRL0hJBrjA+4dzFS3Ksl0mfHCC+9gB HZ7A== X-Gm-Message-State: ACrzQf2ZHgJmHwzMuLZrdTNa7fE2exBHlW3tm6wCkDpq27ofEv19o5FI e50To/7q7OcjJDK0uCVSO4UrQA== X-Google-Smtp-Source: AMsMyM4a+fgHXqlmKQXOzYlYGfWfBl5nThKvYtL0hC8AbRPs5lidPF5UXeS8sXiiTdUuA/P6Ks8+ng== X-Received: by 2002:a1c:e90b:0:b0:3b4:fb6c:7654 with SMTP id q11-20020a1ce90b000000b003b4fb6c7654mr8949676wmc.98.1664865377043; Mon, 03 Oct 2022 23:36:17 -0700 (PDT) Received: from jenkins.jenkins (ci.linaro.org. [88.99.136.175]) by smtp.gmail.com with ESMTPSA id 12-20020a05600c230c00b003b492753826sm13289540wmo.43.2022.10.03.23.36.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 03 Oct 2022 23:36:16 -0700 (PDT) From: ci_notify@linaro.org X-Google-Original-From: linaro-infrastructure-errors@lists.linaro.org Date: Tue, 4 Oct 2022 06:36:15 +0000 (UTC) To: Jeff Law Cc: gcc-regression@gcc.gnu.org Message-ID: <1112270105.6320.1664865376432@jenkins.jenkins> Subject: [TCWG CI] Failure after basepoints/gcc-13-3004-g1214196da79: More gimple const/copy propagation opportunities MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_6319_1431674154.1664865375607" X-Jenkins-Job: TCWG Build tcwg_gcc_bootstrap/master-arm-bootstrap_debug X-Jenkins-Result: SUCCESS List-ID: X-Spam-Status: No, score=-13.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org ------=_Part_6319_1431674154.1664865375607 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Failure after basepoints/gcc-13-3004-g1214196da79: More gimple const/copy p= ropagation opportunities: Results changed to -10 # true: 0 # build_abe binutils: 1 # build_abe bootstrap_debug: # FAILED # First few build errors in logs: # 00:01:45 ../../../../../../gcc/gcc/tree-ssa-dom.cc:689:13: error: =E2=80= =98dst=E2=80=99 was not declared in this scope; did you mean =E2=80=98dse= =E2=80=99? # 00:01:45 ../../../../../../gcc/gcc/tree-ssa-dom.cc:689:33: error: =E2=80= =98phi=E2=80=99 was not declared in this scope; did you mean =E2=80=98gphi= =E2=80=99? # 00:01:45 make[3]: *** [Makefile:1147: tree-ssa-dom.o] Error 1 # 00:03:00 make[2]: *** [Makefile:4937: all-stage1-gcc] Error 2 # 00:03:00 make[1]: *** [Makefile:25433: stage1-bubble] Error 2 # 00:03:00 make: *** [Makefile:1064: all] Error 2 from -10 # true: 0 # build_abe binutils: 1 # build_abe bootstrap_debug: 2 THIS IS THE END OF INTERESTING STUFF. BELOW ARE LINKS TO BUILDS, REPRODUCT= ION INSTRUCTIONS, AND THE RAW COMMIT. For latest status see comments in https://linaro.atlassian.net/browse/GNU-6= 92 . Status of basepoints/gcc-13-3004-g1214196da79 commit for tcwg_gcc_bootstrap= : commit 1214196da79aabbe5c14ed36e5a28012e141f04c Author: Jeff Law Date: Fri Sep 30 19:26:31 2022 -0400 More gimple const/copy propagation opportunities =20 While investigating a benchmark for optimization opportunities I came a= cross single block loop which either iterates precisely once or forever. = This is an interesting scenario as we can ignore the infinite looping path= and treat any PHI nodes as degenerates. So more concretely let's consider= this trivial testcase: =20 volatile void abort (void); =20 void foo(int a) { int b =3D 0; =20 while (1) { if (!a) break; b =3D 1; } =20 if (b !=3D 0) abort (); } =20 Quick analysis shows that b's initial value is 0 and its value only cha= nges if we enter an infinite loop. So if we get to the test b !=3D 0, the = only possible value b could have would be 0 and the test and its true arm c= an be eliminated. =20 The DOM3 dump looks something like this: =20 ;; basic block 2, loop depth 0, count 118111600 (estimated locally), = maybe hot ;; prev block 0, next block 3, flags: (NEW, VISITED) ;; pred: ENTRY [always] count:118111600 (estimated locally) (= FALLTHRU,EXECUTABLE) ;; succ: 3 [always] count:118111600 (estimated locally) (FALL= THRU,EXECUTABLE) =20 ;; basic block 3, loop depth 1, count 1073741824 (estimated locally),= maybe hot ;; prev block 2, next block 4, flags: (NEW, VISITED) ;; pred: 2 [always] count:118111600 (estimated locally) (FALL= THRU,EXECUTABLE) ;; 3 [89.0% (guessed)] count:955630224 (estimated local= ly) (FALSE_VALUE,EXECUTABLE) # b_1 =3D PHI <0(2), 1(3)> if (a_3(D) =3D=3D 0) goto ; [11.00%] else goto ; [89.00%] ;; succ: 4 [11.0% (guessed)] count:118111600 (estimated local= ly) (TRUE_VALUE,EXECUTABLE) ;; 3 [89.0% (guessed)] count:955630224 (estimated local= ly) (FALSE_VALUE,EXECUTABLE) =20 ;; basic block 4, loop depth 0, count 118111600 (estimated locally), = maybe hot ;; prev block 3, next block 5, flags: (NEW, VISITED) ;; pred: 3 [11.0% (guessed)] count:118111600 (estimated local= ly) (TRUE_VALUE,EXECUTABLE) if (b_1 !=3D 0) goto ; [0.00%] else goto ; [100.00%] ;; succ: 5 [never] count:0 (precise) (TRUE_VALUE,EXECUTABLE) ;; 6 [always] count:118111600 (estimated locally) (FALS= E_VALUE,EXECUTABLE) =20 This is a good representative of what the benchmark code looks like. =20 The primary effect we want to capture is to realize that the test if (b= _1 !=3D 0) is always false and optimize it accordingly. =20 In the benchmark, this opportunity is well hidden until after the loop = optimizers have completed, so the first chance to capture this case is in D= OM3. Furthermore, DOM wants loops normalized with latch blocks/edges. So = instead of bb3 looping back to itself, there's an intermediate empty block = during DOM. =20 I originally thought this was likely to only affect the benchmark. But= when I instrumented the optimization and bootstrapped GCC, much to my surp= rise there were several hundred similar cases identified in GCC itself. So= it's not as benchmark specific as I'd initially feared. =20 Anyway, detecting this in DOM is pretty simple. We detect the infinit= e loop, including the latch block. Once we've done that, we walk the PHI n= odes and attach equivalences to the appropriate outgoing edge. That's all= we need to do as the rest of DOM is already prepared to handle equivalence= s on edges. =20 gcc/ * tree-ssa-dom.cc (single_block_loop_p): New function. (record_edge_info): Also record equivalences for the outgoing edge of a single block loop where the condition is an invariant= . =20 gcc/testsuite/ * gcc.dg/infinite-loop.c: New test. * master-arm-bootstrap_debug ** Failure after basepoints/gcc-13-3004-g1214196da79: More gimple const/cop= y propagation opportunities: ** https://ci.linaro.org/job/tcwg_gcc_bootstrap-build-master-arm-bootstrap_= debug/475/ Bad build: https://ci.linaro.org/job/tcwg_gcc_bootstrap-build-master-arm-b= ootstrap_debug/475/artifact/artifacts Good build: https://ci.linaro.org/job/tcwg_gcc_bootstrap-build-master-arm-b= ootstrap_debug/474/artifact/artifacts Reproduce current build: mkdir -p investigate-gcc-1214196da79aabbe5c14ed36e5a28012e141f04c cd investigate-gcc-1214196da79aabbe5c14ed36e5a28012e141f04c # Fetch scripts git clone https://git.linaro.org/toolchain/jenkins-scripts # Fetch manifests for bad and good builds mkdir -p bad/artifacts good/artifacts curl -o bad/artifacts/manifest.sh https://ci.linaro.org/job/tcwg_gcc_bootst= rap-build-master-arm-bootstrap_debug/475/artifact/artifacts/manifest.sh --f= ail curl -o good/artifacts/manifest.sh https://ci.linaro.org/job/tcwg_gcc_boots= trap-build-master-arm-bootstrap_debug/474/artifact/artifacts/manifest.sh --= fail # Reproduce bad build (cd bad; ../jenkins-scripts/tcwg_gnu-build.sh ^^ true %%rr[top_artifacts] a= rtifacts) # Reproduce good build (cd good; ../jenkins-scripts/tcwg_gnu-build.sh ^^ true %%rr[top_artifacts] = artifacts) Full commit (up to 1000 lines): commit 1214196da79aabbe5c14ed36e5a28012e141f04c Author: Jeff Law Date: Fri Sep 30 19:26:31 2022 -0400 More gimple const/copy propagation opportunities =20 While investigating a benchmark for optimization opportunities I came a= cross single block loop which either iterates precisely once or forever. = This is an interesting scenario as we can ignore the infinite looping path= and treat any PHI nodes as degenerates. So more concretely let's consider= this trivial testcase: =20 volatile void abort (void); =20 void foo(int a) { int b =3D 0; =20 while (1) { if (!a) break; b =3D 1; } =20 if (b !=3D 0) abort (); } =20 Quick analysis shows that b's initial value is 0 and its value only cha= nges if we enter an infinite loop. So if we get to the test b !=3D 0, the = only possible value b could have would be 0 and the test and its true arm c= an be eliminated. =20 The DOM3 dump looks something like this: =20 ;; basic block 2, loop depth 0, count 118111600 (estimated locally), = maybe hot ;; prev block 0, next block 3, flags: (NEW, VISITED) ;; pred: ENTRY [always] count:118111600 (estimated locally) (= FALLTHRU,EXECUTABLE) ;; succ: 3 [always] count:118111600 (estimated locally) (FALL= THRU,EXECUTABLE) =20 ;; basic block 3, loop depth 1, count 1073741824 (estimated locally),= maybe hot ;; prev block 2, next block 4, flags: (NEW, VISITED) ;; pred: 2 [always] count:118111600 (estimated locally) (FALL= THRU,EXECUTABLE) ;; 3 [89.0% (guessed)] count:955630224 (estimated local= ly) (FALSE_VALUE,EXECUTABLE) # b_1 =3D PHI <0(2), 1(3)> if (a_3(D) =3D=3D 0) goto ; [11.00%] else goto ; [89.00%] ;; succ: 4 [11.0% (guessed)] count:118111600 (estimated local= ly) (TRUE_VALUE,EXECUTABLE) ;; 3 [89.0% (guessed)] count:955630224 (estimated local= ly) (FALSE_VALUE,EXECUTABLE) =20 ;; basic block 4, loop depth 0, count 118111600 (estimated locally), = maybe hot ;; prev block 3, next block 5, flags: (NEW, VISITED) ;; pred: 3 [11.0% (guessed)] count:118111600 (estimated local= ly) (TRUE_VALUE,EXECUTABLE) if (b_1 !=3D 0) goto ; [0.00%] else goto ; [100.00%] ;; succ: 5 [never] count:0 (precise) (TRUE_VALUE,EXECUTABLE) ;; 6 [always] count:118111600 (estimated locally) (FALS= E_VALUE,EXECUTABLE) =20 This is a good representative of what the benchmark code looks like. =20 The primary effect we want to capture is to realize that the test if (b= _1 !=3D 0) is always false and optimize it accordingly. =20 In the benchmark, this opportunity is well hidden until after the loop = optimizers have completed, so the first chance to capture this case is in D= OM3. Furthermore, DOM wants loops normalized with latch blocks/edges. So = instead of bb3 looping back to itself, there's an intermediate empty block = during DOM. =20 I originally thought this was likely to only affect the benchmark. But= when I instrumented the optimization and bootstrapped GCC, much to my surp= rise there were several hundred similar cases identified in GCC itself. So= it's not as benchmark specific as I'd initially feared. =20 Anyway, detecting this in DOM is pretty simple. We detect the infinit= e loop, including the latch block. Once we've done that, we walk the PHI n= odes and attach equivalences to the appropriate outgoing edge. That's all= we need to do as the rest of DOM is already prepared to handle equivalence= s on edges. =20 gcc/ * tree-ssa-dom.cc (single_block_loop_p): New function. (record_edge_info): Also record equivalences for the outgoing edge of a single block loop where the condition is an invariant= . =20 gcc/testsuite/ * gcc.dg/infinite-loop.c: New test. --- gcc/testsuite/gcc.dg/infinite-loop.c | 26 +++++++ gcc/tree-ssa-dom.cc | 133 +++++++++++++++++++++++++++++++= +++- 2 files changed, 157 insertions(+), 2 deletions(-) diff --git a/gcc/testsuite/gcc.dg/infinite-loop.c b/gcc/testsuite/gcc.dg/in= finite-loop.c new file mode 100644 index 00000000000..25037a2027e --- /dev/null +++ b/gcc/testsuite/gcc.dg/infinite-loop.c @@ -0,0 +1,26 @@ +/* { dg-do link } */ +/* { dg-options "-O2" } */ +void link_error (void); + +void __attribute__ ((noinline,noipa)) +foo(int a) +{ + int b =3D 0; + + while (1) + { + if (!a) + break; + b =3D 1; + } + + if (b !=3D 0) + link_error (); +} + +int +main() +{ + foo (0); +} + diff --git a/gcc/tree-ssa-dom.cc b/gcc/tree-ssa-dom.cc index fa43dbe6c44..8d8312ca350 100644 --- a/gcc/tree-ssa-dom.cc +++ b/gcc/tree-ssa-dom.cc @@ -426,6 +426,74 @@ free_all_edge_infos (void) } } =20 +/* Return TRUE if BB has precisely two preds, one of which + is a backedge from a forwarder block where the forwarder + block is a direct successor of BB. Being a forwarder + block, it has no side effects other than transfer of + control. Otherwise return FALSE. */ + +static bool +single_block_loop_p (basic_block bb) +{ + /* Two preds. */ + if (EDGE_COUNT (bb->preds) !=3D 2) + return false; + + /* One and only one of the edges must be marked with + EDGE_DFS_BACK. */ + basic_block pred =3D NULL; + unsigned int count =3D 0; + if (EDGE_PRED (bb, 0)->flags & EDGE_DFS_BACK) + { + pred =3D EDGE_PRED (bb, 0)->src; + count++; + } + if (EDGE_PRED (bb, 1)->flags & EDGE_DFS_BACK) + { + pred =3D EDGE_PRED (bb, 1)->src; + count++; + } + + if (count !=3D 1) + return false; + + /* Now examine PRED. It should have a single predecessor which + is BB and a single successor that is also BB. */ + if (EDGE_COUNT (pred->preds) !=3D 1 + || EDGE_COUNT (pred->succs) !=3D 1 + || EDGE_PRED (pred, 0)->src !=3D bb + || EDGE_SUCC (pred, 0)->dest !=3D bb) + return false; + + /* This looks good from a CFG standpoint. Now look at the guts + of PRED. Basically we want to verify there are no PHI nodes + and no real statements. */ + if (! gimple_seq_empty_p (phi_nodes (pred))) + return false; + + gimple_stmt_iterator gsi; + for (gsi =3D gsi_last_bb (pred); !gsi_end_p (gsi); gsi_prev (&gsi)) + { + gimple *stmt =3D gsi_stmt (gsi); + + switch (gimple_code (stmt)) +=09{ +=09 case GIMPLE_LABEL: +=09 if (DECL_NONLOCAL (gimple_label_label (as_a (stmt)))) +=09 return false; +=09 break; + +=09 case GIMPLE_DEBUG: +=09 break; + +=09 default: +=09 return false; +=09} + } + + return true; +} + /* We have finished optimizing BB, record any information implied by taking a specific outgoing edge from BB. */ =20 @@ -435,6 +503,13 @@ record_edge_info (basic_block bb) gimple_stmt_iterator gsi =3D gsi_last_bb (bb); class edge_info *edge_info; =20 + /* Free all the outgoing edge info data associated with + BB's outgoing edges. */ + edge e; + edge_iterator ei; + FOR_EACH_EDGE (e, ei, bb->succs) + free_dom_edge_info (e); + if (! gsi_end_p (gsi)) { gimple *stmt =3D gsi_stmt (gsi); @@ -450,8 +525,6 @@ record_edge_info (basic_block bb) =09 int i; int n_labels =3D gimple_switch_num_labels (switch_stmt); =09 tree *info =3D XCNEWVEC (tree, last_basic_block_for_fn (cfun)); -=09 edge e; -=09 edge_iterator ei; =20 =09 for (i =3D 0; i < n_labels; i++) =09=09{ @@ -583,6 +656,62 @@ record_edge_info (basic_block bb) if (can_infer_simple_equiv && TREE_CODE (inverted) =3D=3D EQ= _EXPR) =09=09edge_info->record_simple_equiv (op0, op1); } + +=09 /* If this block is a single block loop, then we may be able to +=09 record some equivalences on the loop's exit edge. */ +=09 if (single_block_loop_p (bb)) +=09 { +=09 /* We know it's a single block loop. Now look at the loop +=09=09 exit condition. What we're looking for is whether or not +=09=09 the exit condition is loop invariant which we can detect +=09=09 by checking if all the SSA_NAMEs referenced are defined +=09=09 outside the loop. */ +=09 if ((TREE_CODE (op0) !=3D SSA_NAME +=09=09 || gimple_bb (SSA_NAME_DEF_STMT (op0)) !=3D bb) +=09=09 && (TREE_CODE (op1) !=3D SSA_NAME +=09=09 || gimple_bb (SSA_NAME_DEF_STMT (op1)) !=3D bb)) +=09=09{ +=09=09 /* At this point we know the exit condition is loop +=09=09 invariant. The only way to get out of the loop is +=09=09 if never traverses the backedge to begin with. This +=09=09 implies that any PHI nodes create equivalances we can +=09=09 attach to the loop exit edge. */ +=09=09 int alternative +=09=09 =3D (EDGE_PRED (bb, 0)->flags & EDGE_DFS_BACK) ? 1 : 0; + +=09=09 gphi_iterator gsi; +=09=09 for (gsi =3D gsi_start_phis (bb); +=09=09 !gsi_end_p (gsi); +=09=09 gsi_next (&gsi)) +=09=09 { +=09=09 /* If the other alternative is the same as the result, +=09=09=09 then this is a degenerate and can be ignored. */ +=09=09 if (dst =3D=3D PHI_ARG_DEF (phi, !alternative)) +=09=09=09continue; + +=09=09 /* Now get the EDGE_INFO class so we can append +=09=09=09 it to our list. We want the successor edge +=09=09=09 where the destination is not the source of +=09=09=09 an incoming edge. */ +=09=09 gphi *phi =3D gsi.phi (); +=09=09 tree src =3D PHI_ARG_DEF (phi, alternative); +=09=09 tree dst =3D PHI_RESULT (phi); + +=09=09 if (EDGE_SUCC (bb, 0)->dest +=09=09=09 !=3D EDGE_PRED (bb, !alternative)->src) +=09=09=09edge_info =3D (class edge_info *)EDGE_SUCC (bb, 0)->aux; +=09=09 else +=09=09=09edge_info =3D (class edge_info *)EDGE_SUCC (bb, 1)->aux; + +=09=09 /* Note that since this processing is done independently +=09=09=09 of other edge equivalency processing, we may not +=09=09=09 have an EDGE_INFO structure set up yet. */ +=09=09 if (edge_info =3D=3D NULL) +=09=09=09edge_info =3D new class edge_info (false_edge); +=09=09 edge_info->record_simple_equiv (dst, src); +=09=09 } +=09=09} +=09 } } } } ------=_Part_6319_1431674154.1664865375607--