From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x134.google.com (mail-lf1-x134.google.com [IPv6:2a00:1450:4864:20::134]) by sourceware.org (Postfix) with ESMTPS id 541623858D35 for ; Mon, 22 May 2023 12:56:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 541623858D35 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-lf1-x134.google.com with SMTP id 2adb3069b0e04-4f3edc05aa5so1722736e87.3 for ; Mon, 22 May 2023 05:56:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684760212; x=1687352212; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=sRMsf2nSE4dpSvJAmM77tDREgZQ0kbeY79L9EB2Bzgg=; b=RkL8gBA8zrnT064bjzQO5eUtkRHy5PPk3+ruzQ+T+4/t9LtBPaWJ3s2dp7IPRJAXAP Sgt3vpVCdMnByLXCh7Wg6/3DiTNb2+4jKvpoV4dT5fJcmbrtveHxrxzN2kdmoYD1vUjg m/VX6NpUXccw5ZkqyUmrbi515aA8rm2n55bAd17TehM4TJ+7iSCXy4ve76nn3L30n4Iy w8AWZ957VjDnK/4lDwC2J2ndxnHwckhWWIowL8sSM7X9ctDR2CM2v2xJaUOvPF/1fDoX LkiZssT7cNL1X0zMNSxTc9D8fKYfywiQ+LrpbpY8hLEywwnQKPhLAp0ALzRdJwky7fob vTyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684760212; x=1687352212; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=sRMsf2nSE4dpSvJAmM77tDREgZQ0kbeY79L9EB2Bzgg=; b=F10iegul/9dPe4HS+M8AvF4IWvNkRa7q/ybjVc32RXqOWpyS5gR14ZKNTxq17KYT+p TUbj5IbXWxrgidtp1z4P7JYEhofuQflLw1tNe+JonItBkpwFDaHFuwMSky6BM1XuXj3o PcD1bDS7l4EkAV86FgL/WL2gGCC/m16lN7fELVhZPpKl2clDjRWnVhSPGerWHP/4xwPC g503+ULlquZTeMwvFJNKO/dIsEAQbhwFDeb6ThpN7jyjKdgjQ/k5+XEiXQ4FBTeqhUHe aLRCqqadNvY9iwJ4UByRallUNWMJMLy0SF1Ac6QBINoSDug1QKtbf8CfNvwyWN3F9hOS 35XA== X-Gm-Message-State: AC+VfDyNmP+JTMgPbn93+PhN8MpzFhIbcLMvA9ci19LtJONintV4DbQs kuGqy21zEWs6yWrIsbvMcjVqwYz0czBiMXQzsns= X-Google-Smtp-Source: ACHHUZ7zWgYDTxgxGCvUnbBS23h6vG4gyLqnwVVyVOiTM5pWDtkkb/xjPaJncHkXqjxwTjsPy36rTmAbqU5cyQ2t1EQ= X-Received: by 2002:a19:ac04:0:b0:4f4:b592:74ab with SMTP id g4-20020a19ac04000000b004f4b59274abmr312797lfc.62.1684760211350; Mon, 22 May 2023 05:56:51 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Richard Biener Date: Mon, 22 May 2023 14:56:39 +0200 Message-ID: Subject: Re: [PATCH v1] tree-ssa-sink: Improve code sinking pass. To: Ajit Agarwal Cc: gcc-patches , Segher Boessenkool , Peter Bergner , Jeff Law Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, May 18, 2023 at 9:14=E2=80=AFAM Ajit Agarwal wrote: > > Hello All: > > This patch improves code sinking pass to sink statements before call to r= educe > register pressure. > Review comments are incorporated. > > Bootstrapped and regtested on powerpc64-linux-gnu. > > Thanks & Regards > Ajit > > > tree-ssa-sink: Improve code sinking pass. > > Code Sinking sinks the blocks after call. This increases > register pressure for callee-saved registers. Improves > code sinking before call in the use blocks or immediate > dominator of use blocks. > > 2023-05-18 Ajit Kumar Agarwal > > gcc/ChangeLog: > > * tree-ssa-sink.cc (statement_sink_location): Modifed to > move statements before calls. > (block_call_p): New function. > (def_use_same_block): New function. > (select_best_block): Add heuristics to select the best > blocks in the immediate post dominator. > > gcc/testsuite/ChangeLog: > > * gcc.dg/tree-ssa/ssa-sink-20.c: New testcase. > * gcc.dg/tree-ssa/ssa-sink-21.c: New testcase. > --- > gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c | 16 ++ > gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c | 20 +++ > gcc/tree-ssa-sink.cc | 159 ++++++++++++++++++-- > 3 files changed, 185 insertions(+), 10 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c > create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c b/gcc/testsuite/= gcc.dg/tree-ssa/ssa-sink-20.c > new file mode 100644 > index 00000000000..716bc1f9257 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c > @@ -0,0 +1,16 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-sink -fdump-tree-optimized -fdump-tree-= sink-stats" } */ > + > +void bar(); > +int j; > +void foo(int a, int b, int c, int d, int e, int f) > +{ > + int l; > + l =3D a + b + c + d +e + f; > + if (a !=3D 5) > + { > + bar(); > + j =3D l; > + } > +} > +/* { dg-final { scan-tree-dump-times "Sunk statements: 5" 1 "sink" } } *= / this doesn't verify the place we sink to? > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c b/gcc/testsuite/= gcc.dg/tree-ssa/ssa-sink-21.c > new file mode 100644 > index 00000000000..ff41e2ea8ae > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c > @@ -0,0 +1,20 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-sink-stats -fdump-tree-sink-stats" } */ > + > +void bar(); > +int j, x; > +void foo(int a, int b, int c, int d, int e, int f) > +{ > + int l; > + l =3D a + b + c + d +e + f; > + if (a !=3D 5) > + { > + bar(); > + if (b !=3D 3) > + x =3D 3; > + else > + x =3D 5; > + j =3D l; > + } > +} > +/* { dg-final { scan-tree-dump-times "Sunk statements: 5" 1 "sink" } } *= / likewise. So both tests already pass before the patch? > diff --git a/gcc/tree-ssa-sink.cc b/gcc/tree-ssa-sink.cc > index 87b1d40c174..76556e7795b 100644 > --- a/gcc/tree-ssa-sink.cc > +++ b/gcc/tree-ssa-sink.cc > @@ -171,6 +171,72 @@ nearest_common_dominator_of_uses (def_operand_p def_= p, bool *debug_stmts) > return commondom; > } > > +/* Return TRUE if immediate uses of the defs in > + USE occur in the same block as USE, FALSE otherwise. */ > + > +bool > +def_use_same_block (gimple *stmt) > +{ > + use_operand_p use_p; > + def_operand_p def_p; > + imm_use_iterator imm_iter; > + ssa_op_iter iter; > + > + FOR_EACH_SSA_DEF_OPERAND (def_p, stmt, iter, SSA_OP_DEF) > + { > + FOR_EACH_IMM_USE_FAST (use_p, imm_iter, DEF_FROM_PTR (def_p)) > + { > + if (is_gimple_debug (USE_STMT (use_p))) > + continue; > + > + if (use_p use_p is never null > + && (gimple_bb (USE_STMT (use_p)) =3D=3D gimple_bb (stmt))) > + return true; the function behavior is obviously odd ... > + } > + } > + return false; > +} > + > +/* Return TRUE if the block has only calls, FALSE otherwise. */ > + > +bool > +block_call_p (basic_block bb) > +{ > + int i =3D 0; > + bool is_call =3D false; > + gimple_stmt_iterator gsi =3D gsi_last_bb (bb); > + gimple *last_stmt =3D gsi_stmt (gsi); > + > + if (last_stmt && gimple_code (last_stmt) =3D=3D GIMPLE_COND) > + { > + if (!gsi_end_p (gsi)) > + gsi_prev (&gsi); > + > + for (; !gsi_end_p (gsi);) > + { > + gimple *stmt =3D gsi_stmt (gsi); > + > + /* We have already seen a call. */ > + if (is_call) > + return false; Likewise. Do you want to check whether a block has a single stmt and that is a call and that is followed by a condition? It looks like a very convoluted way to write this. > + > + if (is_gimple_call (stmt)) > + is_call =3D true; > + else > + return false; > + > + if (!gsi_end_p (gsi)) > + gsi_prev (&gsi); > + > + ++i; > + } > + } > + if (is_call && i =3D=3D 1) > + return true; > + > + return false; > +} > + > /* Given EARLY_BB and LATE_BB, two blocks in a path through the dominato= r > tree, return the best basic block between them (inclusive) to place > statements. > @@ -190,7 +256,8 @@ nearest_common_dominator_of_uses (def_operand_p def_p= , bool *debug_stmts) > static basic_block > select_best_block (basic_block early_bb, > basic_block late_bb, > - gimple *stmt) > + gimple *stmt, > + gimple *use) please update the function comment > { > basic_block best_bb =3D late_bb; > basic_block temp_bb =3D late_bb; > @@ -230,14 +297,47 @@ select_best_block (basic_block early_bb, > if (threshold > 100) > threshold =3D 100; > } > - > /* If BEST_BB is at the same nesting level, then require it to have > significantly lower execution frequency to avoid gratuitous movemen= t. */ > if (bb_loop_depth (best_bb) =3D=3D bb_loop_depth (early_bb) > /* If result of comparsion is unknown, prefer EARLY_BB. > Thus use !(...>=3D..) rather than (...<...) */ > && !(best_bb->count * 100 >=3D early_bb->count * threshold)) > - return best_bb; > + { > + basic_block new_best_bb =3D get_immediate_dominator (CDI_DOMINATOR= S, best_bb); > + /* Return best_bb if def and use are in same block otherwise new_b= est_bb. > + > + Things to consider: > + > + new_best_bb is not equal to best_bb and early_bb. > + > + stmt is not call. > + > + new_best_bb doesnt have any phis. > + > + use basic block is not equal to early_bb. > + > + use basic block post dominates to new_best_bb. > + > + new_best_bb dominates early_bb. */ > + if (new_best_bb && use > + && (new_best_bb !=3D best_bb) > + && (new_best_bb !=3D early_bb) > + && !is_gimple_call (stmt) > + && gsi_end_p (gsi_start_phis (new_best_bb)) > + && (gimple_bb (use) !=3D early_bb) > + && !is_gimple_call (use) > + && dominated_by_p (CDI_POST_DOMINATORS, new_best_bb, gimple_bb(= use)) > + && dominated_by_p (CDI_DOMINATORS, new_best_bb, early_bb) > + && block_call_p (new_best_bb)) > + { > + if (def_use_same_block (use)) > + return best_bb; given the odd implementation of the predicates this matches very very specific cases. Consider if (..) { foo(); bar(); ... =3D l; } and C++ where foo and bar might throw. You then likely want to sink before foo (). What's the reason to only consider blocks with exactly 'call; cond;' ? > + > + return new_best_bb; > + } > + return best_bb; > + } > > /* No better block found, so return EARLY_BB, which happens to be the > statement's original block. */ > @@ -439,7 +539,7 @@ statement_sink_location (gimple *stmt, basic_block fr= ombb, > if (!dominated_by_p (CDI_DOMINATORS, commondom, frombb)) > return false; > > - commondom =3D select_best_block (frombb, commondom, stmt); > + commondom =3D select_best_block (frombb, commondom, stmt, NULL); > > if (commondom =3D=3D frombb) > return false; > @@ -456,19 +556,58 @@ statement_sink_location (gimple *stmt, basic_block = frombb, > continue; > break; > } > + > use =3D USE_STMT (one_use); > > if (gimple_code (use) !=3D GIMPLE_PHI) > { > - sinkbb =3D select_best_block (frombb, gimple_bb (use), stmt); > + sinkbb =3D select_best_block (frombb, gimple_bb (use), stmt, us= e); > > if (sinkbb =3D=3D frombb) > return false; > > - if (sinkbb =3D=3D gimple_bb (use)) > - *togsi =3D gsi_for_stmt (use); > - else > - *togsi =3D gsi_after_labels (sinkbb); > + gimple *def_stmt =3D SSA_NAME_DEF_STMT (DEF_FROM_PTR (def_p)); > + > + if ((gimple_bb (def_stmt) =3D=3D gimple_bb (use)) > + && (gimple_bb (use) !=3D sinkbb)) > + sinkbb =3D gimple_bb (use); > + > + if (sinkbb =3D=3D gimple_bb (use)) > + { > + gimple_stmt_iterator gsi =3D gsi_last_bb (sinkbb); > + gimple *def_stmt =3D SSA_NAME_DEF_STMT (DEF_FROM_PTR (def= _p)); > + gimple *last_stmt =3D gsi_stmt (gsi); > + > + /* Update sinking point as stmt before call if the sinkin= g block > + has only calls. Otherwise update sinking point as the = use > + stmt. */ > + if (gsi_stmt (gsi) =3D=3D use > + && !is_gimple_call (last_stmt) > + && (gimple_code (last_stmt) !=3D GIMPLE_SWITCH) > + && (gimple_code (last_stmt) !=3D GIMPLE_COND) > + && (gimple_code (last_stmt) !=3D GIMPLE_GOTO) > + && (!gimple_vdef (use) || !def_use_same_block (def_st= mt))) > + { > + if (!gsi_end_p (gsi)) > + gsi_prev (&gsi); > + > + gimple *stmt =3D gsi_stmt (gsi); > + > + if (!gsi_end_p (gsi)) > + gsi_prev (&gsi); > + > + if (gsi_end_p (gsi) && stmt && is_gimple_call (stmt) > + && gsi_end_p (gsi_start_phis (sinkbb)) > + && !is_gimple_call (def_stmt)) > + *togsi =3D gsi_for_stmt (stmt); > + else > + *togsi =3D gsi_for_stmt (use); > + } > + else > + *togsi =3D gsi_for_stmt(use); > + } > + else > + *togsi =3D gsi_after_labels (sinkbb); This is very convoluted. I think that in the end you want to compute (once= ) the position of the first call in each block. Since we're waking the CFG backw= ards in post-dominator order this information can be gathered during this walk. This would determine the location to sink to iff the use stmt is dominated = by this location (you can for example use gimple_uid to mark stmts before it). The alternative is to simply always sink to the start of blocks even for th= e use stmt block in case that has a call before the use (but you still need t= o efficiently compute that). Richard. > > return true; > } > @@ -480,7 +619,7 @@ statement_sink_location (gimple *stmt, basic_block fr= ombb, > if (!sinkbb) > return false; > > - sinkbb =3D select_best_block (frombb, sinkbb, stmt); > + sinkbb =3D select_best_block (frombb, sinkbb, stmt, NULL); > if (!sinkbb || sinkbb =3D=3D frombb) > return false; > > -- > 2.31.1 >