public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <richard.guenther@gmail.com>
To: Ajit Agarwal <aagarwa1@linux.ibm.com>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>,
	 Segher Boessenkool <segher@kernel.crashing.org>,
	Peter Bergner <bergner@linux.ibm.com>,
	 Jeff Law <jeffreyalaw@gmail.com>
Subject: Re: [PATCH v1] tree-ssa-sink: Improve code sinking pass.
Date: Mon, 22 May 2023 14:56:39 +0200	[thread overview]
Message-ID: <CAFiYyc12gMBXcN+Buetb3FrN+EhJ-AnS99hfduESZ4gncCPUVA@mail.gmail.com> (raw)
In-Reply-To: <acdb8e2b-7c95-0fd2-2189-51f661f15bc5@linux.ibm.com>

On Thu, May 18, 2023 at 9:14 AM Ajit Agarwal <aagarwa1@linux.ibm.com> wrote:
>
> Hello All:
>
> This patch improves code sinking pass to sink statements before call to reduce
> register pressure.
> Review comments are incorporated.
>
> Bootstrapped and regtested on powerpc64-linux-gnu.
>
> Thanks & Regards
> Ajit
>
>
> tree-ssa-sink: Improve code sinking pass.
>
> Code Sinking sinks the blocks after call. This increases
> register pressure for callee-saved registers. Improves
> code sinking before call in the use blocks or immediate
> dominator of use blocks.
>
> 2023-05-18  Ajit Kumar Agarwal  <aagarwa1@linux.ibm.com>
>
> gcc/ChangeLog:
>
>         * tree-ssa-sink.cc (statement_sink_location): Modifed to
>         move statements before calls.
>         (block_call_p): New function.
>         (def_use_same_block): New function.
>         (select_best_block): Add heuristics to select the best
>         blocks in the immediate post dominator.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.dg/tree-ssa/ssa-sink-20.c: New testcase.
>         * gcc.dg/tree-ssa/ssa-sink-21.c: New testcase.
> ---
>  gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c |  16 ++
>  gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c |  20 +++
>  gcc/tree-ssa-sink.cc                        | 159 ++++++++++++++++++--
>  3 files changed, 185 insertions(+), 10 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
>
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c
> new file mode 100644
> index 00000000000..716bc1f9257
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-sink -fdump-tree-optimized -fdump-tree-sink-stats" } */
> +
> +void bar();
> +int j;
> +void foo(int a, int b, int c, int d, int e, int f)
> +{
> +  int l;
> +  l = a + b + c + d +e + f;
> +  if (a != 5)
> +    {
> +      bar();
> +      j = l;
> +    }
> +}
> +/* { dg-final { scan-tree-dump-times "Sunk statements: 5" 1 "sink" } } */

this doesn't verify the place we sink to?

> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
> new file mode 100644
> index 00000000000..ff41e2ea8ae
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
> @@ -0,0 +1,20 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-sink-stats -fdump-tree-sink-stats" } */
> +
> +void bar();
> +int j, x;
> +void foo(int a, int b, int c, int d, int e, int f)
> +{
> +  int l;
> +  l = a + b + c + d +e + f;
> +  if (a != 5)
> +    {
> +      bar();
> +      if (b != 3)
> +        x = 3;
> +      else
> +        x = 5;
> +      j = l;
> +    }
> +}
> +/* { dg-final { scan-tree-dump-times "Sunk statements: 5" 1 "sink" } } */

likewise.  So both tests already pass before the patch?

> diff --git a/gcc/tree-ssa-sink.cc b/gcc/tree-ssa-sink.cc
> index 87b1d40c174..76556e7795b 100644
> --- a/gcc/tree-ssa-sink.cc
> +++ b/gcc/tree-ssa-sink.cc
> @@ -171,6 +171,72 @@ nearest_common_dominator_of_uses (def_operand_p def_p, bool *debug_stmts)
>    return commondom;
>  }
>
> +/* Return TRUE if immediate uses of the defs in
> +   USE occur in the same block as USE, FALSE otherwise.  */
> +
> +bool
> +def_use_same_block (gimple *stmt)
> +{
> +  use_operand_p use_p;
> +  def_operand_p def_p;
> +  imm_use_iterator imm_iter;
> +  ssa_op_iter iter;
> +
> +  FOR_EACH_SSA_DEF_OPERAND (def_p, stmt, iter, SSA_OP_DEF)
> +    {
> +      FOR_EACH_IMM_USE_FAST (use_p, imm_iter, DEF_FROM_PTR (def_p))
> +       {
> +         if (is_gimple_debug (USE_STMT (use_p)))
> +           continue;
> +
> +         if (use_p

use_p is never null

> +             && (gimple_bb (USE_STMT (use_p)) == gimple_bb (stmt)))
> +           return true;

the function behavior is obviously odd ...

> +       }
> +     }
> +  return false;
> +}
> +
> +/* Return TRUE if the block has only calls, FALSE otherwise. */
> +
> +bool
> +block_call_p (basic_block bb)
> +{
> +  int i = 0;
> +  bool is_call = false;
> +  gimple_stmt_iterator gsi = gsi_last_bb (bb);
> +  gimple *last_stmt = gsi_stmt (gsi);
> +
> +  if (last_stmt && gimple_code (last_stmt) == GIMPLE_COND)
> +    {
> +      if (!gsi_end_p (gsi))
> +       gsi_prev (&gsi);
> +
> +       for (; !gsi_end_p (gsi);)
> +        {
> +          gimple *stmt = gsi_stmt (gsi);
> +
> +          /* We have already seen a call.  */
> +          if (is_call)
> +            return false;

Likewise.  Do you want to check whether a block has
a single stmt and that is a call and that is followed by
a condition?  It looks like a very convoluted way to write this.

> +
> +          if (is_gimple_call (stmt))
> +            is_call = true;
> +          else
> +            return false;
> +
> +          if (!gsi_end_p (gsi))
> +            gsi_prev (&gsi);
> +
> +           ++i;
> +       }
> +     }
> +  if (is_call && i == 1)
> +    return true;
> +
> +  return false;
> +}
> +
>  /* Given EARLY_BB and LATE_BB, two blocks in a path through the dominator
>     tree, return the best basic block between them (inclusive) to place
>     statements.
> @@ -190,7 +256,8 @@ nearest_common_dominator_of_uses (def_operand_p def_p, bool *debug_stmts)
>  static basic_block
>  select_best_block (basic_block early_bb,
>                    basic_block late_bb,
> -                  gimple *stmt)
> +                  gimple *stmt,
> +                  gimple *use)

please update the function comment

>  {
>    basic_block best_bb = late_bb;
>    basic_block temp_bb = late_bb;
> @@ -230,14 +297,47 @@ select_best_block (basic_block early_bb,
>        if (threshold > 100)
>         threshold = 100;
>      }
> -
>    /* If BEST_BB is at the same nesting level, then require it to have
>       significantly lower execution frequency to avoid gratuitous movement.  */
>    if (bb_loop_depth (best_bb) == bb_loop_depth (early_bb)
>        /* If result of comparsion is unknown, prefer EARLY_BB.
>          Thus use !(...>=..) rather than (...<...)  */
>        && !(best_bb->count * 100 >= early_bb->count * threshold))
> -    return best_bb;
> +    {
> +      basic_block new_best_bb = get_immediate_dominator (CDI_DOMINATORS, best_bb);
> +      /* Return best_bb if def and use are in same block otherwise new_best_bb.
> +
> +        Things to consider:
> +
> +          new_best_bb is not equal to best_bb and early_bb.
> +
> +          stmt is not call.
> +
> +          new_best_bb doesnt have any phis.
> +
> +          use basic block is not equal to early_bb.
> +
> +          use basic block post dominates to new_best_bb.
> +
> +          new_best_bb dominates early_bb. */
> +      if (new_best_bb && use
> +         && (new_best_bb != best_bb)
> +         && (new_best_bb != early_bb)
> +         && !is_gimple_call (stmt)
> +         && gsi_end_p (gsi_start_phis (new_best_bb))
> +         && (gimple_bb (use) != early_bb)
> +         && !is_gimple_call (use)
> +         && dominated_by_p (CDI_POST_DOMINATORS, new_best_bb, gimple_bb(use))
> +         && dominated_by_p (CDI_DOMINATORS, new_best_bb, early_bb)
> +         && block_call_p (new_best_bb))
> +       {
> +         if (def_use_same_block (use))
> +           return best_bb;

given the odd implementation of the predicates this matches very very
specific cases.

Consider

 if (..)
  {
    foo();
    bar();
    ... = l;
  }

and C++ where foo and bar might throw.  You then likely want to sink
before foo ().

What's the reason to only consider blocks with exactly 'call; cond;' ?

> +
> +         return new_best_bb;
> +       }
> +       return best_bb;
> +    }
>
>    /* No better block found, so return EARLY_BB, which happens to be the
>       statement's original block.  */
> @@ -439,7 +539,7 @@ statement_sink_location (gimple *stmt, basic_block frombb,
>        if (!dominated_by_p (CDI_DOMINATORS, commondom, frombb))
>         return false;
>
> -      commondom = select_best_block (frombb, commondom, stmt);
> +      commondom = select_best_block (frombb, commondom, stmt, NULL);
>
>        if (commondom == frombb)
>         return false;
> @@ -456,19 +556,58 @@ statement_sink_location (gimple *stmt, basic_block frombb,
>             continue;
>           break;
>         }
> +
>        use = USE_STMT (one_use);
>
>        if (gimple_code (use) != GIMPLE_PHI)
>         {
> -         sinkbb = select_best_block (frombb, gimple_bb (use), stmt);
> +         sinkbb = select_best_block (frombb, gimple_bb (use), stmt, use);
>
>           if (sinkbb == frombb)
>             return false;
>
> -         if (sinkbb == gimple_bb (use))
> -           *togsi = gsi_for_stmt (use);
> -         else
> -           *togsi = gsi_after_labels (sinkbb);
> +          gimple *def_stmt = SSA_NAME_DEF_STMT (DEF_FROM_PTR (def_p));
> +
> +          if ((gimple_bb (def_stmt) == gimple_bb (use))
> +               && (gimple_bb (use) != sinkbb))
> +            sinkbb = gimple_bb (use);
> +
> +           if (sinkbb == gimple_bb (use))
> +             {
> +               gimple_stmt_iterator gsi = gsi_last_bb (sinkbb);
> +               gimple *def_stmt = SSA_NAME_DEF_STMT (DEF_FROM_PTR (def_p));
> +               gimple *last_stmt = gsi_stmt (gsi);
> +
> +               /* Update sinking point as stmt before call if the sinking block
> +                  has only calls. Otherwise update sinking point as the use
> +                  stmt. */
> +               if (gsi_stmt (gsi) == use
> +                   && !is_gimple_call (last_stmt)
> +                   && (gimple_code (last_stmt) != GIMPLE_SWITCH)
> +                   && (gimple_code (last_stmt) != GIMPLE_COND)
> +                   && (gimple_code (last_stmt) != GIMPLE_GOTO)
> +                   && (!gimple_vdef (use) || !def_use_same_block (def_stmt)))
> +                 {
> +                   if (!gsi_end_p (gsi))
> +                     gsi_prev (&gsi);
> +
> +                   gimple *stmt = gsi_stmt (gsi);
> +
> +                   if (!gsi_end_p (gsi))
> +                     gsi_prev (&gsi);
> +
> +                   if (gsi_end_p (gsi) && stmt && is_gimple_call (stmt)
> +                       && gsi_end_p (gsi_start_phis (sinkbb))
> +                       && !is_gimple_call (def_stmt))
> +                     *togsi = gsi_for_stmt (stmt);
> +                   else
> +                     *togsi = gsi_for_stmt (use);
> +                  }
> +               else
> +                 *togsi = gsi_for_stmt(use);
> +              }
> +            else
> +               *togsi = gsi_after_labels (sinkbb);

This is very convoluted.  I think that in the end you want to compute (once) the
position of the first call in each block.  Since we're waking the CFG backwards
in post-dominator order this information can be gathered during this walk.
This would determine the location to sink to iff the use stmt is dominated by
this location (you can for example use gimple_uid to mark stmts before it).

The alternative is to simply always sink to the start of blocks even for the
use stmt block in case that has a call before the use (but you still need to
efficiently compute that).

Richard.

>
>           return true;
>         }
> @@ -480,7 +619,7 @@ statement_sink_location (gimple *stmt, basic_block frombb,
>    if (!sinkbb)
>      return false;
>
> -  sinkbb = select_best_block (frombb, sinkbb, stmt);
> +  sinkbb = select_best_block (frombb, sinkbb, stmt, NULL);
>    if (!sinkbb || sinkbb == frombb)
>      return false;
>
> --
> 2.31.1
>

  parent reply	other threads:[~2023-05-22 12:56 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-18  7:14 Ajit Agarwal
2023-05-18 16:57 ` Segher Boessenkool
2023-05-22 12:56 ` Richard Biener [this message]
2023-05-30  5:06   ` Ajit Agarwal
2023-05-30  7:04     ` Richard Biener
2023-05-30  7:32       ` Ajit Agarwal
2023-05-30 11:24         ` Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFiYyc12gMBXcN+Buetb3FrN+EhJ-AnS99hfduESZ4gncCPUVA@mail.gmail.com \
    --to=richard.guenther@gmail.com \
    --cc=aagarwa1@linux.ibm.com \
    --cc=bergner@linux.ibm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jeffreyalaw@gmail.com \
    --cc=segher@kernel.crashing.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).