public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <richard.guenther@gmail.com>
To: Ajit Agarwal <aagarwa1@linux.ibm.com>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>,
	Jeff Law <jeffreyalaw@gmail.com>,
	 Segher Boessenkool <segher@kernel.crashing.org>,
	Peter Bergner <bergner@linux.ibm.com>
Subject: Re: [PATCH v8] tree-ssa-sink: Improve code sinking pass
Date: Tue, 17 Oct 2023 10:33:38 +0200	[thread overview]
Message-ID: <CAFiYyc2KNgHt8Ciwszok7AjLWMqxJ5xjhw_-NasEpn5tFa0jrA@mail.gmail.com> (raw)
In-Reply-To: <79f04438-7473-2b01-d26a-9357ad9318af@linux.ibm.com>

On Thu, Oct 12, 2023 at 10:42 AM Ajit Agarwal <aagarwa1@linux.ibm.com> wrote:
>
> This patch improves code sinking pass to sink statements before call to reduce
> register pressure.
> Review comments are incorporated. Synced and modified with latest trunk sources.
>
> For example :
>
> void bar();
> int j;
> void foo(int a, int b, int c, int d, int e, int f)
> {
>   int l;
>   l = a + b + c + d +e + f;
>   if (a != 5)
>     {
>       bar();
>       j = l;
>     }
> }
>
> Code Sinking does the following:
>
> void bar();
> int j;
> void foo(int a, int b, int c, int d, int e, int f)
> {
>   int l;
>
>   if (a != 5)
>     {
>       l = a + b + c + d +e + f;
>       bar();
>       j = l;
>     }
> }
>
> Bootstrapped regtested on powerpc64-linux-gnu.
>
> Thanks & Regards
> Ajit
>
> tree-ssa-sink: Improve code sinking pass
>
> Currently, code sinking will sink code after function calls.  This increases
> register pressure for callee-saved registers.  The following patch improves
> code sinking by placing the sunk code before calls in the use block or in
> the immediate dominator of the use blocks.

The patch no longer does what the description above says.

More comments below.

> 2023-10-12  Ajit Kumar Agarwal  <aagarwa1@linux.ibm.com>
>
> gcc/ChangeLog:
>
>         PR tree-optimization/81953
>         * tree-ssa-sink.cc (statement_sink_location): Move statements before
>         calls.
>         (select_best_block): Add heuristics to select the best blocks in the
>         immediate post dominator.
>
> gcc/testsuite/ChangeLog:
>
>         PR tree-optimization/81953
>         * gcc.dg/tree-ssa/ssa-sink-20.c: New test.
>         * gcc.dg/tree-ssa/ssa-sink-21.c: New test.
> ---
>  gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c | 15 ++++++++
>  gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-22.c | 19 ++++++++++
>  gcc/tree-ssa-sink.cc                        | 39 ++++++++++++---------
>  3 files changed, 56 insertions(+), 17 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-22.c
>
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
> new file mode 100644
> index 00000000000..d3b79ca5803
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
> @@ -0,0 +1,15 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-sink-stats" } */
> +void bar();
> +int j;
> +void foo(int a, int b, int c, int d, int e, int f)
> +{
> +  int l;
> +  l = a + b + c + d +e + f;
> +  if (a != 5)
> +    {
> +      bar();
> +      j = l;
> +    }
> +}
> +/* { dg-final { scan-tree-dump {l_12\s+=\s+_4\s+\+\s+f_11\(D\);\n\s+bar\s+\(\)} sink1 } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-22.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-22.c
> new file mode 100644
> index 00000000000..84e7938c54f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-22.c
> @@ -0,0 +1,19 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-sink-stats" } */
> +void bar();
> +int j, x;
> +void foo(int a, int b, int c, int d, int e, int f)
> +{
> +  int l;
> +  l = a + b + c + d +e + f;
> +  if (a != 5)
> +    {
> +      bar();
> +      if (b != 3)
> +        x = 3;
> +      else
> +        x = 5;
> +      j = l;
> +    }
> +}
> +/* { dg-final { scan-tree-dump {l_13\s+=\s+_4\s+\+\s+f_12\(D\);\n\s+bar\s+\(\)} sink1 } } */
> diff --git a/gcc/tree-ssa-sink.cc b/gcc/tree-ssa-sink.cc
> index a360c5cdd6e..95298bc8402 100644
> --- a/gcc/tree-ssa-sink.cc
> +++ b/gcc/tree-ssa-sink.cc
> @@ -174,7 +174,8 @@ nearest_common_dominator_of_uses (def_operand_p def_p, bool *debug_stmts)
>
>  /* Given EARLY_BB and LATE_BB, two blocks in a path through the dominator
>     tree, return the best basic block between them (inclusive) to place
> -   statements.
> +   statements. The best basic block should be an immediate dominator of
> +   best basic block if the use stmt is after the call.
>
>     We want the most control dependent block in the shallowest loop nest.
>
> @@ -196,6 +197,16 @@ select_best_block (basic_block early_bb,
>    basic_block best_bb = late_bb;
>    basic_block temp_bb = late_bb;
>    int threshold;
> +  /* Get the sinking threshold.  If the statement to be moved has memory
> +     operands, then increase the threshold by 7% as those are even more
> +     profitable to avoid, clamping at 100%.  */
> +  threshold = param_sink_frequency_threshold;
> +  if (gimple_vuse (stmt) || gimple_vdef (stmt))
> +    {
> +      threshold += 7;
> +      if (threshold > 100)
> +       threshold = 100;
> +    }
>
>    while (temp_bb != early_bb)
>      {
> @@ -204,6 +215,14 @@ select_best_block (basic_block early_bb,
>        if (bb_loop_depth (temp_bb) < bb_loop_depth (best_bb))
>         best_bb = temp_bb;
>
> +      /* if we have temp_bb post dominated by use block block then immediate
> +       * dominator would be our best block.  */
> +      if (!gimple_vuse (stmt)
> +         && bb_loop_depth (temp_bb) == bb_loop_depth (early_bb)
> +         && !(temp_bb->count * 100 >= early_bb->count * threshold)
> +         && dominated_by_p (CDI_DOMINATORS, late_bb, temp_bb))

this isn't a post-dominance check, in fact this always returns true.  This
also overrides the best found loop depth which probably means finding
both inside the same loop doesn't work.

What's the intent of the change?

> +       best_bb = temp_bb;
> +
>        /* Walk up the dominator tree, hopefully we'll find a shallower
>          loop nest.  */
>        temp_bb = get_immediate_dominator (CDI_DOMINATORS, temp_bb);
> @@ -233,17 +252,6 @@ select_best_block (basic_block early_bb,
>        && !dominated_by_p (CDI_DOMINATORS, best_bb->loop_father->latch, best_bb))
>      return early_bb;
>
> -  /* Get the sinking threshold.  If the statement to be moved has memory
> -     operands, then increase the threshold by 7% as those are even more
> -     profitable to avoid, clamping at 100%.  */
> -  threshold = param_sink_frequency_threshold;
> -  if (gimple_vuse (stmt) || gimple_vdef (stmt))
> -    {
> -      threshold += 7;
> -      if (threshold > 100)
> -       threshold = 100;
> -    }
> -
>    /* If BEST_BB is at the same nesting level, then require it to have
>       significantly lower execution frequency to avoid gratuitous movement.  */
>    if (bb_loop_depth (best_bb) == bb_loop_depth (early_bb)
> @@ -430,6 +438,7 @@ statement_sink_location (gimple *stmt, basic_block frombb,
>             continue;
>           break;
>         }
> +
>        use = USE_STMT (one_use);
>
>        if (gimple_code (use) != GIMPLE_PHI)
> @@ -439,10 +448,7 @@ statement_sink_location (gimple *stmt, basic_block frombb,
>           if (sinkbb == frombb)
>             return false;
>
> -         if (sinkbb == gimple_bb (use))
> -           *togsi = gsi_for_stmt (use);
> -         else
> -           *togsi = gsi_after_labels (sinkbb);
> +         *togsi = gsi_after_labels (sinkbb);
>
>           return true;
>         }
> @@ -825,7 +831,6 @@ pass_sink_code::execute (function *fun)
>    mark_dfs_back_edges (fun);
>    memset (&sink_stats, 0, sizeof (sink_stats));
>    calculate_dominance_info (CDI_DOMINATORS);
> -
>    virtual_operand_live vop_live;
>
>    int *rpo = XNEWVEC (int, n_basic_blocks_for_fn (cfun));
> --
> 2.39.3
>

  reply	other threads:[~2023-10-17  8:36 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-12  8:42 Ajit Agarwal
2023-10-17  8:33 ` Richard Biener [this message]
2023-10-17  8:53   ` Ajit Agarwal
2023-10-17  9:17     ` Richard Biener
2023-10-17 13:23       ` Ajit Agarwal
2023-10-30 12:21       ` Ajit Agarwal
2023-10-30 13:09         ` Ajit Agarwal
  -- strict thread matches above, loose matches on Subject: below --
2023-07-18 13:33 Ajit Agarwal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFiYyc2KNgHt8Ciwszok7AjLWMqxJ5xjhw_-NasEpn5tFa0jrA@mail.gmail.com \
    --to=richard.guenther@gmail.com \
    --cc=aagarwa1@linux.ibm.com \
    --cc=bergner@linux.ibm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jeffreyalaw@gmail.com \
    --cc=segher@kernel.crashing.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).