public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <richard.guenther@gmail.com>
To: Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org>,
	Richard Biener <richard.guenther@gmail.com>,
	 Richard Sandiford <richard.sandiford@arm.com>
Subject: Re: [PATCH 4/5] if-conv: Apply VN to hoisted conversions
Date: Fri, 19 Nov 2021 11:09:20 +0100	[thread overview]
Message-ID: <CAFiYyc3R_ao1y1netZzPSyJdn+=dTe8QfLP9-7=ZZZnL7a0ToA@mail.gmail.com> (raw)
In-Reply-To: <mptee7g7z8q.fsf@arm.com>

On Tue, Nov 16, 2021 at 7:05 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > On Mon, Nov 15, 2021 at 3:00 PM Richard Sandiford
> > <richard.sandiford@arm.com> wrote:
> >>
> >> Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> >> > On Fri, Nov 12, 2021 at 7:05 PM Richard Sandiford via Gcc-patches
> >> > <gcc-patches@gcc.gnu.org> wrote:
> >> >>
> >> >> This patch is a prerequisite for a later one.  At the moment,
> >> >> if-conversion converts predicated POINTER_PLUS_EXPRs into
> >> >> non-wrapping forms, which for:
> >> >>
> >> >>     … = base + offset
> >> >>
> >> >> becomes:
> >> >>
> >> >>     tmp = (unsigned long) base
> >> >>     … = tmp + offset
> >> >>
> >> >> It then hoists these conversions out of the loop where possible.
> >> >>
> >> >> However, because “base” is a valid gimple operand, there can be
> >> >> multiple POINTER_PLUS_EXPRs with the same base, which can in turn
> >> >> lead to multiple instances of the same conversion.  The later VN pass
> >> >> is (and I think needs to be) restricted to the new if-converted code,
> >> >> whereas here we're deliberately inserting the conversions before the
> >> >> .LOOP_VECTORIZED condition:
> >> >>
> >> >>         /* If we versioned loop then make sure to insert invariant
> >> >>            stmts before the .LOOP_VECTORIZED check since the vectorizer
> >> >>            will re-use that for things like runtime alias versioning
> >> >>            whose condition can end up using those invariants.  */
> >> >>
> >> >> We can therefore enter the vectoriser with redundant conversions.
> >> >>
> >> >> The easiest fix seemed to be to defer the hoisting until after VN.
> >> >> This catches other hoisting opportunities too.
> >> >>
> >> >> Hoisting the code from the (artificial) loop in pr99102.c means
> >> >> that it's no longer worth vectorising.  The patch forces vectorisation
> >> >> instead of relying on the cost model.
> >> >>
> >> >> The patch also reverts pr87007-4.c and pr87007-5.c back to their
> >> >> original forms, undoing changes in 783dc66f9ccb0019c3dad.
> >> >> The code at the time the tests were added was:
> >> >>
> >> >>         testl   %edi, %edi
> >> >>         je      .L10
> >> >>         vxorps  %xmm1, %xmm1, %xmm1
> >> >>         vsqrtsd d3(%rip), %xmm1, %xmm0
> >> >>         vsqrtsd d2(%rip), %xmm1, %xmm1
> >> >>         ...
> >> >> .L10:
> >> >>         ret
> >> >>
> >> >> with the operations being hoisted, and the vxorps was specifically
> >> >> wanted (compared to the previous code).  This patch restores the code
> >> >> to that form, with the hoisted operations and the vxorps.
> >> >>
> >> >> Regstrapped on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
> >> >>
> >> >> Richard
> >> >>
> >> >>
> >> >> gcc/
> >> >>         * tree-if-conv.c: Include tree-eh.h.
> >> >>         (predicate_statements): Remove pe argument.  Don't hoist
> >> >>         statements here.
> >> >>         (combine_blocks): Remove pe argument.
> >> >>         (ifcvt_can_hoist, ifcvt_can_hoist_further): New functions.
> >> >>         (ifcvt_hoist_invariants): Likewise.
> >> >>         (tree_if_conversion): Update call to combine_blocks.  Call
> >> >>         ifcvt_hoist_invariants after VN.
> >> >>
> >> >> gcc/testsuite/
> >> >>         * gcc.dg/vect/pr99102.c: Add -fno-vect-cost-model.
> >> >>
> >> >>         Revert:
> >> >>
> >> >>         2020-09-09  Richard Biener  <rguenther@suse.de>
> >> >>
> >> >>         * gcc.target/i386/pr87007-4.c: Adjust.
> >> >>         * gcc.target/i386/pr87007-5.c: Likewise.
> >> >> ---
> >> >>  gcc/testsuite/gcc.dg/vect/pr99102.c       |   2 +-
> >> >>  gcc/testsuite/gcc.target/i386/pr87007-4.c |   2 +-
> >> >>  gcc/testsuite/gcc.target/i386/pr87007-5.c |   2 +-
> >> >>  gcc/tree-if-conv.c                        | 122 ++++++++++++++++++++--
> >> >>  4 files changed, 114 insertions(+), 14 deletions(-)
> >> >>
> >> >> diff --git a/gcc/testsuite/gcc.dg/vect/pr99102.c b/gcc/testsuite/gcc.dg/vect/pr99102.c
> >> >> index 6c1a13f0783..0d030d15c86 100644
> >> >> --- a/gcc/testsuite/gcc.dg/vect/pr99102.c
> >> >> +++ b/gcc/testsuite/gcc.dg/vect/pr99102.c
> >> >> @@ -1,4 +1,4 @@
> >> >> -/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details" } */
> >> >> +/* { dg-options "-O2 -ftree-vectorize -fno-vect-cost-model -fdump-tree-vect-details" } */
> >> >>  /* { dg-additional-options "-msve-vector-bits=256" { target aarch64_sve256_hw } } */
> >> >>  long a[44];
> >> >>  short d, e = -7;
> >> >> diff --git a/gcc/testsuite/gcc.target/i386/pr87007-4.c b/gcc/testsuite/gcc.target/i386/pr87007-4.c
> >> >> index 9c4b8005af3..e91bdcbac44 100644
> >> >> --- a/gcc/testsuite/gcc.target/i386/pr87007-4.c
> >> >> +++ b/gcc/testsuite/gcc.target/i386/pr87007-4.c
> >> >> @@ -15,4 +15,4 @@ foo (int n, int k)
> >> >>        d1 = ceil (d3);
> >> >>  }
> >> >>
> >> >> -/* { dg-final { scan-assembler-times "vxorps\[^\n\r\]*xmm\[0-9\]" 0 } } */
> >> >> +/* { dg-final { scan-assembler-times "vxorps\[^\n\r\]*xmm\[0-9\]" 1 } } */
> >> >> diff --git a/gcc/testsuite/gcc.target/i386/pr87007-5.c b/gcc/testsuite/gcc.target/i386/pr87007-5.c
> >> >> index e4d956a5d7f..20d13cf650b 100644
> >> >> --- a/gcc/testsuite/gcc.target/i386/pr87007-5.c
> >> >> +++ b/gcc/testsuite/gcc.target/i386/pr87007-5.c
> >> >> @@ -15,4 +15,4 @@ foo (int n, int k)
> >> >>        d1 = sqrt (d3);
> >> >>  }
> >> >>
> >> >> -/* { dg-final { scan-assembler-times "vxorps\[^\n\r\]*xmm\[0-9\]" 0 } } */
> >> >> +/* { dg-final { scan-assembler-times "vxorps\[^\n\r\]*xmm\[0-9\]" 1 } } */
> >> >> diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
> >> >> index e88ddc9f788..0ad557a2f4d 100644
> >> >> --- a/gcc/tree-if-conv.c
> >> >> +++ b/gcc/tree-if-conv.c
> >> >> @@ -121,6 +121,7 @@ along with GCC; see the file COPYING3.  If not see
> >> >>  #include "tree-cfgcleanup.h"
> >> >>  #include "tree-ssa-dse.h"
> >> >>  #include "tree-vectorizer.h"
> >> >> +#include "tree-eh.h"
> >> >>
> >> >>  /* Only handle PHIs with no more arguments unless we are asked to by
> >> >>     simd pragma.  */
> >> >> @@ -2496,7 +2497,7 @@ predicate_rhs_code (gassign *stmt, tree mask, tree cond,
> >> >>  */
> >> >>
> >> >>  static void
> >> >> -predicate_statements (loop_p loop, edge pe)
> >> >> +predicate_statements (loop_p loop)
> >> >>  {
> >> >>    unsigned int i, orig_loop_num_nodes = loop->num_nodes;
> >> >>    auto_vec<int, 1> vect_sizes;
> >> >> @@ -2597,13 +2598,7 @@ predicate_statements (loop_p loop, edge pe)
> >> >>                 {
> >> >>                   gassign *stmt2 = as_a <gassign *> (gsi_stmt (gsi2));
> >> >>                   gsi_remove (&gsi2, false);
> >> >> -                 /* Make sure to move invariant conversions out of the
> >> >> -                    loop.  */
> >> >> -                 if (CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (stmt2))
> >> >> -                     && expr_invariant_in_loop_p (loop,
> >> >> -                                                  gimple_assign_rhs1 (stmt2)))
> >> >> -                   gsi_insert_on_edge_immediate (pe, stmt2);
> >> >> -                 else if (first)
> >> >> +                 if (first)
> >> >>                     {
> >> >>                       gsi_insert_before (&gsi, stmt2, GSI_NEW_STMT);
> >> >>                       first = false;
> >> >> @@ -2684,7 +2679,7 @@ remove_conditions_and_labels (loop_p loop)
> >> >>     blocks.  Replace PHI nodes with conditional modify expressions.  */
> >> >>
> >> >>  static void
> >> >> -combine_blocks (class loop *loop, edge pe)
> >> >> +combine_blocks (class loop *loop)
> >> >>  {
> >> >>    basic_block bb, exit_bb, merge_target_bb;
> >> >>    unsigned int orig_loop_num_nodes = loop->num_nodes;
> >> >> @@ -2697,7 +2692,7 @@ combine_blocks (class loop *loop, edge pe)
> >> >>    predicate_all_scalar_phis (loop);
> >> >>
> >> >>    if (need_to_predicate || need_to_rewrite_undefined)
> >> >> -    predicate_statements (loop, pe);
> >> >> +    predicate_statements (loop);
> >> >>
> >> >>    /* Merge basic blocks.  */
> >> >>    exit_bb = NULL;
> >> >> @@ -3181,6 +3176,109 @@ ifcvt_local_dce (class loop *loop)
> >> >>      }
> >> >>  }
> >> >>
> >> >> +/* Return true if STMT can be hoisted from if-converted loop LOOP.  */
> >> >> +
> >> >> +static bool
> >> >> +ifcvt_can_hoist (class loop *loop, gimple *stmt)
> >> >> +{
> >> >> +  if (auto *call = dyn_cast<gcall *> (stmt))
> >> >> +    {
> >> >> +      if (gimple_call_internal_p (call)
> >> >> +         && internal_fn_mask_index (gimple_call_internal_fn (call)) >= 0)
> >> >> +       return false;
> >> >> +    }
> >> >> +  else if (auto *assign = dyn_cast<gassign *> (stmt))
> >> >> +    {
> >> >> +      if (gimple_assign_rhs_code (assign) == COND_EXPR)
> >> >> +       return false;
> >> >> +    }
> >> >> +  else
> >> >> +    return false;
> >> >> +
> >> >> +  if (gimple_has_side_effects (stmt)
> >> >> +      || gimple_could_trap_p (stmt)
> >> >> +      || stmt_could_throw_p (cfun, stmt)
> >> >> +      || gimple_vdef (stmt)
> >> >> +      || gimple_vuse (stmt))
> >> >> +    return false;
> >> >> +
> >> >> +  int num_args = gimple_num_args (stmt);
> >> >> +  for (int i = 0; i < num_args; ++i)
> >> >> +    if (!expr_invariant_in_loop_p (loop, gimple_arg (stmt, i)))
> >> >> +      return false;
> >> >> +
> >> >> +  return true;
> >> >> +}
> >> >> +
> >> >> +/* PE is the preferred hoisting edge selected by tree_if_conversion, which
> >> >> +   s known to be different from (and to dominate) the preheader edge of the
> >> >> +   if-converted loop.  We already know that STMT can be inserted on the loop
> >> >> +   preheader edge.  Return true if we prefer to insert it on PE instead.  */
> >> >> +
> >> >> +static bool
> >> >> +ifcvt_can_hoist_further (edge pe, gimple *stmt)
> >> >> +{
> >> >> +  /* As explained in tree_if_conversion, we want to hoist invariant
> >> >> +     conversions further so that they can be reused by alias analysis.  */
> >> >> +  auto *assign = dyn_cast<gassign *> (stmt);
> >> >> +  if (assign
> >> >> +      && CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (assign)))
> >> >> +    {
> >> >> +      tree rhs = gimple_assign_rhs1 (assign);
> >> >> +      if (is_gimple_min_invariant (rhs))
> >> >> +       return true;
> >> >> +
> >> >> +      if (TREE_CODE (rhs) == SSA_NAME)
> >> >> +       {
> >> >> +         basic_block def_bb = gimple_bb (SSA_NAME_DEF_STMT (rhs));
> >> >> +         if (!def_bb || dominated_by_p (CDI_DOMINATORS, pe->dest, def_bb))
> >> >> +           return true;
> >> >> +       }
> >> >> +    }
> >> >> +  return false;
> >> >> +}
> >> >> +
> >> >> +/* Hoist invariant statements from LOOP.  PE is the preferred edge for
> >> >> +   hoisting conversions, as selected by tree_if_conversion; see there
> >> >> +   for details.  */
> >> >> +
> >> >> +static void
> >> >> +ifcvt_hoist_invariants (class loop *loop, edge pe)
> >> >> +{
> >> >> +  gimple_stmt_iterator hoist_gsi = {};
> >> >> +  gimple_stmt_iterator hoist_gsi_pe = {};
> >> >> +  unsigned int num_blocks = loop->num_nodes;
> >> >> +  basic_block *body = get_loop_body (loop);
> >> >> +  for (unsigned int i = 0; i < num_blocks; ++i)
> >> >> +    for (gimple_stmt_iterator gsi = gsi_start_bb (body[i]); !gsi_end_p (gsi);)
> >> >> +      {
> >> >> +       gimple *stmt = gsi_stmt (gsi);
> >> >> +       if (ifcvt_can_hoist (loop, stmt))
> >> >> +         {
> >> >> +           /* Once we've hoisted one statement, insert other statements
> >> >> +              after it.  */
> >> >> +           edge e = loop_preheader_edge (loop);
> >> >> +           gimple_stmt_iterator *hoist_gsi_ptr = &hoist_gsi;
> >> >> +           if (e != pe && ifcvt_can_hoist_further (pe, stmt))
> >> >
> >> > One issue with hoisting to loop_preheader_edge instead of 'pe'
> >> > is that eventually the loop versioning code will pick up the defs
> >> > through the versioning condition and that will break because the
> >> > versioning condition will be inserted in place of the
> >> > IFN_VECTORIZED_LOOP, which is _before_ the preheader.
> >> > The code basically expects the preheader to be empty.
> >>
> >> Do you mean that the versioning-for-aliases code assumes that all
> >> loop-invariant definitions are available in the versioning condition?
> >
> > Yes.  Or well, it assumes it can use the .LOOP_VECTORIZED condition
> > as versioning condition and there simply use insert the condition without
> > further checks.  And the checks are generated utlimatively from SCEV
> > which just expands things up to the loops preheader.
> >
> >> I hadn't realised that's what the comment was saying.  It sounded more
> >> like an optimisation.
> >
> > The hoisting itself is, but that we don't just hoist to the loops preheader
> > is for correctness, not an optimization ... it's indeed a bit messy though.
>
> Ah, OK.
>
> How does this version look?  I was in two minds about whether to keep
> the expr_invariant_in_loop_p path, but it seemed like the more natural
> test for the case that it handles.  I'd be happy to drop it and use
> ifcvt_available_on_edge_p instead if you prefer though.
>
> Tested as before.

OK.

Thanks,
Richard.

> Thanks,
> Richard
>
>
> gcc/
>         * tree-if-conv.c: Include tree-eh.h.
>         (predicate_statements): Remove pe argument.  Don't hoist
>         statements here.
>         (combine_blocks): Remove pe argument.
>         (ifcvt_available_on_edge_p, ifcvt_can_hoist): New functions.
>         (ifcvt_hoist_invariants): Likewise.
>         (tree_if_conversion): Update call to combine_blocks.  Call
>         ifcvt_hoist_invariants after VN.
>
> gcc/testsuite/
>         * gcc.dg/vect/pr99102.c: Add -fno-vect-cost-model.
>
>         Revert:
>
>         2020-09-09  Richard Biener  <rguenther@suse.de>
>
>         * gcc.target/i386/pr87007-4.c: Adjust.
>         * gcc.target/i386/pr87007-5.c: Likewise.
> ---
>  gcc/testsuite/gcc.dg/vect/pr99102.c       |   2 +-
>  gcc/testsuite/gcc.target/i386/pr87007-4.c |   2 +-
>  gcc/testsuite/gcc.target/i386/pr87007-5.c |   2 +-
>  gcc/tree-if-conv.c                        | 112 +++++++++++++++++++---
>  4 files changed, 104 insertions(+), 14 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.dg/vect/pr99102.c b/gcc/testsuite/gcc.dg/vect/pr99102.c
> index 6c1a13f0783..0d030d15c86 100644
> --- a/gcc/testsuite/gcc.dg/vect/pr99102.c
> +++ b/gcc/testsuite/gcc.dg/vect/pr99102.c
> @@ -1,4 +1,4 @@
> -/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details" } */
> +/* { dg-options "-O2 -ftree-vectorize -fno-vect-cost-model -fdump-tree-vect-details" } */
>  /* { dg-additional-options "-msve-vector-bits=256" { target aarch64_sve256_hw } } */
>  long a[44];
>  short d, e = -7;
> diff --git a/gcc/testsuite/gcc.target/i386/pr87007-4.c b/gcc/testsuite/gcc.target/i386/pr87007-4.c
> index 9c4b8005af3..e91bdcbac44 100644
> --- a/gcc/testsuite/gcc.target/i386/pr87007-4.c
> +++ b/gcc/testsuite/gcc.target/i386/pr87007-4.c
> @@ -15,4 +15,4 @@ foo (int n, int k)
>        d1 = ceil (d3);
>  }
>
> -/* { dg-final { scan-assembler-times "vxorps\[^\n\r\]*xmm\[0-9\]" 0 } } */
> +/* { dg-final { scan-assembler-times "vxorps\[^\n\r\]*xmm\[0-9\]" 1 } } */
> diff --git a/gcc/testsuite/gcc.target/i386/pr87007-5.c b/gcc/testsuite/gcc.target/i386/pr87007-5.c
> index e4d956a5d7f..20d13cf650b 100644
> --- a/gcc/testsuite/gcc.target/i386/pr87007-5.c
> +++ b/gcc/testsuite/gcc.target/i386/pr87007-5.c
> @@ -15,4 +15,4 @@ foo (int n, int k)
>        d1 = sqrt (d3);
>  }
>
> -/* { dg-final { scan-assembler-times "vxorps\[^\n\r\]*xmm\[0-9\]" 0 } } */
> +/* { dg-final { scan-assembler-times "vxorps\[^\n\r\]*xmm\[0-9\]" 1 } } */
> diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
> index e88ddc9f788..d0ca04600ce 100644
> --- a/gcc/tree-if-conv.c
> +++ b/gcc/tree-if-conv.c
> @@ -121,6 +121,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "tree-cfgcleanup.h"
>  #include "tree-ssa-dse.h"
>  #include "tree-vectorizer.h"
> +#include "tree-eh.h"
>
>  /* Only handle PHIs with no more arguments unless we are asked to by
>     simd pragma.  */
> @@ -2496,7 +2497,7 @@ predicate_rhs_code (gassign *stmt, tree mask, tree cond,
>  */
>
>  static void
> -predicate_statements (loop_p loop, edge pe)
> +predicate_statements (loop_p loop)
>  {
>    unsigned int i, orig_loop_num_nodes = loop->num_nodes;
>    auto_vec<int, 1> vect_sizes;
> @@ -2597,13 +2598,7 @@ predicate_statements (loop_p loop, edge pe)
>                 {
>                   gassign *stmt2 = as_a <gassign *> (gsi_stmt (gsi2));
>                   gsi_remove (&gsi2, false);
> -                 /* Make sure to move invariant conversions out of the
> -                    loop.  */
> -                 if (CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (stmt2))
> -                     && expr_invariant_in_loop_p (loop,
> -                                                  gimple_assign_rhs1 (stmt2)))
> -                   gsi_insert_on_edge_immediate (pe, stmt2);
> -                 else if (first)
> +                 if (first)
>                     {
>                       gsi_insert_before (&gsi, stmt2, GSI_NEW_STMT);
>                       first = false;
> @@ -2684,7 +2679,7 @@ remove_conditions_and_labels (loop_p loop)
>     blocks.  Replace PHI nodes with conditional modify expressions.  */
>
>  static void
> -combine_blocks (class loop *loop, edge pe)
> +combine_blocks (class loop *loop)
>  {
>    basic_block bb, exit_bb, merge_target_bb;
>    unsigned int orig_loop_num_nodes = loop->num_nodes;
> @@ -2697,7 +2692,7 @@ combine_blocks (class loop *loop, edge pe)
>    predicate_all_scalar_phis (loop);
>
>    if (need_to_predicate || need_to_rewrite_undefined)
> -    predicate_statements (loop, pe);
> +    predicate_statements (loop);
>
>    /* Merge basic blocks.  */
>    exit_bb = NULL;
> @@ -3181,6 +3176,99 @@ ifcvt_local_dce (class loop *loop)
>      }
>  }
>
> +/* Return true if VALUE is already available on edge PE.  */
> +
> +static bool
> +ifcvt_available_on_edge_p (edge pe, tree value)
> +{
> +  if (is_gimple_min_invariant (value))
> +    return true;
> +
> +  if (TREE_CODE (value) == SSA_NAME)
> +    {
> +      basic_block def_bb = gimple_bb (SSA_NAME_DEF_STMT (value));
> +      if (!def_bb || dominated_by_p (CDI_DOMINATORS, pe->dest, def_bb))
> +       return true;
> +    }
> +
> +  return false;
> +}
> +
> +/* Return true if STMT can be hoisted from if-converted loop LOOP to
> +   edge PE.  */
> +
> +static bool
> +ifcvt_can_hoist (class loop *loop, edge pe, gimple *stmt)
> +{
> +  if (auto *call = dyn_cast<gcall *> (stmt))
> +    {
> +      if (gimple_call_internal_p (call)
> +         && internal_fn_mask_index (gimple_call_internal_fn (call)) >= 0)
> +       return false;
> +    }
> +  else if (auto *assign = dyn_cast<gassign *> (stmt))
> +    {
> +      if (gimple_assign_rhs_code (assign) == COND_EXPR)
> +       return false;
> +    }
> +  else
> +    return false;
> +
> +  if (gimple_has_side_effects (stmt)
> +      || gimple_could_trap_p (stmt)
> +      || stmt_could_throw_p (cfun, stmt)
> +      || gimple_vdef (stmt)
> +      || gimple_vuse (stmt))
> +    return false;
> +
> +  int num_args = gimple_num_args (stmt);
> +  if (pe != loop_preheader_edge (loop))
> +    {
> +      for (int i = 0; i < num_args; ++i)
> +       if (!ifcvt_available_on_edge_p (pe, gimple_arg (stmt, i)))
> +         return false;
> +    }
> +  else
> +    {
> +      for (int i = 0; i < num_args; ++i)
> +       if (!expr_invariant_in_loop_p (loop, gimple_arg (stmt, i)))
> +         return false;
> +    }
> +
> +  return true;
> +}
> +
> +/* Hoist invariant statements from LOOP to edge PE.  */
> +
> +static void
> +ifcvt_hoist_invariants (class loop *loop, edge pe)
> +{
> +  gimple_stmt_iterator hoist_gsi = {};
> +  unsigned int num_blocks = loop->num_nodes;
> +  basic_block *body = get_loop_body (loop);
> +  for (unsigned int i = 0; i < num_blocks; ++i)
> +    for (gimple_stmt_iterator gsi = gsi_start_bb (body[i]); !gsi_end_p (gsi);)
> +      {
> +       gimple *stmt = gsi_stmt (gsi);
> +       if (ifcvt_can_hoist (loop, pe, stmt))
> +         {
> +           /* Once we've hoisted one statement, insert other statements
> +              after it.  */
> +           gsi_remove (&gsi, false);
> +           if (hoist_gsi.ptr)
> +             gsi_insert_after (&hoist_gsi, stmt, GSI_NEW_STMT);
> +           else
> +             {
> +               gsi_insert_on_edge_immediate (pe, stmt);
> +               hoist_gsi = gsi_for_stmt (stmt);
> +             }
> +           continue;
> +         }
> +       gsi_next (&gsi);
> +      }
> +  free (body);
> +}
> +
>  /* If-convert LOOP when it is legal.  For the moment this pass has no
>     profitability analysis.  Returns non-zero todo flags when something
>     changed.  */
> @@ -3275,7 +3363,7 @@ tree_if_conversion (class loop *loop, vec<gimple *> *preds)
>    /* Now all statements are if-convertible.  Combine all the basic
>       blocks into one huge basic block doing the if-conversion
>       on-the-fly.  */
> -  combine_blocks (loop, pe);
> +  combine_blocks (loop);
>
>    /* Perform local CSE, this esp. helps the vectorizer analysis if loads
>       and stores are involved.  CSE only the loop body, not the entry
> @@ -3297,6 +3385,8 @@ tree_if_conversion (class loop *loop, vec<gimple *> *preds)
>    ifcvt_local_dce (loop);
>    BITMAP_FREE (exit_bbs);
>
> +  ifcvt_hoist_invariants (loop, pe);
> +
>    todo |= TODO_cleanup_cfg;
>
>   cleanup:
> --
> 2.25.1
>

      reply	other threads:[~2021-11-19 10:09 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-12 18:04 Richard Sandiford
2021-11-15 12:47 ` Richard Biener
2021-11-15 13:59   ` Richard Sandiford
2021-11-15 14:11     ` Richard Biener
2021-11-16 18:05       ` Richard Sandiford
2021-11-19 10:09         ` Richard Biener [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFiYyc3R_ao1y1netZzPSyJdn+=dTe8QfLP9-7=ZZZnL7a0ToA@mail.gmail.com' \
    --to=richard.guenther@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=richard.sandiford@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).