public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <richard.guenther@gmail.com>
To: Richard Biener <richard.guenther@gmail.com>,
	Szabolcs Nagy <szabolcs.nagy@arm.com>,
	 gcc-patches@gcc.gnu.org, Kyrylo.Tkachov@arm.com,
	richard.earnshaw@arm.com,  richard.sandiford@arm.com
Subject: Re: [PATCH 02/11] Handle epilogues that contain jumps
Date: Tue, 17 Oct 2023 11:19:27 +0200	[thread overview]
Message-ID: <CAFiYyc2Fm7vRL12=P-qv-Guw9cyR9pJV=Ec=+G2LGXdXeKPWtA@mail.gmail.com> (raw)
In-Reply-To: <mpth6mwcl4t.fsf@arm.com>

On Thu, Oct 12, 2023 at 10:15 AM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Richard Biener <richard.guenther@gmail.com> writes:
> > On Tue, Aug 22, 2023 at 12:42 PM Szabolcs Nagy via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> >>
> >> From: Richard Sandiford <richard.sandiford@arm.com>
> >>
> >> The prologue/epilogue pass allows the prologue sequence
> >> to contain jumps.  The sequence is then partitioned into
> >> basic blocks using find_many_sub_basic_blocks.
> >>
> >> This patch treats epilogues in the same way.  It's needed for
> >> a follow-on aarch64 patch that adds conditional code to both
> >> the prologue and the epilogue.
> >>
> >> Tested on aarch64-linux-gnu (including with a follow-on patch)
> >> and x86_64-linux-gnu.  OK to install?
> >>
> >> Richard
> >>
> >> gcc/
> >>         * function.cc (thread_prologue_and_epilogue_insns): Handle
> >>         epilogues that contain jumps.
> >> ---
> >>
> >> This is a previously approved patch that was not committed
> >> because it was not needed at the time, but i'd like to commit
> >> it as it is needed for the followup aarch64 eh_return changes:
> >>
> >> https://gcc.gnu.org/pipermail/gcc-patches/2022-November/605769.html
> >>
> >> ---
> >>  gcc/function.cc | 10 ++++++++++
> >>  1 file changed, 10 insertions(+)
> >>
> >> diff --git a/gcc/function.cc b/gcc/function.cc
> >> index dd2c1136e07..70d1cd65303 100644
> >> --- a/gcc/function.cc
> >> +++ b/gcc/function.cc
> >> @@ -6120,6 +6120,11 @@ thread_prologue_and_epilogue_insns (void)
> >>                   && returnjump_p (BB_END (e->src)))
> >>                 e->flags &= ~EDGE_FALLTHRU;
> >>             }
> >> +
> >> +         auto_sbitmap blocks (last_basic_block_for_fn (cfun));
> >> +         bitmap_clear (blocks);
> >> +           bitmap_set_bit (blocks, BLOCK_FOR_INSN (epilogue_seq)->index);
> >> +         find_many_sub_basic_blocks (blocks);
> >>         }
> >>        else if (next_active_insn (BB_END (exit_fallthru_edge->src)))
> >>         {
> >> @@ -6218,6 +6223,11 @@ thread_prologue_and_epilogue_insns (void)
> >>           set_insn_locations (seq, epilogue_location);
> >>
> >>           emit_insn_before (seq, insn);
> >> +
> >> +         auto_sbitmap blocks (last_basic_block_for_fn (cfun));
> >> +         bitmap_clear (blocks);
> >> +         bitmap_set_bit (blocks, BLOCK_FOR_INSN (insn)->index);
> >> +         find_many_sub_basic_blocks (blocks);
> >
> > I'll note that clearing a full sbitmap to pass down a single basic block
> > to find_many_sub_basic_blocks is a quite expensive operation.  May I suggest
> > to add an overload operating on a single basic block?  It's only
> >
> >   FOR_EACH_BB_FN (bb, cfun)
> >     SET_STATE (bb,
> >                bitmap_bit_p (blocks, bb->index) ? BLOCK_TO_SPLIT :
> > BLOCK_ORIGINAL);
> >
> > using the bitmap, so factoring the rest of the function and customizing this
> > walk would do the trick.  Note that the whole function could be refactored to
> > handle single blocks more efficiently.
>
> Sorry for the late reply, but does this look OK?  Tested on
> aarch64-linux-gnu and x86_64-linux-gnu.

LGTM, not sure if I'm qualified enough to approve though (I think you
are more qualified here, so ..)

Thanks,
Richard.

> Thanks,
> Richard
>
> ---
>
> The prologue/epilogue pass allows the prologue sequence to contain
> jumps.  The sequence is then partitioned into basic blocks using
> find_many_sub_basic_blocks.
>
> This patch treats epilogues in a similar way.  Since only one block
> might need to be split, the patch (re)introduces a find_sub_basic_blocks
> routine to handle a single block.
>
> The new routine hard-codes the assumption that split_block will chain
> the new block immediately after the original block.  The routine doesn't
> try to replicate the fix for PR81030, since that was specific to
> gimple->rtl expansion.
>
> The patch is needed for follow-on aarch64 patches that add conditional
> code to the epilogue.  The tests are part of those patches.
>
> gcc/
>         * cfgbuild.h (find_sub_basic_blocks): Declare.
>         * cfgbuild.cc (update_profile_for_new_sub_basic_block): New function,
>         split out from...
>         (find_many_sub_basic_blocks): ...here.
>         (find_sub_basic_blocks): New function.
>         * function.cc (thread_prologue_and_epilogue_insns): Handle
>         epilogues that contain jumps.
> ---
>  gcc/cfgbuild.cc | 95 +++++++++++++++++++++++++++++++++----------------
>  gcc/cfgbuild.h  |  1 +
>  gcc/function.cc |  4 +++
>  3 files changed, 70 insertions(+), 30 deletions(-)
>
> diff --git a/gcc/cfgbuild.cc b/gcc/cfgbuild.cc
> index 15ed4deb5f7..9a6b34fb4b1 100644
> --- a/gcc/cfgbuild.cc
> +++ b/gcc/cfgbuild.cc
> @@ -693,6 +693,43 @@ compute_outgoing_frequencies (basic_block b)
>      }
>  }
>
> +/* Update the profile information for BB, which was created by splitting
> +   an RTL block that had a non-final jump.  */
> +
> +static void
> +update_profile_for_new_sub_basic_block (basic_block bb)
> +{
> +  edge e;
> +  edge_iterator ei;
> +
> +  bool initialized_src = false, uninitialized_src = false;
> +  bb->count = profile_count::zero ();
> +  FOR_EACH_EDGE (e, ei, bb->preds)
> +    {
> +      if (e->count ().initialized_p ())
> +       {
> +         bb->count += e->count ();
> +         initialized_src = true;
> +       }
> +      else
> +       uninitialized_src = true;
> +    }
> +  /* When some edges are missing with read profile, this is
> +     most likely because RTL expansion introduced loop.
> +     When profile is guessed we may have BB that is reachable
> +     from unlikely path as well as from normal path.
> +
> +     TODO: We should handle loops created during BB expansion
> +     correctly here.  For now we assume all those loop to cycle
> +     precisely once.  */
> +  if (!initialized_src
> +      || (uninitialized_src
> +          && profile_status_for_fn (cfun) < PROFILE_GUESSED))
> +    bb->count = profile_count::uninitialized ();
> +
> +  compute_outgoing_frequencies (bb);
> +}
> +
>  /* Assume that some pass has inserted labels or control flow
>     instructions within a basic block.  Split basic blocks as needed
>     and create edges.  */
> @@ -744,40 +781,15 @@ find_many_sub_basic_blocks (sbitmap blocks)
>    if (profile_status_for_fn (cfun) != PROFILE_ABSENT)
>      FOR_BB_BETWEEN (bb, min, max->next_bb, next_bb)
>        {
> -       edge e;
> -       edge_iterator ei;
> -
>         if (STATE (bb) == BLOCK_ORIGINAL)
>           continue;
>         if (STATE (bb) == BLOCK_NEW)
>           {
> -           bool initialized_src = false, uninitialized_src = false;
> -           bb->count = profile_count::zero ();
> -           FOR_EACH_EDGE (e, ei, bb->preds)
> -             {
> -               if (e->count ().initialized_p ())
> -                 {
> -                   bb->count += e->count ();
> -                   initialized_src = true;
> -                 }
> -               else
> -                 uninitialized_src = true;
> -             }
> -           /* When some edges are missing with read profile, this is
> -              most likely because RTL expansion introduced loop.
> -              When profile is guessed we may have BB that is reachable
> -              from unlikely path as well as from normal path.
> -
> -              TODO: We should handle loops created during BB expansion
> -              correctly here.  For now we assume all those loop to cycle
> -              precisely once.  */
> -           if (!initialized_src
> -               || (uninitialized_src
> -                    && profile_status_for_fn (cfun) < PROFILE_GUESSED))
> -             bb->count = profile_count::uninitialized ();
> +           update_profile_for_new_sub_basic_block (bb);
> +           continue;
>           }
> -       /* If nothing changed, there is no need to create new BBs.  */
> -       else if (EDGE_COUNT (bb->succs) == n_succs[bb->index])
> +       /* If nothing changed, there is no need to create new BBs.  */
> +       if (EDGE_COUNT (bb->succs) == n_succs[bb->index])
>           {
>             /* In rare occassions RTL expansion might have mistakely assigned
>                a probabilities different from what is in CFG.  This happens
> @@ -788,10 +800,33 @@ find_many_sub_basic_blocks (sbitmap blocks)
>               update_br_prob_note (bb);
>             continue;
>           }
> -
>         compute_outgoing_frequencies (bb);
>        }
>
>    FOR_EACH_BB_FN (bb, cfun)
>      SET_STATE (bb, 0);
>  }
> +
> +/* Like find_many_sub_basic_blocks, but look only within BB.  */
> +
> +void
> +find_sub_basic_blocks (basic_block bb)
> +{
> +  basic_block end_bb = bb->next_bb;
> +  find_bb_boundaries (bb);
> +  if (bb->next_bb == end_bb)
> +    return;
> +
> +  /* Re-scan and wire in all edges.  This expects simple (conditional)
> +     jumps at the end of each new basic blocks.  */
> +  make_edges (bb, end_bb->prev_bb, 1);
> +
> +  /* Update branch probabilities.  Expect only (un)conditional jumps
> +     to be created with only the forward edges.  */
> +  if (profile_status_for_fn (cfun) != PROFILE_ABSENT)
> +    {
> +      compute_outgoing_frequencies (bb);
> +      for (bb = bb->next_bb; bb != end_bb; bb = bb->next_bb)
> +       update_profile_for_new_sub_basic_block (bb);
> +    }
> +}
> diff --git a/gcc/cfgbuild.h b/gcc/cfgbuild.h
> index 51d3eccb1ae..4191fb3fcba 100644
> --- a/gcc/cfgbuild.h
> +++ b/gcc/cfgbuild.h
> @@ -24,5 +24,6 @@ extern bool inside_basic_block_p (const rtx_insn *);
>  extern bool control_flow_insn_p (const rtx_insn *);
>  extern void rtl_make_eh_edge (sbitmap, basic_block, rtx);
>  extern void find_many_sub_basic_blocks (sbitmap);
> +extern void find_sub_basic_blocks (basic_block);
>
>  #endif /* GCC_CFGBUILD_H */
> diff --git a/gcc/function.cc b/gcc/function.cc
> index 336af28fb22..afb0b33da9e 100644
> --- a/gcc/function.cc
> +++ b/gcc/function.cc
> @@ -6112,6 +6112,8 @@ thread_prologue_and_epilogue_insns (void)
>                   && returnjump_p (BB_END (e->src)))
>                 e->flags &= ~EDGE_FALLTHRU;
>             }
> +
> +         find_sub_basic_blocks (BLOCK_FOR_INSN (epilogue_seq));
>         }
>        else if (next_active_insn (BB_END (exit_fallthru_edge->src)))
>         {
> @@ -6210,6 +6212,8 @@ thread_prologue_and_epilogue_insns (void)
>           set_insn_locations (seq, epilogue_location);
>
>           emit_insn_before (seq, insn);
> +
> +         find_sub_basic_blocks (BLOCK_FOR_INSN (insn));
>         }
>      }
>
> --
> 2.25.1
>

  reply	other threads:[~2023-10-17  9:22 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-22 10:38 [PATCH 00/11] aarch64 GCS preliminary patches Szabolcs Nagy
2023-08-22 10:38 ` [PATCH 01/11] aarch64: AARCH64_ISA_RCPC was defined twice Szabolcs Nagy
2023-09-05 14:30   ` Richard Sandiford
2023-08-22 10:38 ` [PATCH 02/11] Handle epilogues that contain jumps Szabolcs Nagy
2023-08-22 11:03   ` Richard Biener
2023-10-12  8:14     ` Richard Sandiford
2023-10-17  9:19       ` Richard Biener [this message]
2023-10-19 15:16         ` Jeff Law
2023-08-22 10:38 ` [PATCH 03/11] aarch64: Use br instead of ret for eh_return Szabolcs Nagy
2023-08-23  9:28   ` Richard Sandiford
2023-08-24  9:43     ` Richard Sandiford
2023-08-22 10:38 ` [PATCH 04/11] aarch64: Do not force a stack frame for EH returns Szabolcs Nagy
2023-09-05 14:33   ` Richard Sandiford
2023-08-22 10:38 ` [PATCH 05/11] aarch64: Add eh_return compile tests Szabolcs Nagy
2023-09-05 14:43   ` Richard Sandiford
2023-08-22 10:38 ` [PATCH 06/11] aarch64: Fix pac-ret eh_return tests Szabolcs Nagy
2023-09-05 14:56   ` Richard Sandiford
2023-08-22 10:38 ` [PATCH 07/11] aarch64: Disable branch-protection for pcs tests Szabolcs Nagy
2023-09-05 14:58   ` Richard Sandiford
2023-08-22 10:39 ` [PATCH 08/11] aarch64,arm: Remove accepted_branch_protection_string Szabolcs Nagy
2023-08-22 10:39 ` [PATCH 09/11] aarch64,arm: Fix branch-protection= parsing Szabolcs Nagy
2023-08-22 10:39 ` [PATCH 10/11] aarch64: Fix branch-protection error message tests Szabolcs Nagy
2023-09-05 15:00   ` Richard Sandiford
2023-10-13 10:29     ` Richard Earnshaw (lists)
2023-10-23 12:28       ` Szabolcs Nagy
2023-08-22 10:39 ` [PATCH 11/11] aarch64,arm: Move branch-protection data to targets Szabolcs Nagy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFiYyc2Fm7vRL12=P-qv-Guw9cyR9pJV=Ec=+G2LGXdXeKPWtA@mail.gmail.com' \
    --to=richard.guenther@gmail.com \
    --cc=Kyrylo.Tkachov@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=richard.earnshaw@arm.com \
    --cc=richard.sandiford@arm.com \
    --cc=szabolcs.nagy@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).