RE: [PATCH]middle-end: Fix dominators updates when peeling with multiple exits [PR113144]

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Tamar Christina <Tamar.Christina@arm.com>
To: Richard Biener <rguenther@suse.de>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
	nd <nd@arm.com>, "jlaw@ventanamicro.com" <jlaw@ventanamicro.com>
Subject: RE: [PATCH]middle-end: Fix dominators updates when peeling with multiple exits [PR113144]
Date: Tue, 9 Jan 2024 13:26:37 +0000	[thread overview]
Message-ID: <VI1PR08MB5325E8D2FF908B2062A09816FF6A2@VI1PR08MB5325.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <8pn44s40-pqs9-s99r-6qs4-1p9432692p7q@fhfr.qr>



> -----Original Message-----
> From: Richard Biener <rguenther@suse.de>
> Sent: Tuesday, January 9, 2024 12:26 PM
> To: Tamar Christina <Tamar.Christina@arm.com>
> Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; jlaw@ventanamicro.com
> Subject: RE: [PATCH]middle-end: Fix dominators updates when peeling with
> multiple exits [PR113144]
> 
> On Tue, 9 Jan 2024, Tamar Christina wrote:
> 
> > > This makes it quadratic in the number of vectorized early exit loops
> > > in a function.  The vectorizer CFG manipulation operates in a local
> > > enough bubble that programmatic updating of dominators should be
> > > possible (after all we manage to produce correct SSA form!), the
> > > proposed change gets us too far off to a point where re-computating
> > > dominance info is likely cheaper (but no, we shouldn't do this either).
> > >
> > > Can you instead give manual updating a try again?  I think
> > > versioning should produce up-to-date dominator info, it's only
> > > when you redirect branches during peeling that you'd need
> > > adjustments - but IIRC we're never introducing new merges?
> > >
> > > IIRC we can't wipe dominators during transform since we query them
> > > during code generation.  We possibly could code generate all
> > > CFG manipulations of all vectorized loops, recompute all dominators
> > > and then do code generation of all vectorized loops.
> > >
> > > But then we're doing a loop transform and the exits will ultimatively
> > > end up in the same place, so the CFG and dominator update is bound to
> > > where the original exits went to.
> >
> > Yeah that's a fair point, the issue is specifically with at_exit.  So how about:
> >
> > When we peel at_exit we are moving the new loop at the exit of the previous
> > loop.  This means that the blocks outside the loop dat the previous loop used to
> > dominate are no longer being dominated by it.
> 
> Hmm, indeed.  Note this does make the dominator update O(function-size)
> and when vectorizing multiple loops in a function this becomes
> quadratic.  That's quite unfortunate so I wonder if we can delay the
> update to the parts we do not need up-to-date dominators during
> vectorization (of course it gets fragile with having only partly
> correct dominators).

Fair, I created https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113290 and will
tackle it when I add SLP support in GCC 15.

I think the problem is, and the reason we do early dominator correction and
validation is because the same function is used by loop distribution.

But you're right that during vectorization we perform dominators update twice
now.

So Maybe we should have a parameter to indicate whether dominators should
be updated?

Thanks,
Tamar

> 
> > The new dominators however are hard to predict since if the loop has multiple
> > exits and all the exits are an "early" one then we always execute the scalar
> > loop.  In this case the scalar loop can completely dominate the new loop.
> >
> > If we later have skip_vector then there's an additional skip edge added that
> > might change the dominators.
> >
> > The previous patch would force an update of all blocks reachable from the new
> > exits.  This one updates *only* blocks that we know the scalar exits dominated.
> >
> > For the examples this reduces the blocks to update from 18 to 3.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
> > and no issues normally and with --enable-checking=release --enable-lto
> > --with-build-config=bootstrap-O3 --enable-checking=yes,rtl,extra.
> >
> > Ok for master?
> 
> See below.
> 
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > 	PR tree-optimization/113144
> > 	PR tree-optimization/113145
> > 	* tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg):
> > 	Update all BB that the original exits dominated.
> >
> > gcc/testsuite/ChangeLog:
> >
> > 	PR tree-optimization/113144
> > 	PR tree-optimization/113145
> > 	* gcc.dg/vect/vect-early-break_94-pr113144.c: New test.
> >
> > --- inline copy of patch ---
> >
> > diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_94-pr113144.c
> b/gcc/testsuite/gcc.dg/vect/vect-early-break_94-pr113144.c
> > new file mode 100644
> > index
> 0000000000000000000000000000000000000000..903fe7be6621e81db6f294
> 41e4309fa213d027c5
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_94-pr113144.c
> > @@ -0,0 +1,41 @@
> > +/* { dg-do compile } */
> > +/* { dg-add-options vect_early_break } */
> > +/* { dg-require-effective-target vect_early_break } */
> > +/* { dg-require-effective-target vect_int } */
> > +
> > +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
> > +
> > +long tar_atol256_max, tar_atol256_size, tar_atosl_min;
> > +char tar_atol256_s;
> > +void __errno_location();
> > +
> > +
> > +inline static long tar_atol256(long min) {
> > +  char c;
> > +  int sign;
> > +  c = tar_atol256_s;
> > +  sign = c;
> > +  while (tar_atol256_size) {
> > +    if (c != sign)
> > +      return sign ? min : tar_atol256_max;
> > +    c = tar_atol256_size--;
> > +  }
> > +  if ((c & 128) != (sign & 128))
> > +    return sign ? min : tar_atol256_max;
> > +  return 0;
> > +}
> > +
> > +inline static long tar_atol(long min) {
> > +  return tar_atol256(min);
> > +}
> > +
> > +long tar_atosl() {
> > +  long n = tar_atol(-1);
> > +  if (tar_atosl_min) {
> > +    __errno_location();
> > +    return 0;
> > +  }
> > +  if (n > 0)
> > +    return 0;
> > +  return n;
> > +}
> > diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
> > index
> 76d4979c0b3b374dcaacf6825a95a8714114a63b..9bacaa182a3919cae1cb99dfc
> 5ae4923e1f93376 100644
> > --- a/gcc/tree-vect-loop-manip.cc
> > +++ b/gcc/tree-vect-loop-manip.cc
> > @@ -1719,8 +1719,6 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class loop
> *loop, edge loop_exit,
> >  	  /* Now link the alternative exits.  */
> >  	  if (multiple_exits_p)
> >  	    {
> > -	      set_immediate_dominator (CDI_DOMINATORS, new_preheader,
> > -				       main_loop_exit_block);
> >  	      for (auto gsi_from = gsi_start_phis (loop->header),
> >  		   gsi_to = gsi_start_phis (new_preheader);
> >  		   !gsi_end_p (gsi_from) && !gsi_end_p (gsi_to);
> > @@ -1776,7 +1774,14 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class loop
> *loop, edge loop_exit,
> >  	{
> >  	  update_loop = new_loop;
> >  	  for (edge e : get_loop_exit_edges (loop))
> > -	    doms.safe_push (e->dest);
> > +	    {
> > +	      /* Basic blocks that the old loop dominated are now dominated by
> > +		 the new loop and so we have to update those.  */
> > +	      for (auto bb : get_all_dominated_blocks (CDI_DOMINATORS, e->src))
> > +		if (!flow_bb_inside_loop_p (loop, bb))
> > +		  doms.safe_push (bb);
> > +	      doms.safe_push (e->dest);
> > +	    }
> 
> I think you'll get duplicate blocks that way.  Maybe simplify this
> all by instead doing
> 
>           auto doms = get_all_dominated_blocks (CDI_DOMINATORS, loop->header);
>           for (unsigned i = 0; i < doms.length (); ++i)
>             if (flow_bb_inside_loop_p (loop, doms[i]))
>               doms.unordered_remove (i);
> 
> ?
> 
> OK with that change, but really we should see to avoid this
> quadraticness :/  It's probably not too bad right now given we have
> quite some restrictions on vectorizing loops with multiple exits,
> but I suggest you try an artificial testcase with the "same"
> loop repeated N times to see whether dominance compute creeps up
> in the profile.
> 
> Richard.

next prev parent reply	other threads:[~2024-01-09 13:26 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-29 14:44 Tamar Christina
2024-01-08 12:19 ` Richard Biener
2024-01-09 11:47   ` Tamar Christina
2024-01-09 12:25     ` Richard Biener
2024-01-09 13:26       ` Tamar Christina [this message]
2024-01-09 13:34         ` Richard Biener
2024-01-09 13:50           ` Richard Biener
2024-01-09 13:58             ` Tamar Christina
2024-01-09 14:14               ` Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=VI1PR08MB5325E8D2FF908B2062A09816FF6A2@VI1PR08MB5325.eurprd08.prod.outlook.com \
    --to=tamar.christina@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jlaw@ventanamicro.com \
    --cc=nd@arm.com \
    --cc=rguenther@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).