public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Successive hoisting and AVAIL_OUT in at least one successor heuristic
@ 2021-05-06  9:39 Prathamesh Kulkarni
  2021-05-06 10:13 ` Richard Biener
  0 siblings, 1 reply; 7+ messages in thread
From: Prathamesh Kulkarni @ 2021-05-06  9:39 UTC (permalink / raw)
  To: GCC Development, Richard Biener

Hi Richard,
I was just wondering if second (and higher) order hoistings may defeat
the "AVAIL_OUT in at least one successor heuristic" ?

For eg:
bb2:
if (cond1) goto bb3 else goto bb4;

bb3:
if (cond2) goto bb5 else goto bb6;

bb5:
return x + y;

bb6:
return x + y;

bb4:
if (cond3) goto bb7 else goto bb8;

bb7:
return x + y;

bb8:
return x + y;

In above CFG, IIUC, we will end up hoisting x + y in bb2 even if bb3
or bb4 don't originally
contain x + y since during first hoisting pass x + y will be hoisted
from bb5, bb6 into bb3,
and likewise from bb7, bb8 into bb4 and during second hoisting pass it
will hoist x + y into bb2, since x + y is now available in bb3 and
bb4.

This might become an issue if successive hoistings will end up
hoisting the expression "too far" resulting in long range hoisting ?
Also if we're hoisting from post-dom block in which case the
expression may not be truly ANTIC, and eventually end up being
inserted into a successor block by successive hoisting passes.

I was wondering if we should check that the expression is "originally"
in AVAIL_OUT in at least one successor of block to avoid considering
expressions inserted by hoisting / PRE ?
We could do that by "marking" blocks during first hoisting pass (or
before hoisting / PRE),
that intersect with AVAIL_OUT of at least one successor and use that
info to check the heuristic during successive hoisting passes ?

Thanks,
Prathamesh

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Successive hoisting and AVAIL_OUT in at least one successor heuristic
  2021-05-06  9:39 Successive hoisting and AVAIL_OUT in at least one successor heuristic Prathamesh Kulkarni
@ 2021-05-06 10:13 ` Richard Biener
  2021-05-06 10:55   ` Prathamesh Kulkarni
  0 siblings, 1 reply; 7+ messages in thread
From: Richard Biener @ 2021-05-06 10:13 UTC (permalink / raw)
  To: Prathamesh Kulkarni; +Cc: GCC Development

On Thu, 6 May 2021, Prathamesh Kulkarni wrote:

> Hi Richard,
> I was just wondering if second (and higher) order hoistings may defeat
> the "AVAIL_OUT in at least one successor heuristic" ?
> 
> For eg:
> bb2:
> if (cond1) goto bb3 else goto bb4;
> 
> bb3:
> if (cond2) goto bb5 else goto bb6;
> 
> bb5:
> return x + y;
> 
> bb6:
> return x + y;
> 
> bb4:
> if (cond3) goto bb7 else goto bb8;
> 
> bb7:
> return x + y;
> 
> bb8:
> return x + y;
> 
> In above CFG, IIUC, we will end up hoisting x + y in bb2 even if bb3
> or bb4 don't originally
> contain x + y since during first hoisting pass x + y will be hoisted
> from bb5, bb6 into bb3,
> and likewise from bb7, bb8 into bb4 and during second hoisting pass it
> will hoist x + y into bb2, since x + y is now available in bb3 and
> bb4.

But that's intended - the logic is _supposed_ to do "everything
at once", thus repeated runs of hoisting should not hoist more.
Which is also why we no longer iterate hoist insertion.

> This might become an issue if successive hoistings will end up
> hoisting the expression "too far" resulting in long range hoisting ?

I think this "long range hoisting" argument is a red herring...

> Also if we're hoisting from post-dom block in which case the
> expression may not be truly ANTIC, and eventually end up being
> inserted into a successor block by successive hoisting passes.

But this shouldn't happen - can you show a testcase where it does?

> I was wondering if we should check that the expression is "originally"
> in AVAIL_OUT in at least one successor of block to avoid considering
> expressions inserted by hoisting / PRE ?

No, why should we?

Richard.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Successive hoisting and AVAIL_OUT in at least one successor heuristic
  2021-05-06 10:13 ` Richard Biener
@ 2021-05-06 10:55   ` Prathamesh Kulkarni
  2021-05-06 11:31     ` Richard Biener
  2021-05-06 13:21     ` Michael Matz
  0 siblings, 2 replies; 7+ messages in thread
From: Prathamesh Kulkarni @ 2021-05-06 10:55 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Development

On Thu, 6 May 2021 at 15:43, Richard Biener <rguenther@suse.de> wrote:
>
> On Thu, 6 May 2021, Prathamesh Kulkarni wrote:
>
> > Hi Richard,
> > I was just wondering if second (and higher) order hoistings may defeat
> > the "AVAIL_OUT in at least one successor heuristic" ?
> >
> > For eg:
> > bb2:
> > if (cond1) goto bb3 else goto bb4;
> >
> > bb3:
> > if (cond2) goto bb5 else goto bb6;
> >
> > bb5:
> > return x + y;
> >
> > bb6:
> > return x + y;
> >
> > bb4:
> > if (cond3) goto bb7 else goto bb8;
> >
> > bb7:
> > return x + y;
> >
> > bb8:
> > return x + y;
> >
> > In above CFG, IIUC, we will end up hoisting x + y in bb2 even if bb3
> > or bb4 don't originally
> > contain x + y since during first hoisting pass x + y will be hoisted
> > from bb5, bb6 into bb3,
> > and likewise from bb7, bb8 into bb4 and during second hoisting pass it
> > will hoist x + y into bb2, since x + y is now available in bb3 and
> > bb4.
>
> But that's intended - the logic is _supposed_ to do "everything
> at once", thus repeated runs of hoisting should not hoist more.
> Which is also why we no longer iterate hoist insertion.
>
> > This might become an issue if successive hoistings will end up
> > hoisting the expression "too far" resulting in long range hoisting ?
>
> I think this "long range hoisting" argument is a red herring...
>
> > Also if we're hoisting from post-dom block in which case the
> > expression may not be truly ANTIC, and eventually end up being
> > inserted into a successor block by successive hoisting passes.
>
> But this shouldn't happen - can you show a testcase where it does?
Well, I was thinking of this test-case:

int f(int cond1, int cond2, int cond3, int x, int y)
{
  void f1();
  void f2(int);
  void f3();

  if (cond1)
    f1 ();
  else
    {
      if (cond2)
        f2 (x + y);
      else
        f3 ();
    }

  return x + y;
}

The input to PRE pass is:

f (int cond1, int cond2, int cond3, int x, int y)
{
  int _1;
  int _11;

  <bb 2> [local count: 1073741824]:
  if (cond1_3(D) != 0)
    goto <bb 3>; [33.00%]
  else
    goto <bb 4>; [67.00%]

  <bb 3> [local count: 354334800]:
  f1 ();
  goto <bb 7>; [100.00%]

  <bb 4> [local count: 719407025]:
  if (cond2_4(D) != 0)
    goto <bb 5>; [50.00%]
  else
    goto <bb 6>; [50.00%]

  <bb 5> [local count: 359703512]:
  _1 = x_7(D) + y_8(D);
  f2 (_1);
  goto <bb 7>; [100.00%]

  <bb 6> [local count: 359703512]:
  f3 ();

  <bb 7> [local count: 1073741824]:
  _11 = x_7(D) + y_8(D);
  return _11;
}

pre dump shows:
Inserting expression in block 4 for code hoisting:
{plus_expr,x_7(D),y_8(D)} (0001)
Inserted _12 = x_7(D) + y_8(D);
 in predecessor 4 (0001)
Found partial redundancy for expression {plus_expr,x_7(D),y_8(D)} (0001)
Inserted _13 = x_7(D) + y_8(D);
 in predecessor 3 (0001)
Created phi prephitmp_14 = PHI <_13(3), _12(5), _12(6)>
 in block 7 (0001)
Starting insert iteration 2
Inserting expression in block 2 for code hoisting:
{plus_expr,x_7(D),y_8(D)} (0001)
Inserted _15 = x_7(D) + y_8(D);
 in predecessor 2 (0001)

So IIUC, during first insert iteration, it is hoisting x + y into bb4
and PRE inserts x + y into bb3.
And during second iteration, it hoists x + y into bb2.
So we are effectively hoisting x + y from bb5, bb7 into bb2, which
doesn't seem to benefit other
two paths (bb2->bb3->bb7 and bb2->bb4->bb6->bb7) since they don't contain x + y.
I am not sure if we should be hoisting x + y into bb2 for this case ?

Thanks,
Prathamesh
>
> > I was wondering if we should check that the expression is "originally"
> > in AVAIL_OUT in at least one successor of block to avoid considering
> > expressions inserted by hoisting / PRE ?
>
> No, why should we?
>
> Richard.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Successive hoisting and AVAIL_OUT in at least one successor heuristic
  2021-05-06 10:55   ` Prathamesh Kulkarni
@ 2021-05-06 11:31     ` Richard Biener
  2021-05-07  6:55       ` Prathamesh Kulkarni
  2021-05-06 13:21     ` Michael Matz
  1 sibling, 1 reply; 7+ messages in thread
From: Richard Biener @ 2021-05-06 11:31 UTC (permalink / raw)
  To: Prathamesh Kulkarni; +Cc: GCC Development

On Thu, 6 May 2021, Prathamesh Kulkarni wrote:

> On Thu, 6 May 2021 at 15:43, Richard Biener <rguenther@suse.de> wrote:
> >
> > On Thu, 6 May 2021, Prathamesh Kulkarni wrote:
> >
> > > Hi Richard,
> > > I was just wondering if second (and higher) order hoistings may defeat
> > > the "AVAIL_OUT in at least one successor heuristic" ?
> > >
> > > For eg:
> > > bb2:
> > > if (cond1) goto bb3 else goto bb4;
> > >
> > > bb3:
> > > if (cond2) goto bb5 else goto bb6;
> > >
> > > bb5:
> > > return x + y;
> > >
> > > bb6:
> > > return x + y;
> > >
> > > bb4:
> > > if (cond3) goto bb7 else goto bb8;
> > >
> > > bb7:
> > > return x + y;
> > >
> > > bb8:
> > > return x + y;
> > >
> > > In above CFG, IIUC, we will end up hoisting x + y in bb2 even if bb3
> > > or bb4 don't originally
> > > contain x + y since during first hoisting pass x + y will be hoisted
> > > from bb5, bb6 into bb3,
> > > and likewise from bb7, bb8 into bb4 and during second hoisting pass it
> > > will hoist x + y into bb2, since x + y is now available in bb3 and
> > > bb4.
> >
> > But that's intended - the logic is _supposed_ to do "everything
> > at once", thus repeated runs of hoisting should not hoist more.
> > Which is also why we no longer iterate hoist insertion.
> >
> > > This might become an issue if successive hoistings will end up
> > > hoisting the expression "too far" resulting in long range hoisting ?
> >
> > I think this "long range hoisting" argument is a red herring...
> >
> > > Also if we're hoisting from post-dom block in which case the
> > > expression may not be truly ANTIC, and eventually end up being
> > > inserted into a successor block by successive hoisting passes.
> >
> > But this shouldn't happen - can you show a testcase where it does?
> Well, I was thinking of this test-case:
> 
> int f(int cond1, int cond2, int cond3, int x, int y)
> {
>   void f1();
>   void f2(int);
>   void f3();
> 
>   if (cond1)
>     f1 ();
>   else
>     {
>       if (cond2)
>         f2 (x + y);
>       else
>         f3 ();
>     }
> 
>   return x + y;
> }
> 
> The input to PRE pass is:
> 
> f (int cond1, int cond2, int cond3, int x, int y)
> {
>   int _1;
>   int _11;
> 
>   <bb 2> [local count: 1073741824]:
>   if (cond1_3(D) != 0)
>     goto <bb 3>; [33.00%]
>   else
>     goto <bb 4>; [67.00%]
> 
>   <bb 3> [local count: 354334800]:
>   f1 ();
>   goto <bb 7>; [100.00%]
> 
>   <bb 4> [local count: 719407025]:
>   if (cond2_4(D) != 0)
>     goto <bb 5>; [50.00%]
>   else
>     goto <bb 6>; [50.00%]
> 
>   <bb 5> [local count: 359703512]:
>   _1 = x_7(D) + y_8(D);
>   f2 (_1);
>   goto <bb 7>; [100.00%]
> 
>   <bb 6> [local count: 359703512]:
>   f3 ();
> 
>   <bb 7> [local count: 1073741824]:
>   _11 = x_7(D) + y_8(D);
>   return _11;
> }
> 
> pre dump shows:
> Inserting expression in block 4 for code hoisting:
> {plus_expr,x_7(D),y_8(D)} (0001)
> Inserted _12 = x_7(D) + y_8(D);
>  in predecessor 4 (0001)
> Found partial redundancy for expression {plus_expr,x_7(D),y_8(D)} (0001)
> Inserted _13 = x_7(D) + y_8(D);
>  in predecessor 3 (0001)

You are clearly looking at old GCC - GCC 11+ first do all PRE
insertion and only then do a _single_ hoist insert run.  But still
it happens exactly as designed - there's a partial redundancy
for the x + y add on BB7 predecessors so we insert in BB6 and BB3.
That in turn leaves us with a perfect opportunity for hoisting
where I don't see any reason do _not_ perform it since it saves
us two copies of x + y.

Trunk:

Starting iteration 1
Starting insert iteration 1
Found partial redundancy for expression {plus_expr,x_7(D),y_8(D)} (0001)
Inserted _12 = x_7(D) + y_8(D);
 in predecessor 3 (0001)
Inserted _13 = x_7(D) + y_8(D);
 in predecessor 6 (0001)
Created phi prephitmp_14 = PHI <_12(3), _1(5), _13(6)>
 in block 7 (0001)
Inserting expression in block 4 for code hoisting: 
{plus_expr,x_7(D),y_8(D)} (0001)
Inserted _15 = x_7(D) + y_8(D);
 in predecessor 4 (0001)
Inserting expression in block 2 for code hoisting: 
{plus_expr,x_7(D),y_8(D)} (0001)
Inserted _16 = x_7(D) + y_8(D);
 in predecessor 2 (0001)

(the hoist insert into BB4 is spurious)

> Created phi prephitmp_14 = PHI <_13(3), _12(5), _12(6)>
>  in block 7 (0001)
> Starting insert iteration 2
> Inserting expression in block 2 for code hoisting:
> {plus_expr,x_7(D),y_8(D)} (0001)
> Inserted _15 = x_7(D) + y_8(D);
>  in predecessor 2 (0001)
> 
> So IIUC, during first insert iteration, it is hoisting x + y into bb4
> and PRE inserts x + y into bb3.
> And during second iteration, it hoists x + y into bb2.
> So we are effectively hoisting x + y from bb5, bb7 into bb2, which
> doesn't seem to benefit other
> two paths (bb2->bb3->bb7 and bb2->bb4->bb6->bb7) since they don't contain x + y.
> I am not sure if we should be hoisting x + y into bb2 for this case ?

We should.  We're optimizing the partially redundant compute in BB5
and the insertion point with the least amount of code generated is
in BB2.

Richard.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Successive hoisting and AVAIL_OUT in at least one successor heuristic
  2021-05-06 10:55   ` Prathamesh Kulkarni
  2021-05-06 11:31     ` Richard Biener
@ 2021-05-06 13:21     ` Michael Matz
  2021-05-07  7:33       ` Prathamesh Kulkarni
  1 sibling, 1 reply; 7+ messages in thread
From: Michael Matz @ 2021-05-06 13:21 UTC (permalink / raw)
  To: Prathamesh Kulkarni; +Cc: Richard Biener, GCC Development

Hello,

On Thu, 6 May 2021, Prathamesh Kulkarni via Gcc wrote:

> Well, I was thinking of this test-case:
> 
> int f(int cond1, int cond2, int cond3, int x, int y)
> {
>   void f1();
>   void f2(int);
>   void f3();
> 
>   if (cond1)
>     f1 ();
>   else
>     {
>       if (cond2)
>         f2 (x + y);
>       else
>         f3 ();
>     }
> 
>   return x + y;
> }
> ...
> And during second iteration, it hoists x + y into bb2. So we are 
> effectively hoisting x + y from bb5, bb7 into bb2, which doesn't seem to 
> benefit other two paths (bb2->bb3->bb7 and bb2->bb4->bb6->bb7) since 
> they don't contain x + y.

But bb7 eventually does, so it also doesn't make those paths worse.  
That's the nature of partial redundancy elimination: it doesn't require 
benefits on all paths, only on one of them.  That's in difference to full 
redundancy eliminaton.

As with all hoistings it can increase register pressure.  The counter 
measure to this is not to limit the hoisting (because that can have 
ripple down effects when some hoists don't happen anymore), but rather to 
tackle the register pressure problem somewhen later, in the register 
allocator (or some preparatory pass).  But I'm not even sure if this is 
the reason you're wondering about how PRE hoists.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Successive hoisting and AVAIL_OUT in at least one successor heuristic
  2021-05-06 11:31     ` Richard Biener
@ 2021-05-07  6:55       ` Prathamesh Kulkarni
  0 siblings, 0 replies; 7+ messages in thread
From: Prathamesh Kulkarni @ 2021-05-07  6:55 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Development

On Thu, 6 May 2021 at 17:01, Richard Biener <rguenther@suse.de> wrote:
>
> On Thu, 6 May 2021, Prathamesh Kulkarni wrote:
>
> > On Thu, 6 May 2021 at 15:43, Richard Biener <rguenther@suse.de> wrote:
> > >
> > > On Thu, 6 May 2021, Prathamesh Kulkarni wrote:
> > >
> > > > Hi Richard,
> > > > I was just wondering if second (and higher) order hoistings may defeat
> > > > the "AVAIL_OUT in at least one successor heuristic" ?
> > > >
> > > > For eg:
> > > > bb2:
> > > > if (cond1) goto bb3 else goto bb4;
> > > >
> > > > bb3:
> > > > if (cond2) goto bb5 else goto bb6;
> > > >
> > > > bb5:
> > > > return x + y;
> > > >
> > > > bb6:
> > > > return x + y;
> > > >
> > > > bb4:
> > > > if (cond3) goto bb7 else goto bb8;
> > > >
> > > > bb7:
> > > > return x + y;
> > > >
> > > > bb8:
> > > > return x + y;
> > > >
> > > > In above CFG, IIUC, we will end up hoisting x + y in bb2 even if bb3
> > > > or bb4 don't originally
> > > > contain x + y since during first hoisting pass x + y will be hoisted
> > > > from bb5, bb6 into bb3,
> > > > and likewise from bb7, bb8 into bb4 and during second hoisting pass it
> > > > will hoist x + y into bb2, since x + y is now available in bb3 and
> > > > bb4.
> > >
> > > But that's intended - the logic is _supposed_ to do "everything
> > > at once", thus repeated runs of hoisting should not hoist more.
> > > Which is also why we no longer iterate hoist insertion.
> > >
> > > > This might become an issue if successive hoistings will end up
> > > > hoisting the expression "too far" resulting in long range hoisting ?
> > >
> > > I think this "long range hoisting" argument is a red herring...
> > >
> > > > Also if we're hoisting from post-dom block in which case the
> > > > expression may not be truly ANTIC, and eventually end up being
> > > > inserted into a successor block by successive hoisting passes.
> > >
> > > But this shouldn't happen - can you show a testcase where it does?
> > Well, I was thinking of this test-case:
> >
> > int f(int cond1, int cond2, int cond3, int x, int y)
> > {
> >   void f1();
> >   void f2(int);
> >   void f3();
> >
> >   if (cond1)
> >     f1 ();
> >   else
> >     {
> >       if (cond2)
> >         f2 (x + y);
> >       else
> >         f3 ();
> >     }
> >
> >   return x + y;
> > }
> >
> > The input to PRE pass is:
> >
> > f (int cond1, int cond2, int cond3, int x, int y)
> > {
> >   int _1;
> >   int _11;
> >
> >   <bb 2> [local count: 1073741824]:
> >   if (cond1_3(D) != 0)
> >     goto <bb 3>; [33.00%]
> >   else
> >     goto <bb 4>; [67.00%]
> >
> >   <bb 3> [local count: 354334800]:
> >   f1 ();
> >   goto <bb 7>; [100.00%]
> >
> >   <bb 4> [local count: 719407025]:
> >   if (cond2_4(D) != 0)
> >     goto <bb 5>; [50.00%]
> >   else
> >     goto <bb 6>; [50.00%]
> >
> >   <bb 5> [local count: 359703512]:
> >   _1 = x_7(D) + y_8(D);
> >   f2 (_1);
> >   goto <bb 7>; [100.00%]
> >
> >   <bb 6> [local count: 359703512]:
> >   f3 ();
> >
> >   <bb 7> [local count: 1073741824]:
> >   _11 = x_7(D) + y_8(D);
> >   return _11;
> > }
> >
> > pre dump shows:
> > Inserting expression in block 4 for code hoisting:
> > {plus_expr,x_7(D),y_8(D)} (0001)
> > Inserted _12 = x_7(D) + y_8(D);
> >  in predecessor 4 (0001)
> > Found partial redundancy for expression {plus_expr,x_7(D),y_8(D)} (0001)
> > Inserted _13 = x_7(D) + y_8(D);
> >  in predecessor 3 (0001)
>
> You are clearly looking at old GCC - GCC 11+ first do all PRE
Oops, sorry! I was looking at an old branch.
> insertion and only then do a _single_ hoist insert run.  But still
> it happens exactly as designed - there's a partial redundancy
> for the x + y add on BB7 predecessors so we insert in BB6 and BB3.
> That in turn leaves us with a perfect opportunity for hoisting
> where I don't see any reason do _not_ perform it since it saves
> us two copies of x + y.
>
> Trunk:
>
> Starting iteration 1
> Starting insert iteration 1
> Found partial redundancy for expression {plus_expr,x_7(D),y_8(D)} (0001)
> Inserted _12 = x_7(D) + y_8(D);
>  in predecessor 3 (0001)
> Inserted _13 = x_7(D) + y_8(D);
>  in predecessor 6 (0001)
> Created phi prephitmp_14 = PHI <_12(3), _1(5), _13(6)>
>  in block 7 (0001)
> Inserting expression in block 4 for code hoisting:
> {plus_expr,x_7(D),y_8(D)} (0001)
> Inserted _15 = x_7(D) + y_8(D);
>  in predecessor 4 (0001)
> Inserting expression in block 2 for code hoisting:
> {plus_expr,x_7(D),y_8(D)} (0001)
> Inserted _16 = x_7(D) + y_8(D);
>  in predecessor 2 (0001)
>
> (the hoist insert into BB4 is spurious)
>
> > Created phi prephitmp_14 = PHI <_13(3), _12(5), _12(6)>
> >  in block 7 (0001)
> > Starting insert iteration 2
> > Inserting expression in block 2 for code hoisting:
> > {plus_expr,x_7(D),y_8(D)} (0001)
> > Inserted _15 = x_7(D) + y_8(D);
> >  in predecessor 2 (0001)
> >
> > So IIUC, during first insert iteration, it is hoisting x + y into bb4
> > and PRE inserts x + y into bb3.
> > And during second iteration, it hoists x + y into bb2.
> > So we are effectively hoisting x + y from bb5, bb7 into bb2, which
> > doesn't seem to benefit other
> > two paths (bb2->bb3->bb7 and bb2->bb4->bb6->bb7) since they don't contain x + y.
> > I am not sure if we should be hoisting x + y into bb2 for this case ?
>
> We should.  We're optimizing the partially redundant compute in BB5
> and the insertion point with the least amount of code generated is
> in BB2.
Right, thanks for the clarification and sorry for the noise.

Regards,
Prathamesh
>
> Richard.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Successive hoisting and AVAIL_OUT in at least one successor heuristic
  2021-05-06 13:21     ` Michael Matz
@ 2021-05-07  7:33       ` Prathamesh Kulkarni
  0 siblings, 0 replies; 7+ messages in thread
From: Prathamesh Kulkarni @ 2021-05-07  7:33 UTC (permalink / raw)
  To: Michael Matz; +Cc: Richard Biener, GCC Development

On Thu, 6 May 2021 at 18:51, Michael Matz <matz@suse.de> wrote:
>
> Hello,
>
> On Thu, 6 May 2021, Prathamesh Kulkarni via Gcc wrote:
>
> > Well, I was thinking of this test-case:
> >
> > int f(int cond1, int cond2, int cond3, int x, int y)
> > {
> >   void f1();
> >   void f2(int);
> >   void f3();
> >
> >   if (cond1)
> >     f1 ();
> >   else
> >     {
> >       if (cond2)
> >         f2 (x + y);
> >       else
> >         f3 ();
> >     }
> >
> >   return x + y;
> > }
> > ...
> > And during second iteration, it hoists x + y into bb2. So we are
> > effectively hoisting x + y from bb5, bb7 into bb2, which doesn't seem to
> > benefit other two paths (bb2->bb3->bb7 and bb2->bb4->bb6->bb7) since
> > they don't contain x + y.
>
> But bb7 eventually does, so it also doesn't make those paths worse.
> That's the nature of partial redundancy elimination: it doesn't require
> benefits on all paths, only on one of them.  That's in difference to full
> redundancy eliminaton.
>
> As with all hoistings it can increase register pressure.  The counter
> measure to this is not to limit the hoisting (because that can have
> ripple down effects when some hoists don't happen anymore), but rather to
> tackle the register pressure problem somewhen later, in the register
> allocator (or some preparatory pass).  But I'm not even sure if this is
> the reason you're wondering about how PRE hoists.
Hi Michael,
Yes, my concern was primarily w.r.t increased register pressure, and
was trying to think
of a "hack" that will disable hoisting in block if it could
potentially increase chances of spills,
offsetting it's benefit (altho the issue isn't limited to hoisting, I
was special casing it because hoisting
seems to regress a couple of benchmarks on arm and aarch64).
But yes, that doesn't look like the right approach.

Thanks,
Prathamesh
>
>
> Ciao,
> Michael.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-05-07  7:34 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-06  9:39 Successive hoisting and AVAIL_OUT in at least one successor heuristic Prathamesh Kulkarni
2021-05-06 10:13 ` Richard Biener
2021-05-06 10:55   ` Prathamesh Kulkarni
2021-05-06 11:31     ` Richard Biener
2021-05-07  6:55       ` Prathamesh Kulkarni
2021-05-06 13:21     ` Michael Matz
2021-05-07  7:33       ` Prathamesh Kulkarni

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).