public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* [QUESTION] About RTL optimization at forward propagation
@ 2020-03-28  3:18 xiezhiheng
  2020-03-30  8:38 ` Richard Biener
  0 siblings, 1 reply; 3+ messages in thread
From: xiezhiheng @ 2020-03-28  3:18 UTC (permalink / raw)
  To: gcc

Hi,
  I find there exists some restricts in function fwprop preventing it to forward propagate addresses into loops.
/* Go through all the uses.  df_uses_create will create new ones at the
   end, and we'll go through them as well.

   Do not forward propagate addresses into loops until after unrolling.
   CSE did so because it was able to fix its own mess, but we are not.  */
for (i = 0; i < DF_USES_TABLE_SIZE (); i++)
  {
    if (!propagations_left)
      break;

    df_ref use = DF_USES_GET (i);
    if (use)
      {
        if (DF_REF_TYPE (use) == DF_REF_REG_USE
            || DF_REF_BB (use)->loop_father == NULL                    <<<<<<<<<<
            /* The outer most loop is not really a loop.  */
            || loop_outer (DF_REF_BB (use)->loop_father) == NULL)
          forward_propagate_into (use, fwprop_addr_p);

        else if (fwprop_addr_p)
          forward_propagate_into (use, false);
      }
  }

  And I have two questions.
  1) What are the reasons or background for not forward propagating addresses into loops ?
  2) Can we still forward propagate addresses if the def and use are in the same loop ?
  I mean something like:
diff -Nurp a/gcc/fwprop.c b/gcc/fwprop.c
--- a/gcc/fwprop.c      2020-03-27 03:17:50.704000000 -0400
+++ b/gcc/fwprop.c      2020-03-27 04:58:35.148000000 -0400
@@ -1573,10 +1573,12 @@ fwprop (bool fwprop_addr_p)
       df_ref use = DF_USES_GET (i);
       if (use)
        {
+         df_ref def = get_def_for_use (use);
          if (DF_REF_TYPE (use) == DF_REF_REG_USE
              || DF_REF_BB (use)->loop_father == NULL
              /* The outer most loop is not really a loop.  */
-             || loop_outer (DF_REF_BB (use)->loop_father) == NULL)
+             || loop_outer (DF_REF_BB (use)->loop_father) == NULL
+             || (def && DF_REF_BB (def)->loop_father == DF_REF_BB (use)->loop_father))
            forward_propagate_into (use, fwprop_addr_p);

          else if (fwprop_addr_p)

I would be grateful if anyone could help.

Best regards

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [QUESTION] About RTL optimization at forward propagation
  2020-03-28  3:18 [QUESTION] About RTL optimization at forward propagation xiezhiheng
@ 2020-03-30  8:38 ` Richard Biener
  2020-03-31  6:52   ` xiezhiheng
  0 siblings, 1 reply; 3+ messages in thread
From: Richard Biener @ 2020-03-30  8:38 UTC (permalink / raw)
  To: xiezhiheng; +Cc: gcc

On Sat, Mar 28, 2020 at 4:19 AM xiezhiheng <xiezhiheng@huawei.com> wrote:
>
> Hi,
>   I find there exists some restricts in function fwprop preventing it to forward propagate addresses into loops.
> /* Go through all the uses.  df_uses_create will create new ones at the
>    end, and we'll go through them as well.
>
>    Do not forward propagate addresses into loops until after unrolling.
>    CSE did so because it was able to fix its own mess, but we are not.  */
> for (i = 0; i < DF_USES_TABLE_SIZE (); i++)
>   {
>     if (!propagations_left)
>       break;
>
>     df_ref use = DF_USES_GET (i);
>     if (use)
>       {
>         if (DF_REF_TYPE (use) == DF_REF_REG_USE
>             || DF_REF_BB (use)->loop_father == NULL                    <<<<<<<<<<
>             /* The outer most loop is not really a loop.  */
>             || loop_outer (DF_REF_BB (use)->loop_father) == NULL)
>           forward_propagate_into (use, fwprop_addr_p);
>
>         else if (fwprop_addr_p)
>           forward_propagate_into (use, false);
>       }
>   }
>
>   And I have two questions.
>   1) What are the reasons or background for not forward propagating addresses into loops ?
>   2) Can we still forward propagate addresses if the def and use are in the same loop ?
>   I mean something like:

The condiition indeed looks odd, the canonical way would be to check

  !flow_loop_nested_p (DF_REF_BB (def)->loop_father, DF_REF_BB
(use)->loop_father)

which would allow propagating addresses defined in loops outside as well.
And loop_father should never be NULL I think.

> diff -Nurp a/gcc/fwprop.c b/gcc/fwprop.c
> --- a/gcc/fwprop.c      2020-03-27 03:17:50.704000000 -0400
> +++ b/gcc/fwprop.c      2020-03-27 04:58:35.148000000 -0400
> @@ -1573,10 +1573,12 @@ fwprop (bool fwprop_addr_p)
>        df_ref use = DF_USES_GET (i);
>        if (use)
>         {
> +         df_ref def = get_def_for_use (use);
>           if (DF_REF_TYPE (use) == DF_REF_REG_USE
>               || DF_REF_BB (use)->loop_father == NULL
>               /* The outer most loop is not really a loop.  */
> -             || loop_outer (DF_REF_BB (use)->loop_father) == NULL)
> +             || loop_outer (DF_REF_BB (use)->loop_father) == NULL
> +             || (def && DF_REF_BB (def)->loop_father == DF_REF_BB (use)->loop_father))
>             forward_propagate_into (use, fwprop_addr_p);
>
>           else if (fwprop_addr_p)
>
> I would be grateful if anyone could help.
>
> Best regards

^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: [QUESTION] About RTL optimization at forward propagation
  2020-03-30  8:38 ` Richard Biener
@ 2020-03-31  6:52   ` xiezhiheng
  0 siblings, 0 replies; 3+ messages in thread
From: xiezhiheng @ 2020-03-31  6:52 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc

> -----Original Message-----
> From: Richard Biener [mailto:richard.guenther@gmail.com]
> Sent: Monday, March 30, 2020 4:39 PM
> To: xiezhiheng <xiezhiheng@huawei.com>
> Cc: gcc@gcc.gnu.org
> Subject: Re: [QUESTION] About RTL optimization at forward propagation
> 
> On Sat, Mar 28, 2020 at 4:19 AM xiezhiheng <xiezhiheng@huawei.com>
> wrote:
> >
> > Hi,
> >   I find there exists some restricts in function fwprop preventing it to
> > forward propagate addresses into loops.
> > /* Go through all the uses.  df_uses_create will create new ones at the
> >    end, and we'll go through them as well.
> >
> >    Do not forward propagate addresses into loops until after unrolling.
> >    CSE did so because it was able to fix its own mess, but we are not.  */
> > for (i = 0; i < DF_USES_TABLE_SIZE (); i++)
> >   {
> >     if (!propagations_left)
> >       break;
> >
> >     df_ref use = DF_USES_GET (i);
> >     if (use)
> >       {
> >         if (DF_REF_TYPE (use) == DF_REF_REG_USE
> >             || DF_REF_BB (use)->loop_father == NULL
> > <<<<<<<<<<
> >             /* The outer most loop is not really a loop.  */
> >             || loop_outer (DF_REF_BB (use)->loop_father) == NULL)
> >           forward_propagate_into (use, fwprop_addr_p);
> >
> >         else if (fwprop_addr_p)
> >           forward_propagate_into (use, false);
> >       }
> >   }
> >
> >   And I have two questions.
> >   1) What are the reasons or background for not forward propagating
> > addresses into loops ?
> >   2) Can we still forward propagate addresses if the def and use are in the
> > same loop ?
> >   I mean something like:
> 
> The condiition indeed looks odd, the canonical way would be to check
> 
>   !flow_loop_nested_p (DF_REF_BB (def)->loop_father, DF_REF_BB
> (use)->loop_father)
> 
> which would allow propagating addresses defined in loops outside as well.
> And loop_father should never be NULL I think.

Thanks for replying!

I think your change will also propagate addresses from one loop to another.
If def's loop is less frequent than use's, maybe it'll get worse performance.
I suggest we can go with the following condition
  DF_REF_BB(def)->loop_father == DF_REF_BB(use)->loop_father
  || flow_loop_nested_p (DF_REF_BB(use)->loop_father,
                       DF_REF_BB(def)->loop_father)


Also I find that function should_replace_address returns true
when it has a positive gain.
/* OLD is a memory address.  Return whether it is good to use NEW instead,
   for a memory access in the given MODE.  */
static bool
should_replace_address (rtx old_rtx, rtx new_rtx, machine_mode mode,
                        addr_space_t as, bool speed)
{
  int gain;

  if (rtx_equal_p (old_rtx, new_rtx)
      || !memory_address_addr_space_p (mode, new_rtx, as))
    return false;

  /* Copy propagation is always ok.  */
  if (REG_P (old_rtx) && REG_P (new_rtx))
    return true;

  /* Prefer the new address if it is less expensive.  */
  gain = (address_cost (old_rtx, mode, as, speed)
          - address_cost (new_rtx, mode, as, speed));

  /* If the addresses have equivalent cost, prefer the new address
     if it has the highest `set_src_cost'.  That has the potential of
     eliminating the most insns without additional costs, and it
     is the same that cse.c used to do.  */
  if (gain == 0)
    gain = (set_src_cost (new_rtx, VOIDmode, speed)
            - set_src_cost (old_rtx, VOIDmode, speed));

  return (gain > 0);  <<<<<<<<
}

Actually we can choose the new address if they have the same cost,
because we may get some other benefits.
Compare
a) m = n + 10
  MEM[m + 20] = 0
and
b) m = n + 10
  MEM[n + 30] = 0
Although we have no gain in memory operation (they have the same
` address_cost' and ` set_src_cost'), we may have a chance to eliminate
the dead code ` m = n + 10' in (b).

Best regards!

> 
> > diff -Nurp a/gcc/fwprop.c b/gcc/fwprop.c
> > --- a/gcc/fwprop.c      2020-03-27 03:17:50.704000000 -0400
> > +++ b/gcc/fwprop.c      2020-03-27 04:58:35.148000000 -0400
> > @@ -1573,10 +1573,12 @@ fwprop (bool fwprop_addr_p)
> >        df_ref use = DF_USES_GET (i);
> >        if (use)
> >         {
> > +         df_ref def = get_def_for_use (use);
> >           if (DF_REF_TYPE (use) == DF_REF_REG_USE
> >               || DF_REF_BB (use)->loop_father == NULL
> >               /* The outer most loop is not really a loop.  */
> > -             || loop_outer (DF_REF_BB (use)->loop_father) == NULL)
> > +             || loop_outer (DF_REF_BB (use)->loop_father) == NULL
> > +             || (def && DF_REF_BB (def)->loop_father == DF_REF_BB
> > (use)->loop_father))
> >             forward_propagate_into (use, fwprop_addr_p);
> >
> >           else if (fwprop_addr_p)
> >
> > I would be grateful if anyone could help.
> >
> > Best regards

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-03-31  6:52 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-28  3:18 [QUESTION] About RTL optimization at forward propagation xiezhiheng
2020-03-30  8:38 ` Richard Biener
2020-03-31  6:52   ` xiezhiheng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).