public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <rguenther@suse.de>
To: Aldy Hernandez <aldyh@redhat.com>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>,
	 "MacLeod, Andrew" <amacleod@redhat.com>,
	Jeff Law <jeffreyalaw@gmail.com>
Subject: Re: [PATCH] Properly honor param_max_fsm_thread_path_insns in backwards threader
Date: Tue, 2 Aug 2022 11:45:52 +0000 (UTC)	[thread overview]
Message-ID: <nycvar.YFH.7.77.849.2208021141380.4208@jbgna.fhfr.qr> (raw)
In-Reply-To: <CAGm3qMVDA_sZGntOMyYJ=XwkEW2hbFDDYv5h0B6QncbwnRKc9Q@mail.gmail.com>

On Tue, 2 Aug 2022, Aldy Hernandez wrote:

> Unfortunately, this was before my time, so I don't know.
> 
> That being said, thanks for tackling these issues that my work
> triggered last release.  Much appreciated.

Ah.  But it was your r12-324-g69e5544210e3c0 that did

-  else if (n_insns > 1)
+  else if (!m_speed_p && n_insns > 1)

causing the breakage on the 12 branch.  That leads to a simpler
fix I guess.  Will re-test and also backport to GCC 12 if successful.

Richard.

> Aldy
> 
> On Tue, Aug 2, 2022 at 10:41 AM Richard Biener <rguenther@suse.de> wrote:
> >
> > I am trying to make sense of back_threader_profitability::profitable_path_p
> > and the first thing I notice is that we do
> >
> >   /* Threading is profitable if the path duplicated is hot but also
> >      in a case we separate cold path from hot path and permit optimization
> >      of the hot path later.  Be on the agressive side here. In some testcases,
> >      as in PR 78407 this leads to noticeable improvements.  */
> >   if (m_speed_p
> >       && ((taken_edge && optimize_edge_for_speed_p (taken_edge))
> >           || contains_hot_bb))
> >     {
> >       if (n_insns >= param_max_fsm_thread_path_insns)
> >         {
> >           if (dump_file && (dump_flags & TDF_DETAILS))
> >             fprintf (dump_file, "  FAIL: Jump-thread path not considered: "
> >                      "the number of instructions on the path "
> >                      "exceeds PARAM_MAX_FSM_THREAD_PATH_INSNS.\n");
> >           return false;
> >         }
> > ...
> >     }
> >   else if (!m_speed_p && n_insns > 1)
> >     {
> >       if (dump_file && (dump_flags & TDF_DETAILS))
> >         fprintf (dump_file, "  FAIL: Jump-thread path not considered: "
> >                  "duplication of %i insns is needed and optimizing for size.\n",
> >                  n_insns);
> >       return false;
> >     }
> > ...
> >   return true;
> >
> > thus we apply the n_insns >= param_max_fsm_thread_path_insns only
> > to "hot paths".  The comment above this isn't entirely clear whether
> > this is by design ("Be on the aggressive side here ...") but I think
> > this is a mistake.  In fact the "hot path" check seems entirely
> > useless since if the path is not hot we simply continue threading it.
> >
> > I have my reservations about how we compute hot (contains_hot_bb
> > in particular), but the following first refactors the above to apply
> > the size constraints always and then _not_ threading if the path
> > is not considered hot (but allow threading if n_insns <= 1 as with
> > the !m_speed_p case).
> >
> > As for contains_hot_bb - it might be that this consciously captures
> > the case where we separate a cold from a hot path even though the
> > threaded path itself is cold.  Consider
> >
> >    A
> >   / \ (unlikely)
> >  B   C
> >   \ /
> >    D
> >   / \
> >  ..  abort()
> >
> > when we want to thread A -> B -> D -> abort () and A (or D)
> > has a hot BB count then we have contains_hot_bb even though
> > the counts on the path itself are small.  In fact when we
> > thread the only relevant count for the resulting threaded
> > path is the count of A with the A->C probability applied
> > (that should also the count to subtract from the blocks
> > we copied - sth missing for the backwards threader as well).
> >
> > So I'm wondering how the logic computing contains_hot_bb
> > relates to the above comment before the costing block.
> > Anyone remembers?
> >
> > Bootstrap & regtest running on x86_64-unknown-linux-gnu.
> >
> >         * tree-ssa-threadbackwards.cc
> >         (back_threader_profitability::profitable_path_p): Apply
> >         size constraints to all paths.  Do not thread cold paths.
> > ---
> >  gcc/tree-ssa-threadbackward.cc | 53 +++++++++++++++++++++-------------
> >  1 file changed, 33 insertions(+), 20 deletions(-)
> >
> > diff --git a/gcc/tree-ssa-threadbackward.cc b/gcc/tree-ssa-threadbackward.cc
> > index 0519f2a8c4b..a887568032b 100644
> > --- a/gcc/tree-ssa-threadbackward.cc
> > +++ b/gcc/tree-ssa-threadbackward.cc
> > @@ -761,22 +761,43 @@ back_threader_profitability::profitable_path_p (const vec<basic_block> &m_path,
> >         *creates_irreducible_loop = true;
> >      }
> >
> > -  /* Threading is profitable if the path duplicated is hot but also
> > +  const int max_cold_insns = 1;
> > +  if (!m_speed_p && n_insns > max_cold_insns)
> > +    {
> > +      if (dump_file && (dump_flags & TDF_DETAILS))
> > +       fprintf (dump_file, "  FAIL: Jump-thread path not considered: "
> > +                "duplication of %i insns is needed and optimizing for size.\n",
> > +                n_insns);
> > +      return false;
> > +    }
> > +  else if (n_insns >= param_max_fsm_thread_path_insns)
> > +    {
> > +      if (dump_file && (dump_flags & TDF_DETAILS))
> > +       fprintf (dump_file, "  FAIL: Jump-thread path not considered: "
> > +                "the number of instructions on the path "
> > +                "exceeds PARAM_MAX_FSM_THREAD_PATH_INSNS.\n");
> > +      return false;
> > +    }
> > +
> > +  /* Threading is profitable if the path duplicated is small or hot but also
> >       in a case we separate cold path from hot path and permit optimization
> >       of the hot path later.  Be on the agressive side here. In some testcases,
> >       as in PR 78407 this leads to noticeable improvements.  */
> > -  if (m_speed_p
> > -      && ((taken_edge && optimize_edge_for_speed_p (taken_edge))
> > -         || contains_hot_bb))
> > +  if (!(n_insns <= max_cold_insns
> > +       || contains_hot_bb
> > +       || (taken_edge && optimize_edge_for_speed_p (taken_edge))))
> > +    {
> > +      if (dump_file && (dump_flags & TDF_DETAILS))
> > +       fprintf (dump_file, "  FAIL: Jump-thread path not considered: "
> > +                "path is not profitable to thread.\n");
> > +      return false;
> > +    }
> > +
> > +  /* If the path is not small to duplicate and either the entry or
> > +     the final destination is probably never executed avoid separating
> > +     the cold path since that can lead to spurious diagnostics.  */
> > +  if (n_insns > max_cold_insns)
> >      {
> > -      if (n_insns >= param_max_fsm_thread_path_insns)
> > -       {
> > -         if (dump_file && (dump_flags & TDF_DETAILS))
> > -           fprintf (dump_file, "  FAIL: Jump-thread path not considered: "
> > -                    "the number of instructions on the path "
> > -                    "exceeds PARAM_MAX_FSM_THREAD_PATH_INSNS.\n");
> > -         return false;
> > -       }
> >        if (taken_edge && probably_never_executed_edge_p (cfun, taken_edge))
> >         {
> >           if (dump_file && (dump_flags & TDF_DETAILS))
> > @@ -794,14 +815,6 @@ back_threader_profitability::profitable_path_p (const vec<basic_block> &m_path,
> >           return false;
> >         }
> >      }
> > -  else if (!m_speed_p && n_insns > 1)
> > -    {
> > -      if (dump_file && (dump_flags & TDF_DETAILS))
> > -       fprintf (dump_file, "  FAIL: Jump-thread path not considered: "
> > -                "duplication of %i insns is needed and optimizing for size.\n",
> > -                n_insns);
> > -      return false;
> > -    }
> >
> >    /* We avoid creating irreducible inner loops unless we thread through
> >       a multiway branch, in which case we have deemed it worth losing
> > --
> > 2.35.3
> >
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

  reply	other threads:[~2022-08-02 11:45 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <04261.122080204410800126@us-mta-529.us.mimecast.lan>
2022-08-02 10:21 ` Aldy Hernandez
2022-08-02 11:45   ` Richard Biener [this message]
2022-08-02 11:54     ` Aldy Hernandez
2022-08-02 11:59       ` Richard Biener
2022-08-02 12:03         ` Aldy Hernandez
2022-08-02 13:29           ` Richard Biener
2022-08-02 14:25             ` Aldy Hernandez
2022-08-02 16:31         ` Jan Hubicka
2022-08-03  9:58           ` Richard Biener
     [not found] <20220802130109.AD89F385381B@sourceware.org>
2022-08-05 16:35 ` Jeff Law
2022-08-02 13:00 Richard Biener
  -- strict thread matches above, loose matches on Subject: below --
2022-08-02 13:00 Richard Biener
2022-08-02 13:00 Richard Biener
2022-08-02  8:41 Richard Biener
2022-08-02  8:41 Richard Biener
2022-08-02  8:41 Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=nycvar.YFH.7.77.849.2208021141380.4208@jbgna.fhfr.qr \
    --to=rguenther@suse.de \
    --cc=aldyh@redhat.com \
    --cc=amacleod@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jeffreyalaw@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).