public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Aldy Hernandez <aldyh@redhat.com>
To: Richard Biener <rguenther@suse.de>
Cc: Andrew MacLeod <amacleod@redhat.com>,
	gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] tree-optimization/106514 - revisit m_import compute in backward threading
Date: Wed, 10 Aug 2022 18:08:12 +0200	[thread overview]
Message-ID: <CAGm3qMXj3PYOU6DgMV+9dyFbP96euUMq0iw99=XO0FyhMdoMfw@mail.gmail.com> (raw)
In-Reply-To: <nycvar.YFH.7.77.849.2208100807100.3343@jbgna.fhfr.qr>

On Wed, Aug 10, 2022 at 12:46 PM Richard Biener <rguenther@suse.de> wrote:
>
> On Wed, 10 Aug 2022, Richard Biener wrote:
>
> > On Tue, 9 Aug 2022, Andrew MacLeod wrote:
> >
> > >
> > > On 8/9/22 09:01, Richard Biener wrote:
> > > > This revisits how we compute imports later used for the ranger path
> > > > query during backwards threading.  The compute_imports function
> > > > of the path solver ends up pulling the SSA def chain of regular
> > > > stmts without limit and since it starts with just the gori imports
> > > > of the path exit it misses some interesting names to translate
> > > > during path discovery.  In fact with a still empty path this
> > > > compute_imports function looks like not the correct tool.
> > >
> > > I don't really know how this works in practice.  Aldys off this week, so he
> > > can comment when he returns.
> > >
> > > The original premise was along the line of recognizing that only changes to a
> > > GORI import name to a block can affect the branch at the end of the block.
> > > ie, if the path doesn't change any import to block A, then the branch at the
> > > end of block A will not change either.    Likewise, if it does change an
> > > import, then we look at whether the branch can be threaded.    Beyond that
> > > basic premise, I dont know what all it does.
> >
> > Yep, I also think that's the idea.
>
> [...]
>
> > Anyway, the important result of the change is that the imports
> > set is vastly smaller since it is now constrained to the
> > actual path.  Plus it now contains the local defs in blocks
> > of the path that do not dominate the exit block which means
> > it might get more threading opportunities.  Which reminds me
> > to re-do the cc1files experiment for this change.
>
> Results are a bit unconclusive.  We have
> ethread 14044 -> 14058, first threadfull 36986 -> 37174,
> first thread 8060 -> 8049, dom 21723 -> 21822, second thread
> (after loop opts) 6242 -> 5582, second dom 8522 -> 8765,
> second threadfull 3072 -> 2998 which makes an overall drop
> in the number of threads from 98649 to 98448.

The threading dance between all passes is a bit fickle as you can see.
Particularly, DOM is a red herring.  If it starts firing up, as it
looks like from your numbers, it's usually because we regressed in the
backward threaders and DOM is picking up the slack.  Though a slightly
increase in DOM threading can be because we opened up more
opportunities for DOM because of better threading in the backward
threaders.  Sometimes it helps to run without DOM (or disable DOM
threading) to compare apples with apples...being careful of course,
that you're not dropping a bunch of threads in thread2 for instance,
and picking them up in threadfull2 or whatever.

Ughhh...I really hate that we have 20 million threading passes.

>
> I've isolated one testcase we threaded before but no longer after
> this change and that is
>
> void foo (int nest, int print_nest)
> {
>   _Bool t0 = nest != 0;
>   _Bool t1 = nest == print_nest;
>   _Bool t2 = t0 & t1;
>   if (t2)
>     __builtin_puts ("x");
>   nest++;
>   if (nest > 2)
>     __builtin_abort ();
>   if (print_nest == nest)
>     __builtin_puts ("y");
> }
>
> where we are able to thread from if (t2) to if (print_nest == nest)
> resolving that to false when t2 is true using the nest == print_nest
> relation.  Now, the reason is because of the imports added by
>
>   // Exported booleans along the path, may help conditionals.
>   if (m_resolve)
>     for (i = 0; i < m_path.length (); ++i)
>       {
>         basic_block bb = m_path[i];
>         tree name;
>         FOR_EACH_GORI_EXPORT_NAME (gori, bb, name)
>           if (TREE_CODE (TREE_TYPE (name)) == BOOLEAN_TYPE)
>             bitmap_set_bit (imports, SSA_NAME_VERSION (name));
>       }
>
> _BUT_ - it turns out that the m_path local to the solver is still
> the one from the last back_threader::find_taken_edge_switch
> calling m_solver->compute_ranges (path, m_imports), so for a possibly
> completely unrelated path.  Truncating the path also on the solver
> side before calling compute_imports makes the testcase fail to thread.
> I'll note the PHI handling also looks at the stale m_path.

Not good.  Thanks for noticing.

>
> I see the solver itself adds relations from edges on the path so
> the cruical item here seems to be to add imports for the path
> entry conditional, but those would likely be GORI imports for that
> block?  Unfortunately that fails to add t[012], the GORI exports
> seem to cover all that's needed but then exports might be too much
> here?  At least that's what the code in compute_imports effectively
> does (also for the entry block).  But I do wonder why compute_ranges
> does not add relations computed by the entry conditional ...
> That's probably because of
>
> void
> path_range_query::compute_outgoing_relations (basic_block bb, basic_block
> next)
> {
>   gimple *stmt = last_stmt (bb);
>
>   if (stmt
>       && gimple_code (stmt) == GIMPLE_COND
>       && (import_p (gimple_cond_lhs (stmt))
>           || import_p (gimple_cond_rhs (stmt))))
>
> where for a condition like above the names in the imports are not
> in the condition itself.  The gori.outgoing_edge_range_p compute
> of the import ranges is appearantly not affected by this
> restriction and picks up nest as ~[0, 0] on the respective path
> even though, while 'nest' is in imports, the temporaries are not.
> That would support the argument to drop the import_p checks in
> path_range_query::compute_outgoing_relations.

Perhaps Andrew can better answer your questions on  GORI imports /
exports here and elsewhere.

>
> I'm not sure it's worth fixing incrementally though, it's fallout
> of r12-5157-gbfa04d0ec958eb that in some way did the reverse of
> my patch.  Interestingly hybrid_jt_simplifier::compute_ranges_from_state
> still computes imports itself before calling compute_ranges.

Hmmm, compute_ranges_from_state(), which is the interface from the
forward threader into the path solver probably shouldn't be rolling
its own.  It looks like a poor man's calculation of imports.  It's
just the GORI imports to the final conditional and the SSA operands to
the final conditional (in case GORI didn't have it in the list of
imports??).  This may have been there originally because the
calculation of imports was in the backward threader, and the forward
threader was doing its own thing before calling the path solver.

ISTM, both the forward and backward threader should use
path_range_query::compute_imports.  In practice, it probably didn't
matter because the forward threader couldn't feed the path solver long
or complex enough paths for it to make a difference in the thread
count.

Aldy

>
> For comparing thread numbers fixing the current state incrementally
> would be nice I guess, I will test something but not further pursue it
> if not successful.
>
> Richard.


  reply	other threads:[~2022-08-10 16:08 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <37255.122080909012202281@us-mta-359.us.mimecast.lan>
2022-08-09 18:52 ` Andrew MacLeod
2022-08-10  6:45   ` Richard Biener
2022-08-10 10:46     ` Richard Biener
2022-08-10 16:08       ` Aldy Hernandez [this message]
2022-08-11 17:46       ` Andrew MacLeod
2022-08-15  9:59         ` Richard Biener
2022-08-10 15:42     ` Aldy Hernandez
2022-08-09 13:01 Richard Biener
  -- strict thread matches above, loose matches on Subject: below --
2022-08-09 13:01 Richard Biener
2022-08-09 13:01 Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGm3qMXj3PYOU6DgMV+9dyFbP96euUMq0iw99=XO0FyhMdoMfw@mail.gmail.com' \
    --to=aldyh@redhat.com \
    --cc=amacleod@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=rguenther@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).