public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jeff Law <law@redhat.com>
To: Segher Boessenkool <segher@kernel.crashing.org>
Cc: gcc-patches@gcc.gnu.org, bschmidt@redhat.com
Subject: Re: [PATCH 8/9] shrink-wrap: shrink-wrapping for separate components
Date: Mon, 12 Sep 2016 18:21:00 -0000	[thread overview]
Message-ID: <0782459b-852f-f667-ca59-ade837ca2988@redhat.com> (raw)
In-Reply-To: <20160909215719.GC26965@gate.crashing.org>

On 09/09/2016 03:57 PM, Segher Boessenkool wrote:
> On Thu, Sep 08, 2016 at 12:34:07PM -0600, Jeff Law wrote:
>> On 07/31/2016 07:42 PM, Segher Boessenkool wrote:
>>>
>>> Deciding what blocks should run with a certain component active so that
>>> the total cost of executing the prologues (and epilogues) is optimal, is
>>> not a computationally feasible problem.
>> Really?  It's just a dataflow problem is it not and one that ought to
>> converge very quickly I'd think.  Or is it more a function of having to
>> run it on so many independent components?  I'm still pondering the value
>> of having every GPR be an independent component :-)
>
> The cost function (as a function of which BBs runs with a certain component
> enabled) is not monotonic: the cost for having a component active for a
> subset of BBs can be higher, the same, or equal to the cost for all such
> nodes (where "cost" means "how often do we execute this prologue").
Understood.   You covered this reasonably well in another reply.  Thanks.

>> Cross jumping is rather simplistic, so I'm not surprised that it doesn't
>> catch all this stuff.
>
> I hoped it would, so I could have so much simpler code.  Sniff.
In general, I think our ability to identify and de-duplicate code is 
poor at best, especially at the RTL level.  It's never been a major area 
of focus.   So, yea.  Sniff.


>
>>> As a final optimisation, if a block needs a prologue and its immediate
>>> dominator has the block as a post-dominator, the dominator gets the
>>> prologue as well.
>> So why not just put it in the idom and not in the dominated block?
>
> That's what it does :-)
Then I must have mis-parsed.  Thanks for clarifying.

>
>
>>> void
>>> thread_prologue_and_epilogue_insns (void)
>>> {
>>> +  if (optimize > 1)
>>> +    {
>>> +      df_live_add_problem ();
>>> +      df_live_set_all_dirty ();
>>> +    }
>> Perhaps conditional on separate shrink wrapping?
>
> Actually, I think we need to do this once more, one of the times always
> (also when only using "regular" shrink-wrapping), because otherwise the
> info in the dump file is out of whack.  Everything is okay once the next
> pass starts, of course.  I'll have a look.
OK.  It struck me as a bit odd, so just a verify on your side that it 
was intended is fine with me.


>
>>> @@ -5932,6 +5936,13 @@ thread_prologue_and_epilogue_insns (void)
>>>
>>>   try_shrink_wrapping (&entry_edge, prologue_seq);
>>>
>>> +  df_analyze ();
>>> +  try_shrink_wrapping_separate (entry_edge->dest);
>>> +  if (crtl->shrink_wrapped_separate)
>>> +    prologue_seq = make_prologue_seq ();
>> Perhaps push the df_analyze call into try_shrink_wrapping_separate?
>
> Yeah possibly.
Your call.


>
>> ANd if it was successful, do you need to update the DF information?  Is
>> that perhaps the cause of some of the issues we're seeing with DCE,
>> regrename, the scheduler?
>
> See above.  The DF info is correct when the next pass starts (or ends,
> I do not remember; the dump file does show correct information).
Hmm, then explain again why DCE is mucking up?  I don't immediately see 
how EPILOGUE_BEG notes come into play with DCE.  It seems to rely on the 
DF data and AFAICT DF only cares about the EPILOGUE_BEG note in 
can_move_insns_across which shouldn't be used by DCE.




>
>> Related,
>> consider using an enum rather than magic constants in the target bits (I
>> noted seeing component #0 as a magic constant in the ppc implementation).
>
> But 0 is the hard register used there :-)
Oh, missed that.  Nevermind :-)

>> So it seems like there's a toplevel list of components that's owned by
>> the target and state at each block components that are owned by the
>> generic code.  That's fine.  Just make sure we doc that the toplevel
>> list of components is allocated by the backend (and where does it get
>> freed?)
>
> Every sbitmap is owned by the separate shrink-wrapping pass, not by the
> target.  As the documentation says:
I'm specifically referring to:

+@deftypefn {Target Hook} sbitmap 
TARGET_SHRINK_WRAP_GET_SEPARATE_COMPONENTS (void)

+@deftypefn {Target Hook} sbitmap TARGET_SHRINK_WRAP_COMPONENTS_FOR_BB 
(basic_block)

Which in the rs6000 implementation allocate and return an sbitmap.  I 
don't see anywhere we could reasonably free them in the target with the 
existing hooks, so it seems like they have to be free'd by the generic 
code. So ownership of those sbitmaps isn't particularly clear.


>
>> Consider using auto_sbitmap rather than manually managing
>> allocation/releasing of the per-block structures.  In fact, can't all of
>> SW become a class and we lose the explicit init/fini routines in favor
>> of a ctor/dtor?
>
> Yes, you can always add indirection.  I do not think the code becomes
> more readable that way (quite the opposite).  Explicit is *good*.
The GCC project is moving away from this kind of explicit 
allocation/deallocation and more towards a RAII.  Unless there is a 
clear need for the explicit allocation/deallocation, please put this 
stuff into a class with an appropriate ctor/dtor.

FWIW, I was a big opponent of how much stuff happens "behind your back" 
with some languages (including C++).  But over the last few years my 
personal stance has softened considerably after seeing how cleanly RAII 
solves certain problems.


>
>> Having looked at this in more detail now, I'm wondering if we want to
>> talk at all in the docs about when selection vs disqualifying happens.
>> ISTM that we build the set of components we want to insert for each
>> edge/block.  When that's complete, we then prune those results with the
>> disqualifying sets.
>
> We decide which blocks should have which components active.  From that,
> it is easy to derive on which edges we need a prologue or an epilogue.
> Now if the target says putting some prologue or epilogue on some certain
> edge will not work, we just don't do that component at all.  It doesn't
> happen very often.  In theory we could be smarter.
Right.  I was obviously over-simplifying, the key point I was verifying 
was that they're separate steps...  Which led to the question about when 
to disqualify for the R0 vs LR issue.


>
>> For the PPC R0 vs LR is the only thing that causes disqualification
>> right?
>
> Currently, yes.
>
>> Can't that be handled when we build the set of components we
>> want to insert for each edge/block?  Is there some advantage to handling
>> disqualifications after all the potential insertion points have been
>> handled?
>
> We do not know if an edge needs a prologue, epilogue, or neither, until
> we have decided whether *both* ends of that edge want the component active
> or not.
Right.  Hmm, maybe I'm not asking the question clearly.

Whether or not an edge needs a prologue or epilogue is a function not 
just of the state at the head or tail of the edge, but instead is a 
function of global dataflow propagation?  Thus we can't disqualify until 
after we've done the dataflow propagation?  Right?


Jeff

  reply	other threads:[~2016-09-12 18:02 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-01  1:43 [PATCH v2 0/9] Separate shrink-wrapping Segher Boessenkool
2016-08-01  1:43 ` [PATCH 3/9] dce: Don't dead-code delete separately wrapped restores Segher Boessenkool
2016-09-08 17:52   ` Jeff Law
2016-09-09 15:59     ` Segher Boessenkool
2016-09-09 16:39       ` Jeff Law
2016-08-01  1:43 ` [PATCH 2/9] cfgcleanup: Don't confuse CFI when -fshrink-wrap-separate Segher Boessenkool
2016-09-08 17:51   ` Jeff Law
2016-08-01  1:43 ` [PATCH 1/9] separate shrink-wrap: New command-line flag, status flag, hooks, and doc Segher Boessenkool
2016-08-29  9:31   ` Bernd Schmidt
2016-08-29 14:30     ` Segher Boessenkool
2016-09-08 17:37     ` Jeff Law
2016-09-09 11:03       ` Bernd Schmidt
2016-09-09 15:13         ` Segher Boessenkool
2016-09-09 18:31           ` Jeff Law
2016-09-09 20:41             ` Segher Boessenkool
2016-09-12 16:49               ` Jeff Law
2016-09-09 15:51         ` Jeff Law
2016-09-09 15:40       ` Segher Boessenkool
2016-09-09 16:58         ` Jeff Law
2016-09-08 17:48   ` Jeff Law
2016-09-09 15:44     ` Segher Boessenkool
2016-08-01  1:57 ` [PATCH 4/9] regrename: Don't rename restores Segher Boessenkool
2016-09-08 17:54   ` Jeff Law
2016-09-09 21:05     ` Segher Boessenkool
2016-09-12 17:01       ` Jeff Law
2016-08-01  1:57 ` [PATCH 9/9] rs6000: Separate shrink-wrapping Segher Boessenkool
2016-08-01  1:57 ` [PATCH 8/9] shrink-wrap: shrink-wrapping for separate components Segher Boessenkool
2016-09-08 19:03   ` Jeff Law
2016-09-09 22:03     ` Segher Boessenkool
2016-09-12 18:21       ` Jeff Law [this message]
2016-09-14 13:45         ` Segher Boessenkool
2016-09-15 16:47           ` Jeff Law
2016-08-01  1:57 ` [PATCH 7/9] cprop: Leave RTX_FRAME_RELATED_P instructions alone Segher Boessenkool
2016-09-08 18:34   ` Jeff Law
2016-09-09 21:21     ` Segher Boessenkool
2016-08-01  2:12 ` [PATCH 5/9] regrename: Don't run if function was separately shrink-wrapped Segher Boessenkool
2016-09-08 17:54   ` Jeff Law
2016-08-01  2:12 ` [PATCH 6/9] sel-sched: Don't mess with register restores Segher Boessenkool
2016-08-04  7:33   ` Andrey Belevantsev
2016-09-08 17:55   ` Jeff Law
2016-09-09 21:13     ` Segher Boessenkool
2016-09-12 17:39       ` Jeff Law
2016-09-14 13:28         ` Segher Boessenkool
2016-08-04  0:05 ` [PATCH v2 0/9] Separate shrink-wrapping Segher Boessenkool
2016-08-24 16:04   ` Segher Boessenkool
2016-08-26 13:03 ` Bernd Schmidt
2016-08-26 13:48   ` David Malcolm
2016-08-26 13:55     ` Bernd Schmidt
2016-08-26 14:50   ` Segher Boessenkool
2016-08-26 15:03     ` Bernd Schmidt
2016-08-26 16:27       ` Segher Boessenkool
2016-09-08 16:58         ` Jeff Law
2016-09-09 15:26           ` Segher Boessenkool
2016-09-09 16:26             ` Jeff Law
2016-09-09 16:51               ` Segher Boessenkool
2016-09-09 17:22                 ` Jeff Law
2016-08-30 12:31       ` Michael Matz
2016-09-08 16:41         ` Jeff Law
2016-09-09  6:31           ` Segher Boessenkool
2016-09-09 15:28             ` Jeff Law
2016-09-09 15:43               ` Segher Boessenkool
2016-09-09 18:25                 ` Jeff Law
2016-09-09 20:29                   ` Segher Boessenkool
2016-09-08 17:20       ` Jeff Law
2016-09-09 15:33         ` Segher Boessenkool
2016-09-09 16:49           ` Jeff Law
2016-09-09 17:00             ` Segher Boessenkool
2016-09-09 17:44               ` Jeff Law
2016-09-09 19:36   ` Jeff Law
2016-09-09 21:00     ` Segher Boessenkool
2016-09-12 11:00       ` Bernd Schmidt
2016-09-12 16:59       ` Jeff Law
2016-09-14 13:22         ` Segher Boessenkool

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0782459b-852f-f667-ca59-ade837ca2988@redhat.com \
    --to=law@redhat.com \
    --cc=bschmidt@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=segher@kernel.crashing.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).