public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jeff Law <law@redhat.com>
To: Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org>,
	richard.sandiford@arm.com
Subject: Re: [00/23] Make fwprop use an on-the-side RTL SSA representation
Date: Sun, 29 Nov 2020 23:45:17 -0700	[thread overview]
Message-ID: <ca3512c8-6398-d8a0-da50-0ad1e327d136@redhat.com> (raw)
In-Reply-To: <mptlfeogl6h.fsf@arm.com>



On 11/26/20 9:03 AM, Richard Sandiford wrote:
> Thanks for the reviews.
>
> Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>> On 11/13/20 1:10 AM, Richard Sandiford via Gcc-patches wrote:
>>> Just after GCC 10 stage 1 closed (oops), I posted a patch to add a new
>>> combine pass.  One of its main aims was to allow instructions to move
>>> around where necessary in order to make a combination possible.
>>> It also tried to parallelise instructions that use the same resource.
>>>
>>> That pass contained its own code for maintaining limited def-use chains.
>>> When I posted the patch, Segher asked why we wanted yet another piece
>>> of pass-specific code to do that.  Although I had specific reasons
>>> (which I explained at the time) I've gradually come round to agreeing
>>> that that was a flaw.
>>>
>>> This series of patches is the result of a Covid-time project to add
>>> a more general, pass-agnostic framework.  There are two parts:
>>> adding the framework itself, and using it to make fwprop.c faster.
>>>
>>> The framework part
>>> ------------------
>>>
>>> The framework provides an optional, on-the-side SSA view of existing
>>> RTL instructions.  Each instruction gets a list of definitions and a
>>> list of uses, with each use having a single definition.  Phi nodes
>>> handle cases in which there are multiple possible definitions of a
>>> register on entry to a basic block.  There are also routines for
>>> updating instructions while keeping the SSA representation intact.
>>>
>>> The aim is only to provide a different view of existing RTL instructions.
>>> Unlike gimple, and unlike (IIRC) the old RTL SSA project from way back,
>>> the new framework isn't a “native” SSA representation.  This means that
>>> all inputs to a phi node for a register R are also definitions of
>>> register R; no move operation is “hidden” in the phi node.
>> Hmm, I'm trying to parse what the last phrase means.  Does it mean that
>> the "hidden copy" problem for out-of-ssa is avoided?  And if so, how is
>> that maintained over time.  Things like copy-prop will tend to introduce
>> those issues even if they didn't originally exist.
> Yeah, the phi nodes simply say which definition of register R provides
> the value of R on a particular incoming edge.  That definition will
> itself be a phi node for R, an artificial definition of R created by DF
> (e.g. for incoming function arguments or for EH data registers), or an
> actual instruction that sets R.
>
> In other words, the SSA form is a purely on-the-side thing and the
> underlying RTL instructions are maintained in the same way as normal.
> The SSA form can be deleted at any time without performing a separate
> out-of-ssa step.  In that respect it's different from cfglayout,
> for example.
>
> One of the goals was to allow the SSA form to be used even after RA,
> where invisible copies would be more problematic.
Right.  But what I'm struggling a bit with is whether or not we have to
put restrictions on what passes can do with that on-the-side data
structure.  While I think we can have that on-the-side data structure be
conservatively correct, I think we have to make sure that we don't allow
changes to the on-the-side data structure to occur that ultimately we
can't reflect into RTL.

I may need to go back and re-read the lost copy problem literature.  But
it's definitely an area that I'm concerned about.


>> It's unfortunately that there's no DCE passes abutting
>> fwprop as DCE is really easy in an SSA world.
> fwprop.c calls delete_trivially_dead_insns, so it does some light DCE.
> One thing I wanted to do (but ran out of time to do) was get the main
> SSA insn-change routine (rtl_ssa::change_insns) to record when an
> instruction becomes dead, and then perform DCE as part of the later
> rtl_ssa::perform_pending_updates step.  This would be much cheaper
> than doing another full scan of the instruction stream (which is what
> delete_trivially_dead_insns needs needs to do).
>
> Unfortunately, I suspect we're relying on this delete_trivially_dead_insns
> call to delete instructions that became dead during earlier passes, not just
> those that become dead during fwprop.c.  So I guess we would need a full
> DCE at some point: making fwprop.c clean up its own mess might not be
> enough.
Oh, yea, if it's using delete_trivially_dead_insns, then, yea, it's got
a mini-DCE and using the SSA algorithm would seem to be a step forward.

I don't necessarily see that incoming dead code is that big of a
problem.  Ultimately it's still going to look like SSA definition with
no uses, in the on-the-side data structure, right?  So an SSA based DCE
should be able to clean up the mess from fwprop as well as any incoming
dead code.
>
>>> * The SSA code groups blocks into extended basic blocks, with the
>>>   EBBs rather than individual blocks having phi nodes.  
>> So I haven't looked at the patch, but the usual place to put PHIs is at
>> the dominance frontier.  But extra PHIs just increase time/memory and
>> shouldn't affect correctness.
> Yeah, the phis still are at dominance frontiers (except for certain
> cases where we use degenerate phis to maintain a linear RPO view;
> see the doc patch for more details about that).  It wasn't clear from
> the description above, but I was really talking about a pure data
> structure choice: once we have both BBs and EBBs, the phis naturally
> attach to the EBB data structure rather than the BB data structure,
> since second and subsequent BBs in an EBB have a single predecessor
> and so never need phi nodes.
Certainly its the case that the dominance frontier must be at the start
of an EBB.  So inserting PHIs at the start of EBBs should be correct. 
But my recollection was that if do it naively you end up with
unnecessary PHIs.    But I don't think we have to do a "no useless PHIs"
algorithm, we just have to do something sensible -- it's my suspicion
that all the work in the early days of SSA to minimize PHIs isn't as
important as it used to be.

>
>>> * The framework also provides live range information for registers
>>>   within an extended basic block and allows instructions to move within
>>>   their EBB.  It might be useful to allow further movement in future;
>>>   I just don't have a use case for it yet.
>> Yup.   You could do something like Click's algorithm to schedule the
>> instructions in a block to maximize CSE opportunities on top of this.
> Yeah.
I noticed that you've got a lot of the infrastructure to do this already
:-) 

>
>>> * One advantage of the new infrastructure is that it gives
>>>   recog_for_combine-like behaviour: if recog wants to add clobbers
>>>   of things like the flags register, the SSA code will make sure
>>>   that the flags register is free.
>> I look more at the intersection between combine and SSA as an
>> opportunity to combine on extended blocks, simplify the "does dataflow
>> allow this combination" logic, drop the need to build/maintain LOG_LINKS
>> and more generally simplify note distribution.
> Yeah, my ultimate goal (for GCC12, I hope this time for real) is still
> to provide an SSA version of combine.  Initially it could sit alongside
> the existing combine pass, and perhaps be run only after RA by default.
> (Testing locally, that seems to give nice results, and reduces the
> pressure to reimplement everything in combine.c in one go.)
>
> But the point above was instead that, at the moment, combine is the
> only pass that can add (say) new clobbers of the flags register as
> part of a recog.  I think ideally *all* passes should be able to do that.
> But passes would then need to track the live ranges of the flags
> register in order to tell when the flags register is free.  One of the
> side-benefits of the SSA stuff is that it can do this in amortised
> sublinear complexity.  So RTL SSA provides its own interface to recog
> that can do the same things as recog_for_combine does.
Sweet.


Jeff


  parent reply	other threads:[~2020-11-30  6:45 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-13  8:10 Richard Sandiford
2020-11-13  8:11 ` [01/23] vec: Silence clang warning Richard Sandiford
2020-11-25 19:58   ` Jeff Law
2020-11-13  8:12 ` [02/23] rtlanal: Remove noop_move_p REG_EQUAL condition Richard Sandiford
2020-11-25 20:00   ` Jeff Law
2020-11-13  8:12 ` [03/23] reginfo: Add a global_reg_set Richard Sandiford
2020-11-25 20:01   ` Jeff Law
2020-11-13  8:13 ` [04/23] Move iterator_range to a new iterator-utils.h file Richard Sandiford
2020-11-25 20:02   ` Jeff Law
2020-11-13  8:13 ` [05/23] Add more iterator utilities Richard Sandiford
2020-11-25 20:12   ` Jeff Law
2020-11-13  8:14 ` [06/23] Add an RAII class for managing obstacks Richard Sandiford
2020-11-25 20:15   ` Jeff Law
2020-11-13  8:14 ` [07/23] Add a class that multiplexes two pointer types Richard Sandiford
2020-11-25 20:23   ` Jeff Law
2020-11-26 16:15     ` Richard Sandiford
2020-11-30  1:28       ` Jeff Law
2020-11-25 23:33   ` Martin Sebor
2020-11-26 17:06     ` Richard Sandiford
2020-11-27 18:12       ` Richard Sandiford
2020-11-28  0:17       ` Martin Sebor
2020-12-17  0:17         ` Richard Sandiford
2020-12-17 14:21           ` Tom Tromey
2020-12-17 15:38             ` Richard Sandiford
2020-12-17 15:44               ` Nathan Sidwell
2021-01-04 15:32                 ` Jeff Law
2020-11-13  8:15 ` [08/23] Add an alternative splay tree implementation Richard Sandiford
2020-12-02 20:36   ` Jeff Law
2020-12-17  0:29     ` Richard Sandiford
2021-01-04 15:27       ` Jeff Law
2021-01-01  8:25   ` Andreas Schwab
2021-01-04 14:53     ` Richard Sandiford
2021-01-04 15:02       ` Andreas Schwab
2021-01-04 15:42         ` Richard Sandiford
2021-01-05 12:13           ` Richard Biener
2020-11-13  8:15 ` [09/23] Add a cut-down version of std::span (array_slice) Richard Sandiford
2020-11-30 19:56   ` Jeff Law
2022-08-03 15:13   ` Martin Jambor
2022-08-03 15:31     ` Richard Sandiford
2022-08-10 16:03   ` Martin Jambor
2022-08-11  6:58     ` Richard Biener
2022-08-16  7:59       ` Richard Sandiford
2020-11-13  8:16 ` [10/23] Tweak the way that is_a is implemented Richard Sandiford
2020-12-02  5:15   ` Jeff Law
2020-11-13  8:16 ` [11/23] Split update_cfg_for_uncondjump out of combine Richard Sandiford
2020-11-30  6:14   ` Jeff Law
2020-11-13  8:17 ` [12/23] Export print-rtl.c:print_insn_with_notes Richard Sandiford
2020-11-25 20:24   ` Jeff Law
2020-11-13  8:18 ` [13/23] recog: Split out a register_asm_p function Richard Sandiford
2020-11-25 20:24   ` Jeff Law
2020-11-13  8:18 ` [14/23] simplify-rtx: Put simplify routines into a class Richard Sandiford
2020-11-30 19:54   ` Jeff Law
2020-11-13  8:19 ` [15/23] recog: Add a validate_change_xveclen function Richard Sandiford
2020-11-30 20:03   ` Jeff Law
2020-11-13  8:19 ` [16/23] recog: Add a way of temporarily undoing changes Richard Sandiford
2020-11-25 20:27   ` Jeff Law
2020-12-17  0:22     ` Richard Sandiford
2020-11-13  8:20 ` [17/23] recog: Add a class for propagating into insns Richard Sandiford
2020-12-03 22:32   ` Jeff Law
2020-11-13  8:20 ` [18/23] recog: Add an RAII class for undoing insn changes Richard Sandiford
2020-11-25 20:27   ` Jeff Law
2020-11-13  8:20 ` [19/23] rtlanal: Add some new helper classes Richard Sandiford
2020-12-13 17:30   ` Jeff Law
2020-12-14 16:37     ` Richard Sandiford
2020-12-14 20:02       ` Jeff Law
2020-11-13  8:21 ` [20/23] rtlanal: Add simple_regno_set Richard Sandiford
2020-11-25 20:31   ` Jeff Law
2020-12-17  0:47     ` Richard Sandiford
2021-01-04 15:28       ` Jeff Law
2020-11-13  8:22 ` [21/23] doc: Add documentation for rtl-ssa Richard Sandiford
2020-11-30  6:26   ` Jeff Law
2020-11-13  8:23 ` [PATCH 22/23] Add rtl-ssa Richard Sandiford
2020-12-16  3:31   ` Jeff Law
2020-12-17  0:33     ` Richard Sandiford
2020-12-19 20:01       ` Jeff Law
2020-11-13  8:24 ` [PATCH 23/23] fwprop: Rewrite to use RTL SSA Richard Sandiford
2020-12-16  3:52   ` Jeff Law
2020-12-17  0:34     ` Richard Sandiford
2020-11-25 19:58 ` [00/23] Make fwprop use an on-the-side RTL SSA representation Jeff Law
2020-11-26 16:03   ` Richard Sandiford
2020-11-27 15:56     ` Michael Matz
2020-11-27 16:31       ` Richard Sandiford
2020-11-30 21:13         ` Jeff Law
2020-12-01  0:03           ` Michael Matz
2020-12-01 10:15             ` Richard Sandiford
2020-12-02  0:25             ` Jeff Law
2020-11-30  6:45     ` Jeff Law [this message]
2020-11-30 14:12       ` Richard Sandiford

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ca3512c8-6398-d8a0-da50-0ad1e327d136@redhat.com \
    --to=law@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=richard.sandiford@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).